# How to create a sample from an R data frame if weights are assigned to the row values?

To create a random sample in R, we can use sample function but if the weight of the values is provided then we need to assign the probability of the values based on the weights. For example, if we have a data frame df that contains a column X with some values and another column Weight with the corresponding weights then a random sample of size 10 can be generated as follows −



## Output

  x weight_x
11 5.257177 10
19 5.401021 9
13 5.334041 10
10 4.416107 6
5 6.593158 2



## Output

  x weight_x
9 4.504645 10
19 5.401021 9
12 5.836453 1
5 6.593158 2
15 3.406828 7
11 5.257177 10
6 4.298533 10



## Output

  x weight_x
8 4.136517 5
11 5.257177 10
7 6.196574 4
4 5.980315 8
9 4.504645 10
6 4.298533 10
19 5.401021 9
18 4.820102 10
16 4.149746 2



## Output

  x weight_x
3 5.768463 10
15 3.406828 7
19 5.401021 9
16 4.149746 2
9 4.504645 10
8 4.136517 5
11 5.257177 10
10 4.416107 6
18 4.820102 10
6 4.298533 10
4 5.980315 8
17 4.657464 4
1 4.126636 10
20 6.718216 6
13 5.334041 10



## Output

  x weight_x
1 4.126636 10
3 5.768463 10
8 4.136517 5
11 5.257177 10
10 4.416107 6
6 4.298533 10
13 5.334041 10
4 5.980315 8
20 6.718216 6
12 5.836453 1
18 4.820102 10
19 5.401021 9

## Example

df[sample(seq_len(nrow(df)),18,prob=df\$weight_x),]


## Output

 x weight_x
5 6.593158 2
4 5.980315 8
6 4.298533 10
20 6.718216 6
15 3.406828 7
3 5.768463 10
9 4.504645 10
10 4.416107 6
13 5.334041 10
19 5.401021 9
8 4.136517 5
11 5.257177 10
18 4.820102 10
1 4.126636 10
7 6.196574 4
12 5.836453 1
17 4.657464 4
16 4.149746 2

Updated on: 07-Nov-2020

804 Views