How to create a sample or samples using probability distribution in R?


A probability distribution is the type of distribution that gives a specific probability to each value in the data set. For example, if we have a variable say X that contains three values say 1, 2, and 3 and each of them occurs with the probability defined as 0.25,0.50, and 0.25 respectively then the function that gives the probability of occurrence of each value in X is called the probability distribution. In R, we can create the sample or samples using probability distribution if we have a predefined probabilities for each value or by using known distributions such as Normal, Poisson, Exponential etc. To create the samples, follow the below steps −

  • Creating a vector
  • Creating the probability distribution with probabilities using sample function.

Example 1

Creating a vector x1 −

 Live Demo

x1<-1:100
x1

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
[55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
[91] 91 92 93 94 95 96 97 98 99 100

Creating the probability distribution

Using sample function probabilities given with prob argument to create the probability distribution of x1 −

 Live Demo

x1<-1:100
sample(x1,200,replace=TRUE,prob=rep(0.01,100))

Output

[1] 25 34 96 3 19 91 50 5 78 31 81 11 60 99 66 78 18 61
[19] 41 75 78 85 80 93 32 8 66 86 27 60 34 60 32 60 8 8
[37] 93 71 65 20 39 70 29 45 39 8 54 57 29 24 62 54 80 54
[55] 14 69 46 87 30 55 27 100 68 58 66 89 64 52 39 67 96 76
[73] 93 86 93 96 79 94 53 96 41 90 65 16 74 51 76 35 34 60
[91] 39 85 62 72 29 59 52 74 49 18 26 10 22 95 55 47 1 52
[109] 83 73 15 85 58 48 14 77 85 90 3 35 59 42 86 53 93 38
[127] 2 47 34 76 63 70 20 62 75 8 61 47 70 4 64 65 21 49
[145] 5 77 7 1 42 83 75 48 45 2 36 100 91 29 86 59 78 97
[163] 4 45 28 15 70 14 23 87 35 21 87 61 14 60 80 55 62 17
[181] 67 70 85 41 100 31 27 82 96 61 56 5 58 11 92 13 61 14
[199] 96 85

Example 2

Creating a vector x2 −

 Live Demo

x2<-c(2,4,5,3,5,8,7,1,2,3,6,6,9,10)
x2

Using sample function probabilities given with prob argument to create the probability distribution of x2 −

[1] 2 4 5 3 5 8 7 1 2 3 6 6 9 10

Using sample function probabilities given with prob argument to create the probability distribution of x2 −

x2<-c(2,4,5,3,5,8,7,1,2,3,6,6,9,10)
sample(x2,200,replace=TRUE,prob=c(0.2,0.1,0.1,0.1,0.05,0.05,0.05,0.05,0.1,0.05,0.05,0.
04,0.02,0.04))

Output

[1] 2 2 10 2 2 2 3 3 4 1 2 1 5 2 4 3 9 2 4 2 5 4 2 5 3
[26] 1 2 2 6 8 3 6 5 9 3 5 1 4 2 10 5 5 3 4 9 3 8 10 2 5
[51] 4 2 6 3 5 4 5 3 2 2 10 7 2 2 4 5 5 5 1 4 6 2 2 5 6
[76] 5 7 3 4 1 8 3 2 2 2 3 5 3 5 6 2 2 6 6 3 5 10 6 5 2
[101] 1 4 9 6 7 4 6 5 4 6 2 8 8 4 6 1 3 8 9 5 1 4 2 2 3
[126] 5 4 7 2 1 3 5 2 5 1 3 7 3 8 5 3 1 2 4 5 9 4 3 5 6
[151] 8 2 5 10 4 5 8 5 3 3 1 3 8 2 2 5 5 2 6 5 4 3 6 8 2
[176] 9 3 3 1 4 2 5 8 3 6 2 2 1 8 2 4 9 5 10 2 5 2 2 2 6

Example 3

Creating a vector x3 −

 Live Demo

x3<-c(0.05,0.17,0.24,0.36,0.21)
x3

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

[1] 0.05 0.17 0.24 0.36 0.21

Using sample function probabilities given with prob argument to create the probability distribution of x3 −

 Live Demo

x3<-c(0.05,0.17,0.24,0.36,0.21)
sample(x3,200,replace=TRUE,prob=c(0.24,0.26,0.20,0.10,0.10))

Output

[1] 0.17 0.17 0.17 0.21 0.36 0.36 0.21 0.17 0.24 0.17 0.24 0.21 0.21 0.21 0.21
[16] 0.17 0.24 0.05 0.17 0.21 0.17 0.05 0.24 0.05 0.24 0.17 0.36 0.21 0.21 0.24
[31] 0.17 0.21 0.24 0.17 0.17 0.17 0.05 0.05 0.17 0.17 0.05 0.05 0.17 0.24 0.17
[46] 0.21 0.05 0.24 0.24 0.17 0.36 0.36 0.05 0.21 0.17 0.24 0.05 0.05 0.17 0.24
[61] 0.17 0.21 0.17 0.05 0.24 0.21 0.17 0.21 0.05 0.17 0.21 0.17 0.17 0.24 0.17
[76] 0.05 0.05 0.21 0.36 0.36 0.05 0.17 0.17 0.17 0.24 0.17 0.21 0.05 0.05 0.21
[91] 0.24 0.24 0.36 0.17 0.36 0.05 0.17 0.17 0.05 0.24 0.24 0.05 0.17 0.05 0.05
[106] 0.24 0.05 0.05 0.05 0.05 0.05 0.17 0.17 0.21 0.36 0.24 0.17 0.05 0.24 0.17
[121] 0.17 0.24 0.24 0.05 0.24 0.17 0.17 0.21 0.36 0.17 0.17 0.24 0.21 0.24 0.05
[136] 0.24 0.36 0.36 0.05 0.05 0.17 0.17 0.17 0.05 0.17 0.05 0.21 0.05 0.24 0.05
[151] 0.24 0.05 0.05 0.05 0.24 0.17 0.05 0.05 0.17 0.24 0.05 0.05 0.24 0.05 0.36
[166] 0.17 0.05 0.17 0.17 0.17 0.36 0.17 0.21 0.17 0.05 0.05 0.05 0.17 0.05 0.17
[181] 0.17 0.17 0.21 0.05 0.24 0.05 0.17 0.17 0.21 0.05 0.21 0.05 0.05 0.17 0.17
[196] 0.05 0.05 0.24 0.21 0.17

Example 4

Creating a vector x4 −

 Live Demo

x4<-c(102,81,39,122,97,109)
x4

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

[1] 102 81 39 122 97 109

Using sample function probabilities given with prob argument to create the probability distribution of x4 −

 Live Demo

x4<-c(102,81,39,122,97,109)
sample(x4,200,replace=TRUE,prob=c(0.10,0.10,0.25,0.25,0.15,0.15))

[1] 97 97 109 81 39 97 109 39 97 109 81 122 39 81 97 39 97 122 

[19] 122 109 122 122 122 97 81 39 39 39 81 39 39 97 39 39 81 81 

[37] 122 81 97 122 39 109 81 109 102 109 102 97 109 109 97 122 122 102 

[55] 39 102 39 109 122 109 109 122 97 122 109 97 97 39 109 39 122 39 

[73] 122 81 39 81 39 102 39 122 122 122 39 97 97 81 122 97 39 39 

[91] 122 122 39 109 109 81 109 122 122 39 122 102 39 81 39 122 39 122 

[109] 97 39 122 109 81 122 39 122 122 109 122 122 102 97 97 122 109 39 

[127] 109 102 102 39 109 109 39 39 122 81 122 122 39 81 122 39 81 97 

[145] 122 122 97 109 81 102 39 39 102 97 97 109 109 97 39 109 97 102 

[163] 97 109 122 102 109 109 122 122 122 81 97 97 122 97 97 122 109 122 

[181] 109 39 81 39 39 97 122 39 122 122 39 122 39 97 39 109 39 109 

[199] 102 97

Using sample function probabilities given with prob argument to create the probability distribution of x4 −

 Live Demo

x4<-c(102,81,39,122,97,109)
sample(x4,200,replace=TRUE,prob=c(0.10,0.10,0.25,0.25,0.15,0.15))

Output

[1] 97 97 109 81 39 97 109 39 97 109 81 122 39 81 97 39 97 122
[19] 122 109 122 122 122 97 81 39 39 39 81 39 39 97 39 39 81 81
[37] 122 81 97 122 39 109 81 109 102 109 102 97 109 109 97 122 122 102
[55] 39 102 39 109 122 109 109 122 97 122 109 97 97 39 109 39 122 39
[73] 122 81 39 81 39 102 39 122 122 122 39 97 97 81 122 97 39 39
[91] 122 122 39 109 109 81 109 122 122 39 122 102 39 81 39 122 39 122
[109] 97 39 122 109 81 122 39 122 122 109 122 122 102 97 97 122 109 39
[127] 109 102 102 39 109 109 39 39 122 81 122 122 39 81 122 39 81 97
[145] 122 122 97 109 81 102 39 39 102 97 97 109 109 97 39 109 97 102
[163] 97 109 122 102 109 109 122 122 122 81 97 97 122 97 97 122 109 122
[181] 109 39 81 39 39 97 122 39 122 122 39 122 39 97 39 109 39 109
 [199] 102 97

Example 5

Creating a vector x5 −

 Live Demo

x5<-c(25,84,71,69,23)
x5

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

[1] 25 84 71 69 23

Using sample function probabilities given with prob argument to create the probability distribution of x5 −

 Live Demo

x5<-c(25,84,71,69,23)
sample(x5,200,replace=TRUE,prob=c(0.25,0.25,0.10,0.20,0.20))

Output

[1] 69 69 69 25 71 84 23 25 25 25 84 25 69 23 71 23 25 23 84 69 84 23 25 84 84
[26] 69 23 25 25 69 69 84 23 23 69 69 71 25 25 25 69 23 84 25 84 23 84 25 84 69
[51] 69 69 25 69 69 23 69 71 84 84 84 69 69 69 69 23 84 25 84 25 84 25 23 69 84
[76] 25 25 71 71 23 23 69 25 69 84 84 25 71 71 23 69 25 84 69 23 25 25 84 25 84
[101] 69 25 69 84 23 23 84 25 23 25 69 84 69 23 25 84 25 25 25 69 23 25 69 23 23
[126] 69 84 23 69 71 84 69 23 25 25 84 25 71 69 84 25 84 84 25 25 25 25 25 23 69
[151] 23 71 23 25 84 84 84 84 23 25 84 23 25 25 84 84 23 71 69 23 25 84 25 23 69
[176] 84 84 25 25 23 23 69 84 25 69 71 23 84 71 23 23 25 69 71 69 23 25 84 69 25

Updated on: 13-Aug-2021

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements