Why we should use set.seed in R?


The use of set.seed is to make sure that we get the same results for randomization. If we randomly select some observations for any task in R or in any statistical software it results in different values all the time and this happens because of randomization. If we want to keep the values that are produced at first random selection then we can do this by storing them in an object after randomization or we can fix the randomization procedure so that we get the same results all the time.

Example

Randomization without set.seed

> sample(1:10)
[1] 4 10 5 3 1 6 9 2 8 7
> sample(1:10)
[1] 1 4 2 5 8 3 7 9 6 10
> sample(1:10)
[1] 6 3 9 5 10 2 7 1 8 4

Here we created a sample of size 10 three times and in all these samples the values are different.

Randomization with set.seed

> set.seed(99)
> sample(1:10)
[1] 6 2 10 7 4 5 3 1 8 9
> set.seed(99)
> sample(1:10)
[1] 6 2 10 7 4 5 3 1 8 9
> set.seed(99)
> sample(1:10)
[1] 6 2 10 7 4 5 3 1 8 9

Since we used the same set.seed in all the three samples hence we obtained the same sample values.

Updated on: 06-Jul-2020

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements