Find the missing numbers in a sequence in R data frame column.


To find the missing numbers in a sequence in R data frame column, we can use setdiff function.

For Example, if we have a data frame called df that contains a column say X and we want to check which values between 1 to 20 are missing in this column then we can use the below command −

setdiff(1:20,df$X)

Example 1

Following snippet creates a sample data frame −

x<-rpois(20,5)
df1<-data.frame(x)
df1

The following dataframe is created

   x
1  4
2  4
3  5
4  5
5  6
6  6
7  7
8  7
9  5
10 7
11 4
12 5
13 5
14 4
15 4
16 7
17 7
18 1
19 7
20 5

To find the missing numbers between 1 and 10 in x on the above created data frame, add the following code to the above snippet −

x<-rpois(20,5)
df1<-data.frame(x)
setdiff(1:10,df1$x)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 2 3 8 9 10

Example 2

Following snippet creates a sample data frame −

y<-c(1:3,5:10,21:30,35)
df2<-data.frame(y)
df2

The following dataframe is created

   y
1  1
2  2
3  3
4  5
5  6
6  7
7  8
8  9
9  10
10 21
11 22
12 23
13 24
14 25
15 26
16 27
17 28
18 29
19 30
20 35

To find the missing numbers between 1 and 35 in y on the above created data frame, add the following code to the above snippet −

y<-c(1:3,5:10,21:30,35)
df2<-data.frame(y)
setdiff(1:35,df2$y)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 4 11 12 13 14 15 16 17 18 19 20 31 32 33 34

Example 3

Following snippet creates a sample data frame −

z<-sample(1:100,20)
df3<-data.frame(z)
df3

The following dataframe is created

   z
1  84
2  7
3  40
4  87
5  9
6  51
7  3
8  97
9  78
10 69
11 26
12 4
13 61
14 99
15 91
16 81
17 48
18 47
19 80
20 22

To find the missing numbers between 1 and 100 in z on the above created data frame, add the following code to the above snippet −

z<-sample(1:100,20)
df3<-data.frame(z)
setdiff(1:100,df3$z)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1]  1 2 5 6 8 10 11 12 13 14 15 16 17 18 19 20 21 23 24
[20] 25 27 28 29 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45
[39] 46 49 50 52 53 54 55 56 57 58 59 60 62 63 64 65 66 67 68
[58] 70 71 72 73 74 75 76 77 79 82 83 85 86 88 89 90 92 93 94
[77] 95 96 98 100

Updated on: 03-Nov-2021

732 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements