How to find the number of unique values for each column in data.table object in R?

R ProgrammingServer Side ProgrammingProgramming

To find the number of unique values for each column in data.table object, we can use uniqueN function along with lapply. For example, if we have a data.table object called DT that contains five columns each containing some duplicate values then the number of unique values in each of these columns can be found by using DT[,lapply(.SD,uniqueN)].

Example

Consider the below data.table object −

x1<-rpois(20,2)
x2<-rpois(20,5)
DT1<-data.table(x1,x2)
DT1

Output

   x1  x2
1:  3  11
2:  1  10
3:  3  5
4:  0  1
5:  0  7
6:  2  5
7:  2  4
8:  3  6
9:  2  4
10: 4  7
11: 1  6
12: 0  7
13: 2  5
14: 3  2
15: 2  2
16: 1  9
17: 1  2
18: 1  7
19: 2  7
20: 4  5

Finding the number of unique values in each column of DT1 −

Example

DT1[,lapply(.SD,uniqueN)]

Output

   x1 x2
1: 5  9

Example

y1<-round(rnorm(20),1)
y2<-round(rnorm(20),1)
DT2<-data.table(y1,y2)
DT2

Output

     y1    y2
1:   1.0  -0.5
2:  -1.1   0.5
3:   0.0   0.4
4:  -1.0   0.1
5:  -1.0  -1.4
6:   0.4  -0.7
7:   0.6  -0.2
8:   0.0  -0.3
9:   0.0   0.6
10: -0.2  -0.2
11: -0.2   1.8
12: 0.8    0.7
13: 0.5    0.6
14: -1.6  -0.4
15: 0.1  -0.2
16: 0.6  -1.3
17: 0.0   0.8
18: 1.4  -0.6
19: 0.5  -0.2
20: 0.9  -0.7

Finding the number of unique values in each column of DT2 −

Example

DT2[,lapply(.SD,uniqueN)]

Output

   y1 y2
1: 13 15

Example

z1<-round(runif(20,2,5),1)
z2<-round(runif(20,2,5),1)
z3<-round(runif(20,2,5),1)
z4<-round(runif(20,2,5),1)
DT3<-data.table(z1,z2,z3,z4)
DT3

Output

    z1    z2   z3    z4
1:  3.3  3.2  4.6   3.4
2:  4.1  4.4  2.9   2.7
3:  2.3  4.4  4.6   3.6
4:  5.0  3.6  2.6   2.6
5:  4.2  4.1  2.8   4.2
6:  3.7  4.4  2.9   3.1
7:  3.1  3.1  2.0   4.6
8:  4.7  2.7  3.5   5.0
9:  2.1  3.0  4.0   3.7
10: 2.3  2.5  3.2   2.7
11: 4.1  2.1  2.7   2.3
12: 2.4  2.7  4.2   3.2
13: 4.4  3.7  3.5   4.3
14: 3.7  3.1  3.3   3.3
15: 4.3  4.1  4.4   3.4
16: 3.9  2.7  2.9   3.6
17: 2.1  3.6  2.2   4.1
18: 3.0  3.6  2.3   3.4
19: 4.1  3.3  4.3   4.5
20: 2.4  3.4  3.7   3.6

Finding the number of unique values in each column of DT3 −

Example

DT3[,lapply(.SD,uniqueN)]

Output

   z1 z2 z3 z4
1: 14 12 16 15
raja
Published on 16-Mar-2021 12:32:11
Advertisements