How to deal with error “undefined columns selected” while subsetting data in R?


When we do subsetting with the help of single square brackets we need to be careful about putting the commas at appropriate places. If we want to subset rows using the columns then comma needs to be placed before the condition. The “undefined columns selected” error occurs when we do not specify any comma. Check out the examples to understand how it works.

Consider the below data frame −

Example

 Live Demo

x1<-rpois(20,5)
x2<-rpois(20,2)
df1<-data.frame(x1,x2)
df1

Output

   x1 x2
1  7  0
2  6  4
3  5  3
4  6  1
5  3  0
6  4  1
7  6  1
8  5  1
9  7  3
10 4  2
11 6  3
12 9  2
13 5  2
14 0  0
15 7  4
16 7  3
17 6  2
18 6  3
19 4  3
20 3  3

df1[which(x1>5)]

Error in `[.data.frame`(df1, which(x1 > 5)) : undefined columns selected

Here error occurs because comma is not used for rows inside square brackets.

Subsetting df1 based on x1 values that are greater than 5 −

Example

df1[which(x1>5),]

Output

   x1 x2
1  7  0
2  6  4
4  6  1
7  6  1
9  7  3
11 6  3
12 9  2
15 7  4
16 7  3
17 6  2
18 6  3

Example

 Live Demo

y1<-rnorm(20)
y2<-rnorm(20)
y3<-rnorm(20)
y4<-rnorm(20)
df2<-data.frame(y1,y2,y3,y4)
df2

Output

       y1           y2            y3             y4
1  -1.409601559   -0.40597308   -0.64615777   -1.22078887
2   0.266717714   -0.57865012    0.76654025   -1.76430465
3  -0.064943594   -1.82008803   -1.10213671   -1.40020872
4   0.809783619   -0.08933758   -0.20752297   -1.11327480
5  -0.034361207   -1.45135447   -1.16066436    0.01539031
6  -0.082024227   -1.96856577   -0.09511484    0.77846417
7   0.259362498   -0.09326561   -0.40534748    0.39772236
8   1.116127337    0.80943746   -1.01315198   -0.60320454
9   0.236156881    0.48847386   -0.72174393   -0.29582895
10  1.762595310   -0.54977615   -0.90530123    0.65145594
11 -0.321092438    0.63080804   -0.76475103   -0.30353104
12  0.020150610    1.59757420   -0.75559972   -1.96075329
13  0.164084351   -0.11924416   -0.72052393    0.14890162
14  0.658193888   -1.32640467   -0.06000406   -0.89518512
15  1.230021633   -0.73053679    2.28237747    0.24679498
16 -0.530892825   -0.69954922   -0.98488545    0.37360026
17  0.563701048    0.67395747   -0.38809559    3.50620870
18  0.001154061   -0.19090813    0.49855009    0.56542930
19  1.821508804   -0.42088642    0.75174472   -0.93212634
20 -0.118279565    1.16474884   -0.60869426    0.95720193

Subsetting df2 based on y2 values that are less than 0.5 −

Example

df2[which(y2<0.5),]

Output

         y1           y2            y3           y4
1  -1.409601559   -0.40597308   -0.64615777   -1.22078887
2   0.266717714   -0.57865012    0.76654025   -1.76430465
3  -0.064943594   -1.82008803   -1.10213671   -1.40020872
4   0.809783619   -0.08933758   -0.20752297   -1.11327480
5  -0.034361207   -1.45135447   -1.16066436    0.01539031
6  -0.082024227   -1.96856577   -0.09511484    0.77846417
7   0.259362498   -0.09326561   -0.40534748    0.39772236
9   0.236156881    0.48847386   -0.72174393   -0.29582895
10  1.762595310   -0.54977615   -0.90530123    0.65145594
13  0.164084351   -0.11924416   -0.72052393    0.14890162
14  0.658193888   -1.32640467   -0.06000406   -0.89518512
15  1.230021633   -0.73053679    2.28237747    0.24679498
16 -0.530892825   -0.69954922   -0.98488545    0.37360026
18  0.001154061   -0.19090813    0.49855009    0.56542930
19  1.821508804   -0.42088642    0.75174472   -0.93212634

Updated on: 11-Feb-2021

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements