How to create a subset of a data frame in R without using column names?

R ProgrammingServer Side ProgrammingProgramming

The subsetting of a data frame can be done by using column names as well as column number. Also, we can subset by subsequent as well as non-subsequent column numbers. For example, if we have a data frame df that contains column x, y, z then we can make a subset of x and z by using df[,c(1,3)].

Example

Consider the below data frame:

> set.seed(191)
> x1<-rnorm(20,1)
> x2<-rnorm(20,5)
> x3<-rnorm(20,2)
> x4<-rnorm(20,4)
> df1<-data.frame(x1,x2,x3,x4)
> df1

Output

     x1        x2        x3        x4
1 0.8464828 5.517463 1.3510192 3.879824
2 1.7157414 4.902044 1.7288418 4.915879
3 2.0612258 5.343704 3.4476224 3.198662
4 0.9817547 5.310376 0.7360361 4.191265
5 1.3137032 4.690344 1.8930611 3.195032
6 3.2946391 5.356714 0.7507614 2.762971
7 1.1292996 3.956172 1.3893677 3.472453
8 0.5938585 3.524826 2.4999638 3.442268
9 2.5721891 3.986746 2.1758887 3.065743
10 0.3154647 2.602883 2.2014771 4.111108
11 0.6326024 6.630669 2.4982478 2.310966
12 1.9772099 4.863338 3.0983665 3.976421
13 2.4442273 3.390198 3.7922736 3.743440
14 1.1505010 4.512891 2.7232374 3.528800
15 2.2532166 4.969238 2.1687148 3.691669
16 0.5104193 4.440487 1.9766220 4.120722
17 0.9377628 2.559686 3.1919780 2.755742
18 -0.3147257 4.919251 3.0462375 2.625914
19 0.3678290 4.088426 3.3926200 3.797904
20 2.0272953 4.151505 3.1796609 2.771270

Subsetting columns of data frame df1 by using column number:

Example

> df1[,1]

Output

[1]  0.8464828 1.7157414 2.0612258 0.9817547 1.3137032 3.2946391
[7]  1.1292996 0.5938585 2.5721891 0.3154647 0.6326024 1.9772099
[13] 2.4442273 1.1505010 2.2532166 0.5104193 0.9377628 -0.3147257
[19] 0.3678290 2.0272953

Example

> df1[,1:2]

Output

       x1      x2
1 0.8464828 5.517463
2 1.7157414 4.902044
3 2.0612258 5.343704
4 0.9817547 5.310376
5 1.3137032 4.690344
6 3.2946391 5.356714
7 1.1292996 3.956172
8 0.5938585 3.524826
9 2.5721891 3.986746
10 0.3154647 2.602883
11 0.6326024 6.630669
12 1.9772099 4.863338
13 2.4442273 3.390198
14 1.1505010 4.512891
15 2.2532166 4.969238
16 0.5104193 4.440487
17 0.9377628 2.559686
18 -0.3147257 4.919251
19 0.3678290 4.088426
20 2.0272953 4.151505

Example

> df1[,1:3]

Output

      x1        x2       x3
1 0.8464828 5.517463 1.3510192
2 1.7157414 4.902044 1.7288418
3 2.0612258 5.343704 3.4476224
4 0.9817547 5.310376 0.7360361
5 1.3137032 4.690344 1.8930611
6 3.2946391 5.356714 0.7507614
7 1.1292996 3.956172 1.3893677
8 0.5938585 3.524826 2.4999638
9 2.5721891 3.986746 2.1758887
10 0.3154647 2.602883 2.2014771
11 0.6326024 6.630669 2.4982478
12 1.9772099 4.863338 3.0983665
13 2.4442273 3.390198 3.7922736
14 1.1505010 4.512891 2.7232374
15 2.2532166 4.969238 2.1687148
16 0.5104193 4.440487 1.9766220
17 0.9377628 2.559686 3.1919780
18 -0.3147257 4.919251 3.0462375
19 0.3678290 4.088426 3.3926200
20 2.0272953 4.151505 3.1796609

Example

> df1[,2:4]

Output

       x2       x3      x4
1 5.517463 1.3510192 3.879824
2 4.902044 1.7288418 4.915879
3 5.343704 3.4476224 3.198662
4 5.310376 0.7360361 4.191265
5 4.690344 1.8930611 3.195032
6 5.356714 0.7507614 2.762971
7 3.956172 1.3893677 3.472453
8 3.524826 2.4999638 3.442268
9 3.986746 2.1758887 3.065743
10 2.602883 2.2014771 4.111108
11 6.630669 2.4982478 2.310966
12 4.863338 3.0983665 3.976421
13 3.390198 3.7922736 3.743440
14 4.512891 2.7232374 3.528800
15 4.969238 2.1687148 3.691669
16 4.440487 1.9766220 4.120722
17 2.559686 3.1919780 2.755742
18 4.919251 3.0462375 2.625914
19 4.088426 3.3926200 3.797904
20 4.151505 3.1796609 2.771270

Example

> df1[,c(1,3)]

Output

        x1      x3
1 0.8464828 1.3510192
2 1.7157414 1.7288418
3 2.0612258 3.4476224
4 0.9817547 0.7360361
5 1.3137032 1.8930611
6 3.2946391 0.7507614
7 1.1292996 1.3893677
8 0.5938585 2.4999638
9 2.5721891 2.1758887
10 0.3154647 2.2014771
11 0.6326024 2.4982478
12 1.9772099 3.0983665
13 2.4442273 3.7922736
14 1.1505010 2.7232374
15 2.2532166 2.1687148
16 0.5104193 1.9766220
17 0.9377628 3.1919780
18 -0.3147257 3.0462375
19 0.3678290 3.3926200
20 2.0272953 3.1796609

Example

> df1[,c(2,4,1)]

Output

      x2       x4      x1
1 5.517463 3.879824 0.8464828
2 4.902044 4.915879 1.7157414
3 5.343704 3.198662 2.0612258
4 5.310376 4.191265 0.9817547
5 4.690344 3.195032 1.3137032
6 5.356714 2.762971 3.2946391
7 3.956172 3.472453 1.1292996
8 3.524826 3.442268 0.5938585
9 3.986746 3.065743 2.5721891
10 2.602883 4.111108 0.3154647
11 6.630669 2.310966 0.6326024
12 4.863338 3.976421 1.9772099
13 3.390198 3.743440 2.4442273
14 4.512891 3.528800 1.1505010
15 4.969238 3.691669 2.2532166
16 4.440487 4.120722 0.5104193
17 2.559686 2.755742 0.9377628
18 4.919251 2.625914 -0.3147257
19 4.088426 3.797904 0.3678290
20 4.151505 2.771270 2.0272953
raja
Published on 21-Nov-2020 05:22:28
Advertisements