How to get row index or column index based on their names in R?

R ProgrammingServer Side ProgrammingProgramming

We might prefer to use row index or column index during the analysis instead of using their numbers, therefore, we can get them with the help of grep function. While dealing with a large data set it becomes helpful because large data sets have large number of rows and columns so it is easier to recall them with their indexes instead of numbers. Specifically, column indexes are needed, on the other hand, rows are required in special cases only such as analysing a particular case.

Example

Consider the below data frame −

> set.seed(1)
> x1<-rnorm(50,0.5)
> x2<-rnorm(50,0.8)
> x3<-rpois(50,2)
> x4<-rpois(50,5)
> x5<-runif(50,5,10)
> df<-data.frame(x1,x2,x3,x4,x5)
> head(df,20)
x1 x2 x3 x4 x5
1 -0.1264538  1.19810588 1 6 8.368561
2  0.6836433  0.18797361 1 9 5.474289
3 -0.3356286  1.14111969 2 5 7.462981
4  2.0952808 -0.32936310 1 5 7.307759
5  0.8295078  2.23302370 1 5 6.876083
6 -0.3204684  2.78039990 2 2 9.955496
7  0.9874291  0.43277852 2 3 5.881754
8  1.2383247 -0.24413463 0 5 9.067176
9  1.0757814  1.36971963 1 4 5.342233
10 0.1946116  0.66494540 3 9 7.002249
11 2.0117812  3.20161776 5 5 5.705722
12 0.8898432  0.76076000 0 4 5.966549
13 -0.1212406 1.48973936 3 4 9.206759
14 -1.7146999 0.82800216 5 7 8.599570
15 1.6249309  0.05672679 3 6 6.336060
16 0.4550664  0.98879230 1 3 7.475008
17 0.4838097 -1.00495863 2 2 5.415569
18 1.4438362  2.26555486 5 6 6.769421
19 1.3212212  0.95325334 5 6 9.846044
20 1.0939013  2.97261167 1 3 8.123571

Finding the index of columns based on their names −

> grep("x3", colnames(df))
[1] 3
> grep("x5", colnames(df))
[1] 5
> grep("x1", colnames(df))
[1] 1

Let’s change the row names because they are numbered from 1 to 50, so row index will be same as their name. We will change them to 50 to 1 as shown below −

> rownames(df)<-50:1
> head(df,20)
       x1          x2      x3  x4    x5
50 -0.1264538  1.19810588   1   6  8.368561
49  0.6836433  0.18797361   1   9  5.474289
48 -0.3356286  1.14111969   2   5  7.462981
47  2.0952808 -0.32936310   1   5  7.307759
46  0.8295078  2.23302370   1   5  6.876083
45 -0.3204684  2.78039990   2   2  9.955496
44 0.9874291   0.43277852   2   3  5.881754
43 1.2383247  -0.24413463   0   5  9.067176
42 1.0757814   1.36971963   1   4  5.342233
41 0.1946116   0.66494540   3   9  7.002249
40 2.0117812   3.20161776   5   5  5.705722
39 0.8898432   0.76076000   0   4  5.966549
38 -0.1212406  1.48973936   3   4  9.206759
37 -1.7146999  0.82800216   5   7  8.599570
36 1.6249309   0.05672679   3   6  6.336060
35 0.4550664   0.98879230   1   3  7.475008
34 0.4838097  -1.00495863   2   2  5.415569
33 1.4438362   2.26555486   5   6  6.769421
32 1.3212212   0.95325334   5   6  9.846044
31 1.0939013   2.97261167   1   3  8.123571

Now let’s find the row names that contains number 5 −

> grep(5, rownames(df))
[1] 1 6 16 26 36 46

Finding the row name of the tenth row −

> grep(10, rownames(df))
[1] 41

Finding the row names that contains number 4 −

> grep(4, rownames(df))
[1] 2 3 4 5 6 7 8 9 10 11 17 27 37 47
raja
Published on 11-Aug-2020 07:57:06
Advertisements