Check which column names contain a specific string in R data frame.

R ProgrammingServer Side ProgrammingProgramming

If we have a data frame that contains columns having names with some commong strings then we might want to find those column names. For this purpose, we can use grepl function for subsetting along with colnames function.

Check out the Examples given below to understand how it can be done.

Example 1

Following snippet creates a sample data frame −

Students_Score<-sample(1:50,20)
Teachers_Rank<-sample(1:5,20,replace=TRUE)
Teachers_Score<-sample(1:50,20)
df1<-data.frame(Students_Score,Teachers_Rank,Teachers_Score)
df1

The following dataframe is created

  Students_Score Teachers_Rank Teachers_Score
 1          20            5             8
 2          28            1            26
 3          42            2            49
 4          25            2            11
 5           7            4            19
 6           4            5            37
 7          48            1             9
 8          33            4            35
 9          23            5            38
10          31            3            29
11          43            1             6
12           6            4            13
13          15            5            33
14           9            1            40
15          41            3            43
16          11            4            34
17          46            5            42
18          44            1 5
19          21            3 48
20          29            4 15

To check columns of df1 that contains string Score on the above created data frame, add the following code to the above snippet −

Students_Score<-sample(1:50,20)
Teachers_Rank<-sample(1:5,20,replace=TRUE)
Teachers_Score<-sample(1:50,20)
df1<-data.frame(Students_Score,Teachers_Rank,Teachers_Score)
colnames(df1)[grepl("Score",colnames(df1))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] "Students_Score" "Teachers_Score"

Example 2

Following snippet creates a sample data frame −

Hot_Temp<-sample(33:50,20,replace=TRUE)
Cold_Temp<-sample(1:10,20,replace=TRUE)
Group<-sample(c("First","Second","Third"),20,replace=TRUE)
df2<-data.frame(Hot_Temp,Cold_Temp,Group)
df2

The following dataframe is created

  Hot_Temp Cold_Temp Group
 1 47         7      First
 2 38         5      Third
 3 48         7      Third
 4 36        10      First
 5 46         6      Third
 6 45         2      First
 7 35        8       Second
 8 33        1       Second
 9 33        4       First
10 34        5       First
11 34        6       Third
12 39       10       Third
13 47       10       First
14 41        6       Third
15 48        3       First
16 36        2       Third
17 49        9       Second
18 35        5       Second
19 33        1       Second
20 49       10       Third

To check columns of df2 that contains string Temp on the above created data frame, add the following code to the above snippet −

Hot_Temp<-sample(33:50,20,replace=TRUE)
Cold_Temp<-sample(1:10,20,replace=TRUE)
Group<-sample(c("First","Second","Third"),20,replace=TRUE)
df2<-data.frame(Hot_Temp,Cold_Temp,Group)
colnames(df2)[grepl("Temp",colnames(df2))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] "Hot_Temp" "Cold_Temp"

Example 3

Following snippet creates a sample data frame −

x1_Rate<-sample(1:10,20,replace=TRUE)
x2_Rate<-sample(1:10,20,replace=TRUE)
Category<-sample(c("Normal","Abnormal"),20,replace=TRUE)
df3<-data.frame(x1_Rate,x2_Rate,Category)
df3

The following dataframe is created

  x1_Rate x2_Rate Category
 1    5        9  Normal
 2    8        8  Normal
 3    7       10  Normal
 4    3        3  Normal
 5    6        6  Normal
 6    4        9  Abnormal
 7    6        5  Abnormal
 8    2        9  Abnormal
 9    3       10  Abnormal
10    7        4  Abnormal
11    1        3  Normal
12    9       10  Abnormal
13    3        3  Normal
14    8       10  Abnormal
15    3        5  Normal
16    2        5  Abnormal
17    2        1  Normal
18    5        7  Abnormal
19    7        1  Abnormal
20    5        8  Normal

To check columns of df3 that contains string Rate on the above created data frame, add the following code to the above snippet −

x1_Rate<-sample(1:10,20,replace=TRUE)
x2_Rate<-sample(1:10,20,replace=TRUE)
Category<-sample(c("Normal","Abnormal"),20,replace=TRUE)
df3<-data.frame(x1_Rate,x2_Rate,Category)
colnames(df3)[grepl("Rate",colnames(df3))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] "x1_Rate" "x2_Rate"
raja
Updated on 10-Nov-2021 06:57:31

Advertisements