How to remove a common suffix from column names in an R data frame?


To remove a common suffix from column names we can use gsub function. For example, if we have a data frame df that contains column defined as x1df, x2df, x3df, and x4df then we can remove df from all the column names by using the below command:

colnames(df)<-gsub("df","",colnames(df))

Example

Consider the below data frame:

Live Demo

> x1Data<-rnorm(20,25,4)
> x2Data<-rnorm(20,25,1.2)
> x3Data<-runif(20,2,5)
> df1<-data.frame(x1Data,x2Data,x3Data)
> df1

Output

x1Data x2Data x3Data
1 29.26500 26.64124 2.598983
2 21.82170 23.41442 4.134393
3 22.71918 25.21586 4.442823
4 19.88633 25.23487 3.338448
5 20.48989 23.33683 3.829757
6 29.07910 25.54084 3.519393
7 24.28573 23.67258 4.667397
8 27.99849 22.97148 4.100405
9 23.48148 25.36574 2.618030
10 26.39401 23.80191 4.235092
11 29.39867 24.36261 2.782559
12 30.11137 24.62702 4.873779
13 23.56623 25.29017 2.255684
14 24.18464 25.59862 2.247147
15 23.35541 25.38190 4.704027
16 25.02549 25.76776 4.706971
17 18.24187 23.53798 2.411423
18 24.12003 25.27751 2.137409
19 27.58055 24.80092 3.992380
20 23.70832 25.73701 2.577801

Removing the word Data from column names of data frame df1:

Example

> colnames(df1)<-gsub("Data","",colnames(df1))
> df1

Output

x1 x2 x3
1 29.26500 26.64124 2.598983
2 21.82170 23.41442 4.134393
3 22.71918 25.21586 4.442823
4 19.88633 25.23487 3.338448
5 20.48989 23.33683 3.829757
6 29.07910 25.54084 3.519393
7 24.28573 23.67258 4.667397
8 27.99849 22.97148 4.100405
9 23.48148 25.36574 2.618030
10 26.39401 23.80191 4.235092
11 29.39867 24.36261 2.782559
12 30.11137 24.62702 4.873779
13 23.56623 25.29017 2.255684
14 24.18464 25.59862 2.247147
15 23.35541 25.38190 4.704027
16 25.02549 25.76776 4.706971
17 18.24187 23.53798 2.411423
18 24.12003 25.27751 2.137409
19 27.58055 24.80092 3.992380
20 23.70832 25.73701 2.577801

Let’s have a look at another example:

Example

Live Demo

> a_treatment<-rpois(20,5)
> b_treatment<-rpois(20,10)
> c_treatment<-rpois(20,2)
> d_treatment<-rpois(20,8)
> df2<-data.frame(a_treatment,b_treatment,c_treatment,d_treatment)
> df2

Output

a_treatment b_treatment c_treatment d_treatment
1 3 18 3 4
2 1 9 1 5
3 3 13 0 5
4 2 14 0 9
5 4 9 1 10
6 6 8 0 8
7 4 7 5 9
8 2 13 1 8
9 5 4 4 7
10 6 11 1 7
11 4 9 3 12
12 7 6 4 10
13 6 20 3 6
14 4 10 1 4
15 13 11 0 12
16 6 11 1 10
17 6 8 1 16
18 4 8 1 14
19 8 11 2 8
20 3 7 0 9

Removing the word _treatment from column names of data frame df2:

Example

> colnames(df2)<-gsub("_treatment","",colnames(df2))
> df2

Output

a b c d
1 3 18 3 4
2 1 9 1 5
3 3 13 0 5
4 2 14 0 9
5 4 9 1 10
6 6 8 0 8
7 4 7 5 9
8 2 13 1 8
9 5 4 4 7
10 6 11 1 7
11 4 9 3 12
12 7 6 4 10
13 6 20 3 6
14 4 10 1 4
15 13 11 0 12
16 6 11 1 10
17 6 8 1 16
18 4 8 1 14
19 8 11 2 8
20 3 7 0 9

Updated on: 07-Nov-2020

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements