How to extract number from string if string is in different format than normal in R data frame?


To extract number from string if string is in different format than normal in R data frame, we can follow the below steps −

  • First of all, create a data frame.

  • Then, use gsub function along with as.numeric function to extract the number.

Example

Create the data frame

Let’s create a data frame as shown below −

x<-
sample(c("grp_12","grp_01","grp_05","grp_03","grp_04","grp_09","grp_10","grp_11","grp_02","grp_06","grp_07","grp_08"),25,replace=TRUE)
df<-data.frame(x)
df

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

     x
1  grp_07
2  grp_06
3  grp_01
4  grp_03
5  grp_04
6  grp_03
7  grp_09
8  grp_07
9  grp_03
10 grp_11
11 grp_09
12 grp_01
13 grp_08
14 grp_03
15 grp_11
16 grp_05
17 grp_11
18 grp_05
19 grp_11
20 grp_05
21 grp_06
22 grp_07
23 grp_02
24 grp_10
25 grp_03

Extract the number

Using gsub function along with as.numeric function to extract the number from column x of data frame df −

x<-
sample(c("grp_12","grp_01","grp_05","grp_03","grp_04","grp_09","grp_10","grp_11","grp_02","grp_06","grp_07","grp_08"),25,replace=TRUE)
df<-data.frame(x)
df$x_numeric<-as.numeric(gsub("^[^_]*_|^*$","",x))
df

Output

     x    x_numeric
1  grp_07  7
2  grp_06  6
3  grp_01  1
4  grp_03  3
5  grp_04  4
6  grp_03  3
7  grp_09  9
8  grp_07  7
9  grp_03  3
10 grp_11 11
11 grp_09  9
12 grp_01  1
13 grp_08  8
14 grp_03  3
15 grp_11 11
16 grp_05  5
17 grp_11 11
18 grp_05  5
19 grp_11 11
20 grp_05  5
21 grp_06  6
22 grp_07  7
23 grp_02  2
24 grp_10 10
25 grp_03  3

Updated on: 11-Nov-2021

58 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements