- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to extract columns based on particular column values of an R data frame that match
a pattern?
The column values of an R data frame can be easily extracted by subsetting with single square brackets but if we want to extract the column values that match a pattern then we need to use grepl function inside single square brackets, this will help us to match the pattern of the values in the data frame columns.
Example
Consider the below data frame:
> set.seed(271) > x1<-sample(c("A187","A321","A123","A242"),20,replace=TRUE) > x2<-sample(c("B87","B74","B81","B71"),20,replace=TRUE) > df1<-data.frame(x1,x2) > df1
Output
x1 x2 1 A242 B71 2 A123 B71 3 A242 B81 4 A242 B87 5 A123 B71 6 A321 B71 7 A187 B74 8 A321 B74 9 A123 B71 10 A242 B74 11 A321 B74 12 A187 B87 13 A321 B74 14 A321 B74 15 A123 B74 16 A187 B87 17 A123 B81 18 A242 B87 19 A242 B71 20 A187 B74
Extracting values that starts with A1:
Example
> df1[grepl("A1",x1),]
Output
x1 x2 2 A123 B71 5 A123 B71 7 A187 B74 9 A123 B71 12 A187 B87 15 A123 B74 16 A187 B87 17 A123 B81 20 A187 B74
Extracting values that starts with A2:
Example
> df1[grepl("A2",x1),]
Output
x1 x2 1 A242 B71 3 A242 B81 4 A242 B87 10 A242 B74 18 A242 B87 19 A242 B71
Extracting values that starts with A3:
Example
> df1[grepl("A3",x1),]
Output
x1 x2 6 A321 B71 8 A321 B74 11 A321 B74 13 A321 B74 14 A321 B74
Extracting values that starts with B7:
Example
> df1[grepl("B7",x2),]
Output
x1 x2 1 A242 B71 2 A123 B71 5 A123 B71 6 A321 B71 7 A187 B74 8 A321 B74 9 A123 B71 10 A242 B74 11 A321 B74 13 A321 B74 14 A321 B74 15 A123 B74 19 A242 B71 20 A187 B74
Extracting values that starts with B8:
Example
> df1[grepl("B8",x2),]
Output
x1 x2 3 A242 B81 4 A242 B87 12 A187 B87 16 A187 B87 17 A123 B81 18 A242 B87
Let’s have a look at another example:
Example
> y1<-sample(c("ID1","ID2","ID3","ID4"),20,replace=TRUE) > y2<-sample(c("C1","C2"),20,replace=TRUE) > df2<-data.frame(y1,y2) > df2
Output
y1 y2 1 ID1 C1 2 ID3 C1 3 ID1 C1 4 ID2 C2 5 ID4 C2 6 ID2 C2 7 ID2 C1 8 ID3 C1 9 ID1 C1 10 ID1 C1 11 ID2 C2 12 ID2 C2 13 ID4 C1 14 ID3 C2 15 ID2 C1 16 ID4 C1 17 ID4 C1 18 ID2 C2 19 ID2 C2 20 ID3 C2
Example
> df2[grepl("ID1",y1),]
Output
y1 y2 1 ID1 C1 3 ID1 C1 9 ID1 C1 10 ID1 C1
Example
> df2[grepl("ID2",y1),]
Output
y1 y2 4 ID2 C2 6 ID2 C2 7 ID2 C1 11 ID2 C2 12 ID2 C2 15 ID2 C1 18 ID2 C2 19 ID2 C2
Example
> df2[grepl("ID3",y1),]
Output
y1 y2 2 ID3 C1 8 ID3 C1 14 ID3 C2 20 ID3 C2
Example
> df2[grepl("ID4",y1),]
Output
y1 y2 5 ID4 C2 13 ID4 C1 16 ID4 C1 17 ID4 C1
Example
> df2[grepl("C1",y2),]
Output
y1 y2 1 ID1 C1 2 ID3 C1 3 ID1 C1 7 ID2 C1 8 ID3 C1 9 ID1 C1 10 ID1 C1 13 ID4 C1 15 ID2 C1 16 ID4 C1 17 ID4 C1
Example
> df2[grepl("C2",y2),]
Output
y1 y2 4 ID2 C2 5 ID4 C2 6 ID2 C2 11 ID2 C2 12 ID2 C2 14 ID3 C2 18 ID2 C2 19 ID2 C2 20 ID3 C2
- Related Articles
- How to extract a particular value based on index from an R data frame column?
- How to replace NA values in columns of an R data frame form the mean of that column?
- How to subset rows of an R data frame based on duplicate values in a particular column?
- How to change row values based on column values in an R data frame?
- How to remove rows that contains NA values in certain columns of an R data frame?
- How to fill NA values with previous values in an R data frame column?
- How to delete rows of an R data frame based on string match?
- How to divide row values of a numerical column based on categorical column values in an R data frame?
- How to subset an R data frame by specifying columns that contains NA?
- How to subset an R data frame based on string match in two columns with OR condition?
- How to find the number of columns of an R data frame that satisfies a condition based on row values?
- How to subset row values based on columns name in R data frame?
- How to find the column means of a column based on another column values that represent factor in an R data frame?
- How to extract a data frame’s column value based on a column value of another data frame in R?
- Extract a particular level from factor column in an R data frame.

Advertisements