- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to select the first and last row based on group column in an R data frame?
Extraction of data is necessary in data analysis because extraction helps us to keep the important information about a data set. This important information could be the first row and the last row of groups as well, also we might want to use these rows for other type of analysis such as comparing the initial and last data values among groups. We can extract or select the first and last row based on group column by using slice function of dplyr package.
Example
Consider the below data frame: > x1<-rep(1:4,each=10) > x2<-rpois(40,5) > df1<-data.frame(x1,x2) > head(df1,12)
Output
x1 x2 1 1 3 2 1 4 3 1 6 4 1 6 5 1 3 6 1 4 7 1 7 8 1 8 9 1 7 10 1 2 11 2 8 12 2 7
Example
> tail(df1,12)
Output
x1 x2 29 3 4 30 3 5 31 4 4 32 4 6 33 4 7 34 4 5 35 4 5 36 4 4 37 4 9 38 4 4 39 4 3 40 4 6
Loading dplyr package −
> library(dplyr) Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’ −
filter, lag
The following objects are masked from ‘package:base’ −
intersect, setdiff, setequal, union
Selecting first and last row based on group column x1 −
Example
> df1%>%group_by(x1)%>%slice(c(1,n())) # A tibble: 8 x 2 # Groups: x1 [4]
Output
x1 x2 <int> <int> 1 1 3 2 1 2 3 2 8 4 2 4 5 3 5 6 3 5 7 4 4 8 4 6
Let’s have a look at another example −
Example
> y1<-rep(c("A","B","C"),each=10) > y2<-rnorm(30) > df2<-data.frame(y1,y2) > head(df2,12)
Output
y1 y2 1 A -1.1640927 2 A 0.3146504 3 A -1.5213974 4 A -1.3728970 5 A -0.9964678 6 A -0.5022738 7 A -0.4225463 8 A -0.3501037 9 A 0.3043838 10 A -1.5216102 11 B -0.2425732 12 B 0.5554217
Example
> tail(df2,12)
Output
y1 y2 19 B 0.30172320 20 B 1.68341427 21 C 0.55127997 22 C -1.77840803 23 C 0.03001296 24 C -1.19246335 25 C 0.03612258 26 C -0.35468216 27 C -0.63579743 28 C -1.90074403 29 C 0.50072577 30 C 0.31911138
Example
> df2%>%group_by(y1)%>%slice(c(1,n())) # A tibble: 6 x 2 # Groups: y1 [3]
Output
y1 y2 <fct> <dbl> 1 A -1.16 2 A -1.52 3 B -0.243 4 B 1.68 5 C 0.551 6 C 0.319
- Related Articles
- How to change row values based on column values in an R data frame?
- How to get row index based on a value of an R data frame column?
- Find the column and row names in the R data frame based on condition.
- How to remove only the first duplicate row by group in an R data frame?
- How to select top rows of an R data frame based on groups of factor column?
- How to divide row values of a numerical column based on categorical column values in an R data frame?
- How to select rows based on range of values of a column in an R data frame?
- How to subset an R data frame based on numerical and categorical column?
- How to find the mean based on single group value in an R data frame?
- How to select data frame columns based on their class in R?
- How to create a group column in an R data frame?
- How to select positive values in an R data frame column?
- How to subset row values based on columns name in R data frame?
- How to select the first row for each level of a factor variable in an R data frame?
- How to find the minimum for each row based on few columns in an R data frame?

Advertisements