- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find the percentiles for multiple columns in an R data frame.
To find the percentiles for multiple columns in R data frame, we can use apply function with quantile function and providing the quantile probabilities with probs argument.
For Example, if we have a data frame called df that contains multiple columns and we want to find three percentiles 0.25, 0.70, 0.90 then we can use the command given below −
apply(df[],2,quantile,probs=c(0.25,0.70,0.90))
Example 1
Following snippet creates a sample data frame −
x1<-rnorm(20) x2<-rnorm(20) x3<-rnorm(20) df1<-data.frame(x1,x2,x3) df1
The following dataframe is created
x1 x2 x3 1 -1.1681428 -0.28065525 -0.53819110 2 0.2318993 -1.15544267 0.17944881 3 -0.5333789 0.36613560 -1.48668050 4 -1.3099335 0.43108366 1.08802308 5 0.6470196 0.08830738 -0.25840686 6 -0.1701303 1.87160281 -0.48819826 7 0.2818403 0.13090818 -0.96722760 8 -3.0132800 -2.09431074 0.31341228 9 -0.4261333 -1.16471217 0.93643827 10 -1.0134820 0.60068445 -1.57522191 11 0.7188261 -0.09290046 -1.21396318 12 0.0877293 -1.10543055 -1.03759785 13 -0.8056363 2.37757742 0.27509481 14 -1.1005749 -0.64515153 -0.86212935 15 2.1567133 0.92086077 0.70579629 16 0.3628198 0.01760350 0.51998078 17 -0.7449807 -0.88991305 -0.91787379 18 1.6731441 0.02442096 -0.03178033 19 1.1367622 1.00582342 -0.25280294 20 2.7935713 -0.19143469 -0.14149516
To find different percentiles for all columns in df1 on the above created data frame, add the following code to the above snippet −
x1<-rnorm(20) x2<-rnorm(20) x3<-rnorm(20) df1<-data.frame(x1,x2,x3) apply(df1[],2,quantile,probs=c(0.05,0.10,0.20,0.25,0.30,0.40,0.50,0.60,0.70,0.75,0.80, 0.90,0.95))
Output
If you execute all the above given snippets as a single program, it generates the following Output −
x1 x2 x3 5% -1.39510080 -1.21119210 -1.49110757 10% -1.18232188 -1.15636962 -1.24123491 20% -1.03090060 -0.93301655 -0.98130165 25% -0.85759773 -0.70634191 -0.93021224 30% -0.76317738 -0.39000413 -0.87885268 40% -0.46903156 -0.13231415 -0.50819540 50% -0.04120049 0.02101223 -0.25560490 60% 0.25187572 0.10534770 -0.09760923 70% 0.44807978 0.38562001 0.20814261 75% 0.66497124 0.47348386 0.28467418 80% 0.80241330 0.66471971 0.35472598 90% 1.72150104 1.09240136 0.72886048 95% 2.18855624 1.89690154 0.94401751
Example 2
Following snippet creates a sample data frame −
y1<-rpois(20,1) y2<-rpois(20,2) y3<-rpois(20,5) y4<-rpois(20,5) df2<-data.frame(y1,y2,y3,y4) df2
The following dataframe is created
y1 y2 y3 y4 1 1 2 4 6 2 1 6 7 5 3 2 2 6 4 4 0 4 2 5 5 1 1 5 6 6 1 2 2 7 7 3 2 4 6 8 2 3 6 5 9 1 2 7 3 10 0 3 6 4 11 1 0 8 5 12 1 1 3 4 13 4 0 3 3 14 0 1 3 8 15 2 4 5 1 16 2 2 2 1 17 0 1 5 6 18 2 3 2 3 19 0 2 2 11 20 0 2 5 3
To find different percentiles for all columns in df2 on the above created data frame, add the following code to the above snippet −
y1<-rpois(20,1) y2<-rpois(20,2) y3<-rpois(20,5) y4<-rpois(20,5) df2<-data.frame(y1,y2,y3,y4) apply(df2[],2,quantile,probs=c(0.05,0.10,0.20,0.25,0.30,0.40,0.50,0.60,0.70,0.75,0.80, 0.90,0.95))
Output
If you execute all the above given snippets as a single program, it generates the following Output −
y1 y2 y3 y4 5% 0.00 0.0 2.00 1.00 10% 0.00 0.9 2.00 2.80 20% 0.00 1.0 2.00 3.00 25% 0.00 1.0 2.75 3.00 30% 0.70 1.7 3.00 3.70 40% 1.00 2.0 3.60 4.00 50% 1.00 2.0 4.50 5.00 60% 1.00 2.0 5.00 5.00 70% 2.00 2.3 5.30 6.00 75% 2.00 3.0 6.00 6.00 80% 2.00 3.0 6.00 6.00 90% 2.10 4.0 7.00 7.10 95% 3.05 4.1 7.05 8.15
- Related Articles
- Find the mean of multiple columns based on multiple grouping columns in R data frame.
- How to find the mean for multiple columns in an R data frame using mean function not colMeans?
- How to find the frequency table for factor columns in an R data frame?
- How to create an ID column for the combination of values in multiple columns in R data frame?
- How to convert multiple columns into single column in an R data frame?
- Find the row means for columns starting with a string in an R data frame.
- How to find standard deviations for all columns of an R data frame?
- How to find the row mean for selected columns in R data frame?
- Create cross tabulation for three categorical columns in an R data frame.
- How to find the class of columns of an R data frame?
- How to find the median of all columns in an R data frame?
- How to find the number of numerical columns in an R data frame?
- How to reorder the columns in an R data frame?
- How to create a subset of an R data frame based on multiple columns?
- How to find the row mean for columns in an R data frame by ignoring missing values?
