- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to get the summary statistics including all basic statistical values for R data frame columns?
When we apply summary function in R, the output gives minimum, first quartile, median, mean, third quartile, and maximum but there are many other basic statistical values that help us to understand the variable such as range, sum, standard error of mean, variance, standard deviation, and coefficient of variation. Therefore, if we want to find all the values then we can use stat.desc function of pastecs package as shown in the below examples.
Example1
Consider the below data frame −
> x1<-rnorm(20) > x2<-rnorm(20) > x3<-rnorm(20) > df1<-data.frame(x1,x2,x3) > df1
Output
x1 x2 x3 1 1.37057327 0.96585723 -1.6824440 2 0.43258556 -2.54077794 -1.5962218 3 0.68188832 1.08144561 -0.9956110 4 0.24553258 0.07541754 -0.3527252 5 -0.19946765 0.49262220 -0.7946248 6 -1.93924451 0.13544724 -0.4184053 7 0.27443524 0.08363552 0.8696729 8 -2.02613035 -0.67827697 -0.8940207 9 0.33772301 -1.51171368 0.4032073 10 -0.44463177 1.69245587 1.7037202 11 1.69256604 -0.60384845 0.7247898 12 0.11356829 1.05543184 0.9780191 13 -0.01516246 0.92529906 0.4805570 14 -0.78159893 -0.55414738 -0.4680645 15 -0.08974609 0.76847977 -0.2780631 16 -0.45456509 1.08361106 -1.6672789 17 1.13920983 0.24680491 1.3922984 18 0.55562889 -0.06529163 -0.7083794 19 -0.11607439 1.09421670 2.1602874 20 -0.78351132 0.48005020 0.3453250
Finding summary of df1 using summary function −
> summary(df1)
Output
x1 x2 x3 Min. :-2.0261304 Min. :-2.5408 Min. :-1.6824 1st Qu.:-0.4471151 1st Qu.:-0.1875 1st Qu.:-0.8195 Median : 0.0492029 Median : 0.3634 Median :-0.3154 Mean :-0.0003211 Mean : 0.2113 Mean :-0.0399 3rd Qu.: 0.4633464 3rd Qu.: 0.9883 3rd Qu.: 0.7610 Max. : 1.6925660 Max. : 1.6925 Max. : 2.1603
Loading pastecs package and finding the statistical summary of df1 using stat.desc function −
> library(pastecs) > stat.desc(df1)
Output
x1 x2 x3 nbr.val 2.000000e+01 20.0000000 20.00000000 nbr.null 0.000000e+00 0.0000000 0.00000000 nbr.na 0.000000e+00 0.0000000 0.00000000 min -2.026130e+00 -2.5407779 -1.68244397 max 1.692566e+00 1.6924559 2.16028742 range 3.718696e+00 4.2332338 3.84273139 sum -6.421540e-03 4.2267187 -0.79796158 median 4.920292e-02 0.3634276 -0.31539416 mean -3.210770e-04 0.2113359 -0.03989808 SE.mean 2.103941e-01 0.2262258 0.25081489 CI.mean.0.95 4.403600e-01 0.4734961 0.52496160 var 8.853137e-01 1.0235624 1.25816219 std.dev 9.409111e-01 1.0117126 1.12167829 coef.var -2.930484e+03 4.7872246 -28.11359138
Example2
> y1<-rpois(20,5) > y2<-rpois(20,2) > y3<-rpois(20,10) > y4<-rpois(20,8) > df2<-data.frame(y1,y2,y3,y4) > df2
Output
y1 y2 y3 y4 1 4 4 10 6 2 4 1 9 8 3 2 3 12 9 4 4 0 11 4 5 7 3 7 7 6 6 0 9 18 7 5 1 7 3 8 6 2 5 10 9 5 1 10 5 10 6 1 12 7 11 11 2 8 7 12 4 2 10 11 13 4 3 7 6 14 4 0 11 15 15 10 1 8 8 16 5 0 6 8 17 3 1 13 14 18 4 1 8 5 19 5 1 5 4 20 8 2 13 5
Finding the statistical summary of df2 using stat.desc function −
> stat.desc(df2)
Output
y1 y2 y3 y4 nbr.val 20.0000000 20.0000000 20.0000000 20.0000000 nbr.null 0.0000000 4.0000000 0.0000000 0.0000000 nbr.na 0.0000000 0.0000000 0.0000000 0.0000000 min 2.0000000 0.0000000 5.0000000 3.0000000 max 11.0000000 4.0000000 13.0000000 18.0000000 range 9.0000000 4.0000000 8.0000000 15.0000000 sum 107.0000000 29.0000000 181.0000000 160.0000000 median 5.0000000 1.0000000 9.0000000 7.0000000 mean 5.3500000 1.4500000 9.0500000 8.0000000 SE.mean 0.4988144 0.2562380 0.5547641 0.8795932 CI.mean.0.95 1.0440305 0.5363122 1.1611345 1.8410097 var 4.9763158 1.3131579 6.1552632 15.4736842 std.dev 2.2307657 1.1459310 2.4809803 3.9336604 coef.var 0.4169656 0.7902973 0.2741415 0.4917076
- Related Articles
- How to find the statistical summary of an R data frame with all the descriptive statistics?
- How to find group-wise summary statistics for an R data frame?
- How to save the summary statistics into a data frame in R?
- How to extract statistical summary from boxplot in R?
- How to find standard deviations for all columns of an R data frame?
- How to perform Wilcoxon test for all columns in an R data frame?
- How to perform shapiro test for all columns in an R data frame?
- How to get list of all columns except one or more columns from an R data frame?
- How to display the data frame summary in vertical order in R?
- How to find the number of columns where all row values are equal in R data frame?
- How to plot all the values of an R data frame?
- How to create line chart for all columns of a data frame a in R?
- Roll up R data frame columns for summation by group if missing values exist in the data frame.
- How to find the median of all columns in an R data frame?
- How to find the row mean for columns in an R data frame by ignoring missing values?

Advertisements