- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the correlation matrix in R using all variables of a data frame?
Correlation matrix helps us to determine the direction and strength of linear relationship among multiple variables at a time. Therefore, it becomes easy to decide which variables should be used in the linear model and which ones could be dropped. We can find the correlation matrix by simply using cor function with data frame name.
Example
Consider the below data frame of continuous variable −
> set.seed(9) > x1<-rnorm(20) > x2<-rnorm(20,0.2) > x3<-rnorm(20,0.5) > x4<-rnorm(20,0.8) > x5<-rnorm(20,1) > df<-data.frame(x1,x2,x3,x4,x5) > df x1 x2 x3 x4 x5 1 -0.76679604 1.95699294 -0.30845634 1.081222227 1.11407587 2 -0.81645834 0.38225214 -1.51938169 -0.402708626 -0.05365988 3 -0.14153519 -0.06688875 -0.23872407 1.265163691 1.15599915 4 -0.27760503 1.12642163 0.88288656 1.152016386 2.30039421 5 0.43630690 -0.49333188 2.23086367 0.210143783 -0.15588645 6 -1.18687252 2.88199007 0.29691805 -0.053599959 1.21604185 7 1.19198691 0.42252448 -0.49639735 0.553267880 1.80447819 8 -0.01819034 -0.50667241 -0.80653629 2.339338571 0.26788427 9 -0.24808460 0.61721325 -0.49783160 1.346077684 -0.61809812 10 -0.36293689 0.56955678 -0.06502873 2.364961851 1.83906927 11 1.27757055 -0.71376435 2.25205784 1.049670178 0.64856205 12 -0.46889715 -0.11691475 -0.04777135 -1.162418630 0.28371561 13 0.07105410 1.24905921 -0.35852571 -0.009060223 0.05970815 14 -0.26603845 0.36811181 0.54929453 0.301314912 1.73016571 15 1.84525720 0.23144021 0.29995552 1.105121769 0.56212952 16 -0.83944966 -0.81033054 -0.60395445 0.510792758 0.75061790 17 -0.07744806 0.58275153 0.74058804 2.257714201 0.32792906 18 -2.61770553 -0.61969653 0.88111362 1.673755484 1.80101407 19 0.88788403 0.56171109 2.73045895 -0.152956042 -0.48886193 20 -0.70749145 0.29337136 1.69920239 0.768324524 1.45401160
Finding the correlation matrix for all variables in df −
> cor(df) x1 x2 x3 x4 x5 x1 1.00000000 -0.1332350 0.25115920 -0.04210749 -0.28891754 x2 -0.13323501 1.0000000 -0.15071432 -0.15398933 0.14759671 x3 0.25115920 -0.1507143 1.00000000 -0.05268172 -0.02505888 x4 -0.04210749 -0.1539893 -0.05268172 1.00000000 0.27861734 x5 -0.28891754 0.1475967 -0.02505888 0.27861734 1.00000000
Consider the below data frame of continuous variable −
> a1<-rpois(20,2) > a2<-rpois(20,5) > a3<-rpois(20,8) > a4<-rpois(20,10) > a5<-rpois(20,15) > df_new<-data.frame(a1,a2,a3,a4,a5) > df_new a1 a2 a3 a4 a5 1 2 8 9 5 13 2 1 4 7 11 16 3 2 2 5 12 11 4 1 3 12 9 15 5 1 4 8 4 14 6 0 6 9 8 14 7 2 6 12 10 9 8 7 5 13 11 20 9 0 6 6 13 19 10 4 7 10 8 12 11 0 3 14 8 20 12 3 2 10 15 13 13 2 8 7 12 14 14 2 6 10 11 14 15 2 1 5 10 21 16 2 3 12 10 14 17 3 6 7 9 17 18 0 7 6 14 16 19 2 6 6 9 15 20 2 3 7 8 12
Finding the correlation matrix for all variables in df_new −
> cor(df_new) a1 a2 a3 a4 a5 a1 1.000000000 0.02485671 0.26409706 0.05617819 0.009229284 a2 0.024856710 1.00000000 -0.04540504 -0.10727065 -0.184062998 a3 0.264097059 -0.04540504 1.00000000 -0.17991092 -0.013487095 a4 0.056178192 -0.10727065 -0.17991092 1.00000000 0.115063107 a5 0.009229284 -0.18406300 -0.01348709 0.11506311 1.000000000
- Related Articles
- How to find the sequence of correlation between variables in an R data frame or matrix?
- How to find the correlation matrix for rows of an R data frame?
- How to find the groupwise correlation matrix for an R data frame?
- How to convert the correlation matrix into a data frame with combination of variables and their correlations in R?
- How to find the correlation matrix for a data frame that contains missing values in R?
- How to find the correlation matrix with p-values for an R data frame?
- How to find the correlation matrix by considering only numerical columns in an R data frame?
- How to find the significant correlation in an R data frame?
- How to find the correlation of one variable with all the other variables in R?
- How to create correlation matrix plot without variables labels in R?
- How to find the sum of variables by row in an R data frame?
- How to find the two factor interaction variables in an R data frame?
- How to change the size of correlation coefficient value in correlation matrix plot using corrplot in R?
- How to find the correlation for data frame having numeric and non-numeric columns in R?
- How to find the mean of columns of an R data frame or a matrix?

Advertisements