 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Show Pearson Correlation Test Between Two Variables using Python
The Pearson Correlation Test is a simple statistical method in Python that measures the relationship between two parameter variables. It is useful to measure the relationship between two variables graphically, so you can know how strong the relationship is between the variables and whether they are related or not. To find Pearson Correlation we can use pearsonr() function.
Its value falls between -1 and 1, with -1 being a perfect negative correlation, 0 representing no relationship, and 1 representing a perfect positive correlation.
Syntax
This syntax is used in all the following examples.
pearsonr(variable1,variable2)
Algorithm
- Step 1 ? Import the module and libraries. 
- Step 2 ? Define the variables or datasets. 
var1=[ ]
var2=[ ] or
If you want to perform on csv file then
   df = pd.read_csv("file_name.csv") 
- Step 3 ? Apply the pearsonr() function for calculating the Correlation test. 
- Step 4 ? Now print the result. 
Method 1: Here we are using variables to find the Correlation
Example 1
Finding Pearson Correlation Test Between Two Variables.
from scipy.stats import pearsonr
var1 = [2, 4, 6, 8]   #1st variable
var2 = [1, 3, 5, 7]   #2nd variable
# find Pearson correlation 
correlation,_ = pearsonr(var1, var2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: 1.0
In this code, the ?pearsonr' function is imported from ?scipy.stats'. Two lists named var1 and var2 are created. Using the ?pearsonr()' function, Pearson correlation between var1 and var2 is calculated. For this, ?pearsonr()' function is passed along with var1 and var2. The value of Pearson's correlation is stored in correlation.Then, the Pearson correlation will be printed.
Example 2
Finding Pearson Correlation Test Between Two Variables.
from scipy.stats import pearsonr
var1 = [2.2, 4.6, 6.8, 7.8]   #1st variable
var2 = [1.3, 3.2, 5.6, 9.7]   #2nd variable
# find Pearson correlation 
correlation,_ = pearsonr(var1, var2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: 0.9385130127002226
In this code, the ?pearsonr' function is imported from ?scipy.stats'. Here we are creating two decimal lists named var1 and var2. Using the ?pearsonr()' function, Pearson correlation between var1 and var2 is calculated. For this, ?pearsonr()' function is passed along with var1 and var2. The value of Pearson's correlation is stored in correlation.Then, the Pearson correlation will be printed.
Example 3
Finding Pearson Correlation Test Between Two Variables.
from scipy.stats import pearsonr
var1 = [-2, -5, -1, -7]   #1st variable
var2 = [-8, -3, -6, -9]   #2nd variable
# find Pearson correlation 
correlation,_ = pearsonr(var1, var2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: 0.11437725271791938
In this code, the ?pearsonr' function is imported from ?scipy.stats'. Here we are creating two lists with negative elements(var1 and var2). Using the ?pearsonr()' function, Pearson correlation between var1 and var2 is calculated. For this, ?pearsonr()' function is passed along with var1 and var2. The value of Pearson's correlation is stored in correlation.Then, the Pearson correlation will be printed.
Example 4
Finding Pearson Correlation Test Between Two Variables.
from scipy.stats import pearsonr
var1 = [-2, 5, -1, -7]   #1st variable
var2 = [-4, -3, -6, 2]   #2nd variable
# find Pearson correlation 
correlation,_ = pearsonr(var1, var2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: -0.5717997297136825
Method 2: Here we are using datasets to find the Correlation
Example 1
Finding Pearson Correlation Test from given datasets.
You can download the csv file from here - student_data
import pandas as pd
from scipy.stats import pearsonr
#adding datasets
df = pd.read_csv("student_clustering.csv")
# Convert dataframe into series
column1 = df['cgpa']
column2 = df['iq']
# find Pearson correlation 
correlation,_ = pearsonr(column1, column2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: 0.5353007092636304 #This value indicates a average or intermediate relationship between variables.
In this code, first we have access to the dataset(student_clustering.csv) from the source path. Then we fetch the numeric column with the same length from the dataset. Now we apply the Pearson correlation function and find the correlation value.
Example 2
Finding Pearson Correlation Test from given datasets.
You can download the csv file from here - cardata
import pandas as pd
from scipy.stats import pearsonr
#adding datasets
df = pd.read_csv("cardata.csv")
# Convert dataframe into series
column1 = df['Selling_Price']
column2 = df['Present_Price']
# find Pearson correlation 
correlation,_ = pearsonr(column1, column2)
print('Pearson correlation:', correlation)
Output
Pearson correlation: 0.8252819190808663 #This value indicates a strong relationship between variables because it's near by 1.
In this code, first we have access to the dataset(cardata.csv) from the source path. Then we fetch the numeric column with the same length from the dataset. Now we apply the Pearson correlation function and find the correlation value.
Conclusion
In conclusion, the Pearson Correlation Test is a crucial tool for anyone working with data who wants to understand patterns and correlations. You can easily run this test and learn important details about the pattern and value of the connection between two variables by using Python and the scipy library.
