- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Write a program in Python to compute grouped data covariance and calculate grouped data covariance between two columns in a given dataframe
Assume, you have a dataframe and the result for calculating covariance from grouped data and corresponding column as,
Grouped data covariance is: mark1 mark2 subjects maths mark1 25.0 12.500000 mark2 12.5 108.333333 science mark1 28.0 50.000000 mark2 50.0 233.333333 Grouped data covariance between two columns: subjects maths 12.5 science 50.0 dtype: float64
Solution
To solve this, we will follow the steps given below −
Define a dataframe
Apply groupby function inside dataframe subjects column
df.groupby('subjects')
Apply covariance function to grouped data and store insied group_data,
group_data = df.groupby('subjects').cov()
Apply lambda function for mark1 and mark2 columns with groupby records from the subjects column. It is defined below,
df.groupby('subjects').apply(lambda x: x['mark1'].cov(x['mark2']
Example
Let’s see the below code to get a better understanding −
import pandas as pd df = pd.DataFrame({'subjects':['maths','maths','maths','science','science','science'], 'mark1':[80,90,85,95,93,85], 'mark2':[85,90,70,75,95,65]}) print("DataFrame is:\n",df) group_data = df.groupby('subjects').cov() print("Grouped data covariance is:\n", group_data) result = df.groupby('subjects').apply(lambda x: x['mark1'].cov(x['mark2'])) print("Grouped data covariance between two columns:\n",result)
Output
DataFrame is: subjects mark1 mark2 0 maths 80 85 1 maths 90 90 2 maths 85 70 3 science 95 75 4 science 93 95 5 science 85 65 Grouped data covariance is: mark1 mark2 subjects maths mark1 25.0 12.500000 mark2 12.5 108.333333 science mark1 28.0 50.000000 mark2 50.0 233.333333 Grouped data covariance between two columns: subjects maths 12.5 science 50.0 dtype: float64
Advertisements