Find the Geometric mean of a given Pandas Dataframe.



Pandas Dataframe, a python open-source library, is used for storing, deleting, modifying, and updating data in tabular form. It is designed so that I can easily integrate with Python programs for data analysis. It provides various ways of data manipulation techniques and tools for processing data.

The mathematical notion of geometric means is a highly useful concept for the determination of average or central tendencies within a given set of numerical data. This is achieved by multiplying each individual number present within the data set, resulting in an nth root. The value of n, in turn, is dictated by the total number of values within the data group.

Syntax

Syntax to create DataFrame

df = pandas.DataFrame(data, index, columns)
  • "pandas.dataframe" creates empty dataframe object

  • "data" where we store data. It can be list or dictionary

  • "index " and "column" are optional which specifies row and column labels

Approach 1 - Using NumPy

The following program illustrates finding the geometric mean of a given data frame using Numpy:

Algorithm

Step 1 - Import a Pandas and Numpy modules

Step 2 - Create a Pandas Dataframe to store array values

Step 3 - Use a Numpy function in a variable called geometric_mean to find the average.

Step 4 - Print the output

Example

import pandas as pd
import numpy as np

# create a sample dataframe
df = pd.DataFrame({
   'A': [2, 4, 6, 8],
   'B': [1, 3, 5, 7]
})

# calculate the geometric mean for each column
geometric_mean = np.exp(np.log(df).mean())

# display the result
print("Geometric mean for each column:\n", geometric_mean)

Output

Geometric mean for each column:
A   4.426728
B   3.201086

Approach 2 - Using a custom function

The following program illustrates a custom function called 'geometric_mean' which accepts a Pandas DataFrame as input and calculates the geometric mean of all the values in the data frame by utilizing a loop.

Algorithm

Step 1 - Importing Pandas library

Step 2 - Creating DataFrame and storing values.

Step 3 - Defining custom function

Step 4 - Creating a new variable "gm" to call the function.

Step 5 - Printing output by calling "gm".

Example

import pandas as pd

# create sample dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})

# define a function to calculate geometric mean
def geometric_mean(data):
   product = 1
   for val in data.to_numpy().ravel():
    product *= val
   return product**(1.0/len(data.to_numpy().ravel()))

# calculate geometric mean of dataframe using custom function
gm = geometric_mean(df)

print(gm)

Output

3.764350599503129

Approach 3 - Using Scipy Library

Scipy is a Python library that provides powerful scientific computing capabilities, enabling you to work with numerical algorithms, optimization, and statistical analysis.

The following code computes the geometric means of a Pandas data frame by using the gmean() function from the scipy.stats module.

Algorithm

Step 1 - Importing Pandas, Numpy, Scipy library

Step 2 - Creating DataFrame "df"

Step 3 - Using numpy and scipy functions.

Step 4 - Printing output.

Example

import numpy as np
from scipy.stats import gmean
import pandas as pd

# create sample dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})

# calculate geometric mean of dataframe using Scipy
gm = gmean(df.to_numpy().ravel())

print(gm)

Output

3.764350599503128

Conclusion

The geometric mean, a powerful mathematical formula commonly employed to calculate the average of a series of numbers multiplied together, can be an incredibly valuable tool for data analysis in Pandas Dataframes. This formula is especially useful in scenarios with multiple columns to analyze, as it can be used with speed and precision to uncover the average across each column. By leveraging the geometric mean, you can unlock insights and patterns in your data that might otherwise go unnoticed, enabling you to make informed decisions and take targeted actions based on the information.

Updated on: 2023-08-10T15:34:34+05:30

966 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements