How can data be summarized in Pandas Python?


Lots of information about the data can be obtained by using different functions on it. But if we wish to get all information on the data, the ‘describe’ function can be used.

This function will give information such as ‘count’, ‘mean’, ‘standard deviation’, the 25th percentile, the 50th percentile, and the 75th percentile.

Example

 Live Demo

import pandas as pd
my_data = {'Name':pd.Series(['Tom','Jane','Vin','Eve','Will']),
'Age':pd.Series([45, 67, 89, 12, 23]),'value':pd.Series([8.79,23.24,31.98,78.56,90.20])
}
print("The dataframe is :")
my_df = pd.DataFrame(my_data)
print(my_df)
print("The description of data is :")
print(my_df.describe())

Output

The dataframe is :
   Name  Age   value
0  Tom   45   8.79
1  Jane  67   23.24
2  Vin   89   31.98
3  Eve   12   78.56
4  Will  23   90.20
The description of data is :
          Age     value
count  5.000000  5.000000
mean  47.200000  46.554000
std   31.499206  35.747102
min   12.000000  8.790000
25%   23.000000  23.240000
50%   45.000000  31.980000
75%   67.000000  78.560000
max   89.000000  90.200000

Explanation

  • The required libraries are imported, and given alias names for ease of use.
  • Dictionary of series consisting of key and value is created, wherein a value is actually a series data structure.
  • This dictionary is later passed as a parameter to the ‘Dataframe’ function present in the ‘pandas’ library
  • The dataframe is printed on the console.
  • We are looking at getting all the information about the data.
  • The ‘describe’ function is called on the dataframe.
  • The description is printed on the console.

Updated on: 10-Dec-2020

58 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements