Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python – Descending Order Sort grouped Pandas dataframe by group size?
Sorting grouped Pandas DataFrame by group size in descending order is useful for analyzing data distribution. We use groupby() to group data, size() to count group members, and sort_values(ascending=False) to sort in descending order.
Creating a Sample DataFrame
Let's start by creating a DataFrame with car information ?
import pandas as pd
# Create dataframe with Car and Registration Price columns
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900]
})
print("DataFrame:")
print(dataFrame)
DataFrame:
Car Reg_Price
0 BMW 1000
1 Lexus 1400
2 Audi 1000
3 Mercedes 900
4 Jaguar 1700
5 Bentley 900
Grouping and Sorting by Size
Now we'll group by the 'Reg_Price' column and sort groups by their size in descending order ?
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900]
})
# Group by Reg_Price and sort by group size in descending order
sorted_groups = dataFrame.groupby('Reg_Price').size().sort_values(ascending=False)
print("Groups sorted by size (descending):")
print(sorted_groups)
Groups sorted by size (descending): Reg_Price 1000 2 900 2 1700 1 1400 1 dtype: int64
Understanding the Process
The process involves three key steps:
- groupby('Reg_Price') − Groups rows by registration price
- size() − Counts the number of rows in each group
- sort_values(ascending=False) − Sorts groups by count in descending order
Alternative Approach with Group Details
To see the actual data within each group along with sizes ?
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900]
})
# Get group sizes and sort
group_sizes = dataFrame.groupby('Reg_Price').size().sort_values(ascending=False)
print("Group sizes (sorted):")
print(group_sizes)
print("\nDetailed groups:")
for price in group_sizes.index:
group_data = dataFrame[dataFrame['Reg_Price'] == price]
print(f"\nReg_Price {price} (size: {group_sizes[price]}):")
print(group_data[['Car', 'Reg_Price']])
Group sizes (sorted):
Reg_Price
1000 2
900 2
1700 1
1400 1
dtype: int64
Detailed groups:
Reg_Price 1000 (size: 2):
Car Reg_Price
0 BMW 1000
2 Audi 1000
Reg_Price 900 (size: 2):
Car Reg_Price
3 Mercedes 900
5 Bentley 900
Reg_Price 1700 (size: 1):
Car Reg_Price
4 Jaguar 1700
Reg_Price 1400 (size: 1):
Car Reg_Price
1 Lexus 1400
Conclusion
Use groupby().size().sort_values(ascending=False) to sort grouped DataFrames by group size in descending order. This approach is particularly useful for identifying the most common categories in your dataset.
Advertisements
