Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python – Sort grouped Pandas dataframe by group size?
To group Pandas data frame, we use groupby(). To sort grouped data frames in ascending or descending order, use sort_values(). The size() method is used to get the data frame size.
Steps Involved
The steps included in sorting the pandas data frame by its group size are as follows ?
-
Importing the pandas library and creating a Pandas DataFrame.
-
Grouping the columns by using the groupby() function and sorting the values by using sort_values() in descending order.
-
Sorting the values in ascending order by using the sort_values() function.
Creating a Pandas DataFrame
First import the Pandas library, and create a Pandas data frame ?
import pandas as pd
# DataFrame with one of the columns as Reg_Price
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900],
})
print("DataFrame...\n", dataFrame)
DataFrame...
Car Reg_Price
0 BMW 1000
1 Lexus 1400
2 Audi 1000
3 Mercedes 900
4 Jaguar 1700
5 Bentley 900
Grouping and Sorting by Group Size
Sorting in Descending Order
To group according to the Reg_Price column and sort in descending order by setting ascending=False ?
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900],
})
# Group by Reg_Price and sort group sizes in descending order
result_desc = dataFrame.groupby('Reg_Price').size().sort_values(ascending=False)
print("Sorted in Descending order...\n", result_desc)
Sorted in Descending order... Reg_Price 1000 2 900 2 1700 1 1400 1 dtype: int64
Sorting in Ascending Order
Group according to the Reg_Price column and sort in ascending order by setting ascending=True ?
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900],
})
# Group by Reg_Price and sort group sizes in ascending order
result_asc = dataFrame.groupby('Reg_Price').size().sort_values(ascending=True)
print("Sorted in Ascending order...\n", result_asc)
Sorted in Ascending order... Reg_Price 1400 1 1700 1 900 2 1000 2 dtype: int64
Complete Example
In this comprehensive example, the DataFrame contains columns Car and Reg_Price. We group by the Reg_Price column using the groupby() function, calculate the size of each group, then sort these group sizes in both descending and ascending order using the sort_values() function ?
import pandas as pd
# DataFrame with one of the columns as Reg_Price
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'],
"Reg_Price": [1000, 1400, 1000, 900, 1700, 900]
})
print("DataFrame...\n", dataFrame)
# Group according to Reg_Price column and sort in descending order
print("\nSorted in Descending order...")
print(dataFrame.groupby('Reg_Price').size().sort_values(ascending=False))
# Group according to Reg_Price column and sort in ascending order
print("\nSorted in Ascending order...")
print(dataFrame.groupby('Reg_Price').size().sort_values(ascending=True))
DataFrame...
Car Reg_Price
0 BMW 1000
1 Lexus 1400
2 Audi 1000
3 Mercedes 900
4 Jaguar 1700
5 Bentley 900
Sorted in Descending order...
Reg_Price
1000 2
900 2
1700 1
1400 1
dtype: int64
Sorted in Ascending order...
Reg_Price
1400 1
1700 1
900 2
1000 2
dtype: int64
Comparison
| Sort Order | Method | Result |
|---|---|---|
| Descending | ascending=False |
Largest groups first |
| Ascending | ascending=True |
Smallest groups first |
Conclusion
Use groupby().size().sort_values() to sort groups by their size. Set ascending=False for largest groups first, or ascending=True for smallest groups first. This technique is useful for identifying the most or least common categories in your data.
