Python - Ranking Rows of Pandas DataFrame

PythonServer Side ProgrammingProgramming

To add a column that contains the ranking of each row in the provided data frame that will help us to sort a data frame and determine the rank of a particular element, for example −

Our Dataframe

Name Play Time (in hours)Rate
0Call Of Duty45Better than Average
1Total Overdose46Good
2GTA352Best
3Bully22Average

Output

Name Play Time (in hours)Rate ranking
0Call Of Duty45Better than Average3.0
1Total Overdose46Good2.0
2GTA352Best1.0
3Bully22Average4.0

Now, as you can see in the above example, our rankings are the whole numbers but have a decimal beside it, so that means we can have ranking in real numbers also, and that happens when more and one element have the same rank in the data frame than in such cases our ranking is divided between the elements. Hence, they have a real number as their rank.

Now how do we assign the rank to our data frame

For assigning the rank to our dataframe’s elements, we use a built-in function of the pandas library that is the .rank() function. We pass the criteria based on which we are ranking the elements to it, and this function returns a new column in each row where the ranking is stored.

Example

Code for using the .rank() function is 

import pandas as pd
games = {'Name' : ['Call Of Duty', 'Total Overdose', 'GTA 3', 'Bully'],
       'Play Time(in hours)' : ['45', '46', '52', '22'],
        'Rate' : ['Better than Average', 'Good', 'Best', 'Average']}
df = pd.DataFrame(games)
df['ranking'] = df['Play Time(in hours)'].rank(ascending = 0)
print(df)# Hello World program in Python
   
print ("Hello World!");

Output

    Name Play Time(in hours)       Rate ranking
0  Call Of Duty   45     Better than Average 3.0
1  TotalOverdose  46     Good                2.0
2  GTA 3          52     Best                1.0
3   Bully        22      Average             4.0

Explanation of the Above Code

In this code, we are simply using the built-in function of the panda’s library to rank each element present in the given data frame. We can use the best criteria to rank the elements with the column ‘Play Time(in hours).’

Now we add a column named ‘ranking’ in our data frame and use our .rank() function in it and pass the column name for which we need to do the ranking of our elements (in this case, it is Play Time(in hours) column) now when our new column is created we print our data frame.

Conclusion

In this tutorial, we rank the rows in our data frame and then print our data using the pandas library and its built-in functions. Ranking rows of pandas dataframe is an easy process, but you need to follow the above method properly.

raja
Updated on 26-Nov-2021 06:03:47

Advertisements