- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to get rows/index names in Pandas dataframe?
Pandas is an extremely widely used Python library for data manipulation and analysis. It offers an efficient set of tools for handling structured data, including support with data wrangling, cleaning, visualization, and other functions. Working with tabular data, which is data organized into rows and columns, is one of Pandas' main features. Each row and column in a Pandas DataFrame has a label or name assigned to it, making it simple to refer to certain rows and columns. The term "index" is commonly used to describe the row labels or names in this context, whereas "column names" is used to describe the column labels or names.
Getting the row names while using Pandas DataFrames is a common function. Functions like data filtering, joining, and grouping can all benefit from this. When working with data that has been imported from other sources, such as CSV files or databases, it might also be useful to know the row names. In this regard, Pandas has a range of techniques, including the index property, reset_index() method, and set_index() method, for accessing and modifying the row names. These techniques make it simple to change the row names to suit certain requirements, such as renaming or rearranging rows.
Algorithm
For using attributes like ‘index’, ‘df.index.values’ and ‘df.axes’
Create a pandas dataframe
Get row names
Print the row names
Optionally convert the index object to list using ‘tolist()’ function
Print the list of row names
For using a For Loop
Create a pandas dataframe
Get row names using ‘index’ attribute
Loop through the row names and print them
Optionally, convert the index object into a list using ‘tolist()’ method
Loop through the list of row names and print them
Approaches
Using ‘index’ attribute
Using ‘df.index.values’ attribute
Using ‘df.axes’ attribute
Using a for loop
Approach 1: Using ‘index’ attribute
Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Get row names in the ‘row_names’ variable using ‘index’ function and print them using ‘print()’ function. We can stop here but the ‘index’ attribute returns the names as an object which means that it contains string values. If we want the ro names to appear as a list, we can convert them using ‘tolist()’ function and print them.
Example
import pandas as pd # create a DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['X', 'Y', 'Z']) # get the row names row_names = df.index print(row_names) # convert the row names into list row_names_list = df.index.tolist() print(row_names_list)
Output
Index(['X', 'Y', 'Z'], dtype='object') ['X', 'Y', 'Z']
Approach 2: Using ‘df.index.values’ attribute
Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, we have taken the row names and converted them to list in the same line. We have stored them in a variable called ‘row_names’. We have then printed the row names.
Example
import pandas as pd # create a DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['row1', 'row2', 'row3']) # get the row names row_names = df.index.values.tolist() # print the row names print(row_names)
Output
['row1', 'row2', 'row3']
Approach 3: Using ‘df.axes’ attribute
Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, have used ‘df.axes’ attribute which returns a list of the DataFrame’s both row and column xes and we can access the first element or the row axis in this case, by setting the index as [0]. Then we can convert the resulting Pandas Index object into a list by ‘tolist’ function. The ‘df.axes’ function is useful in situations where we need to access both row and column axes of the dataframe in a single step. We will then print the row names using ‘print()’ function.
Example
import pandas as pd # create a DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['row1', 'row2', 'row3']) # get the row names row_names = df.axes[0].tolist() # print the row names print(row_names)
Output
['row1', 'row2', 'row3']
Approach 4: Using a For Loop
Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, we have simply used ‘index’ attribute to get the row names and stored them in the ‘row_name’ variable. We have then used a for loop to print the name of each row one by one. We don’t need to use ‘tolist’ method here as it will print the names one by one either way.
Example
import pandas as pd # create a DataFrame df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35]}, index=['row1', 'row2', 'row3']) # get the row names row_names = df.index # print the row names for row_name in row_names: print(row_name)
Output
row1 row2 row3
Real-World Application
A DataFrame containing three rows (Alice, Bob, and Charlie) and three columns (test1, test2, and test3) is created and assigned to the variable df. It then uses the index attribute to retrieve the DataFrame's row/index names before converting them to a list with the tolist() function. In order to get the average score for each student, it runs through the list of row names using the row's mean() function.
Consider a scenario in which we wish to get the average score for each student in a class using a DataFrame that contains the test results of the students. The scores of each student can be identified by their row names, after which we can calculate their average score. Here is how to do it −
Example
import pandas as pd # Create a DataFrame with test scores df = pd.DataFrame({'test1': [85, 90, 95], 'test2': [80, 85, 90], 'test3': [75, 80, 85]}, index=['Alice', 'Bob', 'Charlie']) # Get the row names of the DataFrame students = df.index.tolist() # Compute the average score for each student for student in students: avg_score = df.loc[student].mean() print(f"{student}: {avg_score}")
Output
Alice: 80.0 Bob: 85.0 Charlie: 90.0
Conclusion
In conclusion, Pandas provides great Python tools for working with structured data, including the ability to quickly extract and modify a DataFrame's row names. Anyone working with data in Pandas must know how to access and work with the row labels because it makes data manipulation and analysis more effective.
Accessing the 'index', 'index.values' and 'df.axes' attributes and iterating over the row names in a for loop are a few primary techniques for obtaining the row names in a Pandas DataFrame. These techniques may be customized to match certain use cases, such as calculating the average grade for every pupil in the class.