How to get rows/index names in Pandas dataframe?


Pandas is an extremely widely used Python library for data manipulation and analysis. It offers an efficient set of tools for handling structured data, including support with data wrangling, cleaning, visualization, and other functions. Working with tabular data, which is data organized into rows and columns, is one of Pandas' main features. Each row and column in a Pandas DataFrame has a label or name assigned to it, making it simple to refer to certain rows and columns. The term "index" is commonly used to describe the row labels or names in this context, whereas "column names" is used to describe the column labels or names.

Getting the row names while using Pandas DataFrames is a common function. Functions like data filtering, joining, and grouping can all benefit from this. When working with data that has been imported from other sources, such as CSV files or databases, it might also be useful to know the row names. In this regard, Pandas has a range of techniques, including the index property, reset_index() method, and set_index() method, for accessing and modifying the row names. These techniques make it simple to change the row names to suit certain requirements, such as renaming or rearranging rows.

Algorithm

For using attributes like ‘index’, ‘df.index.values’ and ‘df.axes’

  • Create a pandas dataframe

  • Get row names

  • Print the row names

  • Optionally convert the index object to list using ‘tolist()’ function

  • Print the list of row names

For using a For Loop

  • Create a pandas dataframe

  • Get row names using ‘index’ attribute

  • Loop through the row names and print them

  • Optionally, convert the index object into a list using ‘tolist()’ method

  • Loop through the list of row names and print them

Approaches

  • Using ‘index’ attribute

  • Using ‘df.index.values’ attribute

  • Using ‘df.axes’ attribute

  • Using a for loop

Approach 1: Using ‘index’ attribute

Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Get row names in the ‘row_names’ variable using ‘index’ function and print them using ‘print()’ function. We can stop here but the ‘index’ attribute returns the names as an object which means that it contains string values. If we want the ro names to appear as a list, we can convert them using ‘tolist()’ function and print them.

Example

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['X', 'Y', 'Z'])

# get the row names
row_names = df.index

print(row_names)

# convert the row names into list
row_names_list = df.index.tolist()

print(row_names_list)

Output

Index(['X', 'Y', 'Z'], dtype='object')
['X', 'Y', 'Z']

Approach 2: Using ‘df.index.values’ attribute

Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, we have taken the row names and converted them to list in the same line. We have stored them in a variable called ‘row_names’. We have then printed the row names.

Example

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['row1', 'row2', 'row3'])

# get the row names
row_names = df.index.values.tolist()

# print the row names
print(row_names)

Output

['row1', 'row2', 'row3']

Approach 3: Using ‘df.axes’ attribute

Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, have used ‘df.axes’ attribute which returns a list of the DataFrame’s both row and column xes and we can access the first element or the row axis in this case, by setting the index as [0]. Then we can convert the resulting Pandas Index object into a list by ‘tolist’ function. The ‘df.axes’ function is useful in situations where we need to access both row and column axes of the dataframe in a single step. We will then print the row names using ‘print()’ function.

Example

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['row1', 'row2', 'row3'])

# get the row names
row_names = df.axes[0].tolist()

# print the row names
print(row_names)

Output

['row1', 'row2', 'row3']

Approach 4: Using a For Loop

Import pandas library as pd. Create a dataframe named ‘df’ using ‘pd.DataFrame()’ function. Here, we have simply used ‘index’ attribute to get the row names and stored them in the ‘row_name’ variable. We have then used a for loop to print the name of each row one by one. We don’t need to use ‘tolist’ method here as it will print the names one by one either way.

Example

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35]},  
         index=['row1', 'row2', 'row3'])

# get the row names
row_names = df.index

# print the row names 
for row_name in row_names:
  print(row_name)

Output

row1
row2
row3

Real-World Application

A DataFrame containing three rows (Alice, Bob, and Charlie) and three columns (test1, test2, and test3) is created and assigned to the variable df. It then uses the index attribute to retrieve the DataFrame's row/index names before converting them to a list with the tolist() function. In order to get the average score for each student, it runs through the list of row names using the row's mean() function.

Consider a scenario in which we wish to get the average score for each student in a class using a DataFrame that contains the test results of the students. The scores of each student can be identified by their row names, after which we can calculate their average score. Here is how to do it −

Example

import pandas as pd

# Create a DataFrame with test scores
df = pd.DataFrame({'test1': [85, 90, 95], 'test2': [80, 85, 90], 'test3': [75, 80, 85]}, index=['Alice', 'Bob', 'Charlie'])

# Get the row names of the DataFrame
students = df.index.tolist()

# Compute the average score for each student
for student in students:
   avg_score = df.loc[student].mean()
   print(f"{student}: {avg_score}")

Output

Alice: 80.0
Bob: 85.0
Charlie: 90.0

Conclusion

In conclusion, Pandas provides great Python tools for working with structured data, including the ability to quickly extract and modify a DataFrame's row names. Anyone working with data in Pandas must know how to access and work with the row labels because it makes data manipulation and analysis more effective.

Accessing the 'index', 'index.values' and 'df.axes' attributes and iterating over the row names in a for loop are a few primary techniques for obtaining the row names in a Pandas DataFrame. These techniques may be customized to match certain use cases, such as calculating the average grade for every pupil in the class.

Updated on: 24-Jul-2023

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements