Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Filter Pandas DataFrame Based on Index
Pandas DataFrame filtering based on index is a fundamental operation for data analysis. The filter() method and boolean indexing provide flexible ways to select specific rows and columns based on their index labels.
Syntax
df.filter(items=None, like=None, regex=None, axis=None)
Parameters
items: List of labels to keep. Returns only rows/columns with matching names.
like: String pattern. Keeps labels containing this substring.
regex: Regular expression pattern for matching labels.
axis: 0 for rows, 1 for columns. Default is None (columns).
Filtering by Numeric Index Positions
Use iloc[] to filter rows by their numeric positions ?
import pandas as pd
# Create a sample DataFrame
data = {'A': [1, 2, 3, 4, 5],
'B': [6, 7, 8, 9, 10],
'C': [11, 12, 13, 14, 15]}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print("\nFiltered rows at positions 1 and 3:")
# Filter rows based on numeric positions
filtered_df = df.iloc[[1, 3]]
print(filtered_df)
Original DataFrame: A B C 0 1 6 11 1 2 7 12 2 3 8 13 3 4 9 14 4 5 10 15 Filtered rows at positions 1 and 3: A B C 1 2 7 12 3 4 9 14
Filtering by Custom Index Labels
Filter DataFrames with custom string indices using boolean conditions ?
import pandas as pd
# Create DataFrame with custom index labels
data = {'A': [1, 2, 3, 4, 5],
'B': [6, 7, 8, 9, 10],
'C': [11, 12, 13, 14, 15]}
df = pd.DataFrame(data, index=['apple', 'banana', 'orange', 'grape', 'kiwi'])
print("DataFrame with custom indices:")
print(df)
print("\nFiltered rows containing letter 'a':")
# Filter rows where index contains 'a'
filtered_df = df[df.index.str.contains('a')]
print(filtered_df)
DataFrame with custom indices:
A B C
apple 1 6 11
banana 2 7 12
orange 3 8 13
grape 4 9 14
kiwi 5 10 15
Filtered rows containing letter 'a':
A B C
apple 1 6 11
banana 2 7 12
orange 3 8 13
grape 4 9 14
Using filter() Method
The filter() method provides pattern-based filtering for index labels ?
import pandas as pd
# Create DataFrame with descriptive indices
data = {'Sales': [100, 200, 150, 300],
'Profit': [20, 40, 30, 60]}
df = pd.DataFrame(data, index=['Jan_2023', 'Feb_2023', 'Jan_2024', 'Feb_2024'])
print("Original DataFrame:")
print(df)
# Filter rows containing 'Jan'
jan_data = df.filter(like='Jan', axis=0)
print("\nRows containing 'Jan':")
print(jan_data)
# Filter using regex for 2024 data
data_2024 = df.filter(regex='.*2024$', axis=0)
print("\nRows ending with '2024':")
print(data_2024)
Original DataFrame:
Sales Profit
Jan_2023 100 20
Feb_2023 200 40
Jan_2024 150 30
Feb_2024 300 60
Rows containing 'Jan':
Sales Profit
Jan_2023 100 20
Jan_2024 150 30
Rows ending with '2024':
Sales Profit
Jan_2024 150 30
Feb_2024 300 60
Comparison of Methods
| Method | Use Case | Index Type |
|---|---|---|
iloc[] |
Numeric positions | Any |
filter(like=) |
Substring matching | String labels |
filter(regex=) |
Pattern matching | String labels |
boolean indexing |
Complex conditions | Any |
Conclusion
DataFrame index filtering enables precise data selection using iloc[] for positions, filter() for patterns, and boolean indexing for complex conditions. Choose the method that best fits your filtering requirements and index structure.
