Filter the rows – Python Pandas

In Python Pandas, filtering rows based on specific criteria is a common data manipulation task. The contains() method is particularly useful for filtering string columns by checking if they contain a specific substring.

Basic Row Filtering with contains()

The str.contains() method returns a boolean mask that can be used to filter DataFrame rows ?

import pandas as pd

# Create sample DataFrame
data = {
    'Car': ['Lamborghini', 'Ferrari', 'Lamborghini', 'Porsche', 'BMW'],
    'Model': ['Huracan', 'F8', 'Aventador', '911', 'M3'],
    'Year': [2020, 2021, 2019, 2020, 2018],
    'Price': [240000, 280000, 400000, 150000, 70000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Original DataFrame:
          Car      Model  Year   Price
0  Lamborghini    Huracan  2020  240000
1     Ferrari         F8  2021  280000
2  Lamborghini  Aventador  2019  400000
3     Porsche        911  2020  150000
4         BMW         M3  2018   70000

Filtering Rows with Specific Text

Filter rows where the 'Car' column contains 'Lamborghini' ?

import pandas as pd

# Create sample DataFrame
data = {
    'Car': ['Lamborghini', 'Ferrari', 'Lamborghini', 'Porsche', 'BMW'],
    'Model': ['Huracan', 'F8', 'Aventador', '911', 'M3'],
    'Year': [2020, 2021, 2019, 2020, 2018],
    'Price': [240000, 280000, 400000, 150000, 70000]
}
df = pd.DataFrame(data)

# Filter rows containing 'Lamborghini'
filtered_df = df[df['Car'].str.contains('Lamborghini')]
print("Filtered DataFrame (Lamborghini only):")
print(filtered_df)
Filtered DataFrame (Lamborghini only):
          Car      Model  Year   Price
0  Lamborghini    Huracan  2020  240000
2  Lamborghini  Aventador  2019  400000

Advanced Filtering Options

Case-Insensitive Filtering

Use the case parameter for case-insensitive matching ?

import pandas as pd

data = {
    'Car': ['lamborghini', 'Ferrari', 'LAMBORGHINI', 'Porsche'],
    'Model': ['Huracan', 'F8', 'Aventador', '911']
}
df = pd.DataFrame(data)

# Case-insensitive filtering
filtered_df = df[df['Car'].str.contains('lambo', case=False)]
print("Case-insensitive filtering:")
print(filtered_df)
Case-insensitive filtering:
          Car      Model
0  lamborghini    Huracan
2  LAMBORGHINI  Aventador

Using Regular Expressions

Enable regex patterns for more complex matching ?

import pandas as pd

data = {
    'Car': ['Lamborghini Huracan', 'Ferrari F8', 'Lamborghini Aventador', 'Porsche 911'],
    'Price': [240000, 280000, 400000, 150000]
}
df = pd.DataFrame(data)

# Filter using regex pattern
filtered_df = df[df['Car'].str.contains(r'Lamborghini.*', regex=True)]
print("Regex filtering:")
print(filtered_df)
Regex filtering:
                 Car   Price
0   Lamborghini Huracan  240000
2  Lamborghini Aventador  400000

Multiple Conditions

Combine multiple filtering conditions using logical operators ?

import pandas as pd

data = {
    'Car': ['Lamborghini', 'Ferrari', 'Lamborghini', 'Porsche', 'BMW'],
    'Year': [2020, 2021, 2019, 2020, 2018],
    'Price': [240000, 280000, 400000, 150000, 70000]
}
df = pd.DataFrame(data)

# Multiple conditions: Lamborghini cars from 2020 or later
filtered_df = df[(df['Car'].str.contains('Lamborghini')) & (df['Year'] >= 2020)]
print("Lamborghini cars from 2020+:")
print(filtered_df)
Lamborghini cars from 2020+:
          Car  Year   Price
0  Lamborghini  2020  240000

Common Use Cases

Method Use Case Example
str.contains('text') Basic substring matching Find cars containing "BMW"
str.contains('text', case=False) Case-insensitive matching Find "bmw", "BMW", "Bmw"
str.contains(pattern, regex=True) Pattern matching Find cars starting with "L"

Conclusion

The str.contains() method is essential for filtering DataFrame rows based on string patterns. Use case=False for case-insensitive searches and regex=True for advanced pattern matching.

Updated on: 2026-03-26T13:25:44+05:30

8K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements