Select rows that contain specific text using Pandas

To select rows that contain specific text in Pandas, use the str.contains() method. This is useful for filtering DataFrames based on text patterns or substrings within columns.

Basic Syntax

The basic syntax for selecting rows with specific text is ?

df = df[df['column_name'].str.contains('text')]

Example with Sample Data

Let's create a sample DataFrame and select rows containing "BMW" ?

import pandas as pd

# Creating a sample DataFrame
data = {
    'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
    'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
    'UnitsSold': [80, 110, 100, 95, 80, 80, 100, 120, 100]
}

dataFrame = pd.DataFrame(data)
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
         Car       Place  UnitsSold
0       Audi   Bangalore         80
1    Porsche      Mumbai        110
2  RollsRoyce        Pune        100
3        BMW       Delhi         95
4   Mercedes   Hyderabad         80
5 Lamborghini  Chandigarh         80
6       Audi      Mumbai        100
7   Mercedes        Pune        120
8 Lamborghini       Delhi        100

Selecting Rows with Specific Text

Now let's select rows that contain "BMW" in the Car column ?

import pandas as pd

# Creating the same DataFrame
data = {
    'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
    'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
    'UnitsSold': [80, 110, 100, 95, 80, 80, 100, 120, 100]
}

dataFrame = pd.DataFrame(data)

# Select rows containing text "BMW"
bmw_rows = dataFrame[dataFrame['Car'].str.contains('BMW')]
print("Rows containing BMW:")
print(bmw_rows)
Rows containing BMW:
  Car  Place  UnitsSold
3 BMW  Delhi         95

Case-Insensitive Search

For case-insensitive matching, use the case=False parameter ?

import pandas as pd

data = {
    'Car': ['Audi', 'bmw', 'BMW', 'mercedes'],
    'Place': ['Bangalore', 'Mumbai', 'Delhi', 'Pune'],
    'UnitsSold': [80, 90, 95, 85]
}

dataFrame = pd.DataFrame(data)

# Case-insensitive search for BMW
bmw_rows = dataFrame[dataFrame['Car'].str.contains('bmw', case=False)]
print("Case-insensitive search for 'bmw':")
print(bmw_rows)
Case-insensitive search for 'bmw':
  Car   Place  UnitsSold
1 bmw  Mumbai         90
2 BMW   Delhi         95

Multiple Text Patterns

You can search for multiple patterns using the pipe operator | ?

import pandas as pd

data = {
    'Car': ['Audi', 'Porsche', 'BMW', 'Mercedes', 'Lamborghini'],
    'Place': ['Bangalore', 'Mumbai', 'Delhi', 'Hyderabad', 'Chandigarh'],
    'UnitsSold': [80, 110, 95, 80, 80]
}

dataFrame = pd.DataFrame(data)

# Select rows containing either BMW or Mercedes
luxury_cars = dataFrame[dataFrame['Car'].str.contains('BMW|Mercedes')]
print("Rows containing BMW or Mercedes:")
print(luxury_cars)
Rows containing BMW or Mercedes:
       Car      Place  UnitsSold
2      BMW      Delhi         95
3 Mercedes  Hyderabad         80

Conclusion

Use str.contains() to filter DataFrame rows based on text patterns. Add case=False for case-insensitive searches and use | for multiple patterns. This method is essential for text-based data filtering in Pandas.

Updated on: 2026-03-26T13:34:42+05:30

897 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements