Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Select rows that contain specific text using Pandas
To select rows that contain specific text in Pandas, use the str.contains() method. This is useful for filtering DataFrames based on text patterns or substrings within columns.
Basic Syntax
The basic syntax for selecting rows with specific text is ?
df = df[df['column_name'].str.contains('text')]
Example with Sample Data
Let's create a sample DataFrame and select rows containing "BMW" ?
import pandas as pd
# Creating a sample DataFrame
data = {
'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
'UnitsSold': [80, 110, 100, 95, 80, 80, 100, 120, 100]
}
dataFrame = pd.DataFrame(data)
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
Car Place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 RollsRoyce Pune 100
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborghini Chandigarh 80
6 Audi Mumbai 100
7 Mercedes Pune 120
8 Lamborghini Delhi 100
Selecting Rows with Specific Text
Now let's select rows that contain "BMW" in the Car column ?
import pandas as pd
# Creating the same DataFrame
data = {
'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
'UnitsSold': [80, 110, 100, 95, 80, 80, 100, 120, 100]
}
dataFrame = pd.DataFrame(data)
# Select rows containing text "BMW"
bmw_rows = dataFrame[dataFrame['Car'].str.contains('BMW')]
print("Rows containing BMW:")
print(bmw_rows)
Rows containing BMW: Car Place UnitsSold 3 BMW Delhi 95
Case-Insensitive Search
For case-insensitive matching, use the case=False parameter ?
import pandas as pd
data = {
'Car': ['Audi', 'bmw', 'BMW', 'mercedes'],
'Place': ['Bangalore', 'Mumbai', 'Delhi', 'Pune'],
'UnitsSold': [80, 90, 95, 85]
}
dataFrame = pd.DataFrame(data)
# Case-insensitive search for BMW
bmw_rows = dataFrame[dataFrame['Car'].str.contains('bmw', case=False)]
print("Case-insensitive search for 'bmw':")
print(bmw_rows)
Case-insensitive search for 'bmw': Car Place UnitsSold 1 bmw Mumbai 90 2 BMW Delhi 95
Multiple Text Patterns
You can search for multiple patterns using the pipe operator | ?
import pandas as pd
data = {
'Car': ['Audi', 'Porsche', 'BMW', 'Mercedes', 'Lamborghini'],
'Place': ['Bangalore', 'Mumbai', 'Delhi', 'Hyderabad', 'Chandigarh'],
'UnitsSold': [80, 110, 95, 80, 80]
}
dataFrame = pd.DataFrame(data)
# Select rows containing either BMW or Mercedes
luxury_cars = dataFrame[dataFrame['Car'].str.contains('BMW|Mercedes')]
print("Rows containing BMW or Mercedes:")
print(luxury_cars)
Rows containing BMW or Mercedes:
Car Place UnitsSold
2 BMW Delhi 95
3 Mercedes Hyderabad 80
Conclusion
Use str.contains() to filter DataFrame rows based on text patterns. Add case=False for case-insensitive searches and use | for multiple patterns. This method is essential for text-based data filtering in Pandas.
Advertisements
