Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - Filter Pandas DataFrame with numpy
The numpy.where() method can be used to filter Pandas DataFrame by returning indices that meet specified conditions. This approach is particularly useful when you need to apply multiple filtering conditions simultaneously.
Setup
First, let's import the required libraries and create a sample DataFrame ?
import pandas as pd
import numpy as np
# Create a DataFrame with product records
dataFrame = pd.DataFrame({
"Product": ["SmartTV", "ChromeCast", "Speaker", "Earphone"],
"Opening_Stock": [300, 700, 1200, 1500],
"Closing_Stock": [200, 500, 1000, 900]
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
Product Opening_Stock Closing_Stock
0 SmartTV 300 200
1 ChromeCast 700 500
2 Speaker 1200 1000
3 Earphone 1500 900
Filtering with Two Conditions
Use np.where() to find rows where Opening_Stock >= 700 AND Closing_Stock
import pandas as pd
import numpy as np
dataFrame = pd.DataFrame({
"Product": ["SmartTV", "ChromeCast", "Speaker", "Earphone"],
"Opening_Stock": [300, 700, 1200, 1500],
"Closing_Stock": [200, 500, 1000, 900]
})
# Filter with 2 conditions
indices = np.where((dataFrame['Opening_Stock'] >= 700) & (dataFrame['Closing_Stock'] < 1000))
print("Filtered DataFrame (2 conditions):")
print(dataFrame.loc[indices])
Filtered DataFrame (2 conditions):
Product Opening_Stock Closing_Stock
1 ChromeCast 700 500
3 Earphone 1500 900
Filtering with Three Conditions
Add a third condition to filter products that start with the letter 'C' ?
import pandas as pd
import numpy as np
dataFrame = pd.DataFrame({
"Product": ["SmartTV", "ChromeCast", "Speaker", "Earphone"],
"Opening_Stock": [300, 700, 1200, 1500],
"Closing_Stock": [200, 500, 1000, 900]
})
# Filter with 3 conditions
indices = np.where((dataFrame['Opening_Stock'] >= 500) &
(dataFrame['Closing_Stock'] < 1000) &
(dataFrame['Product'].str.startswith('C')))
print("Filtered DataFrame (3 conditions):")
print(dataFrame.loc[indices])
Filtered DataFrame (3 conditions):
Product Opening_Stock Closing_Stock
1 ChromeCast 700 500
How It Works
The np.where() function returns a tuple of arrays containing the indices where the condition is True. When used with dataFrame.loc[], it selects only the rows that meet all specified conditions.
Key Points
- Use
&for AND operations between conditions - Use
|for OR operations between conditions - Wrap each condition in parentheses when combining multiple conditions
- The result is a tuple of indices that can be used with
.loc[]
Conclusion
The numpy.where() method provides an efficient way to filter Pandas DataFrames with complex conditions. It returns indices that meet your criteria, which can then be used to select the filtered rows using .loc[].
