Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Write a Python program to trim the minimum and maximum threshold value in a dataframe
Sometimes you need to limit values in a DataFrame to fall within specific minimum and maximum thresholds. Pandas provides the clip() method to trim values that exceed these boundaries.
Understanding DataFrame Clipping
The clip() method constrains values between a lower and upper limit:
lowerparameter sets the minimum thresholdupperparameter sets the maximum thresholdValues below the lower limit are replaced with the lower limit
Values above the upper limit are replaced with the upper limit
Syntax
DataFrame.clip(lower=None, upper=None, axis=None)
Creating Sample Data
Let's create a DataFrame with values that need trimming :
import pandas as pd
data = {"Column1": [12, 34, 56, 78, 28],
"Column2": [23, 30, 25, 50, 90]}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Original DataFrame: Column1 Column2 0 12 23 1 34 30 2 56 25 3 78 50 4 28 90
Applying Minimum Threshold
Setting a lower bound of 30 replaces all values below 30 :
import pandas as pd
data = {"Column1": [12, 34, 56, 78, 28],
"Column2": [23, 30, 25, 50, 90]}
df = pd.DataFrame(data)
min_threshold = df.clip(lower=30)
print("Minimum threshold (lower=30):")
print(min_threshold)
Minimum threshold (lower=30): Column1 Column2 0 30 30 1 34 30 2 56 30 3 78 50 4 30 90
Applying Maximum Threshold
Setting an upper bound of 50 replaces all values above 50 :
import pandas as pd
data = {"Column1": [12, 34, 56, 78, 28],
"Column2": [23, 30, 25, 50, 90]}
df = pd.DataFrame(data)
max_threshold = df.clip(upper=50)
print("Maximum threshold (upper=50):")
print(max_threshold)
Maximum threshold (upper=50): Column1 Column2 0 12 23 1 34 30 2 50 25 3 50 50 4 28 50
Applying Both Thresholds
Combining both lower and upper bounds constrains all values between 30 and 50 :
import pandas as pd
data = {"Column1": [12, 34, 56, 78, 28],
"Column2": [23, 30, 25, 50, 90]}
df = pd.DataFrame(data)
clipped_df = df.clip(lower=30, upper=50)
print("Clipped DataFrame (30 ? values ? 50):")
print(clipped_df)
Clipped DataFrame (30 ? values ? 50): Column1 Column2 0 30 30 1 34 30 2 50 30 3 50 50 4 30 50
Complete Example
Here's the complete program demonstrating all three clipping operations :
import pandas as pd
data = {"Column1": [12, 34, 56, 78, 28],
"Column2": [23, 30, 25, 50, 90]}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print("\nMinimum threshold (lower=30):")
print(df.clip(lower=30))
print("\nMaximum threshold (upper=50):")
print(df.clip(upper=50))
print("\nClipped DataFrame (30 ? values ? 50):")
print(df.clip(lower=30, upper=50))
Original DataFrame: Column1 Column2 0 12 23 1 34 30 2 56 25 3 78 50 4 28 90 Minimum threshold (lower=30): Column1 Column2 0 30 30 1 34 30 2 56 30 3 78 50 4 30 90 Maximum threshold (upper=50): Column1 Column2 0 12 23 1 34 30 2 50 25 3 50 50 4 28 50 Clipped DataFrame (30 ? values ? 50): Column1 Column2 0 30 30 1 34 30 2 50 30 3 50 50 4 30 50
Conclusion
The clip() method is useful for data preprocessing when you need to constrain values within specific bounds. Use lower for minimum thresholds, upper for maximum thresholds, or both together for range limiting.
