Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas – Remove numbers from string in a DataFrame column
To remove numbers from strings in a DataFrame column, we can use the str.replace() method with regular expressions. This is useful for cleaning data where numbers need to be stripped from text fields.
Basic Setup
First, let's import pandas and create a sample DataFrame with student records ?
import pandas as pd
# Create DataFrame with student records
dataFrame = pd.DataFrame({
"Id": ['S01', 'S02', 'S03', 'S04', 'S05', 'S06', 'S07'],
"Name": ['Jack', 'Robin', 'Ted', 'Robin', 'Scarlett', 'Kat', 'Ted'],
"Result": ['Pass', 'Fail', 'Pass', 'Fail', 'Pass', 'Pass', 'Pass']
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
Id Name Result
0 S01 Jack Pass
1 S02 Robin Fail
2 S03 Ted Pass
3 S04 Robin Fail
4 S05 Scarlett Pass
5 S06 Kat Pass
6 S07 Ted Pass
Removing Numbers from a Specific Column
Use str.replace() with the regular expression \d+ to remove all digits from the "Id" column ?
import pandas as pd
# Create DataFrame
dataFrame = pd.DataFrame({
"Id": ['S01', 'S02', 'S03', 'S04', 'S05', 'S06', 'S07'],
"Name": ['Jack', 'Robin', 'Ted', 'Robin', 'Scarlett', 'Kat', 'Ted'],
"Result": ['Pass', 'Fail', 'Pass', 'Fail', 'Pass', 'Pass', 'Pass']
})
# Remove numbers from the Id column
dataFrame['Id'] = dataFrame['Id'].str.replace('\d+', '', regex=True)
print("Updated DataFrame:")
print(dataFrame)
Updated DataFrame: Id Name Result 0 S Jack Pass 1 S Robin Fail 2 S Ted Pass 3 S Robin Fail 4 S Scarlett Pass 5 S Kat Pass 6 S Ted Pass
Alternative Methods
Using regex=False for Simple Replacements
For removing specific characters without regex patterns ?
import pandas as pd
# Create DataFrame with mixed content
dataFrame = pd.DataFrame({
"Code": ['A123B', 'C456D', 'E789F'],
"Value": [100, 200, 300]
})
# Remove specific numbers (e.g., only '123')
dataFrame['Code'] = dataFrame['Code'].str.replace('123', '', regex=False)
print(dataFrame)
Code Value 0 AB 100 1 C456D 200 2 E789F 300
Key Points
| Method | Pattern | Description |
|---|---|---|
\d+ |
Regex | Removes all consecutive digits |
\d |
Regex | Removes individual digits one by one |
[0-9]+ |
Regex | Alternative to \d+ for digit removal |
| String literal | No regex | Removes exact string matches |
Conclusion
Use str.replace('\d+', '', regex=True) to remove all numbers from DataFrame string columns. The \d+ pattern matches one or more consecutive digits, providing an efficient way to clean textual data.
Advertisements
