
- Python Basic Tutorial
- Python - Home
- Python - Overview
- Python - Environment Setup
- Python - Basic Syntax
- Python - Comments
- Python - Variables
- Python - Data Types
- Python - Operators
- Python - Decision Making
- Python - Loops
- Python - Numbers
- Python - Strings
- Python - Lists
- Python - Tuples
- Python - Dictionary
- Python - Date & Time
- Python - Functions
- Python - Modules
- Python - Files I/O
- Python - Exceptions
Replacing strings with numbers in Python for Data Analysis
Sometimes there is a requirement to convert a string to a number (int/float) in data analysis. For each string, we can assign a unique integer value to differentiate string values.
For this, we use the data in Comma Separated Values(CSV) files. Say we have an excel file containing CSV data as follow −
Company | Industry | Recommendation |
---|---|---|
HDFC Bank | Finance | Hold |
Apollo | Healthcare | Buy |
Hero | Automobile | Underperform |
Yes Bank | Finance | Hold |
M&M | Automobile | Underperform |
Fortis | Healthcare | Buy |
Maruti | Automobile | Underperform |
Above is just a few lines from a large dataset, we need to give different recommendation .i.e. Buy, Hold, Underperform etc. integer values, which will link to our metadata. So for the above input, our expected output will be something like −
Company | Industry | Recommendation |
---|---|---|
HDFC Bank | Finance | 2 |
Apollo | Healthcare | 1 |
Hero | Automobile | 3 |
Yes Bank | Finance | 2 |
M&M | Automobile | 3 |
Fortis | Healthcare | 1 |
Maruti | Automobile | 3 |
Here is a way to replace our string(column values) to integers.
Code 1
#Import required library import pandas as pd #Import the CSV file into Python using read_csv() from pandas dataframe = pd.read_csv("data_pandas1.csv") #Create the dictionary of key-value pair, where key is #your old value(string) and value is your new value(integer). Recommendation = {'Buy': 1, 'Hold': 2, 'Underperform': 3} #Assign these different key-value pair from above dictiionary to your table dataframe.Recommendation = [Recommendation[item] for item in dataframe.Recommendation] #New table print(dataframe)
Result
Company Industry Recommendation 0 HDFC Bank Finance 2 1 Apollo Healthcare 1 2 Hero Automobile 3 3 Yes Bank Finance 2 4 M&M Automobile 3 5 Fortis Healthcare 1 6 Maruti Automobile 3
There is another way to write above code, where we don’t deal with a dictionary instead we directly assign another value to the columns field(Recommendations here) if condition matches.
#Import required library import pandas as pd #Import the CSV file into Python using read_csv() from pandas dataf = pd.read_csv("data_pandas1.csv") #Directly assigning individual fields of Recommendation column different integer value #if condition matches .i.e.In the dataframe, recommendation columns we have "Buy" we'll assign # integer 1 to it. dataf.Recommendation[data.Recommendation =='Buy'] =1 dataf.Recommendation[data.Recommendation =='Hold'] =2 dataf.Recommendation[data.Recommendation =='Underperform'] =3 print(dataf)
Result
Company Industry Recommendation 0 HDFC Bank Finance 2 1 Apollo Healthcare 1 2 Hero Automobile 3 3 Yes Bank Finance 2 4 M&M Automobile 3 5 Fortis Healthcare 1 6 Maruti Automobile 3
Above I’ve mentioned the only couple of way to replacing string data in your table(csv format file) to an integer value and there are many instances come up when you have the same requirement to change your data field from string to integer.
- Related Articles
- Data analysis and Visualization with Python program
- Exploratory Data Analysis in Python
- Which is better for data analysis: R or Python?
- JavaScript Strings: Replacing i with 1 and o with 0
- Data Analysis with Spreadsheets
- Data Analysis and Visualization in Python?
- Is Python the most important programming language for data analysis?
- Python Data analysis and Visualization
- Data analysis using Python Pandas
- Olympics Data Analysis Using Python
- Alphanumeric Order by in MySQL for strings mixed with numbers
- Replacing numbers on a comma delimited result with MySQL?
- C Program for replacing one digit with other
- What are the aspects of data mining for Biological Data Analysis?
- Make Given Binary Strings Equal by Replacing Two Consecutive 0s with Single1 Repeatedly
