Replacing strings with numbers in Python for Data Analysis

In data analysis, converting categorical strings to numerical values is essential for machine learning algorithms and statistical analysis. Python provides several methods to map string values to integers efficiently.

Consider this sample dataset with stock recommendations ?

Company Industry Recommendation
HDFC Bank Finance Hold
Apollo Healthcare Buy
Hero Automobile Underperform
Yes Bank Finance Hold
M&M Automobile Underperform
Fortis Healthcare Buy

We need to convert the Recommendation column to numerical values: Buy=1, Hold=2, Underperform=3.

Method 1: Using Dictionary Mapping with List Comprehension

Create a mapping dictionary and apply it using list comprehension ?

import pandas as pd

# Sample data
data = {
    'Company': ['HDFC Bank', 'Apollo', 'Hero', 'Yes Bank', 'M&M', 'Fortis'],
    'Industry': ['Finance', 'Healthcare', 'Automobile', 'Finance', 'Automobile', 'Healthcare'],
    'Recommendation': ['Hold', 'Buy', 'Underperform', 'Hold', 'Underperform', 'Buy']
}

dataframe = pd.DataFrame(data)

# Create mapping dictionary
recommendation_map = {'Buy': 1, 'Hold': 2, 'Underperform': 3}

# Apply mapping using list comprehension
dataframe['Recommendation'] = [recommendation_map[item] for item in dataframe['Recommendation']]

print(dataframe)
     Company    Industry  Recommendation
0  HDFC Bank     Finance               2
1     Apollo  Healthcare               1
2       Hero  Automobile               3
3   Yes Bank     Finance               2
4        M&M  Automobile               3
5     Fortis  Healthcare               1

Method 2: Using Pandas map() Function

The map() method is more efficient for large datasets ?

import pandas as pd

# Sample data
data = {
    'Company': ['HDFC Bank', 'Apollo', 'Hero', 'Yes Bank', 'M&M', 'Fortis'],
    'Industry': ['Finance', 'Healthcare', 'Automobile', 'Finance', 'Automobile', 'Healthcare'],
    'Recommendation': ['Hold', 'Buy', 'Underperform', 'Hold', 'Underperform', 'Buy']
}

dataframe = pd.DataFrame(data)

# Create mapping dictionary
recommendation_map = {'Buy': 1, 'Hold': 2, 'Underperform': 3}

# Apply mapping using map() function
dataframe['Recommendation'] = dataframe['Recommendation'].map(recommendation_map)

print(dataframe)
     Company    Industry  Recommendation
0  HDFC Bank     Finance               2
1     Apollo  Healthcare               1
2       Hero  Automobile               3
3   Yes Bank     Finance               2
4        M&M  Automobile               3
5     Fortis  Healthcare               1

Method 3: Using Conditional Assignment

Directly assign values based on conditions ?

import pandas as pd

# Sample data
data = {
    'Company': ['HDFC Bank', 'Apollo', 'Hero', 'Yes Bank', 'M&M', 'Fortis'],
    'Industry': ['Finance', 'Healthcare', 'Automobile', 'Finance', 'Automobile', 'Healthcare'],
    'Recommendation': ['Hold', 'Buy', 'Underperform', 'Hold', 'Underperform', 'Buy']
}

dataframe = pd.DataFrame(data)

# Apply conditional assignments
dataframe.loc[dataframe['Recommendation'] == 'Buy', 'Recommendation'] = 1
dataframe.loc[dataframe['Recommendation'] == 'Hold', 'Recommendation'] = 2
dataframe.loc[dataframe['Recommendation'] == 'Underperform', 'Recommendation'] = 3

print(dataframe)
     Company    Industry Recommendation
0  HDFC Bank     Finance              2
1     Apollo  Healthcare              1
2       Hero  Automobile              3
3   Yes Bank     Finance              2
4        M&M  Automobile              3
5     Fortis  Healthcare              1

Comparison

Method Performance Best For
Dictionary + List Comprehension Good Small datasets, clear mapping
map() function Excellent Large datasets, efficient mapping
Conditional Assignment Fair Few categories, complex conditions

Conclusion

Use map() function for efficient string-to-number conversion in data analysis. Dictionary mapping provides clear, readable code while conditional assignment works best for complex transformations.

Updated on: 2026-03-25T05:20:59+05:30

980 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements