Merge Python Pandas dataframe with a common column and set NaN for unmatched values

To merge two Pandas DataFrames with a common column, use the merge() function and set the on parameter as the column name. To set NaN for unmatched values, use the how parameter and set it to left or right for left or right joins respectively.

Understanding Merge Types

The how parameter determines which rows to include ?

  • left ? Keep all rows from the left DataFrame, set NaN for unmatched right values
  • right ? Keep all rows from the right DataFrame, set NaN for unmatched left values
  • inner ? Keep only matching rows from both DataFrames
  • outer ? Keep all rows from both DataFrames, set NaN where no match

Example: Left Merge with NaN for Unmatched Values

Let's create two DataFrames and perform a left merge ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame1:")
print(dataFrame1)

# Create DataFrame2
dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Tesla', 'Mustang', 'Mercedes', 'Jaguar'],
    "Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})

print("\nDataFrame2:")
print(dataFrame2)

# Merge DataFrames with left join - keeps all rows from dataFrame1
mergedRes = pd.merge(dataFrame1, dataFrame2, on='Car', how="left")
print("\nMerged DataFrame (left join):")
print(mergedRes)
DataFrame1:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3  Mustang     80
4  Bentley    110
5   Jaguar     90

DataFrame2:
       Car  Reg_Price
0      BMW       7000
1    Lexus       1500
2    Tesla       5000
3  Mustang       8000
4 Mercedes       9000
5   Jaguar       6000

Merged DataFrame (left join):
       Car  Units  Reg_Price
0      BMW    100     7000.0
1    Lexus    150     1500.0
2     Audi    110        NaN
3  Mustang     80     8000.0
4  Bentley    110        NaN
5   Jaguar     90     6000.0

Example: Right Merge

A right merge keeps all rows from the right DataFrame ?

import pandas as pd

dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Tesla', 'Mustang', 'Mercedes', 'Jaguar'],
    "Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})

# Right merge - keeps all rows from dataFrame2
mergedRes = pd.merge(dataFrame1, dataFrame2, on='Car', how="right")
print("Merged DataFrame (right join):")
print(mergedRes)
Merged DataFrame (right join):
       Car  Units  Reg_Price
0      BMW  100.0       7000
1    Lexus  150.0       1500
2  Mustang   80.0       8000
3   Jaguar   90.0       6000
4    Tesla    NaN       5000
5 Mercedes    NaN       9000

Comparison of Merge Types

Merge Type Keeps Rows From NaN Location
left Left DataFrame Right columns for unmatched left rows
right Right DataFrame Left columns for unmatched right rows
inner Both (matching only) None (no NaN values)
outer Both DataFrames Any column for unmatched rows

Conclusion

Use how="left" to keep all rows from the first DataFrame and set NaN for unmatched values from the second. Use how="right" for the opposite behavior. The merge operation provides flexible control over how to handle missing data.

Updated on: 2026-03-26T01:38:24+05:30

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements