How to Merge Two Pandas DataFrames on Index?

Merging two Pandas DataFrames on index is essential when you have datasets sharing common row identifiers but containing different columns. This operation combines data horizontally, creating a unified dataset for analysis.

In this article, we will explore three methods to merge Pandas DataFrames based on their index: merge(), join(), and concat().

What is a Pandas DataFrame?

A DataFrame is Pandas' primary two-dimensional data structure, similar to a spreadsheet or SQL table. It consists of labeled rows and columns where each column can contain different data types (integers, floats, strings, etc.). The row labels form the index, while column labels are the column names.

Method 1: Using merge() Function

The merge() function combines DataFrames using left_index=True and right_index=True parameters to merge on index ?

import pandas as pd

# Create two DataFrames with same index
df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}, index=['x', 'y', 'z'])
df2 = pd.DataFrame({'C': [70, 80, 90], 'D': [100, 110, 120]}, index=['x', 'y', 'z'])

print("DataFrame 1:")
print(df1)
print("\nDataFrame 2:")
print(df2)

# Merge on index
merged_df = pd.merge(df1, df2, left_index=True, right_index=True)
print("\nMerged DataFrame:")
print(merged_df)
DataFrame 1:
    A   B
x  10  40
y  20  50
z  30  60

DataFrame 2:
    C    D
x  70  100
y  80  110
z  90  120

Merged DataFrame:
    A   B   C    D
x  10  40  70  100
y  20  50  80  110
z  30  60  90  120

Method 2: Using join() Function

The join() function is specifically designed for index-based merging and provides a more concise syntax ?

import pandas as pd

# Create DataFrames with different indexes
df1 = pd.DataFrame({'Sales': [100, 200, 150]}, index=['Jan', 'Feb', 'Mar'])
df2 = pd.DataFrame({'Profit': [20, 40, 30]}, index=['Jan', 'Feb', 'Apr'])

print("Sales DataFrame:")
print(df1)
print("\nProfit DataFrame:")
print(df2)

# Join using outer join to include all indexes
joined_df = df1.join(df2, how='outer')
print("\nJoined DataFrame (outer join):")
print(joined_df)
Sales DataFrame:
     Sales
Jan    100
Feb    200
Mar    150

Profit DataFrame:
     Profit
Jan      20
Feb      40
Apr      30

Joined DataFrame (outer join):
     Sales  Profit
Jan  100.0    20.0
Feb  200.0    40.0
Mar  150.0     NaN
Apr    NaN    30.0

Method 3: Using concat() Function

The concat() function with axis=1 concatenates DataFrames horizontally along the index ?

import pandas as pd

# Create sample DataFrames
df1 = pd.DataFrame({'Temperature': [25, 30, 28]}, index=['Mon', 'Tue', 'Wed'])
df2 = pd.DataFrame({'Humidity': [60, 65, 70]}, index=['Mon', 'Tue', 'Wed'])

print("Temperature DataFrame:")
print(df1)
print("\nHumidity DataFrame:")
print(df2)

# Concatenate horizontally
concat_df = pd.concat([df1, df2], axis=1)
print("\nConcatenated DataFrame:")
print(concat_df)
Temperature DataFrame:
     Temperature
Mon           25
Tue           30
Wed           28

Humidity DataFrame:
     Humidity
Mon        60
Tue        65
Wed        70

Concatenated DataFrame:
     Temperature  Humidity
Mon           25        60
Tue           30        65
Wed           28        70

Comparison of Methods

Method Best For Handles Different Indexes Join Types
merge() Complex merging with multiple options Yes inner, outer, left, right
join() Simple index-based joining Yes left, right, outer, inner
concat() Combining multiple DataFrames Yes outer, inner

Conclusion

Use join() for simple index-based merging, merge() for complex operations with multiple parameters, and concat() when combining multiple DataFrames. All methods effectively combine DataFrames horizontally based on their index values.

Updated on: 2026-03-27T10:27:56+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements