How to combine dataframes in Pandas?

Pandas provides several methods to combine DataFrames efficiently. The three most common approaches are concat() with inner join for column-wise combination, concat() for vertical stacking, and merge() for database-style joins.

Using concat() with Inner Join

The concat() function with join='inner' combines DataFrames side by side, keeping only matching indices ?

import pandas as pd

# Create sample DataFrames
df1_data = {'Player': ['Jacob', 'Steve', 'David', 'John', 'Kane'], 
            'Age': [29, 25, 31, 26, 27]}
df2_data = {'Rank': [1, 2, 3, 4, 5], 
            'Points': [100, 87, 80, 70, 50]}

df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)

print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)

# Combine DataFrames horizontally using inner join
result = pd.concat([df1, df2], axis=1, join='inner')
print("\nCombined DataFrames (Inner Join):")
print(result)
DataFrame1:
  Player  Age
0  Jacob   29
1  Steve   25
2  David   31
3   John   26
4   Kane   27

DataFrame2:
   Rank  Points
0     1     100
1     2      87
2     3      80
3     4      70
4     5      50

Combined DataFrames (Inner Join):
  Player  Age  Rank  Points
0  Jacob   29     1     100
1  Steve   25     2      87
2  David   31     3      80
3   John   26     4      70
4   Kane   27     5      50

Using concat() for Vertical Combination

The concat() function stacks DataFrames vertically when they have similar columns ?

import pandas as pd

# Create DataFrames with same structure
df1_data = {'Player': ['Steve', 'David'], 'Age': [29, 25]}
df2_data = {'Player': ['John', 'Kane'], 'Age': [31, 27]}

df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)

print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)

# Combine DataFrames vertically
result = pd.concat([df1, df2], ignore_index=True)
print("\nCombined DataFrames (Vertical):")
print(result)
DataFrame1:
  Player  Age
0  Steve   29
1  David   25

DataFrame2:
  Player  Age
0   John   31
1   Kane   27

Combined DataFrames (Vertical):
  Player  Age
0  Steve   29
1  David   25
2   John   31
3   Kane   27

Using merge() for Database-Style Joins

The merge() function performs database-style joins based on common columns ?

import pandas as pd

# Create DataFrames with a common column
df1_data = {'Player': ['Steve', 'David', 'John'], 'Age': [29, 25, 31]}
df2_data = {'Player': ['Steve', 'David', 'Kane'], 'Points': [100, 87, 50]}

df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)

print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)

# Merge DataFrames on 'Player' column
result = pd.merge(df1, df2, on='Player', how='inner')
print("\nMerged DataFrames:")
print(result)
DataFrame1:
  Player  Age
0  Steve   29
1  David   25
2   John   31

DataFrame2:
  Player  Points
0  Steve     100
1  David      87
2   Kane      50

Merged DataFrames:
  Player  Age  Points
0  Steve   29     100
1  David   25      87

Comparison of Methods

Method Use Case Key Parameter Best For
concat() with axis=1 Side-by-side combination join='inner'/'outer' Same-length DataFrames
concat() with axis=0 Vertical stacking ignore_index=True Same column structure
merge() Database-style joins on='column', how='inner' Common key columns

Conclusion

Use concat() for simple stacking operations and merge() for sophisticated joins based on common columns. Choose the method based on your data structure and combination requirements.

Updated on: 2026-03-26T21:33:47+05:30

892 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements