Python Pandas – Fetch the Common rows between two DataFrames with concat()

To fetch the common rows between two DataFrames, use the concat() function. This method involves concatenating both DataFrames, grouping by all columns, and identifying rows that appear in both DataFrames.

Understanding the Approach

The process involves several steps ?

  • Concatenate both DataFrames using concat()
  • Reset the index to avoid conflicts
  • Group by all columns to identify duplicates
  • Filter groups with more than one occurrence

Step-by-Step Implementation

Creating Sample DataFrames

Let's create two DataFrames with car data ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

print("DataFrame1:")
print(dataFrame1)
DataFrame1:
      Car  Reg_Price
0     BMW       1000
1   Lexus       1500
2    Audi       1100
3   Tesla        800
4  Bentley       1100
5  Jaguar        900
import pandas as pd

# Create DataFrame2
dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

print("DataFrame2:")
print(dataFrame2)
DataFrame2:
      Car  Reg_Price
0     BMW       1200
1   Lexus       1500
2    Audi       1000
3   Tesla        800
4  Bentley       1100
5  Jaguar       1000

Finding Common Rows

Now we'll concatenate and identify common rows ?

import pandas as pd

# Create DataFrames
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

# Concatenate both DataFrames
dfRes = pd.concat([dataFrame1, dataFrame2])

# Reset index to avoid conflicts
dfRes = dfRes.reset_index(drop=True)

# Group by all columns to find duplicates
dfGroup = dfRes.groupby(list(dfRes.columns))

# Get indices of rows that appear more than once (common rows)
res = [k[0] for k in dfGroup.groups.values() if len(k) > 1]

# Display common rows
print("Common rows between DataFrames:")
print(dfRes.reindex(res))
Common rows between DataFrames:
      Car  Reg_Price
1   Lexus       1500
3   Tesla        800
4  Bentley       1100

How It Works

The algorithm identifies rows with identical values across all columns ?

Step Purpose Result
concat() Combine both DataFrames Single DataFrame with all rows
reset_index() Avoid index conflicts Sequential integer index
groupby() Group identical rows Groups of duplicate rows
Filter by length Find groups with >1 member Common rows only

Alternative Approach Using merge()

You can also use merge() for a more direct approach ?

import pandas as pd

dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

# Find common rows using merge (inner join)
common_rows = pd.merge(dataFrame1, dataFrame2, how='inner')
print("Common rows using merge():")
print(common_rows)
Common rows using merge():
      Car  Reg_Price
0   Lexus       1500
1   Tesla        800
2  Bentley       1100

Conclusion

Use concat() with groupby to find common rows when you need more control over the process. The merge() approach is simpler and more efficient for basic common row identification. Both methods identify rows that exist in both DataFrames with identical values.

Updated on: 2026-03-26T01:36:25+05:30

598 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements