Article Categories

Selected Reading

Python Pandas – Fetch the Common rows between two DataFrames with concat()

Python Server Side Programming Programming

To fetch the common rows between two DataFrames, use the concat() function. This method involves concatenating both DataFrames, grouping by all columns, and identifying rows that appear in both DataFrames.

Understanding the Approach

The process involves several steps ?

Concatenate both DataFrames using concat()
Reset the index to avoid conflicts
Group by all columns to identify duplicates
Filter groups with more than one occurrence

Step-by-Step Implementation

Creating Sample DataFrames

Let's create two DataFrames with car data ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

print("DataFrame1:")
print(dataFrame1)

DataFrame1:
      Car  Reg_Price
0     BMW       1000
1   Lexus       1500
2    Audi       1100
3   Tesla        800
4  Bentley       1100
5  Jaguar        900

import pandas as pd

# Create DataFrame2
dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

print("DataFrame2:")
print(dataFrame2)

DataFrame2:
      Car  Reg_Price
0     BMW       1200
1   Lexus       1500
2    Audi       1000
3   Tesla        800
4  Bentley       1100
5  Jaguar       1000

Finding Common Rows

Now we'll concatenate and identify common rows ?

import pandas as pd

# Create DataFrames
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

# Concatenate both DataFrames
dfRes = pd.concat([dataFrame1, dataFrame2])

# Reset index to avoid conflicts
dfRes = dfRes.reset_index(drop=True)

# Group by all columns to find duplicates
dfGroup = dfRes.groupby(list(dfRes.columns))

# Get indices of rows that appear more than once (common rows)
res = [k[0] for k in dfGroup.groups.values() if len(k) > 1]

# Display common rows
print("Common rows between DataFrames:")
print(dfRes.reindex(res))

Common rows between DataFrames:
      Car  Reg_Price
1   Lexus       1500
3   Tesla        800
4  Bentley       1100

How It Works

The algorithm identifies rows with identical values across all columns ?

Step	Purpose	Result
`concat()`	Combine both DataFrames	Single DataFrame with all rows
`reset_index()`	Avoid index conflicts	Sequential integer index
`groupby()`	Group identical rows	Groups of duplicate rows
Filter by length	Find groups with >1 member	Common rows only

Alternative Approach Using merge()

You can also use merge() for a more direct approach ?

import pandas as pd

dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000]
})

# Find common rows using merge (inner join)
common_rows = pd.merge(dataFrame1, dataFrame2, how='inner')
print("Common rows using merge():")
print(common_rows)

Common rows using merge():
      Car  Reg_Price
0   Lexus       1500
1   Tesla        800
2  Bentley       1100

Conclusion

Use concat() with groupby to find common rows when you need more control over the process. The merge() approach is simpler and more efficient for basic common row identification. Both methods identify rows that exist in both DataFrames with identical values.

AmitDiwan

Updated on: 2026-03-26T01:36:25+05:30

646 Views

Previous Next