How are dataframes in Pandas merged?


Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures. A Data frame in Pandas is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.

In this article, we will see how to merge dataframes in Python. We will use the merge() method. Following is the syntax:

dataframe.merge(right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)

Here,

Parameter Value Description
right A DataFrame or a Series to merge with
how

'left'

'right'

'outer'

'inner': default

'cross'

How to merge.
on String

List

The level to do the merging
left_on String

List

The level to do the merging on the DataFrame to the left
right_on String

List

The level to do the merging on the DataFrame to the right
left_index

True

False

Whether to use the index from the left DataFrame as join key or not
right_index

True

False

Whether to use the index from the right DataFrame as join key or not
sort

True

False

Whether to sort the DataFrame by the join key or not
suffixes List A list of strings to add for overlapping columns
copy

True

False

Merge Dataframes using the merge() method with keys from right dataframe

To merge dataframes, we will use the merge() method. The right value of the how parameter use only keys from right frame, similar to a SQL right outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method res = df1.merge(df2, how='right') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = 
   Player  Age
0  Steve   29
1  David   25

DataFrame2 = 
   Player  Age
0  Steve   31
1   Kane   27
Combined DataFrames = 
   Player  Age
0  Steve   31
1   Kane   27

Merge Dataframes using the merge() method with keys from left dataframe

To merge dataframes, we will use the merge() method. The left value of the how parameter use only keys from left frame, similar to a SQL left outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to left res = df1.merge(df2, how='left') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = 
   Player  Age
0  Steve   29
1  David   25

DataFrame2 = 
   Player  Age
0  Steve   31
1   Kane   27
Combined DataFrames = 
   Player  Age
0  Steve   29
1  David   25

Merge Dataframes with union of keys from both dataframes

To merge dataframes, we will use the merge() method. The outer value of the how parameter use union of keys from both the frames, similar to a SQL full outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to outer i.e. res = df1.merge(df2, how='outer') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = 
   Player  Age
0  Steve   29
1  David   25

DataFrame2 = 
   Player  Age
0  Steve   31
1   Kane   27
Combined DataFrames = 
   Player  Age
0  Steve   29
1  David   25
2  Steve   31
3   Kane   27

Merge Dataframes with intersection of keys from both dataframes

To merge dataframes, we will use the merge() method. The inner value of the how parameter use intersection of keys from both the frames, similar to a SQL inner join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to inner res = df1.merge(df2, how='inner') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = 
   Player  Age
0  Steve   29
1  David   25

DataFrame2 = 
   Player  Age
0  Steve   31
1   Kane   27
Combined DataFrames = 
Empty DataFrame
Columns: [Player, Age]
Index: []

Merge Dataframes with cartesian product from both dataframes

To merge dataframes, we will use the merge() method. The cross value of the how parameter creates the cartesian product from both the frames:

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to cross i.e. cartesian product res = df1.merge(df2, how='cross') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = 
   Player  Age
0  Steve   29
1  David   25
DataFrame2 = 
    Player  Age
0   Steve   31
1   Kane    27Combined DataFrames = 
Player_x  Age_x Player_y  Age_y
0    Steve     29    Steve     31
1    Steve     29     Kane     27
2    David     25    Steve     31
3    David     25     Kane     27

Updated on: 15-Sep-2022

208 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements