Different Types of Joins in Pandas


Pandas is one of the popular libraries used to perform data analysis and data manipulation. There are many advanced features to work with the tabular data such as join multiple data frames into one depending upon the common columns or indices of columns.

In python, there are different types of joins available which can be performed by using the merge() function along with the how parameter of the pandas library. Following are the different joins.

  • Inner Join

  • Outer Join

  • Left Join

  • Right Join

  • Cross Join

Inner Join

An Inner Join in the pandas library will return the rows which have the same values in both defined data frames and in the output. We can observe the columns of the dataframe that have the same columns. The following is the syntax for applying the inner join on the data frames.

import pandas as pd
pd.merge(dataframe1,dataframe2,how=’inner’

Where,

  • pandas is the name of the library.

  • pd is the alias name.

  • merge is the function to perform the join operation.

  • how specifies the type of join operation.

Example

In this example, we will pass two data frames to the merge() function along with the ‘how’ parameter by assigning inner join. Following is the code.

import pandas as pd
dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"],
       "Age": [20,40,78,90,38],
       "Language" : ["Python","Java","C","Python","Salesforce"]}

dic2 = {"Name": ["John","Mark","Nancy","Sita","Ram"],
       "Age": [20,40,78,90,38],
       "class" : [10,9,12,9,5]}

data1 = pd.DataFrame(dic1)
print(data1)
data2 = pd.DataFrame(dic2)
print(data2)
inner_join = pd.merge(data1,data2,how = "inner")
inner_join.head()

Output

    Name  Age    Language
0   John   20      Python
1   Mark   40        Java
2  Nancy   78           C
3   Sita   90      Python
4    Ram   38  Salesforce

    Name  Age  class
0   John   20     10
1   Mark   40      9
2  Nancy   78     12
3   Sita   90      9
4    Ram   38      5

Outer Join

The outer join will return all the rows of both the data frames and if any missing data in the rows then it fills the null values. Following syntax is used for applying the outer join on data frames.

import pandas as pd
pd.merge(dataframe1,dataframe2,how=’outer’

Example

import pandas as pd
dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"],
       "Age": [20,40,78,90,38],
       "Language" : ["Python","Java","C","Python","Salesforce"]}

dic2 = {"Name": ["John","Nancy","Sita","Ram"],
       "Age": [20,40,70,38],
       "class" : [10,12,9,5]}

data1 = pd.DataFrame(dic1)
print(data1)
data2 = pd.DataFrame(dic2)
print(data2)
inner_join = pd.merge(data1,data2,how = "outer")
inner_join.head()

Output

    Name  Age    Language
0   John   20      Python
1   Mark   40        Java
2  Nancy   78           C
3   Sita   90      Python
4   Ram   38  Salesforce
    Name  Age  class
0   John   20     10
1  Nancy   40     12
2   Sita   70      9
3    Ram   38      5

Left Join

The left join returns the entire rows of the left dataframe and all the matching rows of the right dataframe. If in case, there are no matching rows in the right data frame then the output data frame will be filled with the null values in the corresponding columns. The following is the syntax for applying the left join.

import pandas as pd
pd.merge(dataframe1,dataframe2,how=’left’

Example

Here in this example, we will apply the left join on the two input data frames then the output of the left join will be returned.

import pandas as pd
dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"],
       "Age": [2,40,78,90,38],
       "Language" : ["Python","Java","C","Python","Salesforce"]}

dic2 = {"Name": ["John","Nancy","Sita","Ram"],
       "Age": [20,40,70,38],
       "class" : [10,12,9,5]}

data1 = pd.DataFrame(dic1)
print(data1)
data2 = pd.DataFrame(dic2)
print(data2)
inner_join = pd.merge(data1,data2,how = "left")
inner_join.head()

Output

    Name  Age    Language
0   John    2      Python
1   Mark   40        Java
2  Nancy   78           C
3   Sita   90      Python
4    Ram   38  Salesforce
    Name  Age  class
0   John   20     10
1  Nancy   40     12
2   Sita   70      9
3    Ram   38      5

Right Join

The right join returns the entire rows of the right dataframe and all the matching rows of the left dataframe. If in case there are no matching rows in the left data frame then the output data frame will be filled with the null values in the corresponding columns. The following is the syntax for applying the right join.

import pandas as pd
pd.merge(dataframe1,dataframe2,how=’right’

Example

In this example, we will apply the right join on the two input data frames then the output of the right join will be returned.

import pandas as pd
dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"],
       "Age": [2,40,78,90,38],
       "Language" : ["Python","Java","C","Python","Salesforce"]}

dic2 = {"Name": ["John","Nancy","Sita","Ram"],
       "Age": [20,40,70,38],
       "class" : [10,12,9,5]}

data1 = pd.DataFrame(dic1)
print(data1)
data2 = pd.DataFrame(dic2)
print(data2)
inner_join = pd.merge(data1,data2,how = "right")
inner_join.head()

Output

   Name  Age    Language
0   John    2      Python
1   Mark   40        Java
2  Nancy   78           C
3   Sita   90      Python
4    Ram   38  Salesforce
    Name  Age  class
0   John   20     10
1  Nancy   40     12
2   Sita   70      9
3    Ram   38      5

Cross Join

The cross join returns the Cartesian product of the two data frames; meaning it returns all the possible combinations of rows from both data frames.

Updated on: 20-Oct-2023

71 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements