Python – Display only non-duplicate values from a DataFrame


We will see how to display only non-duplicated values. At first, we will create a DataFrame with duplicate values −

dataFrame = pd.DataFrame(
   {
      "Student": ['Jack', 'Robin', 'Ted', 'Robin', 'Scarlett', 'Kat', 'Ted'],"Result": ['Pass', 'Fail', 'Pass', 'Fail', 'Pass', 'Pass', 'Pass']
   }
)

Above, we have created 2 columns. To display only non-duplicated values, use the duplicated() method and logical NOT. Through this, non-duplicated values will be fetched −

dataFrame[~dataFrame.duplicated('Student')]

Example

Following is the complete code −

import pandas as pd

# Create DataFrame
dataFrame = pd.DataFrame(
   {
      "Student": ['Jack', 'Robin', 'Ted', 'Robin', 'Scarlett', 'Kat', 'Ted'],"Result": ['Pass', 'Fail', 'Pass', 'Fail', 'Pass', 'Pass', 'Pass']
   }
)

print"DataFrame ...\n",dataFrame

# displaying non-duplicates
res = dataFrame[~dataFrame.duplicated('Student')]
print"\nDataFrame after removing duplicates ...\n",res

Output

This will produce the following output −

DataFrame ...
   Result   Student
0    Pass      Jack
1    Fail     Robin
2    Pass       Ted
3    Fail     Robin
4    Pass  Scarlett
5    Pass       Kat
6    Pass       Ted

DataFrame after removing duplicates ...
   Result   Student
0    Pass      Jack
1    Fail     Robin
2    Pass       Ted
4    Pass  Scarlett
5    Pass       Kat

Updated on: 20-Sep-2021

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements