Checking if a Value Exists in a DataFrame using 'in' and 'not in' Operators in Python Pandas


Pandas is a powerful Python library widely used for data manipulation and analysis. When working with DataFrames, it is often necessary to check whether a specific value exists within the dataset. In this tutorial, we will explore how to use the 'in' and 'not in' operators in Pandas to determine the presence or absence of a value in a DataFrame.

Checking for a Value Using the "in" Operator

The 'in' operator in Python is used to check if a value is present in an iterable object. In the context of Pandas, we can use the 'in' operator to verify if a value exists within a DataFrame. Let's consider two examples that demonstrate the usage of the 'in' operator to check the existence of a value in a dataframe.

Example 1: Checking for a Value in a DataFrame Column

In this example, we create a DataFrame with two columns: 'Name' and 'Age'. We want to check if the value 'Alice' exists in the 'Name' column. By using the 'in' operator, we compare the value against the values present in the 'Name' column using the ".values" attribute.

Consider the code shown below.

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob', 'Emily'],'Age': [25, 30, 28, 35]})

# Check if a value exists in the 'Name' column
value = 'Alice'
if value in df['Name'].values:
   print(f"{value} exists in the DataFrame.")
else:
   print(f"{value} does not exist in the DataFrame.")

Output

If the value is found, the corresponding message is displayed; otherwise, a different message is printed.

When you execute this code, it will produce the following output −

Alice exists in the DataFrame.

Example 2: Checking for a Value across a DataFrame

In this example, we want to check if the value '28' exists anywhere within the DataFrame. We utilise the "in" operator to compare the value against all the values in the DataFrame using the ".values" attribute.

Consider the code shown below −

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob', 'Emily'],'Age': [25, 30, 28, 35]})

# Check if a value exists in the DataFrame
value = 28
if value in df.values:
   print(f"{value} exists in the DataFrame.")
else:
   print(f"{value} does not exist in the DataFrame.")

Output

If the value is present, the corresponding message is displayed; otherwise, a different message is printed.

When you execute this code, it will produce the following output −

28 exists in the DataFrame.

Checking for a Value Using the "not in" Operator

In this example, we create a DataFrame with two columns: "Name" and "Age". We aim to check if the value "Michael" does not exist in the 'Name' column.

By utilizing the "not in" operator, we compare the value against the values in the "Name" column using the ".values" attribute.

Consider the code shown below.

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'Name': ['John', 'Alice', 'Bob', 'Emily'],'Age': [25, 30, 28, 35]})

# Check if a value does not exist in the 'Name' column
value = 'Michael'
if value not in df['Name'].values:
   print(f"{value} does not exist in the DataFrame.")
else:
   print(f"{value} exists in the DataFrame.")

Output

If the value is not found, the corresponding message is displayed; otherwise, a different message is printed.

When you execute this code, it will produce the following output −

Michael does not exist in the DataFrame.

Conclusion

In this tutorial, we explored how to utilise the "in" and "not in" operators in Pandas to check for the existence or non-existence of a value within a DataFrame. By leveraging these operators, we can efficiently determine the presence or absence of values in specific columns or across the entire DataFrame

Through the code examples provided, we demonstrated how to employ the 'in' operator to check if a value exists in a DataFrame column or across the entire DataFrame. Additionally, we showcased the use of the 'not in' operator to check for the non-existence of a value.

By using these operators, analysts and data scientists can effectively validate data presence or absence, enabling them to make informed decisions based on the available information within their DataFrame structures.

In conclusion, the "in" and "not in" operators in Pandas offer powerful tools for value existence and non-existence checks, facilitating efficient data exploration and analysis.

Updated on: 01-Sep-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements