Drop Empty Columns in Pandas

Pandas DataFrames often contain empty columns filled with NaN values that can clutter your data analysis. Python provides several efficient methods to identify and remove these empty columns to create cleaner, more relevant datasets.

What Are Empty Columns?

In pandas, a column is considered empty when it contains only NaN (Not a Number) values. Note that columns with empty strings, zeros, or spaces are not considered empty since these values may carry meaningful information about your dataset.

Creating a DataFrame with Empty Columns

Let's start by creating a sample DataFrame that includes an empty column filled with NaN values ?

import pandas as pd
import numpy as np

dataset = {
    "Hostel ID": ["DSC224", "DSC124", "DSC568", "DSC345"], 
    "Hostel Rating": [8, 6, 10, 5], 
    "Hostel price": [35000, 32000, 50000, 24000], 
    "Hostel location": [np.nan, np.nan, np.nan, np.nan]
}

df = pd.DataFrame(dataset, index=["Hostel 1", "Hostel 2", "Hostel 3", "Hostel 4"])
print("Original DataFrame:")
print(df)
Original DataFrame:
         Hostel ID  Hostel Rating  Hostel price  Hostel location
Hostel 1    DSC224              8         35000              NaN
Hostel 2    DSC124              6         32000              NaN
Hostel 3    DSC568             10         50000              NaN
Hostel 4    DSC345              5         24000              NaN

Method 1: Using dropna() with how='all'

The dropna() method with how='all' removes columns where all values are NaN ?

import pandas as pd
import numpy as np

dataset = {
    "Hostel ID": ["DSC224", "DSC124", "DSC568", "DSC345"], 
    "Hostel Rating": [8, 6, 10, 5], 
    "Hostel price": [35000, 32000, 50000, 24000], 
    "Hostel location": [np.nan, np.nan, np.nan, np.nan]
}

df = pd.DataFrame(dataset, index=["Hostel 1", "Hostel 2", "Hostel 3", "Hostel 4"])

# Drop columns where all values are NaN
cleaned_df = df.dropna(how='all', axis=1)
print("After dropping empty columns:")
print(cleaned_df)
After dropping empty columns:
         Hostel ID  Hostel Rating  Hostel price
Hostel 1    DSC224              8         35000
Hostel 2    DSC124              6         32000
Hostel 3    DSC568             10         50000
Hostel 4    DSC345              5         24000

Note: To modify the original DataFrame instead of creating a new one, use inplace=True.

df.dropna(how='all', axis=1, inplace=True)

Method 2: Using notnull() with Column Selection

This approach uses notnull() to identify columns that contain at least one non-null value ?

import pandas as pd
import numpy as np

dataset = {
    "Hostel ID": ["DSC224", "DSC124", "DSC568", "DSC345"], 
    "Hostel Rating": [8, 6, 10, 5], 
    "Hostel price": [35000, 32000, 50000, 24000], 
    "Hostel location": [np.nan, np.nan, np.nan, np.nan]
}

df = pd.DataFrame(dataset, index=["Hostel 1", "Hostel 2", "Hostel 3", "Hostel 4"])

# Select columns that have at least one non-null value
cleaned_df = df.loc[:, df.notnull().any(axis=0)]
print("Using notnull() method:")
print(cleaned_df)
Using notnull() method:
         Hostel ID  Hostel Rating  Hostel price
Hostel 1    DSC224              8         35000
Hostel 2    DSC124              6         32000
Hostel 3    DSC568             10         50000
Hostel 4    DSC345              5         24000

Comparison of Methods

Method Syntax Best For
dropna() df.dropna(how='all', axis=1) Simple and readable approach
notnull() df.loc[:, df.notnull().any(axis=0)] More control over selection logic

Conclusion

Use dropna(how='all', axis=1) for the simplest approach to remove empty columns. The notnull() method provides more flexibility when you need custom filtering logic. Both methods help maintain clean, relevant DataFrames for better data analysis.

Updated on: 2026-03-27T06:13:21+05:30

11K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements