Python Pandas - Filling missing column values with mode

PythonServer Side ProgrammingProgramming

Mode is the value that appears the most in a set of values. Use the fillna() method and set the mode to fill missing columns with mode. At first, let us import the required libraries with their respective aliases −

import pandas as pd
import numpy as np

Create a DataFrame with 2 columns. We have set the NaN values using the Numpy np.NaN

dataFrame = pd.DataFrame(
   {
      "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
   }
)

Find mode of the column values with NaN i.e, for Units columns here. Replace NaNs with the mode of the column where it is located using mode() on Units column −

dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)

Example

Following is the complete code −

import pandas as pd
import numpy as np

# Create DataFrame
dataFrame = pd.DataFrame(
   {
      "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
   }
)

print"DataFrame ...\n",dataFrame

# finding mode of the column values with NaN i.e, for Units columns here
# Replace NaNs with the mode of the column where it is located
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)

print"\nUpdated Dataframe after filling NaN values with mode...\n",dataFrame

Output

This will produce the following output −

DataFrame ...
       Car   Units
0      BMW   100.0
1    Lexus   150.0
2    Lexus     NaN
3  Mustang    80.0
4  Bentley     NaN
5  Mustang     NaN

Updated Dataframe after filling NaN values with mode...
       Car   Units
0      BMW   100.0
1    Lexus   150.0
2    Lexus    80.0
3  Mustang    80.0
4  Bentley    80.0
5  Mustang    80.0
raja
Published on 21-Sep-2021 06:51:07
Advertisements