Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Python Pandas - Filling missing column values with mode
Mode is the value that appears the most in a set of values. Use the fillna() method and set the mode to fill missing columns with mode. At first, let us import the required libraries with their respective aliases −
import pandas as pd import numpy as np
Create a DataFrame with 2 columns. We have set the NaN values using the Numpy np.NaN −
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
}
)
Find mode of the column values with NaN i.e, for Units columns here. Replace NaNs with the mode of the column where it is located using mode() on Units column −
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)
Example
Following is the complete code −
import pandas as pd
import numpy as np
# Create DataFrame
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
}
)
print"DataFrame ...\n",dataFrame
# finding mode of the column values with NaN i.e, for Units columns here
# Replace NaNs with the mode of the column where it is located
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)
print"\nUpdated Dataframe after filling NaN values with mode...\n",dataFrame
Output
This will produce the following output −
DataFrame ... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus NaN 3 Mustang 80.0 4 Bentley NaN 5 Mustang NaN Updated Dataframe after filling NaN values with mode... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus 80.0 3 Mustang 80.0 4 Bentley 80.0 5 Mustang 80.0
Advertisements