Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Get the data type of column in Pandas - Python
Pandas is a popular and powerful Python library commonly used for data analysis and manipulation. It offers data structures like Series and DataFrame for working with tabular data.
Each column in a Pandas DataFrame can contain a different data type. This article covers various methods for determining column data types in a DataFrame ?
Sample DataFrame
Let's create a sample DataFrame to demonstrate different approaches ?
import pandas as pd
# Create a sample dataframe
df = pd.DataFrame({
'Vehicle name': ['Supra', 'Honda', 'Lamborghini'],
'price': [5000000, 600000, 7000000],
'mileage': [12.5, 18.2, 9.8],
'year': [2020, 2019, 2021]
})
print(df)
Vehicle name price mileage year 0 Supra 5000000 12.5 2020 1 Honda 600000 18.2 2019 2 Lamborghini 7000000 9.8 2021
Using the dtypes Attribute
The dtypes attribute returns the data type of each column in the DataFrame ?
Getting All Column Data Types
import pandas as pd
df = pd.DataFrame({
'Vehicle name': ['Supra', 'Honda', 'Lamborghini'],
'price': [5000000, 600000, 7000000],
'mileage': [12.5, 18.2, 9.8],
'year': [2020, 2019, 2021]
})
# Get data types of all columns
print("Data types of each column:")
print(df.dtypes)
Data types of each column: Vehicle name object price int64 mileage float64 year int64 dtype: object
Getting Single Column Data Type
import pandas as pd
df = pd.DataFrame({
'Vehicle name': ['Supra', 'Honda', 'Lamborghini'],
'price': [5000000, 600000, 7000000],
'mileage': [12.5, 18.2, 9.8]
})
# Get data type of a specific column
print("Data type of 'price' column:", df.dtypes['price'])
print("Data type of 'mileage' column:", df.dtypes['mileage'])
Data type of 'price' column: int64 Data type of 'mileage' column: float64
Using select_dtypes() Method
The select_dtypes() method filters columns based on their data types ?
import pandas as pd
df = pd.DataFrame({
'Vehicle name': ['Supra', 'Honda', 'Lamborghini'],
'price': [5000000, 600000, 7000000],
'mileage': [12.5, 18.2, 9.8],
'year': [2020, 2019, 2021]
})
# Select numeric columns
numeric_cols = df.select_dtypes(include=['int64', 'float64']).columns
print("Numeric columns:", list(numeric_cols))
# Select object columns
object_cols = df.select_dtypes(include=['object']).columns
print("Object columns:", list(object_cols))
# Print data types of numeric columns
for col in numeric_cols:
print(f"Data type of '{col}': {df[col].dtype}")
Numeric columns: ['price', 'mileage', 'year'] Object columns: ['Vehicle name'] Data type of 'price': int64 Data type of 'mileage': float64 Data type of 'year': int64
Using the info() Method
The info() method provides a comprehensive summary including data types, non-null counts, and memory usage ?
import pandas as pd
df = pd.DataFrame({
'Vehicle name': ['Supra', 'Honda', 'Lamborghini'],
'price': [5000000, 600000, 7000000],
'mileage': [12.5, 18.2, 9.8],
'year': [2020, 2019, 2021]
})
print("DataFrame Info:")
df.info()
DataFrame Info: <class 'pandas.core.frame.DataFrame'> RangeIndex: 3 entries, 0 to 2 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Vehicle name 3 non-null object 1 price 3 non-null int64 2 mileage 3 non-null float64 3 year 3 non-null int64 dtypes: float64(1), int64(2), object(1) memory usage: 224.0+ bytes
Comparison of Methods
| Method | Output | Best For |
|---|---|---|
dtypes |
Series of data types | Quick data type check |
select_dtypes() |
Filtered columns | Filtering by data type |
info() |
Complete summary | Comprehensive overview |
Conclusion
Use dtypes for quick data type checks, select_dtypes() for filtering columns by type, and info() for comprehensive DataFrame information. Each method serves different purposes in data analysis workflows.
