- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Get the data type of column in Pandas - Python
Pandas is a popular and powerful Python library commonly used for data analysis and manipulation. It offers a number of data structures, including the Series, DataFrame, and Panel, for working with tabular and time-series data.
Pandas DataFrame is a two-dimensional tabular data structure. In this article, we'll go through various methods for determining a column's data type in Pandas. There can be numerous cases where we have to find the data type of a column in Pandas DataFrame. Each column in a Pandas DataFrame can contain a different data type.
Before Moving forward, let's make a sample dataframe on which we have to Get the data type of column in Pandas
import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) print(df)
Output
This python script prints the DataFrame that we have created.
Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000
The approaches that can be followed to complete the task are mentioned as below
Approaches
Using the dtypes attribute
Using select_dtypes()
Using the info() method
Using the describe() function
Now let's discuss each approach and how they can be used to get the data type of column in Pandas.
Method 1: Using the dtypes attribute
We can use the dtypes attribute for getting the data type of each column present in the DataFrame. This attribute will return a series with the data type of each column. Below syntax can be used:
Syntax
df.dtypes
Return Type data type of each column present in the DataFrame.
Algorithm
Import the Pandas library.
Create a DataFrame using the pd.DataFrame() function and pass the sample as a dictionary.
Use the dtypes attribute to get the data types of each column in the DataFrame.
Print the result to check the data types of each column.
Example 1
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # get the data types of each column print("\nData types of each column:") print(df.dtypes)
Output
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data types of each column: Vehicle name object price int64 dtype: object
Example 2
In this example, we are getting the data type of a single column of the DataFrame
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # get the data types of column named price print("\nData types of column named price:") print(df.dtypes['price'])
Output
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data types of column named price: int64
Method 2: Using select_dtypes()
We can use the select_dtypes() method for filtering out what data type columns we need. Based on the data types supplied as inputs, the select_dtypes() method returns a subset of the columns. This method allows us to choose the columns that belong to a specific data type and then determine the data type.
Algorithm
Import the Pandas library.
Create a DataFrame using pd.DataFrame() function and pass the given data as a dictionary.
Print the DataFrame to check the created data.
Use the select_dtypes() method to select the all the numeric columns from the DataFrame. Pass the list of data types that we want to select as an argument using the include parameter.
loop on the columns to iterate through each numeric column and print its data type.
Example
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # select the numeric columns numeric_cols = df.select_dtypes(include=['float64', 'int64']).columns # get the data type of each numeric column for col in numeric_cols: print("Data Type of column", col, "is", df[col].dtype)
Output
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data Type of column price is int64
Method 3: Using the info() method
We can also use the info() method for our task. The info() method provides us with a concise summary of a DataFrame, including the data type of each column. Below syntax can be used:
Syntax
DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)
Return Value None
Algorithm
Import the Pandas library.
Create a DataFrame using the pd.DataFrame() function and pass the above data as a dictionary.
Print the DataFrame to check the created data.
Use the info() method to get information about the DataFrame.
Print the information obtained from the info() method.
Example
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # use the info() method to get the data type of each column print(df.info())
Output
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 <class 'pandas.core.frame.DataFrame'> RangeIndex: 3 entries, 0 to 2 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Vehicle name 3 non-null object 1 price 3 non-null int64 dtypes: int64(1), object(1) memory usage: 176.0+ bytes None
Method 4: Using the describe() function
The describe() method is used to generate descriptive statistics of a DataFrame, including the data type of each column.
Algorithm
Import the Pandas library using the import statement.
Create a DataFrame using the pd.DataFrame() function and pass the given data as a dictionary.
Print the DataFrame to check the created data.
Use the describe() method to get the descriptive statistics of the DataFrame.
Use the include parameter of the describe() method to 'all' for including all the columns in the descriptive statistics.
Get the data type of each column in the DataFrame using the dtypes attribute.
Print the data type of each column.
Example
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # use the describe() method to get the descriptive statistics of the dataframe desc_stats = df.describe(include='all') # get the data type of each column dtypes = desc_stats.dtypes # print the data type of each column print("Data type of each column in the descriptive statistics:\n", dtypes)
Output
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data type of each column in the descriptive statistics: Vehicle name object price float64 dtype: object
Conclusion
We can efficiently complete various data manipulation and analysis jobs by knowing how to get the data type of each column. Each approach has its own advantages and disadvantages based on the method or function used. You can choose the method you want based on the complexity of the expression you want to have and your personal preference for writing the code.