What kind of data does python pandas handle?



One must need to deal with data If they are working with any of these technologies like Machine Learning or Data Science. And data is the foundation for these technologies. Dealing with data is a very difficult process in real-time. because real-world data is messy.

The main advantage of using the python pandas package is, it has numerous functions to handle data. As we know that real-time data can be any form, it may be in the form of characters, integers, floating-point values, categorical data, and more.

Pandas is best for handling or manipulating tabular data because it has a DataFrame object which has more functions. DataFrame is a 2-dimensional data structure that stores tabular data that can be in any form (integer values, characters, floating values, categorical, and more).

Example

import pandas as pd
data = pd.read_csv('sales_data.csv')
data.dtypes

Explanation

By using the import keyword we have imported the pandas package, after that the read_csv function is used to read the CSV file. Here sales_data.csv file is our data file, and this file has 10 columns which are named as Customer Number, Customer Name, 2016, 2017, Percent Growth, Jan Units, Month, Day, Year, Active.

Each column is holding a different type of data. To get each column data type individually here we are using the dtype attribute.

Output

Customer Number   float64
Customer Name     object
2016              object
2017              object
Percent Growth    object
Jan Units         object
Month              int64
Day                int64
Year               int64
Active            object
dtype: object

The above output block is representing column names and data types of our input data set (sales_data.cvs). There are three columns that store integer values and one column for floating-point values and the remaining six columns are storing object data nothing but text type data.

Example

df = pd.DataFrame({'datetime': [pd.Timestamp('20190210')],'boolean': True})

print(df)
print() # for providing space at output
print(df.dtypes)

Explanation

The above code block is created by two data types which are DateTime and boolean data types. By using the pd.timestamp we have created date time dtype data.

Output

datetime boolean
0 2019-02-10     True

datetime   datetime64[ns]
boolean bool
dtype: object

There are 2 outputs in the above output block. The first one is representing data that is present in the data frame object df and the second one is representing the dtypes of each column of our data frame object.

By using these examples we can see how and what kind of data does pandas handle.


Advertisements