- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How do StringDtype objects differ from object dtype in Python Pandas?
Pandas can not only include text data as an object, it also includes any other data that pandas don’t understand. This means, if you say when a column is an Object dtype, and it doesn’t mean all the values in that column will be a string or text data. In fact, they may be numbers, or a mixture of string, integers, and floats dtype. So with this incompatibility, we can not do any string operations on that column directly.
Due to this problem, string dtype is introduced from the pandas 1.0 version, but we need to define it explicitly.
See some examples to understand how StringDtype and object dtype differ.
Example
list_ = ['python',90, 'string',2] # assign a list ds = pd.Series(list_) # create a Series print(ds) # print series print() print(type(ds[1])) # display type of 2nd element from dataSeries
Explanation
Above code, created a pandas Series with the list of 3 elements, those elements have strings as well as integers. and in the last line, we try to get the data type of the 2nd element.
Output
0 python 1 90 2 string 3 2 dtype: object <class 'int'>
We can clearly see that the dtype of ds Series is an object, but if you try to get the type of a 2nd element it returns the output as an integer, not an object or a string. So it concludes, the dtype object doesn’t store only text data, it is a mixture of all data.
Example
Here define pd.StringDtype() explicitly to the dtype parameter of the pandas series method.
list_ = ['python',90, 'string'] ds = pd.Series(list_, dtype=pd.StringDtype()) print(ds) print() print(type(ds[1]))
Explanation
In this example we changed the default dtype of the series by assigning dtype parameter value as string. And again we try to display the type of 2nd element from series ds.
Output
0 python 1 90 2 string dtype: string <class 'str'>
The output dtype of series ds is a string and also the type of 2nd element of that ds is a string. So we can understand that the dtype StringDtype will change the type of all data.
By defining StringDtype to textual data that won’t create any difficulties to perform string operations. That is the reason it is recommended to use StringDtype to store all textual data.
- Related Articles
- Python Pandas - Return the dtype object of the underlying data
- Python - Check if the Pandas Index is of the object dtype
- How do domestic animals differ from wild animals?
- How do poriferan animals differ from coelenterate animals?
- Python Pandas - Return DatetimeIndex as object ndarray of datetime.datetime objects
- Python Pandas - Return TimeDeltaIndex as object ndarray of datetime.datetime objects
- How do metal oxides differ from non-metal oxides?
- Python Pandas - Return the frequency object from the PeriodIndex object
- Python - Get the weekday from Timestamp object in Pandas
- How do you compare Python objects with .NET objects?
- Python Pandas - Create RangeIndex from a range object
- Python Pandas - Return the microseconds from Timedelta object
- Python Pandas - Return the nanoseconds from Timedelta object
- Python Pandas - Return the seconds from Timedelta object
- How do you implement persistent objects in Python?
