Python Data Types for Data Science



Overview

Python is a high-level language used for many purposes such as developing websites and software, data visualization, data analysis, and task automation. Additionally, it provides top skills for using scientific, statistical, and mathematical functions. And also It offers superb libraries for working with data science applications. In terms of application areas, ML scientists also favour Python.

There are numerous data types in Python. The most popular ones are list, dict (dictionary), int (integer), str (string), bool (boolean), and float (floating point). And also there are many libraries that are used in data science NumPy, pandas, Matplotlib, scipy, etc.

Python in Data Science

Programming in data science requires a very flexible language that is simple to learn and capable of doing exceedingly sophisticated mathematical operations. Given that it has previously established itself as a language for both general and scientific computing, Python is best suited for such requirements. Additionally, it is constantly improved with fresh additions to its assortment of libraries that are tailored to different programming requirements.

Python Data Types for Data Science

Data types refer to the categorization or classification of data components. It stands for the kind of value that defines the possible operations on a given piece of data.

In other words, Data types are a specific class of data item that can be identified by the values it can accept, the programming language that can be used to create it, or the actions that can be carried out on it.

In python, we have discussed built-in data types and their categories, and in Python programming, everything is an object, hence data types are classes and variables are objects of that class.

There are mainly five standard data types in python, they are given below

  • Numeric − int, float, complex

  • Dictionary − dict

  • Boolean − bool

  • Set − set

  • Sequence Type − list, tuple, range

  • String − str

Let’s discuss each of them in depth.

Python Numeric Data Type

Python's numeric data types are used to represent data that has a numeric value. It is mainly in three types, i.e., an integer belonging to the int class, a floating number belonging to the float class, or even a complex number belonging to the complex class.

Integer − It has both positive and negative whole numbers in it without fractions or decimals. And belong to the int class with no restriction on the length of integer numbers in Python.

Float − It has a floating-point representation and is a real number. To represent it we have a decimal point to indicate it. We may be added e or E after a positive and negative number to designate scientific notation.

Complex Number − Complex classes serve as representations for complex numbers. As an example, 4+5j is described as (actual part) + (imaginary part)j.

Note − To identify the type of data, use the type() method.

Example

numb1 = 2
print("Type of ", numb1, "is ", type(numb1))
numb2 = 1.0
print("Type of ", numb2, "is ", type(numb2))
numb3 = 2+3j
print("Type of ", numb3, "is ", type(numb3))

Output

('Type of ', 2, 'is ', <type 'int'>)
('Type of ', 1.0, 'is ', <type 'float'>)
('Type of ', (2+3j), 'is ', <type 'complex'>)

Python Dictionary

A dictionary in Python is an unordered collection of data values used to store data values similar to a map. Dictionaries consist of key-value pairs, in contrast to other data types, which can only contain a single value.

To increase the efficiency of the dictionary the Key-value pairs are included. When representing a dictionary data type, each key-value pair is distinguished by a colon, whereas each key is distinguished by a "comma."

Creation of dictionary

In Python, a dictionary may be made by enclosing a number of elements in curly brackets and separating them with commas. Values in dictionaries can be of any datatype and can be replicated, unlike keys, which cannot be repeated and must be immutable. A dictionary is created using the built-in function dict(). A dictionary will be empty if it is just enclosed in curly braces.

Example

data = {'f_name': 'Prabhdeep', 'l_name': 'Singh', 'age': 25}
print(type(data))

Output

<type 'dict'>

The above code snippet creates a dictionary named data with three key-value pairs.

Access Dictionary Values using Keys

You can use the keys to access the respective values in the dictionary.

Example

data = {'f_name': 'Prabhdeep', 'l_name': 'Singh', 'age': 25}
# Access the value of first key - f_name
print(data['f_name'])
# Access the value of second key - l_name
print(data['l_name'])
# Access the value of last/ third key - age
print(data['age'])
# Access all values
print(data)

Output

Prabhdeep
Singh
25
{'l_name': 'Singh', 'f_name': 'Prabhdeep', 'age': 25}

Note − Dictionary keys are case-sensitive; meaning that the same name can have a different meaning depending on the case of the Key.

Python Boolean

Data that has the predetermined values True or False. Equal to False Boolean objects are false(false), while equal to True Boolean objects are truthy (true). However, it is also possible to evaluate and categorize non-Boolean things in a boolean context. The bool class is used to represent it.

Note − Boolean values must start with a capital T or F in order to be accepted by Python. Python will throw an exception for the boolean values true and false since they are invalid. Look at the below example

Example

# define a boolean variable
b = False
print(type(b))

Output

<type 'bool'>

Python Set Data Type

In Python, a set is a non-duplicate collection of data types that may be iterated through and changed. A set may have a variety of components, but the placement of the parts is not fixed.

Unordered objects are grouped together as a set. There cannot be any duplicates of any set element, and it must be immutable (cannot be changed).

Due to the set's unordered nature, indexing will be useless. As a result, the slicing operator [] is useless.

Creation of set

The built-in set() method can be used to build sets with an iterable object or a series by wrapping the sequence behind curly brackets and separating them with a comma,. The elements in a set don't have to be of the same type; they might contain a variety of mixed data type values.

Example

# Create a set from a list using the set() function
s = set([1, 2, 3, 4, 5])
print(s) # Output: {1, 2, 3, 4, 5}
# Create a set using curly braces
s = {1, 2, 3, 4, 5}
print(s) # Output: {1, 2, 3, 4, 5}

Output

set([1, 2, 3, 4, 5])
set([1, 2, 3, 4, 5])

Python Sequence

The sequence in Python is an ordered grouping of related or dissimilar data types. Sequences enable the ordered and effective storage of several values. In Python, there are various sequence types. They are given below −

  • List

  • Tuple

  • Range

List Data Type

A list can be formed by putting all the elements in square brackets and all the present elements are separated by a comma. Elements can be any data type even a list also and can be traversed using an iterator or using index we can also get the elements.

Example

# Create a list using square brackets
l = [1, 2, 3, 4, 5]
print(l) # Output: [1, 2, 3, 4, 5]
# Access an item in the list using its index
print(l[1]) # Output: 2

Output

[1, 2, 3, 4, 5]
2

Tuple Data Type

Tuples are similar to lists, but they can’t be modified once they are created. Tuples are commonly used to store data that should not be modified, such as configuration settings or data that is read from a database.

Here is an example of creating a tuple and accessing its elements

Example

# Create a tuple using parentheses
t = (1, 2, 3, 4)
print(t) # Output: (1, 2, 3, 4)
# Access an item in the tuple using its index
print(t[1]) # Output: 2

Output

(1, 2, 3, 4)
2

Python Rage

The range data type represents an immutable sequence of numbers. It is similar to a list, but it is more memory-efficient and faster to iterate over.

Here is an example of the range data type in Python

Example

# Create a range using the range() function
r = range(10)
print(r) # Output: range(0, 10)
# Access an item in the range using its index
print(r[3]) # Output: 3

Output

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3

Python String

A string of Unicode characters makes up the string. A string is a grouping of one or more characters enclosed in a single, double, or triple quotation mark. A class called str can be used to represent it. There is no character data type in Python; instead, a character is a string of length 1. The class str is used to represent it.

Strings can be used for a variety of actions, including concatenation, slicing, and repetition.

  • Concatenation − This process involves connecting two or more threads together.

  • Slicing is a method for taking different pieces of string out.

  • Repeating a set of instructions, a certain number of times is referred to as repetition.

Conclusion

Data types refer to the categorization or classification of data components. It is the kind of value that defines the possible operations on a given data set. In python, we have discussed built-in data types and their categories, and in Python programming, everything is an object, hence data types are classes and variables are objects of that class. There are mainly five standard data types in python, they are Numeric, Dictionary, Boolean, Set, and Sequence Type.


Advertisements