Different ways to create Pandas Dataframe


Pandas is one of the libraries in python which is used to perform data analysis and data manipulation. The data can have created in pandas in two ways one is as DataFrame and the other way is Series.

The DataFrame is the two dimensional labeled data structure in python. It is used for data manipulation and data analysis. It accepts different data types such as integer, float, strings etc. The label of the column is unique whereas the row is labeled with the unique index value which helps in accessing the defined row.

DataFrame is used in machine learning tasks which allow the users to manipulate and analyze the data sets in large size. It supports the operations such as filtering, sorting, merging, grouping and transforming data.

The following are the different ways to create pandas Dataframe. Let’s see them one by one.

From a NumPy array

We can create the DataFrame from the Numpy array by using the DataFrame() function of the Pandas library. The following is the syntax to create the pandas dataframe from the numpy array.

pandas.DataFrame(array)

Where,

  • pandas is the name of the library

  • DataFrame is the function

  • array is the numpy array

Example

In this example we will pass the numpy array as the input argument to the DataFrame function along with the column names then the array will be converted into Dataframe.

import pandas as pd
import numpy as np
arr = np.array([[20,30,40],[70,80,40]])
data = pd.DataFrame(arr, columns= ['a1', 'a2', 'a3'])
print(data.head())

Output

a1   a2   a3
0    20   30  40
1    70   80  40

From a dictionary

The DataFrame can be created from the dictionary by using the DataFrame() function of the pandas library by passing the dictionary as the input argument. The following is the syntax to create the pandas dataframe from the dictionary.

pandas.DataFrame(dictionary)

Example

In this example we will pass the dictionary as the input argument to the DataFrame() function of the pandas library then the dictionary will be converted into dataframe.

import pandas as pd
import numpy as np
dic = {'b': [2,3], 'c': [3,5], 'a': [1,6]}
data = pd.DataFrame(dic)
data.head()

Output

b  c  a
0  2  3  1
1  3  5  6

From a CSV file

We can create the dataframe from the data of a csv file. In pandas library we have a function named read_csv() to read the csv file data. The following is the syntax for creating the dataframe from the csv file.

pandas.read_csv(csv_file)

Example

Here in this example we will create the pandas dataframe from a csv file data by using the read_csv() function. The following is the code for reference.

import pandas as pd
data=pd.read_csv("https://raw.githubusercontent.com/Opensourcefordatascience/Data-sets/master/blood_pressure.csv")
print(data.head(20))

Output

    patient   sex agegrp  bp_before  bp_after
0         1  Male  30-45        143       153
1         2  Male  30-45        163       170
2         3  Male  30-45        153       168
3         4  Male  30-45        153       142
4         5  Male  30-45        146       141
5         6  Male  30-45        150       147
6         7  Male  30-45        148       133
7         8  Male  30-45        153       141
8         9  Male  30-45        153       131
9        10  Male  30-45        158       125
10       11  Male  30-45        149       164
11       12  Male  30-45        173       159
12       13  Male  30-45        165       135
13       14  Male  30-45        145       159
14       15  Male  30-45        143       153
15       16  Male  30-45        152       126
16       17  Male  30-45        141       162
17       18  Male  30-45        176       134
18       19  Male  30-45        143       136
19       20  Male  30-45        162       150

Updated on: 20-Oct-2023

60 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements