How to create an empty DataFrame and append rows & columns to it in Pandas?


Pandas is a Python library used for data manipulation and analysis. It is built on top of the numpy library and provides an efficient implementation of a dataframe. A dataframe is a two-dimensional data structure. In a dataframe, data is aligned in rows and columns in a tabular Form. It is similar to a spreadsheet or an SQL table or the data.frame in R. The most commonly used pandas object is DataFrame. Mostly, the data is imported into pandas dataframe from other data sources like csv, excel, SQL, etc. In this tutorial, we will learn to create an empty dataframe and how to append rows and columns to it in Pandas.

Syntax

To create an empty dataframe and append rows and columns to it, you need to follow the following syntax −

# syntax for creating an empty dataframe
df = pd.DataFrame()

# syntax for appending rows to a dataframe
df = pd.concat([df, pd.DataFrame([['row1_col1', 'row1_col2', 'row1_col3']], columns=['col1', 'col2', 'col3'])], ignore_index=True)

# syntax for appending columns to a dataframe
df['col_name'] = pd.Series([col1_val1, col1_val2, col1_val3, col1_val4], index=df.index)

We have used the Pandas.concat method to append rows to the dataframe. The ignore_index parameter is used to reset the index of the dataframe after appending rows. The first parameter of the concat method is a list of dataframes to be concatenated with the column names.

The ignore_index parameter is used to reset the index of the dataframe after appending rows. The Pandas.Series method can be used to create a series from a list. The column values can also be passed as a list without using the Series method.

Example 1

In this example, we have created an empty dataframe. Then, we created 2 columns in the dataframe by passing the column names ['Name', 'Age'] to the columns parameter of the DataFrame constructor. Next, we used pd.concat method to append 3 rows ['John', 25], ['Mary', 30], ['Peter', 28] to the dataframe. The ignore_index parameter is set to True to reset the index of the dataframe after appending rows.

Then, we appended 2 columns ['Salary', 'City'] to the dataframe. The 'Salary' column values were passed as a Series. The index of the Series was set to the index of the dataframe. The column values for the 'City' column were passed as a list.

import pandas as pd

df = pd.DataFrame()
df = pd.DataFrame(columns=['Name', 'Age'])

df = pd.concat([df, pd.DataFrame([['John', 25]], columns=['Name', 'Age'])], ignore_index=True)
df = pd.concat([df, pd.DataFrame([['Mary', 30]], columns=['Name', 'Age'])], ignore_index=True)
df = pd.concat([df, pd.DataFrame([['Peter', 28]], columns=['Name', 'Age'])], ignore_index=True)

df['Salary'] = pd.Series([50000, 60000, 70000], index=df.index)
df['City'] = ['New York', 'Los Angeles', 'Chicago']

print(df)

Output

    Name Age  Salary         City
0   John  25   50000     New York
1   Mary  30   60000  Los Angeles
2  Peter  28   70000      Chicago

Example 2

In this example, we have created an empty dataframe. Then, we created 5 columns in the dataframe by passing the column names ['Batsman', 'Runs', 'Balls', '4s', '6s'] to the columns parameter of the DataFrame constructor. Next, we used pd.concat method to append 4 rows ['MS Dhoni', 100, 80, 8, 1], ['Virat Kohli', 120, 100, 10, 2], ['Rohit Sharma', 100, 80, 8, 1], ['Shikhar Dhawan', 80, 60, 6, 0] to the dataframe. Then, we appended 2 columns ['Strike Rate', 'Average'] to the dataframe.

The column values for the 'Strike Rate' column were passed as a Series. The column values for the 'Average' column were passed as a list. The index of the list was the default index of the list.

import pandas as pd

df = pd.DataFrame()
df = pd.DataFrame(columns=['Batsman', 'Runs', 'Balls', '4s', '6s'])

df = pd.concat([df, pd.DataFrame([['MS Dhoni', 100, 80, 8, 1]], columns=['Batsman', 'Runs', 'Balls', '4s', '6s'])], ignore_index=True)
df = pd.concat([df, pd.DataFrame([['Virat Kohli', 120, 100, 10, 2]], columns=['Batsman', 'Runs', 'Balls', '4s', '6s'])], ignore_index=True)
df = pd.concat([df, pd.DataFrame([['Rohit Sharma', 100, 80, 8, 1]], columns=['Batsman', 'Runs', 'Balls', '4s', '6s'])], ignore_index=True)
df = pd.concat([df, pd.DataFrame([['Shikhar Dhawan', 80, 60, 6, 0]], columns=['Batsman', 'Runs', 'Balls', '4s', '6s'])], ignore_index=True)

df['Strike Rate'] = pd.Series([125, 120, 125, 133], index=df.index)
df['Average'] = [100, 120, 100, 80]
print(df)

Output

          Batsman Runs Balls  4s 6s  Strike Rate  Average
0        MS Dhoni  100    80   8  1          125      100
1     Virat Kohli  120   100  10  2          120      120
2    Rohit Sharma  100    80   8  1          125      100
3  Shikhar Dhawan   80    60   6  0          133       80

Conclusion

We learned how to create an empty dataframe and how to append rows and columns to it using the Pandas library in Python. We also learned about some of the Pandas methods, their syntax, and the parameters they accept. This learning can be very much helpful for the ones who are starting to operate on dataframes using the Pandas library in Python.

Updated on: 11-May-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements