- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Create a Pandas DataFrame from lists
A Pandas DataFrame is a two-dimensional table with rows and columns that are immutable, meaning they cannot be changed once they are created. Creating a DataFrame from scratch with lists is a common task in data science and information technology. A list is an ordered collection of elements, and it is one of the most commonly used data structures in Python. A list can store any type of values such as numbers, strings and boolean values.
In this document, I will provide a detailed explanation of how to create Pandas DataFrame from lists with real-world examples using step-by-step instructions, code snippets and explanations of each subsection.
What are the key differences between dataframe and list?
A list is a basic data structure in Python that can hold a collection of elements of any data type, while a dataframe is a two-dimensional table-like structure, similar to a spreadsheet or SQL table, that stores data in rows and columns. Here are some key differences between a dataframe and a list −
Structure − A list is a simple, one-dimensional collection of values, while a dataframe is a two-dimensional table-like structure that has rows and columns.
Data types − A list can hold elements of any data type, including numbers, strings, and even other lists, while a dataframe is designed to hold data in a tabular format, with columns of specific data types, such as integers, floats, and strings.
Size − A list can hold any number of elements, while a dataframe is typically designed to hold a large amount of data, with potentially millions of rows and columns.
Operations − A list supports basic operations such as indexing, slicing, and appending, while a dataframe supports more complex operations such as filtering, joining, and grouping.
Data manipulation − A list provides basic functionality for data manipulation, while a dataframe provides powerful tools for data manipulation, such as filtering, sorting, and aggregating data based on specific criteria.
Prerequisites
Before we dive into the task few things should is expected to be installed onto your system −
List of recommended settings −
pip install pandas, bokeh
It is expected that the user will have access to any standalone IDE such as VS-Code, PyCharm, Atom or Sublime text.
Even online Python compilers can also be used such as Kaggle.com, Google Cloud platform or any other will do.
Updated version of Python. At the time of writing the article I have used 3.10.9 version.
Knowledge of the use of Jupyter notebook.
Knowledge and application of virtual environment would be beneficial but not required.
It is also expected that the person will have a good understanding of statistics and mathematics.
Steps required
Importing Libraries
To create a DataFrame in Pandas, we need to import the Pandas library. The following code is used to import the Pandas library −
import pandas as pd
Creating Lists
Before we can create a DataFrame using lists, we first need to create lists to store the data. In this section, I will show you how to create lists with real-world examples using simple data.
Creating a List of Names
names = ['John', 'Mary', 'Peter', 'Jane', 'Daniel']
In the code snippet above, we created a list called `names` that contains five string values representing the names of individuals.
Creating a List of Ages
ages = [32, 25, 41, 29, 36]
In the code snippet above, we created a list called `ages` that contains five integer values representing the ages of individuals.
Creating a List of Boolean Values
current_status = [True, False, True, False, True]
In the code snippet above, we created a list called `current_status` that contains five Boolean values representing the current status of individuals.
Creating a DataFrame from Lists
Once we have the lists containing the data, we can use the `pd.DataFrame()` function to create a DataFrame in Pandas. We can pass the lists as arguments to the `pd.DataFrame()` function. The following code is used to create a DataFrame from lists −
df = pd.DataFrame(list(zip(names, ages, current_status)), columns=['Name', 'Age', 'Current_Status'])
In the code snippet above, we first created a list of tuples using the `zip()` function. The `zip()` function combines the lists into a single list of tuples. We then passed this list of tuples as the first argument to the `pd.DataFrame()` function.
The second argument to the `pd.DataFrame()` function is a list of column names for the DataFrame. In this case, we used `columns=['Name', 'Age', 'Current_Status']` to specify the column names as `Name`, `Age` and `Current_Status`.
Viewing the DataFrame
After creating the DataFrame, we can use the `.head()` function to view the first few rows of the DataFrame. The following code is used to display the first few rows of the DataFrame −
print(df.head())
In the code snippet above, we used the `.head()` function to display the first few rows of the DataFrame.
Output
Name Age Current_Status 0 John 32 True 1 Mary 25 False 2 Peter 41 True 3 Jane 29 False 4 Daniel 36 True
In the above output we can the creation of a dataframe from list.
Conclusion
In this document, I provided a step-by-step guide on how to create a Pandas DataFrame from lists. I demonstrated how to import the Pandas library, create lists, and how to create a DataFrame using the `pd.DataFrame()` function. Additionally, I showed how to view the first few rows of the DataFrame using the `.head()` function. By following these instructions, you should now be able to create Pandas DataFrame from lists with real-world examples.