Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Importing Data in Python
When working with data analysis in Python, importing data from external sources is a fundamental skill. Python provides several powerful modules for reading different file formats including CSV, Excel, and databases. Let's explore the most common approaches for importing data.
Importing CSV Files
The built-in csv module allows us to read CSV files row by row using a specified delimiter. We open the file in read mode and iterate through each row ?
import csv
import io
# Sample CSV data
csv_data = """customerID,gender,Contract,PaperlessBilling,Churn
7590-VHVEG,Female,Month-to-month,Yes,No
5575-GNVDE,Male,One year,No,No
3668-QPYBK,Male,Month-to-month,Yes,Yes
7795-CFOCW,Male,One year,No,No"""
# Read CSV from string (simulating file read)
csv_file = io.StringIO(csv_data)
rows = csv.reader(csv_file, delimiter=',')
for row in rows:
print(row)
['customerID', 'gender', 'Contract', 'PaperlessBilling', 'Churn'] ['7590-VHVEG', 'Female', 'Month-to-month', 'Yes', 'No'] ['5575-GNVDE', 'Male', 'One year', 'No', 'No'] ['3668-QPYBK', 'Male', 'Month-to-month', 'Yes', 'Yes'] ['7795-CFOCW', 'Male', 'One year', 'No', 'No']
Using Pandas for Data Import
Pandas is the most versatile library for data import, supporting CSV, Excel, JSON, and many other formats. Here's how to read CSV data with pandas ?
import pandas as pd import io # Sample CSV data csv_data = """customerID,gender,Contract,PaperlessBilling,Churn 7590-VHVEG,Female,Month-to-month,Yes,No 5575-GNVDE,Male,One year,No,No 3668-QPYBK,Male,Month-to-month,Yes,Yes 7795-CFOCW,Male,One year,No,No 9237-HQITU,Female,Month-to-month,Yes,Yes""" # Read CSV using pandas df = pd.read_csv(io.StringIO(csv_data)) print(df.head())
customerID gender Contract PaperlessBilling Churn 0 7590-VHVEG Female Month-to-month Yes No 1 5575-GNVDE Male One year No No 2 3668-QPYBK Male Month-to-month Yes Yes 3 7795-CFOCW Male One year No No 4 9237-HQITU Female Month-to-month Yes Yes
Reading Excel Files
Pandas can also handle Excel files directly using read_excel() ?
import pandas as pd
# Read Excel file
df = pd.read_excel("customers.xlsx", sheet_name="customers")
print(df.head(10))
Database Connectivity with PyODBC
For importing data from databases, pyodbc provides database connectivity. Combined with pandas, you can execute SQL queries and load results directly into DataFrames ?
import pyodbc
import pandas as pd
# Establish database connection
connection_string = "Driver={SQL Server};Server=serverName;UID=UserName;PWD=Password;Database=sqldb;"
sql_conn = pyodbc.connect(connection_string)
# Execute SQL query and load into DataFrame
query = "SELECT * FROM customers LIMIT 10"
data_sql = pd.read_sql_query(query, sql_conn)
print(data_sql.head())
# Close connection
sql_conn.close()
Common File Formats Comparison
| Format | Pandas Function | Best For |
|---|---|---|
| CSV | read_csv() |
Simple tabular data |
| Excel | read_excel() |
Formatted spreadsheets |
| JSON | read_json() |
Web APIs, nested data |
| Database | read_sql() |
Large datasets, complex queries |
Conclusion
Pandas is the most efficient choice for data import in Python, supporting multiple formats with simple syntax. Use the built-in csv module for basic CSV processing, and pyodbc with pandas for database connectivity and SQL queries.
