Importing Data in Python

When working with data analysis in Python, importing data from external sources is a fundamental skill. Python provides several powerful modules for reading different file formats including CSV, Excel, and databases. Let's explore the most common approaches for importing data.

Importing CSV Files

The built-in csv module allows us to read CSV files row by row using a specified delimiter. We open the file in read mode and iterate through each row ?

import csv
import io

# Sample CSV data
csv_data = """customerID,gender,Contract,PaperlessBilling,Churn
7590-VHVEG,Female,Month-to-month,Yes,No
5575-GNVDE,Male,One year,No,No
3668-QPYBK,Male,Month-to-month,Yes,Yes
7795-CFOCW,Male,One year,No,No"""

# Read CSV from string (simulating file read)
csv_file = io.StringIO(csv_data)
rows = csv.reader(csv_file, delimiter=',')

for row in rows:
    print(row)
['customerID', 'gender', 'Contract', 'PaperlessBilling', 'Churn']
['7590-VHVEG', 'Female', 'Month-to-month', 'Yes', 'No']
['5575-GNVDE', 'Male', 'One year', 'No', 'No']
['3668-QPYBK', 'Male', 'Month-to-month', 'Yes', 'Yes']
['7795-CFOCW', 'Male', 'One year', 'No', 'No']

Using Pandas for Data Import

Pandas is the most versatile library for data import, supporting CSV, Excel, JSON, and many other formats. Here's how to read CSV data with pandas ?

import pandas as pd
import io

# Sample CSV data
csv_data = """customerID,gender,Contract,PaperlessBilling,Churn
7590-VHVEG,Female,Month-to-month,Yes,No
5575-GNVDE,Male,One year,No,No
3668-QPYBK,Male,Month-to-month,Yes,Yes
7795-CFOCW,Male,One year,No,No
9237-HQITU,Female,Month-to-month,Yes,Yes"""

# Read CSV using pandas
df = pd.read_csv(io.StringIO(csv_data))
print(df.head())
  customerID  gender        Contract PaperlessBilling Churn
0  7590-VHVEG  Female  Month-to-month              Yes    No
1  5575-GNVDE    Male        One year               No    No
2  3668-QPYBK    Male  Month-to-month              Yes   Yes
3  7795-CFOCW    Male        One year               No    No
4  9237-HQITU  Female  Month-to-month              Yes   Yes

Reading Excel Files

Pandas can also handle Excel files directly using read_excel() ?

import pandas as pd

# Read Excel file
df = pd.read_excel("customers.xlsx", sheet_name="customers")
print(df.head(10))

Database Connectivity with PyODBC

For importing data from databases, pyodbc provides database connectivity. Combined with pandas, you can execute SQL queries and load results directly into DataFrames ?

import pyodbc
import pandas as pd

# Establish database connection
connection_string = "Driver={SQL Server};Server=serverName;UID=UserName;PWD=Password;Database=sqldb;"
sql_conn = pyodbc.connect(connection_string)

# Execute SQL query and load into DataFrame
query = "SELECT * FROM customers LIMIT 10"
data_sql = pd.read_sql_query(query, sql_conn)
print(data_sql.head())

# Close connection
sql_conn.close()

Common File Formats Comparison

Format Pandas Function Best For
CSV read_csv() Simple tabular data
Excel read_excel() Formatted spreadsheets
JSON read_json() Web APIs, nested data
Database read_sql() Large datasets, complex queries

Conclusion

Pandas is the most efficient choice for data import in Python, supporting multiple formats with simple syntax. Use the built-in csv module for basic CSV processing, and pyodbc with pandas for database connectivity and SQL queries.

Updated on: 2026-03-15T18:13:20+05:30

17K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements