Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Extract csv file specific columns to list in Python
To extract specific columns from a CSV file into a list in Python, we can use Pandas read_csv() method with the usecols parameter. This allows us to select only the columns we need, making our data processing more efficient.
Steps to Extract Specific Columns
Create a list of column names that need to be extracted
Use read_csv() method with
usecolsparameter to extract specific columnsConvert the extracted columns to lists if needed
Process or visualize the extracted data
Basic Column Extraction
Here's how to extract specific columns from a CSV file ?
import pandas as pd
# Sample data creation for demonstration
data = {
'Name': ['Arun', 'Shyam', 'Govind', 'Javed', 'Raju'],
'Marks': [98, 75, 54, 92, 87],
'Age': [25, 23, 24, 26, 22],
'City': ['Mumbai', 'Delhi', 'Bangalore', 'Chennai', 'Pune']
}
df_sample = pd.DataFrame(data)
print("Sample CSV data:")
print(df_sample)
Sample CSV data:
Name Marks Age City
0 Arun 98 25 Mumbai
1 Shyam 75 23 Delhi
2 Govind 54 24 Bangalore
3 Javed 92 26 Chennai
4 Raju 87 22 Pune
Extracting Specific Columns
Extract only the columns you need using usecols parameter ?
import pandas as pd
# Sample data
data = {
'Name': ['Arun', 'Shyam', 'Govind', 'Javed', 'Raju'],
'Marks': [98, 75, 54, 92, 87],
'Age': [25, 23, 24, 26, 22],
'City': ['Mumbai', 'Delhi', 'Bangalore', 'Chennai', 'Pune']
}
df_full = pd.DataFrame(data)
# Extract specific columns
columns_to_extract = ["Name", "Marks"]
df_selected = df_full[columns_to_extract]
print("Extracted columns:")
print(df_selected)
# Convert columns to lists
names_list = df_selected['Name'].tolist()
marks_list = df_selected['Marks'].tolist()
print("\nName column as list:", names_list)
print("Marks column as list:", marks_list)
Extracted columns:
Name Marks
0 Arun 98
1 Shyam 75
2 Govind 54
3 Javed 92
4 Raju 87
Name column as list: ['Arun', 'Shyam', 'Govind', 'Javed', 'Raju']
Marks column as list: [98, 75, 54, 92, 87]
Visualizing the Extracted Data
Create a simple plot using the extracted columns ?
import pandas as pd
import matplotlib.pyplot as plt
# Sample data
data = {
'Name': ['Arun', 'Shyam', 'Govind', 'Javed', 'Raju'],
'Marks': [98, 75, 54, 92, 87]
}
df = pd.DataFrame(data)
# Configure plot settings
plt.figure(figsize=(8, 5))
plt.plot(df['Name'], df['Marks'], marker='o', linewidth=2, markersize=8)
plt.title('Student Marks')
plt.xlabel('Student Names')
plt.ylabel('Marks')
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
# Display basic info
print("Extracted data:")
print(df)
print(f"\nTotal students: {len(df)}")
print(f"Average marks: {df['Marks'].mean():.1f}")
Extracted data:
Name Marks
0 Arun 98
1 Shyam 75
2 Govind 54
3 Javed 92
4 Raju 87
Total students: 5
Average marks: 81.2
Multiple Column Extraction Methods
| Method | Syntax | Use Case |
|---|---|---|
| Column indexing | df[['col1', 'col2']] |
After loading full DataFrame |
| usecols parameter | pd.read_csv('file.csv', usecols=['col1']) |
During CSV reading (memory efficient) |
| Column positions | pd.read_csv('file.csv', usecols=[0, 2]) |
When you know column positions |
Conclusion
Use pd.read_csv() with usecols parameter for memory-efficient column extraction. Convert DataFrame columns to lists using .tolist() method when needed for further processing.
