How to convert CSV columns to text in Python?

CSV (Comma Separated Values) files are commonly used to store and exchange tabular data. However, there may be situations where you need to convert the data in CSV columns to text format, for example, to use it as input for natural language processing tasks or data analysis.

Python provides several tools and libraries that can help with this task. In this tutorial, we will explore different methods for converting CSV columns to text in Python using the Pandas library.

Approach

  • Load the CSV file into a pandas DataFrame using the read_csv() function.

  • Extract the desired column from the DataFrame using indexing, and convert it to text using astype(str).

  • Join the resulting strings using the join() method to create a single text string.

This approach reads in the CSV file with pandas, converts the desired column to text format, and then joins the resulting strings into a single text string for further processing.

Let's say that we have a CSV file named input.csv which contains the following data ?

input.csv

Name,Age,Occupation
John,32,Engineer
Jane,28,Teacher
Bob,45,Salesperson

Converting Specific Column of CSV into Text

Here's how to select a specific column (Age column in this case) and convert it to text format ?

Example

import pandas as pd
import io

# Sample CSV data
csv_data = """Name,Age,Occupation
John,32,Engineer
Jane,28,Teacher
Bob,45,Salesperson"""

# Read the CSV data into a pandas DataFrame
df = pd.read_csv(io.StringIO(csv_data))

# Select the second column (Age) and convert it to text
text_series = df.iloc[:, 1].astype(str)

# Join the text Series into a single string
text_string = ' '.join(text_series)

# Print the resulting text string
print("Age column as text:", text_string)
print("Data type:", type(text_string))
Age column as text: 32 28 45
Data type: <class 'str'>

How It Works

  • Import the Pandas library and use read_csv() to read the CSV data into a DataFrame.

  • Use iloc[:, 1] to select the second column (Age column) where iloc stands for "integer location".

  • Convert the selected column to text using astype(str) method.

  • Join all values into a single string using join() method with space as separator.

Converting All Columns of CSV into Text

To convert all columns of the CSV file into separate text strings, we can iterate through each column and apply the same conversion process ?

Example

import pandas as pd
import io

# Sample CSV data
csv_data = """Name,Age,Occupation
John,32,Engineer
Jane,28,Teacher
Bob,45,Salesperson"""

# Read the CSV data into a pandas DataFrame
df = pd.read_csv(io.StringIO(csv_data))

# Convert all columns to text Series
text_series_list = [df[col].astype(str) for col in df.columns]

# Join each text Series into a single string
text_strings = [' '.join(text_series) for text_series in text_series_list]

# Print the resulting text strings
for i, text_string in enumerate(text_strings):
    print(f"{df.columns[i]} column: {text_string}")
Name column: John Jane Bob
Age column: 32 28 45
Occupation column: Engineer Teacher Salesperson

Converting Columns with Custom Separator

You can also use different separators when joining the text values ?

Example

import pandas as pd
import io

# Sample CSV data
csv_data = """Name,Age,Occupation
John,32,Engineer
Jane,28,Teacher
Bob,45,Salesperson"""

df = pd.read_csv(io.StringIO(csv_data))

# Convert Name column with different separators
name_column = df['Name'].astype(str)

print("With comma separator:", ', '.join(name_column))
print("With pipe separator:", ' | '.join(name_column))
print("With newline separator:")
print('\n'.join(name_column))
With comma separator: John, Jane, Bob
With pipe separator: John | Jane | Bob
With newline separator:
John
Jane
Bob

Comparison of Methods

Method Use Case Output Format
Single Column Extract specific column data Single text string
All Columns Convert entire CSV to text List of text strings
Custom Separator Format text with specific delimiters Formatted text string

Conclusion

Converting CSV columns to text in Python is straightforward using Pandas. Use astype(str) to convert columns to text format and join() to combine values into single strings. This method is useful for text analysis and natural language processing tasks.

Updated on: 2026-03-27T01:20:04+05:30

9K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements