Join two text columns into a single column in Pandas

Combining text columns is a common data manipulation task in Pandas. When working with datasets containing multiple text fields like first name and last name, or address components, you'll often need to merge them into a single column for analysis or presentation.

Basic Syntax

The simplest way to join two text columns is using the + operator ?

# Basic concatenation
df['new_column'] = df['column1'] + df['column2']

# With separator
df['new_column'] = df['column1'] + ' ' + df['column2']

Method 1: Using the + Operator

The + operator provides direct string concatenation. You can add separators like spaces or other characters between columns ?

import pandas as pd

# Create sample data
data = {'Name': ['John', 'Jane', 'Alice'],
        'Surname': ['Doe', 'Smith', 'Johnson']}
df = pd.DataFrame(data)

# Join columns with space separator
df['Full Name'] = df['Name'] + ' ' + df['Surname']

print(df)
   Name  Surname     Full Name
0  John      Doe      John Doe
1  Jane    Smith    Jane Smith
2 Alice  Johnson  Alice Johnson

Method 2: Using str.cat() Method

The str.cat() method offers more control and handles missing values better. It's specifically designed for string concatenation ?

import pandas as pd

# Create sample data
data = {'Name': ['John', 'Jane', 'Alice'],
        'Surname': ['Doe', 'Smith', 'Johnson']}
df = pd.DataFrame(data)

# Join using str.cat() with separator
df['Full Name'] = df['Name'].str.cat(df['Surname'], sep=' ')

print(df)
   Name  Surname     Full Name
0  John      Doe      John Doe
1  Jane    Smith    Jane Smith
2 Alice  Johnson  Alice Johnson

Handling Missing Values

The str.cat() method handles NaN values more gracefully than the + operator ?

import pandas as pd
import numpy as np

# Data with missing values
data = {'First': ['John', 'Jane', None, 'Bob'],
        'Last': ['Doe', None, 'Johnson', 'Brown']}
df = pd.DataFrame(data)

# Using + operator (creates NaN)
df['Full_Plus'] = df['First'] + ' ' + df['Last']

# Using str.cat() (skips NaN by default)
df['Full_Cat'] = df['First'].str.cat(df['Last'], sep=' ', na_rep='')

print(df)
  First     Last   Full_Plus     Full_Cat
0  John      Doe    John Doe     John Doe
1  Jane     None        None         Jane
2  None  Johnson        None      Johnson
3   Bob    Brown   Bob Brown    Bob Brown

Joining Multiple Columns

You can concatenate more than two columns using either method ?

import pandas as pd

# Create sample data with multiple columns
data = {'Title': ['Mr.', 'Ms.', 'Dr.'],
        'First': ['John', 'Jane', 'Alice'],
        'Last': ['Doe', 'Smith', 'Johnson']}
df = pd.DataFrame(data)

# Method 1: Using + operator
df['Full_Name_Plus'] = df['Title'] + ' ' + df['First'] + ' ' + df['Last']

# Method 2: Using str.cat() with list
df['Full_Name_Cat'] = df['Title'].str.cat([df['First'], df['Last']], sep=' ')

print(df)
  Title  First     Last    Full_Name_Plus     Full_Name_Cat
0   Mr.   John      Doe      Mr. John Doe      Mr. John Doe
1   Ms.   Jane    Smith     Ms. Jane Smith     Ms. Jane Smith
2   Dr.  Alice  Johnson  Dr. Alice Johnson  Dr. Alice Johnson

Comparison

Method Syntax NaN Handling Best For
+ operator Simple Creates NaN Clean data, simple concatenation
str.cat() More options Flexible control Missing values, complex joining

Conclusion

Use the + operator for simple text concatenation when data is clean. Choose str.cat() when you need better control over separators and missing value handling for more robust data processing.

Updated on: 2026-03-27T15:01:46+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements