Reshape Data: Concatenate - Problem
DataFrame Concatenation Challenge

You're working as a data analyst and need to combine student records from two separate DataFrames into one comprehensive dataset. Both DataFrames contain the same structure with student information including student_id, name, and age.

Goal: Concatenate two DataFrames vertically (stack them on top of each other) to create a single unified DataFrame containing all student records.

Input: Two DataFrames with identical column structures
Output: One DataFrame containing all rows from both input DataFrames

This is a fundamental data manipulation operation used frequently in data preprocessing and ETL (Extract, Transform, Load) processes.

Input & Output

basic_concatenation.py โ€” Python
$ Input: df1 = pd.DataFrame({'student_id': [1, 2], 'name': ['Alice', 'Bob'], 'age': [20, 21]}) df2 = pd.DataFrame({'student_id': [3, 4], 'name': ['Charlie', 'Diana'], 'age': [22, 19]})
โ€บ Output: student_id name age 0 1 Alice 20 1 2 Bob 21 2 3 Charlie 22 3 4 Diana 19
๐Ÿ’ก Note: The two DataFrames are stacked vertically, with the result having a new sequential index from 0 to 3
single_row_dataframes.py โ€” Python
$ Input: df1 = pd.DataFrame({'student_id': [100], 'name': ['Eve'], 'age': [25]}) df2 = pd.DataFrame({'student_id': [200], 'name': ['Frank'], 'age': [23]})
โ€บ Output: student_id name age 0 100 Eve 25 1 200 Frank 23
๐Ÿ’ก Note: Even with single-row DataFrames, concatenation works the same way, creating a two-row result
empty_dataframe_edge_case.py โ€” Python
$ Input: df1 = pd.DataFrame({'student_id': [1, 2], 'name': ['Alice', 'Bob'], 'age': [20, 21]}) df2 = pd.DataFrame(columns=['student_id', 'name', 'age'])
โ€บ Output: student_id name age 0 1 Alice 20.0 1 2 Bob 21.0
๐Ÿ’ก Note: When one DataFrame is empty, the result is essentially a copy of the non-empty DataFrame with potential data type changes

Constraints

  • Both DataFrames have identical column structure
  • DataFrames can have 0 to 106 rows each
  • Column names must match exactly
  • Data types should be compatible for proper concatenation

Visualization

Tap to expand
Student Records AAlice | 20Bob | 21Student Records BCharlie | 22Diana | 19pd.concat()[df1, df2]ignore_index=TrueCombined Records0 | Alice | 201 | Bob | 212 | Charlie | 223 | Diana | 19New Sequential Index๐Ÿ“Š Vertical DataFrame Concatenationโšก O(n) Time Complexity - Efficient Memory Management
Understanding the Visualization
1
Identify Source DataFrames
Two separate DataFrames with identical column structure
2
Apply concat() Function
Pandas efficiently combines the data with optimal memory management
3
Index Management
New sequential index is created (0, 1, 2, 3, ...) when ignore_index=True
4
Return Unified DataFrame
Single DataFrame containing all rows from both inputs
Key Takeaway
๐ŸŽฏ Key Insight: Pandas concat() function provides optimal O(n) performance by efficiently managing memory allocation and data copying, making it the preferred approach for combining DataFrames in production environments.
Asked in
Google 45 Meta 38 Amazon 35 Netflix 28
28.5K Views
High Frequency
~8 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen