Reshape Data: Concatenate

Reshape Data: Concatenate - Problem

Database Easy

You are given two DataFrames with identical structure:

DataFrame df1

Column Name	Type
student_id	int
name	object
age	int

DataFrame df2

Column Name	Type
student_id	int
name	object
age	int

Write a function to concatenate these two DataFrames vertically into one DataFrame. The rows from df2 should be appended below the rows from df1.

Return the concatenated DataFrame.

Input & Output

Example 1 — Basic Concatenation

$ Input: df1 = [{"student_id": 1, "name": "Alice", "age": 20}, {"student_id": 2, "name": "Bob", "age": 22}], df2 = [{"student_id": 3, "name": "Charlie", "age": 19}, {"student_id": 4, "name": "Diana", "age": 21}]

› Output: [{"student_id": 1, "name": "Alice", "age": 20}, {"student_id": 2, "name": "Bob", "age": 22}, {"student_id": 3, "name": "Charlie", "age": 19}, {"student_id": 4, "name": "Diana", "age": 21}]

💡 Note: All rows from df1 come first, followed by all rows from df2, maintaining original order within each DataFrame

Example 2 — Single Row DataFrames

$ Input: df1 = [{"student_id": 1, "name": "Eve", "age": 23}], df2 = [{"student_id": 2, "name": "Frank", "age": 24}]

› Output: [{"student_id": 1, "name": "Eve", "age": 23}, {"student_id": 2, "name": "Frank", "age": 24}]

💡 Note: Each DataFrame has only one row, result combines both into a two-row DataFrame

Example 3 — Empty DataFrame

$ Input: df1 = [{"student_id": 5, "name": "Grace", "age": 20}], df2 = []

› Output: [{"student_id": 5, "name": "Grace", "age": 20}]

💡 Note: When one DataFrame is empty, result contains only rows from the non-empty DataFrame

Constraints

Both DataFrames have identical column structure
0 ≤ number of rows in each DataFrame ≤ 1000
Column names are consistent across both DataFrames

Visualization

Tap to expand

Asked in

M Meta 15 G Google 12

The key insight is to use built-in concatenation functions like pd.concat() for efficient vertical DataFrame combination. Best approach is Built-in Concatenation with Time: O(n+m), Space: O(n+m).

Common Approaches

✓ Built-in Concatenation

⏱️ Time: O(n + m) Space: O(n + m)

Leverage pandas built-in concat() function which is optimized for combining DataFrames vertically. This is the standard and most efficient approach.

Row-by-Row Manual Copy

⏱️ Time: O(n + m) Space: O(n + m)

Create an empty result DataFrame and manually copy each row from df1, then each row from df2. This simulates what happens internally but is inefficient.

Built-in Concatenation — Algorithm Steps

Use pd.concat([df1, df2]) to combine DataFrames
Set ignore_index=True to reset row indices
Return the concatenated result

Visualization

Tap to expand

Step-by-Step Walkthrough

Input DataFrames

Two DataFrames with same structure

Concat Function

Built-in function combines efficiently

Result

Single DataFrame with all rows

Code -

solution.c — C

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_LINE 10000
#define MAX_OBJECTS 100
#define MAX_FIELDS 10
#define MAX_STRING 100

typedef struct {
    char key[MAX_STRING];
    char value[MAX_STRING];
    int is_number;
} Field;

typedef struct {
    Field fields[MAX_FIELDS];
    int field_count;
} Object;

typedef struct {
    Object objects[MAX_OBJECTS];
    int count;
} Array;

/* ----------- JSON PARSER ----------- */

Array parseJSON(const char* json) {
    Array result = {0};

    if (strcmp(json, "[]") == 0)
        return result;

    const char* pos = json;

    while (*pos && *pos != '[') pos++;
    if (*pos == '[') pos++;

    while (*pos && *pos != ']') {

        if (*pos == '{') {
            Object obj = {0};
            pos++;  // skip {

            while (*pos && *pos != '}') {

                while (*pos == ' ' || *pos == ',') pos++;

                if (*pos == '"') {
                    pos++;  // skip opening quote

                    char key[MAX_STRING] = {0};
                    int i = 0;

                    while (*pos && *pos != '"' && i < MAX_STRING - 1) {
                        key[i++] = *pos++;
                    }
                    key[i] = '\0';

                    if (*pos == '"') pos++;  // skip closing quote

                    while (*pos == ' ' || *pos == ':') pos++;

                    char value[MAX_STRING] = {0};
                    int is_number = 0;

                    if (*pos == '"') {
                        pos++;
                        i = 0;
                        while (*pos && *pos != '"' && i < MAX_STRING - 1) {
                            value[i++] = *pos++;
                        }
                        value[i] = '\0';
                        if (*pos == '"') pos++;
                    } else {
                        is_number = 1;
                        i = 0;
                        while (*pos && *pos != ',' && *pos != '}' && i < MAX_STRING - 1) {
                            if (*pos != ' ')
                                value[i++] = *pos;
                            pos++;
                        }
                        value[i] = '\0';
                    }

                    strcpy(obj.fields[obj.field_count].key, key);
                    strcpy(obj.fields[obj.field_count].value, value);
                    obj.fields[obj.field_count].is_number = is_number;
                    obj.field_count++;
                }
            }

            if (*pos == '}') pos++;
            result.objects[result.count++] = obj;
        }

        while (*pos == ' ' || *pos == ',') pos++;
    }

    return result;
}

/* ----------- PRINT JSON ----------- */

void printJSON(const Array* data) {
    printf("[");

    for (int i = 0; i < data->count; i++) {
        if (i > 0) printf(", ");
        printf("{");

        for (int j = 0; j < data->objects[i].field_count; j++) {
            if (j > 0) printf(", ");

            printf("\"%s\": ", data->objects[i].fields[j].key);

            if (data->objects[i].fields[j].is_number)
                printf("%s", data->objects[i].fields[j].value);
            else
                printf("\"%s\"", data->objects[i].fields[j].value);
        }

        printf("}");
    }

    printf("]");
}

/* ----------- CONCAT SOLUTION ----------- */

Array solution(Array* df1, Array* df2) {
    Array result = *df1;

    for (int i = 0; i < df2->count && result.count < MAX_OBJECTS; i++) {
        result.objects[result.count++] = df2->objects[i];
    }

    return result;
}

/* ----------- MAIN ----------- */

int main() {
    char line1[MAX_LINE], line2[MAX_LINE];

    fgets(line1, MAX_LINE, stdin);
    fgets(line2, MAX_LINE, stdin);

    line1[strcspn(line1, "\n")] = 0;
    line2[strcspn(line2, "\n")] = 0;

    Array df1 = parseJSON(line1);
    Array df2 = parseJSON(line2);

    Array result = solution(&df1, &df2);

    printJSON(&result);
    printf("\n");

    return 0;
}

Time & Space Complexity

Time Complexity

⏱️

O(n + m)

Efficiently combines n rows from df1 and m rows from df2 in linear time

✓ Linear Growth

Space Complexity

O(n + m)

Creates new DataFrame containing all rows from both input DataFrames

⚡ Linearithmic Space

12.5K Views

High Frequency

~5 min Avg. Time

450 Likes

Ln 1, Col 1

Smart Actions

💡 Explanation

AI Ready

💡 Suggestion Tab to accept Esc to dismiss

// Output will appear here after running code

Code Editor Closed

Click the red button to reopen

Input & Output

Constraints

Visualization

Related Problems

Common Approaches

Built-in Concatenation — Algorithm Steps

Visualization

Code -

Time & Space Complexity

Select Compiler