Select Data - Problem
You are working as a data analyst for a university, and you need to extract specific student information from a large DataFrame containing student records.
Given a DataFrame students with columns student_id, name, and age, your task is to select and return only the name and age of the student with student_id = 101.
DataFrame Schema:
| Column Name | Type |
|---|---|
| student_id | int |
| name | object |
| age | int |
The result should be a DataFrame containing only the name and age columns for the specified student.
Input & Output
example_1.py โ Basic Selection
$
Input:
students = pd.DataFrame({
'student_id': [101, 102, 103],
'name': ['Bob', 'Carol', 'David'],
'age': [21, 22, 20]
})
โบ
Output:
name age
0 Bob 21
๐ก Note:
Student with ID 101 is found in the first row, so we return their name 'Bob' and age 21.
example_2.py โ Student Not First
$
Input:
students = pd.DataFrame({
'student_id': [100, 101, 102],
'name': ['Alice', 'Bob', 'Carol'],
'age': [20, 21, 22]
})
โบ
Output:
name age
1 Bob 21
๐ก Note:
Student with ID 101 is in the second row (index 1), so we return their name 'Bob' and age 21.
example_3.py โ Large Dataset
$
Input:
students = pd.DataFrame({
'student_id': [100, 99, 101, 103, 104],
'name': ['Alice', 'Zoe', 'Bob', 'Carol', 'Eve'],
'age': [20, 19, 21, 22, 18]
})
โบ
Output:
name age
2 Bob 21
๐ก Note:
Even with more students, we efficiently find student 101 (Bob, age 21) using pandas indexing.
Visualization
Tap to expand
Understanding the Visualization
1
Boolean Indexing
Create a True/False mask for each row based on student_id == 101
2
Row Filtering
Apply the boolean mask to select only matching rows
3
Column Selection
From the filtered rows, select only 'name' and 'age' columns
4
Result DataFrame
Return the filtered and projected DataFrame
Key Takeaway
๐ฏ Key Insight: Pandas boolean indexing allows us to filter and select data in a single, highly optimized operation that's both readable and performant.
Time & Space Complexity
Time Complexity
O(n)
Pandas uses optimized C code for filtering, much faster than Python loops
โ Linear Growth
Space Complexity
O(1)
Only stores the result row(s), pandas handles memory efficiently
โ Linear Space
Constraints
- 1 โค students.shape[0] โค 105
- student_id values are integers
- Student ID 101 may or may not exist in the DataFrame
- Name values are non-empty strings
- Age values are positive integers
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code