Select Data - Problem

You are working as a data analyst for a university, and you need to extract specific student information from a large DataFrame containing student records.

Given a DataFrame students with columns student_id, name, and age, your task is to select and return only the name and age of the student with student_id = 101.

DataFrame Schema:

Column NameType
student_idint
nameobject
ageint

The result should be a DataFrame containing only the name and age columns for the specified student.

Input & Output

example_1.py โ€” Basic Selection
$ Input: students = pd.DataFrame({ 'student_id': [101, 102, 103], 'name': ['Bob', 'Carol', 'David'], 'age': [21, 22, 20] })
โ€บ Output: name age 0 Bob 21
๐Ÿ’ก Note: Student with ID 101 is found in the first row, so we return their name 'Bob' and age 21.
example_2.py โ€” Student Not First
$ Input: students = pd.DataFrame({ 'student_id': [100, 101, 102], 'name': ['Alice', 'Bob', 'Carol'], 'age': [20, 21, 22] })
โ€บ Output: name age 1 Bob 21
๐Ÿ’ก Note: Student with ID 101 is in the second row (index 1), so we return their name 'Bob' and age 21.
example_3.py โ€” Large Dataset
$ Input: students = pd.DataFrame({ 'student_id': [100, 99, 101, 103, 104], 'name': ['Alice', 'Zoe', 'Bob', 'Carol', 'Eve'], 'age': [20, 19, 21, 22, 18] })
โ€บ Output: name age 2 Bob 21
๐Ÿ’ก Note: Even with more students, we efficiently find student 101 (Bob, age 21) using pandas indexing.

Visualization

Tap to expand
DataFrame Selection Process1. Original DataID | Name | Age100 | Alice | 20101 | Bob | 21102 | Carol | 222. Boolean Maskstudent_id == 101FalseTrueFalse3. Filtered RowsID | Name | Age101 | Bob | 214. Final ResultName | AgeBob | 21Pandas Code Breakdownstudents[students['student_id'] == 101][['name', 'age']]Step 1: students['student_id'] == 101 creates boolean maskStep 2: students[mask] filters rows where mask is TrueStep 3: [['name', 'age']] selects only specified columnsResult: New DataFrame with filtered rows and selected columns
Understanding the Visualization
1
Boolean Indexing
Create a True/False mask for each row based on student_id == 101
2
Row Filtering
Apply the boolean mask to select only matching rows
3
Column Selection
From the filtered rows, select only 'name' and 'age' columns
4
Result DataFrame
Return the filtered and projected DataFrame
Key Takeaway
๐ŸŽฏ Key Insight: Pandas boolean indexing allows us to filter and select data in a single, highly optimized operation that's both readable and performant.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n)

Pandas uses optimized C code for filtering, much faster than Python loops

n
2n
โœ“ Linear Growth
Space Complexity
O(1)

Only stores the result row(s), pandas handles memory efficiently

n
2n
โœ“ Linear Space

Constraints

  • 1 โ‰ค students.shape[0] โ‰ค 105
  • student_id values are integers
  • Student ID 101 may or may not exist in the DataFrame
  • Name values are non-empty strings
  • Age values are positive integers
Asked in
Google 25 Meta 20 Amazon 18 Microsoft 15
24.5K Views
High Frequency
~5 min Avg. Time
890 Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen