Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - Minimum K Records for Nth Index in Tuple List
When working with lists of tuples in Python, you often need to find the smallest K records based on values at a specific index position. This task can be efficiently accomplished using Python's built-in functions like sorting, the heapq module, and list comprehensions.
Problem Overview
Given a list of tuples, we want to extract the K smallest records based on the value at the Nth index. For example, if we have tuples representing (name, score) pairs, we might want the 3 lowest scores.
Method 1: Using Sorting and Slicing
The most straightforward approach is to sort the entire list based on the Nth index and slice the first K elements ?
def extract_minimum_records_sorting(tuple_list, K, N):
sorted_list = sorted(tuple_list, key=lambda x: x[N])
return sorted_list[:K]
# Example data
tuple_list = [('apple', 5), ('banana', 2), ('cherry', 9), ('durian', 4), ('elderberry', 1)]
K = 3
N = 1
result = extract_minimum_records_sorting(tuple_list, K, N)
print("Minimum", K, "records based on index", N, ":")
for item in result:
print(item)
Minimum 3 records based on index 1 :
('elderberry', 1)
('banana', 2)
('durian', 4)
Method 2: Using heapq Module
The heapq module provides efficient heap operations. We can use nsmallest() to directly get the K smallest elements ?
import heapq
def extract_minimum_records_heap(tuple_list, K, N):
return heapq.nsmallest(K, tuple_list, key=lambda x: x[N])
# Example data
tuple_list = [('apple', 5), ('banana', 2), ('cherry', 9), ('durian', 4), ('elderberry', 1)]
K = 3
N = 1
result = extract_minimum_records_heap(tuple_list, K, N)
print("Minimum", K, "records using heapq:")
for item in result:
print(item)
Minimum 3 records using heapq:
('elderberry', 1)
('banana', 2)
('durian', 4)
Method 3: Using List Comprehension with Sorted Values
This approach first extracts and sorts the Nth index values, then filters the original list ?
def extract_minimum_records_comprehension(tuple_list, K, N):
# Extract and sort the Nth index values
nth_values = sorted([item[N] for item in tuple_list])
# Get the K smallest unique values
k_smallest_values = nth_values[:K]
# Filter original list based on these values
result = []
for item in tuple_list:
if item[N] in k_smallest_values and len(result) < K:
result.append(item)
return result
# Example data
tuple_list = [('apple', 5), ('banana', 2), ('cherry', 9), ('durian', 4), ('elderberry', 1)]
K = 3
N = 1
result = extract_minimum_records_comprehension(tuple_list, K, N)
print("Minimum", K, "records using list comprehension:")
for item in result:
print(item)
Minimum 3 records using list comprehension:
('banana', 2)
('durian', 4)
('elderberry', 1)
Performance Comparison
| Method | Time Complexity | Space Complexity | Best For |
|---|---|---|---|
| Sorting + Slicing | O(n log n) | O(n) | Simple implementation |
| heapq.nsmallest() | O(n log K) | O(K) | Large lists, small K |
| List Comprehension | O(n log n) | O(n) | Custom filtering logic |
Complete Example with Different Data
import heapq
# Student data: (name, age, score)
students = [
('Alice', 20, 85),
('Bob', 19, 92),
('Charlie', 21, 78),
('Diana', 20, 96),
('Eve', 22, 81)
]
# Find 3 youngest students (index 1 = age)
youngest = heapq.nsmallest(3, students, key=lambda x: x[1])
print("3 Youngest students:")
for student in youngest:
print(f"{student[0]}: {student[1]} years old")
print()
# Find 2 students with lowest scores (index 2 = score)
lowest_scores = heapq.nsmallest(2, students, key=lambda x: x[2])
print("2 Students with lowest scores:")
for student in lowest_scores:
print(f"{student[0]}: {student[2]} points")
3 Youngest students: Bob: 19 years old Alice: 20 years old Diana: 20 years old 2 Students with lowest scores: Charlie: 78 points Eve: 81 points
Conclusion
For finding minimum K records from tuple lists, use heapq.nsmallest() for best performance with large datasets. Use sorting with slicing for simple cases. The heapq approach is most efficient when K is much smaller than the total number of records.
