Kth Largest Element in a Stream in Python

The Kth Largest Element in a Stream problem involves designing a class that efficiently finds the kth largest element as new elements are added to a data stream. This is particularly useful in real-time data processing scenarios.

The KthLargest class maintains a stream of numbers and returns the kth largest element each time a new number is added. Note that we want the kth largest in sorted order, not the kth distinct element.

Problem Understanding

Given k = 3 and initial elements [4, 5, 8, 2], when we add elements 3, 5, 10, 9, 4 sequentially, we should get 4, 5, 5, 8, 8 as the 3rd largest elements respectively.

Basic Implementation

The straightforward approach involves maintaining an array and sorting it after each addition ?

class KthLargest:
    def __init__(self, k, nums):
        self.array = nums
        self.k = k
    
    def add(self, val):
        self.array.append(val)
        self.array.sort()
        return self.array[len(self.array) - self.k]

# Example usage
kth_largest = KthLargest(3, [4, 5, 8, 2])
print(kth_largest.add(3))   # 3rd largest among [2,3,4,5,8]
print(kth_largest.add(5))   # 3rd largest among [2,3,4,5,5,8]
print(kth_largest.add(10))  # 3rd largest among [2,3,4,5,5,8,10]
print(kth_largest.add(9))   # 3rd largest among [2,3,4,5,5,8,9,10]
print(kth_largest.add(4))   # 3rd largest among [2,3,4,4,5,5,8,9,10]
4
5
5
8
8

Optimized Implementation Using Min-Heap

A more efficient approach uses a min-heap of size k. This reduces time complexity from O(n log n) to O(log k) per addition ?

import heapq

class KthLargestOptimized:
    def __init__(self, k, nums):
        self.k = k
        self.heap = nums
        heapq.heapify(self.heap)
        
        # Keep only k largest elements
        while len(self.heap) > k:
            heapq.heappop(self.heap)
    
    def add(self, val):
        heapq.heappush(self.heap, val)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        return self.heap[0]  # Root of min-heap is kth largest

# Example usage
kth_largest_opt = KthLargestOptimized(3, [4, 5, 8, 2])
print(kth_largest_opt.add(3))
print(kth_largest_opt.add(5))
print(kth_largest_opt.add(10))
print(kth_largest_opt.add(9))
print(kth_largest_opt.add(4))
4
5
5
8
8

How the Min-Heap Approach Works

The min-heap maintains exactly k elements - the k largest elements seen so far. The root (minimum element in the heap) represents the kth largest element overall.

Min-Heap for k=3 4 5 8 Root (4) = 3rd largest Heap maintains k=3 largest elements

Comparison

Approach Time Complexity (add) Space Complexity Best For
Array + Sort O(n log n) O(n) Simple implementation
Min-Heap O(log k) O(k) Frequent additions, large streams

Conclusion

For finding the kth largest element in a stream, use a min-heap of size k for optimal performance. The basic sorting approach works for small datasets, but the heap-based solution scales much better with larger streams and frequent additions.

Updated on: 2026-03-25T08:47:12+05:30

635 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements