Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find total duration of K most watched shows in Python
Suppose we have a list of strings called shows, a list of integers called durations, and a value k. Here shows[i] and durations[i] represent a show and its duration watched by the ith person. We need to find the total duration watched of the k most watched shows.
So, if the input is like shows = ["The BGT", "Jack jumper", "The BGT", "Jokers Company", "Music magic"] durations = [10, 8, 10, 18, 9] k = 2, then the output will be 38, as the top 2 most watched shows are "Jokers Company" and "The BGT" with total durations of 18 and (10 + 10) = 20, giving us 18 + 20 = 38.
Algorithm
To solve this, we will follow these steps ?
If shows is empty or durations is empty or k is 0, then return 0
Create a dictionary to store total duration for each show
For each show and duration pair, add the duration to the show's total
Extract all duration values and sort them in descending order
Sum the top k durations and return the result
Example
Let us see the following implementation to get better understanding ?
from collections import defaultdict
def solve(shows, durations, k):
if not shows or not durations or not k:
return 0
# Dictionary to store total duration for each show
show_durations = defaultdict(int)
# Calculate total duration for each show
for i in range(len(shows)):
show_durations[shows[i]] += durations[i]
# Get all durations and sort in descending order
all_durations = list(show_durations.values())
all_durations.sort(reverse=True)
# Sum the top k durations
total = 0
for i in range(min(k, len(all_durations))):
total += all_durations[i]
return total
# Test the function
shows = ["The BGT", "Jack jumper", "The BGT", "Jokers Company", "Music magic"]
durations = [10, 8, 10, 18, 9]
k = 2
result = solve(shows, durations, k)
print(f"Total duration of top {k} shows: {result}")
The output of the above code is ?
Total duration of top 2 shows: 38
How It Works
The algorithm works by first aggregating the total watch time for each show using a dictionary. For our example:
"The BGT": 10 + 10 = 20
"Jack jumper": 8
"Jokers Company": 18
"Music magic": 9
Then we sort these totals in descending order [20, 18, 9, 8] and sum the top k=2 values: 20 + 18 = 38.
Alternative Approach Using heapq
For better performance with large datasets, we can use Python's heapq module ?
import heapq
from collections import defaultdict
def solve_optimized(shows, durations, k):
if not shows or not durations or not k:
return 0
# Calculate total duration for each show
show_durations = defaultdict(int)
for show, duration in zip(shows, durations):
show_durations[show] += duration
# Get top k durations using heap
top_k_durations = heapq.nlargest(k, show_durations.values())
return sum(top_k_durations)
# Test the optimized function
shows = ["The BGT", "Jack jumper", "The BGT", "Jokers Company", "Music magic"]
durations = [10, 8, 10, 18, 9]
k = 2
result = solve_optimized(shows, durations, k)
print(f"Total duration (optimized): {result}")
The output of the above code is ?
Total duration (optimized): 38
Conclusion
The problem is solved by aggregating show durations in a dictionary, then finding the sum of the k largest values. The heapq.nlargest() approach is more efficient for large datasets as it avoids sorting the entire list.
