
Problem
Solution
Submissions
MapReduce Framework for Distributed Computing
Certification: Advanced Level
Accuracy: 50%
Submissions: 2
Points: 20
Implement a simplified version of the MapReduce framework with Mapper and Reducer classes for parallel processing and aggregation.
Example 1
- Input:
data = ["hello world", "big data"]
mapper_function = lambda x: ...
reducer_function = lambda x, y: ... - Output:
[("hello", 1), ("world", 1), ("big", 1), ("data", 1)] - Explanation:
- Map words into key-value (word, 1).
- Reduce by summing word counts.
- Return aggregated word frequencies.
Example 2
- Input:
data = [10, 20, 30, 40, 50]
mapper_function = lambda x: ...
reducer_function = lambda x, y: ... - Output:
[("even", 150)] - Explanation:
- Tag numbers as even or odd.
- Sum values for each tag.
- Return totals for each category.
Constraints
- 1 ≤ len(data) ≤ 10^6
- Mapper returns list of key-value pairs
- Reducer returns one pair per key
- Time Complexity: O(n * m)
- Space Complexity: O(n * k)
Editorial
My Submissions
All Solutions
Lang | Status | Date | Code |
---|---|---|---|
You do not have any submissions for this problem. |
User | Lang | Status | Date | Code |
---|---|---|---|---|
No submissions found. |
Solution Hints
- Use Python's multiprocessing module to parallelize the mapping phase
- Implement a partitioning mechanism to distribute intermediate key-value pairs
- Group values by keys after the mapping phase
- Use multiple processes for both mapping and reducing phases
- Implement proper exception handling and resource management