
Problem
Solution
Submissions
Bloom Filter for Efficient Set Membership Testing
Certification: Advanced Level
Accuracy: 100%
Submissions: 2
Points: 20
Write a Python program to implement a Bloom filter, a space-efficient probabilistic data structure used to test whether an element is a member of a set. A Bloom filter may have false positive matches, but false negatives are impossible. Your task is to implement the BloomFilter
class with methods for adding elements and testing membership.
Example 1
- Input: bloom = BloomFilter(size=10, hash_count=3) bloom.add("apple") bloom.add("banana") print(bloom.contains("apple")) print(bloom.contains("orange"))
- Output: True False
- Explanation:
- Step 1: Initialize a Bloom filter with size 10 and 3 hash functions.
- Step 2: Add "apple" and "banana" to the filter.
- Step 3: Check if "apple" exists (returns True as expected).
- Step 4: Check if "orange" exists (returns False as it wasn't added).
Example 2
- Input: bloom = BloomFilter(size=100, hash_count=5) words = ["hello", "world", "python", "programming", "algorithm"] for word in words: bloom.add(word) tests = ["hello", "python", "java", "bloom", "filter"] results = [bloom.contains(word) for word in tests]
- Output: [True, True, False, False, False]
- Explanation:
- Step 1: Initialize a Bloom filter with size 100 and 5 hash functions.
- Step 2: Add multiple words to the filter.
- Step 3: Test various words - correctly identifies existing and non-existing members.
Constraints
- 10 ≤ size ≤ 10^6 (size of the bit array)
- 1 ≤ hash_count ≤ 10 (number of hash functions)
- Input elements can be any hashable Python object
- The false positive rate depends on the size of the filter and the number of hash functions
- Time Complexity: O(k) for add and contains operations, where k is the number of hash functions
- Space Complexity: O(m) where m is the size of the bit array
Editorial
My Submissions
All Solutions
Lang | Status | Date | Code |
---|---|---|---|
You do not have any submissions for this problem. |
User | Lang | Status | Date | Code |
---|---|---|---|---|
No submissions found. |
Solution Hints
- Use a bit array (or a list of booleans) to represent the Bloom filter
- Implement multiple hash functions using combinations of existing hash functions
- Handle different types of input objects by converting them to strings first
- Choose the size and number of hash functions based on the expected number of elements and desired false positive rate
- Consider using the mmh3 library for efficient hash functions