Tutorialspoint
Problem
Solution
Submissions

Bloom Filter for Efficient Set Membership Testing

Certification: Advanced Level Accuracy: 100% Submissions: 2 Points: 20

Write a Python program to implement a Bloom filter, a space-efficient probabilistic data structure used to test whether an element is a member of a set. A Bloom filter may have false positive matches, but false negatives are impossible. Your task is to implement the BloomFilter class with methods for adding elements and testing membership.

Example 1
  • Input: bloom = BloomFilter(size=10, hash_count=3) bloom.add("apple") bloom.add("banana") print(bloom.contains("apple")) print(bloom.contains("orange"))
  • Output: True False
  • Explanation:
    • Step 1: Initialize a Bloom filter with size 10 and 3 hash functions.
    • Step 2: Add "apple" and "banana" to the filter.
    • Step 3: Check if "apple" exists (returns True as expected).
    • Step 4: Check if "orange" exists (returns False as it wasn't added).
Example 2
  • Input: bloom = BloomFilter(size=100, hash_count=5) words = ["hello", "world", "python", "programming", "algorithm"] for word in words: bloom.add(word) tests = ["hello", "python", "java", "bloom", "filter"] results = [bloom.contains(word) for word in tests]
  • Output: [True, True, False, False, False]
  • Explanation:
    • Step 1: Initialize a Bloom filter with size 100 and 5 hash functions.
    • Step 2: Add multiple words to the filter.
    • Step 3: Test various words - correctly identifies existing and non-existing members.
Constraints
  • 10 ≤ size ≤ 10^6 (size of the bit array)
  • 1 ≤ hash_count ≤ 10 (number of hash functions)
  • Input elements can be any hashable Python object
  • The false positive rate depends on the size of the filter and the number of hash functions
  • Time Complexity: O(k) for add and contains operations, where k is the number of hash functions
  • Space Complexity: O(m) where m is the size of the bit array
ArraysHash MapAmazonGoogle
Editorial

Login to view the detailed solution and explanation for this problem.

My Submissions
All Solutions
Lang Status Date Code
You do not have any submissions for this problem.
User Lang Status Date Code
No submissions found.

Please Login to continue
Solve Problems

 
 
 
Output Window

Don't have an account? Register

Solution Hints

  • Use a bit array (or a list of booleans) to represent the Bloom filter
  • Implement multiple hash functions using combinations of existing hash functions
  • Handle different types of input objects by converting them to strings first
  • Choose the size and number of hash functions based on the expected number of elements and desired false positive rate
  • Consider using the mmh3 library for efficient hash functions

Steps to solve by this approach:

 Step 1: Initialize the BloomFilter with a bit array and specified number of hash functions
 Step 2: Implement hash functions that generate different hash values for each item
 Step 3: Create the add method that sets bits in the array based on hash values
 Step 4: Implement the contains method that checks if all relevant bits are set
 Step 5: Handle different input types by converting them to strings before hashing

Submitted Code :