- Trending Categories
- Data Structure
- Operating System
- MS Excel
- C Programming
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
There are three performance metrics for Bloom filters that can be traded off: computation or execution time (corresponds to the number k of hash functions), size of filter (corresponds to the number m of bits), and probability of error (corresponds to the false positive rate
f = (1 − p)k )
The Bloom filter (BF) introduces an error tolerance to enhance lookup performance and space efficiency. The Bloom filter either returns true or false. Thus, the result of Bloom filter is fallen under any one of the following classes: true positive, false positive, true negative, and false negative. Maximum number the Bloom filter contains false positive. The false positive as well as false negative causes overhead to a system. The Bloom filter implements an array to store the information of an element. The false positive is defined as follows: if the Bloom filter returns true when holds element. Similarly, false negative is also defined as follows: the Bloom filter returns false when holds element. Thus, the Bloom filter belongs to the probabilistic data structure.
Bloom filter size and number of Hash function
We understand that if the size of the bloom filter is too small, soon enough all of the bit fields will turn into ‘1’ and then our bloom filter will return ‘false positive’ for every inputted value. So, the size of the bloom filter is a very vital or important decision to be made. A larger filter consists of less false positives, and a smaller one more.
So, we can conclude that size of bloom filter is totally based on the ‘false positive error rate’.
Another important parameter is to determine amount of hash functions we will use. The more hash functions we implement, the slower the bloom filter will be, and the quicker it fills up. If we have too few, however, we may suffer due to many false positives.
We can compute the false positive error rate, p, based on the size of the filter, m, the number of hash functions, k, and the number of elements inserted, n, with the formula
We would actually mostly need to determine what our m and k would be. So, if we set or fix an error tolerance value p and the number of elements n by ourselves we can implement the following formulas to calculate these parameters
m=(-n ln p)/(ln 2)2
- Related Articles
- Performance Testing Tutorial (Definition, Types, Metrics, Example)
- Key Metrics to Track Your Content Marketing and Overall Digital Marketing Performance
- What is Data Mining Metrics?
- What are the YouTube Metrics?
- Metrics for success with RPA
- Collecting MySQL Statistics and Metrics
- Software Testing Metrics: Definition, Types & Example
- What is an Information Security Metrics?
- Is Performance Management the Same as Performance Appraisal?
- Ethernet Performance
- Performance Appraisal
- What are the lifecycle of security metrics?
- What are the classification of security metrics?
- What is Software Testing Metrics with Types & Example?
- What is Security Metrics Management in information security?