What are collision avoidance techniques(DBMS)?

DBMSDatabaseBig Data Analytics

Collision is a problem that occurs when two keys applied on a hash table map to the same location in the hash table.

There are two techniques that are used to avoid collision they are −

  • Linear probing.
  • Chaining.

Let us discuss each technique in detail.

Linear probing

Linear probing is a strategy for resolving collisions. In this the new key is placed in the closest following empty cell.

Here the elements are stored wherever the hash function maps into a hash table, if that cell is filled then the next consecutive location is searched to store that value. Here generally we use arrays.

Step 1 − Let us take a table T that stores all the records in memory.

Step 2 − If a memory location (h) is already filled then we store the record in the next empty location.

Step 3 − We apply linear search in table T to find an empty memory location T(h), T(h+1), T(h+2), ……..

Record: A, B, C, D, E, X, Y, Z

H(k) : 4, 8, 2, 11, 4, 11, 5, 1

The table for linear probing is given below −

1X
2C
3Z
4A
5E
6Y
7
8B
9
10
11D

The advantage is that the linear probing is very fast, due to locality of reference usage.

The disadvantage is that the linear probing needs five-way independence in the hash function.

Methods to minimise Clustering

There are two methods which are used to minimize clustering. These methods are as follows −

  • Quadratic probing

Suppose a record has hash address h, is already filled then we search the memory locations with address h, h+1, h+4, h+9, h+16,……h+i2,…. to decrease the collision.

  • Double hashing

The collision is resolved by hashing the hash address again. So hash function Hash(h)= h’, we search the memory location with address h, h+h’, h+2h’, h+3h’,….

Advantages of double hashing

  • Double Hashing drastically reduces clustering.

  • Double Hashing requires fewer comparisons.

  • Smaller hash tables can be used.

  • Double Hashing minimizes repeated collisions and the effects of clustering, it is free from problems seen in clustering.

Disadvantages of double hashing

  • Double Hashing technique fills up the Hash table very frequently so we have performance degrades.

  • Below thing makes the processing mechanism slower and de-grading the system.

Chaining

Chaining is known as the Chained Hash Table Mechanism. As the name illustrates, it holds the indexes into pointers to the head of linked-lists.

Here the linked list is used. Each record has two parts, which are as follows −

  • Data part to store data.

  • Next part is to link the records having the same hash address.

Example

The keys 25, 96, 102, 162, 197 stored in the hash table using the chaining method.

Here,

H(k) : k%5

H(26) =26 % 5= 1

H(44) = 44 % 5 = 4

H(38) = 38 % 5 = 3

H(29) = 29 % 5 =4

H(16) = 16 % 5 =1

The table for chaining will be as shown below −

0



126
16NULL
2



338NULL

444
29NULL

Advantages of Chaining

The advantages of chaining are as follows −

  • Chained Hash Tables remain effectively even though the number of keys is stored in different shared locations.

  • Collision Reduction

  • Upgraded Performance.

Disadvantages of Chaining

The disadvantages of chaining are as follows −

  • Key Stored will be more, since the Chained Hash Table has to store separate keys for every data.

  • Space overhead.

  • All disadvantages applicable for linked-lists are applicable for chained hash tables. Since, it also uses linked-list logic.

raja
Published on 08-Jul-2021 07:50:11
Advertisements