Overview of Concurrency Control and Recovery in Distributed Databases


In a distributed DBMS environment, several challenges arise in concurrency control and recovery, which are not present in a centralized DBMS environment. This article will discuss these challenges and their potential solutions.

Multiple Copies of Data Items

Dealing with multiple copies of data items is a significant challenge in distributed DBMS environments. Consistency among these copies is crucial for proper concurrency control, and recovery methods are responsible for making a copy consistent with others if the site storing the copy fails.

Failure of Individual Sites

In the event of site failure, distributed DBMS should continue to operate with its running sites if possible. When a site recovers, its local database must be brought up-to-date with the rest of the sites before rejoining the system.

Failure of Communication Links

The system must be able to deal with the failure of one or more of the communication links that connect the sites. Network partitioning may occur, breaking up the sites into two or more partitions, where the sites within each partition can communicate only with each other.

Distributed Commit

Problems can arise with committing a transaction that is accessing databases stored on multiple sites if some sites fail during the commit process. The two-phase commit protocol is often used to deal with this problem.

Distributed Deadlock

Deadlock may occur among several sites, so techniques for dealing with deadlocks must be extended to take this into account

Distributed Concurrency Control and Recovery Techniques

Distributed concurrency control and recovery techniques must deal with the challenges mentioned above and others. In this section, we review some of the suggested techniques to handle recovery and concurrency control in DDBMSs.

Distributed DBMS environments face unique challenges in concurrency control and recovery. These challenges must be addressed by implementing effective distributed concurrency control and recovery techniques. With the proper implementation of these techniques, a distributed DBMS can provide a reliable and robust solution for organizations dealing with large amounts of data.

Distributed Concurrency Control Based on a Distinguished Copy of a Data Item

In distributed databases, replicated data items pose a challenge for concurrency control. To address this issue, several concurrency control techniques have been proposed that extend the concurrency control methods used for centralized databases. These techniques involve designating a particular copy of each data item as a distinguished copy, with locks for the data item associated with the distinguished copy. All locking and unlocking requests are sent to the site that contains the distinguished copy.

Different methods have been proposed for choosing the distinguished copies, including the primary site technique, primary site with backup site, and primary copy technique. In the primary site technique, all distinguished copies are kept at a single primary site, which acts as the coordinator site for concurrency control. However, this approach has certain disadvantages, such as overloading the primary site with locking requests and causing system bottlenecks. Failure of the primary site also paralyzes the system, limiting reliability and availability.

The primary site with backup site approach addresses the issue of primary site failure by designating a second site as a backup site. All locking information is maintained at both the primary and backup sites, and the backup site takes over as the primary site in case of failure. The primary copy technique distributes the load of lock coordination among various sites by storing distinguished copies of different data items at different sites. Failure of one site affects only transactions accessing locks on items whose primary copies reside at that site.

Choosing a new coordinator site in case of failure involves electing a new coordinator site through a complex algorithm. If no backup site exists or if both the primary and backup sites are down, the election process is initiated by a site that repeatedly fails to communicate with the coordinator site. The site proposes itself as the new coordinator, and as soon as it receives a majority of yes votes, it declares itself as the new coordinator.

Distributed Concurrency Control Based on Voting

Distributed Concurrency Control Based on Voting is a concurrency control method that differs from other replicated items methods in that it does not rely on a distinguished copy to maintain locks. Instead, a lock request is sent to all sites containing a copy of the data item, and each copy maintains its own lock and can grant or deny the request.

If a transaction requesting a lock is granted the lock by a majority of the copies, it holds the lock and informs all copies that it has been granted the lock. However, if a transaction does not receive a majority of votes granting it a lock within a certain time-out period, it cancels its request and informs all sites of the cancellation.

The voting method is considered a truly distributed concurrency control method, as the responsibility for a decision resides with all the sites involved. Simulation studies have shown that voting has higher message traffic among sites than do the distinguished copy methods. If the algorithm takes into account possible site failures during the voting process, it becomes extremely complex.

Distributed Recovery

Distributed recovery in a distributed database is a process that has several issues. One of the major challenges is detecting whether a site is down or not. It often requires exchanging messages with other sites. For example, if site X sends a message to site Y and does not receive a response. It is difficult to determine if the message was not delivered due to a communication failure. If site Y is down and could not respond, or if site Y sent a response that was not delivered.

Another significant problem in distributed recovery is distributed commit. When a transaction modifies data at multiple sites. It cannot commit until it ensures that the effects of the transaction on each site are not lost. To achieve this, each site must record the local effects of the transaction permanently in its local site log-on disk before committing. The two-phase commit protocol is used to ensure the accuracy of the distributed commit.

Conclusion

In conclusion, distributed DBMS environments pose several challenges in concurrency control and recovery that are not present in centralized DBMS environments. These challenges include dealing with multiple copies of data items, site failures, communication link failures, distributed commits, and distributed deadlock. To address these challenges, effective distributed concurrency control and recovery techniques must be implemented. The article reviewed some of the suggested techniques, such as distributed concurrency control based on a distinguished copy of a data item, distributed concurrency control based on voting, and distributed recovery. By implementing these techniques, a distributed DBMS can provide a reliable and robust solution for organizations dealing with large amounts of data.

Updated on: 18-May-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements