Introduction to Disjoint Set Data Structure or Union-Find Algorithm


Disjoint set information structure, too known as the Union-Find algorithm, could be an essential concept in computer science that gives an effective way to solve issues related to apportioning and network. It is especially valuable in solving issues including sets of components and determining their connections. In this article, we are going to investigate the language structure, algorithm, and two distinctive approaches to executing the disjoint set information structure in C++. We will also provide fully executable code examples to illustrate these approaches.

Syntax

Before diving into the algorithm, let's familiarize ourselves with the syntax used in the following code examples −

// Create a disjoint set
DisjointSet ds(n);

// Perform union operation
ds.unionSets(a, b);

// Find the representative of a set
ds.findSet(x);

Algorithm

When handling multiple disassociated sets, utilizing the disjoint data structure can prove useful. Each individual grouping is designated with a specific representative characterizing it. The starting point involves each component forming its own isolated set that corresponds with its respective representative (which also happens to be itself). The two primary operations performed on disjoint sets are union and find.

Union operation

  • Find the representatives of the two sets to be merged.

  • If the representatives are different, make one representative point to the other, effectively merging the sets.

  • If the representatives are the same, the sets are already merged, and no further action is needed.

Find operation

  • Given an element, find the representative of the set it belongs to.

  • Follow the parent pointers until reaching the representative.

  • Return the representative as the result.

Approach 1: Rank-based Union by Rank and Path Compression

One efficient approach to implementing the disjoint set data structure is using the Union by Rank and Path Compression technique.

In this approach, each set has an associated rank, initially set to 0.

When performing a union operation between two sets, priority is given to the set with higher ranks and it incorporates the one with lower ranks. Should both sets have similar ranks, an arbitrary choice must be made as to which set incorporates whom. In either scenario, once merged into a new set, its rankings increase by 1. Additionally, to expedite find operations and decrease time complexity, path compression helps flatten out tree structures during those operations.

Example

#include <iostream>
#include <vector>

class DisjointSet {
   std::vector<int> parent;
   std::vector<int> rank;
    
public:
   DisjointSet(int n) {
      parent.resize(n);
      rank.resize(n, 0);
      for (int i = 0; i < n; ++i)
         parent[i] = i;
   }
    
   int findSet(int x) {
      if (parent[x] != x)
         parent[x] = findSet(parent[x]);
      return parent[x];
   }
    
   void unionSets(int x, int y) {
      int xRoot = findSet(x);
      int yRoot = findSet(y);
        
      if (xRoot == yRoot)
         return;
        
      if (rank[xRoot] < rank[yRoot])
         parent[xRoot] = yRoot;
      else if (rank[xRoot] > rank[yRoot])
         parent[yRoot] = xRoot;
      else {
         parent[yRoot] = xRoot;
         rank[xRoot]++;
      }
   }
};

int main() {
   // Example usage of DisjointSet
   int n = 5;  // Number of elements

   DisjointSet ds(n);

   ds.unionSets(0, 1);
   ds.unionSets(2, 3);
   ds.unionSets(3, 4);

   std::cout << ds.findSet(0) << std::endl;  
   std::cout << ds.findSet(2) << std::endl;  

   return 0;
}

Output

0
2

Approach 2: Size-based Union by Size and Path Compression

Another approach to the disjoint set data structure is using the Union by Size and Path Compression technique.

  • In this approach, each set has an associated size, initially set to 1.

  • During the union operation, the smaller set is merged into the larger set.

  • The size of the resulting set is updated accordingly.

  • Path compression is applied during the find operation to flatten the tree structure, similar to the previous approach.

Example

#include <iostream>
#include <vector>

class DisjointSet {
   std::vector<int> parent;
   std::vector<int> size;
    
public:
   DisjointSet(int n) {
      parent.resize(n);
      size.resize(n, 1);
      for (int i = 0; i < n; ++i)
         parent[i] = i;
   }
    
   int findSet(int x) {
      if (parent[x] != x)
         parent[x] = findSet(parent[x]);
      return parent[x];
   }
    
   void unionSets(int x, int y) {
      int xRoot = findSet(x);
      int yRoot = findSet(y);
        
      if (xRoot == yRoot)
         return;
        
      if (size[xRoot] < size[yRoot]) {
         parent[xRoot] = yRoot;
         size[yRoot] += size[xRoot];
      }
      else {
         parent[yRoot] = xRoot;
         size[xRoot] += size[yRoot];
      }
   }
};

int main() {
   // Example usage of DisjointSet
   int n = 5;  // Number of elements

   DisjointSet ds(n);

   ds.unionSets(0, 1);
   ds.unionSets(2, 3);
   ds.unionSets(3, 4);

   std::cout << ds.findSet(0) << std::endl;  
   std::cout << ds.findSet(2) << std::endl;  
   return 0;
}

Output

0
2

Conclusion

The disjoint set data structure, or Union-Find algorithm, is a powerful tool for solving problems involving sets and connectivity. The present piece extensively examined C++'s disjoint set data structure syntax as well as its algorithm. To expand on our understanding, we provide readers with two unique methods- Rank-based Union by Rank in combination with Path Compression, and Size-based Union through Size plus Path Compression.. By understanding and implementing these approaches, you can efficiently solve a wide range of problems that require tracking disjoint sets.

Updated on: 25-Jul-2023

720 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements