Graph Theory - Tarjan's Algorithm



Tarjan's Algorithm

Tarjan's Algorithm is used to find strongly connected components (SCCs) in a directed graph. A strongly connected component of a directed graph is a maximal subset of vertices such that every vertex is reachable from every other vertex within the subset.

This algorithm uses depth-first search (DFS) to identify SCCs, providing a stack to manage the traversal order and maintaining discovery and low-link values to track connectivity.

Overview of Tarjan's Algorithm

Tarjan's algorithm identifies SCCs in a directed graph by using a depth-first search to explore the graph. It uses the following key concepts −

  • DFS Index: Each vertex is assigned an index that represents the order of its discovery during DFS.
  • Low-Link Value: Each vertex is associated with a "low-link value," which is the smallest DFS index reachable from that vertex, including itself.
  • Stack: A stack is used to track the vertices of the current DFS path. Vertices in the stack are part of the current SCC being explored.

The algorithm uses these properties to identify SCCs as the DFS progresses. When the DFS returns to a vertex with a low-link value equal to its DFS index, an SCC is detected, and the vertices are popped from the stack to form the SCC.

Properties of Tarjan's Algorithm

Tarjan's algorithm has several important properties, including −

  • Efficient Time Complexity: The algorithm runs in O(V + E) time, where V is the number of vertices and E is the number of edges.
  • Single DFS Pass: The algorithm requires only a single DFS traversal of the graph to find all SCCs.
  • Stack-Based Exploration: The use of a stack allows tracking of vertices within the current DFS path.
  • Low-Link Value Calculation: The low-link values provide critical information about the connectivity of vertices, allowing for SCC detection.

Steps of Tarjan's Algorithm

Let us break down the steps of Tarjan's Algorithm in detail −

Initialization

Before starting the DFS traversal, initialize data structures to track visited vertices, DFS indices, low-link values, and the stack. A "visited" array ensures that vertices are only processed once, and a stack keeps track of vertices in the current DFS path.

def initialize_tarjan(graph):
   n = len(graph)
   index = [None] * n
   low_link = [None] * n
   on_stack = [False] * n
   stack = []
   sccs = []
   return index, low_link, on_stack, stack, sccs

In the above code, the "initialize_tarjan" function prepares the required data structures for the algorithm. Each vertex's index and low-link value are initialized to None, and the "on_stack" array tracks whether a vertex is in the stack.

Depth-First Search (DFS)

The algorithm uses DFS to explore the graph. Each vertex is assigned a unique index during its first visit, and its low-link value is calculated based on its neighbors. Vertices are pushed onto the stack as they are visited.

def tarjan_dfs(v, graph, index, low_link, stack, on_stack, sccs, current_index):
   index[v] = low_link[v] = current_index
   current_index += 1
   stack.append(v)
   on_stack[v] = True

   for neighbor in graph[v]:
      if index[neighbor] is None:  # Neighbor not visited
         tarjan_dfs(neighbor, graph, index, low_link, stack, on_stack, sccs, current_index)
         low_link[v] = min(low_link[v], low_link[neighbor])
      elif on_stack[neighbor]:  # Neighbor in the current stack
         low_link[v] = min(low_link[v], index[neighbor])

The "tarjan_dfs" function recursively explores neighbors of the current vertex. If a neighbor has not been visited, it continues the DFS. If a neighbor is on the stack, it updates the low-link value of the current vertex.

Detect Strongly Connected Components

After completing the DFS for a vertex, the algorithm checks whether the vertex is the root of an SCC. If the vertex's index equals its low-link value, an SCC is detected, and the vertices are popped from the stack to form the SCC.

   if low_link[v] == index[v]:
      scc = []
      while True:
         w = stack.pop()
         on_stack[w] = False
         scc.append(w)
         if w == v:
            break
      sccs.append(scc)

The above code identifies an SCC by popping vertices from the stack until the root vertex is reached. The SCC is then added to the list of SCCs.

Main Function

The main function iterates through all vertices of the graph and performs DFS on unvisited vertices. It collects all SCCs in the process.

def tarjan_scc(graph):
   index, low_link, on_stack, stack, sccs = initialize_tarjan(graph)
   current_index = 0

   for v in range(len(graph)):
      if index[v] is None:
         tarjan_dfs(v, graph, index, low_link, stack, on_stack, sccs, current_index)

   return sccs

The "tarjan_scc" function initializes the data structures and iterates through all vertices, invoking DFS for unvisited vertices. It returns the list of SCCs in the graph.

Complete Python Implementation

Following is the complete Python implementation of Tarjan's algorithm −

from collections import defaultdict

def initialize_tarjan(graph):
   n = len(graph)
   index = [None] * n
   low_link = [None] * n
   on_stack = [False] * n
   stack = []
   sccs = []
   return index, low_link, on_stack, stack, sccs

def tarjan_dfs(v, graph, index, low_link, stack, on_stack, sccs, current_index):
   index[v] = low_link[v] = current_index
   current_index += 1
   stack.append(v)
   on_stack[v] = True

   for neighbor in graph[v]:
      if index[neighbor] is None:
         tarjan_dfs(neighbor, graph, index, low_link, stack, on_stack, sccs, current_index)
         low_link[v] = min(low_link[v], low_link[neighbor])
      elif on_stack[neighbor]:
         low_link[v] = min(low_link[v], index[neighbor])

   if low_link[v] == index[v]:
      scc = []
      while True:
         w = stack.pop()
         on_stack[w] = False
         scc.append(w)
         if w == v:
            break
      sccs.append(scc)

def tarjan_scc(graph):
   index, low_link, on_stack, stack, sccs = initialize_tarjan(graph)
   current_index = 0

   for v in range(len(graph)):
      if index[v] is None:
         tarjan_dfs(v, graph, index, low_link, stack, on_stack, sccs, current_index)

   return sccs

# Example graph representation
example_graph = [
   [1],       # 0 -> 1
   [2],       # 1 -> 2
   [0, 3],    # 2 -> 0, 3
   [4],       # 3 -> 4
   []         # 4 -> None
]

# Find and display SCCs
sccs = tarjan_scc(example_graph)
print("Strongly Connected Components:", sccs)

After executing the above implementation on an example graph, it outputs the SCCs. For the given graph, the SCCs are as follows −

Strongly Connected Components: [[4], [3], [2, 1, 0]]

In the example, the graph contains three SCCs: vertex 4 as a single SCC, vertex 3 as another SCC, and the set {0, 1, 2} forming a larger SCC where each vertex is reachable from the others as shown in the graph below −

Tarjans Algorithm

Complexity of Tarjan's Algorithm

Tarjan's algorithm has the following complexity characteristics −

  • Time Complexity: The time complexity of the algorithm is O(V + E), where V is the number of vertices and E is the number of edges. This linear time complexity makes it highly useful, especially for large graphs, as it processes each vertex and edge exactly once.
  • Space Complexity: The space complexity is O(V), as the algorithm requires storage for the DFS index, low-link values, stack, and auxiliary arrays like "on_stack." This ensures that the memory usage grows linearly with the number of vertices.
Advertisements