- Design and Analysis of Algorithms
- Home

- Basics of Algorithms
- DAA - Introduction
- DAA - Analysis of Algorithms
- DAA - Methodology of Analysis
- Asymptotic Notations & Apriori Analysis
- DAA - Space Complexities

- Design Strategies
- DAA - Divide & Conquer
- DAA - Max-Min Problem
- DAA - Merge Sort
- DAA - Binary Search
- Strassen’s Matrix Multiplication
- DAA - Greedy Method
- DAA - Fractional Knapsack
- DAA - Job Sequencing with Deadline
- DAA - Optimal Merge Pattern
- DAA - Dynamic Programming
- DAA - 0-1 Knapsack
- Longest Common Subsequence

- Graph Theory
- DAA - Spanning Tree
- DAA - Shortest Paths
- DAA - Multistage Graph
- Travelling Salesman Problem
- Optimal Cost Binary Search Trees

- Heap Algorithms
- DAA - Binary Heap
- DAA - Insert Method
- DAA - Heapify Method
- DAA - Extract Method

- Sorting Methods
- DAA - Bubble Sort
- DAA - Insertion Sort
- DAA - Selection Sort
- DAA - Quick Sort
- DAA - Radix Sort

- Complexity Theory
- Deterministic vs. Nondeterministic Computations
- DAA - Max Cliques
- DAA - Vertex Cover
- DAA - P and NP Class
- DAA - Cook’s Theorem
- NP Hard & NP-Complete Classes
- DAA - Hill Climbing Algorithm

- DAA Useful Resources
- DAA - Quick Guide
- DAA - Useful Resources
- DAA - Discussion

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# Longest Common Subsequence

The longest common subsequence problem is finding the longest sequence which exists in both the given strings.

## Subsequence

Let us consider a sequence S = <s_{1}, s_{2}, s_{3}, s_{4}, …,s_{n}>.

A sequence Z = <z_{1}, z_{2}, z_{3}, z_{4}, …,z_{m}> over S is called a subsequence of S, if and only if it can be derived from S deletion of some elements.

## Common Subsequence

Suppose, ** X** and

**are two sequences over a finite set of elements. We can say that**

*Y***is a common subsequence of**

*Z***and**

*X***, if**

*Y***is a subsequence of both**

*Z***and**

*X***.**

*Y*## Longest Common Subsequence

If a set of sequences are given, the longest common subsequence problem is to find a common subsequence of all the sequences that is of maximal length.

The longest common subsequence problem is a classic computer science problem, the basis of data comparison programs such as the diff-utility, and has applications in bioinformatics. It is also widely used by revision control systems, such as SVN and Git, for reconciling multiple changes made to a revision-controlled collection of files.

## Naïve Method

Let ** X** be a sequence of length

**and**

*m***a sequence of length**

*Y***. Check for every subsequence of**

*n***whether it is a subsequence of**

*X***, and return the longest common subsequence found.**

*Y*There are ** 2^{m}** subsequences of

**. Testing sequences whether or not it is a subsequence of**

*X***takes**

*Y***time. Thus, the naïve algorithm would take**

*O(n)***time.**

*O(n2*^{m})## Dynamic Programming

Let *X = < x _{1}, x_{2}, x_{3},…, x_{m} >* and

*Y = < y*be the sequences. To compute the length of an element the following algorithm is used.

_{1}, y_{2}, y_{3},…, y_{n}>In this procedure, table ** C[m, n]** is computed in row major order and another table

**is computed to construct optimal solution.**

*B[m,n]*Algorithm: LCS-Length-Table-Formulation (X, Y)m := length(X) n := length(Y) for i = 1 to m do C[i, 0] := 0 for j = 1 to n do C[0, j] := 0 for i = 1 to m do for j = 1 to n do if x_{i}= y_{j}C[i, j] := C[i - 1, j - 1] + 1 B[i, j] := ‘D’ else if C[i -1, j] ≥ C[i, j -1] C[i, j] := C[i - 1, j] + 1 B[i, j] := ‘U’ else C[i, j] := C[i, j - 1] B[i, j] := ‘L’ return C and B

Algorithm: Print-LCS (B, X, i, j)if i = 0 and j = 0 return if B[i, j] = ‘D’ Print-LCS(B, X, i-1, j-1) Print(x_{i}) else if B[i, j] = ‘U’ Print-LCS(B, X, i-1, j) else Print-LCS(B, X, i, j-1)

This algorithm will print the longest common subsequence of **X** and **Y**.

## Analysis

To populate the table, the outer **for** loop iterates ** m** times and the inner

**for**loop iterates

**times. Hence, the complexity of the algorithm is**

*n**O(m, n)*, where

**and**

*m***are the length of two strings.**

*n*## Example

In this example, we have two strings ** X = BACDB** and

**to find the longest common subsequence.**

*Y = BDCB*Following the algorithm LCS-Length-Table-Formulation (as stated above), we have calculated table C (shown on the left hand side) and table B (shown on the right hand side).

In table B, instead of ‘D’, ‘L’ and ‘U’, we are using the diagonal arrow, left arrow and up arrow, respectively. After generating table B, the LCS is determined by function LCS-Print. The result is BCB.