Minimize removals to remove another string as a subsequence of a given string


A subsequence refers to a sequence that can be obtained from another sequence by removing zero or more elements, without altering the order of the remaining elements. In simpler terms, a subsequence is derived by selecting elements from the original sequence, while preserving their relative order.

For example, consider the sequence [1, 2, 3, 4]. Some possible subsequences of this sequence are: [1, 2], [1, 3, 4], [2, 4], [1, 2, 3, 4], [3], and [4].

Problem Statement

The objective is to determine the minimum number of character removals from string s1 in order to eliminate any occurrence of s2 as a subsequence within s1.

For Examples

Input: s1 = “jjkklll”, s2 = “jkl”
Output: 2

Explanation

The string s1 contains the subsequence "jkl". To make s1 not contain s2 as a subsequence, we can remove either two occurrences of 'j' or two occurrences of 'k' from s1. As a result, the minimum number of characters that need to be removed is 2.

Input: s1 = “”, s2 = “q”
Output: 0

Explanation

Since s1 is an empty string, there are no characters to remove.Hence, there is no need to remove any characters as the minimum number of characters to be removed is 0.

Solution Approach

To find the minimum number of characters that need to be removed from string s1 such that it does not contain string s2 as a subsequence, we can use a dynamic programming approach. It can be implemented as follows −

  • Create a 2D array named "dp" with dimensions NxM, where N represents the length of string s1 and M represents the length of string s2. This array will be utilized to store the minimum number of characters to be removed.

  • Fill the first row of dp by checking if the first character of s2 matches the corresponding character of s1. If there is a match, set the value in dp to 1, indicating that one character needs to be removed.

  • Iterate over the rows of dp starting from the second row. For each row, iterate over the columns.

  • If the current character of string s1 matches the current character of string s2, we can determine the minimum number of characters to be removed.

  • Calculate the minimum of two values −

    • The minimum number of characters to be removed from the previous row (i - 1) and the current column (j) of the dp array, and add 1 to it.

    • The minimum number of characters to be removed from the previous row (i - 1) and the previous column (j - 1) of the dp array.

  • When the characters at the current positions of s1 and s2 do not match, the minimum number of characters to be removed remains the same as the minimum number of characters that need to be removed from the previous row (i - 1) and the current column (j) of the dp array.

  • Once we have iterated over all elements of the dp array, the minimum number of characters to be removed is stored in dp[N - 1][M - 1], where N and M are the lengths of strings s1 and s2, respectively.

  • Finally, we can output the minimum number of characters that need to be removed from s1 in order to ensure that it does not contain s2 as a subsequence.

The dynamic programming approach calculates the minimum number of character removals by considering all possible combinations and choosing the minimum number of removals at each step.

Algorithm

function printMinimumRemovals(s1, s2):
   N = length(s1)
   M = length(s2)
   Create a 2D array dp of size (N) x (M)

   // Step 2
   For j = 0 to M - 1:
      If s1[0] equals s2[j]:
         Set dp[0][j] to 1

   // Step 3
   For i = 1 to N - 1:
      For j = 0 to M - 1:
         // Step 4
         If s1[i] equals s2[j]:
            Set dp[i][j] to min(dp[i-1][j] + 1, dp[i-1][j-1])
         // Step 5
         Else:
            Set dp[i][j] to dp[i-1][j]

   // Step 6
   minimumRemovals = dp[N][M]

   // Step 7
   print minimumRemovals

function main:
   initialise string s1 and s2
   function call printMinimumRemovals()

Example: C++ Program

The given code solves the problem of finding the least amount of characters that need to be removed from string s1 to make it a subsequence of string s2. The code uses dynamic programming to calculate this minimum removal count.

The printMinimumRemovals function takes two strings, s1 and s2, as input. It creates a 2D array, dp, to store the minimum number of removals required. After iterating over all elements of dp, the least amount of characters to be removed is stored in dp[N - 1][M - 1].

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;
// This function prints the minimum number of characters that need to be removed from the string `s1` to make it a subsequence of the string `s2`.
void printMinimumRemovals(string s1, string s2){
   int N = s1.length();
   int M = s2.length();
   // Create a 2D array to store the minimum number of characters that need to be removed from the first `i` characters of `` to make it a subsequence of the first `j` characters of `s2`.
   vector<vector<int>> dp(N, vector<int>(M));
   // Fill the first row of the array.
   for (int j = 0; j < M; j++){
      // If the first character of `s2` matches the first character of `s1`, then the minimum number of characters that need to be removed is 1.
      if (s1[0] == s2[j]){
         dp[0][j] = 1;
      }
   }
   // Iterate over the rows of the array.
   for (int i = 1; i < N; i++){
      // Iterate over the columns of the array.
      for (int j = 0; j < M; j++){
         // When the current character of s1 matches the current character  of s2, the minimum number of characters to be removed is determined by taking the smaller value between two scenarios:
         // Removing the minimum number of characters needed to make the first i - 1 characters of s1 a subsequence of the first j characters of s2.
         // Removing the minimum number of characters needed to make the first i - 1 characters of s1 a subsequence of the first j - 1 characters of s2.
         if (s1[i] == s2[j]){
            dp[i][j] = min(dp[i - 1][j] + 1, dp[i - 1][j - 1]);
         }
         // In case of non-matching characters, consider the minimum number of characters to be removed from the first i - 1 characters of s1 to make it a subsequence of the first j characters of s2.
         else{
            dp[i][j] = dp[i - 1][j];
         }
      }
   }
   // Print the minimum number of characters that need to be removed.
   cout << dp[N - 1][M - 1] << endl;
}
int main(){
   // Input
   string s1 = "bb";
   string s2 = "b";
   // Function call to obtain the minimum number of character removals.
   printMinimumRemovals(s1, s2);
   return 0;
}

Output

0

Time and Space Complexity Analysis

Time Complexity − O(N*M)

  • Creating the 2D vector dp of size NxM takes O(N*M) time.

  • Filling the first row of dp takes O(M) time as it iterates over the length of s2.

  • The nested loops for iterating over the rows and columns of dp take O(N*M) time in total, as each element of the 2D array is processed once.

  • The code's time complexity is primarily determined by the nested loops, resulting in an overall time complexity of O(N*M).

Space Complexity − O(N*M)

  • The space complexity of the code is O(N + M), where N and M are the lengths of strings s1 and s2, respectively, as it depends on the size of the input strings.

  • Creating the 2D vector dp of size NxM requires O(N*M) space.

Conclusion

The article discusses a dynamic programming approach to the problem statement. The concept of the problem was explained through the help of suitable examples. The solution approach includes the steps involved, algorithm used, C++ program implementation as well as time and space complexity analysis.

Updated on: 25-Oct-2023

60 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements