Modify array of strings by replacing characters repeating in the same or remaining strings

Data Structure C++Server Side Programming Programming

Modifying an array of strings by replacing characters repeating in the same or remaining strings is a common problem in programming. It can be solved using hash tables, sets, arrays etc. The aim is to improve the time and space requirements while providing the same functionality. This problem can be encountered in many real-life scenarios, such as processing large text or cleaning up datasets with duplicates.

Problem Statement

Given an input string array arr[] containing lowercase and uppercase characters. The goal is to modify the array by removing characters from the strings which are repeating in the same string or another string in the array.

Sample Example 1

Input

arr[] = {“apple”, “Banana”, “coding”}

Output

arr[] = {“aple”, “Bn”, “codig”}

Explanation

For arr[0] = “apple”, character ‘p’ is the only repeating character in the string, thus modified arr[0] = “aple”

For arr[1] = “Banana”, character ‘a’ is repeating and ‘n’ is also repeating, thus modified arr[1] = “Bn”

For arr[2] = “coding”, character ‘n’ is repeating, thus modified arr[2] = “codig”

Sample Example 2

Input

arr[] = {“Stay”, “This”, “Hill”}

Output

arr[] = {“Stay”, “This”, “Hl”}

Explanation

For arr[0] = “Stay”, no repeating character, thus modified arr[0] = “Stay”

For arr[1] = “This”, no repeating character, thus modified arr[1] = “This”

For arr[2] = “Hill”, character ‘i’ and ‘l’ are repeating, thus modified arr[2] = “Hl”

Approach: Using Sets

In order to remove duplicate characters across a given array of strings a set can be used. An unordered set can be used to record the distinct characters encountered as we iterate over the string of the array. Iterating over each string in the array and then over each character in the string, check for the occurrence of each character in the set, if present then add that character to the set and append the modified string otherwise skip it.

Algorithm

Here's the algorithm for the given C++ program −

Define a function removeDuplicates that takes a vector of strings arr as input.
Declare an unordered set of characters uniqueChar to store the distinct characters.
Determine the size of the array n.
Declare an empty vector of strings ans to store the list of modified strings.
Traverse the array using a for loop and iterate over each string str in arr.
Declare an empty string temp to store the modified string.
Traverse each character ch in str using a for loop.
If ch is already present in uniqueChar, continue to the next character.
Otherwise, append ch to temp and insert ch into uniqueChar`.
If the temp is not an empty string, push it into the vector ans.
Traverse the ans vector using a for loop and print each string.
In the main function, define an array of strings arr with given input strings.
Call the removeDuplicates function passing arr as the input.

Example: C++ Implementation

In the following program, we use an unordered set to keep track of all the unique characters in the array of strings. Each character is checked first in the set to check for previous occurrences and then a decision is made as to whether keep it in the modified string or not.

#include <bits/stdc++.h>
using namespace std;
// Function to remove duplicate characters across the strings
void removeDuplicates(vector<string> arr){

   // Unordered set stores the unique characters in the given array of strings
   unordered_set<char> uniqueChar;
   int n = arr.size();
   vector<string> ans;
   
   // traversing the array to get all the strings individually
   for (auto str : arr) {
      string temp = "";
      
      // Iterating over the characters of the string
      for (auto ch : str) {
      
         // If character has not occurred before and is not present in the set
         if (uniqueChar.find(ch) == uniqueChar.end()) {
         
            // Adding the character to the modified string and the set
            temp += ch;
            uniqueChar.insert(ch);
         }
         
         // If ch is present in the set then it had occurred before thus just skip it
      }

      ans.push_back(temp);
   }
   
   // Printing the list of modified strings
   for (int i = 0; i < ans.size(); i++) {
      cout << ans[i] << " ";
   }
}

int main(){
   vector<string> arr= { "Stay", "This", "Hill" };
   removeDuplicates(arr);
}

Output

Stay This Hl

Time Complexity − O(nm) where n is the size of input array and m is the size of the largest string.

Space Complexity − O(n)

Conclusion

In conclusion, we can modify an array of strings by replacing duplicate characters across the strings with a simple algorithm. We can use a set to store the distinct characters. This algorithm is efficient, with a time complexity of O(nm) but it can be optimised further by

Pass the vector by reference instead of by value to avoid copying the entire vector.
Use a constant reference when iterating over the vector to avoid copying each element.

Divya Sahni

Updated on: 25-Jul-2023

129 Views

Kickstart Your Career

Get certified by completing the course

Get Started