Remove comments in a string using C++


Given a C++ program as input, remove the comments from it. ‘source’ is a vector where the i-th line of the source code is the source[i]. This represents the result of splitting the source code string by the newline character \n. In C++, we can create two types of comments, i.e., Line Comments, Block Comments.

The string ‘\’ denotes the line comment, which means the string next to it on the right will be ignored by the program.

The string ‘\* and *\’ is a multiline comment representing the string starting from ‘\* till the *\’ will be ignored.

The first useful comment takes precedence over others: if the string // occurs in a block comment, it is ignored. Similarly, if the string /* occurs in a line or block comment, it is also ignored. If a certain line of code is empty after removing comments, you must not output that line − each string in the answer list will be non-empty.

For Example −

Input-1

source = ["/*Test program */", "int main()", "{ ", " // variable
declaration ", "int a, b, c;", "/* This is a test", " multiline ", "
comment for ", " testing */", "a = b + c;", "}"]
The line by line code is as follows:
/*Test program */
int main(){
   // variable declaration
   int a, b, c;
   /* This is a test multiline comment for testing */
   a = b + c;
}

Output

["int main()","{ "," ","int a, b, c;","a = b + c;","}"]The line by line
code is as follows:
int main() /// Main Function
{
   int a, b, c;
   a = b + c;
}

Explanation − The string /* means a multiline comment, including lines 1 and lines 6-9. The string // denotes line 4 as comments.

Approach to Solve this Problem

  • We will parse the string line by line as an ideal compiler does. We will ignore all the characters between and after these block quotes when we encounter// or ‘/* /*.’

  • A function removeString(vector<string>&source) takes a source code as an input and returns the code after removing its comments.

  • A Boolean variable comment is initialized as false, which will check whether the particular block of string or character is a comment or not.

  • If we start a block comment and we are not in a block, then we will skip over the next two characters and change our state in that particular block.

  • If we end a block comment and we are in a block, we will skip over the next two characters and change our state to not be in a block.

  • If we start a line comment and aren't in a block, we will ignore the rest of the line.

  • If we aren't in a block comment (and it wasn't the start of a comment), we will record the character we are at.

  • If we aren't in a block at the end of each line, we will record the line.

  • The algorithm runs in O(source) time complexity. The source is the input string.

Example

#include<bits/stdc++.h>
using namespace std;
vector<string>removeComments(vector<string>&source){
   vector<string>ans;
   string s;
   bool comment= false;
   for(int i = 0; i < source.size(); i++) {
      for(int j = 0; j < source[i].size(); j++) {
         if(!comment && j + 1 < source[i].size() && source[i][j] == '/' && source[i][j+1]=='/')
            break;
         else if(!comment && j + 1 < source[i].size() && source[i][j] == '/' && source[i][j+1]=='*')
comment = true;
            j++;
         else if(comment && j + 1 < source[i].size() && source[i][j] == '*' && source[i][j+1]=='/')
comment = false;
            j++;
         else if(!comment)
            s.push_back(source[i][j]);
      }
      if(!comment && s.size()) ans.push_back(s), s.clear();
   }
   return ans;
}
int main(){
   vector<string>source
   (“ source = ["/*Test program */", "int main()", "{ ", " // variable declaration ", "int a, b, c;", "/* This is a test", " multiline ", " comment for ", " testing */", "a = b + c;", "}"]
   The formatted code can be interpreted as -
   /*Test program */
   int main() // Main function{
      int a, b, c; // variable declaration
      /* This is a test multiline comment for testing */
      a = b + c;
   }”);
   vector<string>res= removeComments(source);
   for(auto x:res){
      cout<<x;
   }
   return 0;
}

Output

["int main()","{ "," ","int a, b, c;","a = b + c;","}"]The line by line
code is visualized as below:
int main(){
   int a, b, c;
   a = b + c;
}

Updated on: 05-Feb-2021

682 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements