How to validate GST (Goods and Services Tax number using Regular Expression?


The government issues a unique identification number known as a GST number to businesses and individuals who have registered for GST. As part of the validation procedure, the GST number is checked for correctness and legitimacy. The format of the number is frequently checked during the verification process to make sure it is correct and conforms to the format needed by the tax authorities of the relevant country. In order to confirm the validity of the number and that it belongs to the person seeking it, the verification phase may also involve cross-referencing it with the database of the tax office. This assures tax compliance and helps to avoid fraud.

GST validation by regular expression

In the case of a valid GST number, the first two digits represent the state code, the next ten the PAN (Permanent Account Number) of the business entity, the thirteenth the type of entity (1 for regular taxpayers, 2 for government agencies, etc.), the fourteenth the number of unit registrations in the state, and the fifteenth the checksum number. Here's an example of a regular phrase that might be used to validate an Indian GST number.

^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[0-9]{1}[A-Z]{1}[0-9A-Z]{1}$

The following is a description of the template −

[A-Z]5 − Matches any five capital letters (indicating the corporate organisation's PAN number).

[0-9] − Any four integers match the number 4, which denotes an entity type.

[A-Z]1 − A single uppercase letter represents the number of records that match the same PAN.

[0-9]1 − Only one number matches, but the checksum is still feasible.

[A-Z]1 − represents the checksum and corresponds to a single capital letter.

[0-9A-Z] − A checksum is represented by one alphanumeric character (1). $: string's end You may use regular expression validation or write a function in your chosen programming language to test if the supplied text matches the GST number.

Algorithm

Step 1: Define the Structure of a GST Number

To start, the structure of a GST number is defined. This involves specifying the number of characters, the types of characters (alphabetic or numeric), and their specific positions in the string.

Step 2: Write a Regular Expression

Next, a regular expression is written based on the structure defined in step 1. The regular expression is used to match the input string with the defined structure.

Step 3: Validate the GST Number

In the final step, the input string is validated using the regular expression. The input string is compared to the pattern defined in the regular expression. If the input string matches the pattern, the function returns true, indicating that the GST number is valid. If the input string does not match the pattern, the function returns false, indicating that the GST number is not valid.

Approach

  • Using a regular expression pattern.

  • Using a regular expression approach.

The initial approach employs a pre-established regular expression configuration to examine if the GST number adheres to the mandatory arrangement, such as the quantity of characters, the character types, and their placements. This method is rapid and effective, as the pattern only requires a single comparison with the input string.

The alternative approach employs regular expressions to authenticate the GST number, but offers increased flexibility as it accommodates customization according to specific requirements. This approach may be more time-intensive as it necessitates a thorough examination of the input string to verify its conformity to the required structure.

Both approaches serve the same objective, which is to affirm the format of the GST number, yet they present varying degrees of flexibility and efficiency.

Approach 1: Using a regular expression pattern

Utilizing a singular regex blueprint, the code ascertains whether an input string qualifies as a legitimate GST numeral. The blueprint examines the string's composition, which must span 15 characters, with the initial pair being alphabetical, the subsequent 10 numerical, the 13th a letter, and the concluding duo numerical. If the input string diverges from this archetype, the function yields a false outcome, signifying an invalid GST identifier. Conversely, should the input string align with the pattern, the function conveys a true verdict, denoting a valid GST number.

Example

This code uses regular expressions to ensure that the GST number is formatted correctly. An input string surely contains 15 characters, with the first two being alphabetic characters, the next 10 being numbers furthermore the thirteenth being a letter, and the final two being numbers. In the event that any of these criteria fails, the function returns false, indicating that the GST number is not valid. GST numbers are valid if the function returns true.

// C++ program to validate the
// GST (Goods and Services Tax) number
// using Regular Expression
#include <iostream>
#include <string>
#include <regex>

bool validateGST(std::string gstNumber) {
   // remove any spaces in the input string
   gstNumber.erase(std::remove(gstNumber.begin(), gstNumber.end(), ' '), gstNumber.end());
    
   // check that the input string is 15 characters long
   if (gstNumber.length() != 15) {
      return false;
   }
   // check that the first two characters are alphabets
   if (!std::isalpha(gstNumber[0]) || !std::isalpha(gstNumber[1])) {
      return false;
   }
   // check that the next 10 characters are digits
   if (!std::regex_match(gstNumber.substr(2, 10), std::regex("[0-9]{10}"))) {
      return false;
   }
   // check that the 13th character is a letter
   if (!std::isalpha(gstNumber[12])) {
      return false;
   }
   // check that the last two characters are digits
   if (!std::regex_match(gstNumber.substr(13, 2), std::regex("[0-9]{2}"))) {
      return false;
   }
   // if all checks pass, return true
   return true;
}
int main() {
   std::string gstNumber = "AA0123456789A1Z9";
    
   if (validateGST(gstNumber)) {
      std::cout << "The GST number is valid." << std::endl;
   } else {
      std::cout << "The GST number is not valid." << std::endl;
   }
   return 0;
}

Output

The GST number is not valid.

Approach 2: Using a regular expression approach

In this code snippet, a distinct approach employs a distinctive regex pattern to ascertain the legitimacy of a GST (Goods and Services Tax) identifier. The procedure, dubbed isValidGST, accepts a textual parameter representing the GST numeral and provides a binary outcome, signifying the authenticity thereof. To scrutinize the GST designation, the code invokes Luhn's algorithm, an oft-utilized technique for establishing the veracity of credit card designations and additional identification numerals. Within the primary function, the isValidGST methodology is applied to the exemplary GST sequence "29ABCDE1234F1Z5," culminating in the declaration "Valid GST Number."

Example

This example program shows how to create the function isValidGST, which accepts a GST number as text input and returns a boolean value indicating if the number is authentic or not. Luhn's algorithm is used by the function to verify the GST number. Then, we call this method and report the result in the main GST function using the example "29ABCDE1234F1Z5". Since the submitted GST number passes the Luhn Algorithm test, the result is "Valid GST Number".

#include <iostream>
#include <string>
using namespace std;

bool isValidGST(string gst) {
   int len = gst.length();
   int sum = 0;
   bool alternate = false;

   for (int i = len - 1; i >= 0; i--) {
      int num = gst[i] - '0';

      if (alternate) {
         num *= 2;
         if (num > 9) {
         num = (num % 10) + 1;
         }
      }
      sum += num;
      alternate = !alternate;
   }
   return (sum % 10 == 0);
}
int main() {
   string gst = "29ABCDE1234F1Z5";
   if (isValidGST(gst)) {
      cout << "InValid GST number" << endl;
   } else {
      cout << "Valid GST number" << endl;
   }
   return 0;
}

Output

Valid GST number

Conclusion

Verifying GST numbers is a crucial step in verification as the government's identity numbers supplied to people and businesses that have registered for GST are correct. This technique reduces fraud, enhances compliance, responsibly handles data, expedites tax-related processes, and fosters confidence between taxpayers and tax authorities. Although confirming your GST number may provide some difficulties, there are many more advantages than disadvantages. GST data should be thoroughly examined to ensure that the tax system is equitable and efficient for taxpayers and tax officials.

To guarantee that the tax system is just and efficient for both taxpayers and tax authorities, GST data should be double-checked.

Updated on: 20-Jul-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements