Efficient Construction of Finite Automata

Data StructureAlgorithmsPattern Searching Algorithms

By constructing Finite Automata, we can simply perform the pattern searching in texts. At first, we have to fill a 2D array to make the transition table of the finite automata. Once the table is created, the searching procedure is simple. By starting from the first state of the automaton, when we reach the final state, it means that the pattern is found in the string.

For finite automata construction, the time complexity is O(M*K), M is the pattern length and the K is a number of different characters. The complexity of main pattern searching is O(n).

Input and Output

Input:
Main String: “ABAAABCDBBABCDDEBCABC”, Pattern “ABC”
Output:
Pattern found at position: 4
Pattern found at position: 10
Pattern found at position: 18

Algorithm

fillTransTable(pattern, transTable)

Input − The pattern and the transition table to fill with the transition

Output − The filled transition table

Begin
   longPS := 0
   clear all entries of transition table with 0
   transTable[0, patter[0]] = 1      //for the first character of the pattern

   for index of all character i present in pattern, do
      for all possible characters, do
         transTable[i,j] := transTable[longPS, j]
      done

      transTable[i, pattern[i]] := i+1
      if i < pattern size, then
         longPS := transTable[longPS, pattern[i]]
   done
End

patternSearch(text, pattern)

Input − The main text and the pattern

Output − The index, where patterns are found.

Begin
   patLen := pattern length
   strLen := string length
   call fillTransTable(pattern, transTable)
   present := 0

   for all character’s index i of text, do
      present := transTable[present, text[i]]
      if present = patLen, then
         print the location (i – patLen +1) as there is the pattern
   done
End

Example

#include<iostream>
#define MAXCHAR 256
using namespace std;

void fillTransitionTable(string pattern, int transTable[][MAXCHAR]) {
   int longPS = 0;

   for (int i = 0; i < MAXCHAR; i++) {
      transTable[0][i] = 0;        // create entries for first state
   }

   transTable[0][pattern[0]] = 1;  //move to first state for first character
   for (int i = 1; i<= pattern.size(); i++) {

      for (int j = 0; j < MAXCHAR ; j++)    // update states using prefix and suffix
         transTable[i][j] = transTable[longPS][j];
      transTable[i][pattern[i]] = i + 1;
      if (i < pattern.size())
         longPS = transTable[longPS][pattern[i]]; //update longest prefix and suffix for next states
   }
}

void FAPatternSearch(string mainString, string pattern, int array[], int *index) {
   int patLen = pattern.size();
   int strLen = mainString.size();
   int transTable[patLen+1][MAXCHAR];     //create transition table for each pattern

   fillTransitionTable(pattern, transTable);
   int presentState = 0;

   for(int i = 0; i<=strLen; i++) {
      presentState = transTable[presentState][mainString[i]];    //move to next state is transition is possible
      if(presentState == patLen) {    //when present state is the final state, pattern found
         (*index)++;
         array[(*index)] = i - patLen + 1 ;
      }
   }
}

int main() {
   string mainString = "ABAAABCDBBABCDDEBCABC";
   string pattern = "ABC";
   int locArray[mainString.size()];
   int index = -1;
   FAPatternSearch(mainString, pattern, locArray, &index);

   for(int i = 0; i <= index; i++) {
      cout << "Pattern found at position: " << locArray[i]<<endl;
   }
}

Output

Pattern found at position: 4
Pattern found at position: 10
Pattern found at position: 18
raja
Published on 09-Jul-2018 12:48:56
Advertisements