Program to construct DFA for Regular Expression C( A + B)+


In this article, we will be discussing how to construct a Deterministic Finite Automaton (DFA) for the Regular Expression C(A + B)+. We'll start by understanding the problem and the theory behind it, then we'll dive into the implementation and conclude with a relevant example to demonstrate its use.

Understanding the Problem Statement

A Deterministic Finite Automaton (DFA) is a theoretical model of computation used in automata theory, a branch of theoretical computer science. It's one of the simplest types of automata and an essential concept in the study of compilers and parsers.

The task here is to program a DFA for the Regular Expression C(A + B)+. This expression can be interpreted as 'C' followed by one or more occurrences of either 'A' or 'B'. Our goal is to create the programs that will check if a given input string matches this Regular Expression.

Theoretical Background

A DFA consists of a set of states and transitions between those states on input symbols. It starts from an initial state and reads the input symbols. For each input symbol, it transitions to a new state until all input symbols have been read. The DFA accepts the input if and only if it ends in a final (or accepting) state.

In this case, the DFA for the Regular Expression C(A + B)+ can be visualized as follows −

  • Start state: q0

  • Accepting state: q2

  • Transitions −

    • q0 on input 'C' goes to q1

    • q1 on input 'A' or 'B' goes to q2

    • q2 on input 'A' or 'B' stays at q2

Example

Now let's implement this DFA in different programming languages. Note that this program will only work with uppercase 'A', 'B', and 'C'.−

#include <stdio.h>
#include <stdbool.h>
#include <string.h>

typedef enum { q0, q1, q2, q3 } State;

State getNextState(State currentState, char input) {
   switch (currentState) {
      case q0: return (input == 'C') ? q1 : q3;
      case q1: return (input == 'A' || input == 'B') ? q2 : q3;
      case q2: return (input == 'A' || input == 'B') ? q2 : q3;
      default: return q3;
   }
}
bool matchesRE(const char* s) {
   State currentState = q0;
   int len = strlen(s);
   for (int i = 0; i < len; i++) {
      currentState = getNextState(currentState, s[i]);
   }
   return currentState == q2;
}
int main() {
   const char* test = "CABAB";
   printf("%s\n", matchesRE(test) ? "Matches Regular Expression" : "Does not match Regular Expression");
   return 0;
}

Output

Matches Regular Expression
#include <iostream>
#include <string>
using namespace std;

enum State { q0, q1, q2, q3 };

State getNextState(State currentState, char input) {
   switch (currentState) {
      case q0: return (input == 'C') ? q1 : q3;
      case q1: return (input == 'A' || input == 'B') ? q2 : q3;
      case q2: return (input == 'A' || input == 'B') ? q2 : q3;
      default: return q3;
   }
}

bool matchesRE(string s) {
   State currentState = q0;
   for (char c : s) {
      currentState = getNextState(currentState, c);
   }
   return currentState == q2;
}

int main() {
   string test = "CABAB";
   cout << (matchesRE(test) ? "Matches Regular Expression" : "Does not match Regular Expression") << endl;
   return 0;
}

Output

Matches Regular Expression
public class Main {
   enum State { q0, q1, q2, q3 }

   static State getNextState(State currentState, char input) {
      switch (currentState) {
         case q0:
            return (input == 'C') ? State.q1 : State.q3;
         case q1:
            return (input == 'A' || input == 'B') ? State.q2 : State.q3;
         case q2:
            return (input == 'A' || input == 'B') ? State.q2 : State.q3;
         default:
            return State.q3;
      }
   }

   static boolean matchesRE(String s) {
      State currentState = State.q0;
      for (char c : s.toCharArray()) {
         currentState = getNextState(currentState, c);
      }
      return currentState == State.q2;
   }

   public static void main(String[] args) {
      String test = "CABAB";
      System.out.println(matchesRE(test) ? "Matches Regular Expression" : "Does not match Regular Expression");
   }
}

Output

Matches Regular Expression
class State:
   q0, q1, q2, q3 = range(4)

def get_next_state(current_state, input):
   if current_state == State.q0:
      return State.q1 if input == 'C' else State.q3
   elif current_state == State.q1:
      return State.q2 if input == 'A' or input == 'B' else State.q3
   elif current_state == State.q2:
      return State.q2 if input == 'A' or input == 'B' else State.q3
   else:
      return State.q3

def matches_re(s):
   current_state = State.q0
   for c in s:
      current_state = get_next_state(current_state, c)
   return current_state == State.q2

test = 'CABAB'
print('Matches Regular Expression' if matches_re(test) else 'Does not match Regular Expression')

Output

Matches Regular Expression

Test Case

Let's use the string "CABAB" as an example. This string starts with 'C' and is followed by a sequence of 'A's and 'B's. Therefore, it matches the Regular Expression C(A + B)+, and the output of the program will be: "Matches Regular Expression".

Conclusion

In this article, we have taken a closer look at the theoretical model of computation, DFA, and its application in validating Regular Expressions. We have focused on the Regular Expression C(A + B)+ and created a C++ program to check if an input string matches this Regular Expression. We hope this discussion has been informative and has helped you understand DFAs and their implementation in C++ better.

Updated on: 27-Oct-2023

477 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements