What is Chomsky Hierarchy in compiler design?

The Chomsky hierarchy is a collection of various formal grammars. With the use of this formal grammar, it can generate some formal languages. They can be defined by multiple types of devices that can identify these languages such as finite state automata, pushdown automata, linear bounded automata, and Turing machines, respectively.

Chomsky has suggested four different classes of phrase structure grammar as follows −

  • Type-0 Grammar (Unrestricted Grammar) − Type-0 grammar is constructed with no restrictions on the replacement rule. A non-terminal must appear in the string on the left side. The language generated is called recursively enumerable language.

    Thus, type-0 grammar is

    • An alphabet ∑ of terminal symbols.

    • An alphabet ∀ of non-terminals including a start symbol. $\sum\cup V,′α′$ contains at least one non-terminal and there are no restrictions on ′β′. The type-0 grammar is identified by Turing machines. Let us consider an example, a grammar G can be represented as follows −

V=set of non−terminals={A,B,C}
T=set of terminals={a}
S=start symbol={A}

and production P as follows −




  • Type-1 Grammar (Context Sensitive Grammar) − A grammar is said to be type-1 grammar or context-sensitive grammar if it follows the following conditions −

    • Each production in the form α→β and the length of α is less than or equal to the length of β i.e., there are no empty production, those in which right side is an empty string ∈.

    • Each production of the form α12→ α1 βα2,with $β≠∈$. The Turing machine can be constructed to recognize the context-sensitive language generated by a context-sensitive grammar (CSG). Let the grammar G (V, T, P, S) is an example of context-sensitive language, where




and production is given by




  • Type-2 Grammar (Context Free Grammar) − A grammar is said to be context-free grammar/type-2 grammar if the production is in the form of A→α, where A is a non-terminal and α is a sentimental form i.e., α ∈ (V ∪T)∗i.e.,α can be ∈. The left-hand side of a production must contain only one non-terminal.

The type-2 grammar can be recognized by push down automata. Let the grammar G(V,T,P,S)=({S},{a,b},P,S)and where P consists of an S→aSa|bSb|a|b is an example of context free grammar.

  • Type-3 Grammar (Regular Grammar) − A grammar is said to be type-3 grammar if the production is in the form A→a or A→aB i.e., the left-hand side of each production should contain only one non-terminal or first symbol on the right-hand side must be terminal and can be followed by a non-terminal.

The language generated by this grammar is recognized by the finite state machine. These regular languages can also be expressed by a simpler expression called regular expression.