Removal of Null Productions in CFG



In this chapter, we will see the process of simplifying context-free grammars (CFGs) by removing null productions. In the last two chapters, we have covered the other two methods: the removal of unit productions and the removal of unwanted productions. Removing null is another important part in CFG simplification.

We will start by defining null productions, then move on to a step-by-step procedure for their removal, illustrated with a clear example for a better understanding.

What are Null Productions?

In a CFG, we might encounter productions where a non-terminal symbol derives the empty string, represented by the symbol 'ε' (epsilon). These are known as null productions.

In other words, a non-terminal symbol 'A' in a CFG is considered nullable if −

  • There exists a production rule A → ε (A directly derives null).
  • There exists a derivation starting from A that ultimately leads to ε.

Productions containing these nullable variables are the null productions we aim to eliminate for CFG simplification.

Steps to Remove Null Productions

Let us see the steps to remove null productions. Initially we need to identify them. and systematically replacing them with equivalent productions that don't include epsilon.

Let's break down the procedure into three simple steps −

Step 1: Identify Null Productions

Check the grammar for productions of the form A → ε. Identify any non-terminal symbols that can derive ε through a series of productions.

Step 2: Find Productions Containing Nullable Variables

For each identified null production (A → ε), locate all other productions where 'A' appears on the right-hand side.

Step 3: Replace Nullable Variables with Epsilon

For every production we get in Step 2, create new productions by replacing each occurrence of the nullable variable ('A' in our example) with ε. Add these newly generated productions to the grammar.

Example of Removing Null Productions in CFG

Let us see the idea through an example for a clear understanding.

Follow the grammar −

  • Start Symbol: S
  • Non-terminal Symbols: A, B, C
  • Terminal Symbols: a, b, c
  • Production Rules:
    • S → ABAC
    • A → aA | ε
    • B → bB | ε
    • C → c

Step 1. Identifying Null Productions

We can directly observe two null productions −

  • A → ε
  • B → ε

Step 2. Eliminating 'A → ε'

Productions containing 'A' on the right-hand side −

  • S → ABAC
  • A → aA

Replacing 'A' with ε

  • Replacing first A: S → BAC
  • Replacing second A: S → ABC
  • Replacing both A: S → BC
  • From A → aA: A → a

Updating the grammar −

  • S → ABAC | ABC | BAC | BC
  • A → aA | a
  • B → bB | ε
  • C → c

Step 3. Eliminating 'B → ε':

Productions containing 'B' on the right-hand side −

  • S → ABAC | ABC | BAC | BC
  • B → bB

Replacing 'B' with ε

  • From S → ABAC, Replacing B: S → AAC
  • From S → ABC, Replacing B: S → AC
  • From S → BAC, Replacing B: S → AC (already present)
  • From S → BC, Replacing B: S → C
  • From B → bB, Replacing B: B → b

Updating the grammar (final grammar) −

  • S → ABAC | ABC | BAC | BC | AAC | AC | C
  • A → aA | a
  • B → bB | b
  • C → c

This is the final grammar but still we can see one unit production, S → C, which can be removed by S → c. So, the reduced grammar could be −

  • S → ABAC | ABC | BAC | BC | AAC | AC | c
  • A → aA | a
  • B → bB | b
  • C → c

Conclusion

In this chapter, we explained in detail how to remove null productions from contextfree grammars. We successfully removed the null productions from our example CFG with steps. The final grammar is equivalent to the original but does not contain any productions that directly derive epsilon.

We also removed one unit production that was generating in the reduced grammar. This simplification is beneficial for various CFG applications, including parsing and language processing tasks.

Advertisements