- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What is Design of Lexical Analysis in Compiler Design?
Lexical Analysis can be designed using Transition Diagrams.
Finite Automata (Transition Diagram) − A Directed Graph or flowchart used to recognize token.
The transition Diagram has two parts −
States − It is represented by circles.
Edges − States are connected by Edges Arrows.
Example − Draw Transition Diagram for "if" keyword.
To recognize Token ("if"), Lexical Analysis has to read also the next character after "f". Depending upon the next character, it will judge whether the "if" keyword or something else is.
So, Blank space after "if" determines that "If" is a keyword.
"*" on Final State 3 means Retract, i.e., control will again come to previous state 2. Therefore Blank space is not a part of the Token ("if").
Transition Diagram for an Identifier − An identifier starts with a letter followed by letters or Digits. Transition Diagram will be:
For example, In statement int a2; Transition Diagram for identifier a2 will be:
As (;) is not part of Identifier ("a2"), so use "*" for Retract i.e., coming back to state 1 to recognize identifier ("a2").
The Transition Diagram for identifier can be converted to Program Code as −
Coding
State 0: C = Getchar() If letter (C) then goto state 1 else fail State1: C = Getchar() If letter (C) or Digit (C) then goto state 1 else if Delimiter (C) goto state 2 else Fail State2: Retract () return (6, Install ());
In-state 2, Retract () will take the pointer one state back, i.e., to state 1 & declares that whatever has been found till state 1 is a token.
The lexical Analysis will return the token to the Parser, not in the form of an English word but the form of a pair, i.e., (Integer code, value).
In the case of identifier, the integer code returned to the parser is 6 as shown in the table.
Install () − It will return a pointer to the symbol table, i.e., address of tokens.
The following table shows the integer code and value of various tokens returned by lexical analysis to the parser.
Integer Codes for different Tokens
Token | Integer Code | Value |
---|---|---|
Begin | 1 | - |
End | 2 | - |
If | 3 | - |
Then | 4 | - |
Else | 5 | - |
Identifier | 6 | Pointer to Symbol Table |
Constants | 7 | Pointer to Symbol Table |
< | 8 | 1 |
<= | 8 | 2 |
= | 8 | 3 |
<> | 8 | 4 |
> | 8 | 5 |
>= | 8 | 6 |
These integer values are not fixed. Different Programmers can choose other integer codes and values while designing the Lexical Analysis.
Suppose, if the identifier is stored at location 236 in the symbol table, then
Similarly, if constant is stored at location 238 then
Integer code = 7
Install () = 238 i.e., Pair will be (7, 238)
Transition Diagram (Finite Automata) for Tokens −
- Related Articles
- What is the role of the lexical analyzer in compiler design?
- What is Compiler Design?
- What is minimizing of DFA in compiler design?
- What is Chomsky Hierarchy in compiler design?
- What is error handling in compiler design?
- What is Input Buffering in Compiler Design?
- What is Finite Automata in Compiler Design?
- What is the Representation of DFA in compiler design?
- What is Components of LR Parsers in compiler design?
- What is types of LR Parser in compiler design?
- What is translation of control statements in compiler design?
- What is techniques of storage allocation in compiler design?
- What is Language Processing Systems in Compiler Design?
- What is role of different data structures in compiler design?
- What is Implementation of Block Structured Language in compiler design?
