- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What is the Lexical Analysis?
Lexical Analysis is the first step of the compiler which reads the source code one character at a time and transforms it into an array of tokens.
Token − The token is a meaningful collection of characters in a program. These tokens can be keywords including do, if, while etc. and identifiers including x, num, count, etc. and operator symbols including >,>=, +, etc., and punctuation symbols including parenthesis or commas. The output of the lexical analyzer phase passes to the next phase called syntax analyzer or parser.
Example − A statement a = b + 5 will have the tokens.
Role of Lexical Analysis
The main function of lexical analysis are as follows −
It can separate tokens from the program and return those tokens to the parser as requested by it.
It can eliminate comments, whitespaces, newline characters, etc. from the string.
It can inserts the token into the symbol table.
Lexical Analysis will return an integer number for each token to the parser.
The correlating error messages that are produced by the compiler during lexical analyzer with the source program.
It can implement the expansion of macros, in the case of macro pre-processors are used in the source code.
Example1 − What will be operations performed by the lexical analysis phase on input string a = b + 5.
Solution
Find out tokens and their types.
Token Type | Values |
---|---|
IDENTIFIER | a |
IDENTIFIER | b |
ASSIGN-OPERATOR | = |
ADD-OPERATOR | + |
CONSTANT | 5 |
Put information about Tokens into Symbol Table.
Address | Token Type, value |
---|---|
330 | id, integer, value = a |
332 | id, integer, value = b |
. . . | |
360 | constant, integer, value = 5 . . . |
After finding out tokens and storing them into the symbol table, a token stream is generated as follows −
[i= id, 330] = [id, 332] + [const, 360]
where each pair is of the form [token – type, index]
token-type − It tells whether it is a constant, identifier, label, etc.
index − It tells about the address of the token in the symbol table.
Example2 − What will be the operation performed by lexical analysis on the statement. If (A=10) then GOTO 200.
Solution
Tokens will be
Token Type | Values |
Keywords | If, then, GOTO |
Identifiers | A |
Assign-Operators | = |
Label | 200 |
Delimiters | (,) |
Symbol Table
Address | Token Type, Value |
---|---|
236 | id, integer, value = A |
238 | constant, integer, value = 10 |
………… | |
288 | label, value = 200 |
……. | ……….. |
Token Stream will be
If ([id, 236] = [constant, 238]) then GOTO [label, 288]