# What is the Lexical Analysis?

Compiler DesignProgramming LanguagesComputer Programming

#### Web Design for Beginners: Build Websites in HTML & CSS 2022

68 Lectures 8 hours

#### HTML5 & CSS3 Site Design

61 Lectures 6 hours

Lexical Analysis is the first step of the compiler which reads the source code one character at a time and transforms it into an array of tokens.

Token − The token is a meaningful collection of characters in a program. These tokens can be keywords including do, if, while etc. and identifiers including x, num, count, etc. and operator symbols including >,>=, +, etc., and punctuation symbols including parenthesis or commas. The output of the lexical analyzer phase passes to the next phase called syntax analyzer or parser.

Example − A statement a = b + 5 will have the tokens.

## Role of Lexical Analysis

The main function of lexical analysis are as follows −

• It can separate tokens from the program and return those tokens to the parser as requested by it.

• It can eliminate comments, whitespaces, newline characters, etc. from the string.

• It can inserts the token into the symbol table.

• Lexical Analysis will return an integer number for each token to the parser.

• The correlating error messages that are produced by the compiler during lexical analyzer with the source program.

• It can implement the expansion of macros, in the case of macro pre-processors are used in the source code.

Example1 − What will be operations performed by the lexical analysis phase on input string a = b + 5.

Solution

• Find out tokens and their types.

Token TypeValues
IDENTIFIERa
IDENTIFIERb
ASSIGN-OPERATOR=
CONSTANT5
• Put information about Tokens into Symbol Table.

330id, integer, value = a
332id, integer, value = b

.
.
.
360constant, integer, value = 5
.
.
.
• After finding out tokens and storing them into the symbol table, a token stream is generated as follows −

[i= id, 330] = [id, 332] + [const, 360]

where each pair is of the form [token – type, index]

token-type − It tells whether it is a constant, identifier, label, etc.

index − It tells about the address of the token in the symbol table.

Example2 − What will be the operation performed by lexical analysis on the statement. If (A=10) then GOTO 200.

Solution

• Tokens will be

 Token Type Values Keywords If, then, GOTO Identifiers A Assign-Operators = Label 200 Delimiters (,)
• Symbol Table

If ([id, 236] = [constant, 238]) then GOTO [label, 288]