Working of Preprocessor in C



When we write a C program, it goes through multiple steps before becoming an executable file. The very first step is preprocessing, which is handled by the C Preprocessor. It prepares the source code for compilation by processing all instructions that start with the # symbol, called preprocessor directives.

Here is the overall build process of a C program, showing where the preprocessor comes in −

build flow of c program

In this chapter, we'll see what the preprocessor does, how it works, and why it is important in a C program.

What is a preprocessor?

The preprocessor is a program that executes at the very first stage of compilation. It looks for preprocessor directives and performs tasks like including header files, defining constants, or selectively compiling parts of the code. It returns the modified source code with the '.i' extension, which then goes to the assembler.

For example −

#include <stdio.h>
#define PI 3.14

Here, #include and #define are preprocessor directives. #include uses standard functions, and #define creates a constant value for the program.

Main Functions of Preprocessor

The preprocessor works purely with text and does not understand C language syntax, variables, functions, or data types. It performs text manipulation based on preprocessor directives. Its main functions are

  • It removes all comments (// or /* */).
  • It expands macros wherever they are used by replacing them with their definitions.
  • It includes files based on standard header files or custom files.
  • It selects only the required parts of the code for compilation.
  • It adjusts line numbers, handles control and special instructions, throws errors, or gives special compiler instructions.

Working of Preprocessor in C

Let's see the steps below to understand how internally our preprocessor works before the compilation process.

Step 1: Reading the File

The preprocessor opens your .c source file and reads it as plain text. It breaks the program into tokens, which are the smallest units like keywords, symbols, and numbers.

For example −

int main() {
    return 0;
}

The tokens for this program will be: int, main, (, ), {, return, 0, ;, }. These tokens are used in the next steps to identify preprocessor instructions.

Step 2: Looking for Preprocessor Instructions

The preprocessor scans the program line by line to identify preprocessor directives. These lines always start with #, such as #include, #define, and #ifdef. At this step, it only marks these lines for further processing and does not act on them yet.

For example −

#include <stdio.h>   // Preprocessor instruction
#define MAX 100      // Preprocessor instruction
int x = MAX;         // Normal C code

Here, only the lines starting with # are recognized as preprocessor instructions. The rest of the code is ignored at this step.

Step 3: Removing Comments

Before processing any directives, the preprocessor removes all comments from the code. This includes both single-line (//) and multi-line (/* */) comments.

For example -

// This is a single line comment
int a = 10;  /* This is a block comment */

After preprocessing, it becomes -

int a = 10;

Step 4: File Inclusion

Now the preprocessor processes the #include directives that were identified earlier. It opens the specified file and copies its entire content into the program.

For example, consider a main.c program and a myheader.h header file -

main.c

#include "myheader.h"
int main() {
    return 0;
}

myheader.h

#define PI 3.14
int calculate();

After preprocessing, it will look like this −

#define PI 3.14
int calculate();
int main() {
    return 0;
}

The #include line is replaced by the actual content of the header file.

Step 5: Macro Definition and Expansion

Macros are text replacements defined by the preprocessor. They can be object-like or function-like, and they are replaced wherever they appear in the code. Let's see an example below −

Object-like Macros

#define MAX 100
#define MIN 1

int array[MAX];
int value = MIN + 5;

After preprocessing, the code becomes −

int array[100];
int value = 1 + 5;

Here, the preprocessor repalces MAX with 100 and MIN with 1.

Function like Macros

#define SQUARE(x) ((x) * (x))
double result = SQUARE(5);

After preprocessing, it becomes −

double result = ((5) * (5));

The processor works like a small inline function and repalces the function-like macro call with the actual expression. Here, SQUARE(5) is replaced with ((5) * (5)).

Step 6: Conditional Compilation

The preprocessor can also include or skip certain parts of the program depending on the conditions provided using directives like #if, #ifdef, and #ifndef.

For example −

#define DEBUG

#ifdef DEBUG
    printf("Debug mode\n");
#else
    printf("Release mode\n");
#endif

If DEBUG is defined, the program will print Debug mode and if it is not defined, the program will print Release mode.

Step 7: Line Control and Special Directives

The preprocessor also handles special directives that control the compiler or show custom messages. For example, #line can set custom line numbers in error messages, #error stops compilation and displays a message, and #pragma sends instructions to the compiler. These directives gives us more control over compilation and error reporting.

For example −

#error This program requires C99

Here, #error stops the compilation and shows a message that displays the program requires the C99 standard.

Step 8: Producing Modified Source Code

After processing all directives, expanding macros, including files, handling conditional compilation, and removing comments, the preprocessor generates the modified source code. This code is what the compiler actually works with.

For example, if main.c contains −

#include <stdio.h>
#define MAX 5

int main() {
    int arr[MAX];
    printf("Array size is %d\n", MAX);
    return 0;
}

It will produce the main.i file. Normally, we don't see the preprocessed file, but we can generate it using GCC like this −

gcc -E main.c -o main.i

Here, the -E option stops compilation after preprocessing and creates the main.i file −

#include <stdio.h>

int main() {
    int arr[5];
    printf("Array size is %d\n", 5);
    return 0;
}

In this file, the macro MAX is replaced with 5, the content of <stdio.h> is included, and comments are removed.

Final Example: Working of Preprocessor

Let's take a complete program example that uses different preprocessor directives and see how the preprocessor transforms the code for the next step of compilation process.

Original Code (main.c) −

#include <stdio.h>
#define PI 3.14
#define AREA(r) (PI * (r) * (r))

#define DEBUG

int main() {
#ifdef DEBUG
    printf("Debugging Mode\n");
#endif
    double result = AREA(5);
    printf("Area = %f\n", result);
    return 0;
}

Preprocessed Code (main.i) −

#include <stdio.h>

int main() {
    printf("Debugging Mode\n");
    double result = (3.14 * (5) * (5));
    printf("Area = %f\n", result);
    return 0;
}

In the above program the preprocessor performs the following task before generating the file with .i extension.

  • It replaced the macro PI with 3.14.
  • It expanded the macro AREA(5) into (3.14 * (5) * (5)).
  • It included the printf("Debugging Mode\n"); line because DEBUG was defined.
  • It processed the #include <stdio.h> directive by inserting the contents of the standard I/O header file.

In this chapter, we learned about the preprocessor, which is the first stage of the build process and prepares the source code for compilation. It takes the source file, processes all directives, macros, and comments, and generates an assembly file that is then passed to the assembler and linker, continuing through the build process to produce the final executable.

Advertisements