Article Categories

Selected Reading

Process Multiple Input Files Using Awk

Linux Operating System Open Source

Awk is a powerful text processing tool widely used by developers, system administrators, and analysts to manipulate data in various ways. It can process text files, extract data, and transform it into different formats. One of its key features is the ability to process multiple input files simultaneously, making it ideal for batch processing tasks.

How Awk Handles Multiple Input Files

When processing multiple input files, Awk treats each file as a separate stream of input data. It reads each file in sequence, processing the input data from each file in turn. This allows you to process files with the same type of data all at once, rather than processing each file individually.

Awk provides several built-in variables to track file processing

FILENAME Contains the name of the current input file
FNR Line number in the current file (resets for each new file)
NR Total line number across all files

Basic File Processing

To read data from multiple input files, specify the filenames as arguments to Awk. Consider these sample files

file1.txt

apple
banana
orange

file2.txt

carrot
potato

Process both files with this command

awk '{print FILENAME ": " $0}' file1.txt file2.txt

This prints each line prefixed with its filename, producing

file1.txt: apple
file1.txt: banana
file1.txt: orange
file2.txt: carrot
file2.txt: potato

Processing Structured Data

For structured data like CSV files, you can process multiple files while maintaining field separation. Consider these files

sales1.csv

product,quantity,price
apple,10,0.50
banana,15,0.40

sales2.csv

product,quantity,price
orange,8,0.60
grape,12,0.80

Calculate total revenue from both files

awk -F',' 'NR==1 || FNR==1 {next} {total += $2 * $3} END {print "Total Revenue: $" total}' sales1.csv sales2.csv

This command skips header rows and calculates the total revenue by multiplying quantity by price for each product.

File-Specific Processing

You can perform different operations based on which file is being processed

awk '{
    if (FILENAME == "file1.txt") 
        print "Fruit: " $0
    else if (FILENAME == "file2.txt") 
        print "Vegetable: " $0
}' file1.txt file2.txt

Advanced Examples

Merging CSV Files with Headers

To merge multiple CSV files while keeping only one header

awk 'FNR==1 && NR!=1 {next} {print}' file1.csv file2.csv > merged.csv

This skips the header row from the second file onwards, ensuring only one header appears in the merged output.

Calculating Statistics Across Files

Process log files to count errors per file

awk '/ERROR/ {errors[FILENAME]++} END {
    for (file in errors) 
        print file ": " errors[file] " errors"
}' log1.txt log2.txt log3.txt

Combining Data with File Tracking

Create a summary that tracks which file each record came from

awk '{print $0 "," FILENAME}' data1.txt data2.txt > combined_with_source.csv

Best Practices

Technique	Use Case	Example
Use FILENAME variable	File-specific processing	`if (FILENAME == "config.txt")`
Check FNR vs NR	Handle headers in multiple files	`FNR==1 && NR!=1 {next}`
Use associative arrays	Track data by filename	`data[FILENAME]++`
END block processing	Generate final reports	`END {print summary}`

Conclusion

Awk's ability to process multiple input files makes it an excellent tool for batch processing and data analysis tasks. By leveraging built-in variables like FILENAME, FNR, and NR, you can create sophisticated data processing workflows that handle multiple files efficiently while maintaining full control over the processing logic.

Satish Kumar

Updated on: 2026-03-17T09:01:38+05:30

3K+ Views

Previous Next