Process Multiple Input Files Using Awk

Awk is a powerful text processing tool widely used by developers, system administrators, and analysts to manipulate data in various ways. It can process text files, extract data, and transform it into different formats. One of its key features is the ability to process multiple input files simultaneously, making it ideal for batch processing tasks.

How Awk Handles Multiple Input Files

When processing multiple input files, Awk treats each file as a separate stream of input data. It reads each file in sequence, processing the input data from each file in turn. This allows you to process files with the same type of data all at once, rather than processing each file individually.

Awk provides several built-in variables to track file processing

  • FILENAME Contains the name of the current input file

  • FNR Line number in the current file (resets for each new file)

  • NR Total line number across all files

Basic File Processing

To read data from multiple input files, specify the filenames as arguments to Awk. Consider these sample files

file1.txt

apple
banana
orange

file2.txt

carrot
potato

Process both files with this command

awk '{print FILENAME ": " $0}' file1.txt file2.txt

This prints each line prefixed with its filename, producing

file1.txt: apple
file1.txt: banana
file1.txt: orange
file2.txt: carrot
file2.txt: potato

Processing Structured Data

For structured data like CSV files, you can process multiple files while maintaining field separation. Consider these files

sales1.csv

product,quantity,price
apple,10,0.50
banana,15,0.40

sales2.csv

product,quantity,price
orange,8,0.60
grape,12,0.80

Calculate total revenue from both files

awk -F',' 'NR==1 || FNR==1 {next} {total += $2 * $3} END {print "Total Revenue: $" total}' sales1.csv sales2.csv

This command skips header rows and calculates the total revenue by multiplying quantity by price for each product.

File-Specific Processing

You can perform different operations based on which file is being processed

awk '{
    if (FILENAME == "file1.txt") 
        print "Fruit: " $0
    else if (FILENAME == "file2.txt") 
        print "Vegetable: " $0
}' file1.txt file2.txt

Advanced Examples

Merging CSV Files with Headers

To merge multiple CSV files while keeping only one header

awk 'FNR==1 && NR!=1 {next} {print}' file1.csv file2.csv > merged.csv

This skips the header row from the second file onwards, ensuring only one header appears in the merged output.

Calculating Statistics Across Files

Process log files to count errors per file

awk '/ERROR/ {errors[FILENAME]++} END {
    for (file in errors) 
        print file ": " errors[file] " errors"
}' log1.txt log2.txt log3.txt

Combining Data with File Tracking

Create a summary that tracks which file each record came from

awk '{print $0 "," FILENAME}' data1.txt data2.txt > combined_with_source.csv

Best Practices

Technique Use Case Example
Use FILENAME variable File-specific processing if (FILENAME == "config.txt")
Check FNR vs NR Handle headers in multiple files FNR==1 && NR!=1 {next}
Use associative arrays Track data by filename data[FILENAME]++
END block processing Generate final reports END {print summary}

Conclusion

Awk's ability to process multiple input files makes it an excellent tool for batch processing and data analysis tasks. By leveraging built-in variables like FILENAME, FNR, and NR, you can create sophisticated data processing workflows that handle multiple files efficiently while maintaining full control over the processing logic.

Updated on: 2026-03-17T09:01:38+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements