Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Write Scripts Using Awk Programming Language?
Awk is a powerful text-processing language named after its three original authors: Alfred Aho, Peter Weinberger, and Brian Kernighan. It's a versatile language primarily used for pattern scanning and processing, and is a staple of Unix scripting commonly used for tasks like data extraction, reporting, and data transformation.
Awk scripts are quick to write and perform well for small to medium-sized tasks. This article introduces you to the basics of writing scripts using the Awk programming language.
Basic Syntax
An Awk program consists of a sequence of pattern-action pairs, written as
pattern { action }
Here, the pattern is a condition. If the input line matches the pattern, then the action is performed.
For example
awk '/search_pattern/ { print $0 }' file_name
In this example, awk will search for lines that include search_pattern from file_name, and if it matches, it will print the whole line ($0).
Built-in Variables
Awk has built-in variables that you can use to format your output. Some of the most common are
$0 The entire line.
$1, $2, ... Each individual field (default delimited by whitespace).
FS Field separator (defaults to a space).
OFS Output field separator (defaults to a space).
NR Number of records processed.
NF Number of fields in the current record.
Example Processing Student Data
Assume we have a text file named students.txt with the following content
John Doe 18 Jane Smith 19
We can use awk to print the names and ages separately
awk '{ print "Name: " $1 " " $2 ", Age: " $3 }' students.txt
Name: John Doe, Age: 18 Name: Jane Smith, Age: 19
Control Flow
Awk supports common control flow mechanisms like if, else, while, and for. Here's an example using if and else
awk '{ if ($3 > 18) print $1 " is an adult"; else print $1 " is a minor"}' students.txt
John is a minor Jane is an adult
Functions
Awk has built-in functions for string manipulation, arithmetic operations, and input/output. You can also define your own functions.
User-Defined Function Example
Here's an example of a user-defined function that converts temperatures from Fahrenheit to Celsius
function toCelsius(fahrenheit) {
return (fahrenheit - 32) * 5/9
}
BEGIN { print "Fahrenheit Celsius" }
{ print $1, toCelsius($1) }
If we have an input file temperatures.txt with Fahrenheit temperatures
32 212
Fahrenheit Celsius 32 0 212 100
Regular Expressions
Awk supports regular expression syntax for pattern matching. Here's an example where we search for lines in our students.txt that start with the letter 'J'
awk '/^J/ { print $0 }' students.txt
The caret (^) symbol represents the start of a line. This script will output
John Doe 18 Jane Smith 19
Arrays
Awk supports one-dimensional arrays for complex data manipulation. Let's count the occurrence of ages in our students.txt file
awk '{ count[$3]++ } END { for (age in count) print age " appears " count[age] " times." }' students.txt
18 appears 1 times. 19 appears 1 times.
In this script, count[$3]++ uses the age (third field) as the array key and increments its value each time it appears.
Advanced Pattern Matching
You can use the ~ operator to match a field against a regular expression. Consider a students.txt file with an additional field for the course
John Doe 18 ComputerScience Jane Smith 19 Mathematics
To find students studying Computer Science
awk '$4 ~ /ComputerScience/ { print $1 " " $2 " is studying Computer Science." }' students.txt
John Doe is studying Computer Science.
Writing Awk Scripts
For complex operations, it's convenient to write scripts in separate files. Create a file with the .awk extension and start with the shebang line
#!/usr/bin/awk -f
Example Script Average Age Calculator
Create an Awk script called students.awk that calculates the average age
#!/usr/bin/awk -f
BEGIN {
sum = 0
count = 0
}
{
sum += $3
count++
}
END {
print "Average age: " sum/count
}
Make it executable and run
chmod +x students.awk ./students.awk students.txt
Average age: 18.5
Combining Awk with Unix Commands
You can combine Awk scripts with other Unix commands using pipes (|)
cat students.txt | awk '{ print $1 }' | sort | uniq
This command prints first names, sorts them, and removes duplicates
Jane John
Conclusion
Awk is a powerful tool for text processing on Unix-based systems. Its strength lies in its simplicity and straightforward syntax for pattern matching and data manipulation. Whether you're processing log files, extracting data, or performing calculations, Awk provides an efficient solution for text processing tasks.
