Awk is a domain-specific programming language that can be used as a Linux command-line tool or within a shell script. The working is similar to the sed and grep commands, extracting the data from the reference file using the regular expressions.
It can be piped with other commands; regular expressions must be defined between a single colon within the curly bracket “{regular_expression}
“, and by default, space is treated as a column in AWK.
Tutorial Details
Description | AWK (Aho, Weinberger, and Kernighan) |
Difficulty Level | High |
Root or Sudo Privileges | Maybe |
OS Compatibility | Ubuntu, Manjaro, Fedora, etc. |
Prerequisites | awk |
Internet Required | No |
Displaying the Single Column
The ps command output consists of four columns separated by spaces. You can use this space to define the column in numbers (referring to integers).
To print the first column using AWK, specify the column number with the “$
” dollar symbol inside the curly bracket, as shown.
$ ps | awk '{print $1}'
Output:
You can output the fourth column by replacing “$1
” with “$4
″.
$ ps | awk '{print $4}'
Output:
To print both the first and fourth columns together, specify both with a space between the double quotation marks.
$ ps | awk '{print $1" "$4}'
Output:
If you replace the “$1
” and “$4
” with “$0
” it will print the complete command output.
Displaying the Columns Using a Custom Separator
If you want to read files like “/etc/passwd
” which separate every entry using the “:
” colon then the above method of printing the column using the space will not work for this.
To print the first column of “/etc/passwd
” you must specify the fields are separated using the “:
” colon to “-F
” flag and then use the “$
” dollar symbol to print the first column, as shown.
$ awk -F ":" '{print $1}' /etc/passwd
Output:
To print the sixth column, replace “$1
” with “$6
“.
$ awk -F ":" '{print $6}' /etc/passwd
Output:
Limiting the Entries Using the “NR” Variable
In the previous command you were printing the complete first column from the “/etc/passwd
” file. However, you can limit the output by defining the range in number using “NR
” variables.
$ awk -F ":" 'NR==1, NR==5 {print $6}' /etc/passwd
Output:
Above, you are only getting the records between the first and fifth rows from the sixth column.
If you read the file like “/etc/shells
“, you will get the first line as a comment that will interrupt while printing the column, which you can easily ignore using the “NR
” variable, as shown.
Reading the “/etc/shells
” file without removing the first line.
$ cat /etc/shells
Output:
Reading the “/etc/shells
“, removing the first line using the “NR
” variable, and printing the rest of the output.
$ awk 'NR==2, NR==0' /etc/shells
Output:
I hope you understood that “NR==0
” told the AWK to print the rest of the result on screen.
Using Regular Expressions
Earlier, I explained how to ignore the first row or any row using the “NR
” variable with the example of the “/etc/shells
” file.
However, if you look in the same “shells
” file, you will find that the main content, which we are interested in, starts with a “/
” backslash, which we can use to print the only line that starts with a “/
” backslash at the beginning, as shown.
$ awk -F "/" '/^\// {print $NF}' /etc/shells
Output:
The “$NF
” variable is used to print the last column from filtered data.
Displaying the Entries Based on Line Length
In the previous command, you used the “$NF
” variable to print the last column; if you remove it or replace it with “$0
“, it will print the complete output.
$ awk -F "/" '/^\// {print $0}' /etc/shells
Output:
In the above command, some lines are larger than others, meaning the length of some lines is greater than that of others.
You can use this length to print the line with only a certain number of characters using the “>
“, “<
“, or “=
” characters.
$ awk -F "/" '/^\// {if(length($0)>10) print}' /etc/shells
The above command will only display the entries whose lines are less than ten characters.
Displaying Entries Beginning with a Specific Keyword
If you execute the “ps -ef
” command, you will get a detailed list of running processes.
$ ps -ef
Output:
From the above list, if you want to just output the line whose first column contains “systemd+
“, you can do that with the help of an if statement, as shown.
$ ps -ef | awk '{if($1=="systemd+") print}'
Output:
Displaying the Line with Matching Text
If you have a text file with a lot of lines in which a few lines contain the same string, then you can use that string to display the limited entries that contain the specified string, as shown.
$ awk '/line/ {print}' file.txt
The above command will only display the entries that contain the “line
” string.
Showing Entries by Line Number
Earlier, I showed you how to use the “NR
” variable to ignore the first line from the “/etc/shells
” file. But you know you can use the same variable to print the line number, as shown.
$ awk '{print NR,$0}' file.txt
Output:
For Loop in AWK
Like other programming languages, in AWK you can create a for loop statement by specifying the range, as shown.
$ awk 'BEGIN { for (i = 1; i <= 5; i++) print i }'
Output:
While Loop in AWK
While loop statement can also be created in AWK by specifying the variable with a start value and a final value.
$ awk 'BEGIN {i = 1; while (i <= 5) { print i; i++ } }'
Output:
There are a variety of things you can do with AWK if you have good hands at regular expression.
Although I tried to explain it to you with the best examples, if you still have any doubts in any corner of your brain, do let us know in the comment section.
Join The Conversation
Users are always welcome to leave comments about the articles, whether they are questions, comments, constructive criticism, old information, or notices of typos. Please keep in mind that all comments are moderated according to our comment policy.