Beginners Guide for AWK Command on Linux

Linux TLDR
Last Updated:
Reading time: 4 minutes

Awk is a domain-specific programming language that can be used as a Linux command-line tool or within a shell script. The working is similar to the sed and grep commands, extracting the data from the reference file using the regular expressions.

It can be piped with other commands; regular expressions must be defined between a single colon within the curly bracket β€œ{regular_expression}β€œ, and by default, space is treated as a column in AWK.

Tutorial Details

DescriptionAWK (Aho, Weinberger, and Kernighan)
Difficulty LevelHigh
Root or Sudo PrivilegesMaybe
OS CompatibilityUbuntu, Manjaro, Fedora, etc.
Prerequisitesawk
Internet RequiredNo

Displaying the Single Column

The ps command output consists of four columns separated by spaces. You can use this space to define the column in numbers (referring to integers).

To print the first column using AWK, specify the column number with the β€œ$” dollar symbol inside the curly bracket, as shown.

$ ps | awk '{print $1}'

Output:

Displaying the first column using the awk command

You can output the fourth column by replacing β€œ$1” with β€œ$4β€³.

$ ps | awk '{print $4}'

Output:

Displaying the fourth column using the awk command

To print both the first and fourth columns together, specify both with a space between the double quotation marks.

$ ps | awk '{print $1" "$4}'

Output:

Displaying the first and fourth columns together using the awk command

If you replace the β€œ$1” and β€œ$4” with β€œ$0” it will print the complete command output.

Displaying the Columns Using a Custom Separator

If you want to read files like β€œ/etc/passwd” which separate every entry using the β€œ:” colon then the above method of printing the column using the space will not work for this.

To print the first column of β€œ/etc/passwd” you must specify the fields are separated using the β€œ:” colon to β€œ-F” flag and then use the β€œ$” dollar symbol to print the first column, as shown.

$ awk -F ":" '{print $1}' /etc/passwd

Output:

Specifying the custom separator while displaying the column

To print the sixth column, replace β€œ$1” with β€œ$6β€œ.

$ awk -F ":" '{print $6}' /etc/passwd

Output:

Displaying the sixth column using a custom separator

Limiting the Entries Using the β€œNR” Variable

In the previous command you were printing the complete first column from the β€œ/etc/passwd” file. However, you can limit the output by defining the range in number using β€œNR” variables.

$ awk -F ":" 'NR==1, NR==5 {print $6}' /etc/passwd

Output:

Limiting the entries using the "NR" variable

Above, you are only getting the records between the first and fifth rows from the sixth column.

If you read the file like β€œ/etc/shellsβ€œ, you will get the first line as a comment that will interrupt while printing the column, which you can easily ignore using the β€œNR” variable, as shown.

Reading the β€œ/etc/shells” file without removing the first line.

$ cat /etc/shells 

Output:

Reading the "/etc/shells" file

Reading the β€œ/etc/shellsβ€œ, removing the first line using the β€œNR” variable, and printing the rest of the output.

$ awk 'NR==2, NR==0' /etc/shells 

Output:

Removing the first line from the "/etc/shells" file

I hope you understood that β€œNR==0” told the AWK to print the rest of the result on screen.

Using Regular Expressions

Earlier, I explained how to ignore the first row or any row using the β€œNR” variable with the example of the β€œ/etc/shells” file.

However, if you look in the same β€œshells” file, you will find that the main content, which we are interested in, starts with a β€œ/” backslash, which we can use to print the only line that starts with a β€œ/” backslash at the beginning, as shown.

$ awk -F "/" '/^\// {print $NF}' /etc/shells

Output:

Filtering the entries that start with ("/") backslash

The β€œ$NF” variable is used to print the last column from filtered data.

Displaying the Entries Based on Line Length

In the previous command, you used the β€œ$NF” variable to print the last column; if you remove it or replace it with β€œ$0β€œ, it will print the complete output.

$ awk -F "/" '/^\// {print $0}' /etc/shells

Output:

Displaying the complete entries that start with ("/") backslash

In the above command, some lines are larger than others, meaning the length of some lines is greater than that of others.

You can use this length to print the line with only a certain number of characters using the β€œ>β€œ, β€œ<β€œ, or β€œ=” characters.

$ awk -F "/" '/^\// {if(length($0)>10) print}' /etc/shells

The above command will only display the entries whose lines are less than ten characters.

Displaying the entries whose length is less than ten characters

Displaying Entries Beginning with a Specific Keyword

If you execute the β€œps -ef” command, you will get a detailed list of running processes.

$ ps -ef

Output:

Displaying the running processes

From the above list, if you want to just output the line whose first column contains β€œsystemd+β€œ, you can do that with the help of an if statement, as shown.

$ ps -ef | awk '{if($1=="systemd+") print}'

Output:

Displaying the entries whose first column matches the specified string

Displaying the Line with Matching Text

If you have a text file with a lot of lines in which a few lines contain the same string, then you can use that string to display the limited entries that contain the specified string, as shown.

$ awk '/line/ {print}' file.txt 

The above command will only display the entries that contain the β€œline” string.

Displaying the entries with matching strings

Showing Entries by Line Number

Earlier, I showed you how to use the β€œNR” variable to ignore the first line from the β€œ/etc/shells” file. But you know you can use the same variable to print the line number, as shown.

$ awk '{print NR,$0}' file.txt 

Output:

Displaying the entries with line numbers

For Loop in AWK

Like other programming languages, in AWK you can create a for loop statement by specifying the range, as shown.

$ awk 'BEGIN { for (i = 1; i <= 5; i++) print i }'

Output:

for loop statement using the awk command

While Loop in AWK

While loop statement can also be created in AWK by specifying the variable with a start value and a final value.

$ awk 'BEGIN {i = 1; while (i <= 5) { print i; i++ } }'

Output:

While loop statement using awk command

There are a variety of things you can do with AWK if you have good hands at regular expression.

Although I tried to explain it to you with the best examples, if you still have any doubts in any corner of your brain, do let us know in the comment section.

Join The Conversation

Users are always welcome to leave comments about the articles, whether they are questions, comments, constructive criticism, old information, or notices of typos. Please keep in mind that all comments are moderated according to our comment policy.