Easy steps for optimizing shell scripts

Computation

You have several different options for calculating the sum of numbers that a text file contains. In the example in Listing 14, every line containing the fiftieth string is interesting. The script evaluates a file that contains one million lines. Every fiftieth line contains the string.

Listing 14

Looking for a String

#!/bin/bash
typeset -i sum=0
while read line; do
  set $line
  # Totaling field 6
  sum=$((sum+$6))
# Parse output from grep
done < <(grep " fiftieth " largefile)
echo "Sum total: $sum"

Here, too, you can use Awk as a tool for quick summation (Listing 15). But Awk does not work directly in machine language. For this reason, it makes sense to pass the search for the character string to the grep command and then add the lines found using Awk (Listing 16).

Listing 15

Faster with Awk

# time awk '/ fiftieth / {sum += $6} END {print "Sum total:", sum}' largefile

Listing 16

Fastest: Awk with grep

# time awk '{sum += $6} END {print "Sum total:", sum}' < <(grep " fiftieth " largefile)

In this example, too, optimization achieved significant speed gains. The variant from Listing 15 reduces the runtime to about one third; the variant from Listing 16 runs twenty times faster than the first alternative (see Table 3).

Table 3

Awk Timer

Category

Listing 14

Listing 15

Listing 16

real

0m4.471s

0m1.408s

0m0.231s

user

0m2.374s

0m1.348s

0m0.050s

sys

0m1.956s

0m0.013s

0m0.010s

Conclusions

The examples in this article show that you can drastically increase the speed of your scripts by skillfully using multifunctional tools such as Awk, Python, or Perl and by avoiding complex constructs with Tr, Sed, or Grep: You will thus consistently avoid many context changes.

However, not every anonymous pipe is detrimental to throughput, as the last example shows. Instead, it is more important to use the strengths of the various tools and keep the number of subprocesses as low as possible.

It is also important to remember that you can improve the readability and thus the maintainability of the scripts by doing without complex chains of commands.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Metadata in the Shell

    Armed with the right shell commands, you can quickly identify and evaluate file and directory metadata.

  • Bash scripting

    A few scripting tricks will help you save time by automating common tasks.

  • Bash Tuning

    In the old days, shells were capable of little more than calling external programs and executing basic, internal commands. With all the bells and whistles in the latest versions of Bash, however, you hardly need the support of external tools.

  • Scripted Printing

    A few commands and some simple shell scripts make it easier to manage your printer so that you can access print functions quickly and automate recurring tasks.

  • Statistics with gawk

    With very little overhead, you can access statistics on the spread of COVID-19 using gawk scripts and simple shell commands.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News