Making your scripts interactive
Bash Communication
Letting your scripts ask complex questions and give user feedback makes them more effective.
The final installment in my shell script tutorial series focuses on conversations between Bash scripts and their human users who interact with them during script execution. Part of what you will learn here can be applied to conversations between shell scripts and other programs, but this is a different problem, which is often best solved with other tools, such as Expect [1].
As with every other programming feature, the most important questions are not "what" or "how," but "when" and "why" a script would need conversations. For the purpose of this tutorial, I will distinguish three broad cases, some of which may overlap in several ways.
In the first case, there are certain scripts that should not hold conversations at all (reporting outcomes and errors in logfiles is a different issue, of course). Obvious examples are scripts that should run as cron jobs or that must do exactly the same thing every time they run.
The second case involves scripts that must ask questions (i.e., receive different arguments every time they start) but require no other input after that.
The third case involves actual conversations, which occur in scripts that regardless of initial arguments must also carry on some more or less flexible dialog with users or programs over their entire execution.
Before I start, there is one common rule for all three cases: Never trust user input – even if that user is you! Due to space constraints, I have omitted any validation code from my examples. Do not, as they say on TV, try this at home! If you ask a user to enter a number in your script, include validation code to verify that it is a number; if it isn't a number, either ask again for a correct value or abort the script. To do this, use the test operators explained in the fourth installment of this series [2].
Command-Line Arguments
The simplest way to pass a script some data is on the command line when you launch the script. As previously explained in this series, all the arguments passed to a script as follows
#> scientists-directory.sh Ada Lovelace mathematician 1815
are available inside the script in the special variables $1
, $2
, and so on or in one array called $@
. In the above example, the scientists-directory.sh
script should begin by copying all those values into other variables with better, self-documenting names
NAME=$1 SURNAME=$2 PROFESSION=$3 YEAR_OF_BIRTH=$4
and then only use those other variables as needed. One limit of this approach is that it is easier to make mistakes, for example, by passing arguments in the wrong order. You can greatly mitigate this risk by making your script require command-line switches, which are named parameters that may or may not have an assigned value. In both cases, to distinguish the parameter names from their values, the former always starts with one or two dashes. An argument like -v
may mean "enable verbose logging," and a couple of strings like --newuser marco
would tell the script to assign the value marco
to the $NEWUSER
variable.
Listing 1, which is taken from a backup script capable of both total and incremental backups, shows the simplest way to parse such arguments. If the first command-line argument passed to the script ($1
) has a value of -t
, --total
, -i
, or --incremental
, the case
statement sets the $BACKUP_TYPE
accordingly, so the rest of the script knows which backup procedure it should execute. If there is no first argument, the script exits. There could, of course, be a default $BACKUP_TYPE
to run when no arguments are given. Whether that is a good idea or not depends on your individual needs.
Listing 1
Parsing Command-Line Switches
The method shown in Listing 1 can be extended to handle multiple command-line arguments (see Listing 2), each of which may have any value and appear in any order.
Listing 2
Multiple Command-Line Arguments
The special variable $#
in line 1 holds the number of parameters passed on the command line that are still available in the $@
array. The while
loop in the same line continues until $#
is equal to zero (i.e., until all arguments have been parsed and removed). The commands in lines 6 or 11 are executed every time $1
(the current first argument saved in $key
) matches one of the accepted switches (-f
, -u
, and so on). When that happens, the following parameter ($2
) is copied to the internal variable corresponding to that switch. If the code in Listing 2 were in a script that makes on-demand backups of only some file types (e.g., images, audio, etc.) for one user, launching it as follows
#> custom_backup.sh -u marco -f images
would set $#
to 4
, and the variables from $1
to $4
to -u
, marco
, -f
, and images
respectively. Consequently, the first execution of the case statement in line 5 would set $USER
to marco
and then "shift away" (i.e., remove) the first two elements of $@
(lines 8 and 9). This would bring $#
to 2
, causing one more iteration of the while
loop. This time, however, $key
would be equal to -f
, because the original first two elements of $@
(-u
and marco
) were removed. This would set $FILETYPE
to images
. It is easy to see that changing the order of the pairs of arguments
#> custom_backup.sh -f images -u marco
would yield exactly the same result. Any argument in $@
after the ones explicitly mentioned in the case
statement (or before the first pair) would be dumped into the $OTHERS
variable.
A more sophisticated option to parse command-line arguments is the getopts
built-in command, whose main features are shown in Listing 3. The real difference between this while
loop and the one in Listing 2 are getopts
' arguments (line 1). If there are only two arguments (as in this example), getopts
works on the actual command-line switches passed to the script. Optionally, you may pass any other array, as a third argument, to getopts
.
Listing 3
getopts
The second argument in line 1 (opt
) is the variable that must store the current option. The initial, admittedly cryptic string ":ht:"
contains the recognized (single-character) switches (h
and t
in the example) with some qualifiers. Starting this string with a colon tells getopts
to set $opt
to ?
whenever a command-line switch other than -h
or -t
is found. Of course, this will make a difference to the script only if it contains code to handle such errors (line 11).
As far as the actual command-line switches are concerned, please note the important difference in the processing of -h
and -t
. The first switch only works, when present, as an actual on/off switch of some function. The colon after t
in line 1, instead, means that this option must have a value, which getopts
will automatically copy in the special $OPTARG
variable. When that value after -t
is missing, $opt
is set to a colon, causing the error message in line 9.
Line 15 has the same purpose as the shift
commands in Listing 2, because the special variable $OPTIND
holds the number of options parsed by the last call to getopts
: Therefore, decreasing it as shown here removes the last argument that was already parsed from $@
. A short, but clear and complete description of how getopts
works is available online [3].
Asking Multiple Questions
Command-line arguments are very flexible, but, by definition, you can only use them once and only before the script starts running. When that is not enough, the easiest way to get user input that runs entirely in the terminal without any third-party program is the read
built-in command discussed in the second, third and sixth installments of this series [4] (second was "Shell arrays", third was "Shell Flow control" and sixth was "Shell functions"). You can call read
as many times as you like in a script, and each invocation can set many variables, with several options controlling how the command behaves.
Inside the already mentioned scientists-directory.sh
script, for example, a single read
call may load all the data about one scientist:
read NAME SURNAME PROFESSION YEAR_OF_BIRTH
What happens here is that read
loads an entire line of input and splits it into words according to the value of the $IFS
variable [5]. Next, read
saves each of those words inside the variable in the same position in the argument list it received ($NAME
, $SURNAME
, etc.). If there are less words than variables, the extra variables remain empty. In the opposite case, all the excess words are appended to the last variable ($YEAR_OF_BIRTH
in the command above). If you call read
without any argument, instead, the entire line is saved into the special variable $REPLY
. Alternatively, you may call read
with the -a ARRAYNAME
option, and it will save all the words it reads into one array called $ARRAYNAME
.
Beware! The same power that makes this command very handy can cause problems if you are not careful. To make this point, Listing 4 deliberately uses bad code, which collects voter data inside a generic voter registration script.
Listing 4
Misusing read
Let's see how this works with a few different inputs before explaining the code in detail. If the user is a child, the following appears in the terminal window:
Please type your year of birth (4 digits), followed by [ENTER]: 2010 sorry, you cannot vote because you are only 9 years old. Quitting...
So far, so good. But what happens with bogus, unchecked data?
Please type your year of birth (4 digits), followed by [ENTER]: 0 You may vote, because you are 2019 years old. What is your nationality? [ENTER]: ...
Not so good now, right? Even worse is the fact that you would obtain exactly the same result by entering something like NONE
as your year of birth. A quick and dirty explanation of why this happens is that non-numeric values are interpreted as null (in practice, zero) when the variable that holds them is used in arithmetic expressions. But wait! I saved the best for last. This is what happens with another set of inputs, assuming that the user's nationality is Italian:
Please type your year of birth (4 digits), followed by [ENTER]: 1987 You may vote, because you are 32 years old. What is your nationality? [ENTER]: i Please type "t" if you are a terrorist, followed by [ENTER]:t you are a citizen of i, 32 years old but you are a TERRORIST! Please freeze, the police are coming for you! Please enter any comment you may have here, followed by [ENTER]: Rats! How did you know??? Rats! How did you know??? #>
When you call read
with the -p
option, it prints the corresponding prompt (Listing 4, line 34). Instead, when you use -n <NUMBER>
, it only grabs the first n
input characters. Therefore, typing italian as the answer to the question on line 17 made only the first character (i
) go into $nationality
(line 18), and only the second character (t
) go into $goodguy
(line 22), thus labeling me as a terrorist. Without input validation between lines 18 and 22, Bash took everything I typed and merrily distributed it among the several commands requesting user input according to those commands' options. Consult read
's man page for other options. Consequently, using read
or any other tool in a Bash script makes it easy to ask questions in order to update many variables quickly, but you need to ask unambiguous questions and never use the answers without checking that they at least look correct.
Requesting Input in a GUI
Commands like read
, echo
, and printf
are the simplest way to ask questions and receive answers inside a shell script, not to mention the only ones that will work with console connections to remote Unix computers where only the very basic tools are installed. However, you don't have to be limited by these commands. Figures 1 and 2, which were generated with the code in Listing 5, show that you can create pop-up windows that perform the same functions with the Zenity tool [6].
Listing 5
Zenity Dialogs
In addition to Figures 1 and 2, Zenity can produce many more window types. Comparing Listing 5 with Figures 1 and 2 also shows how Zenity is simple to use once you have had a quick look at its documentation [6]. Of note in Listing 5, line 9 shows how Zenity makes the answers received by the user available to the script that calls it. The Yes/No buttons in Figure 2 just produce an exit code, but when the user types something, the data is sent to standard output, which you can redirect to a file like voter-data.csv
. Inside that file, fields are separated by a pipe (|), so the content of the file generated by Figure 1 would be 1981|Italian
.
Buy this article as PDF
(incl. VAT)