Exploring the latest version of the great Bourne-again shell

File Handling

Listing 4 contains a typical programming pattern. The while loop in lines 5 through 8 parses the lines from an input file one after another, storing the results in an array. This construction occurs frequently, but unfortunately, it is imperfect. If the last line does not end with a newline character, the loop will not store the line. The built-in read command does not perform the assignment until the line is terminated.

Listing 4

Parsing Files into an Array – Legacy Approach

01 #!/bin/bash
02
03 inputFile="$1"
04 i=0
05 while read line; do
06    lines[$i]="$line"
07    let i++
08 done < "$inputFile"
09
10 # Ongoing processing of the array lines ...

The new Bash version not only saves programmers some typing, it provides a cleaner implementation. Instead of the while loop, a single line is all it takes (see line 4 in Listing 5). The mapfile command additionally uses the readarray alias, which describes the purpose more aptly.

Listing 5

Bash 4 Parses Files into an Array

01 #!/bin/bash
02
03 inputFile="$1"
04 mapfile -n 0 lines < "$inputFile"
05
06 # Ongoing processing of the array lines  ...

The mapfile command can do even more. If you are interested in more detail, type help mapfile or visit the man page. The command can parse multiple lines in a single pass and process these lines one by one using a callback function. Unfortunately, the developers did not provide an especially elegant implementation of this function. Bash 4 calls the callback before parsing and not after. This approach gives the script a callback before the shell has actually read anything, but loses a callback after the last line. In spite of this complication, the function is still useful for parsing complete files.

Dependencies and Compatibility

Fortunately, the only dependency the new, lean Bash package needs to resolve is that of Version 6 of the Readline library. The installation is particularly easy on openSUSE; working as the system administrator, you simply replace the existing bash and bash-doc packages with the new versions and install libreadline6 parallel to libreadline5. The packages are available from the Build Service [3]. Debian offers a package in its experimental branch [4]. Users of other distributions might prefer to wait or build the package from the source code.

The first place to look for compatibility information is the COMPAT file in the documentation directory. It lists eight items, none of which are fundamental but relate to Posix compatibility. To discover whether or not this is an issue, you will need to run your own tests. Another important compatibility test relates to the init system with its many start and stop scripts. For example, some issues with openSUSE's /etc/init.d/network script cropped up. The check_firewall() function returns negative values, which doesn't make sense for two reasons: The range of valid values is between 0 and 255, and the script does not actually evaluate the return value that precisely. This appears to be and openSUSE error.

This does not faze the Bash 3.2 parser, but the negative return codes cause errors in Bash 4. As a consequence, openSUSE still starts the firewall, even if the system configuration tells it to do otherwise. An error like this could lead to problems for systems that the administrator has not actually changed.

Upper and Lower Case

If you have ever had to implement a robust approach to processing configuration files or user input, you will be aware of the following issue: Is the value in the configuration file or the result of a read operation YES, yes, Yes, true, TRUE, or True? Up to and including Bash 3.2, script authors used a call to tr to convert the value unambiguously to upper or lower case.

Bash 4 takes a far easier approach. The declare -u Varname instruction automatically converts all assignments to the Varname variable to upper case. The similar declare -l function converts to lower case. These conversion tools save the programmer the trouble of calling an external program and improve the script's performance. If you just want to convert once, rather than globally, you can use the new parameter extension:

foo=yes
echo ${foo^}
echo ${foo^^}

The first command converts the first letter to uppercase, thus outputting Yes. The second command converts all the letters and thus returns a result of YES.

If you need to convert to lowercase letters, you can use:

foo=YES
echo ${foo,}
echo ${foo,,}

The ${Var^Pattern} syntax supports even more complex replacements.

Co-Processes and Lists of Values

Most hardware manufacturers today offer multiple core CPUs, and techniques for exploiting multi-core functionality are starting to appear. Bash 4 allows programmers to launch what are referred to as co-processes:

coproc pipes command

This command returns the descriptors for standard input and output for the command in the pipes[0] and pipes variables. The main Bash process uses them to communicate with the co-process, which is particularly useful for shell scripts in parallel processing [5].

Automatic value lists, or brace expansions, have been around for a while, but most users don't know about them: The call to echo {5..15} counts from the first to the last number. Many Bash programmers still use a call to seq for this:

echo $(seq -s " " 5 15)

The seq command is not just slower, the result is also more difficult to read. But if you need to sort this kind of output, you have to contend with the drawback of losing the leading zeros. Bash 4 shell users can now write

echo {05..15}

and, if needed, receive two-digit values with leading zeros as the return value.

The new version of the standard Linux shell offers some useful new features for programmers and command-line fans. Despite minor incompatibilities, the Bash maintainer encourages users to upgrade (see the "Interview with Bash 4 Developer Chet Ramey" box).

The new version of Bash is not exactly lean and mean – the binary now weighs in at 730KB compared with 590KB in the previous version. It is measurably, but not noticeably, slower; the slower execution times will hardly be a factor on today's hardware.

If you want to use the new functions today, you need full control of your target environment to avoid having to replace the whole Bash environment. In some scripts, you might want to query the BASH_VERSION or BASH_VERSINFO environmental variables just to be on the safe side.

Let the script terminate gracefully and output an error message if the conditions for Bash 4 are not fulfilled.

Interview with Bash 4 Developer Chet Ramey

Chet Ramey is manager of the Network Engineering and Security Group in the IT services division of Case Western Reserve University. He has served as the Bash maintainer since 1990. Linux Magazine's Nils Magnus asked him for some thoughts on the latest Bash release.

Q: The free software community has not seen a new release of Bash in some time. Some say the 3.2 release was suitable for virtually all necessary tasks. So why something new?

A: I'm glad people thought so highly of bash-3.2. I don't like to release new versions too often, since the shell is so basic to many vendor distributions, but it was time. Bash-4.0 offers a number of new features (I tend not to put major new features into minor "point" releases), lots of bug fixes that did not make it as patches to 3.2, and additional functionality for existing features.

Q: Please name the three improvements or new features you are most exited about.

A: Let's see. That's a tough one, since there are a number of good ones.

1. Associative arrays.

2. The fix for the last long-standing piece of Posix non-compliance. The shell no longer requires that parentheses balance when parsing $(…)-style command substitutions (e.g., when parsing a case statement inside a command substitution). I'm excited about it because it was easily the most complicated thing to implement. It was not easy to do with a yacc-generated parser.

3. The improvements possible with bind -x. When you execute a command bound to a key sequence with bind -x, that command has access to the readline line buffer and current cursor position and can change them. A shell function could call an external program to rearrange the words on the command line, for instance, and have that reflected back into the editing buffer. I don't think this has gotten wide use yet, but there are a lot of possibilities.

Q: Bash 4 added a number of new features that eases programming. Do you think it can compete with scripting languages like Perl or Python?

A: I think that Perl and Python are richer languages, in that they have much more built-in functionality. Shells in general are designed to tie together external programs or shell functions and provide an environment in which this is easy. However, you can write very complex programs using the shell language: Look at the bash debugger, for example.

Q: Bash 4 added a number of new features for command-line users. Do you think it can compete with shells specializing on this use case, like Zsh?

A: I think so. Bash may not provide as much functionality built-in, but I think it provides enough tools to make it as rich an interactive environment as a shell like Zsh.

Q: Some discussion has taken place about programming style and the use of features in shell scripts. Some traditionalists mandate for plain Bourne shell compatibility, others make use of many features of Bash. What is your point of view? What about backward compatibility to Bash 3.2? Do you recommend moving on immediately or running 3.2 and 4.0 in parallel?

A: I suppose it depends on your goals. It's definitely the case that when people advocate for "plain Bourne shell compatibility" they mean the version of sh running on the machine they use most frequently. There are different versions of the Bourne shell: v7, SVR2, SVR3, SVR4, SVR4.2, for starters, and different vendors have amalgams of features from different Bourne shell versions. It's hard to decide exactly what people mean when they say "vanilla Bourne shell."

For folks interested in writing portable scripts, I would say code to the Posix standard. That can be considered a lowest common denominator that most of the popular shells implement. Many, if not all, vendors ship a shell that conforms to Posix. For those that don't, shells like Bash run on just about every platform out there.

As for backwards compatibility with bash-3.2, I've tried to keep as much backwards compatibility as possible. There are places where I felt that the bash-3.2 behavior was wrong and corrected it, sacrificing some backwards compatibility in the process. There is also the notion of the shell's "compatibility level," which explicitly preserves some old behavior when set (look at the compat31 and compat32 shell options). I think the level of compatibility with bash-3.2 is quite high and should not affect portability of scripts.

I think the compatibility is sufficient that users can upgrade to bash-4.0 right away and gradually get accustomed to the new features.

Thanks for the opportunity to contribute to the magazine.

Infos

  1. Zsh: http://www.zsh.org
  2. Bash: http://tiswww.case.edu/php/chet/bash/bashtop.html
  3. Bash 4 on openSUSE Build Service: http://software.opensuse.org/search/search?baseproject=ALL&q=bash
  4. Bash 4 in Debian's Experimental repository: http://packages.debian.org/experimental/bash-builtins
  5. "Parallel Bash" by Bernhard Bablok, Linux Pro Magazine, March 2009, pg. 56

The Author

Bernhard Bablok manages a data warehouse for Allianz Shared Infrastructure Services with technical performance metrics from mainframes to servers. When he is not listening to music, cycling, or walking, Bernhard enjoys working with Linux and object-oriented software.

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Bash 4.0 Introduces Associative Arrays

    The GNU Project's Bourne Again Shell (bash) is now in its fourth major version, which provides numerous enhancements.

  • Command Line

    A few basic tricks can liven up the command line and add a dash of color to your console.

  • Bash Builtins

    Even beginners can benefit from a greater understanding of the Bash shell’s many builtin commands.

  • Bash vs. Vista PowerShell

    Microsoft’s new PowerShell relies on .NET framework libraries and thus has access to a treasure trove of functions and objects. How does PowerShell measure up to traditional shells like Bash?

  • Bash Tuning

    In the old days, shells were capable of little more than calling external programs and executing basic, internal commands. With all the bells and whistles in the latest versions of Bash, however, you hardly need the support of external tools.

comments powered by Disqus

Direct Download

Read full article as PDF:

News