Expand Your Command Line

Tutorials – moreutils

Article from Issue 204/2017
Author(s):

Upgrade your Bash sessions with extra features and power.

Way back in the distant past (the 1970s), a group of programmers at Bell Labs created the first version of Unix. This operating system came with a set of utilities to help use the shell-driven interface. Those utilities proved incredibly useful, and we still have them today. Things like ls, rm, and cat are all descendants of the first pieces of Unix software, and they've changed remarkably little over the years. The GNU versions found on most Linux systems have more features than their ancestors, but the basic functionality remains the same.

These utilities have remained fairly static, because they stick to the basic Unix philosophy of "do one thing well." When you do one thing, there's far less to change or optimize. For this to work well, though, you need enough different tools that each does one thing well. While there's new Linux software being created all the time, there's surprisingly little of the sort of utility software that makes it easy to build powerful commands. In this tutorial, I'm going to look at the work of one project looking to change that: moreutils. Essentially, this project is just looking to expand the basic set of utilities. You should find it in your package manager (probably in a package called moreutils), or you can download it directly from the project website [1].

The aim of each of the utilities is to do just one thing well, so none of them are particularly complicated to use, and each utility has a well-written man page for guidance (Figure 1). The first of the commands I'll look at is combine, which takes two sets of input and combines them using a single logic rule to form the output.

Figure 1: All the utilities in moreutils have well-written man pages to help you out if you forget how to use them.

For example,

combine file1 and file2

outputs every line in file1 that is also in file2, whereas

combine file1 not file2

outputs every line that's in file1 but not in file2. Other options are or and exor. You can also replace a filename with a "-" to get input from stdin, which makes it particularly useful for whitelisting (or blacklisting) output from a particularly verbose command. For example, run the command once and send the contents to a file called file1. Run it a second time with

| combine -- not file1

and you'll just get the output that's different from the first time you ran it.

The next command I'll look at is pee. OK, take a moment to snigger at the name, and then I'll move on to what it does. The name comes from the fact that it works a little like tee, but for processes; since the tee command isn't that common, I'll just ignore explaining that for now. Basically, pee takes stdin and sends it to more than one command. A ridiculously simple example is this command:

echo "hello" | pee cat cat cat

The result is hello printing three times. Note that all this output is in stdout, so the following (wholly useless command) only outputs hello once:

echo "hello" | pee cat cat cat | uniq

Have you ever left a long-running command only to come back to it and had no idea when the last line of output printed? Or made some tweaks to some settings and wanted to know what effect they had on the time it took between two lines of output? Well, thanks to ts, that's easy! This command does one very little thing that proves to be surprisingly useful – it appends a timestamp to every line in stdin. For the above examples, all you would need to do is pipe the commands to ts, and you'll be able to see exactly when each line of output reached ts. It's like logging for lazy people. The -i option outputs the time since the previous line of stdin; this is useful for profiling changes to settings in software.

Most standard Unix commands work with text files. Some have the additional ability to work with zipped text files, but not all do. In fact, why should command-line tools come with the ability to work with zipped files? That is, after all, against the basic principle of doing one thing well. zrun is the solution. It's a tool that does just one thing: Unzips a file to a temporary file and then runs a command with that temporary file. That all sounds a little more complicated than it actually is, so I'll look at an example. If you have a zipped file, hello.txt.zip, you can cat it with the command:

zrun cat hello.txt.zip

With zrun, you can run any Linux command with zipped files.

There are some commands that you run all the time, and most of the time they work. You want them to just get on with their job and not spam you with information about what's going on. I'm thinking of things that go in cron jobs or systemd timer units. However, every once in awhile, they'll break, and then you want to know everything that happened. The old fashioned way of dealing with this situation was either to direct all the output to /dev/null and cross your fingers or to write the output to a logfile and just deal with the fact that most of the data there was pointless. The chronic utility solves this problem. By default, it'll just run a command and drop all of the output. However, if the command fails, it will send everything to stdout. In this way, you can run commands in such a way that they output all the details that might be useful, without having to worry about it clogging up logfiles, but still have the information available if you need it. Just run it like this:

chronic <command>

I've taken a look at just some of my favorite utilities in moreutils, but I haven't covered all of them (see the "Even More Utils" box). In fact, the team behind this software is still on the lookout for more commands to help bolster their collection of tools that set out to make our lives easier.

Even More Utils

moreutils isn't the only source of new command-line tools. Here are a few more of my favorites:

  • jq: works as a complicated, but very powerful command-line JSON parser
  • pv: views the progress of data through piped commands
  • autossh: automatically reconnects ssh connections and tunnels after network disruptions or timeouts
  • tmux: runs multiple terminal sessions inside a single window

If you occasionally have to work on a Windows machine, you can use any of these commands with Windows Subsystem for Linux [2] (Figure 2).

Figure 2: If you can't install Linux on your machine, Windows Subsystem for Linux lets you use moreutils (and, indeed, any other Linux command-line tool).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tutorials – Systemd

    Take control of the services running on your Linux machine

  • Command Line: Mailx

    The simple mailx command-line mail client handles mail either interactively or via command-line options. Although it lacks the convenience of a GUI-based tool, mailx compares well in scripting.

  • Command Line: Diffutils

    The Diffutils tool set helps you compare text files, discover and display the differences between files, and even automatically synchronize files.

  • Command Line: File Management

    Do some serious spring cleaning and reorganize your data. The right commands can help you to keep on top of your file and directory management.

  • Command Line: Data Flow

    Working in the shell has many benefits. Pipelines, redirectors, and chains of commands give users almost infinite options.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News