Data processor

Open Source Gem

© Lead Image © Mikhail Avlasenko,

© Lead Image © Mikhail Avlasenko,

Article from Issue 278/2024

A little-known, very powerful data processor for your scripts, datamash makes long, complex calculations simple.

GNU datamash [1] is a command-line program capable of analyzing, summarizing, or transforming in various ways tables of numbers, with or without text, stored inside plaintext files. For these kinds of tasks, datamash is often a faster, more productive alternative to tools like AWK, sed, or any scripting language.

Just like those other tools, datamash is a good team player, in the traditional Unix and Linux sense: You can use datamash interactively at the prompt, automatically in shell scripts, and even directly attach it to other programs (including itself!) via Unix pipes.

Besides, in almost all the cases I have seen or can imagine, datamash does what you need with less typing, possibly a lot less. Last but not least, datamash lets you easily perform basic quality checks on raw data. I'll show you how to do all this from scratch, starting with the basic options and ways of working with datamash and then moving to more complicated examples.


Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tool Tips

    We test DK Tools 4.2.2, Midnight Commander 4.8.15, Datamash 1.1.0, F3 6.0, Sauvegarde 0.0.7, and WackoWiki 5.4.3.

  • pdfsandwich

    Use this handy tool to make your scanned PDF files zoomable and searchable.

  • LibreOffice Calc Pivot Tables

    Pivot tables let you sort, rearrange, group, and perform calculations on your spreadsheet data. We help you get started with this powerful tool.

  • Command Line: sort

    sort helps you organize file lists and program

    output. And if you like, you can even use this small

    but powerful tool to merge and sort multiple files.

  • Miller

    Miller offers a clever alternative for working with structured text files: use a single tool to replace the strings of commands built from conventional utilities like grep, cut, and sed.

comments powered by Disqus