File Compression for Modern Computing
Command Line – zstd
In an effort to meet modern computing needs, zstd offers a greater degree of compression at a faster compression rate, with unique options to enhance performance.
Many standard Linux tools have been around so long that second-generation tools are being developed to meet modern needs. For instance, Neovim is an update of the Vim text editor, and apt
is a rearrangement of the basic tools for apt-get
, the Debian package manager. Similarly, Zstandard (zstd
) [1] is a revision of compression tools like tar
and gzip
, except with higher degrees of compression at a faster rate. Additionally, zstd
includes several unique tools for enhanced performance, such as advanced compression features, compression levels and strategies, and dictionaries.
zstd
was written by Facebook employee Yann Collet and released in August 2016. Briefly, it is a lossless compression algorithm based loosely on the earlier LZ77 algorithm [2]. The command's syntax is deliberately similar to that of gzip
, down to variations on the basic command that are the equivalent of popular options. For example, zstdmt
is the same as zstd -T0
(use the same number of threads as detected cores), whereas unzstd
is the same as zstd -d
(decompress), and zstdcat
is the same as zstd -dcf
(decompress, force write to standard output, and overwrite without prompt).
The Basics
Getting started with zstd
is as simple as typing:
zstd FILE
Multiple files can be specified using a space-separated list. Unless you add --rm
as an option, the original file is not deleted. A progress bar is displayed as a single file is compressed; unless -q
is added to the command, an error produces a short help page. Unless otherwise specified, level 3 compression is used along with four threads (see below), and a data-integrity check is done on the original file before compression. The result is a file with the same name as the original file, but ending in .zst
(Figure 1).
To decompress, type:
zstd -d FILE
and a decompressed file is created without the .zst
extension. If you specify more than one .zst
file to uncompress, all the files are decompressed into a single file. Another option is to run --test
(-t
) to check the integrity of compressed files without creating or deleting any files.
In any operation, you can specify file size as needed in kilobytes (using KiB, Ki, K, or KB) or in megabytes (using MiB, Mi, M, or MB).
Basic Options
Most of zstd
's basic behaviors can be modified by options. To start, zstd
has both verbose (-v
) and quiet (-q
) modes for running the command. You can also use -o FILE
to specify any file name you want for an output file, placing the option after the original file's name, instead of directly after the basic command with the rest of the options. Additionally, if you are aware that a compressed file of the same name as the output file already exists, you can add --force
(-f
) to overwrite any file of the same name without confirming the operation first.
Several options help speed up commands. You can save time by turning off the integrity check during compression with --no-check
. The increased speed, of course, comes with the possibility that the compressed file might not be usable. A somewhat safer option to increase the speed is to enable --sparse
, which reduces the number of zeroes in the output file, which can add a couple more percentage points of compression when dealing with a text file. For a graphics file, however, --sparse
saves so little that it hardly seems worth using unless you are determined to save every bit of hard drive space possible.
As a recently created utility, zstd
can also be compiled to use multiple CPU cores to make compression faster. By default, only one core is used, but you can adjust the number with the option -T=NUMBER
(--threads=NUMBER
).
If the value is
, then zstd
will detect the number of cores and try to use all of them. Should the online help appear, you will know that the zstd
version you are using was compiled without threading.
Compression Levels, Strategies, and Advanced Options
zstd
approaches compression in two different ways. The more conventional tactic is to specify a specific compression level using --compress
(-z
or -#LEVEL
). The default level 3
can be overridden with any number from 1
to 19
, with 1
being the quickest and least compressed, and 19
the slowest and most compressed. To give a sense of the choices involved, a compression setting of 1
reduced the size of a 42MB .png
file by five percent in about six seconds, whereas a setting of 19
compresses the same file by just under eight percent in about 20 seconds. With a plain text file of 4,600 bytes, level 1 compression produces an archive file 55 percent smaller, wereas level 19 compression creates a file that is 59 percent smaller, both requiring only a few seconds. This difference between graphic and text files is typical.
You also have the option of adding the --ultra
option to enable the high, more memory intensive compression of levels 20-22. However, when used by themselves, the advanced compression levels are no more efficient than level 19 compression. To get the most from the ultra-compression levels, you need to experiment with the advanced options.
The advanced options for compression are defined in the option:
--zstd=OPTION=SETTING,OPTION=SETTING
The easiest to use is strategy=
(strat=
). This option can be completed with a number from
to 7
, in which
is the fastest and 7
the most compressed. Each strategy contains a number of methods and searches the file being compressed for an opportunity to use them. This search greatly increases both the time and the memory required to compress the file. However, the use of 7
can double the compression for a file.
Other advanced options for compression can override any of the options used in zstd
's compression algorithm. For instance, hashLog=BITS
sets the maximum number of bits for a hash table, making compression faster. Unfortunately, the man page lists the algorithm options with only a brief explanation of what they do, so most users will have to experiment blindly or else find other sources of information to understand what is being adjusted. Any algorithm option not specifically altered will use its default settings.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.
-
Plasma Desktop Will Soon Ask for Donations
The next iteration of Plasma has reached the soft feature freeze for the 6.2 version and includes a feature that could be divisive.
-
Linux Market Share Hits New High
For the first time, the Linux market share has reached a new high for desktops, and the trend looks like it will continue.
-
LibreOffice 24.8 Delivers New Features
LibreOffice is often considered the de facto standard office suite for the Linux operating system.
-
Deepin 23 Offers Wayland Support and New AI Tool
Deepin has been considered one of the most beautiful desktop operating systems for a long time and the arrival of version 23 has bolstered that reputation.
-
CachyOS Adds Support for System76's COSMIC Desktop
The August 2024 release of CachyOS includes support for the COSMIC desktop as well as some important bits for video.
-
Linux Foundation Adopts OMI to Foster Ethical LLMs
The Open Model Initiative hopes to create community LLMs that rival proprietary models but avoid restrictive licensing that limits usage.
-
Ubuntu 24.10 to Include the Latest Linux Kernel
Ubuntu users have grown accustomed to their favorite distribution shipping with a kernel that's not quite as up-to-date as other distros but that changes with 24.10.
-
Plasma Desktop 6.1.4 Release Includes Improvements and Bug Fixes
The latest release from the KDE team improves the KWin window and composite managers and plenty of fixes.
-
Manjaro Team Tests Immutable Version of its Arch-Based Distribution
If you're a fan of immutable operating systems, you'll be thrilled to know that the Manjaro team is working on an immutable spin that is now available for testing.