Improving Linux package management

Delivery Service

© Photo by Mika Baumeister on Unsplash

© Photo by Mika Baumeister on Unsplash

Article from Issue 247/2021
Author(s):

Linux package managers work too slowly. The experimental distri research project investigates ways to speed up package management.

Package managers differ from each other not only in terms of the package formats they use, but also in their execution speed. Developer Michael Stapelberg has been working on how to streamline package managers such as Debian's Apt or Fedora's DNF to make them faster. He has written blog posts on the subject, given talks, and created an experimental distribution, distri [1] to explore the problem.

Distri is a minimal, command-line distribution for reviewing package management concepts in Linux. This is purely a feasibility study and is not suitable for production use. Distri seeks to be the simplest distribution that is still useful.

Criticism of Debian

Stapelberg, currently a Google developer, was a package maintainer at Debian from 2012 to 2019. Besides maintaining packages of Debian he wrote the i3 Window Manager and the Debian Code Search engine.

In March 2019, he announced in frustration his withdrawal from Debian development with a harsh criticism of the Debian project [2]. He referred to the practices and tools used to develop, manage, and support the software in the distribution saying that they were often more of a hindrance than a help.

In Stapelberg's opinion, there is a lack of effective tools to implement comprehensive changes in a timely manner. Also, according to Stapelberg, the specifications laid down in the Debian guidelines and pushed by the quality assurance tool Lintian [3] unduly hinder the implementation of necessary technical changes.

Too Much, Too Slow

In particular, Stapelberg vehemently criticizes the package management – not only in Debian, but Linux in general. Above all, he dislikes that package managers do too much and do it too slowly. To this end, he first conducted a series of tests with small and larger packages using the Apt, DNF, pacman, Nix, and apk package managers, contrasting the metadata downloaded and the time and bandwidth used [4].

What he discovered was that the metadata downloaded was often out of proportion to the package's size, and even more so the smaller the package. Any Debian user can easily check this by looking in the terminal at the end of an apt update to see how many megabytes of metadata the operation downloads. This amount often exceeds the total size of the actual packages to be updated (Figure 1).

Figure 1: When installing and updating packages, the package manager also downloads metadata. The volume of metadata is particularly high in Fedora and Debian, which delays package installation.

The maintainer scripts that the package manager runs during installation also slow down the process, as well as prevent parallel installation of packages. In Stapelberg's view, these scripts can just as easily run the first time the application is launched. Debian processes maintainer scripts using files such as preinst and postinst when installing or updating packages; they contain distribution-specific customizations [5]. Debian has 8,620 maintainer scripts. The Debian Policy Manual web page provides flowcharts that exemplify the underlying complexity here [6]. In Fedora, scriptlets perform this task [7].

Inefficient

At Google, Stapelberg learned a great deal about effectively updating large amounts of data. Updates to distributions proved fairly ineffective in his research. He did a test using common systems to install one small package and one large package, recording times.

Although the test computer's network connection supported speeds of around 115MBps, none of the distributions achieved more than just under 11MBps of throughput, with most achieving around 3MBps. Alpine Linux performed best, being the fastest in both tests at 10.8MBps and also requiring the lowest volume of metadata to complete the task. Arch Linux and NixOS were in the middle, while Debian and Fedora performed worst.

As an example, for the small 75KB ack package, Fedora had to download a massive 114MB and the install took 33 seconds. Alpine, on the other hand, was content with 10MB installed in one second. When installing the virtualization software Qemu, Alpine managed with 26MB, while Fedora needed almost 10 times the amount of data at 226MB.

In view of such numbers, it is little wonder that Fedora is pretty sluggish when it comes to updates and installations. In both scenarios, Debian came in second to last, because downloading the large volume of metadata also delayed the process. Besides Alpine, Arch Linux also had one of the faster package managers in the test.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Welcome

    When I started this job 15 years ago, Debian was a really big deal. The whole Linux community paid attention to Debian elections, and this magazine even had a regular monthly column devoted to Debian affairs. It would not be an exaggeration to reveal that I even knew a guy who had the Debian spiral tattooed on his neck.

  • Listaller

    A single uniform software installation tool for all distributions is still just a dream, but thanks to Listaller, this goal has come a little bit closer.

  • Why Debian Policy is important to package quality
  • LINUX WORLD NEWS
  • Getting Rid of Old Kernels

    When you update the kernel, the old version remains on the disk. If you clean up, the reward is several hundred megabytes of free disk space.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News