The future of Linux updates

Upgrade 2.0

Article from Issue 107/2009
Author(s):

Constant security updates can give you peace of mind, but the inconvenience of download – install – reboot can be a pain. We show you how to avoid downtime while staying up to date.

Few things are more annoying than testing something and having a software package change versions unexpectedly, so I have automatic updates disabled on my Fedora 11 test machine. Recently, however, when I ran yum update, I received a bit of a shock: Although I had updated the machine recently, I still had another 250MB to download and install (Figure 1). Thank goodness for broadband – if I were still on dial-up or on a slower form of broadband, I would have been up the creek.

Smaller Is Better

Who cares if I need to download 100MB every week or so and occasionally reboot to keep my system up to date? If you can make updates small enough that they don't significantly affect the amount of bandwidth being used (and for wireless systems, this means extending battery life), then it's much more likely that you can automate updates and keep everyone secure. Furthermore, with embedded devices becoming more complicated (running Java, web servers, email servers, and so forth), there is a greater chance that everything needs security updates.

For package files such as RPM and DPKG, one simple solution is simply to leave out any files in the update that haven't changed. For example, on Fedora 11 the package openoffice.org-core is 92MB in size, but between the original release and the first update of this file there are about 30.3MB of identical files (images, XML files, and so on) in addition to the overhead of including these files within the RPM (path information, hash values of the files, etc.). But even this isn't much of a gain.

How can you get the update really small? Binary diff tools such as bsdiff [1] can be used to compare two executable files and create a diff file that allows you to upgrade the old binary with a relatively small diff file. For example, the httpd binary as shipped with the Apache package in Fedora 11 is a 322KB executable; however, the bsdiff for the most recent update is a mere 8KB.

Unfortunately, bsdiff doesn't always work well – a single changed function call or new function at the beginning of the binary file can result in two binary files that look different enough to cause the patch file to be extremely large (for example, comparing two recent Fedora 11 kernel files that are 1.8MB in size results in a bsdiff patch file that is 1.8MB in size).

A Better Way

Google's new Chrome web browser includes an automatic update functionality, Courgette [2], which will no doubt be rolled into their Google Chrome operating system (a lightweight Linux platform with the Chrome web browser on top). Courgette basically does the same thing bsdiff does, but with slightly more intelligence. Instead of just comparing two raw executables in binary format, Courgette converts the binaries to primitive assembly language and does the diffing at that level. By bringing it up a level, Courgette is able to compare the symbol tables for the executables with a much higher chance of finding matching strings, which results in a smaller patch because it can ignore more of what hasn't changed, even if the location has changed slightly. Google claims a roughly 90 percent decrease in the size of patches when compared with the use of just bsdiff.

Although you can solve the problem of size, that still leaves you with the rather sticky problem of dealing with kernel updates that require a system reboot. Although the kernel might reboot quickly, some services, such as VMware or various databases, can take minutes to come online fully. In the case of a database, it could take hours for the cache to become properly populated.

Other services, such as VOIP servers, might simply be in use 24 hours a day, seven days a week, with no ability to be shut down. Luckily, this problem has led to the creation of Ksplice [3]. Ksplice, which is relatively simple in theory, creates replacement code (what you have when compared with the update), resolves symbols in the replacement code, and then checks the safety of the patch (e.g., determining whether any functions being replaced have not been in-lined elsewhere and need updating there too).

Ksplice then inserts jump (JMP) instructions into the original kernel that point to the new code, and – presto – you have an updated kernel without rebooting. The interesting thing about Ksplice is that it can also be applied to services running in user space, which could eventually result in systems that can be updated without rebooting or even having to restart services. If you want to give Ksplice a spin, it's available for Ubuntu from the package archive on their website [4].

Role of Vendors

On the other hand, why don't vendors ship a full file and a partial file that only contains the update? Part of the problem is the overhead: Do you only keep partial packages for each possible upgrade (i.e., a partial for version 1.0 to 1.1, 1 to 1.1.1, 1.0 to 1.2, and so on), or just for each step of the upgrade (i.e., a partial for version 1.0 to 1.1, 1.1 to 1.1.1, 1.1.1 to 1.2, and so on). The good news is that certain update-related technologies, such as geolocation, are being handled, and I suspect that as tools like Courgette and Ksplice mature, you'll see vendors embracing them.

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus

Direct Download

Read full article as PDF:

News