Kernel Issues - The "Hurricane Katrina" of programming

Paw Prints: Writings of the maddog
Last week I spent two days that the Red Hat Summit in Boston. Unlike a lot of conferences I attend, I actually spent much of my time in technical talks listening to some of the things that Red Hat was going to be putting into RHEL 6.0 which is due out in a short time1.
I enjoy listening to technical talks, particularly ones talking about kernel issues since I used to teach operating system design. I taught other types of programming (database, compiler design, networking, graphics) but in my opinion most application-level programming (including libraries) is a “calm sea” versus the “Hurricane Katrina” of kernel programming.
One of the areas of interest to me was the various file systems being supported in the upcoming RHEL, not only the various attributes of the filesystems (that I could also get by reading various white papers and reports on the Internet) but some of the lower-level “grunt work” that needs to be done to make sure the files system is dependable and efficient under different loads.
As an example, in order to get better performance out of some newer filesystems, the device drivers had been changed to utilize various hardware features in iSCSI and SATA. These standard features had been defined in the hardware architecture for some time, but had never been utilized by Linux. Unfortunately some of the earlier implementations of these standards did not implement these features correctly, so the Red Hat engineers had to go back and re-qualify some of the hardware controllers to make sure that they would work as expected with the modified drivers.
Another area of testing is in file system limits. Some of the newer filesystems can expand to 100 TeraBytes or more2. It is one thing to write the code that “theoretically” can handle 100 TeraBytes, but quite another to actually set up 100 TB of disk space and exercise the code to get actual verification that it is working correctly, or to get true measured performance.
As another example, issues of programming the kernel have been accentuated by issues around virtualization. Kernel locks, which have always been a touchy issue with race conditions, and clock skew issues in bare metal operating systems take on a life of their own when running on a virtualized system.
One of my favorite talks of the two-day session was “Kernel Optimizations for KVM” by Rik van Riel. It was Rik's talk that illuminated the issues of clock programming that I alluded to earlier. Rik pointed out that the normal issue of the clock causing an interrupt every 100 milliseconds was not a big deal in a normal Linux kernel running on bare metal, as the CPU has enough time to go to sleep between clock tics to save energy. However, when you have many virtual machines running at one time on a piece of hardware, with a normal Linux “clock” the virtual machines would wake up the host just to have a “clock tick” registered, which would keep the host cpu from being efficient with electricity. The other issues happens if the client system happens to be swapped out when its clock tic should be registered, the client system will miss its clock tic and have an incorrect time. As Rik pointed out, an “incorrect time” on a series of virtual servers might mean that the packet that goes from one system to another may show up either “very late” or “before it was sent”. With current changes in the Linux KVM kernel the guest system can simply ask the host for the time-of-day information, and get the correct “time-of-day” for its activities.
These are the types of issues that make kernel programming “interesting”.
Carpe Diem!
1Tim Burke, VP of Platform Engineering (and an old friend of mine), did a great balance of going over the major features in only an hour's time without talking faster than a tabacco auctioneer.
2- is it just me, or is it that “100 TeraBytes” does not seem that large any more? Perhaps it is because I can buy 2 TB disks at such an inexpensive price....
comments powered by DisqusSubscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News
-
System76 Releases COSMIC Alpha 7
With scores of bug fixes and a really cool workspaces feature, COSMIC is looking to soon migrate from alpha to beta.
-
OpenMandriva Lx 6.0 Available for Installation
The latest release of OpenMandriva has arrived with a new kernel, an updated Plasma desktop, and a server edition.
-
TrueNAS 25.04 Arrives with Thousands of Changes
One of the most popular Linux-based NAS solutions has rolled out the latest edition, based on Ubuntu 25.04.
-
Fedora 42 Available with Two New Spins
The latest release from the Fedora Project includes the usual updates, a new kernel, an official KDE Plasma spin, and a new System76 spin.
-
So Long, ArcoLinux
The ArcoLinux distribution is the latest Linux distribution to shut down.
-
What Open Source Pros Look for in a Job Role
Learn what professionals in technical and non-technical roles say is most important when seeking a new position.
-
Asahi Linux Runs into Issues with M4 Support
Due to Apple Silicon changes, the Asahi Linux project is at odds with adding support for the M4 chips.
-
Plasma 6.3.4 Now Available
Although not a major release, Plasma 6.3.4 does fix some bugs and offer a subtle change for the Plasma sidebar.
-
Linux Kernel 6.15 First Release Candidate Now Available
Linux Torvalds has announced that the release candidate for the final release of the Linux 6.15 series is now available.
-
Akamai Will Host kernel.org
The organization dedicated to cloud-based solutions has agreed to host kernel.org to deliver long-term stability for the development team.