Kernel Issues - The "Hurricane Katrina" of programming
Paw Prints: Writings of the maddog
Last week I spent two days that the Red Hat Summit in Boston. Unlike a lot of conferences I attend, I actually spent much of my time in technical talks listening to some of the things that Red Hat was going to be putting into RHEL 6.0 which is due out in a short time1.
I enjoy listening to technical talks, particularly ones talking about kernel issues since I used to teach operating system design. I taught other types of programming (database, compiler design, networking, graphics) but in my opinion most application-level programming (including libraries) is a “calm sea” versus the “Hurricane Katrina” of kernel programming.
One of the areas of interest to me was the various file systems being supported in the upcoming RHEL, not only the various attributes of the filesystems (that I could also get by reading various white papers and reports on the Internet) but some of the lower-level “grunt work” that needs to be done to make sure the files system is dependable and efficient under different loads.
As an example, in order to get better performance out of some newer filesystems, the device drivers had been changed to utilize various hardware features in iSCSI and SATA. These standard features had been defined in the hardware architecture for some time, but had never been utilized by Linux. Unfortunately some of the earlier implementations of these standards did not implement these features correctly, so the Red Hat engineers had to go back and re-qualify some of the hardware controllers to make sure that they would work as expected with the modified drivers.
Another area of testing is in file system limits. Some of the newer filesystems can expand to 100 TeraBytes or more2. It is one thing to write the code that “theoretically” can handle 100 TeraBytes, but quite another to actually set up 100 TB of disk space and exercise the code to get actual verification that it is working correctly, or to get true measured performance.
As another example, issues of programming the kernel have been accentuated by issues around virtualization. Kernel locks, which have always been a touchy issue with race conditions, and clock skew issues in bare metal operating systems take on a life of their own when running on a virtualized system.
One of my favorite talks of the two-day session was “Kernel Optimizations for KVM” by Rik van Riel. It was Rik's talk that illuminated the issues of clock programming that I alluded to earlier. Rik pointed out that the normal issue of the clock causing an interrupt every 100 milliseconds was not a big deal in a normal Linux kernel running on bare metal, as the CPU has enough time to go to sleep between clock tics to save energy. However, when you have many virtual machines running at one time on a piece of hardware, with a normal Linux “clock” the virtual machines would wake up the host just to have a “clock tick” registered, which would keep the host cpu from being efficient with electricity. The other issues happens if the client system happens to be swapped out when its clock tic should be registered, the client system will miss its clock tic and have an incorrect time. As Rik pointed out, an “incorrect time” on a series of virtual servers might mean that the packet that goes from one system to another may show up either “very late” or “before it was sent”. With current changes in the Linux KVM kernel the guest system can simply ask the host for the time-of-day information, and get the correct “time-of-day” for its activities.
These are the types of issues that make kernel programming “interesting”.
1Tim Burke, VP of Platform Engineering (and an old friend of mine), did a great balance of going over the major features in only an hour's time without talking faster than a tabacco auctioneer.
2- is it just me, or is it that “100 TeraBytes” does not seem that large any more? Perhaps it is because I can buy 2 TB disks at such an inexpensive price....comments powered by Disqus
Version 16 of the popular Linux desktop reveals new tools, edge-snapping, and performance improvements.
Symantec says Linux-Darlioz burrows in through PHP.
Dell renews its quest for the ultimate developer machine.
Innovative back door looks like normal SSH traffic.
One of CeBITs most successful forums opens the new year with a new name. The popular Open Source Forum continues in 2014 under the name Special Conference: Open Source. This year, the forum will be bigger and offer a wider range of possibilities for sponsors.
New release offers better graphics drivers and expands filesystem support.
New mail protocol will shut out the NSA and prevent snooping on metadata.
A new web application helps users visualize distributed denial-of-service attacks.
Ubuntu 13.10 takes a step toward convergence, with lots of mobility, but Mir only partly here.
Galileo board is targeted to embedded developers and educational institutions.