Zack's Kernel News
Auto-Suspend
Rafael J. Wysocki pointed out that phones and other small devices necessitated the ability to suspend the system to RAM or to a low-power mode automatically after a certain period of idleness. He said that to implement these features in Linux, developers had to figure out what parts should be in the kernel and what parts in user-land. He enumerated the various questions, such as how to tell when the system was idle. In response, Arjan van de Ven and Benjamin Herrenschmidt opened up the debate, essentially agreeing that a device driver would have to make those kind of determinations. But as Roland Dreier pointed out, the situation is more complex in the case of highly integrated hardware because different parts of the system can be suspended at different times, and various parts of the system can be cleared to be suspended if various other parts of the system are already suspended. On a PC-like system, he said, a device driver would be sufficient to manage system suspension for the whole machine, but for highly integrated systems, many more options must be considered.
The discussion got very technical. The problem appears to be that the kernel shouldn't concern itself with the state of various user processes (i.e., userspace code should watch all that), whereas userspace code shouldn't concern itself with shutting down various internal parts of the system (i.e., the kernel should handle all that). This issue supplies plenty of examples, counter-examples, and special historical cases to back up everyone's point of view. Ultimately, a fascinating topological border will no doubt be established between the user-side and kernel-side obligations. And whatever decision is ultimately implemented, it will undoubtedly set precedent for the next big question of where the kernel ends and the rest of the world begins.
To Dream the Unbreakable System
Tarkan Erimer had the idea of a "failover" kernel, in which two kernels would run simultaneously, with the primary kernel doing all the work and the failover kernel taking over in the event of a panic or other crash and simply continuing to run the system without interruption. Willy Tarreau pointed out that scheduling both kernels could be a thorny problem. Also, he explained that most system crashes were the result of non-recoverable errors, like memory corruption or a driver bug. In those cases, a failover kernel wouldn't be able to sanely recover the system because many conditions would require a reset to restore to a known state. Memory corruption could end up worse than before. He also mentioned that he has implemented a similar concept with the use of a watchdog timer that would reboot to a second kernel image if the system crashed for any reason.
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)