Zack's Kernel News

Kernel News

Article from Issue 170/2015

Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

To Add Capsicum or Not To Add Capsicum

David Drysdale noticed that FreeBSD had a new security feature called Capsicum that might work well in Linux. He gave a link to a paper from the 19th USENIX Security Symposium in 2010 describing the project [1].

The idea was to implement fine-grained security privileges so that applications could isolate their own abilities and prevent an attacker from forcing them to do the wrong thing. David gave the example of tcpdump constraining itself to read only from the network file descriptor and write only to standard output. An interesting aspect of this type of security is that the application must be aware of the security features provided by the operating system and include code to take advantage of them.

David posted some of his implementation ideas, but Eric W. Biederman felt that most of these were badly conceived. For example, Capsicum required that the kernel police the rights checks of file descriptors, and David thought that the best place to do that was in the code that converted a userspace file descriptor into a kernel space file pointer structure. This in turn, David said, would require implementing an extensive and invasive abstraction layer within the kernel code. However, Eric pointed out that the abstraction layer wasn't necessary, because filesystem "capabilities" had already existed for 20 years and (with some modifications) could perform a similar function.

Eric had similar issues with other implementation details in David's proposal. Eric also felt that Capsicum itself was not perfectly designed and was missing fundamental features like the ability to revoke privileges.

David disagreed with Eric's take on the issue. He agreed that "capabilities" were similar, but he said that Capsicum offered much finer grained control, which was both useful and more difficult to implement in a coarse-grained system like capabilities.

David also pointed out that the issue of whether or not to support revoking privileges was part of an ongoing debate in the Capsicum community, and not simply an oversight or poor design. As Ben Laurie put it in an email discussion on the cl-capsicum-discuss list in 2011, "It would require additional book-keeping to find and revoke outstanding capabilities, which requires knowing how to reach capabilities, and then whether they are derived from the capability being revoked. It also requires an authorisation model for revocation. The former two points mean additional overhead in terms of data structure operations and synchronisation."

Apparently the debate has not progressed much further in the past few years. Eric also made the point that users could get the kind of fine-grained security they wanted in other ways. As he put it, "What you can implement today if you want fine grained limitations like this is to create a mount namespace with exactly the subdirectory tree you want to allow access to and to return a file descriptor that points into that mount namespace. … In fact that solution is sufficiently performant and simple that even if you came up with a better user space interface for it that is how we would want to implement it."

The discussion was slightly disjointed and didn't come to a firm conclusion, but it was clear that Capsicum is an ongoing controversy. It's possible that the Capsicum design is good and that its limitations can be overcome by simple modifications to the basic design. Or, it's possible that there are simpler alternatives to Capsicum that implement similar security measures. One thing both sides of the debate seem to agree on is that finer-grained security would be a good thing.

Microcode Byte Alignment

Henrique de Moraes Holschuh recently noticed what seemed to be a discrepancy between the Intel Software Developer's Manual [2] and the way the Linux kernel behaved on the relevant architectures.

Specifically, section 9.11.6 described an update loader that would load microcode updates into several different architectures – the Pentium 4, the Intel Xeon, and the P6 family processors. At the very end of section 9.11.6, the document stated that microcode updates required 16-byte boundary alignment.

Henrique had been adding some additional strictness routines to the Intel Microcode Driver, when he noticed that the driver did not enforce the 16-byte alignment constraint.

Looking deeper, Henrique found that certain architectures in the specified group didn't actually require 16-byte alignment. The Xeon X5550 and the second-generation i5 seemed to require only 4-byte alignment rather than 16.

Henrique suggested some code fixes for the kernel to match the documented behavior. Borislav Petkov, however, said that the kernel itself seemed to be working fine and that, instead, it was Intel's documentation that needed to be updated to reflect the fact that certain architectures didn't require 16-byte alignment for microcode updates.

Henrique replied, "I often wonder how much of the Intel SDM is really a fairy tale … it certainly has enough legends from times long past inside ;-) But just like old stories, should you forget all about them, they sometimes grow fangs back and get you when you're least prepared."

He insisted that this was a kernel problem, because the kernel code was not actively checking for alignment. This meant that userspace code could selectively do the wrong thing and potentially mess up the kernel.

Borislav replied, "It seems to me you're looking for issues where there are none. We simply have to ask Intel people what's with the 16-byte alignment and fix the SDM, apparently. If the processor accepts the non-16-byte-aligned update, why do you care?"

At this point. H. Peter Anvin chimed in, saying that even if the current kernel behavior seemed to work fine, it should still be fixed so as to conform to the SDM. As he put it, "The SDM is the contract between the hardware and the software. This doesn't mean that not following the SDM doesn't work, but following the procedure in the SDM is what is guaranteed to work."

Bill Davidsen also agreed, remarking, "if the requirement is enforced in some future revision [of the chip], and [microcode] updates then fail in some insane way, the vendor is justified in claiming 'I told you so'."

In a later thread, Henrique submitted patches to fix the alignment issue.

It's interesting to me that in some cases, like certain parts of POSIX, the Linux kernel simply goes its own way; but, for Intel's developer manual, there seems to be a much stronger push to match it precisely in the kernel code. This could be because POSIX defines software behaviors, while Intel's developer manuals define hardware behaviors. The kernel developers can't really redesign hardware behaviors.

Zack Brown

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.


  1. "Capsicum: Practical Capabilities for UNIX" by Robert Watson et al., 19th USENIX Security Symposium, 2010:
  2. Intel Corporation, Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A:

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95