Zack's Kernel News
Zack's Kernel News
Zack Brown reports on: Trusted Computing and Linux; Load Balancer Improvements; and New Random Number Handling.
Trusted Computing and Linux
Sumit Garg posted a new version of the Trusted Keys subsystem for the Linux kernel, essentially targeting support for Trusted Platform Module (TPM) devices.
The general idea behind TPM technology is that the TPM chip manages access to a given device by encrypting its firmware and creating a corresponding hash value that is stored on a central server. When the system tries to use the device, the TPM hashes the firmware and compares it with what's on the central server. If they match, the user can use the device. Otherwise, they can't.
The goal is to prevent computer system owners from controlling their own systems and to give control to large companies such as Microsoft, who can then make decisions about what software is or is not allowed to be used on that system.
The benefits are enormous! For example, streaming copyrighted content can be handled without fear of piracy, because the large company can prevent pirating software from running on the system. That's the theory.
The drawback is that users can't control their own computers, and they get locked into a dependent relationship with whichever company controls their system. Naturally, there is a lot of money and energy being put into getting these types of patches through the gauntlet of kernel maintainers and up through Linus Torvalds, for inclusion in the main kernel tree.
Linus has traditionally been willing to accept Trusted Platform patches, but only to the extent that they helped, rather than hindered, users' abilities to control their own systems. You can imagine the debates between developers trying to implement features to wrest control of users' systems, and Linus cherry-picking only those aspects of those patches that actually kept control in the hands of users.
In the current discussion, Mimi Zohar asked for more information about the key signing and verification process. In particular, she wanted to know if the TPM's secret key, which it used for generating all the other keys, could ever be accessed by the user. Sumit replied no that this wasn't possible. The key was permanently locked into the TPM chip and represented part of the system-on-a-chip (SoC) service offered by the company producing the TPM device.
Sumit's code was split into several patches, and these were examined independently.
The first two patches enabled registering shared memory with the Trusted Execution Environment (TEE). The TEE is the environment that needs to be created by the various TPM devices, such that it has control over the movement of all data, to ensure that nothing happens that goes against the controlling company's policies. If a fully isolated environment cannot be created, the company can't verify its own control.
The third patch added support for blocking user access to the TEE to obtain the TPM's trusted keys. If the users could access those trusted keys, they could potentially violate the integrity of the TEE.
And Sumit's remaining several patches added support for the TEE's trusted keys.
Janne Karhunen remarked that he had implemented something similar to this. However, instead of supporting an external controlling company, he said, "my thought was to support any type of trust source. Remote, local, or both. Just having one particular type of locally bound 'TEE' sounded very limited, especially when nothing from the TEE execution side is really needed for supporting the kernel crypto. What you really need is the seal/unseal transaction going somewhere and where that somewhere is does not matter much. With the user mode helper in between, anyone can easily add their own thing in there."
Sumit pointed out that a generic TEE, of the sort Janne had described, was already in the Linux kernel and pointed to Documentation/tee.txt
for reference.
Sumit also mentioned that his patches supported arbitrary "trust sources," so long as they implemented a few special library functions.
But Sumit also questioned some of Janne's statement – particularly the idea of having a user-mode helper standing in the middle of the trusted network. Sumit said, "Isn't actual purpose to have trusted keys is to protect user-space from access to kernel keys in plain format? Doesn't user mode helper defeat that purpose in one way or another?"
Janne remarked in reply, "CPU is in the user mode while running the code, but the code or the secure keydata being [used] is not available to the 'normal' userspace. It's like microkernel service/driver this way. The usermode driver is part of the kernel image and it runs on top of a invisible rootfs." Janne continued, "I chose the userspace plugin due to this; you can use userspace aids to provide any type of service. Use the crypto library you desire to do the magic you want."
The debate continued in very polite terms, but the battle lines were, once again, clearly drawn. Janne underscored the issue in a subsequent email, saying, "Does the TEE you work with actually support GP [Global Platform standards] properly? Can I take a look at the code?" The Global Platform TEE standard is an open framework for multiple service providers to work together to include their separate products in the secured TEE environment.
Janne continued, "Normally the TEE implementations are well-guarded secrets and the state of the implementation is quite random. In many cases keeping things secret is fine from my point of view, given that it is a RoT [Root of Trust] after all. The secrecy is the core business here. So, this is why I opted the userspace 'secret' route – no secrets in the kernel, but it's fine for the userspace."
That's the key debate: "No secrets in the kernel" means the human owner of the computer has control of the system and can implement anything they want in conjunction with the TEE.
Janne also remarked, "The fundamental problem with these things is that there are infinite amount of ways how TEEs and ROTs can be done in terms of the hardware and software. I really doubt there are 2 implementations in existence that are even remotely compatible in real life. As such, all things TEE/ROT would logically really belong in the userland."
From the perspective of the corporate control advocates, however, giving the machine owner this level of control reduces the TEE's security. As Sumit put it:
"In our case TEE is based on ARM TrustZone which only allows TEE communications to be initiated from privileged mode. So why would you like to route communications via user-mode (which is less secure) when we have standardized TEE interface available in kernel?"
He asked Janne to "elaborate here with an example regarding how this user-mode helper will securely communicate with a hardware based trust source with other user-space processes denied access to that trust source?"
Janne explained, "The other user mode processes will never see the device node to open. There is none in existence for them; it only exists in the ramfs based root for the user mode helper."
Janne added, "Layered security is generally a good thing, and the userspace pass actually adds a layer, so not sure which is really safer?"
As I read this exchange, Janne is attempting to goad Sumit into affirming that the additional security he wants is exactly the elimination of the machine owner's ability to keep control. Janne is apparently essentially saying, "Security issues? What security issues?" And inviting Sumit to say that it's still possible for the machine owner to insert whatever they want into the TEE pipeline, which, of course, is exactly what Linux itself is supposed to be able to do.
But Janne got more and more explicit as the conversation proceeded. At one point he said, "The fundamental problem with the 'standardized kernel tee' still exists – it will never be generic in real life. Getting all this [patch submission] in the kernel will solve your problem and sell this particular product, but it is quite unlikely to help that many users."
And, even more explicitly, Janne remarked, "there is no way to convince op-tee or any other tee to be adopted by many real users. Every serious user can and will do their own thing, or at very best, buy it from someone who did their own thing and is trusted. There is zero chance that samsung, huawei, apple, nsa, google, rambus, payment system vendors … would actually share the tee (or probably even the interfaces). It is just too vital and people do not trust each other anymore."
The discussion petered out shortly afterwards. However, Sumit did not give up and submitted more patches later. Again, the owner-friendly elements were seen as acceptable, while the rest was seen as still problematic.
In this kind of debate, I ask myself if these sorts of features are inevitable in Linux. Will Linux definitely some day support the kind of Trusted Computing platform that could lock users out of controlling their own system? In other words, is there some sort of conceivable scenario in which these companies sneak a certain set of features through the development process and then Linus finds himself unable to undo those changes, because it would break too much user space that has already come to depend on it?
Another way of putting it might be: What if we discovered, today, that a basic element of networking could be used to implement this kind of Trusted Computing in Linux? Would Linus be willing to remove that element, knowing that it was generally regarded as essential? Or would he accept as inevitable the creation of these Linux-based Trusted Computing features?
Load Balancer Improvements
Vincent Guittot pointed out that the Linux load balancer had gotten a bit out of whack recently. Various improvements had made certain heuristics pointless, but those heuristics had not yet been removed. He also pointed out that not all CPU imbalances were based on load, while the load balancer calculated everything based on load. Consequently, Vincent felt there was room for further improvement along those lines.
He posted some patches to clean up things. Among other things, he consolidated the balancing logic into only three functions – one to identify the busiest group of processes, another to check if there's an imbalance, and a third to decide which processes to move in order to balance the load better.
Peter Zijlstra was very happy to see these patches; he and Valentin Schneider offered technical suggestions and documentation fixes. The three of them went back and forth for awhile, without disputes or controversies. It was a very forward-moving collaboration.
This is no guarantee that the code will go into the kernel. Yes, it's excellent to make the load balancer more meaningful and remove arbitrary logic and so on. And it's excellent to see unidirectional progress in the mailing list discussion. However, there are still obstacles that might arise between Vincent's patch set and inclusion in the main kernel tree – security issues and whatnot.
The main problem, especially with something like the load balancing code, is simply the impossibility of knowing how people use their systems. Obviously, if the kernel knew exactly how the system would be used, it would be trivial to balance out all of those processes between the various CPUs. But since use cases vary from person to person, we can never have such knowledge. And often the final obstacle to improving the load balancer is simply that, regardless of the intelligence behind a given patch, there is simply no way to know if it's actually better than what was there previously. So, to be accepted, a load balancer patch might need to make a large, clearly noticeable improvement, when, ironically, more subtle and delicate changes might in fact be the better way to go.
New Random Number Handling
Andy Lutomirski submitted some patches to improve the Linux kernel's random number generation routines. First, he added a getentropy()
function to provide a little entropy for use in generating a stream of random numbers. The idea is that entropy is itself a bit of randomness, taken from, for instance, the time delays between keyboard key presses. Then that number can be fed into a random number generator that will produce a stream of random numbers based on it. If you feed the same entropy in each time, you get the same stream of "random" numbers – not so random anymore. But if you have a good source of entropy, you can always have a fresh random number to start with, and therefore a truly unpredictable stream of random numbers.
However, as Andy pointed out, you don't always want this. Sometimes a bit of code wants random numbers, but not because they need to be cryptographically secure. Sometimes it just wants something, anything, so long as it is different than what came before. Andy's code would guarantee that it would provide a "best effort" at obtaining entropy, without actually requiring anything like true entropy.
The point of this is that the Linux kernel would normally wait for enough entropy to build up in the system, before allowing one of these entropy requests to return to the calling routine. And this is definitely still important for various cases. But in the cases where it's not, Andy's patches speed things up by not forcing the user code to wait for the build up of a suitable amount of entropy.
Andy added reassuringly, "This series should not break any existing programs. /dev/urandom
is unchanged. /dev/random
will still block just after booting, but it will block less than it used to. getentropy()
with existing flags will return output that is, for practical purposes, just as strong as before."
Theodore Y. Ts'o remarked that this was actually a really big change. He felt that the timing was not right in the development cycle for a patch "of this magnitude." He added, "The reason for this is because at the moment, there are some PCI compliance labs who believe that the 'true randomness' of /dev/random
is necessary for PCI compliance and so they mandate the use of /dev/random
over /dev/urandom
's 'cryptographic randomness' for that reason. A lot of things which are thought to be needed for PCI compliance that are about as useful as eye of newt and toe of frog, but nothing says that PCI compliance (and enterprise customer requirements :-) have to make sense."
Ted added, "It may be that what we might need to really support people (or stupid compliance labs) who have a fetish for 'true randomness' [is] to get a better interface for hardware random number generators than /dev/hwrng
. Specifically, one which allows for a more sane way of selecting which hardware random number generator to use if there are multiple available, and also one where we mix in some CRNG as a whitening step just [in] case the hardware number generator is busted in some way. (And to fix the issue that at the moment, if someone evil fakes up a USB device with the USB manufacturer and minor device number for a ChosKey device that generates a insecure sequence, it will still get blindly trusted by the kernel without any kind of authentication of said hardware device.)"
Ted's idea was to find a way to hook /dev/random
into any available hardware random number generator to satisfy those users who needed truly random numbers.
But Andy thought this might not be a kernel issue at all. He saw no reason why the PCI folks couldn't be satisfied by a userspace source of randomness. He remarked, "it should be straightforward to write a little CUSE program that grabs bytes from RDSEED or RDRAND, TPM, ChaosKey (if enabled, with a usb slot selected!), and whatever other sources are requested and, configurable to satisfy whoever actually cares, mixes some or all with a FIPS-compliant, provably-indistinguishable-from-random, definitely not Dual-EC mixer, and spits out the result. And filters it and checks all the sources for credibility, and generally does whatever the user actually needs. And the really over-the-top auditors can symlink it to /dev/random
."
Pavel Machek also replied to Andy's original post, asking for some better justification of the patches than Andy had given. And Andy explained, "The random code is extremely security sensitive, and it's made considerably more complicated by the need to support the blocking semantics for /dev/random
. My primary argument is that there is no real reason for the kernel to continue to support it."
There was no further discussion, but Ted was right that Andy's patch would be a big change – not necessarily to the behavior of the kernel at all, but just to the resources offered by the kernel to user code. Depending on how much time Linus Torvalds wanted to give users to adapt their code to this new randomness situation, Andy's patches would have to be timed carefully, to appear early in the development cycle leading to the next official kernel release.
Buy this article as PDF
(incl. VAT)