Zack's Kernel News

Zack's Kernel News

Article from Issue 232/2020

Zack Brown reports on: Line Ending Issues; Hardware Hinting; and Simplifying the Command Line.

Line Ending Issues

Jonathan Corbet posted some kernel documentation updates gathered from many contributors. However, Linus Torvalds spotted some file format problems. Specifically, some of the patches used CRLF (carriage return, line feed) characters as line endings. This dates back to MS-DOS of the elder days, and it's hard to imagine these line endings appearing in any Linux development tool chain.

Jonathan C. said it wasn't his tool chain, but that Jonathan Neuschäfer had submitted the patches to him with those line endings. More importantly, he said, "The problem repeats if I apply those patches now, even if I add an explicit '--no-keep-cr' to the 'git am' command line. It seems like maybe my version of git is somehow broken?"

Linus replied, also ccing the git development mailing list. He speculated, "I wonder if the CRLF removal is broken in general, or if the emails are somehow unusual (patches in attachments or MIME-encoded or something)? Maybe the CRLF was removed from the envelope email lines, but if the patch is then decoded from an attachment or something it's not removed again from there?"

Meanwhile, Jonathan N. was equally confused, saying he was "not sure why, or where in the mails' path this happened, but the base64 with CR/LF inside is also present in the copies that went directly to me, rather than via the mailing lists."

Jonathan C. checked his email history and reported that although the email containing the patch was plain text, the patch itself was a base-64-encoded attachment. So git's --no-keep-cr would not have filtered that encoded text.

Linus agreed this was probably the case – but wondered why the patches would have successfully applied to Jonathan C.'s tree in the first place. Linus said, "I'm surprised, though – when git applies patches, it really wants the surrounding lines to match exactly. The extra CR at the end of the lines should have made that test fail."

Jonathan C. confirmed that he used the git flag --ignore-whitespace in his process. When he applied these patches without that flag, git indeed choked on it. Jonathan explained, "Docs patches often come from relatively new folks, and I've found that I needed that to apply a lot of their patches. But clearly that was not a good choice; among other things, I've lost the opportunity to tell people when their patches have the types of whitespace issues that this option covers over. I've taken it out of the script and will only use it by hand in cases where I'm sure that it won't cause problems."

Junio C. Hamano, the git maintainer, affirmed that the behavior produced by Jonathan C. was normal for git. But Linus did not agree. Linus said:

"I think it's a mistake that --no-keep-cr (which is the default) only acts on the outer envelope.

"Now, *originally* the outer envelope was all that existed so it makes sense in a historical context of "CR removal happens when splitting emails in an mbox". And that's the behavior we have.

"But then git learnt to do MIME decoding and extracting things from base64 etc, and the CR removal wasn't updated to that change."

Junio was skeptical about making this change. He replied:

"What was the reason why '--no-keep-cr' was invented and made default? Wasn't it because RFC says that each line of plaintext transfer of an e-mail is terminated with CRLF? It would mean that, whether the payload originally had CRLF terminated or LF terminated, we would not be able to tell – the CR may have been there from the beginning, or it could have been added in transit. And because we (the projects Git was originally designed to serve well) wanted our patches with LF terminated lines most of the time, it made sense to strip CR from CRLF (i.e. assuming that it would be rare that the sender wants to transmit CRLF terminated lines).

"If the contents were base64 protected from getting munged during transit, we _know_ CRLF in the payload after we decode MIME is what the sender _meant_ to give us, no?"

The question was not decided in the discussion, which petered out around here.

An interesting tidbit is that although Linus is not the git maintainer, he has enormous, probably decisive, influence over its course of development. Linus originally wrote git himself over the course of a few weeks in 2005, after Larry McVoy canceled the BitKeeper license, and no other open source revision control system proved capable of the speed and efficiency Linus needed for kernel development.

After totally and forever transforming revision control across the known universe, Linus handed off leadership of the project to its most active contributor, Junio, who has maintained it ever since. Junio generally has the last word in git development decisions – but there's an unspoken reality that if Linus wants git to move in a certain direction, that's the direction Junio moves it. This makes some amount of sense, since the entire reason for git's creation was to support Linux kernel development; it's difficult to imagine a legitimate change to git that would make git better but Linux development worse.

In this particular case, the question of how to handle line endings and file encodings will probably take the natural course of reasoned discussion, rather than anyone simply pronouncing an arbitrary decision.

Hardware Hinting

Given hardware vulnerabilities like Meltdown and Spectre, Linux has had to take strong measures to work around those problems. But recently Vitaly Kuznetsov asked, what about container-based virtual systems that only appear to be running on affected hardware? Depending on the true CPU topology of a given system, one or another security strategy would be best – but as Vitaly pointed out, the user would need to know the true CPU topology in order to make that choice.

It's not just a question of protecting against security vulnerabilities. The hard core STIBP patch solves the Spectre and Meltdown problems entirely and simply, but at a high cost of speed. The various available alternatives are all ways to accept a more complicated solution while regaining some of the speed lost by STIBP.

Of course, more complicated solutions have other trade-offs, and there are several floating around. One option, Vitaly said, would be to simply opt in to the STIBP patch and be done with it. That's a legitimate option, especially if the user is not certain what hardware their system is truly running on.

One attempt to speed things up is Symmetric Multithreading (SMT). It allows one CPU to appear to be two and lets them engage in normal kernel threading and load balancing. This does result in better overall CPU utilization, but traditionally it's shown to have some potential security risks. Then there is Peter Zijlstra's core scheduling patch, which tries to mitigate SMT's security risks by grouping certain of these virtual CPUs together, to avoid potentially risky interactions with other virtual CPUs on the same system.

All of these possibilities, Vitaly said, would be legitimate options for a user to consider, if only they knew what hardware the system actually used. As he put it, the question boiled down to, "does the topology the guest see[s] match hardware or if it is 'fake' and two vCPUs which look like different cores from guest's perspective can actually be scheduled on the same physical core. Disabling SMT or doing core scheduling only makes sense when the topology is trustworthy."

Liran Alon remarked that this was not only a security issue – the same considerations were needed for other speed optimizations as well. For example, it would need to ask the same questions when deciding whether to run tasks that share memory on the same Non-Uniform Memory Access (NUMA) node.

There were a few criticisms of Vitaly's patch. Peter felt that "The only way virt topology can make any sense what so ever is if the vcpus are pinned to physical CPUs. And I was under the impression we already had a bit for that [...]. So I would much rather you have a bit that indicates the 1:1 vcpu/cpu mapping and if that is set accept the topology information and otherwise completely ignore it."

The conversation meandered a bit. Since it's about security, objections can come from anywhere, and the final concept may look like anything. In this case, it's clear that giving hints about the underlying architecture to higher level code would be useful for both speed and security. But it looks like Vitaly's ideas about how to do this may need significant revision before they get into the kernel.

One thing I find fascinating about this type of discussion is that when it comes to new security features, someone generally needs to take the plunge and actually implement something, with the full knowledge that once they do, there will be a set of nearly random objection-vectors coming at them. And only then will it start to become clear what the proper solution will end up looking like. So in a way, developers need to create something they know will be destroyed, just in order to then be able to create it again.

Simplifying the Command Line

Masami Hiramatsu posted a patch to support Extra Boot Config (XBC) to allow users to pass a configuration file to the kernel at boot-time. Configuration would have a tree structure, and users could access it via Linux's Supplemental Kernel Cmdline (SKC) API. The great benefit of this patch would be simplifying the kernel command line. Pretty much everyone agrees the command line has gotten out of hand. Other attempts to simplify it – including support for arbitrary binary blobs of data right there in the command line – have come and gone.

Randy Dunlap had no immediate objection, though he noticed that Masami had set XBC support to be enabled by default in all kernel builds. Normally, Randy said, such a decision would need a lot of justification. Marami said he'd change it to be disabled by default, but he also remarked, "I thought that was OK because most of the memories for the bootconfig support were released after initialization. If user doesn't pass the bootconfig, only the code for /proc/bootconfig remains on runtime memory."

Steven Rostedt also affirmed Randy's idea that new features are normally disabled by default. But Steven said in this case, Masami's patch should be enabled by default. Steven stated his case:

"This is not some new fancy feature, or device that Linus complains about 'my X is important!'. I will say this X *is* important! This will (I hope) become standard in all kernel configs. One could even argue that there shouldn't even be a config for this at all (forced 'y'). This would hurt more not to have than to have. I would hate to try to load special options only to find out that the kernel was compiled with default configs and this wasn't enabled.

"This is extended boot config support that can be useful for most developers. The only ones that should say 'n' are those that are working to get a 'tiny' kernel at boot up. As Masami said, the memory is freed after init, thus this should not be an issue for 99.9% of kernel users."

And Masami agreed with Steven, saying, "Yes, for the users point of view, it is hard to notice that their kernel can accept the boot config or not before boot. To provide consistent system usability, I think it is better to be enabled by default. Anyway, if there is no boot config, almost all buffers and code are released after init (except for /proc/bootconfig entry point, which will return an empty buffer)."

But Masami did acknowledge that the actual compiled kernel binary would be about 15 or 20KB larger with this patch than without it.

There was no further discussion, but it seems clear that this is a very welcome patch, at least for some developers. Whether it will make it all the way through the gauntlet is another question. Certainly everyone including Linus would like to see a better way of dealing with the kernel command line. Maybe Masami's patch will be it.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    Zack Brown reports on communicating with Linux during bootup, pruning SuperH, and bug hunting for Stea.

  • Kernel News

    Linux 2.4 Status

    Kernel Insanity at the Highest Levels

    Linux Licensing Constraints

    Kernel Disassembler

  • Kernel News

    Chronicler Zack Brown reports on the little links that bring us closer within the Linux kernel community.

  • Kernel News

    Adding git Documentation; Untangling the System Call Situation; and Bit or Bitmap?

comments powered by Disqus