Zack's Kernel News

Zack's Kernel News

Article from Issue 265/2022

Chronicler Zack Brown reports on the little links that bring us closer within the Linux kernel community.

The Little Links That Bring Us Closer

The Linux kernel development process continues to change over time in little ways that are fun to observe. Over the past year, one such change has centered around the information included in patch submissions. It may not always make sense to talk about the exact moment a given series of events began, but once upon a time, Steven Rostedt submitted a patch to help the kernel detect infinite recursion loops. His patch included a couple of standard Link: tags and a cc list of relevant developers who might care about the patch.

Linus Torvalds pointed out that in the cc list, Steven had included the string =?utf-8?b?546L6LSH?= where it should have displayed Yun Wang's name as:

Linus said to Steven, "Either let the email tools do proper decoding of the headers and cut-and-paste from that, or use one of the explicit tools that do email header decoding (there's at least a few online ones). Yeah, yeah, I know, we're much too used to US-ASCII (or, in my case, the slightly expanded Western Latin1), and there's a couple of other examples of this in the git history, but we really should strive to get peoples names right."

Steven said he'd re-examine his scripts and fix the issue. After a moment, he added, "It's the copying of the header Cc list into the Cc list of the commit that is causing my issue. Will investigate it more." And he added, "I probably could just stop doing that, as it also adds the Link: tag to the lore email, which includes all the Cc's."

This is where things got interesting. In terms of stopping including the cc lines, Linus replied:

"I like the cc: lines, and I don't think it's a great argument to say that 'the data is in the thread in the link'.

"Generally the commit message should stand on its own, and contain enough of the relevant information that the link data isn't needed.

"So _primarily_ the 'Link:' line should be about background – and for 'oh, there was discussion about this patch after it was committed'.

"So it should not be seen as a _replacement_ for any information in the commit itself, or as an excuse to leave relevant information out."

Linus also argued against using scripts to automate generating the cc lines and Link: tags. He said, "people who do the 'Link:' and 'Cc:' information based on the automation they picked up the patch from often don't look at the part that is _really_ relevant, namely the discussion that _caused_ the patch. IOW [in other words], the original report, possibly earlier versions of the patch etc etc. Mindless 'take the cc list from the emailed submission of the patch' is still somewhat useful in that it avoids having to look it up later when something happens, but really, a bit of human mindful editing is probably called for."

A couple of months later in a different thread, Thorsten Leemhuis attempted to codify Linus's preferences into actual kernel documentation that could be followed by developers composing patches for submission.

In his patch, Thorsten said the Link: tag was overloaded. He identified two new tags, Reported: and Reviewed:, that could take some of the burden off of Link:. The Reported: tag would include a URL to an actual bug report that was addressed by the patch being submitted.

The Reviewed: tag was explicitly not supposed to be used by developers. As Thorsten explained, the maintainer would fill in that tag to point to the most recent public review of the patch, as of the time the maintainer accepted the patch.

The Link: tag, as Thorsten wrote in the documentation, "points to websites providing additional backgrounds or details, for example a document with a specification implemented by the patch."

But Jonathan Corbet replied, "So this is a serious change from how Link: is used now, and runs counter to the scripts used by a lot of maintainers. I suspect that this thread is only as short as it is because a lot of people haven't seen this yet; it could be a hard change to sell."

To which Thorsten replied, "Yeah, I'm aware of that. And to be honest: I don't have a strong interest in this, just think it might be the right thing to do. And I just got the impression that regzbot's dependence on the Link: tag for linking to regression reports is making the ambiguity of the tag worse. That lead to the thought: well, simply bring it up now and see what people think; if they don't like it, I can tell myself 'well, I tried to improve it, but it was not welcomed' and sleep well at night. At least as long as my cat allows me to. :-)"

Geert Uytterhoeven remarked, "The power of the 'Link' tag is that it can refer to a variety of related content. I.e. the meaning is derived from the link target, which can be an email discussion, a bug report, a bug tracker page, … A proliferation of tags complicates life for patch authors and commit analyzers. IMHO adding tags should only be done as a last resort, as it doesn't come without a cost."

That was the end of it for that thread, but some time later in a different thread on a different topic, Michael S. Tsirkin posted a patch, in response to which Linus kept the issue alive, saying:

"And – once again – I want to complain about the 'Link:' in that commit.

"It points to a completely useless patch submission. It doesn't point to anything useful at all.

"I think it's a disease that likely comes from 'b4', and people decided that 'hey, I can use the -l parameter to add that Link: field', and it looks better that way.

"And then they add it all the time, whether it makes any sense or not.

"I've mainly noticed it with the -tip tree, but maybe that's just because I've happened to look at it.

"I really hate those worthless links that basically add zero actual information to the commit.

"The 'Link' field is for _useful_ links. Not 'let's add a link just because we can'."

Michael said he'd stop using Link: that way. And Nathan Chancellor also said:

"[A]s someone who is frequently tracking down and reporting issues, a link to the mailing list post in the commit message makes it much easier to get these reports into the right hands, as the original posting is going to have all relevant parties in one location and it will usually have all the context necessary to triage the problem. While has made it much easier to find patch postings with the 'all' list and the search syntax that public-inbox offers, it is simpler to just import the thread with 'b4 mbox' using the link directly.

"However, I do agree that it should be easier for people to tell whether or not the link is additional context or information or just a link to the original patch posting on the mailing list. Perhaps there should be a new tag like 'Archived-at:', 'Posted-at:', or 'Submitted-at:' that makes this clearer?"

Linus replied to Nathan, saying, "I really don't find much value in the 'Link to original submission', because that thing is *already* trivial to find, and the lore search is actually better in many ways (it also tends to find people *reporting* that commit, which is often what you really want[...)]." And he added:

"[B]ut what *is* interesting, and where the 'Link:' line is very useful, is finding where the original problem that *caused* that patch to be posted in the first place.

"Yes, obviously you can find that original problem by searching too if the commit message has enough other information.

"For example, if there is an oops quoted in the commit message, I have personally searched for parts of that kind of information to find the original report and discussion."

Michael Ellerman, on the other hand, felt that the link to the original submission was indeed very useful, because:

"It means I can easily go from a commit back to the original submission. That's useful for automating various things like replies and patchwork status updates.

"It allows me to find the exact patch I applied, even if what I committed is slightly different (due to fuzz or editing), which would be harder with a search based approach.

"It gives us a way to essentially augment the change log after the fact, by replying to the original patch with things we didn't know at the time of commit – eg. this patch was reverted because it caused a bug, etc."

Jörg Rödel also felt that using Link: to point back to the original patch submission had real value. He said:

"1) First of all it is an easy proof that the patch was actually submitted somewhere for public review before it went into a maintainers tree.

"2) The patch submission is often the entry point to the discussion which lead to this patch. From that email I can see what was discussed and often there is even a link to previous versions and the discussions that happened there. It helps to better understand how a patch came to be the way it is. I know this should ideally be part of the commit message, but in reality this is what I also use the link tag for.

"3) When backporting a patch to a downstream kernel it often helps a lot to see the whole patch-set the change was submitted in, especially when it comes to fixes. With the Link: tag the whole submission thread is easy to find."

Konstantin Ryabitsev joined Michael E. and Jörg in voting for a link to the original patch submission, as opposed to Linus's idea of linking to the post identifying the problem that a given patch was meant to fix. Konstantin said to Linus, "I think the disconnect here is that you're approaching this from the perspective of a human being, while what many want is a dumb and reliable way to match commits to ML submissions, which will allow improving unattended automation." And he added, "I really, really like having a fool-proof way to match commits directly to the exact ML submissions. :( Even a 99%-reliable fuzzy matching algorithm has enough of a failure rate that causes maintainers to get annoyed."

Konstantin suggested – if the meaning of the Link: tag was really going to change – that there should be a new Message-ID: tag, containing the actual email Message-ID header of the original patch submission. This way Konstantin, Michael E., Jörg, and others could continue their various processes without additional pain.

Linus, however, was unmoved. He said of Link: tags:

"They are wonderful when they link to the original problem.

"They are *really* wonderful when they link to some long discussion about how to solve the problem.

"They are completely useless when they link to 'this is the patch submission of the SAME DAMN PATCH THAT THE COMMIT IS'.

"See the difference?

"The two first links add actual new information.

"That last link adds absolutely nothing. It's a link to the same email that was just applied."

Michael E. argued that Linus was not being consistent here. He said, "Folks wanted to add Change-Id: tags to every commit. You said we didn't need to, because we have the Link: to the original patch submission, which includes the Message-Id and therefore is a de facto change id. Links to other random places don't serve that function."

Linus replied:

"[W]hen people asked for change ID's and I said 'we have links', I by no means meant that 'you can just add random worthless links to commits'.

"For example, if you have a (public-facing) Gerrit system that tracks a patch before it gets committed, BY ALL MEANS add a link to that as the 'change ID' that you tracked in Gerrit.

"That's a Link: that actually adds *information*. It shows some real history to the commit, and shows who approved it and when, and gives you all the Gerrit background.

"But a link to the email on lkml that just contains the patch and the same commentary that was introduced into the commit? Useless garbage. It adds no actual information.

"THAT is my argument. Why do people think I'm arguing against the Link: tag? No. I'm arguing against adding links with no relevant new information behind them.

"I don't argue against links to lore. Not at all. If those links are about the background that caused the patch, they are great. Maybe they are to a long thread about the original problem and how to solve it. Thats WONDERFUL.

"But here's the deal: when I look at a commit that I wonder 'why is it doing this, it seems wrong' (possibly after there's been a bug report about it, but possibly just because I'm reviewing it as part of doing the pull), and I see a 'Link:' tag, and it just points back to the SAME DAMN DATA that I already have in the commit, then that Link: tag not only wasn't helpful, it was ACTIVELY DETRIMENTAL and made me waste time and just get irritated."

Eric W. Biederman also came out against Linus on this issue. He said that having to do a search to track down the original patch submission was itself difficult and time consuming. And in terms of what Linus seemed to really want from the Link: tag, Eric remarked, "As for finding the original problem that can be very hard. I recently had someone report a problem in code that had not changed in a decade or so that had just appeared. They had just happened to run ltp on a big enough machine where a poorly written test stressed the hardware on a large enough machine in just the right way that things started falling over. It took asking several times to find that out."

Eric commented further:

"[S]ure let's aim at getting more and better information in commits and in the urls that we place in commits. But let's not throw the baby out with the bath water and stop doing the part we can automate, because we have done such a good job of automating the indexing that we can usually find it with a simple search.

"Let's instead aim to keep the conversation connected, and the threads not broken so that following the url that is the easy thing to create gives us much more information."

The discussion did not end there. Instead, in another thread, Thomas Gleixner submitted a patch, to which Linus remarked, "So I have to once more complain about the -tip tree 'Link:' usage," and he added, "I _really_ wish the -tip tree had more links to the actual problem reports and explanations, rather than links to the patch submission."

Regarding this particular patch, Linus said, "I don't have a clue what the actual report was, and there really isn't enough information in the commit itself, except for a fairly handwavy 'Device drivers might, for instance, still need to flush operations…' I don't want to know what device drivers _might_ do. I would want to have an actual pointer to what they do and where."

Thorsten thanked Linus for raising the issue again; he said, "that's a problem in a lot or subsystems and makes my regression tracking efforts hard, as my tracking bot relies on the 'Link:' tag. If it's missing I thus have to manually search if patches were posted or committed to fix a regression, which makes the tracking hard and annoying. :-/" And he added, "It seems quite a few developers are under the impressions that 'Link:' is just for the patch submission and not to be used for other things. That's why some people invented other tags. I see 'BugLink' quite often these days, but there are also other tags in use (some drm people used 'References:' for a while)."

Michael E., however, argued that "that doesn't scale though, it puts more work on maintainers, who already don't have enough time. The *submitter* should be putting all the relevant info in the patch, including any links to other discussions, previous versions etc. etc. Then the maintainer can automatically add the 'Link:' tag pointing to the submission, and everything is there in the archive."

Theodore Ts'o made the same point, saying, "I would argue that it should be the patch submitter's responsibility to explicitly add a URL to the problem report. In some cases this might be a pointer to a bug tracking system; in other cases it might be a URL to"

And Linus replied:

"I agree in the perfect case.

"But in practice, we have a lot more patch submitters than we have maintainers, and not all 'leaf developers' necessarily know how to do everything.

"So the maintainer should probably expect to fix things up. Not always, but also not a 'the developer should have done this, so I won't do it'.

"This isn't so different from the fact that not everybody writes English proficiently – people do hopefully end up fixing things up as they get passed onwards."

The debate, discussion, and consideration of ways to make everyone happy is ongoing. One thing I like about Linux development is that developers generally have no compunction about standing up to Linus when they think he's wrong. In this case, a bunch of them argued against his main point, articulating their own situation and needs very clearly. And while Linus didn't back down, neither did he lay down the law. And as in many other cases where kernel development practices have changed, the pressure itself tends to lead to good ideas that hadn't been considered before.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    The Kernel Development Process

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Zack Brown reports on bug hunting and process appreciation, and refusing "useful" patches.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More