Zack's Kernel News
Zack's Kernel News

Chronicler Zack Brown reports on kernel efficiency improvements and tracking compiler plugin problems.
Kernel Efficiency Improvements
Sometimes the Linux folks will struggle to eke out a few microseconds of speed-up in some part of the kernel or other. These efficiency improvements may not seem to matter to the average user, but they do add up. They can mean the difference between hours and days to larger organizations constantly doing massive computations. At the same time, the question of where a given improvement should be made – in the kernel itself, in the C compiler, or elsewhere in the toolchain – may also mean the difference between a tight clean solution and a complex and easily breakable one.
Recently, Mateusz Guzik noticed that "The kernel is chock full of inlined rep movsq and rep stosq, including in hot paths and these are known to be detrimental to performance below certain sizes." He added, "When issuing hello-world compiles in a loop this is over 1% of total CPU time as reported by perf. With the kernel recompiled to instead do a copy with regular stores this drops to 0.13%." Mateusz also added that these were only his personal results, and he'd be interested to see if similar slowdowns were seen on other CPUs.
He posted a tentative patch to fix the issue, but said he wanted to hear from other people before making it a real submission. However, Linus Torvalds replied, "Please make this a gcc bug-report instead – I really don't want to have random compiler-specific tuning options in the kernel. Because that whole memcpy-strategy thing is something that gets tuned by a lot of other compiler options (ie -march and different versions)."
[...]
Buy this article as PDF
(incl. VAT)