We are not programming in 1991 anymore!

Paw Prints: Writings of the maddog
As I write this I am also copying a talk given in February of 1996 at Digital Equipment Corporation (DEC) about the port of Linux to DEC's Alpha AXP processor.
It is interesting to hear Linus Torvalds and other people talking about spending three thousand dollars (or more) to buy a “high-end” PC, and to have that PC consist of a 32-bit address machine with eight megabytes of main memory, two or three gigabytes of storage on a hard drive and disk transfer rates of two megabytes per second.
Linus talks about the “Big Kernel Lock” and how this issue was not too important since Linux was aimed toward “low end systems” and most of those systems did not have multiple CPUs per board nor (at that time) multiple cores per CPU.
Another recurring theme is the lack of optimization and performance of the GNU compiler suite versus the “commercial compilers”.
In this time frame (and even before) some of the earliest pieces of GNU/Linux code were written.
Today we have 64 bit virtual address space, CPUs that have multiple threads, memories that are Gigabytes in size (and much less cost), disks (and controllers) that are much larger and faster, and the GNU compilers are very good with optimization.
Finally the GNU/Linux system itself was much simpler back in the days when some of these programs were written or ported to the platform. Many of the APIs and system facilities we enjoy today did not occur until later in the life of the Linux kernel.
In a lot of the GNU/Linux distributions such as Ubuntu and Fedora, there are approximately 1400 programs that have assembly language in them. This assembly language was sometimes inserted in the program a long time ago, reacting to the slower CPUs, smaller memories, and less optimal compilers.
When CPUs were single core (and the systems were not SMP), assembly language was relatively straight forward, but when you start to have multiple cores it becomes much more difficult to code in assembly correctly. Compilers can keep track of the data and data-flow in a parallel environment much more successfully than the typical programmer using assembly language.
Some of these modules were written so long ago that the first assembly language used was either IBM 360/370 or DEC VAX architectures. Over time these modules were ported to other architectures, but the assembly language “port” was poorly done and instead of using instructions that would have been optimal for the new architecture, tended to match up the instructions of the existing assembly language to that of the new machine, often causing a less-than-optimum solution. In other cases upper-level code was created as a “fall back”, but the existing assembly language code was left in-line, not taking advantage of the multi-core capabilities of the compiled and optimized code.
In addition to all of this, in days gone past the sole criteria of a good program might be speed of execution or size of the memory in which it runs. These days another criteria has emerged, that of efficiency. How much electricity does your server need? How much cooling will you need for your server farm? How long do you want the batteries in your phone to last? These are other needs that affect programs being written today.
Now it is the twenty-first century, and ARM is designing a new 64-bit chip which will need these 1400 modules ported to them, or at least certified that the code will work on their 64-bit chips. This is a perfect time to:
- make sure that the 1400 modules run on ARM-64
- remove the old, crufty assembly language from other architectures whenever possible
- look at new algorithms what could be used with larger memory sizes (while still maintaining sensitivity to embedded system applications that require smaller footprints).
- Look at compiler intrinsics, add new libraries or make changes to old libraries that would eliminate having redundant code scattered throughout the operating system
- Think about how these programs might be operating in different environments
Linaro, an association of companies that build ARM chips has been working on making sure that GNU/Linux works well on ARM's new 64-bit architecture. In doing so they created a contest to have members of the community help with porting and certifying the existing modules of GNU/Linux.
In setting up this contest, however, the real issues behind these ancient pieces of assembly-language ridden code came to light, and Linaro extended the contest to try and help GNU/Linux to be more efficient and more portable.
There are now two parts to the contest.
Porting
One part has to do with porting and verification. Contestants are encouraged to first register at our site, then select a piece of code to work on from the list of code modules at the web site (http://performance.linaro.org). If that code compiles and works on ARM-64, then the module can be marked “ported” and the contestant is authorized to receive an “entry prize” of a Linaro T-shirt, as well as having the glory of having worked on the GNU/Linux operating system.
If the module does not work, then the contestant should file a bug against the module and start the process of fixing the code so it works on ARM-64. This could be done by writing ARM-64 assembly code or writing a fall-back set of higher-level “C” code that would not only work for ARM-64 but for other architectures as well. After confirming that the module works at least as fast on the various architectures that are supported and ARM-64, then the contestant should submit the patch to the maintainers of the code and mark the module as “patched” at the Linaro site.
The more modules that you test and patch, and the earlier in the contest you do the work, the more likely you are to win the grand prize of an all-expenses paid trip to a Connect meeting to be held in the United States of America or Asia, depending on the time of year. Please see the official contest rules at the web site for more details.
Performance
Another part of the contest is dedicated to improving the performance of GNU/Linux. While getting the code to work on ARM-64 is important to Linaro, so is the goal of having GNU/Linux perform very well on every architecture.
Linaro recognized that a lot of the modules which used assembly language used it because that particular part of the code was very critical, and the compilers of the day were not as efficient as small amounts of assembly could be. However, for reasons stated previously, Linaro feels that these modules (and the options on their compile lines) might be examined again to see if they might be made more efficient or run faster.
In this case the modules may or may not have first been ported to the ARM-64 architecture. If they have not been ported, then Linaro would assume that the performance work would be done in such a way as to make sure the code works on ARM-64. However, ARM does not want this work to penalize any of the other existing architectures or environments for this code, so contestants are strongly encouraged to discuss their plans of enhancement with the existing upstream maintainers/developers to see if the contestant's ideas match up with what the maintainers/developers have envisioned for the code.
After the contestant has obtained a “go ahead” from the upstream maintainers/developers, they should measure the performance of the code on various architectures, do the optimization, then measure the performance again. These measurements (and perhaps input and output data) may have to be submitted to the contest site, and in every case a report of how much more efficient the code is, the work done, and an affidavit stating that the code was accepted by the upstream maintainers/developers will be mandatory for the contest submission.
As with the “porting” part of the contest, the first thing the contestant should do is go to the site (http://performance.linaro.org), sign up for the contest and choose one of the code segments to work on from the 1400 listed.
Every performance update completed will also be entered into the porting part of the contest to win a trip to Linaro's Connect meeting. However there will be a second way for the performance person to win. Twice a year the contestant with the greatest percentage of performance improvement in their code module will also win a free, all-expense paid trip to Connect.
Over the next couple of months, examples of code speedups, new algorithms, and ways of improving code (including some “classics” from maddog's own history) will appear here in maddog's blog.
We hope this will be an exciting, educational and useful exercise for people that wish to join the GNU/Linux programming community.
Welcome aboard!
comments powered by DisqusSubscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News
-
LibreOffice 25.2 Has Arrived
If you've been hoping for a release that offers more UI customizations, you're in for a treat.
-
TuxCare Has a Big AlmaLinux 9 Announcement in Store
TuxCare announced it has successfully completed a Security Technical Implementation Guide for AlmaLinux OS 9.
-
First Release Candidate for Linux Kernel 6.14 Now Available
Linus Torvalds has officially released the first release candidate for kernel 6.14 and it includes over 500,000 lines of modified code, making for a small release.
-
System76 Refreshes Meerkat Mini PC
If you're looking for a small form factor PC powered by Linux, System76 has exactly what you need in the Meerkat mini PC.
-
Gnome 48 Alpha Ready for Testing
The latest Gnome desktop alpha is now available with plenty of new features and improvements.
-
Wine 10 Includes Plenty to Excite Users
With its latest release, Wine has the usual crop of bug fixes and improvements, along with some exciting new features.
-
Linux Kernel 6.13 Offers Improvements for AMD/Apple Users
The latest Linux kernel is now available, and it includes plenty of improvements, especially for those who use AMD or Apple-based systems.
-
Gnome 48 Debuts New Audio Player
To date, the audio player found within the Gnome desktop has been meh at best, but with the upcoming release that all changes.
-
Plasma 6.3 Ready for Public Beta Testing
Plasma 6.3 will ship with KDE Gear 24.12.1 and KDE Frameworks 6.10, along with some new and exciting features.
-
Budgie 10.10 Scheduled for Q1 2025 with a Surprising Desktop Update
If Budgie is your desktop environment of choice, 2025 is going to be a great year for you.