Ode to Machine Architecture
Paw Prints: Writings of the maddog
I have been writing lately about the importance of learning the underlying tenants of computing if you are going to be a great programmer, and in particular some machine language and computer architecture.
It typically does not make a difference which architecture you learn, or which machine language, as long as the architecture and machine language can illustrate the basic concepts of computing to a level that is useful in future studies of operating systems design and compiler theory, helping you to under stand issues like cache management, interrupt handling and I/O.
This blog entry, however, is not going to talk about those issues. Instead it will talk about a few instances in my life where knowing assembly language helped me immensely in solving problems.
It was 1973, and I had just graduated from Drexel University and had taken a job with Aetna Life and Casualty, an insurance company in Hartford, Connecticut. Like a lot of other students, I had studied a lot of the different high-level languages of that day, such as COBOL, FORTRAN, APL, and taken some of the rudimentary “computer science” (more like “computer black magic”) courses offered at that time. In addition to these courses, I had taught myself my first programming language (FORTRAN) and my second programming language (PDP-8 Assembler) by reading books and practicing, no courses or professors involved.
My first job at Aetna was in the casualty division, working on an online claim-payment system called “Safari”. It was one of the first real-time transaction processing systems in the world, and I was hired to do assembly-language programming for reports. My boss, John Silliman, wanted us to write in assembly language even though the reports could have been written faster and with less mistakes in a high level language because he was grooming system programmers for other positions inside Aetna.
In those days most of the Insurance companies in Hartford, Connecticut used IBM mainframes, with MVS as an operating system, and COBOL as their main applications language. The average programmer would stay for 1.5 years with one company, then “jump ship” to another, work for them for 1.5 years, and go to a third...increasing their salary each time. On a rainy day you could see people coming into the building carrying a Traveler's umbrella, wrapped in a Nationwide blanket, and probably carrying a “piece of the rock” (Prudential) in their briefcase.
John Silliman's strategy worked, since most of the people in John's group stayed with Aetna for a much longer time than the average “applications programmer”, sometimes moving from division to division as needs for systems programmers were determined.
I did not know IBM assembler when I arrived at Aetna. I had purchased a book called “Basic Assembly Language for the IBM 360 Computer” (or something like that) and had started to read it. When I arrived at Aetna, John gave me a week or two to finish learning BAL (as it was called), then he gave me my first assignment.
A program that another programmer had written was demonstrating some odd characteristics. Not only did his program run slow, but when it started to run every other program on the system ran a LOT slower, and every other program on every computer system even remotely connected with that mainframe ran slower.
The programer who had written this was a more senior person, and my immediate manager cautioned me not to let him know that a “wet-behind-the-ears college student” was looking at his program. I was not allowed to ask him any questions, or let him know I was looking at “his” program. This was the direct opposite of “ego-less programming”
As I started to look at the program, I realized it was written in five different languages. Part was FORTRAN, part was COBOL, part was APL, part was ALGOL and part was assembler language. I think part may even have been SNOBOL with some APL thrown in, but those languages were too painful to think about, so my mind has rejected that possibility over the years.
We had some fairly amazing tools even in those early years. Aetna was the largest commercial user of IBM equipment in the free word, and even in 1973 we had massive investments in both hardware and software. We had 500,000 nine-track tapes in our tape library on site, with another 100,000 tapes in a salt mine in Idaho. We automatically ordered two of everything that IBM announced, tape drive to mainframe. No salesperson had to call us......two units just showed up on our shipping docks.
One investment was in a piece of software that could profile a binary and find out where the computer was spending its time. From this piece of code I found that 99.999 percent of the time was being used by one machine language instruction....in one location in memory.
Curious as to what this instruction did, I pulled out the manual “The IBM 360 Principles of Operation” (POA) which told what every instruction in the system did, how much space it took up in memory, how long it took to execute and many other things. In fact the POA told everything about the hardware that a programmer would want to know.
I found out that the instruction was “Read The Clock”. Finding it odd that such a simple instruction would have such a big impact, I continued to read, and found out that this particular “Read the Clock” instruction was also used for synchronizing multiprocessor applications, and for setting up semaphores inside of the operating system. Before the program could “read the clock”, the entire system had to come to a grinding halt. All the I/O had to be effectively complete, all the other CPUs had to be halted, all the cache had to be flushed, on every system that was in any way linked to the system where the instruction was being executed. Then you could “read the clock”, and continue with your program. "Read The Clock" was an "Atomic Operation", one that could not be interrupted, and therefore would return one unique value. After that caches would have to re-fill, tape drives had to start up again, and buffers fill in core (yes, core memory). You might guess this was an “expensive” instruction, and could only be executed by a program running in a privileged mode.
My friend's program was running in that privileged mode. Those were heady times.....
I found out that the programmer was “reading the clock” for every one of the hundreds of thousands of records he was reading off the magnetic input tapes (the tapes themselves were a big bottleneck, but we will get to that later).
I forget how I managed to casually engage the programmer in conversation about his program. Perhaps it was at the lunch table, with a feigned interest in his career, or his life, but eventually I found out that he was reading this clock tens of thousands of times a day just to find out the date of execution of the program...the date it was running. I casually suggested that perhaps he could move that instruction outside the tape reading loop, do it one time at the beginning of the program and store the result in a variable. “Oh yeah, that is a good idea”, he said.
The next time the program ran, it was finished so fast the operators thought it had crashed without processing anything. No other programs on any of the systems showed any signs of slowing down.
I also asked this older programmer why he wrote the program in so many different languages. It turns out he was studying for his Master's Degree at night, and that month he had a course in “comparative languages”, so as they studied each language over the course of 11 weeks, he used that language in that part of the report. I congratulated him for creating such a good “job security” program, because as far as I knew there were only two programmers in Aetna's staff of 400 programmers that knew all five languages, and both of them were sitting at that lunch table.
I later reported to my boss what I had found, and that programmer was not long on Aetna's payroll.
To be fair, the “knowledge” of machine/assembly language worked against this programmer. If he had coded his program in a high-level language I am fairly sure the compiler would not have chosen that instruction to find out the “date of execution”. There would have been a library routine or intrinsic that would have provided the same information without having such a great impact. On the other hand, without my knowledge of machine language I probably would not have found the answer.
Another time I had a COBOL applications programmer come to me. They had looked their program over about a thousand times and could not find what was wrong. I looked at his COBOL code, and I could not see what was wrong either.
“It must be the compiler”, I said. “The COMPILER,” the applications programmer exclaimed, “how could that be?” I gently explained that the compiler is a program, just like his programs, written by the same type of people who go out on a given night, drink a lot of beer, and sometimes are not in the best shape the next day. Just like his programs, sometimes the compiler creates an error.
So we turned on the compiler flag that generated the assembly language code for the COBOL program, and sure enough, we could see the incorrect code being generated. I think the COBOL programmer drank a little more beer than normal that night, as one of his unshakable truths had just been shaken.
The final story to this blog post came from a good friend of mine who will remain nameless in case he is still alive. He was a very good assembly language programmer, and he had been asked to develop a routine that would access data in Aetna's database and allow the COBOL programmers to update it quickly. To do this he used a VERY large array with an index into it.
Where he made his mistake was specifying the index of the array as a COBOL “Picture 9” field.
In IBM COBOL an index is normally a binary number, and you would specify it and its size for storage by now many “9”s in the field definition. Unfortunately if that is where you stop, the number is stored as EBCDIC, a code that is somewhat akin to ASCII, and is used to map characters to bit strings. IBM assembler had four instructions dedicated to EBCDIC to binary conversion: EBCDIC to packed decimal, then packed decimal to binary and the reverse. Unfortunately those four instructions were some of the slowest in the entire architecture.
If my friend had defined his number as “Picture 999 COMP”, the number would have been passed and stored as a binary number. But defining it as “Picture 999” it was always passed and stored as an EBCDIC number. Therefore every time it had to be used as an index it had to have two of the slowest machine instructions act on it, then it would be used, incremented, and converted back again for storage (two more of some of the slowest instructions). This happened literally TENS OF MILLIONS of times every day.
When I realized that the programmer had done, I went over to his desk and discussed it with him. He had realized his mistake, but it was too late since the API for the routine had been published, and the COBOL programmers were happily coding their subroutine calls.
In the end it was not as bad as “read the clock”, but not far from it, and was definitely not really what the company wanted. My lasting memory of this person (an otherwise great programmer and great human being) was with his head between his hands looking down at his listing....
There are other examples of why it is good to learn machine/assembly language. Coding in a high-level language, then inspecting the output of the compiler for what instructions were generated will give you an idea of the efficiency of your algorithms and your coding style.
Using your knowledge of assembler and machine architecture to explain the command line options on the gcc (and other) optimizing compilers.
In the near future I will write some examples of how knowledge of machine architecture made some programs run 10,000 (or more) times faster, and cut storage considerably.
In the meantime I am sure that there will be many readers that will contribute their own stories of how machine architecture and assembly language knowledge helped them solve problems in the past in the comment section. I will probably enjoy reading them.comments powered by Disqus
News site for the openSUSE community falls victim to a Wordpress exploit.
The source code is available online.
One out of three virtual machines on Microsoft Azure Cloud run Linux.
The form factor of the board makes it a drop-in replacement for Raspberry Pi.
Makes it easier for customers to move workloads into container-centric applications.
SUSE’s answer to container-centric operating systems.
Linux 4.9 is the biggest release in terms of number of commits.
The latest version of the official RHEL clone is here.
New release targets Linux professionals.
The Fedora project adds Wayland and Gnome 3.22