The ARM architecture – yesterday, today, and tomorrow
ARM Architecture
Like the x86 architecture, the ARM architecture has been extended over time to meet new requirements. Each version of the architecture defines which of these extensions are mandatory and which are optional. The extensions might not be as varied as with the x86, but the scope is still large (Figure 1), which is why this article focuses on the hitherto most-used architectures: ARMv4 through ARMv7. ARMv8, which you'll learn about later in this article, differs significantly from its predecessors with an extension to 64 bits.
32 Bits for Everything
The ARM architecture was designed from the beginning as a 32-bit architecture, which is expressed in particular in the 32-bit processing width and 32-bit address space (from ARMv3 onward; 26-bit addressing before this). An ARM core thus addresses a maximum of 4GB of memory, although most implementations actually use only a part of it. Only the Cortex A15 avoids this limit with a few tricks.
As with other processor architectures, ARM has several processor modes. Ordinary programs run in user mode, and system mode is reserved for privileged operating system code. Some modes also handle exceptions and, starting with ARMv7, hardware-assisted virtualization. One special feature of the ARM architecture is that each mode has its specific registers, which the system automatically remaps during a mode change. ARM systems thus implement interrupts in a very efficient way.
Most implementations also have an MMU (Memory Management Unit) for storage virtualization and memory protection, but some only have an MPU (Memory Protection Unit) for the implementation of memory protection. Some very simple microcontrollers do without both.
The main difference between x86 and ARM is that ARM is a RISC architecture (see the "RISC" box), whereas the x86 is a member of the CISC (Complex Instruction Set Computer) family. ARM leverages the RISC concept to the max; it is a load-store architecture with a relatively large number of registers and a very small number of commands.
In combination with the restricted addressing modes, this means that all commands can be encoded with exactly 32 bits (i.e., one word) and aligned with word boundaries. The instruction decoder can thus be designed very simply: All it has to do for a command is read a word from memory and then decode it.
With an x86, the overhead is far greater because commands here have lengths of between 1 and 15 bytes (some even have instruction set extensions). The processor thus has to decide, depending on the start of a command, how long the command will be – alignment at word boundaries is not possible.
Current x86 implementations solve this problem by breaking down the complex x86 instructions into simple RISC instructions (known as micro-operations). These steps are not necessary for an ARM processor, which reduces both the hardware overhead and energy consumption.
Conditional Instructions
The mode-specific registers lead to the need for a large number of registers (about 40), but in any mode, only 16 registers (R0 to R15) can be addressed directly (Figure 2). The programmer can use registers R0 to R12 freely, whereas R13 acts as the stack pointer in most cases, R14 is used as the link register for storing the return address for procedure calls, and R15 is the program counter, which the processor can also access directly, just like any other register.
The instruction set avoids unnecessary redundancy, providing standard instructions for arithmetic and logical operations, memory access, program flow control, exception handling, controlling the various modes, and accessing coprocessors. The instructions themselves are no different from those of other architectures, so in this article, we just highlight a few features.
Unlike most other architectures, in which only branch instructions allow execution as a function of conditions, almost any ARM instruction is conditional. To allow this to happen, the command code uses a 4-bit mask to specify which conditions (negative, zero, carry, overflow) must be met for execution, allowing for very compact code and avoiding jumps (Listing 2).
Listing 2
Conditional Execution
In addition to the load-store instructions that allow a memory word to be transferred between memory and the registers, load-store multiple instructions allow a series of contiguous words in memory between a set of registers. This means that the processor can write a small variable field to the registers with one command. This approach also lends itself to very effective use of the stack because it can store or read multiple registers at once. This is of particular interest in the programming of interrupt handlers or context switching in an operating system, since the entire register set can be replaced with just two commands.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.
-
Juno Tab 3 Launches with Ubuntu 24.04
Anyone looking for a full-blown Linux tablet need look no further. Juno has released the Tab 3.
-
New KDE Slimbook Plasma Available for Preorder
Powered by an AMD Ryzen CPU, the latest KDE Slimbook laptop is powerful enough for local AI tasks.