The ARM architecture – yesterday, today, and tomorrow
ARM Architecture
Like the x86 architecture, the ARM architecture has been extended over time to meet new requirements. Each version of the architecture defines which of these extensions are mandatory and which are optional. The extensions might not be as varied as with the x86, but the scope is still large (Figure 1), which is why this article focuses on the hitherto most-used architectures: ARMv4 through ARMv7. ARMv8, which you'll learn about later in this article, differs significantly from its predecessors with an extension to 64 bits.
32 Bits for Everything
The ARM architecture was designed from the beginning as a 32-bit architecture, which is expressed in particular in the 32-bit processing width and 32-bit address space (from ARMv3 onward; 26-bit addressing before this). An ARM core thus addresses a maximum of 4GB of memory, although most implementations actually use only a part of it. Only the Cortex A15 avoids this limit with a few tricks.
As with other processor architectures, ARM has several processor modes. Ordinary programs run in user mode, and system mode is reserved for privileged operating system code. Some modes also handle exceptions and, starting with ARMv7, hardware-assisted virtualization. One special feature of the ARM architecture is that each mode has its specific registers, which the system automatically remaps during a mode change. ARM systems thus implement interrupts in a very efficient way.
Most implementations also have an MMU (Memory Management Unit) for storage virtualization and memory protection, but some only have an MPU (Memory Protection Unit) for the implementation of memory protection. Some very simple microcontrollers do without both.
The main difference between x86 and ARM is that ARM is a RISC architecture (see the "RISC" box), whereas the x86 is a member of the CISC (Complex Instruction Set Computer) family. ARM leverages the RISC concept to the max; it is a load-store architecture with a relatively large number of registers and a very small number of commands.
In combination with the restricted addressing modes, this means that all commands can be encoded with exactly 32 bits (i.e., one word) and aligned with word boundaries. The instruction decoder can thus be designed very simply: All it has to do for a command is read a word from memory and then decode it.
With an x86, the overhead is far greater because commands here have lengths of between 1 and 15 bytes (some even have instruction set extensions). The processor thus has to decide, depending on the start of a command, how long the command will be – alignment at word boundaries is not possible.
Current x86 implementations solve this problem by breaking down the complex x86 instructions into simple RISC instructions (known as micro-operations). These steps are not necessary for an ARM processor, which reduces both the hardware overhead and energy consumption.
Conditional Instructions
The mode-specific registers lead to the need for a large number of registers (about 40), but in any mode, only 16 registers (R0 to R15) can be addressed directly (Figure 2). The programmer can use registers R0 to R12 freely, whereas R13 acts as the stack pointer in most cases, R14 is used as the link register for storing the return address for procedure calls, and R15 is the program counter, which the processor can also access directly, just like any other register.
The instruction set avoids unnecessary redundancy, providing standard instructions for arithmetic and logical operations, memory access, program flow control, exception handling, controlling the various modes, and accessing coprocessors. The instructions themselves are no different from those of other architectures, so in this article, we just highlight a few features.
Unlike most other architectures, in which only branch instructions allow execution as a function of conditions, almost any ARM instruction is conditional. To allow this to happen, the command code uses a 4-bit mask to specify which conditions (negative, zero, carry, overflow) must be met for execution, allowing for very compact code and avoiding jumps (Listing 2).
Listing 2
Conditional Execution
In addition to the load-store instructions that allow a memory word to be transferred between memory and the registers, load-store multiple instructions allow a series of contiguous words in memory between a set of registers. This means that the processor can write a small variable field to the registers with one command. This approach also lends itself to very effective use of the stack because it can store or read multiple registers at once. This is of particular interest in the programming of interrupt handlers or context switching in an operating system, since the entire register set can be replaced with just two commands.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Rhino Linux Announces Latest "Quick Update"
If you prefer your Linux distribution to be of the rolling type, Rhino Linux delivers a beautiful and reliable experience.
-
Plasma Desktop Will Soon Ask for Donations
The next iteration of Plasma has reached the soft feature freeze for the 6.2 version and includes a feature that could be divisive.
-
Linux Market Share Hits New High
For the first time, the Linux market share has reached a new high for desktops, and the trend looks like it will continue.
-
LibreOffice 24.8 Delivers New Features
LibreOffice is often considered the de facto standard office suite for the Linux operating system.
-
Deepin 23 Offers Wayland Support and New AI Tool
Deepin has been considered one of the most beautiful desktop operating systems for a long time and the arrival of version 23 has bolstered that reputation.
-
CachyOS Adds Support for System76's COSMIC Desktop
The August 2024 release of CachyOS includes support for the COSMIC desktop as well as some important bits for video.
-
Linux Foundation Adopts OMI to Foster Ethical LLMs
The Open Model Initiative hopes to create community LLMs that rival proprietary models but avoid restrictive licensing that limits usage.
-
Ubuntu 24.10 to Include the Latest Linux Kernel
Ubuntu users have grown accustomed to their favorite distribution shipping with a kernel that's not quite as up-to-date as other distros but that changes with 24.10.
-
Plasma Desktop 6.1.4 Release Includes Improvements and Bug Fixes
The latest release from the KDE team improves the KWin window and composite managers and plenty of fixes.
-
Manjaro Team Tests Immutable Version of its Arch-Based Distribution
If you're a fan of immutable operating systems, you'll be thrilled to know that the Manjaro team is working on an immutable spin that is now available for testing.