Facebook releases its own OOM implementation
Contract Killer
When a Linux system runs out of memory, a special agent, the out-of-memory killer, rushes to its aid. Facebook has now introduced its own OOM killer. What makes it different from its kernel-based counterpart? And what is an OOM killer really?
If you have not placed an order for a large server for a long time, you will probably rub your eyes in amazement the next time you order a new device: Configurations with terabytes instead of gigabytes of RAM are easy to get, and you don't need to be a millionaire to buy them. Gone are the days when people were proud of every single gigabyte (Figure 1).
Some buyers don't even worry about RAM anymore and just assume the system will have enough; however, this might be a little too optimistic, even on a modern system. Servers still sometimes come up short on RAM, and when they do, it can have dramatic consequences: If a component such as systemd needs RAM and cannot allocate it, the system will malfunction or stop working. To avoid a RAM shortage bringing computers to their knees, the Linux kernel has a watchdog on board: the out-of-memory killer, or OOM killer for short. In an emergency, OOM frees up memory by shooting down processes in a targeted way; the memory is then available for other, presumably more important purposes.
Many legends and horror stories are centered on the OOM-killer, and the admin's sense of humor is typically strained when they see kernel messages in the log saying that the killer has struck again (Figure 2). The reason for the anxiety is that it is large applications, such as Java, that the OOM killer targets as its victims.
Java is not famed for being very sparing with resources, but it is usually necessary for running the application for which the server exists. If the OOM killer shoots down Java on a Tomcat system, a load balancer usually catches the problem, but the server taken out in this way is still gone at the end of the day.
This article introduces the current OOM implementation in Linux and explains how it works. I will then compare this standard implementation with an alternative approach chosen by Facebook.
How OOM Situations Occur
Even servers with huge amounts of RAM can get into situations where the available system RAM is not sufficient. This is because the Linux kernel uses certain ways and means to allocate memory as efficiently as possible. If you have ever called top
and looked at the RAM statistics, you will be aware that even on systems with a large amount of RAM and very little load, the display for RAM utilization is often close to the 100 percent limit, even if the system has nothing to do (Figure 3).
The Linux kernel is the interface between the hardware on one side and the programs on the other. If a program wants memory, it asks the kernel for it using a system call like malloc()
. However, it takes too long for the kernel to first search for free memory and then make the requested amount available.
Instead, the kernel preempts: It divides the entire available memory into segments, known as memory pages. In addition, the kernel remembers which pages are already assigned to the running programs and which are thus still available. If a program now comes along and uses RAM, the kernel simply assigns it a memory page from the list of free pages. Because the memory pages are not all the same size, the kernel also has a certain degree of flexibility and can ensure that there is not too much waste.
Waste Is Bad
It is important to avoid waste to the greatest extent possible. Even if you have an arbitrary amount of RAM at your disposal, you will still want to use it as well and efficiently as possible. For many years, the Linux kernel has supported a function that many admins consider equivalent to opening up the proverbial Pandora's box – overbooking RAM.
Roughly speaking, it works like this: The kernel assigns memory pages to requesting programs as usual, but more in total than would actually be available through the physically available working memory. This does not directly cause OOM problems – they are caused by programs that require too much RAM.
However, RAM overcommitment increases the risk of OOM situations because the kernel does not rigorously deal with potential difficulties in advance. If Linux did not allow applications to allocate more memory than actually exists, crashes due to a lack of memory would be unthinkable because applications would simply see an error message when they tried to claim more memory than available.
The Linux approach is different. The kernel speculates that allocated memory will never be fully used. The vm.overcommit_memory=sysctl
variable manages everything else: If it is set to
, which is the default value, the kernel uses a heuristic approach to calculate how much RAM is actually free. It then sets this in relation to the memory that a requesting application wants to have. If the calculations are positive, the program gets the memory, even if the amount of allocated memory becomes larger than the actual memory available in the system.
vm.overcommit_memory=1
makes the kernel even more radical: In this case, the kernel skips the heuristic analysis and approves every request for RAM. But if you set the value to 2
, RAM overbooking is switched off.
What Really Helps
If you think that it is sufficient to deactivate RAM overbooking on the basis of the previous explanations, you are wrong. The OOM problem is not caused by overbooking RAM, but by programs that continuously allocate too much RAM. And unfortunately, they usually do this unpredictably and for a variety of reasons. Often the root of the problem is simply a programming error, which causes the affected program to overburden the RAM. Occasionally, it actually happens that a system needs more RAM than is available to process incoming requests.
If you are confronted with OOM situations, you should first try very carefully to find the cause. If the emergency is not based on a programming error and the OOM situations occur regularly and reproducibly, the long-term solution can only be more hardware. You can either put more RAM into the affected servers or scale the setup horizontally.
If you are dealing with a programming error, it is a good idea to find it and repair it – in collaboration with the developers if necessary. Troubleshooting in such cases can be tough and time consuming. But if OOM problems occur after an update where there were none before, a bug is most likely the trigger.
Buy this article as PDF
(incl. VAT)