In the battle for web visitors, milliseconds count. A few simple changes will help your site stay popular.
Four seconds might not be a long time, but it seems like forever when you are waiting for a page to load. If you are running a commercial website and the user has the perception your site is slow, you might just lose a customer. With the expectation of fast load times, website administrators know that every second (or fraction) counts. If you look closely at your web server environment, you might just find that half the average wait time is unnecessary. But tuning is often a matter of trade-offs – performance improvements sometimes require sacrifice in other areas. Before you make any changes, be sure you understand the implications.
Web server performance is often measured around three common indicators:
- the number of requests served per second;
- site throughput in bytes per seconds;
- the response latency for each new request (including new connections).
These factors interact in complex ways to produce the performance you experience at your website. It isn't possible to study every possible issue affecting web server performance, but I will highlight some tips for a faster Apache.
The Front Line
The most common Apache default configuration is to use the Multi-Processing Module (MPM), which is a non-threaded, "pre-forking" web server. MPM is useful on sites that need to avoid threading because they use non-thread-safe libraries, as well as where you need to ensure that individual requests do not interfere with each other. By running ps on your system, you will see several processes running a command like /usr/sbin/httpd2-prefork, indicating it is the pre-fork version. Pre-fork means that servers are started before they are actually used.
The big four Apache performance tuning directives are StartServers, MinSpareServers, MaxSpareServers, and MaxClients. The number of spare processes is defined by the values MaxSpareServers and MinSpareServers, and the number to start is defined by StartServers.
Usually you do not need to make changes to these settings because MPM is self-regulating. However, it is important to ensure that you have set the maximum number of clients (MaxClients) high enough to cover the number of concurrent requests. Keep in mind, however, there is a danger of not having enough memory if you set this value too high.
During normal operation, Apache has a single control process that is responsible for starting the child processes that serve the remote requests. Because starting a child process takes time, often it is useful to have a number of them already running and waiting for new connections so visitors don't have to wait for a new process to start. These servers that are running but not currently servicing clients are called spares, and a couple of directives configure them.
The latest versions of Apache have made things a little easier by combining the most common performance-related directives into a single file: /etc/apache2/server-tuning.conf.
If you have a lot of requests, you might be tempted to change the values of these directives. For example, you might want to increase the value of StartServers so that more server processes are running immediately after startup. Apache only starts one server per second, so theoretically it could take a couple of minutes for all of them to get running on startup. If you need to restart Apache often enough that this becomes an issue, you probably have other problems that are affecting your server more than the number of Apache processes. For the most part, the default settings are fine. If you want the wait time for your users to be as short as possible, set this value very high so that as many servers as possible will start immediately.
The MinSpareServers and MaxSpareServers directives define the minimum and maximum spare servers, respectively, that should be running. As connections are created and dropped, Apache will start and stop servers as needed to maintain these values. Here, too, you rarely need to change the default values. If your server needs to handle more than the default 256 simultaneous connections, you might need to increase MaxClients.
Note that if you increase MaxClients, you might need to change ServerLimit as well. ServerLimit defines the maximum number of server processes that can be started during the lifetime of the Apache process. Whereas you can change MaxClients and gracefully restart Apache, you cannot change ServerLimit without actually stopping and restarting the Apache process.
If you are running into problems because you have a lot of simultaneous requests and want to set MaxClients higher, you can do so without stopping Apache completely. However, if you set it to something higher than ServerLimit, MaxClients will be set to ServerLimit.
Memory
The biggest problem you are likely to encounter is not enough RAM. Particularly with web servers for which even slight delays affect the user's "experience," you should avoid swapping at all costs. If half the time of each request is spent loading existing processes back in from swap space, it is definitely time to consider more RAM.
At baseline, take the amount of memory you need without Apache running. (Use a tool like top.) Subtracting this amount from your total memory gives you the approximate amount you have for the Apache processes. Next, monitor the system with Apache running to get an idea of the average amount of memory Apache processes need. Once you know this, divide the amount of memory available for Apache processes by per-process memory to get the approximate maximum value you can set for MaxClients.
Keep in mind that other processes might start up and need some memory, so it is better not to give all of the remaining RAM to the Apache processes. Also, setting this value too low can really kill performance. If all child processes are busy, new connection requests are put in the TCP queue. If the system cannot respond fast enough, the connection will time out.
MaxRequestsPerChild is another two-edged sword. This directive controls the number of requests that each child process will accept before restarting. This limit on the number of requests is a protection mechanism. If your application has problems (i.e., memory leaks), the amount of memory used increases with each request. With no limit, you eventually run out of memory. If the child process is stopped after a predefined number of requests, the memory is freed by the system. Obviously, if you have a lot of requests and some severe memory leaks, you might run into problems more quickly.
Setting MaxRequestsPerChild too low is likely to cause some performance problems, and if you really need to set it low because of memory leaks or other problems, you definitely need to correct those problems first. The default value is 0 (no limit), but a common setting is 10000, which might seem like a lot, but consider that this does not mean the number of pages, but rather the number of requests. On any given page, you might have several images plus CSS or JavaScript files that are treated as separate requests.
The top utility gives you a good idea of how each individual process is behaving. Typically the client processes all run as a specific user, so you could tell top to monitor only processes from the user wwwrun, for example (see Figure 1):
top -U wwwrun
Keep Alive
The KeepAlive and KeepAliveTimeout directives work closely with the server directives discussed earlier. By turning on KeepAlive, you allow Apache to serve multiple requests for each connection. If KeepAlive is not turned on, the client needs to open up a new TCP connection for each request. Keep in mind that a request is not just a page, but everything on it. So if the page includes CSS files and several images, you can expect delays. If the client has to re-connect for each request, the server would seem a lot slower than it is.
The KeepAliveTimeout directive tells Apache how long to wait if no further requests are made before closing the connection. If this value is set too high and the user ends up reading the contents of a page without loading anything new, the process will sit idle and possibly make other users wait. The standard value is between three and seven seconds. On the other hand, setting this value low can be a good idea if you expect the user to stop and read a large portion of the page before continuing.