Monitoring with the Sysstat tool collection
The Sysstat tools, featuring sar, iostat, mpstat, and pidstat, acquire system parameters and calculate statistics.
Server performance is down or non-existent. To look for clues, you check the logfiles, take a quick look at the /proc directory, and run tools such as Vmstat or Top, but to no avail: time to launch the Sysstat collection . Sysstat is a group of simple Linux command-line tools for performance analysis and monitoring. According to the Sysstat project, the toolset "… contains various utilities common to many commercial Unixes, and tools you can schedule via cron to collect and historize performance and activity data."
The Sysstat set collects system information, stores it for a period of time, and calculates mean values, letting you query individual system parameters at specific times for more flexible troubleshooting. The tools work well with cron so that you can take readings of system performance at predefined intervals for a flexible, customizable approach to data collection. The Sysstat project defines the collection as follows:
- iostat – reports input/output statistics for devices, partitions, and network filesystems.
- mpstat – monitors processor statistics.
- pidstat – reports on system processes.
- sar and a supporting cast of related utilities – monitor, collect, and report on system activities related to CPU, memory, interrupts, interfaces, kernel tables, and other factors.
The sar set includes the data collector sadc; sa1, which assembles binary data in a system activity file; sa2, which writes activity reports; and sadf, a tool used to display sar data in formats such as CSV and XML. kSar, which is available as a separate project , is a convenient graphing tool for sar data.
The Sysstat tools provide a practical, simple set of building blocks that are easy to integrate into the everyday life of the network.
Get a Recent Linux
Pidstat lets administrators query and monitor I/O load. Because this level of functionality requires a special kernel option, older distributions will not support it. In fact, if you have Ubuntu, you will need at least Hardy, and openSUSE fans will need version 11 or a modified kernel to run the recently released version 8 of Sysstat by Sébastien Godard and the team of developers backing him up.
Iostat, pidstat, mpstat, and the sar utilities provide information on system parameters, such as:
- Input/output and transfer rates: global or by device, partition, NFS drive, process ID, or process name
- CPU load: global, per CPU, or by process
- Virtual and physical memory and swap file usage
- Paging, memory load, and pagefault counts: global, for individual processes or process trees
- The speed at which the system is spawning new processes
- The number of interrupts: global, per CPU, or by interrupt or APIC source
- Network interfaces
- NFS servers and clients
- Run queue and system load
- Internal kernel tables
- Number of context switches
- TTY activity
The information provided by the tools is typically grouped into the categories of memory, network, processors, CPU, and I/O. The pidstat options are as follows: -d outputs the I/O load, -r the pagefaults, -u the CPU usage, and -w the kernel's task-switching activity. Sar and iostat also include options for displaying data for network filesystems, and sar lets the administrator monitor TTY console use.
Sar and kSar
Unix diehards might complain that you can gather the same data with the use of internal Linux tools. Admins can use Awk to calculate the load average for any given period of time. All you need is some shell survival skills and the data from the proc filesystem (Figure 2). Although this is true, sar makes the task much simpler. The command
sar -q P0 120 2
gives you the queue length and load average. P0 specifies the CPU to monitor, and 120 2 tells sar to take two samples for a period of two minutes each, and then display the mean values for 1, 5, and 15 minutes. The output is shown in Figure 2.
If inexplicable hardware problems occur, you can run
sar -I interrupt
to investigate individual interrupts. If the CPU load, which you can get by running sar -u, tells you that the server is working as hard as it can, this might not mean it is time to buy a new machine. Long-term monitoring might be preferable – options are available for both the console and the graphical front end.
The kSar graphing tool is well suited to the task of graphing sar results. kSar gives you a fast overview of system performance and can even show you multiple servers at the same time with an SSH command (Figure 1). The Java tool will run sar -A on a remote server, receive a full set of sar data, and display the data for point and click navigation in a tree view. One really nice feature of kSar is that child windows are synchronized. Clicking on a sub-item in a tree diagram will automatically update the graphs for all the hosts currently shown. This feature is perfect for administrators wanting to compare multiple machines.
Bugs or loops in active programs often bring a server to its knees. Formerly, Linux admins had no alternative to time-consuming monitoring with Top, Strace, and other debugging tools, and that meant sitting in front of the screen. Today, one of the Sysstat tools makes this unnecessary. Pidstat creates statistics reports for individual Linux tasks. To monitor processes over extended periods of time, you can use pidstat and output the mean values.
The Sysstat developers describe how they even used pidstat to track down a memory hole in the Sysstat tools. If you use, say, pidstat -u 20 2 -C processname to monitor an individual process, the result is the average load for two requests within a period of 40 seconds (see Figure 2).
Buy this article as PDF
Lennart Poettering wants to change the way Linux developers talk to each other.
Enterprise giant frees itself from ink and home PCs (and visa versa).
Mozilla’s product think tank sinks silently into history.
TODO group will focus on open source tools in large-scale environments.
New tool will look like GParted but support a wider range of storage technologies.
New public key pinning feature will help prevent man-in-the-middle attacks.
Carnegie Mellon researchers say 3 million pages could fall down the phishing hole in the next year.
The US government rolls new best-practice rules for protecting SSH.
Klaus Knopper announces the latest version of his iconic Live Linux system.