Alpha Beast
Charly’s Column – Sysdig
In this issue, sys admin columnist and tool veterinarian Charly Kühnast invites Sysdig, the jack-of-all-trades among system diagnostic tools, into his surgery for a quick checkup. The project promises to unite the functionality of lsof, iftop, netstat, tcpdump, and others.
Where an alpha beast claims to replace an entire herd, the bar is naturally fairly high. Of course, the Wireshark authors, who are also the people behind the Sysdig [1] project, are no beginners. The software only performs well if you have root privileges; otherwise, it can't access all the required system areas. If you launch the tool without parameters, a steady stream of system messages scrolls by: It meticulously logs every single syscall. To thin out the thicket, Sysdig uses what it calls chisels. You can find out which chisels exist with the sysdig -cl
command.
The chisels are sorted into categories (Net, IO, application, logs, and so on). For example, the Performance category has a chisel named netlower
. I decided to pass in a time value of 10 milliseconds as a parameter:
sysdig -c netlower 10
Now Sysdig keeps listing processes whose network IO is slower than 10 milliseconds – on my home network, this means the SmokePing probes to the garden Raspberry Pis and some Munin connections.
You can output a list of the processes with the most frequent mass storage accesses by typing:
sysdig -c topprocs_file
The following reveals the entity causing the most network traffic:
sysdig -c topconns
A replacement for top
can be found in:
sysdig -c topprocs_cpu
The built-in automatic analysis of bottlenecks is particularly informative. Typing
sysdig -c bottlenecks
generates a list of processes whose syscalls take a suspiciously long time. This is a great approach to searching for bottlenecks.
Depth on the Interface
If you like a more interactive approach, try csysdig
. The tool displays the information provided by Sysdig in a continuously updated ncurses interface. Called without parameters, the start screen reminds one of htop
, but pressing F2 takes you to a list of Views that correspond to the categories to which Sysdig assigns its chisels, and you can access them quickly and easily.
For example, if you choose the Spectrogram-File view, you are treated to a graphic like that shown in Figure 1: It shows the file access latency distribution, in which each line represents one second. At the time of grabbing the screenshot, an apt dist-upgrade
was running, hence the high read and write load highlighted in red.
The Views overview showcases one of the specialities of Sysdig and Csysdig: You can restrict analyses to applications that run in containerized systems such as Docker or Kubernetes. Thus, admins can quickly and easily identify any performance fluctuations in containerized software.
My conclusions: Used only as a replacement for top
and netstat
, Sysdig is like taking a sledgehammer to crack a nut, but the many easily parameterized analyses of file and network latencies are a real help. If I have to dig down into individual syscalls, I can save a trace file and filter it until I find what I want. Here, at last, you can finally see the signature of the Wireshark makers.
Infos
- Sysdig: https://www.sysdig.org
Buy this article as PDF
(incl. VAT)