Network Sniffers
Core Technology
Learn what's going on in your network, using Linux and its arsenal of packet capture tools.
We are always told that eavesdropping is bad. In social relations, that's probably true, but in computing (especially networking) where this activity is known as sniffing, it's an indispensable debugging technique. What goes in the wire is an ultimate answer to "What you've thrown at my service?" and "How did I reply?" Debugging aside, network sniffers may collect statistics or perform security monitoring. Linux comes with many tools of this kind, both GUI and terminal based. In this Core Tech, we'll discover perhaps the most popular one (or just my favorite).
Although sniffing is a legitimate technique, it is still largely prohibited in corporate environments. Many ISPs deem it illegal, too, so be careful when experimenting. A virtual machine based dedicated test lab is the safest option. Anyway, employ common sense and don't sniff traffic that could be sensitive, even if your housemates or colleagues are careless enough to send it unencrypted.
Back to the Beginning
Any network sniffer relies on the operating system's ability to forward it to all packets the network card receives, regardless of which process (or even host) they really target. The exact way of doing this is platform-specific. In Linux, packet sockets are the standard mechanism [1].
The AF_PACKET
address family operates on raw Layer 2 packets (or frames). An "address" of this family (struct sockaddr_ll
) tells the kernel from which interface to sniff packets, and in which Layer 2 protocols you are interested. Packet sockets come into play early, shortly after the network card receives a frame, and well before the data goes through the Linux networking stack. Capturing from the wire and crafting raw Layer 2 datagrams is a sensitive operation, so only processes with CAP_NET_RAW
capability can do it. We've discussed capabilities in the last issue of Linux Magazine [2], but typically it just means you need to be root
.
A rare network sniffer uses packet sockets directly. Most rely on libpcap
[3], a library that abstracts away platform specifics and adds some bonus features. For example, it can save captured packets in so-called .pcap (packet capture, you guessed it) files and later you can read, or "replay" a .pcap file. This way, you (or someone acting on your behalf) can capture traffic on one box to analyze it on another. The libpcap
library is designed so that from the application point of view reading a .pcap file is almost indistinguishable from capturing content live, so replay is usually doable with any sniffer. Because .pcap files are, well, just files, reading from them does not typically require root privileges.
Another problem libpcap
solves elegantly for you is getting only data you want. Modern networks may run at whopping 40Gbit/sec or more, yet you might be interested in that small VoIP session your boss is complaining about. In other words, sniffing is closely intermingled with filtering.
A technology called Berkeley Packet Filters (BPF) was conceived long ago just for these purposes, and libpcap
implements a high-level filtering language it compiles into BPF bytecode. This doesn't sound particularly intriguing until you realize that you can safely put arbitrary BPF bytecode in the kernel: It can never hang or crash (by design). Moreover, Linux compiles BPF bytecode into native CPU instructions, so these filters are also fast. On the rare occasion when libpcap
finds it impossible to execute a filter in the kernel, it resorts to userspace emulation, which is a bit slower. BPF is a really powerful concept, and it was recently extended to include tasks that have nothing to do with network filtering, such as performance profiling. This is a whole set of opportunities, but we'll leave them for another time (drop us a line if you are interested in seeing more).
Dumping TCP (and Whatever Else)
If you'd want me to vote for a single network sniffer in Linux, I'd go for tcpdump
(Figure 1). It may not be as good at decoding protocols as some commercial ones (although I never felt it was limited), and it has less eye candy than Wireshark (Figure 2). However, it seems to be installed on every Linux box I have access to, so it wouldn't be an exaggeration to call tcpdump
a de facto standard.
The tcpdump
sniffer is a text-mode command-line tool. You pass it some options and read the dump from your terminal window or put it in a file. Despite the name, tcpdump
is not about TCP. It understands about 150 network protocols, spanning the majority of OSI Model layers: from Ethernet and ARP through IPv4/IPv6, TCP, UDP, SCTP, and friends to DHCP, HTTP, and BGP. For something really exotic or new, you can always get a raw hex dump.
Before anything else, you should tell tcpdump
from which network interface to capture packets. You do this with the -i
switch, and it is also possible to monitor all NICs in the system with -i any
. Moreover, tcpdump
can list network interfaces it can capture from:
$ tcpdump --list-interfaces 1.wlan0 [Up, Running] ...
Note that the interface must be up (ip link set up dev wlan0
) to be usable in tcpdump
. As I mentioned previously, live packet capture implies root permissions, so don't forget to prefix your commands with sudo
. I do this in Figure 1, where you can see some flows from my home wireless network. By default, a tcpdump
session lasts forever until you type Ctrl+C, but you may instruct the tool to receive a predefined number of packets (-c
). It also puts the device from which you capture in so-called "promiscuous" mode, where the device receives frames destined to other hosts. Usually, that's what you want, but if you are interested in local traffic only, consider adding -p
to disable promiscuous mode.
You see that tcpdump
prints a human-readable summary for each Layer 3 datagram it receives. Add -e
to include Layer 2 header information, such as MAC addresses (and the respective NIC manufacturers). You can also make tcpdump
dump more data with -v
, -vv
, or even -vvv
, where v
stands for verbosity, and the more v
s you have, the more verbose tcpdump
is. (This is a recurring pattern in its CLI.) A maximum verbosity level appears in Figure 3. I usually run tcpdump
in this mode myself and also add -n
to disable host/port/whatever else names from being resolved. Without it, 8.8.8.8 would become google-public-dns-a.google.com, and 80 would read http. This is useful most of the time, but when I do network troubleshooting, I prefer clear pictures.
Additionally, tcpdump
does a great job of decoding network protocols, but if it prints some packets as garbage, you may want to look at raw bytes to see what's wrong. You may see that the packet is indeed well-formed, with just a few extra bytes at the beginning. tcpdump
facilitates this workflow with -x
and -X
switches. Both print raw hex dumps, but the latter also adds ASCII as most hex editors do (Figure 3). A corresponding double-x version (-xx
and -XX
) includes Layer 2 headers in the dump.
So far, tcpdump
is dumping everything it sees in the wire on your terminal. It is also completely possible to store packets in .pcap files:
$ sudo tcpdump -i wlan0 -vn -w first.pcap tcpdump: listening on wlan0, link-type EN10MB (Ethernet), capture size 262144 bytes Got 77
The counter indicates how many packets were captured. If you are going to grab a lot of data, tcpdump
can spread it across multiple files that act as a "ring buffer." Consult tcpdump(1)
man page [4] for details. (See also the "PF_RING" box.)
PF_RING
Packet sockets are standard in Linux but not particularly fast. This is usually not an issue for debugging but may cause problems in network monitoring scenarios. If the capture mechanism isn't fast enough, you'll be losing events you were to monitor. Not good.
PF_RING [5] is one approach to making packet capture in Linux faster. It doesn't probably come with your distribution kernel, but you can build it yourself or download a pre-packaged version for Ubuntu/Debian or Red Hat/CentOS. PF_RING is as a kernel module with its own userspace API, but it also provides the custom libpcap
, which you can use as a drop-in replacement. PF_RING plugs into the Linux kernel's NAPI subsystem to poll packets from the network card and pass them to userspace via a set of ring buffers. This helps to distribute packets better on multicore CPUs, resulting in higher throughput (and CPU usage). The authors claim that a PF_RING-aware libpcap is 12-34 percent faster than a vanilla one, depending on the NIC driver. A zero-copy variant, PF_RING ZC bypasses the kernel almost entirely, but it's a proprietary software (PF_RING is GPLv2). This "drawback" aside, it claims to be able to operate at line rates, that is, capture packets at 10Gbit/sec on a 10G network.
To read a .pcap file, use tcpdump -r
(recall it doesn't require root permissions). Note that .pcap stores not only packets but also timestamps, so expect it to run as long as your capture session lasted. You can also open a .pcap in another sniffer, as shown in Figure 2.
Sometimes, you may see tcpdump
complaining about checksum errors in packets coming from your local Linux box. It is completely normal: Many modern network cards compute checksums in silicon (this is called "checksum offloading"), so the kernel doesn't bother filling the respective fields. Packet sockets operate before the packet is handed to the NIC, so checksum fields may contain incorrect values. If these complaints annoy you, disable checksum calculation in tcpdump
with -k
.
Filters Galore
If you tried the preceding examples on a busy network, you may have noticed that they produce too much data to be really useful. As I mentioned earlier, libpcap
(and tcpdump
, for this reason) provides a high-level filtering language that compiles into BPF. The pcap-filter(7) man page [6] describes the formal syntax, so you can go through some typical examples there.
Filters consists of primitives joined by logical operations, such as &&
/and
or ||
/or
. Some filters span a single primitive. Conventional relation and Boolean operators are also supported. At the next level, primitives consist of an ID, which is the value you filter for (say, an IP address or a port) and some qualifiers. The latter tells if the ID is a host address, a port, or whatever; whether it refers to the source or the destination; and which protocols you are after. If omitted, the type of ID is assumed to be host
, the direction is src or dst
(i.e., any), and any protocol where ID makes sense would match. For example, port 80
is the same as (tcp or udp) port 80
, and host 1.2.3.4
matches ip
, arp
, or rarp
, but not ip6
.
Most of the time, you would filter by host: host 192.168.0.2
does the job. You are free to use a DNS name instead, and tcpdump
will resolve it at compile time. Such a filter grabs both ingress and egress traffic for the host. If you are interested only in packets where 192.168.0.2 is the source or the destination (or both), just prepend the filter with src
, dst
, or src and dst
, respectively. The latter is quite rare. It is also a good idea to filter out ARP – which you are not normally interested in – to reduce the clutter: host 192.168.0.2 and not arp
.
You may also be interested in traffic for a particular service, such as a web server. Here you need two primitives: one saying that the server's IP address must be involved as a source or a destination, and another saying you want TCP traffic to or from port 80. In tcpdump
's parlance, this reads host 192.168.0.1 and tcp port 80
. For trickier protocols like BitTorrent, you may want to specify multiple ports with portrange
.
You can also filter by network addresses (192.168.0.0/24) rather than hosts. You may set limits on the packet size with len
. For advanced cases, you may access individual bytes at known offsets using array index notation. The man page has a good example: ip[6:2] & 0x1fff = 0
drops IPv4 fragments other than the first (or the only) one. This filter takes a 16-bit half word at the offset 6 in IPv4 header and masks the highest three bits (or flags) to get a fragment offset value. If it's zero, the packet is either non-fragmented (most likely) or the first fragment in a series. To tell the former from the latter, you can check the MF bit within masked flags.
For those of you wanting to know how all of this works under the hood, tcpdump
can print a low-level BPF representation of your filter (Figure 3). Three modes are available. With tcpdump -d
, you get a human-readable BPF disassembly, -dd
dumps a C code snippet to use in your programs, and -ddd
produces raw bytes you can paste into whatever BPF interpreter you want.
Look at Listing 1 for a BPF disassembly of a simple filter: src 1.2.3.4
. I got this output with sudo tcpdump -d src 1.2.3.4
. Note that while the compiler itself is unprivileged code, tcpdump
still needs CAP_NET_RAW
and complains of insufficient permissions to capture otherwise. This is minor but annoying.
Listing 1
A BPF Assembly Code for "src 1.2.3.4" Filter
First, the code loads a half-word at offset 14. It's EtherType, and it stores Layer 3 protocol the Ethernet frame encapsulates. If it's 0x800, or IPv4, a 32-byte word at offset 28 is loaded and checked against 0x1020304 (1.2.3.4 in network byte order). That's where the Source Address field in IPv4 is: Ethernet header is 16 bytes long, and Source Address is at offset 12 within the IPv4 header. Other branching instructions check for ARP (EtherType 0x806) and RARP (0x8035). We discussed these protocols in a previous issue [7]. The program returns a number of bytes to take from the packet. If the condition matches, it's 256K or the whole packet (in IPv4, it is never more than 64K; typically 1 to 10K). If not, it's zero, and the packet is discarded.
Command of the Month: Scapy
tcpdump
and friends do a decent job, but they are mere observers. On the other hand, Scapy [8] can not only capture and decode packets but also forge them and send them over the wire.
We devoted an entire Core Tech to Scapy more than a year ago [9], but it never hurts to refresh the basics. Scapy is a Python framework for dissecting network protocols, and an associated interactive tool, scapy
, built on top of it. Each protocol is represented as a Python class, and you combine them with /
to build protocol stacks or read their properties to get protocol fields you want. Knowing Python is a bonus for using Scapy (and many other tools, in fact), but not a must. Indeed, do you need it to read the following?
>>> Ether() / IP(dst='1.2.3.4') <Ether type=0x800 |<IP dst=1.2.3.4 |>>
That's self-explanatory. Scapy could be a sniffer if you tell it to read packets from the network or a .pcap file. Or it could be a fuzzer if you use it to forge random packets and throw them at the target. It can even build beautiful packet visualizations for you, as seen in Figure 4. It doesn't matter whether you are a network professional or just a hobbyist: Having Scapy under your belt may save you some day.
Infos
- packet(7) man page: https://linux.die.net/man/7/packet
- "Core Technology" by Valentine Sinitsyn, Linux Magazine, Issue 198, pg. 76
- lipcap library home: http://www.tcpdump.org/pcap.html
- tcpdump man page: http://www.tcpdump.org/tcpdump_man.html
- PF_RING: http://www.ntop.org/products/packet-capture/pf_ring/
- pcap-filter(7) man page: https://linux.die.net/man/7/pcap-filter
- "Core Technology" by Valentine Sinitsyn, Linux Voice, Issue 31, pg. 94
- Scapy homepage: http://www.secdev.org/projects/scapy/
- "Core Technology" by Valentine Sinitsyn, Linux Voice, Issue 28, pg. 94
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.
-
Juno Tab 3 Launches with Ubuntu 24.04
Anyone looking for a full-blown Linux tablet need look no further. Juno has released the Tab 3.