Network Sniffers

Core Technology

Article from Issue 199/2017
Author(s):

Learn what's going on in your network, using Linux and its arsenal of packet capture tools.

We are always told that eavesdropping is bad. In social relations, that's probably true, but in computing (especially networking) where this activity is known as sniffing, it's an indispensable debugging technique. What goes in the wire is an ultimate answer to "What you've thrown at my service?" and "How did I reply?" Debugging aside, network sniffers may collect statistics or perform security monitoring. Linux comes with many tools of this kind, both GUI and terminal based. In this Core Tech, we'll discover perhaps the most popular one (or just my favorite).

Although sniffing is a legitimate technique, it is still largely prohibited in corporate environments. Many ISPs deem it illegal, too, so be careful when experimenting. A virtual machine based dedicated test lab is the safest option. Anyway, employ common sense and don't sniff traffic that could be sensitive, even if your housemates or colleagues are careless enough to send it unencrypted.

Back to the Beginning

Any network sniffer relies on the operating system's ability to forward it to all packets the network card receives, regardless of which process (or even host) they really target. The exact way of doing this is platform-specific. In Linux, packet sockets are the standard mechanism [1].

The AF_PACKET address family operates on raw Layer 2 packets (or frames). An "address" of this family (struct sockaddr_ll) tells the kernel from which interface to sniff packets, and in which Layer 2 protocols you are interested. Packet sockets come into play early, shortly after the network card receives a frame, and well before the data goes through the Linux networking stack. Capturing from the wire and crafting raw Layer 2 datagrams is a sensitive operation, so only processes with CAP_NET_RAW capability can do it. We've discussed capabilities in the last issue of Linux Magazine [2], but typically it just means you need to be root.

A rare network sniffer uses packet sockets directly. Most rely on libpcap [3], a library that abstracts away platform specifics and adds some bonus features. For example, it can save captured packets in so-called .pcap (packet capture, you guessed it) files and later you can read, or "replay" a .pcap file. This way, you (or someone acting on your behalf) can capture traffic on one box to analyze it on another. The libpcap library is designed so that from the application point of view reading a .pcap file is almost indistinguishable from capturing content live, so replay is usually doable with any sniffer. Because .pcap files are, well, just files, reading from them does not typically require root privileges.

Another problem libpcap solves elegantly for you is getting only data you want. Modern networks may run at whopping 40Gbit/sec or more, yet you might be interested in that small VoIP session your boss is complaining about. In other words, sniffing is closely intermingled with filtering.

A technology called Berkeley Packet Filters (BPF) was conceived long ago just for these purposes, and libpcap implements a high-level filtering language it compiles into BPF bytecode. This doesn't sound particularly intriguing until you realize that you can safely put arbitrary BPF bytecode in the kernel: It can never hang or crash (by design). Moreover, Linux compiles BPF bytecode into native CPU instructions, so these filters are also fast. On the rare occasion when libpcap finds it impossible to execute a filter in the kernel, it resorts to userspace emulation, which is a bit slower. BPF is a really powerful concept, and it was recently extended to include tasks that have nothing to do with network filtering, such as performance profiling. This is a whole set of opportunities, but we'll leave them for another time (drop us a line if you are interested in seeing more).

Dumping TCP (and Whatever Else)

If you'd want me to vote for a single network sniffer in Linux, I'd go for tcpdump (Figure 1). It may not be as good at decoding protocols as some commercial ones (although I never felt it was limited), and it has less eye candy than Wireshark (Figure 2). However, it seems to be installed on every Linux box I have access to, so it wouldn't be an exaggeration to call tcpdump a de facto standard.

Figure 1: The tcpdump tool, dumping flows from a wireless network. You may see reverse DNS requests and some mDNS exchange with the local printer.
Figure 2: Wireshark is a popular GUI network sniffer for Linux. Its cousin, tshark, does almost the same from the command line.

The tcpdump sniffer is a text-mode command-line tool. You pass it some options and read the dump from your terminal window or put it in a file. Despite the name, tcpdump is not about TCP. It understands about 150 network protocols, spanning the majority of OSI Model layers: from Ethernet and ARP through IPv4/IPv6, TCP, UDP, SCTP, and friends to DHCP, HTTP, and BGP. For something really exotic or new, you can always get a raw hex dump.

Before anything else, you should tell tcpdump from which network interface to capture packets. You do this with the -i switch, and it is also possible to monitor all NICs in the system with -i any. Moreover, tcpdump can list network interfaces it can capture from:

$ tcpdump --list-interfaces
1.wlan0 [Up, Running]
...

Note that the interface must be up (ip link set up dev wlan0) to be usable in tcpdump. As I mentioned previously, live packet capture implies root permissions, so don't forget to prefix your commands with sudo. I do this in Figure 1, where you can see some flows from my home wireless network. By default, a tcpdump session lasts forever until you type Ctrl+C, but you may instruct the tool to receive a predefined number of packets (-c). It also puts the device from which you capture in so-called "promiscuous" mode, where the device receives frames destined to other hosts. Usually, that's what you want, but if you are interested in local traffic only, consider adding -p to disable promiscuous mode.

You see that tcpdump prints a human-readable summary for each Layer 3 datagram it receives. Add -e to include Layer 2 header information, such as MAC addresses (and the respective NIC manufacturers). You can also make tcpdump dump more data with -v, -vv, or even -vvv, where v stands for verbosity, and the more vs you have, the more verbose tcpdump is. (This is a recurring pattern in its CLI.) A maximum verbosity level appears in Figure 3. I usually run tcpdump in this mode myself and also add -n to disable host/port/whatever else names from being resolved. Without it, 8.8.8.8 would become google-public-dns-a.google.com, and 80 would read http. This is useful most of the time, but when I do network troubleshooting, I prefer clear pictures.

Additionally, tcpdump does a great job of decoding network protocols, but if it prints some packets as garbage, you may want to look at raw bytes to see what's wrong. You may see that the packet is indeed well-formed, with just a few extra bytes at the beginning. tcpdump facilitates this workflow with -x and -X switches. Both print raw hex dumps, but the latter also adds ASCII as most hex editors do (Figure 3). A corresponding double-x version (-xx and -XX) includes Layer 2 headers in the dump.

Figure 3: tcpdump can be quite verbose. Here, you get a full protocol decode and a complete hexdump of two Dropbox LAN sync discovery packets.

So far, tcpdump is dumping everything it sees in the wire on your terminal. It is also completely possible to store packets in .pcap files:

$ sudo tcpdump -i wlan0 -vn -w first.pcap
tcpdump: listening on wlan0, link-type EN10MB (Ethernet), capture size 262144 bytes
Got 77

The counter indicates how many packets were captured. If you are going to grab a lot of data, tcpdump can spread it across multiple files that act as a "ring buffer." Consult tcpdump(1) man page [4] for details. (See also the "PF_RING" box.)

PF_RING

Packet sockets are standard in Linux but not particularly fast. This is usually not an issue for debugging but may cause problems in network monitoring scenarios. If the capture mechanism isn't fast enough, you'll be losing events you were to monitor. Not good.

PF_RING [5] is one approach to making packet capture in Linux faster. It doesn't probably come with your distribution kernel, but you can build it yourself or download a pre-packaged version for Ubuntu/Debian or Red Hat/CentOS. PF_RING is as a kernel module with its own userspace API, but it also provides the custom libpcap, which you can use as a drop-in replacement. PF_RING plugs into the Linux kernel's NAPI subsystem to poll packets from the network card and pass them to userspace via a set of ring buffers. This helps to distribute packets better on multicore CPUs, resulting in higher throughput (and CPU usage). The authors claim that a PF_RING-aware libpcap is 12-34 percent faster than a vanilla one, depending on the NIC driver. A zero-copy variant, PF_RING ZC bypasses the kernel almost entirely, but it's a proprietary software (PF_RING is GPLv2). This "drawback" aside, it claims to be able to operate at line rates, that is, capture packets at 10Gbit/sec on a 10G network.

To read a .pcap file, use tcpdump -r (recall it doesn't require root permissions). Note that .pcap stores not only packets but also timestamps, so expect it to run as long as your capture session lasted. You can also open a .pcap in another sniffer, as shown in Figure 2.

Sometimes, you may see tcpdump complaining about checksum errors in packets coming from your local Linux box. It is completely normal: Many modern network cards compute checksums in silicon (this is called "checksum offloading"), so the kernel doesn't bother filling the respective fields. Packet sockets operate before the packet is handed to the NIC, so checksum fields may contain incorrect values. If these complaints annoy you, disable checksum calculation in tcpdump with -k.

Filters Galore

If you tried the preceding examples on a busy network, you may have noticed that they produce too much data to be really useful. As I mentioned earlier, libpcap (and tcpdump, for this reason) provides a high-level filtering language that compiles into BPF. The pcap-filter(7) man page [6] describes the formal syntax, so you can go through some typical examples there.

Filters consists of primitives joined by logical operations, such as &&/and or ||/or. Some filters span a single primitive. Conventional relation and Boolean operators are also supported. At the next level, primitives consist of an ID, which is the value you filter for (say, an IP address or a port) and some qualifiers. The latter tells if the ID is a host address, a port, or whatever; whether it refers to the source or the destination; and which protocols you are after. If omitted, the type of ID is assumed to be host, the direction is src or dst (i.e., any), and any protocol where ID makes sense would match. For example, port 80 is the same as (tcp or udp) port 80, and host 1.2.3.4 matches ip, arp, or rarp, but not ip6.

Most of the time, you would filter by host: host 192.168.0.2 does the job. You are free to use a DNS name instead, and tcpdump will resolve it at compile time. Such a filter grabs both ingress and egress traffic for the host. If you are interested only in packets where 192.168.0.2 is the source or the destination (or both), just prepend the filter with src, dst, or src and dst, respectively. The latter is quite rare. It is also a good idea to filter out ARP – which you are not normally interested in – to reduce the clutter: host 192.168.0.2 and not arp.

You may also be interested in traffic for a particular service, such as a web server. Here you need two primitives: one saying that the server's IP address must be involved as a source or a destination, and another saying you want TCP traffic to or from port 80. In tcpdump's parlance, this reads host 192.168.0.1 and tcp port 80. For trickier protocols like BitTorrent, you may want to specify multiple ports with portrange.

You can also filter by network addresses (192.168.0.0/24) rather than hosts. You may set limits on the packet size with len. For advanced cases, you may access individual bytes at known offsets using array index notation. The man page has a good example: ip[6:2] & 0x1fff = 0 drops IPv4 fragments other than the first (or the only) one. This filter takes a 16-bit half word at the offset 6 in IPv4 header and masks the highest three bits (or flags) to get a fragment offset value. If it's zero, the packet is either non-fragmented (most likely) or the first fragment in a series. To tell the former from the latter, you can check the MF bit within masked flags.

For those of you wanting to know how all of this works under the hood, tcpdump can print a low-level BPF representation of your filter (Figure 3). Three modes are available. With tcpdump -d, you get a human-readable BPF disassembly, -dd dumps a C code snippet to use in your programs, and -ddd produces raw bytes you can paste into whatever BPF interpreter you want.

Look at Listing 1 for a BPF disassembly of a simple filter: src 1.2.3.4. I got this output with sudo tcpdump -d src 1.2.3.4. Note that while the compiler itself is unprivileged code, tcpdump still needs CAP_NET_RAW and complains of insufficient permissions to capture otherwise. This is minor but annoying.

Listing 1

A BPF Assembly Code for "src 1.2.3.4" Filter

 

First, the code loads a half-word at offset 14. It's EtherType, and it stores Layer 3 protocol the Ethernet frame encapsulates. If it's 0x800, or IPv4, a 32-byte word at offset 28 is loaded and checked against 0x1020304 (1.2.3.4 in network byte order). That's where the Source Address field in IPv4 is: Ethernet header is 16 bytes long, and Source Address is at offset 12 within the IPv4 header. Other branching instructions check for ARP (EtherType 0x806) and RARP (0x8035). We discussed these protocols in a previous issue [7]. The program returns a number of bytes to take from the packet. If the condition matches, it's 256K or the whole packet (in IPv4, it is never more than 64K; typically 1 to 10K). If not, it's zero, and the packet is discarded.

Command of the Month: Scapy

tcpdump and friends do a decent job, but they are mere observers. On the other hand, Scapy [8] can not only capture and decode packets but also forge them and send them over the wire.

We devoted an entire Core Tech to Scapy more than a year ago [9], but it never hurts to refresh the basics. Scapy is a Python framework for dissecting network protocols, and an associated interactive tool, scapy, built on top of it. Each protocol is represented as a Python class, and you combine them with / to build protocol stacks or read their properties to get protocol fields you want. Knowing Python is a bonus for using Scapy (and many other tools, in fact), but not a must. Indeed, do you need it to read the following?

>>> Ether() / IP(dst='1.2.3.4')
<Ether type=0x800 |<IP dst=1.2.3.4 |>>

That's self-explanatory. Scapy could be a sniffer if you tell it to read packets from the network or a .pcap file. Or it could be a fuzzer if you use it to forge random packets and throw them at the target. It can even build beautiful packet visualizations for you, as seen in Figure 4. It doesn't matter whether you are a network professional or just a hobbyist: Having Scapy under your belt may save you some day.

Figure 4: Nearly all sniffers are network protocol decoders, but Scapy also shows you where each value came from.

Infos

  1. packet(7) man page: https://linux.die.net/man/7/packet
  2. "Core Technology" by Valentine Sinitsyn, Linux Magazine, Issue 198, pg. 76
  3. lipcap library home: http://www.tcpdump.org/pcap.html
  4. tcpdump man page: http://www.tcpdump.org/tcpdump_man.html
  5. PF_RING: http://www.ntop.org/products/packet-capture/pf_ring/
  6. pcap-filter(7) man page: https://linux.die.net/man/7/pcap-filter
  7. "Core Technology" by Valentine Sinitsyn, Linux Voice, Issue 31, pg. 94
  8. Scapy homepage: http://www.secdev.org/projects/scapy/
  9. "Core Technology" by Valentine Sinitsyn, Linux Voice, Issue 28, pg. 94

The Author

Valentine Sinitsyn works in a cloud infrastructure team and teaches students completely unrelated subjects. He also has a KDE develop account he never really used.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Capture File Filtering with Wireshark

    Wireshark doesn’t just work in real time. If you save a history of network activity in a pcap file using a tool such as tcpdump, you can filter the data with Wireshark to search for evidence.

  • Security Lessons

    Building a network flight recorder with Wireshark.

  • Wireshark

    If you know your way around network protocols, you can get to the source of a problem quickly with Wireshark.

  • Tshark

    The simple and practical Tshark packet analyzer gives precise information about the data streams on the network.

  • Security Visualization Tools

    Spot intruders with these easy security visualization tools.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News