A study in detecting network intruders

Console Shark

The graphical front end of Wireshark is very well suited for offline analysis. If you are dealing with a large volume of data, the graphical display can be very memory intensive. The command-line counterpart, TShark [6], can jump into the breach; it supports the same protocol analyzers as Wireshark. Like tcpdump, TShark can give you command-line output, such as all HTTP packets using the SEARCH request method. Figure 2 shows TShark inspecting a capture file.

Figure 2: Using a display filter, TShark works its way through a capture file (camera2.pcap).

If the search at the beginning of the analysis is still somewhat imprecise, you can apply the full gamut of standard Linux tools, such as grep, Awk, Perl, or Python, to look for patterns.

There are several ways to filter. The most common option is -Y. If you specify the -e option, TShark will only display selected fields. This means that the output only contains information relevant to the search (for example, the source and destination IP addresses) and does not unnecessarily grow what is potentially already a large haystack.

The -e option also includes the -T option, which defines the output format. A call can then look something like the following:

tshark -T json -e ip.addr -r Capture_File

Thanks to the -T json parameter, TShark outputs structured data in JSON so that further processing software no longer needs a parser but uses the standard library.

Later versions of the malware used DNS to find the C&C server. To do so, the attackers used the local DNS server. This approach gives the admin two options: you can evaluate all DNS traffic, or, if Bind is used as a recursive resolver, you can take a look at the database and see if you come across any suspicious entries.

The rndc dumpdb command creates a dump of all cached entries in the named_dump.db file. The results are stored in the directory defined for the name server below Options | Directory in named.conf. This file contains a large zone file with all the records that the name server knows at this time. You will want to inspect the A records, which are used to associate a domain name with an IP address.

If the attackers are more cunning, you should also use Grep to check whether the local network is requesting unusual record types on the Internet-facing side. The following command returns a simple list of all requested hostnames (but also of other records):

egrep '^[a-zA-Z0-9]' /var/lib/bind/named_dump.db | grep -v PTR | sort | less

Sorting helps with manual analysis. However, a search for the domains is also useful. The following command searches for the cached SOA records:

grep SOA /etc/bind/dns/named_dump.db | awk '{print $2}' | sort

Unfortunately, this file does not contain any information about who might have made a suspicious request. But the results of the analysis can be rehashed in TShark to filter the DNS requests for the requester. Since attackers who use DNS names for their C&C servers often limit their validity to a short period of time and repeatedly move their servers, this form of analysis can be useful. But the question is whether DNS is used at all.

NetFlow

An alternative to TShark and tcpdump is to collect NetFlow data. The NetFlow protocol [7] originates from Cisco and sometimes goes by other names (such as JFlow at Juniper). The technology collects connection data and exports the data as a UDP stream to a NetFlow collector, which then processes the results.

Several different versions of the NetFlow protocol exist. Version 9 is standardized in RFC 3954. Many manufacturers support version 9 and the Linux kernel has a kernel module to match. Version 10 of the protocol is also distributed under the name IPFIX.

The exporter sends data via IP connections. The data contains concrete information about which IP address on which source ports communicate with which target IP address on which target port. In addition, the exporter transmits data relating to the numbers of packets sent and the bytes transmitted in each direction.

The NetFlow collector helps with the analysis. The collector is available in a number of different tools, because forensic scientists often have to merge several data sources. The article looks at the NetFlow collector in the ELK stack.

ELK stands for a combination of Elasticsearch, Logstash, and Kibana. The Elastic website [8] describes the simple setup for a single host, where all three ELK components run on one machine. The following call:

bin/logstash --modules netflow --setup -M netflow.var.input.udp.port=port_number

initializes the dashboards in Kibana and also creates some mappings. At the same time, the Logstash call starts with an instance that accepts NetFlow packets on the specified port. Listing 1 shows an addition to the conf.d directory that activates the NetFlow service when Logstash is launched.

Listing 1

Logstash with NetFlow

 

If Linux gateways are used, you have two ways to produce NetFlow data: Open vSwitch [9] lets you export NetFlow data for each bridge. Alternatively, an ipt_netflow kernel module exports flow data in conjunction with iptables rules. You can either install the package that comes with your distribution or use the Git archive [10]. If you build yourself, you first need to install the packages required for the build.

The kernel module needs information on where to send the data. The call for the install looks like:

modprobe ipt_netflow destination=172.25.1.117:2055,protocol=10

The protocol parameter can be omitted, since version 10 is the default. 2055 is the port number to match the Logstash configuration.

The Linux computer does not start sending data yet. You need iptables rules with a NETFLOW target. In the simplest case, you would issue the following command

iptables -I FORWARD -j NETFLOW

If the rule is more specific, the gateway only forwards certain packets. As a forensics expert, you will want to restrict the search to packets that pass through the gateway, but want to leave the internal network. The task is to find suspicious connections to the outside world.

Network Hardware

Classic network hardware often supports the NetFlow protocol without additional modules, but you may need an additional license for some vendors. With network hardware, however, you should make sure that the typically weak CPU of the management unit is not forced to handle the task of collecting the traffic. A router that serves several 100GBit/s interfaces as a hardware device can quickly be overwhelmed by this traffic.

The sample rate is the key to controlling the balance between data collection and system overload. If tweaking the sample rate does not provide a solution to the performance issues, you would have to set up a mirror port first and install a Linux computer. The installation collects data in the first step, then Kibana (Figure 3) is used to view it. The NetFlow: Conversation Partners and NetFlow: Traffic Analysis dashboards provide an overview.

Figure 3: You can adapt the Kibana dashboard to fit your different needs.

When searching for the needle(s) in the haystack, you can now use the Kibana filter function.

Timeline visualizations with the aim of plotting the numbers of packets or bytes from the network that go to various targets over time can help discover conspicuous behavior in normal office operation. Even without a proxy, admins should update during off-peak hours and examine the peaks during off-peak hours. Again, network admins will want to ignore known connections and inspect the remaining traffic (Figure 4).

Figure 4: This 24-hour graph lets you drill down to focus on night-time peaks.

Searches that Kibana cannot directly map, such as more complex statistical evaluations, are enabled in the ELK constellation by the Elasticsearch API.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tshark

    The simple and practical Tshark packet analyzer gives precise information about the data streams on the network.

  • Capture File Filtering with Wireshark

    Wireshark doesn’t just work in real time. If you save a history of network activity in a pcap file using a tool such as tcpdump, you can filter the data with Wireshark to search for evidence.

  • Security Lessons

    Building a network flight recorder with Wireshark.

  • Wireshark

    If you know your way around network protocols, you can get to the source of a problem quickly with Wireshark.

  • An Essential Sys Admin and Security Tool

    Wireshark fills the gap between security and system administration for those who need to know more about what’s flowing through the wires or over the airwaves in the corporate network.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News