Finding strategy patterns in packet captures

Packet capturing tools like Wireshark collects network traffic. The captures are large files, too large to be examined at the packet level. Here we show how you can use strategy patterns to analyze these files.

We will use an example from the National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition, specifically this file (194M). The data is from competitions where one team tries to defend an organization's network from attacks by another team.

Even though the data file consists of millions of packets, we can organize the data into "flows". These are related packets that capture one conversation between two computers. We will use the term Source to indicate the computer that initiates the conversation, the one receiving the initial call is the Destination. The packet capture file can be analyzed using libpcap or the related Java package jNetPcap..But this file captures less than one hour of traffic, but contains nearly a million flows. Thus it is impossible to analyze the data by looking only at the flows.

Even though there is a lot of data here, there are only a few sources. In this case there are only 62 sources. Let us consider the sources to be potential troublemakers. Each source may connect to a number of destinations (there are over 700 destinations in this data.) When connecting to a destination, the source has to connect through a specific IP port. For example, usual HTTP connections are made to port 80 and HTTPS connections are made to port 443.

The following two pictures show two different behaviors of sources. The first one is a "bad" source, which is trying to get into some destination by trying a lot of different ports. This is like a real life burglar testing each window of a house to see if any of them have been left open.

Example 1

This shows part of the strategy for the first source, called Src8. This source tries a lot of destinations, and a lot of ports that it tries to connect with on many destinations. If we draw this behavior as a tree, we see a deep and wide tree that shows somebody trying to find some way to get into some computer within a pool of destination computers.

By contrast, when somebody wants to visit a house, they come in through one of the few doors, and does not go around the house looking for some unlocked window.

Example 2

The second picture shows the other source, Src9. This one connects to several destinations, but on each destination it connects only to one or two ports. The strategy of a normal visitor is to simply transact some expected operations.

Our analysis reads the large capture file containing about a million flows. From this it identifies a few sources that are "bad actors" that are trying to get into computers in a local area network by trying various ports. The first picture is related to an attacker's strategy. The second picture shows the strategy of a normal visitor. The two are quite distinct.

In this case, the entire flow is largely made up of the attacker's operations, with a few normal visitors in between. These strategies explain "what is going on" in the whole capture file.