Retis - Tracing packets in the Linux networking stack & friends

Retis aims at improving visibility of what happens in the Linux networking stack and different control and/or data paths, some of which can be in userspace. It works either in a single collect & display phase, or in a collect then process fashion.

Event collection

The entry point for most of the use cases is the collect command, which will install probes and gather events for instant reporting on the console or for later processing writing events to a file (or both). In addition to collect-level options, retis collect has the concept of collectors. Those collectors can be enabled individually and will act on different parts of the networking stack to retrieve specific information.

Currently supported collectors are listed below. By default Retis will try to load all collectors if their individual requirements are met (e.g. the ovs collector needs the OpenVSwitch kernel module to be loaded) but collectors can be explicitly selected too (here if prerequisites are not met an error will be returned). If no specific option is used, Retis will by default output the events to the console.

$ retis collect
00:42:00 [INFO] Collector(s) started: skb-tracking, skb, skb-drop, ovs, nft, ct
00:42:01 [INFO] 5 probe(s) loaded
...
$ retis collect -c skb,skb-drop
00:42:00 [INFO] 4 probe(s) loaded
...

In order to allow post-processing, events need to be stored in a file. This is done using the -o option (defaults to retis.data). To also output the events to the console in parallel, one can use --print.

$ retis collect -c skb,skb-drop,skb-tracking -o
00:42:00 [INFO] 4 probe(s) loaded
...
$ retis collect -c skb,skb-drop,skb-tracking -o --print
00:42:00 [INFO] 4 probe(s) loaded
...

Collectors

Collectors are responsible for filling events and target specific areas or data types. Some, but not all, install specific probes to build their events. Currently supported collectors are:

Collector Data collected Installs probes
skb Packet information No
skb-drop Drop reason Yes (1)
skb-tracking Packet tracking id No[^1]
ovs OpenVSwitch data Yes (many)
nft Nftables context Yes (1)
ct Conntrack info No

See retis collect --help for a description of each collector and its command line arguments.

[^1]: Probes for tracking packets are always installed by the core.

Post-processing

Events stored in a file can be formatted and displayed to the console using the simple print command (the events filename used for the input defaults to retis.data).

$ retis print
...

But events can also be post-processed. Retis allows to trace packets across the networking stack and as such the same packet can be seen multiple times (e.g. in the IP stack, TCP stack, OvS stack & netfilter stack; sometimes multiple times in each subsystem depending on which probes were loaded). The sort command uses information reported by the skb-tracking and the ovs collectors to identify unique packets and group/reorder the events so the same packet can be efficiently tracked in the stack.

$ retis collect --allow-system-changes -p kprobe:ip_local_deliver \
        --nft-verdicts drop -f 'udp port 8080' -o --print
...
$ retis sort

3316376152002 [swapper/2] 0 [k] ip_local_deliver #304276b119fffff9847c36ba800 (skb 18446630032886128640) n 0
  if 2 (eth0) rxif 2 172.16.42.1.40532 > 172.16.42.2.8080 ttl 64 tos 0x0 id 14042 off 0 [DF] len 32 proto UDP (17) len 4
  + 3316376220653 [swapper/2] 0 [k] __nft_trace_packet #304276b119fffff9847c36ba800 (skb 18446630032886128640) n 1
    if 2 (eth0) rxif 2 172.16.42.1.40532 > 172.16.42.2.8080 ttl 64 tos 0x0 id 14042 off 0 [DF] len 32 proto UDP (17) len 4
    table firewalld (1) chain filter_IN_FedoraServer (202) handle 215 drop
  + 3316376224687 [swapper/2] 0 [tp] skb:kfree_skb #304276b119fffff9847c36ba800 (skb 18446630032886128640) n 2 drop (NETFILTER_DROP)
    if 2 (eth0) rxif 2 172.16.42.1.40532 > 172.16.42.2.8080 ttl 64 tos 0x0 id 14042 off 0 [DF] len 32 proto UDP (17) len 4

Another post-processing command, pcap, can be used to generate pcap-ng files from a set of stored Retis events. For this to work the collection has to be done using (at least) the pcap profile. For now pcap-ng files can be generated by filtering the events using a single probe.

$ retis -p pcap,generic collect -o
$ retis pcap --probe tp:net:netif_receive_skb | tcpdump -nnr -
$ retis pcap --probe tp:net:net_dev_start_xmit -o retis.pcap
$ wireshark retis.pcap

Some post-processing commands (eg. print, sort) can generate a long output. In such case a pager is automatically used in case the output is larger than the current terminal. By default less is used but the pager can be explicitly chosen by setting the PAGER environment variable, or unset by setting NOPAGER.

$ PAGER=more retis sort
$ NOPAGER=1 retis sort

Profiles and customization

Retis has the concept of profiles, which are a predefined set of cli arguments (e.g. collectors for the collect command). Profiles are meant to improve user experience to provide a comprehensive and consistent configuration to Retis aimed at operating on pre-defined topics.

$ retis -p generic collect
...

Available profiles can be listed using the profile command.

$ retis profile list
...

Profiles can be extended by using cli arguments. Cli arguments can also be used without a profile. One example is adding probes while collecting events.

$ retis -p dropmon collect -p tp:skb:consume_skb
...
$ retis collect -p tp:skb:kfree_skb -p kprobe:ovs_ct_clear
...

New profiles can be written and used if stored in /etc/retis/profiles or $HOME/.config/profiles. Here is an example profile with inlined comments. If a profile is generic enough, consider contributing it!

Filtering

Tracing packets can generate a lot of events, some of which are not interesting. Retis implements a filtering logic to only report packets matching the filter or being tracked (see tracking). Retis has two ways of filtering and both can coexist. The former is based on the packet content, the latter is based on metadata.

Packet filtering uses a pcap-filter syntax. See man pcap-filter for an overview on the syntax. Upon execution the pcap-filter gets compiled in cBPF and subsequently translated into an eBPF program which in turn gets consumed by the probes.

$ retis collect -f 'tcp port 443'
...

Packet filtering can be of two types: L2 and L3. Retis automatically detects and generates L2/L3 filters based on the expression. This allows to match both packets fully formed and packets not having a valid L2 header yet (sk_buff having invalid mac_header but valid and set network_header). One advantage of this approach is the ability to match locally generated packets while still allowing matches based on L2 criteria.

For example, the following filter:

$ retis collect -f 'arp or tcp port 443'
L2+L3 packet filter(s) loaded
...

Internally generates two filters. For probes where only the network_header is valid and set in the sk_buff the filter would match packets with tcp source or destination port 443. For sk_buff with valid mac_header both arp and tcp packets would be matched. Please note that some limitations exist and they are a direct consequence of libpcap capabilities. For example filters like:

$ retis collect -f 'ether broadcast or tcp port 443'
L2 packet filter(s) loaded
...

Will only generate L2 filters, that is, packets will be matched only if mac_header is set. For further information about the reason an L3 filter gets skipped, please use the --log-level debug option, i.e.:

$ retis --log-level debug collect -f 'ether broadcast or tcp port 443'
...
DEBUG Skipping L3 filter generation (Could not compile the filter: libpcap error: not a broadcast link).
INFO  L2 packet filter(s) loaded
...

Metadata filtering instead allows to write filters that match packets based on their metadata. Metadata filters can match against any subfield of the sk_buff and subsequent inner data structures. Meta filtering also automatically follows struct pointers, so indirect access to structures pointed by an sk_buff field is possible.

$ retis collect -m 'sk_buff.dev.nd_net.net.ns.inum == 4026531840'
...

The comparison operators are:

  1. "==" for equal to
  2. "!=" for not equal to
  3. "<" and "<=" for less than and less than or equal to
  4. ">" and ">=" for greater than and greater than or equal to

At the moment, only number and string comparisons are supported. The right-hand side (rhs) of numeric matches must be expressed as literal and can be represented in either base 10 or base 16, with the latter starting with 0x prefix. All the comparison operators support numbers (both signed and unsigned). Bitfields are supported as well (both signed and unsigned) and they are treated as regular numbers.

For strings only the operator equal to is supported, furthermore, the string must be enclosed between quotes.

$ retis collect -m 'sk_buff.dev.name == "eth0"'
...

The example above shows how strings can be matched and how they are required to be quoted.

Metadata filtering, being a BTF-based way of filtering, is theoretically not limited to sk_buff, so from a generic point of view it can support all filters under the form struct_type_name.field1.field2.field3 with the above constraints, but for the time being only struct sk_buff is supported. This implies that the sk_buff keyword MUST always be present and MUST always appear first.

It is possible to combine packet and meta filtering, and doing so is just a matter of specifying their respective options and filters.

$ retis collect -f 'tcp port 443' -m 'sk_buff.dev.name == "eth0"'
...

The above options will be concatenated, meaning that both filters must match in order to have a match and generate events for packets.

Meta filtering has some known limitations, in particular only one field at the time can be matched.

Tracking

Retis does its best to track packets in the networking stack, and does it in different ways. Note that tracking packets is not a built-in feature of the Linux kernel and doing so is complex and cannot be 100% foolproof (but of course bugs should be reported and fixed).

  1. A Retis core built-in feature generates unique identifiers by tracking the data part of socket buffers. The socket buffer is also included in the identifier so we can track clones and friends. This core skb tracking logic is used by the filtering part for Retis to track packets after they were modified (e.g. NAT). Full details on the implementation can be found in the sources.

  2. A collector, skb-tracking, retrieves the core tracking information (unique identifier and socket buffer address) and reports it in the event. Without enabling this collector, skb tracking information won't be reported and can't be used at post-processing time.

  3. The ovs collector tracks packets in upcalls so we can follow a packet being sent to the OpenVSwitch user-space daemon, even if it is re-injected later on.