Filtering
Retis offers two distinct methods for filtering packets, both of which can operate simultaneously:
- packet-based filtering which filters packets based on their content (headers).
- metadata-based filtering which filters packets based on their associated metadata.
These filtering mechanisms ensure that only relevant packets are reported, so reducing the volume of uninteresting events and improving the efficiency of packet tracing.
Packet
Packet filtering uses a pcap-filter syntax. See man pcap-filter for an
overview on the syntax. Upon execution the pcap-filter gets compiled in cBPF
and subsequently translated into an eBPF program which in turn gets consumed by
the probes.
$ retis collect -f 'tcp port 443'
...
Packet filtering can be of two types: L2 and L3. Retis automatically detects and
generates L2/L3 filters based on the expression. This allows to match both packets
fully formed and packets not having a valid L2 header yet (sk_buff having invalid
mac_header but valid and set network_header). One advantage of this approach is
the ability to match locally generated packets while still allowing matches based
on L2 criteria.
For example, the following filter:
$ retis collect -f 'arp or tcp port 443'
L2+L3 packet filter(s) loaded
...
Internally generates two filters. For probes where only the network_header
is valid and set in the sk_buff the filter would match packets with tcp
source or destination port 443. For sk_buff with valid mac_header both arp
and tcp packets would be matched.
Please note that some limitations exist and they are a direct consequence of
libpcap capabilities.
For example filters like:
$ retis collect -f 'ether broadcast or tcp port 443'
L2 packet filter(s) loaded
...
Will only generate L2 filters, that is, packets will be matched only if
mac_header is set.
For further information about the reason an L3 filter gets skipped, please
use the --log-level debug option, i.e.:
$ retis --log-level debug collect -f 'ether broadcast or tcp port 443'
...
DEBUG Skipping L3 filter generation (Could not compile the filter: libpcap error: not a broadcast link).
INFO L2 packet filter(s) loaded
...
Metadata
Metadata filtering instead allows to write filters that match packets based
on their metadata.
These filters can match against any subfield of the sk_buff and subsequent
inner data structures.
Meta filtering also automatically follows struct pointers, so indirect access to
structures pointed by an sk_buff field is possible.
A filter expression is represented by this pest grammar below:
program = _{ SOI ~ expr ~ EOI }
expr = { primary ~ (infix ~ primary)* }
infix = { and | or }
and = { "and" | "&&" }
or = { "or" | "||" }
primary = { term | "(" ~ expr ~ ")" }
term = { lhs ~ (op ~ rhs)? }
op = { "==" | "!=" | ">=" | "<=" | ">" | "<" }
lhs = { "sk_buff" ~ ("." ~ ident?)+ }
ident = { uident ~ ident_modifiers? }
ident_modifiers = { ":" ~ mask ~ (":" ~ uident)? }
mask = { not? ~ (hex | bin | dec) }
hex = @{ "0x" ~ ('0'..'9' | 'a'..'f' | 'A'..'F')+ }
bin = @{ "0b" ~ ("0" | "1")+ }
dec = @{ ASCII_DIGIT+ }
not = @{ "~" }
uident = @{ so_ident ~ re_ident }
so_ident = _{ ('a'..'z' | 'A'..'Z' | "_") }
re_ident = _{ (so_ident | "_")* }
rhs = { num | string }
num = @{ hex | bin | ext_dec }
ext_dec = @{ (neg)? ~ ASCII_DIGIT+ }
neg = @{ "-" }
string = @{ PUSH("\"" | "'") ~ any_string+ ~ POP }
any_string = { (!PEEK ~ ANY)+ }
WHITESPACE = _{ " " | "\t" }
EOF = _{ EOI | ";" }
An example of a simple filter that respect a previous definition is:
$ retis collect -m 'sk_buff.dev.nd_net.net.ns.inum == 4026531840'
...
The relational operators are:
==for equal to!=for not equal to<and<=for less than and less than or equal to>and>=for greater than and greater than or equal to- if
opandrhsare omitted interm, a not equal to zero numeric comparison is assumed
The boolean operators are:
&&with its alternate syntaxandresulting true if both left and right conditions aretrue,falseotherwise (conjunction)||with its alternate syntaxorresulting true if at least one between left and right conditions aretrue,falseotherwise (disjunction)
Boolean operators have the same precedence, meaning in absence of explicit syntax, they are considered left-associative. In order to define different associativity in the expression, parentheses are available. Parentheses MUST be well-formed and can arbitrarily change the assiociativity of the operators and so the way the expression is evaluated.
For example, the following expression:
$ retis collect -m 'sk_buff.pkt_type == 0x0 || sk_buff.mark == 0x100 && sk_buff.cloned'
logically results in:
0: if sk_buff.pkt_type == 0 goto 4
1: goto 2
2: if sk_buff.mark == 256 goto 4
3: goto 7
4: if sk_buff.cloned != 0 goto 6
5: goto 7
6: true
7: false
which is semantically equivalent to:
$ retis collect -m '(sk_buff.pkt_type == 0x0 || sk_buff.mark == 0x100) && sk_buff.cloned'
conversely, the following:
$ retis collect -m 'sk_buff.pkt_type == 0x0 || (sk_buff.mark == 0x100 && sk_buff.cloned)'
results in:
0: if sk_buff.pkt_type == 0 goto 6
1: goto 2
2: if sk_buff.mark == 256 goto 4
3: goto 7
4: if sk_buff.cloned != 0 goto 6
5: goto 7
6: true
7: false
There's no limit to the use of paretheses and boolean operators, but a short-circuit evaluation strategy is applied, so it's highly suggested to define the expression accordingly in order to avoid unnecessary runtime overhead.
At the moment, only numeric and string comparisons are supported.
The right-hand side (rhs) of numeric matches must be expressed as
literal and can be represented in either base 10, base 16 if
starting with 0x prefix, or base 2 if starting with 0b prefix.
An example of equivalent numeric expressions with different bases are:
sk_buff.mark == 16
sk_buff.mark == 0x10
sk_buff.mark == 0b10000
All the relational operators support numbers (both unsigned in any base
and signed in base 10). The usage of negative numbers is only allowed against
signed members.
Bitfields are supported as well (both signed and unsigned) and they
are treated as regular numbers.
For numeric comparisons, an additional bitwise AND operation can be
performed by specifying a mask.
A mask can be expressed as a hexadecimal number (e.g. 0xdeaf), a
binary number (e.g. 0b01010101), and a regular decimal number and
MUST strictly be positive.
The filtering engine allows you to specify masks up to u64::MAX
with any target. While this approach is safe, ensuring consistency is
the user's responsibility.
The following example demonstrates this approach:
$ retis collect -m 'sk_buff._nfct:0x7 == 0x2'
...
which is equivalent to the following:
(sk_buff->_nfct & NFCT_INFOMASK) == IP_CT_NEW
For strings only the operators equal to and not equal to are supported,
furthermore, the string (rhs) must be enclosed between quotes.
$ retis collect -m 'sk_buff.dev.name == "eth0"'
...
The example above shows how strings can be matched and how they are required to be quoted.
Another useful feature meta filtering expose is the ability to follow pointers embedded in members with a different defined type. For example, the filter below:
$ retis collect -m sk_buff._nfct:~0x7:nf_conn.mark
...
is equivalent to the following:
((struct nf_conn *)(skb->_nfct & NFCT_PTRMASK))->mark != 0
Please, notice in the previous example the operator ~ which is
semantically equivalent to a bitwise not.
Bitwise not is only applicable to masks (in any base).
Metadata filtering, being a BTF-based way of filtering, is theoretically
not limited to sk_buff, so from a generic point of view it can support
all filters under the form struct_type_name.field1.field2.field3 with
the above constraints, but for the time being only struct sk_buff is
supported.
This implies that the sk_buff keyword MUST always be present and MUST
always appear first in each expression.
It is possible to combine packet and meta filtering, and doing so is just a matter of specifying their respective options and filters.
$ retis collect -f 'tcp port 443' -m 'sk_buff.dev.name == "eth0"'
...
The above options will be concatenated, meaning that both filters must match in order to have a match and generate events for packets.