libbpf-tools: add tcpdrop to trace TCP packet drops #5329

Amaindex · 2025-06-12T07:45:49Z

Added tcpdrop tool, consisting of tcpdrop.bpf.c and tcpdrop.c, to trace TCP kernel-dropped packets using eBPF. Supports IPv4/IPv6 filtering and network namespace filtering, with output including timestamp, PID, IP addresses, ports, TCP state, and drop reason. Based on tcptop(8) from BCC.

Added tcpdrop tool, consisting of tcpdrop.bpf.c and tcpdrop.c, to trace TCP kernel-dropped packets using eBPF. Supports IPv4/IPv6 filtering and network namespace filtering, with output including timestamp, PID, IP addresses, ports, TCP state, and drop reason. Based on tcptop(8) from BCC. Signed-off-by: Lance Yang <lance.yang@linux.dev> Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

Amaindex · 2025-06-18T05:53:44Z

Hi @chenhengqi , we’ve got a C version of tcpdrop in this PR (#5329), sticking close to the Python version’s features and options. Could you take a peek when you’ve got a sec? Would love your thoughts :)

libbpf-tools/tcpdrop.c

libbpf-tools/Makefile

libbpf-tools/tcpdrop.bpf.c

libbpf-tools/tcpdrop.c

Amaindex · 2025-07-01T08:00:38Z

Hi @ekyooo and @chenhengqi ,

Thanks for the great feedback! I've made the following updates based on your suggestions:

Switched to ksyms__load and ksyms__map_addr for symbol resolution in tcpdrop.c.
Updated tcpdrop.bpf.c and tcpdrop.c to follow Linux kernel coding style.
Improved IPv6 address handling with __u32 saddr_v6[4] and in6_u.u6_addr32 in both files.
Removed bpf_printk debug statements from tcpdrop.bpf.c.
Added /tcpdrop to .gitignore.
Moved event struct to tcpdrop.h to avoid duplication.

Please take a look and let me know if there's anything else I can tweak!

Amaindex · 2025-07-01T08:01:09Z

Hi @chenhengqi ,

Regarding your suggestion to copy reason enums from the kernel for tcpdrop, we previously used this approach in tcpdrop.py. However, recent experience shows these enums vary across kernel versions and distros, and they're easy to verify. So, I think dynamic loading via parse_reason_enum is more robust. It might be good to update tcpdrop.py to match this approach for consistency. What do you think, or is there another way to handle this?

- Use ksyms__load and ksyms__map_addr for kernel symbol resolution. - Follow Linux kernel coding style in tcpdrop.bpf.c and tcpdrop.c. - Optimize IPv6 address handling with __u32 arrays and in6_u.u6_addr32. - Remove bpf_printk debug statements from tcpdrop.bpf.c. - Add /tcpdrop to .gitignore to exclude the binary. - Define event struct in tcpdrop.h to prevent duplicate definitions. - Check drop reason with bpf_core_field_exists in tcpdrop.bpf.c. Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

libbpf-tools/.gitignore

libbpf-tools/tcpdrop.h

libbpf-tools/tcpdrop.bpf.c

chenhengqi · 2025-07-02T12:28:19Z

Hi @chenhengqi ,

Regarding your suggestion to copy reason enums from the kernel for tcpdrop, we previously used this approach in tcpdrop.py. However, recent experience shows these enums vary across kernel versions and distros, and they're easy to verify. So, I think dynamic loading via parse_reason_enum is more robust. It might be good to update tcpdrop.py to match this approach for consistency. What do you think, or is there another way to handle this?

Do you have an example of these enums vary across kernel versions and distros ?
We have enum skb_drop_reason in vmlinux.h

Amaindex · 2025-07-03T06:12:39Z

Hi @chenhengqi ,
Regarding your suggestion to copy reason enums from the kernel for tcpdrop, we previously used this approach in tcpdrop.py. However, recent experience shows these enums vary across kernel versions and distros, and they're easy to verify. So, I think dynamic loading via parse_reason_enum is more robust. It might be good to update tcpdrop.py to match this approach for consistency. What do you think, or is there another way to handle this?

Do you have an example of these enums vary across kernel versions and distros ? We have enum skb_drop_reason in vmlinux.h

Take NETFILTER_DROP as an example. In kernel v5.15.186, as you can see in include/linux/skbuff.h, the skb_drop_reason enum lists NETFILTER_DROP as the 7th value (index 6):

enum skb_drop_reason {
    SKB_DROP_REASON_NOT_SPECIFIED,  /* 0 */
    SKB_DROP_REASON_NO_SOCKET,      /* 1 */
    SKB_DROP_REASON_PKT_TOO_SMALL,  /* 2 */
    SKB_DROP_REASON_TCP_CSUM,       /* 3 */
    SKB_DROP_REASON_SOCKET_FILTER,  /* 4 */
    SKB_DROP_REASON_UDP_CSUM,       /* 5 */
    SKB_DROP_REASON_NETFILTER_DROP, /* 6 */
    ...
};

This is reflected in the tracepoint format for /sys/kernel/debug/tracing/events/skb/kfree_skb/format, where NETFILTER_DROP is mapped to index 6 in the __print_symbolic output.

Now, fast forward to kernel v6.15.4, and things shift in include/net/dropreason-core.h. The skb_drop_reason enum has new entries, and NETFILTER_DROP moves to index 12:

enum skb_drop_reason {
    SKB_NOT_DROPPED_YET,           /* 0 */
    SKB_CONSUMED,                  /* 1 */
    SKB_DROP_REASON_NOT_SPECIFIED, /* 2 */
    SKB_DROP_REASON_NO_SOCKET,     /* 3 */
    SKB_DROP_REASON_SOCKET_CLOSE,  /* 4 */
    SKB_DROP_REASON_SOCKET_FILTER, /* 5 */
    SKB_DROP_REASON_SOCKET_RCVBUFF,/* 6 */
    SKB_DROP_REASON_UNIX_DISCONNECT,/* 7 */
    SKB_DROP_REASON_UNIX_SKIP_OOB, /* 8 */
    SKB_DROP_REASON_PKT_TOO_SMALL, /* 9 */
    SKB_DROP_REASON_TCP_CSUM,      /* 10 */
    SKB_DROP_REASON_UDP_CSUM,      /* 11 */
    SKB_DROP_REASON_NETFILTER_DROP,/* 12 */
    ...
};

The tracepoint format in v6.15.4 confirms this, with NETFILTER_DROP now at index 12 in the __print_symbolic output. This isn’t just a case of appending new values at the end—new entries like SKB_CONSUMED, SOCKET_CLOSE, SOCKET_RCVBUFF, etc., are inserted in the middle, shuffling the indices around.

Considering the skb_drop_reason index changes across kernel versions, parse_reason_enum for dynamic loading feels more adaptable than hardcoding the enums.

chenhengqi · 2025-07-03T08:20:07Z

Hi @chenhengqi ,
Regarding your suggestion to copy reason enums from the kernel for tcpdrop, we previously used this approach in tcpdrop.py. However, recent experience shows these enums vary across kernel versions and distros, and they're easy to verify. So, I think dynamic loading via parse_reason_enum is more robust. It might be good to update tcpdrop.py to match this approach for consistency. What do you think, or is there another way to handle this?

Do you have an example of these enums vary across kernel versions and distros ? We have enum skb_drop_reason in vmlinux.h

Take NETFILTER_DROP as an example. In kernel v5.15.186, as you can see in include/linux/skbuff.h, the skb_drop_reason enum lists NETFILTER_DROP as the 7th value (index 6):
enum skb_drop_reason {
    SKB_DROP_REASON_NOT_SPECIFIED,  /* 0 */
    SKB_DROP_REASON_NO_SOCKET,      /* 1 */
    SKB_DROP_REASON_PKT_TOO_SMALL,  /* 2 */
    SKB_DROP_REASON_TCP_CSUM,       /* 3 */
    SKB_DROP_REASON_SOCKET_FILTER,  /* 4 */
    SKB_DROP_REASON_UDP_CSUM,       /* 5 */
    SKB_DROP_REASON_NETFILTER_DROP, /* 6 */
    ...
};
This is reflected in the tracepoint format for /sys/kernel/debug/tracing/events/skb/kfree_skb/format, where NETFILTER_DROP is mapped to index 6 in the __print_symbolic output.

Now, fast forward to kernel v6.15.4, and things shift in include/net/dropreason-core.h. The skb_drop_reason enum has new entries, and NETFILTER_DROP moves to index 12:
enum skb_drop_reason {
    SKB_NOT_DROPPED_YET,           /* 0 */
    SKB_CONSUMED,                  /* 1 */
    SKB_DROP_REASON_NOT_SPECIFIED, /* 2 */
    SKB_DROP_REASON_NO_SOCKET,     /* 3 */
    SKB_DROP_REASON_SOCKET_CLOSE,  /* 4 */
    SKB_DROP_REASON_SOCKET_FILTER, /* 5 */
    SKB_DROP_REASON_SOCKET_RCVBUFF,/* 6 */
    SKB_DROP_REASON_UNIX_DISCONNECT,/* 7 */
    SKB_DROP_REASON_UNIX_SKIP_OOB, /* 8 */
    SKB_DROP_REASON_PKT_TOO_SMALL, /* 9 */
    SKB_DROP_REASON_TCP_CSUM,      /* 10 */
    SKB_DROP_REASON_UDP_CSUM,      /* 11 */
    SKB_DROP_REASON_NETFILTER_DROP,/* 12 */
    ...
};
The tracepoint format in v6.15.4 confirms this, with NETFILTER_DROP now at index 12 in the __print_symbolic output. This isn’t just a case of appending new values at the end—new entries like SKB_CONSUMED, SOCKET_CLOSE, SOCKET_RCVBUFF, etc., are inserted in the middle, shuffling the indices around.

Considering the skb_drop_reason index changes across kernel versions, parse_reason_enum for dynamic loading feels more adaptable than hardcoding the enums.

Sounds reasonable. I am OK with this approach.

libbpf-tools/tcpdrop.c

Remove print_drop_reasons function and replace its call with a warning message in main when parse_reason_enum fails. Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

Amaindex · 2025-07-15T05:20:23Z

Hi @chenhengqi ,
I’ve removed print_drop_reasons and added a warning for parse failures in tcpdrop.c as you suggested, and reordered headers in tcpdrop.bpf.c to avoid compilation issues. Let me know if it looks good to go!

chenhengqi · 2025-07-16T02:00:23Z

Some comments are not resolved, please check.

…cpdrop Move ipv4_only, ipv6_only, and netns_id to rodata section for better memory management. Optimize tcpdrop.bpf.c by declaring variables upfront and reordering operations for clarity. Update event struct to place stack_id correctly. Fix missing newlines at file ends. Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

Amaindex · 2025-07-16T08:42:13Z

Some comments are not resolved, please check.

Hi @chenhengqi ,
My apologies, I just saw these comments and have pushed the corresponding fixes.
Thank you for the detailed feedback. I learned a lot from your suggestions, and the patch is much better for it.

chenhengqi · 2025-07-21T11:37:19Z

libbpf-tools/tcpdrop.bpf.c

+		event->drop_reason = -1;
+	}
+
+	if (bpf_ringbuf_query(&events, BPF_RB_AVAIL_DATA) >= 511) {


What's the purpose of bpf_ringbuf_query here ?

chenhengqi · 2025-07-21T11:40:00Z

libbpf-tools/tcpdrop.bpf.c

+	protocol = args->protocol;
+	if (protocol != ETH_P_IP && protocol != ETH_P_IPV6) {
+		bpf_ringbuf_discard(event, 0);
+		return 0;
+	}
+	if (ipv4_only && protocol != ETH_P_IP) {
+		bpf_ringbuf_discard(event, 0);
+		return 0;
+	}
+	if (ipv6_only && protocol != ETH_P_IPV6) {
+		bpf_ringbuf_discard(event, 0);
+		return 0;
+	}


Check these before bpf_ringbuf_reserve() so that we don't have to use many bpf_ringbuf_discard() in each branch.

Defer the BPF ring buffer event allocation in tcpdrop.bpf.c until all preliminary checks are passed, reducing unnecessary discards and improving performance. This ensures the event is only reserved when the skb meets all processing conditions, minimizing resource waste. Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

Amaindex · 2025-07-22T06:20:45Z

Hi @chenhengqi,
Thanks for the feedback. I’ve delayed the event allocation to cut down on unnecessary discards. Also, I removed the ringbuffer capacity check, which I overlooked from an earlier version, as it’s no longer needed.

Amaindex requested review from drzaeus77, goldshtn, yonghong-song, 4ast, brendangregg and davemarchevsky as code owners June 12, 2025 07:45

ekyooo reviewed Jun 20, 2025

View reviewed changes

libbpf-tools/tcpdrop.c Outdated Show resolved Hide resolved

chenhengqi requested changes Jun 20, 2025

View reviewed changes

Amaindex force-pushed the master branch from 395f9b5 to 6af9933 Compare July 1, 2025 10:52

chenhengqi reviewed Jul 2, 2025

View reviewed changes

chenhengqi reviewed Jul 3, 2025

View reviewed changes

libbpf-tools/tcpdrop.c Outdated Show resolved Hide resolved

libbpf-tools: remove print_drop_reasons and add warn on parse failure

451edb5

Remove print_drop_reasons function and replace its call with a warning message in main when parse_reason_enum fails. Signed-off-by: Zi Li <zi.li@linux.dev> Signed-off-by: Amaindex <amaindex@outlook.com>

chenhengqi reviewed Jul 21, 2025

View reviewed changes

libbpf-tools: add tcpdrop to trace TCP packet drops #5329

Are you sure you want to change the base?

libbpf-tools: add tcpdrop to trace TCP packet drops #5329

Uh oh!

Conversation

Amaindex commented Jun 12, 2025

Uh oh!

Amaindex commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Amaindex commented Jul 1, 2025

Uh oh!

Amaindex commented Jul 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chenhengqi commented Jul 2, 2025

Uh oh!

Amaindex commented Jul 3, 2025

Uh oh!

chenhengqi commented Jul 3, 2025

Uh oh!

Uh oh!

Amaindex commented Jul 15, 2025

Uh oh!

chenhengqi commented Jul 16, 2025

Uh oh!

Amaindex commented Jul 16, 2025

Uh oh!

chenhengqi Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

chenhengqi Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Amaindex commented Jul 22, 2025

Uh oh!

Uh oh!