-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Description
I've been having some problems with the IEEE802154 stack and the recent Zephyr release (v3.2.0), which basically manifested as IPv6 packets being dropped. I finally managed to pin it down as a regression due to recent L2 endianness changes (see the details below).
Note: I'm using TI CC1352-based boards with the subGHz radio, though this bug is platform-agnostic.
Intro: my application communicates using link-local IPv6 addresses (to benefit from IPHC) and the occasional broadcast destination (though this is not important). Before upgrading Zephyr to 3.2.0, everything worked perfectly. After that, packets were not received anymore and, by enabling net logging, I would see that they were dropped, mainly because of UDP checksum mismatch (for broadcast packets). So I started investigating...
Echo client / server samples still worked fine with the default configuration, so I enabled packet hex dumping in both my application and the echo samples and started comparing them to see what went wrong. My first observation was that my application used link-local addresses (FE80:), while the sample used global addresses (2001:), thus mine was using 6lo header compression and the sample was not.
After this, is managed to reproduce it by altering the echo samples' configuration:
- I modified the
echo_serversample to print the received L2 packet + logging goodness:CONFIG_NET_IPV6_LOG_LEVEL_DBG=y, then note its IPv6 link local address (via shell); - Create
overlay-6lo-bug.confcontainingCONFIG_NET_CONFIG_PEER_IPV6_ADDR="fe80::<server board's EUI64>"(we could also use a multicast address); west build --pristine -b cc1352r1_launchxl samples/net/sockets/echo_client -- -DOVERLAY_CONFIG="<optional: overlay for ieee802154 subg> overlay-6lo-bug.conf" && west flash;- See errors / pkt dump on
echo_server's console, no replies being sent...
Log / packet capture:
Note: the actual IPv6 address of the server is fe80::7014:a61c:4b:1200 (yeah, please ignore the fact that the TI OUI, 00:12:4b, is at the end of the MAC, this is a separate endianness bug in the ieee802154_cc13xx_cc26xx_subg driver - which has existed since its dawn - I will report it later).
[00:03:33.699,890] <dbg> net_ipv6: net_ipv6_input: (rx_q[0]): IPv6 packet len 80 received from fe80::212:4b00:1ca1:ace5 to ff02::1:ff4b:1200
[00:03:33.700,042] <inf> ieee802154_sniffer: Received (64 bytes):
41 d8 16 cd ab ff ff 00 12 4b 00 1c a1 ac e5 7b |A....... .K.....{
39 3a 02 01 ff 4b 12 00 87 00 0d a4 00 00 00 00 |9:...K.. ........
fe 80 00 00 00 00 00 00 70 14 a6 1c 00 4b 12 00 |........ p....K..
01 02 e5 ac a1 1c 00 4b 12 00 00 00 00 00 00 00 |.......K ........
As we can see, the IEEE frame header's source address is "00 12 4b 00 1c a1 ac e5" (in little endian, as per IEEE 802.15.4 specification).
After the IPHC decompression, the IPv6 address obtained is "fe80::212:4b00:1ca1:ace5" - which uses the source MAC address in little endian, which is wrong (L3 should receive it in big endian).
The destination address is also mistakenly converted to multicast (I have no idea why it results in this, but it probably has the same cause).
Code Analysis & Workaround
I added debug logging to print the memory contents of the L2 SRC addresses at each step:
ieee802154_cc13xx_cc26xx_subg_rx_done: packet capture, the LL addresses are in little endian;ieee802154_recv: L2 addresses are stored inside thenet_pktstructure, which are automatically converted byset_pkt_ll_addrto be in Big Endian.ieee802154_6lo_decode_pkt: addresses are swapped again before being passed on to the IPHCnet_6lo_uncompressroutine, which builds the wrong L3 addresses;
I've identified the commit bff6a5 as the primary cause for this regression.
I think it should have also removed the sys_mem_swap lines from ieee802154_6lo.c:
/* Upper IP stack expects the link layer address to be in
* big endian format so we must swap it here.
*/
if (net_pkt_lladdr_src(pkt)->addr &&
net_pkt_lladdr_src(pkt)->len == IEEE802154_EXT_ADDR_LENGTH) {
sys_mem_swap(net_pkt_lladdr_src(pkt)->addr, net_pkt_lladdr_src(pkt)->len);
}
if (net_pkt_lladdr_dst(pkt)->addr &&
net_pkt_lladdr_dst(pkt)->len == IEEE802154_EXT_ADDR_LENGTH) {
sys_mem_swap(net_pkt_lladdr_dst(pkt)->addr, net_pkt_lladdr_dst(pkt)->len);
}If I comment these lines, both the echo samples and my application works as intended!
I don't know if there are any more endianness bugs... I can come with a PR removing those lines and fixing the bug, if you need it.
For future work, some new test cases covering the interaction between L2 and L3 / IPHC should be made. Although I have zero experience on doing unit testing for embedded / C and my schedule is kind of full this time of the year, I'll try to take care of it in the not-so-distant future (hopefully, next month).