Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LWM2M bootstrap with dtls fails when using HL7800 modem #81672

Open
jbr-ia opened this issue Nov 20, 2024 · 7 comments
Open

LWM2M bootstrap with dtls fails when using HL7800 modem #81672

jbr-ia opened this issue Nov 20, 2024 · 7 comments
Assignees
Labels
area: LWM2M area: Modem bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug

Comments

@jbr-ia
Copy link
Contributor

jbr-ia commented Nov 20, 2024

Describe the bug
Using lwm2m with dtls and preshared keys fails when using the HL7800 modemdriver.

I am trying to use lwm2m with dtls and preshared keys with an HL7800 modem. The DTLS handshake does not succeed. This seems to be the case since commit: 7bef3fd
When I undo the change of that commit, the dtls handshake can succeed, but DNS does not work anymore (that commit fixes DNS problems), so I need to specify the IP-address of the lwm2m-bootstrap server instead of the dns-name.

To Reproduce

Steps to reproduce the behavior:

  1. If needed, change the prj.conf of the lwm2m_client sample to use the desired lwm2m-server and preshared key
  2. build with west build -b pinnacle_100_dvk samples/net/lwm2m_client/ -- -DOVERLAY_CONFIG="overlay-lwm2m-1.1.conf;overlay-bootstrap.conf;overlay-dtls.conf"
  3. flash west flash
  4. See that it does not finish the handshake

Expected behavior
A successfull connection to the bootstrapserver with dtls (with preshared keys)

Logs and console output
In zephyr I see that lwm2m_engine reports Cannot connect UDP (-11)

[00:00:34.067,382] <dbg> modem_hl7800: on_cmd_sockcreate: look up new socket by creation id
[00:00:34.067,901] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:00:34.076,599] <dbg> modem_hl7800: hl7800_rx: UNHANDLED RX
                                       2b 4b 43 4e 58 5f 49 4e  44 3a 20 31 2c 31 2c 30 |+KCNX_IN D: 1,1,0
[00:00:35.101,043] <dbg> modem_hl7800: hl7800_rx: HANDLE +KUDP_IND:  (len:3)
[00:00:35.101,074] <dbg> modem_hl7800: on_cmd_sock_ind: +KUDP_IND ID: 1
[00:00:35.101,196] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:0/0/2/0, level 3, buf:0x20011913, buflen:1
[00:00:35.103,454] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KUDPSND=1,"xxx.xxx.xx.xx",xxxx,81]
[00:00:35.120,086] <dbg> modem_hl7800: hl7800_rx: HANDLE CONNECT (len:0)
[00:00:35.127,563] <dbg> modem_hl7800: send_data: Sent 81 bytes
[00:00:35.136,291] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:00:36.136,688] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KUDPSND=1,"xxx.xxx.xx.xx",xxxx,81]
[00:00:36.153,015] <dbg> modem_hl7800: hl7800_rx: HANDLE CONNECT (len:0)
[00:00:36.160,552] <dbg> modem_hl7800: send_data: Sent 81 bytes
[00:00:36.169,311] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:00:38.102,172] <err> net_lwm2m_engine: Cannot connect UDP (-11)
[00:00:38.102,844] <dbg> modem_hl7800: send_at_cmd: OUT: [AT+KUDPCLOSE=1]
[00:00:38.111,297] <dbg> modem_hl7800: hl7800_rx: HANDLE OK (len:0)
[00:00:38.111,389] <err> net_lwm2m_rd_client: Cannot init LWM2M engine (-11)

In wireshark, I see repeated attempts to start a handshake
afbeelding

Environment (please complete the following information):

  • OS: Ubuntu 22.04
  • Toolchain: zephyr-sdk-0.16.8
  • Commit SHA or Version used: 6843240 (but also seen with over versions after: 7bef3fd)

Additional context

@jbr-ia jbr-ia added the bug The issue is a bug, or the PR is fixing a bug label Nov 20, 2024
@fabiobaltieri fabiobaltieri added the priority: medium Medium impact/importance bug label Nov 26, 2024
@rerickson1
Copy link
Member

@jbr-ia I will take a look at this when I can. In the mean time, check these kconfigs, these are the values we use:

# LwM2M
# The latency in NB-IoT can be 10 seconds.
# these values need to be greater than 10 seconds to account for the
# the latency of the connection after the cell tower.
CONFIG_NET_SOCKETS_CONNECT_TIMEOUT=13000
CONFIG_NET_SOCKETS_DTLS_TIMEOUT=15000
CONFIG_COAP_INIT_ACK_TIMEOUT_MS=15000
CONFIG_LWM2M_SECONDS_TO_UPDATE_EARLY=20

@jbr-ia
Copy link
Contributor Author

jbr-ia commented Nov 28, 2024

@rerickson1 Thanks for the reply. I have tested it with these values, but unfortunatelly it still does not work for me.

For this test, I tried without bootstrapping this time, so compiling with: west build -b pinnacle_100_dvk samples/net/lwm2m_client/ -- -DOVERLAY_CONFIG="overlay-lwm2m-1.1.conf;overlay-dtls.conf"
and added to prj.conf:

CONFIG_NET_SOCKETS_CONNECT_TIMEOUT=13000
CONFIG_NET_SOCKETS_DTLS_TIMEOUT=15000
CONFIG_COAP_INIT_ACK_TIMEOUT_MS=15000
CONFIG_LWM2M_SECONDS_TO_UPDATE_EARLY=20

#CONFIG_LWM2M_APP_SERVER="coaps://23.97.187.154:5684"
CONFIG_DNS_RESOLVER=y
CONFIG_DNS_SERVER_IP_ADDRESSES=y
CONFIG_DNS_SERVER1="8.8.8.8"
CONFIG_LWM2M_APP_SERVER="coaps://leshan.eclipseprojects.io:5684"

It still failed to connect.

When I change in hl7800.c
From this:

zephyr/drivers/modem/hl7800.c

Lines 1606 to 1607 in 7271000

net_pkt_set_remote_address(pkt, &sock->dst, sizeof(struct sockaddr_in));
pkt->remote.sa_family = AF_INET;

to this:

	if ((((char*)sock->dst.data)[2]==8) && (((char*)sock->dst.data)[3]==8) && (((char*)sock->dst.data)[4]==8) && (((char*)sock->dst.data)[5]==8 )){
		net_pkt_set_remote_address(pkt, &sock->dst, sizeof(struct sockaddr_in));
		pkt->remote.sa_family = AF_INET;
	}

than it works for me with dtls and dns. But this only works because I use 8.8.8.8 as DNS-server, and the IP-address seems to be at this location in the sock->dst.data, so if the DNS-server ever changes it no longer works. So this is no real solution (only for testing).

@rerickson1
Copy link
Member

@jbr-ia can you provide the debug logs from boot until the issue happens?

@jbr-ia
Copy link
Contributor Author

jbr-ia commented Nov 29, 2024

@rerickson1 Attached the log of the failing connection (pinnacle_100_dvk_dtls_failed_to_connect.txt). This is with zephyr revision: 6843240, with added to prj.conf:

CONFIG_NET_SOCKETS_CONNECT_TIMEOUT=13000
CONFIG_NET_SOCKETS_DTLS_TIMEOUT=15000
CONFIG_COAP_INIT_ACK_TIMEOUT_MS=15000
CONFIG_LWM2M_SECONDS_TO_UPDATE_EARLY=20

#CONFIG_LWM2M_APP_SERVER="coaps://23.97.187.154:5684"
CONFIG_DNS_RESOLVER=y
CONFIG_DNS_SERVER_IP_ADDRESSES=y
CONFIG_DNS_SERVER1="8.8.8.8"
CONFIG_LWM2M_APP_SERVER="coaps://leshan.eclipseprojects.io:5684"
CONFIG_MODEM_LOG_LEVEL_DBG=y

compiled with: west build -b pinnacle_100_dvk samples/net/lwm2m_client/ -- -DOVERLAY_CONFIG="overlay-lwm2m-1.1.conf;overlay-dtls.conf"

For comparison also a log with a successfull connection (pinnacle_100_dvk_dtls_successfull_connect.txt). For that situation, I added the if-statement with hardcoded dns-entry from my previous comment.

pinnacle_100_dvk_dtls_failed_to_connect.txt
pinnacle_100_dvk_dtls_successfull_connect.txt

@rerickson1
Copy link
Member

@jbr-ia I have made some progress. I see TLS issues and dont think there is anything wrong with the HL7800 driver.

Using latest main zephyr branch, apply this diff:
p100_1.patch

Build command:
west build -b pinnacle_100_dvk -p auto samples/net/lwm2m_client/ -- -DOVERLAY_CONFIG="overlay-dtls.conf;boards/pinnacle_100_dvk.conf"

logs and pcap:
p100_log_1.log
p100_log_1.pcapng.zip

For some reason MBED TLS is not processing the received Hello Verify Request from the server... Not sure why at this time.

@jbr-ia
Copy link
Contributor Author

jbr-ia commented Dec 19, 2024

Thanks for looking into it. I also see that MBED TLS seems to not process the reaction from the server correctly. When I remove in hl7800.c the lines with

net_pkt_set_remote_address(pkt, &sock->dst, sizeof(struct sockaddr_in));
pkt->remote.sa_family = AF_INET; 

, than mbedtls seems to accept the response (but DNS does not work anymore). Therefore my assumption was that it had something to do with the modemdriver. But it can also be something in mbedtls, or another part of the networkstack.

@rerickson1
Copy link
Member

Thanks for looking into it. I also see that MBED TLS seems to not process the reaction from the server correctly. When I remove in hl7800.c the lines with

net_pkt_set_remote_address(pkt, &sock->dst, sizeof(struct sockaddr_in));
pkt->remote.sa_family = AF_INET; 

, than mbedtls seems to accept the response (but DNS does not work anymore). Therefore my assumption was that it had something to do with the modemdriver. But it can also be something in mbedtls, or another part of the networkstack.

Thanks for the reminder on that, I'll give that a try too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: LWM2M area: Modem bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug
Projects
None yet
Development

No branches or pull requests

4 participants