-
Notifications
You must be signed in to change notification settings - Fork 7.4k
net: latmon: latency monitor service and sample #85154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This sample uses networking APIs but does not seem to be a networking sample but testing gpio latency. Perhaps this should be located somewhere under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not believe this qualifies as a sample. From what I see reading the code, its an application used to calculate the latency of GPIO propagation between two systems, using a timer. For testing embedded systems, latencies are usually measured using digital analyzers to isolate the device under test, to avoid the propagated delays this application is seemingly introducing.
Samples are meant to show how to use zephyr, focusing on a specific feature like RTIO. I do not think anyone seeing this code will learn anything new about zephyr.
Why not keep this application downstream with the project it is designed for? I do not see what the zephyr project gains from adding and maintaining this application.
As I tried to explain in the README.rst, this program is designed for real-time Linux systems, assisting in the evaluation of their interrupt latencies. The client code, latmus, runs on the system under test (SUT) and has been a part of the Xenomai project for over 20 years, spanning multiple iterations. However, since it can run with minimal modifications on various systems, the goal is to extract it into a standalone project or potentially integrate it into the Linux kernel tree - maybe under linux/tools. The ultimate objective is to develop an affordable, open-source, hardware-backed solution for measuring interrupt latency in a clear and reliable manner. It also offers an objective method for comparing PREEMPT_RT with Xenomai: Benchmark Scenarios. I understand your concern but I was hoping we could have a discussion about this. In my opinion, given its simplicity, the code serves as a real-world example of pinctrl, GPIO, and networking usage. Additionally, it contributes to the Linux community by supporting discussions on real-time performance data. If it would be helpful, I’d be happy to be added as a maintainer for this sample code. |
@jukkar @bjarki-andreasen should I add it to samples/boards instead? It seems to focus more on demonstrating functionality and hardware features. I'd like to unblock this discussion |
Could you address the main alternative to adding this as a sample, which is to maintain it as a normal application within the project it is a part of? There are an effectively never ending number of applications built to run on zephyr, projects like https://github.com/jakkra/ZSWatch and https://github.com/zmkfirmware/zmk are rather popular ones. Should we add those applications to samples as well? One could argue they demonstrate how to use many parts of zephyr as well, but that does not make them samples... In short, samples demo and document a feature, applications solve a problem. The code in this PR is part of a larger project, solving a problem. |
@bjarki-andreasen I believe I have provided some sort of justification to what you are asking; I have provided links and data backing my request and offered my support as a maintainer. You are pointing me to some neat repositories though quite frankly I fail to see their industry relevance beyond their own individuality - the problem they are trying to solve doesn't extend beyond their single use case. I also doubt they will be active in 20 years from now. I have pointed you to a project/framework that is a quarter of a century old and supporting many industrial applications (and companies). The latmus tool - used to measure performance deviation - will help a wide range of products and markets. So with that in mind, I am still not sure why this Zephyr code can not be integrated as sample code. Is as generic as any, but it will help many. Now, please, could you help me understand the criteria to merge code to sample/boards? what does an application need to do be be accepted there? |
I do not have big issues with this sample, we have other samples that do somewhat similar things like zperf in networking. If this PR is accepted, then I suggest it is not placed under |
Yes, "some sort of justification" has been provided, I do not find said justification sufficient.
@jukkar Just like this sample does not demo any specific networking feature, its not demoing any specific zephyr feature either. Its not a sample... The "sample" in this PR is an endpoint/client for a completely independent project, it has no utility for zephyr. @kartben What do you think? Is this a sample in your opinion? Should it be part of the zephyr codebase somewhere in any case? |
If we look into samples/boards/expressif/ethernet or the rpi_pico sample it does seem relevant to me (nothing of interest from the zephyr side, just hardware validation AFAICS). But you are clearly better informed - I stopped contributing zephyr code over 8 years ago so I cant say I am up to date. Anyway, sure please let me know if I should drop the PR. |
Have you considered contributing latmon as a network protocol, similar to how protocols like ptp are implemented? Then the sample could demo the usage of said protocol, which would be a zephyr feature :) The protocol being part of the sample is what I believe makes this an application, rather than demoing a zephyr feature. something like:
in The sample would then be a demo of how to use the protocol. |
ah, that is a neat idea. ok let me have a look and thanks a lot for pointing it out! |
keep alive. |
@bjarki-andreasen I still have to address issues from the original code so please do not dive into a formal review yet - could you however comment on the partitioning as a network service? is this what you had in mind when you suggested it? |
Yes, this looks very much like what I envisioned :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a latency monitor service and sample application to support the EVL/Xenomai4 gpio benchmarking tool. It implements hardware configuration, latency measurement with cycle counting, and visual LED indicators, and it introduces a new YAML configuration and header file for the Latmon API.
Reviewed Changes
Copilot reviewed 4 out of 14 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
samples/net/latmon/src/main.c | Implements the latency monitor service and LED blinking thread. |
samples/net/latmon/sample.yaml | Provides configuration for building and running the sample. |
include/zephyr/net/latmon.h | Introduces the Latmon API for client integration. |
Files not reviewed (10)
- doc/connectivity/networking/api/latmon.rst: Language not supported
- doc/connectivity/networking/api/protocols.rst: Language not supported
- samples/net/latmon/CMakeLists.txt: Language not supported
- samples/net/latmon/README.rst: Language not supported
- samples/net/latmon/boards/frdm_k64f.overlay: Language not supported
- samples/net/latmon/prj.conf: Language not supported
- subsys/net/lib/CMakeLists.txt: Language not supported
- subsys/net/lib/Kconfig: Language not supported
- subsys/net/lib/latmon/CMakeLists.txt: Language not supported
- subsys/net/lib/latmon/Kconfig: Language not supported
Comments suppressed due to low confidence (1)
samples/net/latmon/src/main.c:173
- The parameter 'ip_buf' is declared in acquire_dhcp_address but is unused. Consider removing it to improve code clarity.
static struct in_addr *acquire_dhcp_address(struct net_mgmt_event_callback *cb, char *ip_buf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation comments, will prob have more when I'm in front of a computer. Thanks for this contribution!
Overview | ||
******** | ||
|
||
Provides the functionality required for network-based latency monitoring, including socket | ||
management, client-server communication, and data exchange with the Latmus service running | ||
on the System Under Test (SUT). | ||
|
||
Contents | ||
******** | ||
|
||
1. Overview | ||
2. Features | ||
3. Key Components | ||
4. API Reference | ||
5. Workflow | ||
6. Example Usage | ||
|
||
1. Overview | ||
*********** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview section is being repeated. No need to include a manual "local table of contents" as you've already used the directive to have one automatically created for you. Please don't manually number headings either.
3. Key Components | ||
***************** | ||
|
||
3.1 **Socket Management** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use proper sub-headings, and again no Manuel numbering please.
Edit: s/Manuel/manual/ :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
3. Key Components | ||
***************** | ||
|
||
3.1 **Socket Management** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use proper sub heading, and again no manual numbering please.
4. Workflow | ||
*********** | ||
|
||
1. **Socket Creation**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See your PR preview here, if that helps
Here, you're missing a blank line before starting unordered list. Needs to be fixed everywhere of course
1. **Socket Creation**: | |
1. **Socket Creation**: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
samples/net/latmon/README.rst
Outdated
Contributions | ||
------------- | ||
|
||
Contributions are welcome. Please submit a pull request or open an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop - we would basically need this in every page otherwise :)
samples/net/latmon/README.rst
Outdated
Run Latmon and Latmus as follows: | ||
|
||
On the host and to flash the Zephyr board:: | ||
|
||
$ west flash | ||
|
||
On the SUT, latmus **MUST** track the falling edge of the signal:: | ||
|
||
$ latmus -I gpiochip2,23,falling-edge -O gpiochip2,21 -z -g"histogram" <ip address|"broadcast"> | ||
|
||
On completion, generate a latency histogram on the host (a PNG file) using gnuplot:: | ||
|
||
$ gnuplot plot_data.gp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Careful to not indent unnecessarily as it will cause rendering issues
@ldts also, as a general note: "Unless they applied the reviewer’s recommendation exactly, authors must not resolve and hide comments, they must let the initial reviewer do it. The Zephyr project does not require all comments to be resolved before merge" |
yeah I was wondering about that when I was cleaning up (btw on every resolve I did take the action suggested but thanks for pointing it out). still, cant github be configured to not offer users the |
f89be81
to
1a98a2e
Compare
subsys/net/lib/latmon/latmon.c
Outdated
data->histogram[delta_us]++; | ||
} | ||
|
||
return ++data->current_samples < conf->max_samples ? -1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more clearer if we return -EAGAIN
instead of plain -1, it kind of documents itself
subsys/net/lib/latmon/latmon.c
Outdated
ssize_t ret = 0; | ||
int cell = 0; | ||
|
||
if (data->current_samples) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While technically correct, we are trying to avoid these kind of checks for non boolean variables because of a MISRA rule. So here it would be preferred if (data->current_samples != 0) {
I am seeing similar issues in other parts of this file, could you go through all the if() statements and tweak them accordingly.
subsys/net/lib/latmon/latmon.c
Outdated
delta_us -= conf->period; | ||
} | ||
|
||
if (conf->cells) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this one, this should be changed to if (conf->cells != 0) {
LOG_INF("\tmonitoring stopped"); | ||
} | ||
|
||
static int broadcast_ip_address(struct in_addr *ip_addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a chance that IPv6 is supported in the future, then it would make sense to design the user APIs so that we do not need to change the API signature to support IPv6. Not really an issue for this specific function as IPv6 does not support broadcasting.
subsys/net/lib/latmon/latmon.c
Outdated
} | ||
|
||
/* Get a socket to listen to Latmus requests */ | ||
int net_latmon_get_socket(struct sockaddr_in *connection_addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here if we have struct sockaddr *connection_addr
, then we do not need to tweak the API to support IPv6.
Then the memcpy() below could look like this
if (connection_addr != NULL) {
memcpy(&addr, (struct sockaddr_in *)connection_addr, sizeof(addr));
}
} | ||
|
||
/* Waits for connection from Latmus */ | ||
int net_latmon_connect(int socket, struct in_addr *ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here if we change this to net_latmon_connect(int socket, struct sockaddr *ip)
, then there is no need to change the API to support IPv6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but Xenomai's Latmus move to IPv6 is really low in the list of priorities (unless someone is willing to make the update of course!)
samples/net/latmon/src/main.c
Outdated
return ret; | ||
} | ||
|
||
static struct in_addr *get_dhcp_address(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the bound IP address is sent with the BOUND event, so there should be no need to traverse the IP address list.
samples/net/latmon/src/main.c
Outdated
net_mgmt_init_event_callback(cb, dhcp_handler, | ||
NET_EVENT_IPV4_DHCP_BOUND); | ||
net_mgmt_add_event_callback(cb); | ||
net_dhcpv4_start(iface); | ||
k_sem_take(&dhcp_done, K_FOREVER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a semaphore and a handler, you could do here
struct net_if_dhcpv4 *dhcpv4;
size_t info_len;
ret = net_mgmt_event_wait(NET_EVENT_IPV4_DHCP_BOUND, NULL, NULL, &dhcpv4, &info_len, timeout);
which would wait until the desired event is returned. The bound IPv4 address is then found in the dhcpv4 struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified it slightly to avoid a memcpy.
8eece4d
to
7aa3b50
Compare
Compliance checks: MISRA (via sonarqubecloud) appears to require an else clause for every if-else construct, even in cases—like this one—where the else is functionally unnecessary. |
the SonarCloud Code analysis output is bogus as ip is validated before calling strlen |
@jukkar anything else on these PRs? |
can we merge this please? |
|
hi @jukkar I did discuss with Philippe and really there is no intention to support ipv4 so I'd rather follow the option of marking it as experimental. |
as part of the re-review I'll update the strlen to strnlen (even though not needed in this case) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work! Added a few new notes :)
samples/net/latmon/prj.conf
Outdated
# General config | ||
CONFIG_LOG=y | ||
CONFIG_GPIO=y | ||
CONFIG_REBOOT=y | ||
CONFIG_TICKLESS_KERNEL=y | ||
CONFIG_CONSOLE_SUBSYS=y | ||
CONFIG_CONSOLE_GETLINE=y | ||
CONFIG_NEWLIB_LIBC=y | ||
CONFIG_INIT_STACKS=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe only CONFIG_GPIO=y
and CONFIG_LOG=y
need to be explicitly enabled. other options are platform specific, and should have reasonable defaults (hello_world sure does not select CONFIG_TICKLESS_KERNEL=y or CONFIG_REBOOT=y for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also zephyr is moving to picolibc :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all right, testing. thanks!
samples/net/latmon/src/main.c
Outdated
|
||
static K_SEM_DEFINE(ack_event, 0, 1); | ||
|
||
bool dhcp_done; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shared value should at least be volatile, though optimally be exchanged for an atomic value :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, lets make it atomic then.
samples/net/latmon/src/main.c
Outdated
return ret; | ||
} | ||
|
||
#if defined(LATMON_LOOPBACK_DBG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a leftover? if its still needed it should be a kconfig :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well this was really intentional - i figured anyone wanting to calibrate would probably be ok just doing it manually in the source. But I guess you are right! I'll update
Latmon service to interface with Xenomai's Latmus. Link: https://evlproject.org/core/benchmarks/#latmus-gpio-response-time Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com>
Supports the EVL/Xenomai4 benchmarking tool as described in the documentation. The benchmarking tool is accessed via the latmon service. This code uses the J2 socket on FRDMk64-F: PIN_20: tx pulse to SUT PIN_18: rx ack from SUT The client code running on the SUT shall monitor the falling edge of PIN_20. Example usage from the latmus client running Xenomai4: $ latmus -I gpiochip2,23,falling-edge \ -O gpiochip2,21 -g"histogram" \ -z broadcast Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com>
|
Supports the EVL/Xenomai4 gpio benchmarking tool as described in the Xenomai Latmus documentation.
The sample code for the Latmon service uses the FRDM_K64F J2 socket as follows:
The client code running on the SUT shall monitor the falling edge of PIN_20.
Example usage from the latmus client running Xenomai4:
$ latmus -I gpiochip2,23,falling-edge -O gpiochip2,21 ....
Running snapshot:
A sample histogram plot:

Running Sample:
