Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSD Sockets API: Offloading Support #4821

Merged
merged 3 commits into from
Sep 11, 2018

Conversation

GAnthony
Copy link
Collaborator

@GAnthony GAnthony commented Nov 8, 2017

Problem Statement:

  • See github issue BSD Sockets API: Offloading support #3706: (was Jira: ZEP-2271)

    "Users of devices which provide socket and TCP/IP offload engines
    would benefit in memory and power efficiency by enabling full
    offload of the Zephyr BSD socket APIs to a dedicated co-processor.

    The TI CC3220SF SoC, part of the CC32XX SimpleLink SoC family of WiFi
    enabled devices, will be used as the initial socket offload implementation."

  • See Figure 1-2, http://www.ti.com/lit/ug/swru455c/swru455c.pdf,
    for the SimpleLink network co-processor architecture.

  • In summary, it would be more efficient, in the case of vendors providing
    a complete BSD socket offload solution, to hook into the networking
    stack at the BSD socket API level rather than at the net_context() level
    (per the current Zephyr NET_OFFLOAD design);
    otherwise, for applications/protocols using the socket API:

    1. there is unnecessary overhead mapping Zephyr BSD sockets <-> net_context <->
      offloaded BSD socket APIs, converting between the synchronous socket
      and asynchronous callback-based net_context APIs;
    2. there is extra code and data brought in by instantiating the Zephyr
      networking stack (creation of a rx queue and thread, rx/tx pools)
      which is in principal already handled by the offload co-processor;
    3. to implement the mapping in 1), Zephyr network packets are required
      on the host MCU creating an extra buffer copy between the offload engine
      and the application's socket data buffers.

Proposed Solution:

Validation:

  1. http_sensor_demo: Demoed at Linaro Connect SFO17, a micropython script
    calling Zephyr socket APIs (via usocket), offloaded to a SimpleLink WiFi
    "socket provider", sending HTTP POSTs to dweet.io with onboard temperature
    sensor readings.
    https://github.com/GAnthony/micropython/blob/http_sensor_demo/ports/zephyr/scripts/http_sensor_demo.py

  2. samples/net/sockets/echo: works unmodified given the
    prj_cc3220sf_launchxl.conf in this RFC.

Caveats:

  • Only one socket offload provider allowed in the system at a time;
  • As this bypasses the net_app/net_context APIs, existing protocols built on
    net_app/net_context will not benefit from this socket offload;
    Note: There is now a plan to update all protocols to sockets: Zephyr Networking Stack Architectural Updates #7591.
  • Though the SimpleLink provider also offloads DNS and DHCP, there is
    currently no way to offload that from Zephyr with this solution;
  • This offload method is at a level which would bypass existing Zephyr IP
    routing between network interfaces;
  • WiFi Provisioning: this socket provider starts in Station mode by default,
    and the AP security key is passed in on the build command line.
    In the future, it could implement the various provisioning options
    supported by the chip, and be configured via Kconfig;

    A default provisioning using Fast Connect policy has been implemented.
  • The SimpleLink socket API supports select(), but is not yet exported by this
    provider, as Zephyr does not yet support select().

    poll() has been implemented over select().

jukkar
jukkar previously requested changes Nov 9, 2017
Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this bypasses the net_app/net_context APIs, existing protocols built on
net_app/net_context will not benefit from this socket offload;

This is the most problematic issue with this solution. All the protocols we have built on top of native IP stack would be rendered useless from user point of view. This also fragments the IP stack and leads to applications that need to implement things itself instead of using the networking APIs provided by native stack.

We already have offloading API that is run under net_context, if that is used then all the current protocols and also BSD socket API will be offloaded. As the net_context APIs is modelled after BSD socket API (==it provides similar functions) and thus mapping BSD sockets over net_context API is quite straightforward.

Your time would be better spent if you could send patches that enhance/fix the current offloading solution instead of this proposal. IMHO we do not need this BSD socket offloading at all.

pfalcon
pfalcon previously requested changes Nov 9, 2017
Copy link
Contributor

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some technically oriented comments.

* It is assumed that these offload functions (except for init()) follow the
* POSIX socket API standard for arguments, return values and setting of errno.
*/
struct socket_offload {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this structure goes into a separate header, why we can't have single socket_offload.h?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Principle of minimum inclusion". The idea is to only include what's needed. The socket provider only needs the definitions in socket_offload_ops.h, but not the definitions in socket_offload.h.

@@ -0,0 +1,35 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost sure all these files should go into subsys/net/lib/sockets/.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This patch was quite removed from the networking stack, I didn't want to presume. But, yes, ideally.

const void *optval, socklen_t optlen);
int (*getsockopt)(int sock, int level, int optname, void *optval,
socklen_t *optlen);
ssize_t (*recv)(int sock, void *buf, size_t max_len, int flags);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In POSIX, recv(...) == recfrom(..., NULL). I think we should rely on that property, and have only recvfrom/sendto virtual methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds logical. I imagine a similar simplification can be done for send(): i.e., send() == sendto(..., NULL, 0).
Then, some simplification also for zsock_ and net_context_ APIs as well?

@@ -0,0 +1,31 @@
# Kconfig.simplelink - SimpleLink Socket Offload Provider configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say this should be treated as "socket offload driver", and thus live somewhere in drivers/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used "providers" to distinguish this from a real "driver", which is closer to the hardware, has DTS settings, responds to interrupts, has a device binding, etc. And, also to keep some tie back to MyNewt, which was the inspiration. But, I could put under drivers if that is deemed more appropriate.


/* Excerpted from SimpleLink's socket.h:
* "Unsupported: these are only placeholders to not break BSD code.
* Remove once Zephyr has POSIX socket options defined."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess sentence about Zephyr is not part of the quote from SimpleLink's socket.h.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, thanks.

#ifdef NET_SSID
#define SSID_NAME STRINGIFY(NET_SSID)
#else
#error "Pass CFLAGS="-DNET_SSID="<SSID_Name>" -DNET_PASS="<password>"" to make"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be Kconfig options?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but the result would be to have one's passwords end up in source prj.conf files, which could be (I have accidentally in the past) pushed online in commits. At least this method keeps the password in the binary, which is not usually pushed into online repos. The goal is to do this all via WiFi provisioning, storing keys in secure flash, but that method is still TBD. In the meantime, I'm trying to avoid putting passwords into source files.

@@ -53,6 +53,9 @@ struct zsock_addrinfo {
char _ai_canonname[DNS_MAX_NAME_SIZE + 1];
};

/* Note: placing zsock_ symbols in <net/zsock.h>, and making zsock
* the default socket provider would greatly simplify this header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uo remind, Zephyr's socket API functions have prefix "zsock_". Then there's a separate aliasing layer for raw POSIX names. Someone offloading Zephyr socket API would naturally do that on the level of "zsock_" functions. I don't insist though (I may imagine doing that way leads to a shorter patch), just a notice that it may need to be redone later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I was trying to avoid having a test in every zsock_ API for socket_offload, as is done in net_context, in order to simplify the code. I also assumed any 3rd party applications ported to sockets would want to use the POSIX names. But, I see your point, offloading from under zsock_ would catch more use cases.

@pfalcon
Copy link
Contributor

pfalcon commented Nov 9, 2017

We already have offloading API that is run under net_context, if that is used then all the current protocols and also BSD socket API will be offloaded. As the net_context APIs is modelled after BSD socket API (==it provides similar functions) and thus mapping BSD sockets over net_context API is quite straightforward.

My personal IMHO is that we don't need even that offloading API, and instead should "motivate" vendors to provide L2 integration. But the reality is different, where there's a need to motivate vendors use Zephyr at all (or they'll stick with other RTOSes).

Existing offloading API offloads everything except buffer management. And the motivation for socket-level offload was to offload that too, literally, if there's a dedicated engine with its own memory to handle networking data, then there's no need to spend application CPU's memory and cycles on managing the buffers.

The expectation that it would be a part of socket API was there right from the beginning, and came natural even for external observers, e.g. it was one of the first comments from a MyNewt developer with a proposal to standardize on the same socket offload vtable (i.e. MyNewt follows the same design (except maybe what we call "offload", for them is just driver (perhaps TCP/IP stack "driver") interface).

So, my motivation with this patch is simple - this was a subtask of BSD Sockets API "epic" from the beginning, and it's not a big, and pretty clean patch, which addresses requirements of some of the stakeholders (e.g. TI's), i.e. allows easier adoption of Zephyr.

@tbursztyka
Copy link
Collaborator

allows easier adoption of Zephyr.

And bring 2 different models of doing network in Zephyr, more maintenance, no compatibility/portability with native stack, etc etc... How's that going to work on the long run once people will adopt Zephyr? I can foresee this already: it will become the next FreeRTOS...

So no.

If there are issues with current native stack and the way to optimize the offloading in it, let's do so. @jukkar comment summed things up pretty well imo.

@pfalcon
Copy link
Contributor

pfalcon commented Nov 9, 2017

This also fragments the IP stack

There's a definitely a "risk" of that, but the "problem" is not in this patch, but in BSD Sockets API. It's just some potential Zephyr users don't seem to value all the careful design of Zephyr IP stack and innovative solutions like data fragment based packet management, they just want the plain old sockets, because that's what they used all the time, and for what they have existing solution.

It's definitely a question where that leads us, but if Zephyr strives to be a general-purpose OS, it shouldn't be a surprise if people want to use it as such, including relying on well-known APIs.

@pfalcon
Copy link
Contributor

pfalcon commented Nov 9, 2017

And bring 2 different models of doing network in Zephyr, more maintenance, no compatibility/portability with native stack

Even with existing NET_OFFLOAD, compatibility/portability is heavily compromised, but e.g. you don't care that much ;-) : #4711 (comment)

So, why be surprised if some people care even less of native IP stack (but apparently still value other part of Zephyr)?

etc etc... How's that going to work on the long run once people will adopt Zephyr? I can foresee this already: it will become the next FreeRTOS...

Fairly speaking, that wouldn't be too bad, and even that would require quite a long run, with attention to users at each mile... (Of course doing it "right" would be even better, e.g. if it were my project, I'd reject NET_OFFLOAD, lol).

@tbursztyka
Copy link
Collaborator

Even with existing NET_OFFLOAD, compatibility/portability is heavily compromised, but e.g. you don't
care that much ;-) : #4711 (comment)

Where did you see that I don't care? Again assuming things about what people think. wth?

And where do you see net_offload breaking things? Any app done on top of native stack, will run the exact same way on top of a net_offload device: it's the same apis!

@pfalcon
Copy link
Contributor

pfalcon commented Nov 9, 2017

Where did you see that I don't care? Again assuming things about what people think. wth?

Correction: you don't find that to be a big issue. That was an illustration that there's a pluralism of opinions, some think that drivers should integrate at the proper L2 layer, some think that can be lifted and integration can be above that, but still tied into Zephyr's buffer management scheme, and some think that buffer management can be "offloaded" too. So, "assume" is not the right word here, but we definitely should try to understand what other parties think, and what are patterns in that thinking, to understand the big picture.

And where do you see net_offload breaking things? Any app done on top of native stack, will run the exact same way on top of a net_offload device: it's the same apis!

What I mean is that net_offload already bypasses a big part of Zephyr networking stack. Let me give an example: it's possible to have a very robust Zephyr networking stack (which it's not currently), and still have a Zephyr application (product) to be security-vulnerable, because it using 3rd party network context offloading feature, with vulnerability in that 3rd-party code.

it's the same apis!

Yes, it's the same APIs, taking Zephyr's adhoc API as a baseline. As mentioned above, there're parties who apparently take BSD Sockets API as the baseline.

You and Jukka are absolutely right that it raises concerns about the protocol implementations done on the native API. But nothing in this patch has very specific conflicts and contradictions. When someone submits e.g. a libcoap port to be included in Zephyr tree, that would be the right time to raise big concerns of unneeded duplication. (Otherwise, it would be hard to preclude people from doing such a port, reusing existing code is the whole idea of the standard APIs.)

@tbursztyka
Copy link
Collaborator

Correction: you don't find that to be a big issue. That was an illustration that there's a pluralism of opinions, some think that drivers should integrate at the proper L2 layer, some think that can be lifted and integration can be above that, but still tied into Zephyr's buffer management scheme, and some think that buffer management can be "offloaded" too. So, "assume" is not the right word here, but we definitely should try to understand what other parties think, and what are patterns in that thinking, to understand the big picture.

In case of winc1500, if you use the Ethernet L2, then you loose the other features it can offload, afaik. The point of net_offload is to be able to use as much as possible the offloaded features. If it can do dhcpv4, why using ethernet mode which would need to bring up ethernet l2 and our dhcpv4 client into the build? And you still need the wifi management interface, either way.

About the buffer management, unless you can transparently offload that, fine. But until then there will be this kind of copies. Actually, good luck to make this possible, because you still need to be able to route net_pkt throughout different bearers etc... (and I am not putting the user API in the picture here, just talking about how the core would have to manage this. It's not going to be easy).

All these problems, we've known them from day 1. This current PR looks like a very narrow look at how network offloading could be done. I fully understand that stakeholders may try to push for their framework when it's about their own unique comms chip, but within Zephyr design, it won't fit all the other use-cases. It is not a proper integration.

NET_OFFLOAD is far from being perfect, but from the start the way offloaded chip work make it hard to get it perfect (offloading high level APIs...)

@pfalcon
Copy link
Contributor

pfalcon commented Nov 9, 2017

@tbursztyka : So, to clarify re: winc1500, I'm of course not saying that you could do it differently, what you do is taking somebody's stale PR and try to salvage it, which is already pretty great. But it could be done differently in general, we're just experimenting with doing it via NET_OFFLOAD. But Gil did just the same, he experimented with another way to offload, and well, it didn't turn out too bad (as long as we accept offload at all).

Anyway, I tried to do some "advocacy", to not leave Gil one on one with this, but will pass the word to him now. My point however is that this feature was always part of BSD Sockets work: #3706 , and now #3369 can't be closed until it's "done". And well, no concerns were raised about sockets offloading before, so it would be sad if it's done for nothing.

@GAnthony
Copy link
Collaborator Author

GAnthony commented Nov 9, 2017

Part of the reason for submitting this PR, if nothing else, was to highlight the dilemma faced by a vendor who provides a complete offloading solution, and considering to use Zephyr.

Far from being a perfect solution, it is attempting to resolve the issues stated in the problem statement:

  1. Avoiding the overhead mapping Zephyr BSD sockets <-> net_context <->
    offloaded BSD socket APIs (extra code)
  2. Instantiation of the rx queue and thread, and rx/tx pools (extra data).
  3. extra buffer copies between the offload engines buffers and the application's socket data buffers (lower performance).

Issue 2) may be ameliorated perhaps by judicious #ifdef CONFIG_NET_OFFLOAD sprinkled thoughout the net_core code, if that's acceptable, but I thought that would be rather intrusive.

I didn't see a way to to avoid issues 1) and 3) using NET_OFFLOAD.

So, that leaves a vendor evaluating Zephyr, who spent engineering effort to offload as much WiFi functionality as possible from the MCU, and (naturally) targeting a standard BSD sockets API, with a few choices:

a) Do full integration with the IP stack, at the L2 level.
At this level, I'd expect IP routing, and everything else built on net_app to work. But this does not give full hardware entitlement to the vendor's offload chip.

b) Tap off at the NET_OFFLOAD hook: This gives better hardware entitlement, but leaves issues 1) and 3) above unresolved. This is mainly due to the mismatch in BSD socket API/linear buffer vs net_app/net_pkt buffers.

c) Tap off at the BSD socket level: This gives fuller hardware entitlement, and resolves the issues 1) and 3). But, until/unless all protocols are written to the standard BSD socket APIs, the platform does not benefit from existing apps/protocols written to net_app/net_context.

d) Don't attempt to integrate with data plane of the Zephyr IP stack (data plane). SimpleLink SDK is in fact designed to replace the local IP stack, as applications can #include <sys/socket.h> and just call the standard BSD socket APIs directly. Just use Zephyr for sensor/actuator drivers and threading.

So, bottom line, are there any ways to resolve issues 1) and 3) with NET_OFFLOAD, while still providing the vendor full use of their offload hardware?

@nashif
Copy link
Member

nashif commented Nov 11, 2017

There are too many topics of discussion here, the main point however is that when you introduce an offload api into Zephyr, it should not be fixed to work at one level only, as we have seen, there are different levels:

  • L2
  • Buffer
  • Socket

There are use cases for all three and there is no good or bad offloading, it will all depend on the HW being used and how much some is willing to invest making their existing networking application work with Zephyr.

So the first question to answer, do we want to support offloading on all level? So basically what @tbursztyka is doing with WINC, what @GAnthony is doing with SimpleLink and what @pfalcon mentioned re offloading on L2 layer with WINC. I would say yes, we need to look into a way for supporting all 3, if possible.

The main issue here is what happens if you have any of the 3 and how do you support existing protocols and middleware in Zephyr with any of the three cases above. The first 2 cases are probably covered right now due to the fact that protocols are implemented at buffer level, however, this PR expects offloading on the socket level and we do not have much support for sockets on any of the protocols, because they are implemented using native APIs.

To make this PR useful, you will need to implement everything on top of the IP stack using sockets duplicating existing code (or replacing it if you want to go all the way), so, a few questions:

  1. Is the socket API ready for this?
  2. What are the penalties that comes with that, for example HTTP using sockets vs using native
  3. Can we make everything use sockets? Do we actually need to make everything use sockets?
  4. Does offloading on socket level still require services that would go to native IP stack? DHCP? DNS? How is this handled?
  5. Who is going to do all of that?

I personally have no issues having things like HTTP and MQTT, etc. using sockets instead of native stack to allow offloading on all levels, but is this the end of it? I do not think so, the devil is in the details. See questions above.

We wanted to dedicate next week's TSC meeting to this topic, so can someone please take the AI to structure all of this and present the problem we are trying to solve and a proposal for solving in the TSC next week?

@nashif
Copy link
Member

nashif commented Nov 11, 2017

recheck

@jukkar
Copy link
Member

jukkar commented Nov 16, 2017

In yesterdays TSC meeting, someone suggested that in order to support both socket offloading and current application protocol APIs (net-app), we could add BSD socket support into net-app. That would cause some extra memory usage and copying but might be acceptable. The socket support in net-app should be optional so that if one does not have socket offloading enabled, then existing native APIs would be used.
This would have the benefit that applications could either use BSD socket APIs directly, or if for example TLS needs to be supported, then net-app could be used too and that would still support socket offloading.

@erwango
Copy link
Member

erwango commented Nov 16, 2017

The socket support in net-app should be optional so that if one does not have socket offloading enabled, then existing native APIs would be used.
This would have the benefit that applications could either use BSD socket APIs directly, or if for example TLS needs to be supported, then net-app could be used too and that would still support socket offloading.

So, if socket offloading is used, both net_app and BSD socket API would be available?
No restriction at all?
@jukkar , can you clarify? Thanks

@nashif
Copy link
Member

nashif commented Nov 16, 2017

The socket support in net-app should be optional so that if one does not have socket offloading enabled, then existing native APIs would be used.

why do you make this dependent on offloading, one could still use sockets without offloading

@jukkar
Copy link
Member

jukkar commented Nov 16, 2017

why do you make this dependent on offloading, one could still use sockets without offloading

Sure, I did not say anything contrary. Socket API is available for applications as is. What I meant that if there is no socket offloading, then net-app would use net-context API directly instead of using socket API.

@nashif nashif removed the RFC Request For Comments: want input from the community label Nov 18, 2017
@GAnthony
Copy link
Collaborator Author

Rebased, updated per @jukkar comments.
Still need to address poll/select API.

Gil Pitney added 3 commits August 20, 2018 16:12
This patch enables BSD socket offload to a dedicated
TCP/IP offload engine.

This provides a simpler, more direct mechanism than going
through NET_OFFLOAD (zsock -> net_context -> socket conversions)
for those devices which provide complete TCP/IP offload at the
BSD socket level, and whose use cases do not require
IP routing between multiple network interfaces.

To use, configure CONFIG_NET_SOCKETS_OFFLOAD=y, and register
socket_offload_ops with this module.

Fixes zephyrproject-rtos#3706

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
If the SimpleLink WiFi driver is configured, and socket offload
enabled, this revectors the Zephyr BSD socket APIs to the SimpleLink
WiFi host driver BSD socket APIs, providing a
direct offload of the TCP/IP stack to the CC3220SF network
coprocessor.

Fixes zephyrproject-rtos#3706

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
Add a prj conf file for the TI cc3220sf_launchxl board
to enable socket offload to the simplelink WiFi driver.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
@GAnthony
Copy link
Collaborator Author

Rebased, and also addressed request for poll() by @rlubos.

  • implemented poll() API over SimpleLink's select() API. Chose not to export select() at this time, until/unless Zephyr supports it;
  • Validated with socket echo_client sample, both UDP and TCP;
  • Also, added logic to handle the MSG_DONTWAIT recv() flag.

@GAnthony GAnthony changed the title [RFC] BSD Sockets API: Offloading Support BSD Sockets API: Offloading Support Aug 20, 2018
@GAnthony
Copy link
Collaborator Author

recheck

Copy link
Contributor

@rlubos rlubos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update. Good to see that there's also an example demonstrating socket offload. 👍

@pfalcon
Copy link
Contributor

pfalcon commented Aug 27, 2018

Sorry folks, I wasn't able to follow thru recent changes/discussion here, but regarding poll/select stuff: there's no other way but to move to epoll(), as poll() doesn't scale (and select() doesn't scale at all, though depends on what to call "scale" of course). So, please keep that in mind. I just posted a mail titled "[RFC] Thinking about extended poll support in Zephyr", feel free to skim thru it and raise any concerns you see even on the entrance to it.

@GAnthony
Copy link
Collaborator Author

@pfalcon, thanks for the note on epoll:

from: https://en.wikipedia.org/wiki/Epoll

It is meant to replace the older POSIX select(2) and poll(2) system calls, to achieve better performance in more demanding applications, where the number of watched file descriptors is large ...

I wonder if our target IoT devices will have such a large number of simultaneous open sockets, which would require such an optimization?

@pfalcon
Copy link
Contributor

pfalcon commented Aug 29, 2018

I wonder if our target IoT devices will have such a large number of simultaneous open sockets, which would require such an optimization?

Well, I guess it would be better to discuss this matter in the thread on the devel mailing list, the message which I posted there is at https://lists.zephyrproject.org/g/devel/topic/rfc_thinking_about_extended/25004178 .

Answering your question, everything is relative. For example, I think that "our target IoT devices" don't require such complications as user/kernel mode split, and yet it's in Zephyr. Nor I think that they require PTP, 802.1Qav protocols, and yet they're there too.

Regarding epoll, my motivation is:

  1. epoll was in the original stakeholders' docs for implementing BSD Sockets API. It's not implemented, so I still consider it on my plate.
  2. In the mail above, I give 2 examples, when poll()-like interface requires marshalling pollfd-like structure back and forth. That's not good imho.

And how much is "a large number of simultaneous open sockets"? I don't think that 1000 connections is a large number, even for IoT. But that's already quite a large number of poll descriptors to shove back and forth on each call.

@pfalcon
Copy link
Contributor

pfalcon commented Aug 29, 2018

Not that everyone should like epoll, because it's a natural optimization of translating different polling mechanisms between themselves. For example, if you underlying mech is select(), then to emulate poll(), you need on each poll() call create a new fdset, and set all the bits around. With epoll, your internal representation is cached: you create fdset ones on epoll_create(), patch bits on epoll_ctl(), translate back just a little bit of it on epoll_wait().

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The epoll() support can be added later so we can start with this atm.

@nashif nashif added this to the v1.14.0 milestone Sep 6, 2018
GAnthony pushed a commit to GAnthony/zephyr that referenced this pull request Sep 8, 2018
The current SimpleLink WiFi driver has only the wifi_mgmt
(control plane) operations implemented.

This commit adds the NET_OFFLOAD (data plane) operations.

This was validated on CC3220SF using the sockets/echo sample.

This is the more "integrated", though much more complex and
resource intensive alternative, to direct socket offload:
PR zephyrproject-rtos#4821.

TODO:
- Validate DNS, UDP, IPv6
- Add our own net buf pool, and size per SimpleLink MTU.
- Zephyr init_net_app() needs to comprehend SimpleLink FastConnect,
  where a wifi connection automatically occurs well *before*
  reaching init_net_app()'s registration for network events (race);
  So, for now set CONFIG_NET_CONFIG_SETTINGS=n to disable init_net_app.
- Handle non-blocking sockets
- Net offload drivers need to set all the context and pkt fields normally
  done by ip/tcp.c, as expected by upper layers that peer into the
  net_context struct.  eg: pkt->iface, context->state, etc....
  This is a general issue, not specific to this driver.

Fixes: zephyrproject-rtos#3403

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
@jukkar
Copy link
Member

jukkar commented Sep 11, 2018

There is the RFC status marked to this PR so I am wondering is this ready to be merged?

@GAnthony GAnthony removed the RFC Request For Comments: want input from the community label Sep 11, 2018
@GAnthony
Copy link
Collaborator Author

GAnthony commented Sep 11, 2018

There is the RFC status marked to this PR so I am wondering is this ready to be merged?

@jukkar, I just removed the RFC status.

@jukkar jukkar merged commit 844e6f5 into zephyrproject-rtos:master Sep 11, 2018
@xiaodongxusc
Copy link

Hi,

I'm currently evaluating BSD sockets offloading on TI cc3220sf_launchxl. I'm working on zephyr master branch with all 3 commits for this case has been merged. However I got a build error
simplelink_sockets.c:117:25: error: ‘SlInAddr_t {aka struct SlInAddr_t}’ has no member named ‘S_un’
sl_addr_in->sin_addr.S_un.S_addr;

It looks like sl_socket.h is not updated with adding S_un to
typedef struct SlSockAddrIn_t
{
_u16 sin_family; /* Internet Protocol (AF_INET). /
_u16 sin_port; /
Address port (16 bits). /
SlInAddr_t sin_addr; /
Internet address (32 bits). /
_i8 sin_zero[8]; /
Not used. */
}SlSockAddrIn_t;

My question is

  1. What's the current status of this BSD socket offloading feature?
  2. When do you think this feature will be completed&tested on master branch?

Thanks,
Xiaodong

@GAnthony
Copy link
Collaborator Author

However I got a build error
simplelink_sockets.c:117:25: error: ‘SlInAddr_t {aka struct SlInAddr_t}’ has no member named ‘S_un’
sl_addr_in->sin_addr.S_un.S_addr;

@xiaodongxusc, Thanks for evaluating.
Please ensure to update to latest master, and try to build the sockets/echo sample.
This build error should have been fixed last week, with the commit
be64964 "driver: wifi: simplelink: Fix socket offload after s_addr cleanup"

@GAnthony
Copy link
Collaborator Author

What's the current status of this BSD socket offloading feature?

As of last week, the feature worked for in-tree samples/net/socket/echo app.
Recently however, migration to the new Zephyr shell broke the samples/net/wifi app, which was required as a prerequisite for SimpleLink samples to provision the SSID/password for the WiFi Access Point. This is in
process of being debugged. #10617

Note that WiFi testing is not yet part of Zephyr CI loop, which suggests functionality breaks may occur in master. However, build testing has at least been recently added, so that issue you just hit should not happen again.

When do you think this feature will be completed&tested on master branch?

Here is list of TODO:

  1. Support offline provisioning of certificates (specifying cert/key filenames) for the new TLS setsockopt() feature.
  2. Support non-blocking sockets via fcntl() API - that will be needed by some networking apps and protocols.
  3. Add sync point for applications to wait until a device connects to an AP, before starting communications.
  4. Update to latest SimpleLink SDK - the one in Zephyr is quite old.

Of course, there might be other features that would need to be added depending on use cases which might arise. For example, more options for WiFi provisioning, or new offload hooks depending on needs of the new socket-based networking protocols.

@oberstet
Copy link

oberstet commented Dec 9, 2020

rgd the problem statement of this issue

"Users of devices which provide socket and TCP/IP offload engines would benefit in memory and power efficiency by enabling full offload of the Zephyr BSD socket APIs to a dedicated co-processor.

It doesn't mention the price one has to pay for this "feature":

  • adding a big security attack surface which is in a closed blob (firmware on network coprocessor) and unproven (it will have bugs)
  • under exclusive 3rd party control (TI), no way to inspect, "blind trust" model
  • I am now running a proprietary niche stack. fantastic.
  • not being able to use L3+ features of the Zephyr stack (eg cross interface IP routing)
  • this problem only becomes worse with offloading not only TCP/UDP, but also TLS ..
  • etc .. should I continue?

For me, above is a no go. As a Zephyr user, I am looking for the exact opposite: remove everything possible outside of the OS. I want to use all of Zephyr. I don't need a 2nd protocol stack. I need an RF frontend with MAC with (one) MCU core in one chip, and an L2 MAC level driver for Zephyr. Can the TI chip be used like this at all? If not, what's a recommended chip with Zephyr?

Rgd "offloading": the only offloading the RF/MAC frontend should do is L2 packet filtering and possibly VLAN stuff. The frontend should NOT have any kind of firmware of its own. Is there a (single) chip like this?

I should note that I am far from an embedded engineer (so there might be some fundamental flaws in my thinking), more like cloud etc and coming to this field with some strong opinions;)

Anyways, I would be interested in your opinions/perspectives/corrections! thanks

btw: Zephyr rocks! so thanks for Zephyr in the first place;)

@jukkar
Copy link
Member

jukkar commented Dec 15, 2020

I agree with you here, the offloading stack is quite evil as it complicates the native stack design. The offloaded stacks were supported in order to get more boards to run zephyr. I am hoping that in the future different vendors would support their hw using native IP stack instead.
I think the original use case for these offloaded boards is to support Arduino which typically do not provide a host stack.

@nandojve
Copy link
Member

nandojve commented Dec 15, 2020

Hi @oberstet ,

I think interesting this topic, few questions:

I am now running a proprietary niche stack. fantastic.

What is/are your radio?

What radio do you would like to see in Zephyr that runs in Linux (or any other open source stack) that meet your criteria?

Just curios, something like an HCI for WIFI, can it be solution?

@oberstet
Copy link

thanks for the feedback, much appreciated!

@jukkar :

.. in order to get more boards to run zephyr.

ok, got it. yes, "broader HW support" is definitely good! unless, IMO:

  • it requires a big design/impl. burden in the core project? eg it should not add a price that all users (of all HW) have to pay (eg by introducing additional internal layers)
  • it confuses users: what HW should I use? what are the downsides/upsides? what is actually recommended? (see below)

I am hoping that in the future different vendors would support their hw using native IP stack instead.

yes, that would be great!

fwiw, it took Linux years to gain gravity and nudge vendors to go OSS, and not only that, but mainline. some vendors (eg NXP/Freescale or Realtek) are actively following an upstreaming/mainline approach. and for me, this is one of the most important criteria to select HW.

however, I'm afraid, some vendors might follow a different path for (perceived) business reasons. as in: vendor lock in comes in nicely (for the vendor) with an offloaded stack. they want to drag in as much of "protocol stack" as possible.

anyways, IMO, Zephyr should help users not only by providing the broadest HW support possible, but crucially also recommendations or clear advantages/disadvantages tables. eg the costs I listed in above for going with an offloaded network stack chip (single or external "modem chip").

but that's a politicial Q of couse: does Zephyr want to encourage users to select HW that allows to follow a "good approach" (hosted stack, L2 MAC interface) rather than HW that is supported by Zephyr, but technically and security wise inferior?

IOW: as a user coming to Zephyr I might want to know which chips are recommended in a category (eg single chip incl. wifi). In my eyes, anything involving a proprietory blob and offloaded stack cannot be in this list.


@nandojve

In this context, I am looking for a single chip Arm Cortex-M33 microcontroller (256kB+ RAM, 1MB+ flash) with Wifi, where the Wifi part only contains the RF frontend and L2 MAC. And supported by Zephyr of courses;) Does anyone know one?

On Linux (not talking about here, but ..) I would choose Linux mac80211 (https://wireless.wiki.kernel.org/en/developers/Documentation/mac80211) supported HW, eg Atheros ath10k.

For me, I want to minimize or eliminate blobs as far as possible. In a pragmatic way.

fwiw, a non-pragmatic, but even more extreme approach: SDR and OSS

https://github.com/open-sdr/openwifi
https://github.com/open-sdr/openwifi-hw

in this case, even the HW is OSS=) note: this is not a practical approach sadly .. eg certification etc .. but

@oberstet
Copy link

one more note rgd L2/MAC level interfacing from Zephyr for wireless L2 technologies and regulatory compliance - here is what Linux does for 802.11 where portions of L2 code run in Linux kernel .. and hence compliance depends on that code: see the "Statement" here: https://wireless.wiki.kernel.org/en/developers/regulatory/statement - Also see:

@jukkar
Copy link
Member

jukkar commented Dec 15, 2020

Having mac80211 supported in Zephyr would be very nice indeed. We are still quite far from it atm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.