Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][RFC] net: mqtt: Port MQTT library to BSD sockets #5854

Closed

Conversation

GAnthony
Copy link
Collaborator

@GAnthony GAnthony commented Jan 27, 2018

Goals are to:
- provide an example of a non-trivial Zephyr IP protocol working over
standard BSD sockets.
- enhance portability: since the mqtt code is based on a standard API, it
can have value beyond the Zephyr project.
- port mqtt with minimal change to the existing Zephyr mqtt API.

Design Notes:
- the async_socket library is introduced to adapt the synchronous
BSD socket APIs to the asynchronous mqtt library.
- the async_socket library can be expanded in the future and used
for porting other Zephyr asynchronous IP protocols.

Validated:
- samples/net/mqtt_publisher on qemu_x86 (non-secure sockets only).

Todo:
- Port TLS support
- Fix tests/net/lib/mqtt_* test cases and validate.
- Determine costs in code/data size.
- Intersect new ease-of-use API to replacing net_app - remove net_utils.

@codecov-io
Copy link

codecov-io commented Jan 27, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@416614e). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #5854   +/-   ##
=========================================
  Coverage          ?   65.09%           
=========================================
  Files             ?      418           
  Lines             ?    39211           
  Branches          ?     6603           
=========================================
  Hits              ?    25526           
  Misses            ?    10617           
  Partials          ?     3068

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 416614e...19dc3ce. Read the comment docs.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 30, 2018

Todo:

  • Port TLS support

As a quick notice, let's keep TLS matters outside of this PR, as a separate task, which requires its own consideration and planning. I opened #5900 for that.

@mike-scott
Copy link
Contributor

I like the idea of this PR, however this will probably cause existing users of the MQTT library some headache (from the looks of it):

  • Net app settings in user applications such as: CONFIG_NET_APP_PEER_ADDR and CONFIG_NET_APP_MY_ADDR need to be addressed in some way?
  • Typical code for when CONFIG_NET_CONTEXT_NET_PKT_POOL=y involves a call to net_app_set_net_pkt_pool() and this would also be going away.

I also can't recommend using @pfalcon suggestion of putting off TLS changes into a 2nd PR. That would certainly break existing MQTT/TLS users as well as the TLS path in samples/net/mqtt_publisher.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 30, 2018

That would certainly break existing MQTT/TLS users

Oh, I can only say that this PR doesn't break enough stuff ;-). We should take rewrite-in-sockets as a chance to fix (possibly) and optimize (definitely) implementations, and that certainly will require changing APIs and "API contracts".

Generally, that one of the high-level problems with this PR that I see - it (apparently) tries to be a drop-in replacement for existing implementation, but it doesn't have to be. The old API stays for transition period, deprecated. New API requires porting to.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 30, 2018

to fix (possibly)

Why doubt? One can say "all Zephyr's TCP-based protocol implementations are broken", and not eat too many hats afterwards (maybe for your IRC sample, @mike-scott ? ;-) ).

https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/net/lib/mqtt/mqtt.c#L714 : The size of MQTT message can be 256MB, so I'm not sure what it linearizes there. Otherwise, the usual bouquet: assumption that entire protocol message fits into IP packet, assumption that there's a single protocol message in an IP packet, assumption that protocol message starts at the beginning of IP packet, assumption that protocol message ends at the end of IP packet. Nothing of that is true for TCP.

So, for a bit of difference, please let me use the following approach: as long as the implementation is broken, it's not important what API it has, it likely will need to be adjusted in the process of fixing anyway, and then we can adjust it even more to reap additional benefits.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 30, 2018

@GAnthony : Thanks for this PR! I (im)patiently waited for it, but it's pretty big, so review takes time. I'd like to so far concentrate on "net: async_socket: Asynchronous socket library". I'll add specific inline comments, by a generic on is: why was it done this way? I don't see it's stated explicitly, but I imagine it's implied that the idea was to provide drop-in replacement for existing library and its API.

That's honorable goal, but I'm afraid, it show sockets API not from the best angle. We know that sockets themselves have additional overhead on top of net_context, but async_socket adds even more overhead, and this time, adhoc to usage of sockets, specifically: a thread, and extra "control" socket.

Did you consider a different approach (at the cost of changing the API of course)? Let me outline the ideas. Threads are surely useful concept, but recently there's a backlash against using them in networking contexts, based on following observations: a) threads lead to synchronization problems; b) threads are resource-hungry (e.g. stack space); c) from b), threads scales "poorly".

A known alternative is a ("threadless" or "intra-thread") event loop. An application either starts it once and all future control flow dispatch happens from it, or it executes one iteration of it regularly. Of course, your implementation is such an event loop, running in a separate thread, except: a) it's not a generic event loop, but "async socket"; b) it's limited to running in a separate thread (a generic event loop could run in the same or a separate thread). I think that if adding such a thing to Zephyr, we should strive to make it generic.

Another question is how much of "asynchrony" you intend to be there. Writing truly async network code is complicated. Arguably, the original API is truly asynchronous. Except that it's broken, per above. So, I again would suggest to follow an idea that if implement the MQTT protocol properly, it's already a big improvement from the original, and we can skip true asynchrony for now. This is how your patch implements it too, but then I'd suggest to call this API "callback based", not "asynchronous".

/*Socket to callback map, for use by async_recv() and async_sock_server() */
struct rcv_callbacks {
/* Our socket id */
int sock;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation requires O(n) searches. A better implementation is to have array of struct pollfd's, and parallel to it array of callback data (so at index i, there's a particular fd in 1st array, and its callbacl data in 2nd array).

static struct k_thread async_sock_task_data;

/* Helper function, to restart poll() loop: */
static int async_server_restart(void)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there weren't a thread, there wouldn't be needed such a complication: polling vs application callbacks were serialized in runtime, i.e. first event loop's poll() runs, then callbacks are served which may update poll structures, then event loop gets control again, which makes needed pass on (updated) poll structures and calls poll() again. Won't fit existing MQTT API (which is API of the broken implementation, to remind again). Much more efficient. (Also potentially more bug-free.)

* In Linux, we could use eventfd() to add an fd to the poll()
* list of fds, to enable breaking out of the poll() when
* a new fd is added/removed.
* But, Zephyr's poll() API only supports sockets for now, and Zephyr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but Zephyr has k_poll() which can do that. Yep, it would be Zephyr-specific implementation, but that's similar to how eventfd() is Linux-specific implementation (it's not POSIX).

But per above, I don't think that we should even go there, we just shouldn't incur extra thread and related synchronization overheads in the first place.

NET_ASSERT(ufds[0].fd == signal_sock);

/* Just get and discard data: */
rcv_len = recvfrom(ufds[0].fd, server_rcv_buf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another thing how this "async socket" would differ from a generic event loop: such an event loop doesn't necessary read data for an app, why - it can to it itself. (Depends, for a truly async, non-blocking sockets, handling reads/writes may be a big convenience).

@pfalcon
Copy link
Contributor

pfalcon commented Jan 30, 2018

So, that's it for starters. I'd say, this PR already gives a great food for thought, and I tried to explore that above, perhaps in an unorthodox way. So, IMHO, we should decide what's the goals on the top-level: preserve the existing API (which makes quite an adhoc solution), or develop a really correct MQTT library and reap the benefits of socket conversion (at the expense of diverging from the existing API, and spending more time on development - well, do a much deeper conversion, possibly, a half-rewrite).

Thanks for the PR!

@GAnthony
Copy link
Collaborator Author

GAnthony commented Jan 31, 2018

A few comments to help understand the genesis of this PR:

Since stakeholders have been asking for BSD socket support, this WIP PR was started as an experiment to explore the costs of using sockets for Zephyr IP protocols and apps.

I took mqtt as a representative example: it uses DTLS/TLS, is asynchronous, non-trivial, yet code size is small enough to make some progress.

The goal was to get the existing mqtt functional with sockets, and to not otherwise add code to enhance mqtt, as that would not allow a good comparison of rom/ram usage between net_app/net_context and sockets.

The costs of converting from net_app to sockets include:
1a. Work to convert the IP protocols to use a new synchronous API.
1b. Work to convert fragmented net_pkt buffers to socket's linear application buffers.
2. Providing capabilities provided by net_app, which are missing in sockets.
3. Increased (rom/ram) size when using sockets vs net_context.

To reduce the costs of 1a) (IP protocol conversion), there are a few options:
a) if net_app could be built on sockets, then all the IP protocols should just recompile, and we'd be done. I started down this path, but it proved too monumental a task (well, at least for me).
b) rewrite all protocols to call socket APIs directly. However, the problem there is of course, BSD sockets are synchronous, whereas net_app/net_context is asynchronous. This would require lots of changes to the protocols, which use asynchronous net_app callbacks to drive their state machines.
c) create an adaptation layer (async_socket) to the synchronous BSD sockets API to emulate the asynchronous behavior of net_app, basically adding callbacks to sockets. This was the option taken, to make some progress, and to demonstrate that only few changes need to be made to the core protocol code.
Note: This approach is similar to that taken by paho mqtt for its async API (though paho mqtt C code is much more complex, and includes both send and receive threads).

For 1b), moving from net_pkt to linear bufs, was easy in the case of Zephyr mqtt - it was already linearizing mqtt packets for use.

For 2), this is the big WIP part. Some needed net_app capabilities were placed in net_utils as a "parking lot", until a decision is made on what type of new API would replace net_app to add those features. Perhaps the async_socket layer's server can provide an idea of what that could look like for the sync to async mapping part. Note also that net_app_init is still being used implicitly in this PR.

For 3), ROM/RAM costs. Comparing [rom|ram] reports on qemu_x86 for master vs this PR:

  • RAM: +12K: where about +10K is just due to adding a 1K stack (!?)
  • ROM: +7K: +~2K for sockets+async_sock layers, +~3.5K due to newlib vs minimal libc.
    I still need to understand why stack data is so high (stack alignment to page size on x86?)
    Newlib seems to be needed by sockets requiring fcntl. Otherwise, mqtt on master uses minimal libc.
  • Bottom line: ~2K code for sockets + async adaptation layer. The rest is due to newlib and others, and of course there's room for optimization.

Overall, the idea was to amortize the server thread cost over all protocols, so each did not have to implement the equivalent of its own server thread providing net_app-style callbacks, and to avoid a complete rewrite of the Zephyr IP protocols.

@GAnthony
Copy link
Collaborator Author

GAnthony commented Jan 31, 2018

I like the idea of this PR, however this will probably cause existing users of the MQTT library some headache (from the looks of it):

Net app settings in user applications such as: CONFIG_NET_APP_PEER_ADDR and CONFIG_NET_APP_MY_ADDR need to be addressed in some way?
Typical code for when CONFIG_NET_CONTEXT_NET_PKT_POOL=y involves a call to net_app_set_net_pkt_pool() and this would also be going away.

@mike-scott: Thanks for the review.

Indeed, there is so much functionality in net_app that is still needed. Also, the network buffer management is undergoing an overhaul, so perhaps even net_app_set_net_pkt_pool() would evolve as a result. There is a suggestion from the networking team to create a new TLS/DTLS ease-of-use API, which would replace parts of net_app. So, until that is done, I only placed what was needed from net_app for the mqtt_publisher sample in the net_utils.c "parking lot."

@nashif
Copy link
Member

nashif commented Jan 31, 2018

I still need to understand why stack data is so high (stack alignment to page size on x86?)

@GAnthony try qemu_x86_nommu maybe?

@GAnthony
Copy link
Collaborator Author

As a quick notice, let's keep TLS matters outside of this PR, as a separate task, which requires its own consideration and planning. I opened #5900 for that.

I'd agree it will be a (big) separate task, but would like to keep on the todo list here, as its a big part of the port.

@GAnthony
Copy link
Collaborator Author

I still need to understand why stack data is so high (stack alignment to page size on x86?)

@GAnthony try qemu_x86_nommu maybe?

@nasif: Yup, that was it. Now stack size is as expected. I suspected MMU, but didn't know the option to disable. Thanks!

@GAnthony
Copy link
Collaborator Author

@pfalcon: thanks for the thoughtful review. I'll try to answer the general comments first:

Generally, that one of the high-level problems with this PR that I see - it (apparently) tries to be a drop-in replacement for existing implementation,

Yes, that was the goal - to enable a cost analysis of porting to sockets. I wanted to keep this exercise separate from other mqtt enhancements.

Threads are surely useful concept, but recently there's a backlash against using them in networking contexts, based on following observations: a) threads lead to synchronization problems; b) threads are resource-hungry (e.g. stack space); c) from b), threads scales "poorly".

Yes, I was uneasy about introducing another server thread, and thought about workqueues, but that didn't look appropriate as I needed to block in a poll(). Also, since server threads are used throughout the networking stack: in bluetooth hci drivers, 802.15.4 drivers, and the net core itself, there didn't seem to be too much backlash ;-).

A known alternative is a ("threadless" or "intra-thread") event loop. An application either starts it once and all future control flow dispatch happens from it, or it executes one iteration of it regularly. Of course, your implementation is such an event loop, running in a separate thread, except: a) it's not a generic event loop, but "async socket"; b) it's limited to running in a separate thread (a generic event loop could run in the same or a separate thread). I think that if adding such a thing to Zephyr, we should strive to make it generic.

I'm not sure I follow. Can you point me to an example of a generic "threadless event loop"? The async_socket server thread will block in a poll() until a socket becomes available to read. This is better from a power management perspective than periodic execution of an event loop.
The async_sock server is meant to be used by multiple clients, so it should scale. It also means that each protocol/app does not need its own separate server thread to handle socket receive events.

[...] but then I'd suggest to call this API "callback based", not "asynchronous".

That's a fair comment. async_socket is not as asynchronous as net_context, since for example the send callback is immediately fired even though the network may not have sent the underlying packet. But, it's named to distinguish itself from the completely synchronous socket API.

So, IMHO, we should decide what's the goals on the top-level: preserve the existing API (which makes quite an adhoc solution), or develop a really correct MQTT library and reap the benefits of socket conversion

Even if redeveloping the mqtt library from scratch, one would still need to handle asynchronous requests from the peer, so it would be interesting to see how else that might be done using sockets.

default n
select NET_SOCKETS
select NET_SOCKETS_POSIX_NAMES
select NEWLIB_LIBC # Needed to find <sys/fcntl.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this looks bad, I mean we should have a networking code that supports also the minimal libc. Isn't it possible to have this without newlibc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me only fcntl.h include path is needed in the build for sockets, so I don't think the full newlibc is really needed here. Maybe @pfalcon can comment.

OTOH, I see there was a general goal of making newlibc the default library?
#3102. I wonder if that is no longer the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @pfalcon can comment.

I'll have a look. I agree we should support sockets even with minimal libc, given that newlib brings in 3.5K of code.

OTOH, I see there was a general goal of making newlibc the default library?
#3102. I wonder if that is no longer the case.

I added commented there #3102 (comment) summarizing some arguments against that.

@jukkar
Copy link
Member

jukkar commented Jan 31, 2018

@GAnthony thanks for the patches, they are good in order to progress with socket support.

The costs of converting from net_app to sockets include:
1a. Work to convert the IP protocols to use a new synchronous API.
1b. Work to convert fragmented net_pkt buffers to socket's linear application buffers.
2. Providing capabilities provided by net_app, which are missing in sockets.
3. Increased (rom/ram) size when using sockets vs net_context.

To reduce the costs of 1a) (IP protocol conversion), there are a few options:
a) if net_app could be built on sockets, then all the IP protocols should just recompile, and we'd be done. I started down this path, but it proved too monumental a task (well, at least for me).

I think this is probably not going to happen.

b) rewrite all protocols to call socket APIs directly. However, the problem there is of course, BSD sockets are synchronous, whereas net_app/net_context is asynchronous. This would require lots of changes to the protocols, which use asynchronous net_app callbacks to drive their state machines.

Yes, and we definitely need some kind of async API for that purposes.

c) create an adaptation layer (async_socket) to the synchronous BSD sockets API to emulate the asynchronous behavior of net_app, basically adding callbacks to sockets. This was the option taken, to make some progress, and to demonstrate that only few changes need to be made to the core protocol code.
Note: This approach is similar to that taken by paho mqtt for its async API (though paho mqtt C code is much more complex, and includes both send and receive threads).

Async API is much needed and if it needs to use thread for that then be it. Threads use lot of resources but it looks like we cannot avoid that. We also definitely need some kind of TLS/DTLS convenience library that runs on top of sockets, otherwise each application will do it differently and probably wrong.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

I took mqtt as a representative example: it uses DTLS/TLS, is asynchronous, non-trivial, yet code size is small enough to make some progress.

Right, that's great. I thought that maybe it would be nice to have interim discussion of you progress, but actually the aim was to see how people would use sockets in Zephyr, and in this regard this is a "clean-room" example, which is only better. (But then I don't "agree" with various parts, which is fine - we're exploring the space.)

The goal was to get the existing mqtt functional with sockets, and to not otherwise add code to enhance mqtt, as that would not allow a good comparison of rom/ram usage between net_app/net_context and sockets.

Very reasonable approach. But it contains a paradox: going with reasonable and fair approach of not changing external interface, the sockets API is actually put in unfair position, where it doesn't show its benefits, and only shows its overheads. Using sockets API requires different paradigm, which would have a ripple effect on the implementation and API (but would allow to affect qualitative matters, nor just quantitative.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

However, the problem there is of course, BSD sockets are synchronous, whereas net_app/net_context is asynchronous.

I'd like to extend proposal to avoid terms "synchronous" and "asynchronous", as they're multi-faceted (mean different things in different contexts) and thus vague. They're also subject of "inversion paradox", where a thing being synchronous (from one side) entails [need to] being asynchronous (from another side). Again, different sides and different kinds of synchronous-ness and asynchronous-ness meant in such contexts, but that can lead to confusion. An example is JavaScript - it's single-threaded language, so its execution is inherently synchronous. And yet we know that it's a well-known example of asynchronous programming paradigm.

@GAnthony , you seem to agree with this note in the comment below I quote, I wanted to post a dedicated treatment still.

It's better to use other, less ambiguous terms, and try to "decompose" what sync/async means into separate "axes":

  1. Being "push-style" vs "pull-style" (re: how data is fed into app), aka "callback-based" vs "not callback-based".
  2. Blocking vs non-blocking.

Your async_socket from this PR is thus blocking, callback-based.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

RAM: +12K: where about +10K is just due to adding a 1K stack (!?)

Well, I tell that they tell that threads are bad, bad ;-). (They apparently tell that based on bunch of examples similar to the above, when involving threads leads to unobvious results, which require digging to understand.)

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

Yes, I was uneasy about introducing another server thread, and thought about workqueues, but that didn't look appropriate as I needed to block in a poll(). Also, since server threads are used throughout the networking stack: in bluetooth hci drivers, 802.15.4 drivers, and the net core itself, there didn't seem to be too much backlash ;-).

To start with, nobody says that there's no need/use for threads. What people say is that threads are overrated and in many (from application-level programmer PoV) places one can do better without threads than with. Zephyr is not representative here, as it's just a single piece of software (whereas people talk about CS/IT in general), and it's an OS. Still, it shows even for Zephyr: above I mentioned that IMHO, net_app's usage of threads for TLS handling is superfluous. Then, cases like #5857 and #729 can be directly attributed to concurrency issues introduced by threads.

All in all, to proceed, we'd need to be on the same line re: "People tell that threads are bad" (I purposedly formulate it like that, to avoid bringing too much of personal element, though it should be obvious that I've bought into that idea ;-) ). This issue is generic across CS/IT, but each language has own lore. For example, as mentioned above, there're no threads in JavaScript, period. The Python's canonical reference is https://glyph.twistedmatrix.com/2014/02/unyielding.html . That's what I can recommend, and as article says, beyond the possible literal examples, isn't specific to Python at all. Beyond that, Google search for "threads are evil" would give much more discussion: https://www.google.com/search?q=threads+are+evil .

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

I'm not sure I follow. Can you point me to an example of a generic "threadless event loop"?

Sure, I'd need to do some homework to give the "best" example(s), but let's go "bottom-up" for now - we know that JavaScript is single-threaded, so its implementations wouldn't run callbacks from another thread. Let's look at the closest to us JavaScript impl, Zephyr.js. Here's its main loop: https://github.com/intel/zephyr.js/blob/master/src/main.c#L358, and here's where it calls its callbacks from it: https://github.com/intel/zephyr.js/blob/master/src/main.c#L368 .

The async_socket server thread will block in a poll() until a socket becomes available to read. This is better from a power management perspective than periodic execution of an event loop.

How do you know? If it's a quadrocopter flying application (which also wants to serve HTTP while flying), then no, it's not better to block in poll(), the app needs to perform realtime multi-sensor, multi-motor control. Only application may know what's better for it, and it's better if its developer is presented with a repertoire of choices.

The async_sock server is meant to be used by multiple clients, so it should scale. It also means that each protocol/app does not need its own separate server thread to handle socket receive events.

Sure, there're no contradiction in our treatment. You're saying "There's no need for multiple threads to do that, one is enough". I'm saying "Not only there's no need for multiple threads, sometimes, even 1 is too much. Moreover, sometimes external event loop isn't flexible enough either, an app should be able to set up its own per its need."

The bottom line: your solution is definitely in the right direction, and already pretty good. But it's too rigid and inflexible per a more general treatment. And it wouldn't take too much effort to make it more flexible, per the known best practices. And the question is whether we do it now, or leave for the later. The last choice has a drawback that we make too many and too slow iterations of elaborating Zephyr to be a top-notch system (which slows adoption, etc.)

@pfalcon
Copy link
Contributor

pfalcon commented Jan 31, 2018

Note: This approach is similar to that taken by paho mqtt for its async API (though paho mqtt C code is much more complex, and includes both send and receive threads).

Depends on which paho mqtt - they have whole bunch of them. Indeed, in https://github.com/eclipse/paho.mqtt.c they have -lpthread (didn't check how they use it). But for our usecase a good comparison is https://github.com/eclipse/paho.mqtt.embedded-c , and here's what they have:

https://github.com/eclipse/paho.mqtt.embedded-c/blob/master/MQTTClient-C/samples/linux/stdoutsub.c#L241

That's "Moreover, sometimes external event loop isn't flexible enough either, an app should be able to set up its own per its need." case from my comment above (as indeed, the lowest-level and most flexible solution).

@GAnthony
Copy link
Collaborator Author

Thanks for the links, @pfalcon, I'll need to study those JavaScript examples. It indeed would be nice if a thread were not required.

@jukkar
Copy link
Member

jukkar commented Feb 1, 2018

net_app's usage of threads for TLS handling is superfluous

As a side note, I think the TLS thread in net_app is very bad but I did not figure out other way to do overcome the huge stack requirement that mbedtls has. So that is why the TLS/DTLS handling was placed in separated thread.

@pfalcon
Copy link
Contributor

pfalcon commented Feb 1, 2018

but I did not figure out other way to do overcome the huge stack requirement that mbedtls has. So that is why the TLS/DTLS handling was placed in separated thread.

Well, that's good hint. I'd say that having 2 stacks is still more overhead than one, but yes, if switching from non-TLS to TLS config, besides all other options, also requires adjusting stack size largely, that's of course inconvenience. Well, switching to sockets API is definitely not panacea, the idea is to try to resolve some of the baggage of issues we accumulated, but there of course will be trade-offs, head-scratching and possible I-eat-my-hat situations too ;-). We'll need to go thru them and try to find a solution or otherwise document as a known issue I guess.

@pfalcon
Copy link
Contributor

pfalcon commented Feb 15, 2018

@GAnthony : Can you provide details how you tested this, so this testing can be reproduced?

Also, did you have a chance to do anything to this patch in regard to #5985 (which now supports TLS, and thus can be started to be tested with more code) ?

@GAnthony
Copy link
Collaborator Author

Can you provide details how you tested this, so this testing can be reproduced?

I used BOARD=qemu_x86, and otherwise followed the README.rst in samples/net/mqtt_publisher/.

Also, did you have a chance to do anything to this patch in regard to #5985

No, I'm currently working on a wifi shell to test the wifi offload driver.

@pfalcon
Copy link
Contributor

pfalcon commented Mar 2, 2018

I've ported this PR to the proposed "zstream" API, to add TLS support: #5985 (comment) . The changes are included in that PR, as they depend on it.

Copy link
Contributor

@dbkinder dbkinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small tweak


$ sudo ./mqtt_publisher

Note that sudo is necessary as the underlying mqtt library is using a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the stray a in using a an async_sock

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Building/Running On Linux
==========================

This sample can also be built and run on Linux. The setup of the mosquitto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I'd add natively to the phrase: built and run natively on Linux

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: thanks for the review.

Gil Pitney added 8 commits June 11, 2018 11:15
An API for adapting synchronous BSD socket APIs to applications
requiring asynchronous callbacks.

Created to ease migration of asynchronous Zephyr IP protocols from
net_app/net_context to BSD sockets.

Initially, this supports client side TCP sockets, to enable porting
the mqtt publisher sample to BSD sockets.

Todo:
- add UDP async sendto/recvfrom
- add server support: accept/listen
- handle nonblocking send socket failures (EWOULDBLOCK, EAGAIN, etc).
- ensure IPV6 is supported.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
Goals are to:
- provide an example of a non-trivial Zephyr IP protocol working over
  Zephyr BSD sockets.
- enhance portability: since the mqtt code is based on a standard API,
  it can have value beyond the Zephyr project.
- port mqtt with minimal change to the existing Zephyr mqtt API.

Design Notes:
- the async_socket library is introduced to adapt the synchronous
  BSD socket APIs to the asynchronous mqtt library.
- the async_socket library can be expanded in the future and used
  for porting other Zephyr asynchronous IP protocols.

Validated:
- samples/net/mqtt_publisher on qemu_x86 (non-secure sockets only).

Todo:
- Port TLS support
- Fix tests/net/lib/mqtt_* test cases and validate.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
In a first step toward making the mqtt library more portable
to POSIX, this patch moves from zephyr/types.h back to
stdint.h types.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
To compile on Linux requires:
* Using pthread.h, instead of Zephyr's kernel.h or posix/pthread.h
* Using malloc instead of k_mem_slab
* Using assert.h instead of Zephyr NET_ASSERT macros.
* Using sys/socket.h (and friends) instead of Zephyr's net/socket.h.
* Ifdef'ing out Zephyr debug macros, like NET_ERR, NET_DBG.
* Including the required CONFIG_* defines from running Kconfig

Any differences between POSIX/BSD and Zephyr's version of POSIX/BSD
were #ifdef'd with the __ZEPHYR__ build macro.

This was tested:
* On Zephyr using the mqtt_publisher sample on qemu_x86_nommu.
* On Linux: Using a later patch with Makefile.posix

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
Make the sample more portable to a POSIX OS.

This was validated on Linux: see the README for instructions.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
This to enable rom/ram reports on qemu_x86_nommu, enabling
reporting of non-MMU inflated stack sizes.

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
Fix test sense of return value from mqtt_connect().

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>
Remove redundant mqtt_init() call, and fix sense of test for
mqtt_init().

Signed-off-by: Gil Pitney <gil.pitney@linaro.org>

#include "net_utils.h"

static int get_port_number(const char *peer_addr_str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually a function called net_ipaddr_parse() that parses the IPv[4|6] address + port and returns that info to the caller. Perhaps it could be used here too so we could avoid this util function.

@GAnthony
Copy link
Collaborator Author

Closing this PR, being replaced by #10125.

@GAnthony GAnthony closed this Sep 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants