-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REP 2009: Topic Message Type Negotiation Feature #325
Conversation
I'm curious as to the main advantage over a node which advertises multiple topics with different types and checks for existing subscribers vs dynamically negotiating for every pub-sub pair. The former is how this is typically done in ROS 1 and has some advantages (the node's ROS API is static, there are no changes to the node graph which depend on runtime factors and implementations appear less complex (callbacks at runtime with priority lists, possible renegotiation, etc)). Having a static ROS API is advantageous when reasoning about a system. Type negotiation at runtime seems like it would complicate this. Could the "real" motivation be added to the REP perhaps? The motivation section as-is seems to want to generalise from an actual implementation constraint or requirement, but it doesn't make it explicit (or my understanding is lacking). Is the approach described by the REP trying to reduce resource usage (ie: CPU and memory) by "just publishing what subscribers need"? |
I believe that we want to have a known network running with some nodes active and some nodes idle. Which nodes are active and which nodes are idle may change while the network is running - this is where the dynamic or runtime part comes in. In the case that a subscriber becomes active and connects to a publisher, the publisher should renegotiate which type it is publishing. I believe that I understand the advantages of the static network, although I'm not sure of how the implementation in the paragraph above differs from the fully dynamic case. It seems like if we want the above behavior, we may be solving the fully dynamic case, but I may have a misunderstanding.
It seems to me that the user may have more reasons than resource usage for having preferences between the supported message types. For example, perhaps the code for one type of supported message has a certain quality rating and is thus preferred over other supported types. I'm sure there are other examples too, like a race condition occasionally occurs with one message type, or the user is more confident about one code path. In any of these cases, the main thing that I believe is important is that there are preferences for which message type is published, not that those preferences are likely to be motivated by resource usage. |
I was not present at this discussion, perhaps you are confusing me with someone else. This may be lack of coffee, but re-reading the motivation section, it's still unclear to me what this makes possible. My main confusion was around the part which implied that instead of a static (edit: re-reading, this may not actually be the case. The enhancement here would negotiate between nodes, who then set up a regular pub-sub connection) For me personally it would help if it could be made clear why a node couldn't always expose a set of topics with different types and subscribers would subscribe to the topics with the type they are interested in. That would seem to allow for the same flexibility on the publisher's side (ie: only publish what is requested), and would not need more infrastructure for runtime type-negotiation. It would not help avoiding unnecessary work for pub-sub connections of idle nodes (ie: the subscription would still count, but the node is not interested in any new messages). That could be solved by shutting down subscribers in idle nodes, but from the updated motivation section I get the impression that may be undesirable (so the network should remain static, but the message types carried by that network should be allowed to change)?
If a consuming node supports multiple types, why not publish the 'correct' type directly? Is the conversion done automatically? Edit: or is the main point to take away from node authors/users the responsibility of figuring out which specific topic to subscribe to (out of a set), and instead allow nodes to figure that out for themselves, given a few predicates and two sets of produced and consumed message types? A partly built-in coordination level task, so to say? |
You are right. I updated my above comment once I realized.
You're correct.
That's correct, too.
I agree, except rather than topics, it is probably for different message types. It is a design decision as to if different message types are sent on different topics or the same topic. I'll be discussing this in the rationale section of the REP, but I haven't gotten there yet.
I'm not sure that I understand exactly. I'm not sure how it is for one node to shutdown the subscriber in another node. That seems somewhat heavy handed to me. It seems like, especially when combined with lifecycle nodes, which we very well may be building off of for this implementation, that state management would get even more complex and thus harder to use. Also, this could be done in a centralized approach, too, but that is another design decision that we're working on. It seems like the approach that you suggest seems to accomplish the same end. Even still, I believe that you may want to do negotiation at the publisher, because you would want the publisher to know what type it should publish. This is complicated if there are two nodes with subscribers that want different types from the same publisher and only sending one message is preferable for, say, performance reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a partial review here. I didn't look into the "Nodes that delay revealing their preferences" very closely, but I will in another pass.
rep-2009.rst
Outdated
========== | ||
|
||
The primary reason for this change is to allow nodes to publish different types of messages that better allow the system to optimize its behavior. | ||
For example, a subscriber may be more efficient with one image format (say, YUV420) than another (say, ARGB888). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say it is not only that the subscriber may be more efficient, but the publisher may be more efficient as well.
As my initial comments were mostly motivated by curiosity and as I'm not really a stake-holder here I don't really have any additional input, except maybe an observation. Other frameworks I've worked with in the past have explicitly separated the responsibility for configuration of connections between components from the runtime (or computational) aspects of those components. This was done specifically to allow designers of systems to be in complete control over such configuration aspects. Besides reasoning about the architecture of systems, such separation of concerns also facilitates runtime reconfiguration of component connections by an external, coordinating, component (or complete (sub)system). In ROS 1, this was never really supported. The approach described in this REP doesn't seem to improve this situation in ROS 2 (runtime renegotiation of component connections with potentially varying types, based on predicates hidden inside nodes with no envisioned interface for external configuration). Tearing down, negotiating and setting up new connections between components (ie: nodes in ROS 2) sounds like a coordination / configuration phase activity to me. I'm not sure about making parts of that the responsibility of publishers and subscribers, as they don't sound like the kind of entities you'd give that responsibility to. (I realise we currently don't have such separation of concerns in ROS 2 either. I'm just not sure going in -- what appears to be -- the opposite direction is a good idea. But again: I'm not a stakeholder here, and it looks like this is something needed for a project you're working on, so this is just an observation / shower thought) |
This is a good point. It may be good to think into how to do external configuration.
Good point. We're still debating on a centralized or decentralized (node level) approach.
Thanks for the thoughts. Hopefully this will be a useful feature. I think it will be especially useful for hardware accelerators so that their packages can pick data types that, say, optimize performance and resource usage, without the user having to think about it - even with changing system demands. It may also give people some flexibility in using different datatypes with larger frameworks, like, say, MoveIt or Nav2, if they choose to support it. Regardless, I'm happy to hear your thoughts going forward, and thanks for the feedback so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of questions and thoughts about clarification inline. But overall, looking good!
rep-2009.rst
Outdated
- The relative priority of the supported message type. | ||
|
||
The message type is important because multiple supported message types may be sent with the same ROS 2 message type. | ||
For example, a publisher may publish image data in the format of YUV420 or ARGB8888, both of which could be sent in a ``sensor_msgs::msg::Image``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this deserves a bit more explanation for future readers. That is, make sure the user knows that if the publisher wants to publish the same ROS message type, with different metadata, the user may have to "wrap" that ROS Message type in another structure so it is now a "custom" type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand.
This part doesn't rely on type adaptation. You could imagine passing the same ROS message type on two different channels, both of which are in YUV format, but say a different size image, and you choose depending on available resources. In this case, maybe you want to handle them differently and thus they could both register as supported types. I don't believe the user has to define any additional meta data than the supported type and have a code path for the message once it's received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess this is some confusion on my part on what the "types" are. As far as I understand, they can't just be ROS message types (like sensor_msgs/msg/Image
), since publishing a YUV and an RGB message would use exactly the same ROS message type. There has to be some kind of meta-type that defines a ROS message type and particular metadata that would be embedded in that message.
So maybe I'll change the comment to a question: how is the user expected to tell the negotiation system that there are two different "types", but they both happen to use the same ROS message type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that I've explained this better now. See c88fc7e.
Edit 2: @audrow: would it be correct to say this proposal could be summarised as a dynamic version of what image_transport made possible? So use a base topic name (such as But then more generic and usable for things other than image message types. Edit: I missed this sentence during my initial rereading of the REP:
this is a crucial sentence, as it implies there will be additional infrastructure introduced used by nodes to communicate their preferences. After negotiation, a single It's also confusing the REP talks about "publishers creating publishers" and "subscribers creating subscribers". Perhaps that could be described differently?
Quick comment then: the proposed system in this REP reminds me of HTTP content negotiation. For HTTP, it makes sense to have something like that, as without it, types are either implicit or basically unknown. The basic connection (ie: TCP socket) is already setup when this negotiation happens, but that's fine, as such a connection is essentially just a tunnel between client and server over which some bytes are shipped. They don't have any "type" to begin with. For ROS topics, in a "strongly typed" message passing system, this seems like an unexpected addition (unneeded even?), as even before setting up any connections, I can already infer the type of data I'll receive from a topics definition (as in:
Not an expert, but would this sort of functionality complicate certification efforts? They typically don't like / actively avoid runtime changes to systems. (we don't certify ROS systems right now, perhaps partially due to complex runtime behaviour) something about capabilitiesThinking about it some more, this kind of dynamic connection (re)negotiation also seems related to the capabilities system that saw a prototype in ROS 1. At some level of abstraction, a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more change from my side. Otherwise looking pretty good to me.
rep-2009.rst
Outdated
Defining Negotiating Publishers and Subscribers | ||
----------------------------------------------- | ||
|
||
Negotiating publishers and subscribers both require a list of supported message types and a service topic that will be used to negotiate the selected message types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid using the term "service topic" here, as those terms are very overloaded in ROS. Maybe "base topic"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in c88fc7e.
It seems the word "type" is overloaded here. It seems that the ROS message types is meant to be the type of "physical" representation, whereas the REP is using type as a broader, semantic, thing. I am quite concerned if we ended up with relaxing the "physical" type, allowing it to change at runtime. I hope that this is not in scope. This would make programming ROS in statically typed languages (including C++) not very pleasant, and we would loose most advantages. However, even for "semantic types" (an image might be seen as a blob of bytes, regardless encoding, for instance; same for natural language character strings) it would be beneficial to keep them distinct in types. The classic solutions for this problem, to keep both flexibility and types, is to use subtyping or interfaces, or a combination of those. If one allowed type hierarchies (like class hierarchies) for message types, then one could still define channels over abstract types and publishers could publish (perhaps after a negotiation) a concrete subtype of the message. The client API could even use overloading/dynamic dispatch or an abstract interface to handle the message regardless of which concrete variant is being send. This would of course could become a major discussion to this right, but I think a rather elegant solution could be attained. I guess for some abstract tasks one could even hide the concrete message types from programmers this way. |
I think so. I didn't know about the details of image transport before writing this, but after a little reading it seems similar in spirit. I think your summary is accurate.
I've tried to clear up this terminology by better sticking to the terms defined in the terminology section. Hopefully it is better now.
Probably, although I'm not very knowledgable of the certification process. As far as I understand, the general certification process involves taking a subset of the system and certifying that. So maybe in certifying ROS 2 some organization could choose to take or leave this. It likely will be an additional package to
Cool note. It does seem very related. I'll take a look at maybe include this and image transport in the REP as related work. Thanks for pointing these out! |
I've gone through the document and tightened things up. I hope now there is less ambiguity. Thanks for pointing this out.
This would be an opt-in feature, probably similar to lifecycle nodes, in that you have to explicitly use them, so nothing has to change. Also, the type is not really changing. It is just changing which publishers and subscribers are active at one time (through negotiation). Note that publishers and subscribers with different types should use different topic names, although, I believe this is not currently enforced in ROS 2.
That seems like a cool idea.
This is probably possible with type adaptation: https://ros.org/reps/rep-2007.html. |
I think this is a key point. I am much less concerned suddenly :) |
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com> Co-authored-by: Chris Lalancette <clalancette@openrobotics.org>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
Signed-off-by: Audrow Nash <audrow@hey.com>
49a26e1
to
042b33c
Compare
After doing the implementation of the type negotiation, several things were discovered. This rewrite of the document takes all of those new findings into account. Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clalancette, I can't approve since I opened the PR, but the changes look good to me overall and match with my understanding of the implementation. I have two minor comments.
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
This is a REP to describe a type negotiation feature that is part of a collaboration with Nvidia.