Skip to content

Conversation

@kleunen
Copy link
Contributor

@kleunen kleunen commented Oct 14, 2020

No description provided.

@codecov
Copy link

codecov bot commented Oct 14, 2020

Codecov Report

Merging #684 into new_wc will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           new_wc     #684   +/-   ##
=======================================
  Coverage   82.70%   82.70%           
=======================================
  Files          46       46           
  Lines        7082     7082           
=======================================
  Hits         5857     5857           
  Misses       1225     1225           

@redboltz
Copy link
Owner

I noticed that mqtt_valid_topic and mqtt_valid_subscription have prefix mqtt_. It isn't required.
Could you remove this?

It seems that those functions are not called. Of course the branch new_wc doesn't contain test_broker.hpp updates. So I've also checked #558. Who calls them?

Is function name and parameter name topic and subscription appropriate?
I use the name topic_filter for wildcard permitted place. For example, topic filter could be topic1, a/+/b, and a/#. In this case, topic_filter is appropriate name.
Note: The current master branch might violate this rule somewhere, but it should be fixed.

I'm not sure the difference between mqtt_valid_topic and mqtt_valid_subscription. The master branch only has mqtt_valid_topic_filter and it is called from compare_topic_filter(). compare_topic_filter() is called from linear search algorithm. I guess that your approach is very different from compare_topic_filter().

@redboltz
Copy link
Owner

After you update #684, if needed, then I will merge it to new_wc. Then move on the next step. The next step means updating test_broker.hpp to replace compare_topic_filter() with trie version. I'm not sure how to gather matched subscriptions using trie.

For example, a client subscribe a/+/c, a/b/c, and a/#. When a publisher (client) publish to a/b/c, then the broker needs to deliver the publish message to a/+/c, a/b/c, and a/# subscriptions.

@kleunen
Copy link
Contributor Author

kleunen commented Oct 14, 2020

After you update #684, if needed, then I will merge it to new_wc. Then move on the next step. The next step means updating test_broker.hpp to replace compare_topic_filter() with trie version. I'm not sure how to gather matched subscriptions using trie.

For example, a client subscribe a/+/c, a/b/c, and a/#. When a publisher (client) publish to a/b/c, then the broker needs to deliver the publish message to a/+/c, a/b/c, and a/# subscriptions.

yes, that is possible with the trie.

@kleunen
Copy link
Contributor Author

kleunen commented Oct 14, 2020

compare_topic_filter

compare_topic_filter() is called from linear search algorithm. I guess that your approach is very different from compare_topic_filter().

yes, compare_topic_filter is the linear search approach, it is not needed anymore if you use the trie approach

@kleunen
Copy link
Contributor Author

kleunen commented Oct 14, 2020

I'm not sure the difference between mqtt_valid_topic and mqtt_valid_subscription.

mqtt_valid_topic should be used when a message is published to a topic. The published topic should be a valid topic (not empty, does not contain \0, and does not contain wildcards).

mqtt_valid_subscription checks whether a subscription is valid. A subscription is allowed to contain wildcards. You named this validate_topic_filter with argument topic_filter, subscription = topic filter

@kleunen
Copy link
Contributor Author

kleunen commented Oct 14, 2020

mqtt_valid_subscription = validate_topic_filter and
mqtt_valid_topic = validate_topic_name

you can replace them. mqtt_valid_topic is not used by the trie, the broker should handle this.

@redboltz
Copy link
Owner

redboltz commented Oct 14, 2020

mqtt_valid_subscription = validate_topic_filter and
mqtt_valid_topic = validate_topic_name

you can replace them. mqtt_valid_topic is not used by the trie, the broker should handle this.

Hmm... maybe I understand. Let me clarify.

This is the current master's validate_topic_filter() definition.

// TODO: Technically this function is simply wrong, since it's treating the
// topic pattern as if it were an ASCII sequence.
// To make this function correct per the standard, it would be necessary
// to conduct the search for the wildcard characters using a proper
// UTF-8 API to avoid problems of interpreting parts of multi-byte characters
// as if they were individual ASCII characters
MQTT_STRING_VIEW_CONSTEXPR
bool validate_topic_filter(MQTT_NS::string_view topic_filter) {
/*
* Confirm the topic pattern is valid before registering it.
* Use rules from http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718106
*/
// All Topic Names and Topic Filters MUST be at least one character long
// Topic Names and Topic Filters are UTF-8 Encoded Strings; they MUST NOT encode to more than 65,535 bytes
if (topic_filter.empty() || (topic_filter.size() > std::numeric_limits<std::uint16_t>::max())) {
return false;
}
for (MQTT_NS::string_view::size_type idx = topic_filter.find_first_of(MQTT_NS::string_view("\0+#", 3));
MQTT_NS::string_view::npos != idx;
idx = topic_filter.find_first_of(MQTT_NS::string_view("\0+#", 3), idx+1)) {
BOOST_ASSERT(
('\0' == topic_filter[idx])
|| ('+' == topic_filter[idx])
|| ('#' == topic_filter[idx])
);
if ('\0' == topic_filter[idx]) {
// Topic Names and Topic Filters MUST NOT include the null character (Unicode U+0000)
return false;
}
else if ('+' == topic_filter[idx]) {
/*
* Either must be the first character,
* or be preceeded by a topic seperator.
*/
if ((0 != idx) && ('/' != topic_filter[idx-1])) {
return false;
}
/*
* Either must be the last character,
* or be followed by a topic seperator.
*/
if ((topic_filter.size()-1 != idx) && ('/' != topic_filter[idx+1])) {
return false;
}
}
// multilevel wildcard
else if ('#' == topic_filter[idx]) {
/*
* Must be absolute last character.
* Must only be one multi level wild card.
*/
if (idx != topic_filter.size()-1) {
return false;
}
/*
* If not the first character, then the
* immediately preceeding character must
* be a topic level separator.
*/
if ((0 != idx) && ('/' != topic_filter[idx-1])) {
return false;
}
}
else {
return false;
}
}
return true;
}

And this the current master's validate_topic_name() definition.

MQTT_STRING_VIEW_CONSTEXPR
bool validate_topic_name(MQTT_NS::string_view topic_name) {
/*
* Confirm the topic name is valid
* Use rules from https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901247
*/
// All Topic Names and Topic Filters MUST be at least one character long
// Topic Names and Topic Filters are UTF-8 Encoded Strings; they MUST NOT encode to more than 65,535 bytes
// The wildcard characters can be used in Topic Filters, but MUST NOT be used within a Topic Name
// Topic Names and Topic Filters MUST NOT include the null character (Unicode U+0000)
return
! topic_name.empty()
&& (topic_name.size() <= std::numeric_limits<std::uint16_t>::max())
&& (MQTT_NS::string_view::npos == topic_name.find_first_of(MQTT_NS::string_view("\0+#", 3)));
}

They are the same as mqtt_valid_subscription and mqtt_valid_topic on #684. It that right?
If it is right, then we can remove mqtt_valid_subscription and mqtt_valid_topic from path tokenizer.hpp.

That means we don't need #684, simply just use the current master's one.

So, I will remove the functions from new_wc branch. Is that OK?

I also want to update the name of mqtt_path_tokenizer to topic_filter_tokenizer.

Then I will let you know.
When I would update #676 and merge to the master, could you implement test_broker.hpp using trie way and create the new PR for that ?

@redboltz
Copy link
Owner

redboltz commented Oct 15, 2020

So, I will remove the functions from new_wc branch. Is that OK?
I also want to update the name of mqtt_path_tokenizer to topic_filter_tokenizer.

I prepared #685 it is a replacement of #676 .

@kleunen ,

My plan is

  1. merge Added basic part of new wildcard mechanism. #685
  2. close Added basic part of new wildcard mechanism. #676
  3. close Subscription map updated checking of subscription path #684

And then, I will ask you to update test_broker.hpp.

It this OK ?

@kleunen
Copy link
Contributor Author

kleunen commented Oct 15, 2020

It this OK ?

yes, that sounds OK. I am not sure what you did with the validate filter/topic function, you removed my functions and are using your and @jonesmz updated function ?

I think there was also a comment in the validate function of @jonesmz that UTF-8 matching was required. I am not sure if anything needs to be done to get UTF-8 matching working, the topics are validated on individual byte level. Maybe some tests with UTF-8 characters need in topics and topic_filters should be added.

@redboltz
Copy link
Owner

yes, that sounds OK. I am not sure what you did with the validate filter/topic function, you removed my functions and are using your and @jonesmz updated function ?

Ok I go forward the process.
I've already merged @jonesmz updated function to the master. We can continue to use them.

For UTF8 checking, we can use https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/utf8encoded_strings.hpp#L47 .
But it is no cheap, so check is only work if MQTT_USE_STR_CHECK is defined.

See https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/utf8encoded_strings.hpp#L50

@redboltz redboltz closed this Oct 15, 2020
@kleunen
Copy link
Contributor Author

kleunen commented Oct 15, 2020

For UTF8 checking, we can use https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/utf8encoded_strings.hpp#L47 .
But it is no cheap, so check is only work if MQTT_USE_STR_CHECK is defined.

I do think that is required, MQTT specification states strings are UTF-8 encoded, so validation of network input should be done, offcourse ... The performance impact is minimal i would say, not memory allocation in inner loop, just blasting away these instructions is very fast on modern CPU. No locking, no memory allocation performance, these are usually performance killers.

@redboltz
Copy link
Owner

Yes. I think that simple insert validate_contents() function.
If MQTT_USE_STR_CHECK is not defined, then it always returns valid.
User can simply define MQTT_USE_STR_CHECK to UTF8 check.

Some of user uses the broker on closed environment. It might be in the factory. So I mean ob premise case and all of clients are under control. In this case, UTF8 check is redundant. So I provide the option.

@kleunen
Copy link
Contributor Author

kleunen commented Oct 15, 2020

Yes. I think that simple insert validate_contents() function.
If MQTT_USE_STR_CHECK is not defined, then it always returns valid.
User can simply define MQTT_USE_STR_CHECK to UTF8 check.

Some of user uses the broker on closed environment. It might be in the factory. So I mean ob premise case and all of clients are under control. In this case, UTF8 check is redundant. So I provide the option.

Yes, but even if all the clients are under control, if you open a TCP/SSL port, someone else beside your clients might connect. But ok, the library focuses now on MQTT client. If you have a MQTT server, i would always recommend enabling this check, and maybe some additional input validation as well ?

@redboltz
Copy link
Owner

Length checking and invalid bit checking are already implemented and always enabled.

I agree recommending define MQTT_USE_STR_CHECK.

NOTE: mqtt_cpp uses positive meaning flags. e.g.) MQTT_USE_TLS. And mqtt_cpp is a header only library so user need to define them. See https://github.com/redboltz/mqtt_cpp/wiki/Config


This is off topic comment.

I use mqtt_cpp for my proprietary broker. It has authorization and authentication. Client is required subscribe/publish permission for each topic.
And it supports out of UTF8 string like C++ std::string. That means binary data.
For example, user can use binary data in User Property
https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901054
I personally think that UTF8 limitation is too much. If I want to use binary data as User Property, I need to do something like base64 encode. MQTT is compared with HTTP. MQTT is more binary oriented.

Anyway, it is my broker story.
mqtt_cpp should provide UTF8 validating functionality and users use it by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants