Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run test_security on CycloneDDS as well #408

Merged
merged 2 commits into from
Aug 28, 2020

Conversation

mikaelarguedas
Copy link
Member

@SidFaber FYI this branch can be used for testing ros2/rmw_cyclonedds#123.

As of ros2/rmw_cyclonedds@bca0852 there are still some cases failing.

	  8 - test_secure_publisher_subscriber__Empty__rmw_cyclonedds_cpp__secure_comm_1 (Timeout)
	  9 - test_secure_publisher_subscriber__Empty__rmw_cyclonedds_cpp__secure_comm_2 (Timeout)
	 19 - test_secure_publisher_subscriber__UnboundedSequences__rmw_cyclonedds_cpp__secure_comm_1 (Timeout)
	 20 - test_secure_publisher_subscriber__UnboundedSequences__rmw_cyclonedds_cpp__secure_comm_2 (Timeout)

@mikaelarguedas
Copy link
Member Author

This is still blocked on the security plugins to land on the default branch of cyclonedds.

@eboasson do you know if there a specific ticket or thread to subscribe to to get notified when the security plugins get merged?

@eboasson
Copy link

eboasson commented May 6, 2020

@mikaelarguedas sorry this fell by the wayside ... I seem to be really bad at closing tickets promptly, but eclipse-cyclonedds/cyclonedds#132 should be closed when it gets merged in.

I'm working my way through ticking the boxes for doing the 0.6.0 release based on current master, and once that's done the security support code will be merged into master and the process for 0.7.0 started. With the usual caveats about estimates, next week seems likely.

@mikaelarguedas
Copy link
Member Author

That's awesome thanks

@mikaelarguedas
Copy link
Member Author

Tried running security_tests on the latest cyclonedds code but the tests failed with

[test_publisher-1]   'Security was requested but the Cyclone DDS being used does not have security support enabled. Recompile Cyclone DDS with the '-DENABLE_SECURITY=ON' CMake option, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:905, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:273'
  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Using eclipse-cyclonedds/cyclonedds#548 the test gets passed that error but still fails the tests with another error:

[test_publisher-1]   'failed to create topic, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:1775, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:273'
[test_publisher-1] 
[test_publisher-1] with this new error message:
[test_publisher-1] 
[test_publisher-1]   'rcl node's rmw handle is invalid, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:425'
  • Linux Build Status

@SidFaber @eboasson FYI

@mikaelarguedas
Copy link
Member Author

cc @kyrofa (not sure if editing existing comments with new mentions sends notifications)

@kyrofa
Copy link
Member

kyrofa commented Jun 17, 2020

fails the tests with another error:

[test_publisher-1]   'failed to create topic, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:1775, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:273'
[test_publisher-1] 
[test_publisher-1] with this new error message:
[test_publisher-1] 
[test_publisher-1]   'rcl node's rmw handle is invalid, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:425'

When I see that error I typically just rebuild my workspace 🙈 . It's the most unclear thing I've ever seen. On a slightly more helpful note, looks like it's coming from here. Would need some more insight into why create_topic would be returning < 0 though.

Signed-off-by: Mikael Arguedas <mikael.arguedas@gmail.com>
@mikaelarguedas
Copy link
Member Author

As discussed in last Security WG, this has been rebased and retested.

  • Communication with CycloneDDS using security works locally using the nightly archive (for both topics and services). This can be tested by following the SROS2 Tutorial
  • It still fails on CI using this branch with the error mentioned above:
[test_publisher-1] [INFO] [1596002443.442937024] [rcl]: Found security directory: /home/jenkins-agent/workspace/ci_linux/ws/build/test_security/test/test_security_files/enclaves/publisher
[test_subscriber-2] [INFO] [1596002443.442941982] [rcl]: Found security directory: /home/jenkins-agent/workspace/ci_linux/ws/build/test_security/test/test_security_files/enclaves/subscriber
[test_publisher-1] 
[test_publisher-1] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[test_publisher-1] This error state is being overwritten:
[test_publisher-1] 
[test_publisher-1]   'failed to create topic, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:1849, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:261'
[test_publisher-1] 
[test_publisher-1] with this new error message:
[test_publisher-1] 
[test_publisher-1]   'rcl node's rmw handle is invalid, at /home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rcl/rcl/src/rcl/node.c:413'
[test_publisher-1] 
[test_publisher-1] rcutils_reset_error() should be called after error handling to avoid this.

We can see it selects the correct enclave for both the publisher and the subscriber but fails at topic creation time. It fails when creating the DDS topic for the publisher here. Not clear why though

cc @SidFaber @kyrofa

@eboasson
Copy link

eboasson commented Jul 29, 2020

When I see that error I typically just rebuild my workspace 🙈 . It's the most unclear thing I've ever seen.

So true ...

The domanid issue I ran into was just that. A a forced rebuild of test_security does re-generate the permissions document, but not the governance file ...

Somewhere in the test output (only in the log file!):

1596025945.8806360 [test_publisher-1] 1596025945.875017 [108] test_secur: It is not allowed to create participant: Could not find domain 108 in governance(code: 141)

And indeed the generated /home/erik/ros2_ws/build/test_security/test/test_security_files/enclaves/subscriber/governance.p7s file does not specify domain id 108:

<dds xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.omg.org/spec/DDS-SECURITY/20170901/omg_shared_ca_governance.xsd">
    <domain_access_rules>
        <domain_rule>
            <domains>
              <id>0</id>
            </domains>
            <allow_unauthenticated_participants>false</allow_unauthenticated_participants>
            <enable_join_access_control>true</enable_join_access_control>
            <discovery_protection_kind>ENCRYPT</discovery_protection_kind>
            <liveliness_protection_kind>ENCRYPT</liveliness_protection_kind>
            <rtps_protection_kind>SIGN</rtps_protection_kind>
            <topic_access_rules>
                <topic_rule>
                    <topic_expression>*</topic_expression>
                    <enable_discovery_protection>true</enable_discovery_protection>
                    <enable_liveliness_protection>true</enable_liveliness_protection>
                    <enable_read_access_control>true</enable_read_access_control>
                    <enable_write_access_control>true</enable_write_access_control>
                    <metadata_protection_kind>ENCRYPT</metadata_protection_kind>
                    <data_protection_kind>ENCRYPT</data_protection_kind>
                </topic_rule>
            </topic_access_rules>
        </domain_rule>
    </domain_access_rules>
</dds>

the permissions.p7s file does:

...
      <allow_rule>
        <domains>
          <id>108</id>
        </domains>
        <publish>
...

So my interpretation is that the generated governance file is simply incorrect. If you don't set ROS_DOMAIN_ID (or you set it to 0), everything matches and it works fine.

@eboasson
Copy link

Would need some more insight into why create_topic would be returning < 0 though.

I have found it: the generated permissions document doesn't include the "ros_discovery_topic", therefore the access control plugin rejects creating that topic. It reports it in the trace, just not on stderr, and that really can use some refinement.

After a CI run with a few hacks to get a lot more output:

23: MIME-Version: 1.0
23: Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg="sha-256"; boundary="----2E6986B5482D0195A208E28AA1F90031"
23: 
23: This is an S/MIME signed message
23: 
23: ------2E6986B5482D0195A208E28AA1F90031
23: Content-Type: text/plain
23: 
23: <dds xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.omg.org/spec/DDS-SECURITY/20170901/omg_shared_ca_permissions.xsd">
23:   <permissions>
23:     <grant name="/publisher">
23:       <subject_name>CN=/publisher</subject_name>
23:       <validity>
23:         <not_before>2020-07-30T07:09:38</not_before>
23:         <not_after>2030-07-29T07:09:38</not_after>
23:       </validity>
23:       <allow_rule>
23:         <domains>
23:           <id>108</id>
23:         </domains>
23:         <publish>
23:           <topics>
23:             <topic>rq/*/_action/cancel_goalRequest</topic>
23:             <topic>rq/*/_action/get_resultRequest</topic>
23:             <topic>rq/*/_action/send_goalRequest</topic>
23:             <topic>rq/*Request</topic>
23:             <topic>rr/*/_action/cancel_goalReply</topic>
23:             <topic>rr/*/_action/get_resultReply</topic>
23:             <topic>rr/*/_action/send_goalReply</topic>
23:             <topic>rt/*/_action/feedback</topic>
23:             <topic>rt/*/_action/status</topic>
23:             <topic>rr/*Reply</topic>
23:             <topic>rt/*</topic>
23:           </topics>
23:         </publish>
23:         <subscribe>
23:           <topics>
23:             <topic>rq/*/_action/cancel_goalRequest</topic>
23:             <topic>rq/*/_action/get_resultRequest</topic>
23:             <topic>rq/*/_action/send_goalRequest</topic>
23:             <topic>rq/*Request</topic>
23:             <topic>rr/*/_action/cancel_goalReply</topic>
23:             <topic>rr/*/_action/get_resultReply</topic>
23:             <topic>rr/*/_action/send_goalReply</topic>
23:             <topic>rt/*/_action/feedback</topic>
23:             <topic>rt/*/_action/status</topic>
23:             <topic>rr/*Reply</topic>
23:             <topic>rt/*</topic>
23:           </topics>
23:         </subscribe>
23:       </allow_rule>
23:       <default>DENY</default>
23:     </grant>
23:   </permissions>
23: </dds>
...
23: 1596179450.340630 [108] test_secur: Local topic permission denied: ros_discovery_info denied by default rule(code: 145)

My local build has the allow "ros_discovery_topic" rule in the permissions document. The difference with the CI run appears to be that the CI run excluded Fast-RTPS.

@mikaelarguedas
Copy link
Member Author

Thanks @eboasson for looking into this into details!
The domain ID issue was surprising and I havent been able to reproduce it locally.

The ros_discovery_info is promising because indeed sros2 is not aware that rmw_cyclonedds_cpp needs these permissions. I opened ros2/sros2#231 and it fixes the tests locally 👍.

Out of curiosity, how did you get this logging info ? did you have to configure extra logging or was it in the generated test file as is ?

@eboasson
Copy link

I hacked it ... you can get tons of details from Cyclone by enabling tracing, but for now you have to do that with an environment variable (CYCLONEDDS_URI) and I couldn't quickly find a way to set that in a CI build. However, enabling its traces only gets you the message and as it still didn't make (much) sense to me, I also did a hack in the code to simply print the permissions document ...

Like this:

diff --git a/rmw_cyclonedds_cpp/src/rmw_node.cpp b/rmw_cyclonedds_cpp/src/rmw_node.cpp
index a02e417..841a96c 100644
--- a/rmw_cyclonedds_cpp/src/rmw_node.cpp
+++ b/rmw_cyclonedds_cpp/src/rmw_node.cpp
@@ -760,13 +760,14 @@ static bool check_create_domain(dds_domainid_t did, rmw_localhost_only_t localho
        possible, too, but I think it is clearer to spell it out completely).  Empty
        configuration fragments are ignored, so it is safe to unconditionally append a
        comma. */
-    std::string config =
+    std::string config0 =
       localhost_only ?
       "<CycloneDDS><Domain><General><NetworkInterfaceAddress>localhost</NetworkInterfaceAddress>"
       "</General></Domain></CycloneDDS>,"
       :
       "";
-
+    std::string config = config0 + "<Tr><V>finest</><Out>stderr</></>,";
+    
     /* Emulate default behaviour of Cyclone of reading CYCLONEDDS_URI */
     const char * get_env_error;
     const char * config_from_env;

and

diff --git a/src/security/builtin_plugins/access_control/src/access_control.c b/src/security/builtin_plugins/access_control/src/access_control.c
index b342f8ec..496413d8 100644
--- a/src/security/builtin_plugins/access_control/src/access_control.c
+++ b/src/security/builtin_plugins/access_control/src/access_control.c
@@ -2039,6 +2039,8 @@ read_document_from_file(
     DDS_Security_Exception_set(ex, DDS_ACCESS_CONTROL_PLUGIN_CONTEXT, DDS_SECURITY_ERR_INVALID_FILE_PATH_CODE, 0, DDS_SECURITY_ERR_INVALID_FILE_PATH_MESSAGE, (filename ? filename : "NULL"));
     return false;
   }
+
+  fprintf(stderr, "%s\n", *doc);
   return true;
   DDSRT_WARNING_MSVC_ON(4996);
 }

@eboasson
Copy link

On a related note: I have planned the Cyclone 0.7.0 release (so with the support for security) for the 6th and will propose a PR for Foxy to switch to the new release. Hopefully it will be accepted!

@mikaelarguedas
Copy link
Member Author

you can get tons of details from Cyclone by enabling tracing, but for now you have to do that with an environment variable (CYCLONEDDS_URI) and I couldn't quickly find a way to set that in a CI build

Great noting that somewhere for future debugging sessions

I have planned the Cyclone 0.7.0 release

That's awesome, I'll work on getting a matching release of sros2 so security features can be used in environments that have only rmw_cyclonedds_cpp installed and not the other rmw implementations

@mikaelarguedas
Copy link
Member Author

CycloneDDS only (in debug mode)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

All RMWs in release mode:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@mikaelarguedas mikaelarguedas marked this pull request as ready for review July 31, 2020 15:17
Signed-off-by: Mikael Arguedas <mikael.arguedas@gmail.com>
Copy link
Member

@ruffsl ruffsl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Though I wonder what improved the test time compared to past test history on Jenkins.

@mikaelarguedas
Copy link
Member Author

Though I wonder what improved the test time compared to past test history on Jenkins.

Can you clarify which test times you are referring to and comparing to ?
Are you talking about individual test durations or the overall CI job duration ?

@ruffsl
Copy link
Member

ruffsl commented Aug 1, 2020

I was just a perusing through some of the security connection tests, and saw a few ~7 sec tests plummet to sub second, though the drop was from several cycles all. Perhaps it's just outliers from cpu load on the VM worker.

https://ci.ros2.org/job/ci_linux/11730/testReport/test_security/TestSecurePublisherSubscriber/test_subscriber_terminates_in_a_finite_amount_of_time_11/history/

https://ci.ros2.org/job/ci_linux/11730/testReport/test_security/TestSecurePublisherSubscriber/test_subscriber_terminates_in_a_finite_amount_of_time_14/history/

@mikaelarguedas
Copy link
Member Author

I believe this is representative of how long it takes to spin up, connect, tear down a node with connext (slow) vs fastrtps and cyclone (fast).
The issue of these "post-test" tests test_subscriber_terminates_in_a_finite_amount_of_time is that it's not clear if they apply to the same test / same RMW or not.
For example on this page you can see the time for all security tests of the various rmws:
The Connext tests take around 7sec while Fast-RTPS and Cyclone are < 1sec

@mikaelarguedas
Copy link
Member Author

@mjcarroll or other maintainers, do you think we can get this one merged ?

@mikaelarguedas
Copy link
Member Author

@nuclearsandwich Do you think we can get this one reviewed/merged as well ?

@mjcarroll mjcarroll merged commit 2a0a629 into ros2:master Aug 28, 2020
@mikaelarguedas mikaelarguedas deleted the security-cyclone branch August 28, 2020 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants