-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a crash in test_message_cache #635
Conversation
Analyzing the code I can say that this only occurs in the test itself due to the fact that the close() is not called as it should before the change of callback. I am not sure about the change - I will take some time to analyze it. If it works properly, I am for it since it takes away the requirement of close(). If it affects performance considerably, I would opt to change the test instead and think about how we can guarantee that the close was called. Note that the actual code of caching in rosbag2 was tested to the extreme in our benchmarks (including plenty of splitting and huge volumes of data and/or topics) and we had no crashes. Which doesn't necessarily make them impossible, but certainly much less likely. |
The cache consumer requires that you call close() before calling change_consumer_callback(). The test forgot the close() call, so was running into a race where changing out the callback was not atomic. Remember to call close() in the test, which should fix some occasional failures in CI. Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
6e137ce
to
c77539f
Compare
This looks good to me, thank you Chris! @mjeronimo could you take a look and merge this? It fixes a non-deterministic test crash. |
@ros-pull-request-builder retest this please |
The Rpr job is failing, but that doesn't seem unique to this PR; other PRs seem to have the same problem. Probably something to be looked into, but I don't think it should hold this one up. I'm going to run CI on this just to ensure the other platforms are happy as well: |
Fix a crash in test_message_cache.
The cache consumer requires that you call close()
before calling change_consumer_callback(). The test
forgot the close() call, so was running into a race
where changing out the callback was not atomic.
Remember to call close() in the test, which should fix
some occasional failures in CI.
Signed-off-by: Chris Lalancette clalancette@openrobotics.org
This should fix the occasional crashes we are seeing, like https://ci.ros2.org/view/nightly/job/nightly_osx_repeated/2240/testReport/junit/(root)/projectroot/test_message_cache/ .
While this change makes sense to me, and passes the unit tests, I am hesitant just because the previous code was so weird. I'm not sure if I'm missing something. @adamdbrw , @Karsten1987 , @mjeronimo , @emersonknapp review here is welcome.