You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SequentialWriter write function checks for the splitting condition and executes split_bagfile(). This is potentially a costly operation since it involves disk operation and a cache flush (under the hood, we wait for the cache to flush and to join the thread). The write() function is mutex-locked. While splitting is processed, some messages could get lost due to subscriber queues size limits.
In a way, this is an expected behavior, since one would expect that the new bag has to be ready and open before we begin writing. However, since there is a cache, we could use it to capture traffic during the splitting. This would prevent the message loss that is likely to occur during splitting.
The fix might not be straightforward and would involve either a non-blocking cache flush (need to be careful about concurrency logic) or a change in the SequentialWriter locking logic (with a more granular approach).
Expected Behavior
When bag splitting is in progress, we do not lose messages as long as there is space in the cache.
Actual Behavior
Message loss occurs when splitting, especially higher throughput and with high-rate, low queue size publishers.
To Reproduce
It is best to use the rosbag2_performance_benchmarking package (the README.md explains how). Configure .yaml files for your system so that the throughput is close enough to hardware limits of hard drive. Set the max_bag_size parameter so that the splitting will occur.
Additional Context
When changing split_bagfile() to non-blocking, make sure the MessageCache push() implementation is adjusted so that it allows pushing in the flushing_ state.
The text was updated successfully, but these errors were encountered:
adamdbrw
changed the title
Bag splitting blocks the writer and can cause message loss
Bag splitting blocks the writer which can cause message loss
Feb 8, 2021
Description
SequentialWriter write function checks for the splitting condition and executes split_bagfile(). This is potentially a costly operation since it involves disk operation and a cache flush (under the hood, we wait for the cache to flush and to join the thread). The write() function is mutex-locked. While splitting is processed, some messages could get lost due to subscriber queues size limits.
In a way, this is an expected behavior, since one would expect that the new bag has to be ready and open before we begin writing. However, since there is a cache, we could use it to capture traffic during the splitting. This would prevent the message loss that is likely to occur during splitting.
The fix might not be straightforward and would involve either a non-blocking cache flush (need to be careful about concurrency logic) or a change in the SequentialWriter locking logic (with a more granular approach).
Expected Behavior
When bag splitting is in progress, we do not lose messages as long as there is space in the cache.
Actual Behavior
Message loss occurs when splitting, especially higher throughput and with high-rate, low queue size publishers.
To Reproduce
It is best to use the rosbag2_performance_benchmarking package (the README.md explains how). Configure .yaml files for your system so that the throughput is close enough to hardware limits of hard drive. Set the
max_bag_size
parameter so that the splitting will occur.Additional Context
When changing split_bagfile() to non-blocking, make sure the MessageCache push() implementation is adjusted so that it allows pushing in the flushing_ state.
The text was updated successfully, but these errors were encountered: