-
Notifications
You must be signed in to change notification settings - Fork 129
Update OSITrace to be used in osi-validation #762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c0aeaca
47b65f5
2c71922
a4dc739
74ab68b
d719fd6
4776445
fe6311a
3526f3e
0784ebb
bd1b4e6
4869ca4
6990c8d
fbf7bfc
d34f0d9
18f5ccc
36938ee
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,6 +32,7 @@ def __init__(self, path=None, type_name="SensorView", cache_messages=False): | |
self.read_complete = False | ||
self.message_cache = {} if cache_messages else None | ||
self._header_length = 4 | ||
self.timestep_count = 0 | ||
if path: | ||
self.from_file(path, type_name, cache_messages) | ||
|
||
|
@@ -49,13 +50,16 @@ def from_file(self, path, type_name="SensorView", cache_messages=False): | |
self.message_offsets = [0] | ||
self.message_cache = {} if cache_messages else None | ||
|
||
self.timestep_count = len(self.retrieve_offsets()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This turns a simple open operation into a very expensive operation, as it must read through the whole file to extract all the offsets. Sometimes this is needed, but in most instances you want to actually read and process a file sequentially, either step-by-step, or you need all messages anyway, in which case you also do not want to retrieve the offsets but rather read the messages at the same time. So that is a big no-no. |
||
self.type = self.map_message_type(type_name) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a duplication of line 40. |
||
|
||
def retrieve_offsets(self, limit=None): | ||
"""Retrieve the offsets of the messages from the file.""" | ||
if not self.read_complete: | ||
self.current_index = len(self.message_offsets) - 1 | ||
self.file.seek(self.message_offsets[-1], 0) | ||
while ( | ||
not self.read_complete and not limit or len(self.message_offsets) <= limit | ||
while not self.read_complete and ( | ||
not limit or len(self.message_offsets) <= limit | ||
Comment on lines
-57
to
+62
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bugfix is important, It just makes me regret that we added black reformatting checks that this is convoluted by the arbitrary re-indentations needed. We probably should increase the line length to avoid stupidity like this. |
||
): | ||
self.retrieve_message(skip=True) | ||
return self.message_offsets | ||
|
@@ -156,6 +160,21 @@ def get_messages_in_index_range(self, begin, end): | |
yield message | ||
current += 1 | ||
|
||
def cache_messages_in_index_range(self, begin, end): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This whole method is a Bad Idea(tm): A cache is something that stores things that have already been accessed. In any case, the iterator returned from get_messages_in_index_range will already place all read messages into the cache, so the implementation is also needlessly redundant. If one just wants to have all messages available in memory, then this really is not cacheing, it is just buffer reading and should be handled as such; i.e. if I want all messages anyway, then the simple one-liner suffices:
And then access the messages by index via
In Python 3.12 and later itertools.batched could be used directly. In summary I would say osi-validation code that handles messages is currently a mess and should be rewritten to more clearly do the batch-processing it needs without multiple reads/re-reads of the file and deep interaction with internal mechanisms of the reader. |
||
""" | ||
Put all messages from index begin to index end in the cache. Then the | ||
method ``get_message_by_index`` can access to it in a faster way. | ||
|
||
Using this method again clear the last cache and replace it with a new | ||
one. | ||
""" | ||
self.message_cache = { | ||
index + begin: message | ||
for index, message in enumerate( | ||
self.get_messages_in_index_range(begin, end) | ||
) | ||
} | ||
|
||
def close(self): | ||
if self.file: | ||
self.file.close() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant to the length of message_offsets, and only valid once a file has been fully read in, so is not an apropriate API to the outside.