-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I was looking at using this library for converting some python objects that use the struct module into C++ definitions
Unfortunately I found that this repo only handles a small subset of valid struct format strings
struct allows spaces in format strings, as well as integers preceding an identifier.
"I 128s 2h" is a valid format for the struct module, but not for this repo
Instead it fails on the first space, since it isn't listed as a valid character, rather than skipping it
The same for any integer.
I've made my own implementation to handle this, rather than a fork, since I also have to handle my custom objects
(The binary data is represented in nested structures, so my definitions also have to take that into account)
One common special case to look out for is strings:
With struct both "5c" & "5s" represent the equivalent of char[5] in C
However, the objects returned are different:
>>> import struct
>>> struct.unpack("5c", b"\x00\x01\x02\x03\x04")
(0, 1, 2, 3, 4)
>>> struct.unpack("5s", b"words")
b"words"This is an important restriction, in writing my own function to standardise the format strings,
I had to include a case for strings, to ensure I recognised it as the single object struct would expect
Unfortunately my solution is a little clunky, but I'll share it here if you think you'd find it useful:
def simplify(_format):
"""Converts a format string for the struct module into a list of type chars"""
_format = _format.replace(" ", "")
if _format[-1].isnumeric():
raise RuntimeError("_format is invalid (ends in a number)")
out, i = list(), 0
while i < len(_format):
char = _format[i]
if char.isnumeric(): # find the whole number
j = 1
while _format[i:i+j].isnumeric():
j += 1
j -= 1
count = int(_format[i:i + j])
i += j
f = _format[i]
# NOTE: f is only assumed to be valid, validity is not checked here
if f == "s": # string
out.append(f"{count}s")
else:
out.extend([f] * count) # "2I" -> ["I", "I"]
elif char.isalpha(): # lazily collect type
out.append(_format[i])
elif char == "?": # bool is not an alphabetical character, but is still valid
out.append("?")
else:
raise RuntimeError(f"Invalid character '{char}' in _format")
i += 1
return outThis could probably be tidier but in my testing it seems to work with the format strings I'm using
(no pointer types, padding, size_t or 0 length types)
Using this with the dictionary lookups in struct_parse should handle both spaces and types preceeded by counts
Strings are passed through as "128s" etc., so a special case would need to be added
Presumably noting the type as string and also the size
I hope this is useful in expanding this repo to support all valid format strings the struct module accepts