Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saving log events between py2/py3 #68

Open
warner opened this issue Jan 5, 2020 · 0 comments
Open

saving log events between py2/py3 #68

warner opened this issue Jan 5, 2020 · 0 comments

Comments

@warner
Copy link
Owner

warner commented Jan 5, 2020

As mentioned in #48 (comment) , when a py2-based client emits a log event, the receiver (flogtool tail, log-gatherer, incident-gatherer) gets an event dictionary that uses bytes for both the keys and the values. If the receiver is running py3, the json.dumps() will fail, as it is more picky about the key types than the py2 json module was, and insists that the keys are text (str under py3).

The json module has an override (cls= and implement JSONEncoder.default) for handling non-serializable objects, but this doesn't appear to enable the serialization of bytestring keys. The hook isn't implemented for dictionaries at all (nor any other type that it already knows how to serialize).

So to fix this, I think we'd need a recursive rewriter that takes the dictionary, walks through all collections inside it (dicts, but also lists), and returns a new dict with text keys.

For the sake of rendering, it might also be nice to replace bytes values with text equivalents, as most of the values in log events are boring ASCII strings too.

The wrinkle is that application code can provide additional arguments (yay structured logging), and their values are not necessarily boring ASCII. They could contain nested dictionaries, with arbitrary keys. It's probably fair to insist that log events be serializable, even though part of the intended benefit of structured logging was to let the application author record whatever data would be useful in future debugging, without needing to think about how it should be rendered into text.

(I think we were originally using our own Banana serialization for log events, which was more flexible, but more complicated, and managed to introduce dependencies like the log-viewer had to be able to import the log-emitter's classes, eww)

So if a log event arrives over the wire with bytes keys, we should be ok rewriting them to be strings, and just deny things like hash tables with binary keys. If the values are bytes too, we convert them too, perhaps lossily.

@warner warner mentioned this issue Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant