-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression - RepresenterError on dict subclass #142
Comments
Here's a more direct example of what's newly failing:
|
Here's the previous behavior, observed on Python 3.6 and PyYAML 3.12:
|
This might be since it seems to have switched to safe-dumping by default. https://github.com/yaml/pyyaml/blob/master/lib3/yaml/dumper.py#L27 |
So... does that mean the next release will be backward-incompatible? Can you recommend a workaround that would retain the 3.12 behavior? |
I'm not a maintainer. Landed here from the pmxbot issue. :) I'd guess you can try to get the "DangerousDumper" (or similarly named) class and if it's not present, fall back to the existing Dumper class when doing the dump. That'll work now and in the future even if the next release is backwards incompatible. IMO, the current behaviour on master is the correct behaviour. By default PyYAML should be using the "safe" dumper and loader. |
I tried using the dangerous dumper, but found that creates a YAML file that requires a dangerous loader. What I really want is to instruct YAML to serialize a dict subclass exactly as it would a dict. |
I also ran into this issue trying to dump an instance of a dict subclass. I found an ugly workaround on StackOverflow that you must run before calling yaml.add_representer(
Dict_Subclass,
lambda dumper, data: dumper.represent_mapping('tag:yaml.org,2002:map', data.items())
) Limitations:
|
This issue doesn't seem to be limited to dict, but applies to all types that are subclasses of representable types, right? Consider for instance: import json
import yaml
class StringSubclass(str):
...
class IntSubclass(int):
... Both of these fail: print(yaml.safe_dump(StringSubclass("foo")))
print(yaml.safe_dump(IntSubclass(42))) This is unfortunate, because it is inconsistent with the behavior of print(json.dumps(StringSubclass("foo")))
print(json.dumps(IntSubclass(42))) The difference in behavior is unfortunate, because it means that (untagged "safe") yaml serialization cannot just be used in places where plain json serialization works fine. Imho safe dumping should mirror the behavior of json here and allow serialization if the base type is serializable anyway. Otherwise, the necessary work-around on user side when switching from json to yaml can get pretty ugly. |
Yep, I've gotten nailed by this myself plenty of times. In addition to the stuff going on in native Python, IIRC the Cython-libyaml This all probably gets a little easier to manage once we merge something like #700, since it's really mostly still just about which types get registered as representers rather than wildly different behavior that would necessitate actual new Dumper subclasses. |
At least for the If we consider the main use case of the non-safe dumper to be perfect roundtrips, this dumper should probably stay as it is? I did a quick experiment: Replacing the following occurrences of pyyaml/lib/yaml/representer.py Lines 236 to 267 in 957ae4d
The downside would be a slightly worse performance, because the logic would then always have to go into the second branch with the iteration here: pyyaml/lib/yaml/representer.py Lines 46 to 53 in 957ae4d
To get back to the performance as it is now, we could add a multi representer to the normal representers as well (i.e., adding it to both dicts in line 75) so that the logic here would again use branch line 48 in the majority of cases.
Oh I wasn't aware of that other implementation, so I'm not sure how the change suggested above would relate to that. If it could work like I suggested, may I open a PR? |
Correct, but it's really a little more nuanced than that, especially in light of what's going on with #700 and YAML 1.2 support, since the entire concept of "safe" and "not safe" gain a few more dimensions. The problem is making sure that we stack the representers in the right order so that anything that subclasses them and/or adds further multi-representers doesn't get hidden by the fallback. #700 makes that a little more visible and explicit, but still probably not quite enough to prevent folks from being bitten by an overly "greedy" default fallback that masks something they added explicitly. Feel free to play around with some ideas and impls on a draft PR if you like, but we're probably not going to actually merge anything that changes this behavior until after the YAML 1.2 support is nailed down, since it potentially introduces some new variables to consider. Just want to set expectations 😉 |
Due to #126, I'm running pyyaml at master. When I try to run code that passes on released versions of pyyaml, it fails in master:
As the object is a subclass of dict, it would previously be treated as a dict for the purpose of serialization. Is this an intentional change?
The text was updated successfully, but these errors were encountered: