-
-
Notifications
You must be signed in to change notification settings - Fork 805
bypass folding during dump() for js/function string literals #215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…g issues when some of the lines exceed the hard-coded max line length of 80
|
Could you add test? |
|
The dumper must know nothing about javascript literals. JS function type as any other non-core type is an optional plugin. I don't like idea of adding such hacks. This must be solved via type metadata. |
|
Also, I don't understand how folding may cause eval issues. Dumber must serialize data in the way that will be reconstructed without differences, no matter of scalar style. If it doesn't, then it's general serialization bug of folding scalars, no relation to js/function type. |
|
In my view the max line length of 80 characters being hard-coded in the dumper code without any ability for customizing that value is the real problem. But that will require more extensive changes. This simply bypasses that logic for serializing js/function strings which was already serialized properly by the Type but arbitrarily "folded" by the dumper logic to some X column width. I will provide the test scenario as requested but the issue basically happens when a serialized output from Function.toString() contains lines that exceed 80 character length constraint and the folder then arbitrarily wraps that function line to the next. Sometimes it gets lucky but other times it does strange things. Simplest example is: var some_long_string = "this is a long string that will eventually exceed the fixed 80 character length"; The fold logic will cut this line into two. JavaScript does not have multi-line string continuation so it will become invalid JavaScript. Not all scalars can be folded in same ways without it likely breaking syntax. What if you had another YAML scalar as a string in there? Anyhow I don't see how dumper can properly deal with scalar data without introspecting the type it represents. |
|
It worth checking dumped types to allow folding for strings only, because current method is not safe. For example, it can fail on long RegExp. |
|
Anyway, it worth to make test to demonstrate problem and disable it via |
|
Again. Dumper has nothing to do with types. It works with sequences of characters, and it always serializes those without any damage. Yes, for case of js/funciton there could be unpleasant styling damage, but it doesn't affect the correctness of YAML document at all. I agree that some specific types may not want to be represented using some scalar styles, but this must be specified via type metadata, not hardcoded in the dumper. Also, line length for folded scalars should be configurable. If not, then it seems I overlooked that when I was reviewing the pull request... |
|
@dervus https://github.com/nodeca/js-yaml/blob/master/lib/js-yaml/dumper.js#L259 it's hardcoded, because nobody needed to change it. I postponed api change until someone can prove that it's really needed. |
|
Hm. I see. Probably I was disagree back then. :D I personally prefer 120 line length limit for example. |
I doubt you work with yaml files :) |
|
@dervus - I agree this particular hack to explicitly exclude custom type js/function would not be the ideal resolution but it IS a custom-tag inherently supported by js-YAML currently and the one most affected by this folding logic. And most minimalistic change. Ideally, the Type definition can have a flag to either opt-in/out to allow wrap during dump. But that will require all custom-types in present and future to define that. I would go for Type to explicitly opt-in to be wrappable rather than otherwise. But the dump routine (particularly writeScalar) only has state/object/level etc and the object already contains the type.represent so only other variable at hand to use was state.tag Another option would be to allow dump API to accept custom serialization function. |
|
Representer function doesn't control how data will be serialized further. It just converts an arbitrary JS object into a YAML primitive: sequence, mapping, or scalar. |
|
@dervus closing as rejected? |
|
Whether or not this PR itself is rejected the underlying bug should be addressed. We need a way to have JavaScript functions bypass the folding logic during dump. I'd be happy to submit a different PR but would like some guidance on how we can make it work without this type of hackish way. Peter Lee
|
|
I'd create a new ticket to cleanup. Rejecting PR does not means rejecting everything - it's a technical process of keeping tracker clean, nothing else. AFAIK, you need to dump FNs as @dervus could you help? |
|
I already describe the solution twice. Since solution proposed here is unacceptable — yes, it's closed as rejected. |
|
If we create a new Type option to explicitly specify NOT to apply styling during dump, then the assumption is that by default the folding styling will be applied to other custom Types. I don't think that's the right assumption - we should expect that the proper string serialization output would be produced by the custom Type "represent" function. I agree more with Vitaly on this point that the style/folding is only safe with "string" and that's the only type that such folding/style should be applied. If we DO add a new meta option to Type, then it should be to state that the new custom Type is OK to apply folding/style, not the other way. Peter Lee
|
No. Represent function is not about styling.
For what reason? Styling does not affect data correctness. Types way be different. Doing any assumptions which styles suites types better would be wrong. For example there could be markup language custom types which prefer folded style while being non-string objects. For that reason we will never make any assumptions. Why? Because dumper must not care about presenting you data pretty. It only care about presenting it correct. Not correct in terms of any custom types, but it terms of YAML. Because we're writing YAML document, not JavaScript, not Python, not Brainfuck, not whatever else. |
|
Let me ask another way. Why are we applying styling in "dump" for YAML to begin with? What is the reason for folding into some X size column? For readability? But readability should NOT affect the ability to parse it back to the correct form. If the dump "styling" affects ability to chain output of dump into parse and NOT get the original back, then it IS affecting data correctness. As you say, yes, it produces "correct" YAML output but nevertheless it still breaks js-yaml parse so if anything I would say don't fold data for style reasons at all because it creates an unnecessary problem. If users want dump to apply "style" on the output, they should pass that in as an option request when calling dump and the routine should not assume to apply style as a default. As you've stated, we should NOT assume that users calling dump wants any styling to begin with. Peter Lee
|
|
I've created #217. It seems enougth to restrict folding to strings only. No need to overcomplicate things. |
|
@puzrin, thanks for getting this issue resolved. This will allow me to unbundle js-yaml dep inside my module and get back to life. :-) |
Rewrote the scalar dumper with an emphasis on rigor, based on a careful reading of the YAML 1.2 spec and testing against LibYAML. Corrections: Block style chomping accounts for added the extra newline added by subsequent elements or at the end. Fixes nodeca#238. Top-level block scalars are indented. Fixes dumper side of nodeca#221. Folded style should be reliable now. The old dumper was not aware of the subtle rules for more-indented lines. More-indented lines no longer cause extra newlines to be added. (The old dumper also had an off-by-one error that dropped the last character from a long first word on a line.) Fixes nodeca#215, nodeca#222. Removes nodeca#217 workaround. Literal and folded styles can represent arbitrary "printable" strings, including leading/trailing whitespace. NB. For the 'construct-string-types' dumper test, the strings are dumped correctly. The test fails because a loader bug loses a newline when using the keep "+" chomping indicator. Additional: Added test suite covering several former bugs, new edge cases, and new expected behaviors. All previous applicable tests are included. Styles are chosen in a well-defined order. Plain and single are preferred for single lines under the width limit. Folded is preferred when a line is longer than the width limit (if enabled) and does not begin with a space. Literal is used for all other multi-line "printable" strings. Double-quoted is only used when a string is unrepresentable otherwise. Plain style is relaxed to include more characters and strings. Progress on nodeca#183. As indent increases, line width decreases down to min(lineWidth, 40). Setting lineWidth to -1 disables wrapping. Previously there was unexpected behavior past 40 indent: narrow lineWidth suddenly jumped to 40, and -1 (no limit) suddenly enforced a limit of 40.
fixes an issue where javascript string literals get folded and causing eval issues when some of the lines exceed the hard-coded max line length of 80