Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41887: omit leading whitespaces on ast.literal_eval #22469

Merged
merged 5 commits into from
Oct 4, 2020

Conversation

isidentical
Copy link
Member

@isidentical isidentical commented Sep 30, 2020

@pablogsal
Copy link
Member

I am curious if there is a performance impact due to the need to create a substring. This may be non trivial for large springs...

@gvanrossum
Copy link
Member

Only if there's whitespace, and even then I doubt that you can even measure the time -- surely the parser takes much more time looking at the characters one at a time than copying a string, even if the input is just a single long string literal (note that this would construct another copy for sure).

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually discovered something a little disturbing. Some forms of whitespace cause syntax errors -- trailing for ast.literal_eval(), and both leading and trailing for eval()`. Example:

>>> eval("\n 12+12")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 2
    12+12
IndentationError: unexpected indent
>>> ast.literal_eval("123\n ")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 2
    
SyntaxError: unexpected EOF while parsing
>>> 

@isidentical
Copy link
Member Author

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, another comment

@gvanrossum
Copy link
Member

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

No, but I would not want to mislead in the docs. Can we make literal_eval use the same algorithm as eval?

@iritkatriel
Copy link
Member

Can we make literal_eval use the same algorithm as eval?

Re differences between eval and literal_eval, that came up recently here as well: https://bugs.python.org/issue17490 .

The documentation says that literal_eval only understands "strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." But the reality is that it doesn't understand all python numbers.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 1, 2020

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

@iritkatriel
Copy link
Member

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

No, I can't. I was referring to an expression. Point taken about this being off topic here.

@gvanrossum
Copy link
Member

Hm... There's a slight problem with the way this currently works. It strips leading spaces and tabs, but no other whitespace characters. That's the same as eval(), but it's not what's usually considered "whitespace" -- that refers to a larger category of characters, including newline, FF, and a host of weird Unicode things.

I propose to just change the docs to say "strips leading spaces and tabs" everywhere.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, I'll apply myself.

@gvanrossum
Copy link
Member

Will merge when the tests pass.

@gvanrossum gvanrossum merged commit e799aa8 into python:master Oct 4, 2020
@bedevere-bot
Copy link

@gvanrossum: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 4, 2020
Also document that eval() does this (the same way).
(cherry picked from commit e799aa8)

Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
@bedevere-bot
Copy link

GH-22533 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Oct 4, 2020
@miss-islington
Copy link
Contributor

Sorry, @isidentical and @gvanrossum, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker e799aa8b92c195735f379940acd9925961ad04ec 3.8

xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
Also document that eval() does this (the same way).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants