Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in 3.12 beta in json.dump deeply nested dict #107263

Closed
hauntsaninja opened this issue Jul 25, 2023 · 18 comments
Closed

Regression in 3.12 beta in json.dump deeply nested dict #107263

hauntsaninja opened this issue Jul 25, 2023 · 18 comments
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes release-blocker type-bug An unexpected behavior, bug, or error

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Jul 25, 2023

This was reported here: https://discuss.python.org/t/has-sys-setrecursionlimit-behaviour-changed-in-python-3-12b/30205

The following program works fine on 3.11, but crashes with RecursionError on 3.12:

d = {}
for x in range(1_000):
    d = {'k': d}

import json, sys
sys.setrecursionlimit(100_000_000)
foo = json.dumps(d)

I confirmed this bisects to #96510

Linked PRs

@hauntsaninja hauntsaninja added the type-bug An unexpected behavior, bug, or error label Jul 25, 2023
@hauntsaninja
Copy link
Contributor Author

cc @markshannon

@AlexWaygood AlexWaygood added 3.12 bugs and security fixes 3.13 bugs and security fixes labels Jul 25, 2023
@sunmy2019
Copy link
Member

Related #105003

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Jul 26, 2023

Also cc @Yhg1s out of caution, since we're close to RC.

At a minimum, I'm guessing we need a What's New

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Jul 26, 2023

The relevant number is the C stack, so depending on your numbers, note you don't need to set sys.recursionlimit to see a regression. The following works in 3.11, but fails in 3.12:

d = {}
for x in range(810):
    d = {'k': d}

import json
foo = json.dumps(d)

@sunmy2019
Copy link
Member

At a minimum, I'm guessing we need a What's New

I am in favor of adding a new API exposing this. Nothing else would fundamentally fix this.

@markshannon markshannon self-assigned this Jul 26, 2023
@markshannon
Copy link
Member

If we change the original example to:

d = {}
for x in range(100_000):
    d = {'k': d}

import json, sys
sys.setrecursionlimit(100_000_000)
foo = json.dumps(d)

It segfaults on 3.11.
I would argue that not segfaulting is not a regression, but an improvement.

Here are what we can do to lessen the impact of this change:

  • Leaves things as they are.
  • Tweak the C recursion limit, so that PyEval_EvalDefault() consumes more units than other functions so other C code can recurse at least 1000 deep. I'd up the limit to ~2200 making PyEval_EvalDefault() consume 3 units, and everything else 2.
  • Allow the C recursion limit to be changed, which I don't think is a good idea for safety reasons.

Whatever we do, I'll add a section to the "what's new".

@markshannon
Copy link
Member

Something else we could do is to change the C recursion limit from 800 to 1500 on optimized builds, where the C frames will be smaller.

A few tests will need changing to handle the differing recursion limits, but that's not a bad thing as it should make the tests more general.

@sunmy2019
Copy link
Member

Allow the C recursion limit to be changed, which I don't think is a good idea for safety reasons.

Can you elaborate on that?

It segfaults on 3.11.

People can easily increase their stack size on Unix-like systems with ulimit

As OS provides this API, why would Python forbid users with real need?

Other virtual machines like V8 (JavaScript runtime) can increase stack size on Unix-like systems (while not on Windows).

Increasing stack size/recursion limit is a real need for some users, and this limit will be a problem without easy workaround. It breaks user's code without possibility of easy migration.

@lelit
Copy link
Contributor

lelit commented Jul 26, 2023

Here are what we can do to lessen the impact of this change:

* Leaves things as they are.

I'd find that surprising, at the very least that would require changing sys.setrecursionlimit() doc saying that it's now a no-op,
consider:

Python 3.12.0b3 (main, Jun 19 2023, 18:56:29) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def data(n):
...   d = {}
...   for x in range(n):
...     d = {'k': d}
...   return d
... 
>>> import json
>>> foo = json.dumps(data(798))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nix/store/arb...-python3-3.12.0b3/lib/python3.12/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/arb...-python3-3.12.0b3/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/arb...-python3-3.12.0b3/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while encoding a JSON object
>>> foo = json.dumps(data(797))
>>> import sys
>>> sys.getrecursionlimit()
1000
>>> sys.setrecursionlimit(500)
>>> foo = json.dumps(data(797))
>>> 

@gpshead
Copy link
Member

gpshead commented Jul 26, 2023

Alternative take: The json module shouldn't be using recursion. (Which is out of scope as an rc1 release blocker resolution)

@gpshead
Copy link
Member

gpshead commented Jul 26, 2023

People can easily increase their stack size on Unix-like systems

Don't presume this. System, policy, and container limits can constrain the maximum ulimit. And people dealing with untrusted data as is common both encoding or decoding json generally do not have test cases to determine what limits they require to not crash. People's expectations are reasonable that Python will never crash (segfault) the process due to a stack overflow.

Another lack of control: Threads are frequently spawned by extension modules or embedding languages (C/C++/Rust) where the C stack size is configured by that code at thread creation time and can a lot smaller than the ulimit default main thread stack size for a variety of valid reasons. This can be pretty far removed from the control of the Python application user.

@sunmy2019
Copy link
Member

And people dealing with untrusted data as is common both encoding or decoding json generally do not have test cases to determine what limits they require to not crash. People's expectations are reasonable that Python will never crash (segfault) the process due to a stack overflow.

I am not asking to remove this limit. I think it should be made configurable at least for professional users.
Since only people with professional knowledge would change this, which means they will know what they are doing and expect a segfault if they set it not correctly.

Recursive deep is not an error if you have a large stack size. Also if you have a very small stack size, the current limit would cause seg fault either.

Currently, the only workaround for this would be editing the source code and compiling again. That's too much even for professional users.

@gpshead
Copy link
Member

gpshead commented Jul 27, 2023

I am not asking to remove this limit. I think it should be made configurable at least for professional users.

Yep, understood. I do think that's a stack / recursion limit related feature request on its own that we should track and decide in its own issue. (feel free to file one)

@sunmy2019
Copy link
Member

a stack / recursion limit related feature request on its own

I tend to think it's a bug fix.

If that's a feature request and got passed, then we can run such programs in 3.10, 3.11, and 3.13, which is very weird not having them on 3.12.

markshannon added a commit that referenced this issue Aug 4, 2023
…_EvalFrameDefault()` (GH-107535)

* Set C recursion limit to 1500, set cost of eval loop to 2 frames, and compiler mutliply to 2.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 4, 2023
…PyEval_EvalFrameDefault()` (pythonGH-107535)

* Set C recursion limit to 1500, set cost of eval loop to 2 frames, and compiler mutliply to 2.
(cherry picked from commit fa45958)

Co-authored-by: Mark Shannon <mark@hotpy.org>
Yhg1s pushed a commit that referenced this issue Aug 4, 2023
…_PyEval_EvalFrameDefault()` (GH-107535) (#107618)

GH-107263: Increase C stack limit for most functions, except `_PyEval_EvalFrameDefault()` (GH-107535)

* Set C recursion limit to 1500, set cost of eval loop to 2 frames, and compiler mutliply to 2.
(cherry picked from commit fa45958)

Co-authored-by: Mark Shannon <mark@hotpy.org>
@Eclips4
Copy link
Member

Eclips4 commented Aug 11, 2023

Is there anything left to do?

@Yhg1s
Copy link
Member

Yhg1s commented Sep 5, 2023

Assuming this is all fixed now.

@dimpase
Copy link
Contributor

dimpase commented Nov 18, 2023

This has broken basic abilities of functools.cache to speed up recursive computations, see #112215
Take the classic Fibonacci sequence

#  fib.py 
import sys
sys.setrecursionlimit(2000)

from functools import cache

@cache
def fib(n):
    if n<1: return 0
    if n==1: return 1
    return fib(n-1) + fib(n-2)

print(fib(500))

Now

$ time python3.11 <fib.py 
139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125

real	0m0.125s
user	0m0.092s
sys	0m0.034s

is quite quick (removing @cache produces a HUGE slowdown). But, as is easy to find out (I did it on #112215),
that #96510 introduced the regression:

$ time python3.12 <fib.py 
Traceback (most recent call last):
  File "<stdin>", line 12, in <module>
  File "<stdin>", line 10, in fib
  File "<stdin>", line 10, in fib
  File "<stdin>", line 10, in fib
  [Previous line repeated 496 more times]
RecursionError: maximum recursion depth exceeded

real	0m0.133s
user	0m0.099s
sys	0m0.034s

By the way, increasing ulimit -s (from 8Mb to 64Mb, that's more than enough here) has no effect.

@dimpase
Copy link
Contributor

dimpase commented Nov 18, 2023

Assuming this is all fixed now.

you still need to fix functools.cache and friends to remain useful for anything mildly serious, see my comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes release-blocker type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

9 participants