Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-125997: ensure that time.sleep(0) is not delayed on non-Windows platforms #128274

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Dec 26, 2024

This is a suggestion for fixing the issue when the timeout is 0. For other timeouts, this does not change the current behaviour (I'm tempted removing the do-while loop as we should never have an issue, but I'm not familar with the monotonic C implementation so I left it as is).

On non-Windows platforms, this reverts the usage of `clock_nanosleep`
and `nanosleep` introduced by 85a4748 and 7834ff2 respectively,
falling back to a `select(2)` alternative instead.
Modules/timemodule.c Show resolved Hide resolved
Modules/timemodule.c Outdated Show resolved Hide resolved
@picnixz picnixz force-pushed the perf/time/zero-125997 branch from ff959e1 to 80de853 Compare December 27, 2024 11:05
@rruuaanng
Copy link
Contributor

@picnixz Unfortunately, we should seriously review :(. It seems that an assertion failed somewhere.

Assertion failed: (errno == 0), function pysleep, file timemodule.c, line 2220.
Fatal Python error: Aborted

Thread 0x0000000173bbb000 (most recent call first):
  File "/Users/admin/actions-runner/_work/cpython/cpython/Lib/subprocess.py", line 1918 in _execute_child
...

@picnixz
Copy link
Contributor Author

picnixz commented Dec 27, 2024

Ah I think I know what happens. When we raise an OSError from errno we don't clear the errno. So if you chain calls, you need to just ignore them. I'm removing the assertion cc @ZeroIntensity

Due to how `OSError` are raised from `errno`, we do not clear `errno`
afterwards. If we catch `OSError`, then we still have an errno set,
and if we call `time.sleep()` just after, we may have `errno != 0`
(but we know we handled it so it's fine).
@rruuaanng
Copy link
Contributor

If pytime is running on multiple cores, I recommend using _LIBC_REENTRANT, which will make your errno independent in each thread.

raw defined

#  if !defined _LIBC || defined _LIBC_REENTRANT
/* When using threads, errno is a per-thread value.  */
#   define errno (*__errno_location ())
#  endif
# endif /* !__ASSEMBLER__ */
#endif /* _ERRNO_H */

@picnixz
Copy link
Contributor Author

picnixz commented Dec 27, 2024

This wouldn't solve the issue (also because _LIBC_REENTRANT is a private macro and something you supply when configuring libc). By the way, errno is thread-safe (see https://linux.die.net/man/3/errno). The issue is simply that we can raise an OSError from errno, which doesn't clear errno, and do something else afterwards. I will keep the current behaviour as it was done beforehand.

@ZeroIntensity
Copy link
Member

TIL! Should we assert !PyErr_Occurred() instead?

@picnixz
Copy link
Contributor Author

picnixz commented Dec 27, 2024

TIL! Should we assert !PyErr_Occurred() instead?

Not sure. It's not really an issue and the function is internal and only for time.sleep(). It's only used once (it's just that time.sleep() calls this function). Though it wouldn't hurt (if this assertion fails, then there is a separate issue)

@ZeroIntensity
Copy link
Member

I think we should. There's no real cost to adding it, and it would be helpful for debugging in the rare case that something goes wrong.

@picnixz picnixz added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Dec 28, 2024
Copy link
Contributor

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking #28111 for other reviewers

I'd vote we can get rid of the do-while, I think it would only matter if monotonic is not monotonic

I'd also vote we don't backport. This is basically a performance optimisation (which we don't typically backport) for a regression that has only been reported once in two years and that has an easy workaround.

struct timeval zero = {0, 0};
Py_BEGIN_ALLOW_THREADS
ret = select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &zero);
err = errno;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can get rid of err, seems like an artifact of the version of this code that has a bunch of #ifdef

@picnixz
Copy link
Contributor Author

picnixz commented Dec 28, 2024

I'd vote we can get rid of the do-while, I think it would only matter if monotonic is not monotonic

I wanted to be sure that "PyTime_Monotonic() != -1" means that "monotonic > deadline" but if this is guaranteed, then I'd be happy to simplify the code.

I'd also vote we don't backport. This is basically a performance optimisation (which we don't typically backport) for a regression that has only been reported once in two years and that has an easy workaround.

I'm also fine with this (I just treated it as a regular bug fix but we can treat it as a performance feature). The workaround you're talking about is to use poll.poll(0) directly, right?

@picnixz picnixz removed needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Dec 28, 2024
@hauntsaninja
Copy link
Contributor

Yes, using poll.poll directly or maybe using os.sched_yield (on 3.11.1 and newer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants