Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing code in thread or process pools: run_in_executor example #85299

Closed
aaraney mannequin opened this issue Jun 26, 2020 · 14 comments
Closed

Executing code in thread or process pools: run_in_executor example #85299

aaraney mannequin opened this issue Jun 26, 2020 · 14 comments
Labels
3.8 (EOL) end of life docs Documentation in the Doc dir

Comments

@aaraney
Copy link
Mannequin

aaraney mannequin commented Jun 26, 2020

BPO 41127
Nosy @aaraney

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-06-26.15:24:56.727>
labels = ['3.8', 'docs']
title = 'Executing code in thread or process pools: run_in_executor example'
updated_at = <Date 2020-06-26.15:24:56.727>
user = 'https://github.com/aaraney'

bugs.python.org fields:

activity = <Date 2020-06-26.15:24:56.727>
actor = 'aaraney'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2020-06-26.15:24:56.727>
creator = 'aaraney'
dependencies = []
files = []
hgrepos = []
issue_num = 41127
keywords = []
message_count = 1.0
messages = ['372428']
nosy_count = 2.0
nosy_names = ['docs@python', 'aaraney']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue41127'
versions = ['Python 3.8']

@aaraney
Copy link
Mannequin Author

aaraney mannequin commented Jun 26, 2020

I found an issue with the concurrent.futures.ProcessPoolExecuter() example (#3) in the asyncio event loops documentation. The call to asyncio.run(main()) should be guarded by `__name__=="__main__":`, as it sits now a RuntimeError is thrown.

https://docs.python.org/3/library/asyncio-eventloop.html#executing-code-in-thread-or-process-pools

@aaraney aaraney mannequin added the 3.8 (EOL) end of life label Jun 26, 2020
@aaraney aaraney mannequin assigned docspython Jun 26, 2020
@aaraney aaraney mannequin added docs Documentation in the Doc dir 3.8 (EOL) end of life labels Jun 26, 2020
@aaraney aaraney mannequin assigned docspython Jun 26, 2020
@aaraney aaraney mannequin added the docs Documentation in the Doc dir label Jun 26, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@slateny
Copy link
Contributor

slateny commented Apr 22, 2022

Not getting errors on my end for python 3.8 and 3.10, could you reconfirm the error?

@AlexWaygood
Copy link
Member

@slateny FYI I believe, due to some oddities of the github migration, some "mannequin" users won't necessarily receive notifications on issues unless you actually @ them, even if they show up as having previously participated in the thread. (You can safely assume that non-mannequin users are still subscribed to all the issues they were "nosy" to on BPO.)

@slateny
Copy link
Contributor

slateny commented Apr 22, 2022

Not getting errors on my end for python 3.8 and 3.10, could you reconfirm the error?

@aaraney ^ I know it's been a little while, but if you have time could you take another look?

@aaraney
Copy link

aaraney commented Apr 22, 2022

Certainly, thanks for pinging me, @slateny. Admittedly, the original bug report is not very descriptive, so Ill try to clarify the issue.

In the python 3.8 concurrent futures documentation the provided executor example (see below) throws a RuntimeError if you copy and paste the example into a file and run it. It appears the documentation for 3.9 and 3.10 also have this issue. This is because the asyncio.run(main()) statement at the bottom of the example is not guarded by if __name__ == "__main__":.

import asyncio
import concurrent.futures

def blocking_io():
    # File operations (such as logging) can block the
    # event loop: run them in a thread pool.
    with open('/dev/urandom', 'rb') as f:
        return f.read(100)

def cpu_bound():
    # CPU-bound operations will block the event loop:
    # in general it is preferable to run them in a
    # process pool.
    return sum(i * i for i in range(10 ** 7))

async def main():
    loop = asyncio.get_running_loop()

    ## Options:

    # 1. Run in the default loop's executor:
    result = await loop.run_in_executor(
        None, blocking_io)
    print('default thread pool', result)

    # 2. Run in a custom thread pool:
    with concurrent.futures.ThreadPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, blocking_io)
        print('custom thread pool', result)

    # 3. Run in a custom process pool:
    with concurrent.futures.ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, cpu_bound)
        print('custom process pool', result)

asyncio.run(main())

The example should be updated to the following:

import asyncio
import concurrent.futures

def blocking_io():
    # File operations (such as logging) can block the
    # event loop: run them in a thread pool.
    with open('/dev/urandom', 'rb') as f:
        return f.read(100)

def cpu_bound():
    # CPU-bound operations will block the event loop:
    # in general it is preferable to run them in a
    # process pool.
    return sum(i * i for i in range(10 ** 7))

async def main():
    loop = asyncio.get_running_loop()

    ## Options:

    # 1. Run in the default loop's executor:
    result = await loop.run_in_executor(
        None, blocking_io)
    print('default thread pool', result)

    # 2. Run in a custom thread pool:
    with concurrent.futures.ThreadPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, blocking_io)
        print('custom thread pool', result)

    # 3. Run in a custom process pool:
    with concurrent.futures.ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, cpu_bound)
        print('custom process pool', result)

if __name__ == "__main__":
    asyncio.run(main())

@slateny
Copy link
Contributor

slateny commented Apr 23, 2022

Thanks for the clarification - I tried the code and it seems to run for me without error. Are you running this from another file or is this the only file?

@aaraney
Copy link

aaraney commented Apr 25, 2022

Im running it from a standalone script. So, not from the IDLE.

@slateny
Copy link
Contributor

slateny commented Apr 26, 2022

Hmm interesting, I ran it on ubuntu 20 w/ python 3.8 and 3.10 and no errors for me - maybe a version/platform thing?

@aaraney
Copy link

aaraney commented Apr 26, 2022

Yeah, that is pretty strange. I'm AFK, but from memory I reproduced it on 3.8.10 sourced from miniconda on OSX. I'll see if I can reproduce it in a container tomorrow and put together a repository and share it here if I can.

@aaraney
Copy link

aaraney commented Apr 26, 2022

So, I was not able to reproduce the issue in a docker container. I used the following base images in attempt to be more complete: python:3.8-slim-buster, python:3.8.10-slim-buster, and continuumio/miniconda3:4.11.0. However, I am beginning to think this is a platform issue not a python version issue. More specifically, something to do with the spawn semantics used vs fork exec. This behavior was changed in 3.8 for OSX.

Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.

Instead of droning on about why I think that is the case, I threw together a repo that uses github actions and a macos 11 runner. Below ive included the gh action yaml for convenience. I was able to reproduce the issue using gh actions. Please find the failing job here.

name: Run Unit Tests

on: push

jobs:
  cpython_85299:

    runs-on: macos-11

    steps:
    - uses: actions/checkout@v2
    - name: Set up Python 3.8
      uses: actions/setup-python@v2
      with:
        python-version: '3.8'
    - name: Reproduce Runtime Error
      run: |
        python3 main.py

@slateny
Copy link
Contributor

slateny commented Apr 30, 2022

Hmm, if it's a platform issue then do you think that the docs still need updating? If the file's the only one being run, then if __name__ == "__main__": seems equivalent to leaving it out

@aaraney
Copy link

aaraney commented May 2, 2022

Yeah I think that is a fair point, @slateny. Ive not seen it stated in python docs that examples are runnable as their own script, but I think that is kind of implicit to an example snippet. I would prefer that if __name__ == "__main__": is included where examples are expected to be run in a standalone context.

@slateny
Copy link
Contributor

slateny commented May 14, 2022

The third example uses ProcessPoolExecutor, defined here, which then seems to get the start method, which as you mentioned uses spawn since 3.8. A bit down below, there is a note about protecting the program's entry point via if __name__ == '__main__' if using the spawn method.

Taking all that into account, I think the doc change should be instead to add a note/link to the warning for macOS, advising the entrypoint guard as needed, instead of adding it directly into the example as there might be confusion on why the guard's there when it may not be necessary.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 13, 2022

The problem seems to be with how the OP is running the code, not with the example. Nevertheless it is fine to add a __main__ check to the example, since that is customary for all main programs. No notes about platforms though.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 16, 2022
…example (pythonGH-93457)

(cherry picked from commit 79fd6cc)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 16, 2022
…example (pythonGH-93457)

(cherry picked from commit 79fd6cc)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
miss-islington added a commit that referenced this issue Oct 16, 2022
GH-93457)

(cherry picked from commit 79fd6cc)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
miss-islington added a commit that referenced this issue Oct 16, 2022
GH-93457)

(cherry picked from commit 79fd6cc)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
@slateny slateny closed this as completed Oct 16, 2022
carljm added a commit to carljm/cpython that referenced this issue Oct 17, 2022
* main: (31 commits)
  pythongh-95913: Move subinterpreter exper removal to 3.11 WhatsNew (pythonGH-98345)
  pythongh-95914: Add What's New item describing PEP 670 changes (python#98315)
  Remove unused arrange_output_buffer function from zlibmodule.c. (pythonGH-98358)
  pythongh-98174: Handle EPROTOTYPE under macOS in test_sendfile_fallback_close_peer_in_the_middle_of_receiving (python#98316)
  pythonGH-98327: Reduce scope of catch_warnings() in _make_subprocess_transport (python#98333)
  pythongh-93691: Compiler's code-gen passes location around instead of holding it on the global compiler state (pythonGH-98001)
  pythongh-97669: Create Tools/build/ directory (python#97963)
  pythongh-95534: Improve gzip reading speed by 10% (python#97664)
  pythongh-95913: Forward-port int/str security change to 3.11 What's New in main (python#98344)
  pythonGH-91415: Mention alphabetical sort ordering in the Sorting HOWTO (pythonGH-98336)
  pythongh-97930: Merge with importlib_resources 5.9 (pythonGH-97929)
  pythongh-85525: Remove extra row in doc (python#98337)
  pythongh-85299: Add note warning about entry point guard for asyncio example (python#93457)
  pythongh-97527: IDLE - fix buggy macosx patch (python#98313)
  pythongh-98307: Add docstring and documentation for SysLogHandler.createSocket (pythonGH-98319)
  pythongh-94808: Cover `PyFunction_GetCode`, `PyFunction_GetGlobals`, `PyFunction_GetModule` (python#98158)
  pythonGH-94597: Deprecate child watcher getters and setters (python#98215)
  pythongh-98254: Include stdlib module names in error messages for NameErrors (python#98255)
  Improve speed. Reduce auxiliary memory to 16.6% of the main array. (pythonGH-98294)
  [doc] Update logging cookbook with an example of custom handling of levels. (pythonGH-98290)
  ...
pablogsal pushed a commit that referenced this issue Oct 22, 2022
GH-93457)

(cherry picked from commit 79fd6cc)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 (EOL) end of life docs Documentation in the Doc dir
Projects
Development

No branches or pull requests

4 participants