Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fixed] [PBKAC] Requests on dropped connections are always fulfilled #2783

Closed
asfer opened this issue Mar 1, 2018 · 8 comments
Closed

[Fixed] [PBKAC] Requests on dropped connections are always fulfilled #2783

asfer opened this issue Mar 1, 2018 · 8 comments
Labels

Comments

@asfer
Copy link

asfer commented Mar 1, 2018

Long story short

Requests to the web.Application are answered even if the clients already dropped their connections before the web handlers are started.

It is important to fix this because under load the resources are being used for closed connections instead of serving the ones that are currently open.

Expected behaviour

I would expect web handlers to not be executed if their connections are already dropped.

Notice, that I am not talking about connections that are dropped while processing web handlers as discussed in #2098.

Actual behaviour

Web handlers are started for already dropped connections.

Steps to reproduce

  1. Run a load test with a simple server that simulates some synchronous work like
wrk -d1m -c100 -t10 http://127.0.0.1:3001
Running 1m test @ http://127.0.0.1:3001
  10 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.08s   573.01ms   1.84s    66.67%
    Req/Sec     0.82      0.77     3.00     59.39%
  197 requests in 1.00m, 33.86KB read
  Socket errors: connect 0, read 0, write 0, timeout 191
Requests/sec:      3.28
Transfer/sec:     577.13B

on

import logging
import time
from aiohttp import web
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(levelname)s %(message)s')
i = 0
async def index(_):
    global i; i += 1; print(i)
    time.sleep(0.3)
    return web.json_response({'success': True})

app = web.Application()
app.router.add_get('/', index)
web.run_app(app, host='0.0.0.0', port=3001)

and observe that most of the requests have sockets that are timing out (191/197) and that the server continues to work past the load test. Fulfilling a total of about 400 requests.

  1. Alternatively, you can increase the sleep time (say 30s) and create a telnet session with GET / HTTP/1.1 and while this request is waiting, just open another telnet session with a similar request and close it before the first request is completed. You will see that the second request will always be fulfilled, and the web handler will be executed twice.

Your environment

aiohttp version: 1fd798b (HEAD -> master)
on macOS 10.13.3 with Python 3.6.4 from Homebrew

@asvetlov
Copy link
Member

asvetlov commented Mar 1, 2018

Your test snippet looks incorrect because of blocking time.sleep() usage.
Replace it with await asyncio.sleep(), otherwise aiohttp has no change to get informed about connection closing

@pfreixes
Copy link
Contributor

pfreixes commented Mar 1, 2018

IMHO the req/sec reached are pretty aligned with the CPU cost of the handler, almost matches 100%.

@asfer
Copy link
Author

asfer commented Mar 1, 2018

If I change the time.sleep for some other work like:

time python -c 'import numpy as np; np.dot(np.random.rand(1500,1500), np.random.rand(1500,1500))'

real	0m0.328s
user	0m0.584s
sys	0m0.070s

on

import logging
import numpy as np
from aiohttp import web
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(levelname)s %(message)s')
i = 0

async def index(_):
    global i; i += 1; print(i)
    np.dot(np.random.rand(1500, 1500), np.random.rand(1500, 1500))
    return web.json_response({'success': True})

app = web.Application()
app.router.add_get('/', index)
web.run_app(app, host='0.0.0.0', port=3001)
 wrk -d1m -c100 -t10 http://127.0.0.1:3001
Running 1m test @ http://127.0.0.1:3001
  10 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.04s   570.61ms   1.95s    57.14%
    Req/Sec     1.93      1.37    10.00     81.55%
  412 requests in 1.00m, 70.81KB read
  Socket errors: connect 0, read 0, write 0, timeout 398
Requests/sec:      6.86
Transfer/sec:      1.18KB

I still get 510 requests being fulfilled. If the number of connections/threads of the load test are increased, the number of requests that are fulfilled at the end of load test keeps increasing and the server keeps on and on for quite a while...

@mbatchkarov
Copy link

Let me add some more details (I'm working on the same task). time.sleep is there to simulate the real workload- our app fetches a URL asynchronosly, then processes the results synchronously. That second part may take up to several seconds. The processing is inherently CPU-bound and cannot be replaced with an async alternative.

Running the load test for 1 minute with time.sleep(2) I expect to get ~20 successful request, and the rest should fail. At the end of the one minute the test is done, but the server continues to do work (watch i being printed). The service is unavailable because we are still doing work for a closed request.

@asvetlov you mentioned informing aiohttp about connections closing. Is there a way to do that?

@asvetlov
Copy link
Member

asvetlov commented Mar 1, 2018

You should not do blocking calls in asyncio coroutines, the rule is simple, obvious and has no exceptions.
If you do it -- your code is buggy, all findings are incorrect.
Use run_in_executor instead.

@pfreixes
Copy link
Contributor

pfreixes commented Mar 1, 2018

The callbacks that are pending to be executed to inform that a connection has been closed will be executed after the handler, at least for this example, so you can not try to circumvent that issue making some explicit connection check. Will be nice to check it empirically but its what my gut feeling says.

Once said that, in other cases where you have real IO within your handler, that should give the chance to the close connection callback to cancel the task that is being executed [1]

[1] https://github.com/aio-libs/aiohttp/blob/master/aiohttp/web_protocol.py#L201

@asfer
Copy link
Author

asfer commented Mar 1, 2018

@asvetlov Thank you so much for pointing me in the right direction
@pfreixes The problem was indeed the synchronous work that was impeding the close of the connections

Thanks a lot for the prompt responses!

For posteriority:

import logging
import numpy as np
from aiohttp import web
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(levelname)s %(message)s')
i = 0

def multiply():
    return np.dot(np.random.rand(1500, 1500), np.random.rand(1500, 1500))

async def index(request):
    global i; i += 1; print(i)
    matrix = await request.loop.run_in_executor(None, multiply)
    print(matrix.shape)
    return web.json_response({'success': True})

app = web.Application()
app.router.add_get('/', index)
web.run_app(app, host='0.0.0.0', port=3001)

produces

$ wrk -d1m -c1000 -t20 http://127.0.0.1:3001
Running 1m test @ http://127.0.0.1:3001
  20 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.83s     0.00us   1.83s   100.00%
    Req/Sec     0.77      1.47     9.00     91.79%
  280 requests in 1.00m, 48.12KB read
  Socket errors: connect 0, read 1109, write 0, timeout 279
Requests/sec:      4.66
Transfer/sec:     819.92B

and the server stops doing work immediately after the load test stops.

@asfer asfer closed this as completed Mar 1, 2018
@asfer asfer changed the title Requests on dropped connections are always fulfilled [Fixed] [PBKAC] Requests on dropped connections are always fulfilled Mar 1, 2018
@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants