Use native bytearray truncation #120

tomchristie · 2020-11-24T14:26:02Z

Switching from .compress() to using native buffer truncation.

I figure despite some related work on #115 it's worth looking at this PR in isolation.

We probably want this to be blocked on #116.

Benchmarking on Python 3.7...

Before:

$ PYTHONPATH=. venv/bin/python bench/benchmarks/benchmarks.py
6901.9 requests/sec
7055.0 requests/sec
7084.2 requests/sec
7070.6 requests/sec
7107.9 requests/sec
7075.8 requests/sec
7079.4 requests/sec

$ PYTHONPATH=. venv/bin/python bench/benchmarks/benchmarks.py
6972.9 requests/sec
6996.9 requests/sec
6947.3 requests/sec
6999.9 requests/sec
7040.6 requests/sec
7030.0 requests/sec
6997.6 requests/sec

After:

$ PYTHONPATH=. venv/bin/python bench/benchmarks/benchmarks.py
7054.6 requests/sec
7133.3 requests/sec
7148.3 requests/sec
7116.4 requests/sec
7121.0 requests/sec
7128.2 requests/sec
7156.5 requests/sec

$ PYTHONPATH=. venv/bin/python bench/benchmarks/benchmarks.py
7018.9 requests/sec
7074.7 requests/sec
7075.0 requests/sec
7070.6 requests/sec
7028.2 requests/sec
7118.3 requests/sec
7131.2 requests/sec

tomchristie · 2020-11-24T14:26:54Z

We also ought to follow up on the action in the comment "make sure PyPy also has the optimization", but I'm not sure where to start with that.

bluetech · 2020-11-24T14:47:09Z

h11/_receivebuffer.py

-# bytearray is amortized O(n), thanks to some excellent work by Antoine
-# Martin:
-#
-#     https://bugs.python.org/issue19087


I think it'd be helpful to keep the paragraph above.

Fair point, yes. I've addressed it as an inline comment.

bluetech · 2020-11-24T14:48:07Z

h11/_receivebuffer.py

        if not out:
            return None
-        self._start += len(out)
+        self._data[:count] = b""


I think a del would be a little clearer but maybe it's just me. (Also below).

Agreed, yup!

h11/_receivebuffer.py

bluetech · 2020-11-24T19:06:16Z

We also ought to follow up on the action in the comment "make sure PyPy also has the optimization", but I'm not sure where to start with that.

With this code:

size = 50 * 2**20
b = bytearray(size)
for i in range(size):
    del b[0]

Results:

python3.8:           4.257s
pypy3 version 7.3.2: 0.326s
pypy2 version 7.3.2: 0.326s
python2.7:           DNF

So it's either that pypy has this optimization, or it's doing some magic JIT handling of this specific code. I tried some tricks to foil any such optimization and it seems legit.

If we're willing to have a timing-based unit test for this, a size = 3 * 2**20 with a timeout of 5s would do the trick IMO, even with a super fast CPU running the quadratic case or a very slow CPU running the linear case. But it will probably be flaky anyway...

h11/_receivebuffer.py

Co-authored-by: Ran Benita <ran@unusedvar.com>

njsmith · 2020-12-23T15:41:11Z

I think this has been superseded by #115

pgjones · 2020-12-26T17:46:57Z

Agree with #115 superseding this. (Please reopen if not).

tomchristie added 2 commits November 24, 2020 14:18

Use native bytearray truncation

beb1064

Drop comment that is no longer relevant

6519c21

This was referenced Nov 24, 2020

Added ability to use LF, not only CRLF delimiter for response Headers and Body #115

Merged

Drop support for Python 2 #116

Closed

bluetech reviewed Nov 24, 2020

View reviewed changes

tomchristie added 2 commits November 24, 2020 15:04

Address review comments

4a50931

Fix up core contributor name in comment

c0f7b32

bluetech reviewed Nov 24, 2020

View reviewed changes

h11/_receivebuffer.py Outdated Show resolved Hide resolved

Clean up comment

1c04101

bluetech reviewed Nov 24, 2020

View reviewed changes

h11/_receivebuffer.py Outdated Show resolved Hide resolved

Update h11/_receivebuffer.py

50e195f

Co-authored-by: Ran Benita <ran@unusedvar.com>

bluetech approved these changes Dec 8, 2020

View reviewed changes

pgjones closed this Dec 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use native bytearray truncation #120

Use native bytearray truncation #120

Uh oh!

tomchristie commented Nov 24, 2020

Uh oh!

tomchristie commented Nov 24, 2020

Uh oh!

bluetech Nov 24, 2020

Uh oh!

tomchristie Nov 24, 2020

Uh oh!

bluetech Nov 24, 2020

Uh oh!

tomchristie Nov 24, 2020

Uh oh!

Uh oh!

bluetech commented Nov 24, 2020

Uh oh!

Uh oh!

njsmith commented Dec 23, 2020

Uh oh!

pgjones commented Dec 26, 2020

Uh oh!

Uh oh!

Use native bytearray truncation #120

Use native bytearray truncation #120

Uh oh!

Conversation

tomchristie commented Nov 24, 2020

Uh oh!

tomchristie commented Nov 24, 2020

Uh oh!

bluetech Nov 24, 2020

Choose a reason for hiding this comment

Uh oh!

tomchristie Nov 24, 2020

Choose a reason for hiding this comment

Uh oh!

bluetech Nov 24, 2020

Choose a reason for hiding this comment

Uh oh!

tomchristie Nov 24, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bluetech commented Nov 24, 2020

Uh oh!

Uh oh!

njsmith commented Dec 23, 2020

Uh oh!

pgjones commented Dec 26, 2020

Uh oh!

Uh oh!