Skip to content

Conversation

@mdumandag
Copy link
Contributor

@mdumandag mdumandag commented Feb 19, 2019

This is a partial fix for the issue described in #149
Formerly, we were using an immutable bytes field as the buffer and each time reactor reads data from the socket, it were appending the received data to this buffer which was copying and re-creating the existing buffer data with the newly received data. I changed that to the bytearray which is mutable in python so extra re-creation is eliminated .

So overall, python client improved from ~50 seconds to ~35 seconds for the string test and from ~7 seconds to ~1 seconds for the bytes test in my computer

These were the results of the tests described in the #149 in my computer with the old code.

Bytes test, java client

Map Size = 10000
2019-02-19T14:36:48.760Z
Entries Iterated = 10000
2019-02-19T14:36:48.913Z
10849511800752

Bytes test, python client

--------------------------------------

USING SYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:39:57.537118 start
2019-02-19 17:40:03.826406 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:40:03.826461 start
2019-02-19 17:40:11.092503 stop
list length: 10000

--------------------------------------

USING ASYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:40:11.094390 start
2019-02-19 17:40:17.379446 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:40:17.379487 start
2019-02-19 17:40:23.662076 stop
list length: 10000
process done...

String test, java client

Map Size = 10000
2019-02-19T14:41:56.660Z
Entries Iterated = 10000
2019-02-19T14:41:56.853Z
3815515015000

String test, python client


USING SYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:42:46.135877 start
2019-02-19 17:43:38.686844 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:43:38.686897 start
2019-02-19 17:44:31.023816 stop
list length: 10000

--------------------------------------

USING ASYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:44:31.025240 start
2019-02-19 17:45:22.264135 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:45:22.264172 start
2019-02-19 17:46:13.941635 stop
list length: 10000
process done...

And here are the results for python client with the new code

String test, python client


USING SYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:50:45.371581 start
2019-02-19 17:51:19.353318 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:51:19.353372 start
2019-02-19 17:51:53.071307 stop
list length: 10000

--------------------------------------

USING ASYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:51:53.073200 start
2019-02-19 17:52:26.789717 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:52:26.789801 start
2019-02-19 17:53:00.492137 stop
list length: 10000
process done...

Bytes test, python client

--------------------------------------

USING SYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:53:52.663747 start
2019-02-19 17:53:53.528287 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:53:53.528344 start
2019-02-19 17:53:54.447894 stop
list length: 10000

--------------------------------------

USING ASYNC...
map size: 10000
WITH ITERATOR...
2019-02-19 17:53:54.449627 start
2019-02-19 17:53:55.401066 stop
list length: 10000

WITHOUT ITERATOR...
2019-02-19 17:53:55.401108 start
2019-02-19 17:53:56.491391 stop
list length: 10000
process done...

@mdumandag mdumandag added this to the 3.11 milestone Feb 19, 2019
@gokhanoner
Copy link

@mdumandag, String case is still considerable slow. Do you think that a similar approach can be used for string as well?

@mdumandag
Copy link
Contributor Author

@gokhanoner This fix also applies to string case. With this fix, reading a big chunk of data from the connection is now a lot faster. But bottleneck for the string case is not this. For the string case, the issue most probably comes from the bitwise operations we do. Python is not good at handling cpu bound math operations due to its dynamically typed nature. But we have to loop over the byte data and do these operations to read the string. So, the string case requires an extra solution.

@mdumandag mdumandag merged commit c057ac6 into hazelcast:master Mar 5, 2019
@mdumandag mdumandag deleted the reactorBuffer branch March 5, 2019 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants