added new implementation using bytearray and memoryview #11

Skazu · 2024-07-10T13:54:34Z

Hey there,
i've added a new implementation using a bytearray and memoryview to work on a fixed allocated memory buffer.

As in the Pypy implementation the file gets distributed on all cpus via multiprocessing. I create n chunks (where n is the number of cpus available), rearrange each chunk to end/start at a whole line and spawn the processes.

But from there i've changed a lot, i allocate a buffer of configurable size, in this version 1024 * 128 bytes, and read the file directly into this buffer via the readinto1(buffer) method, after that i operate as much as possible on this fixed buffer, searching for \n and ; to split the lines. If there is no \n left in the buffer i read the next part of the file until i reach the end.

On my machine this is even faster than the current Pypy solution, and also has a fixed size memory footprint, i don't need to disable the garbage collection, because there is no garbage created. (Even with disabled gc the memory footprint doesn't rise, whereas the pypy version uses all available ram, until my system freezes).

I don't know how fast my code is on your reference machine, but i'm really curious to find out, maybe you can try it out?

ifnesi · 2024-07-12T18:36:23Z

Hi @Skazu , thank you very much. On my machine your implementation featured on the 3rd place:
| pypy3 | calculateAveragePypyInputBuffer.py | 145.58 | 5.08 | 670% | 22.475 |
For some reason I am unable to commit the changes on the README.md file, I am trying to understand why.

added new implementation using bytearray and memoryview

16e1828

Skazu force-pushed the main branch from 47602d6 to 16e1828 Compare July 10, 2024 13:58

ifnesi merged commit 27440b3 into ifnesi:main Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added new implementation using bytearray and memoryview #11

added new implementation using bytearray and memoryview #11

Skazu commented Jul 10, 2024

ifnesi commented Jul 12, 2024 •

edited

Loading

added new implementation using bytearray and memoryview #11

added new implementation using bytearray and memoryview #11

Conversation

Skazu commented Jul 10, 2024

ifnesi commented Jul 12, 2024 • edited Loading

ifnesi commented Jul 12, 2024 •

edited

Loading