Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added new implementation using bytearray and memoryview #11

Merged
merged 1 commit into from
Jul 12, 2024

Conversation

Skazu
Copy link

@Skazu Skazu commented Jul 10, 2024

Hey there,
i've added a new implementation using a bytearray and memoryview to work on a fixed allocated memory buffer.

As in the Pypy implementation the file gets distributed on all cpus via multiprocessing. I create n chunks (where n is the number of cpus available), rearrange each chunk to end/start at a whole line and spawn the processes.

But from there i've changed a lot, i allocate a buffer of configurable size, in this version 1024 * 128 bytes, and read the file directly into this buffer via the readinto1(buffer) method, after that i operate as much as possible on this fixed buffer, searching for \n and ; to split the lines. If there is no \n left in the buffer i read the next part of the file until i reach the end.

On my machine this is even faster than the current Pypy solution, and also has a fixed size memory footprint, i don't need to disable the garbage collection, because there is no garbage created. (Even with disabled gc the memory footprint doesn't rise, whereas the pypy version uses all available ram, until my system freezes).

I don't know how fast my code is on your reference machine, but i'm really curious to find out, maybe you can try it out?

@ifnesi
Copy link
Owner

ifnesi commented Jul 12, 2024

Hi @Skazu , thank you very much. On my machine your implementation featured on the 3rd place:
| pypy3 | calculateAveragePypyInputBuffer.py | 145.58 | 5.08 | 670% | 22.475 |
For some reason I am unable to commit the changes on the README.md file, I am trying to understand why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants