Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct codec perf #614

Open
benjeffery opened this issue May 13, 2020 · 5 comments
Open

Struct codec perf #614

benjeffery opened this issue May 13, 2020 · 5 comments
Labels
enhancement New feature or request Performance This issue addresses performance, either runtime or memory Python API Issue is about the Python API
Milestone

Comments

@benjeffery
Copy link
Member

No work has been done on this, there are some easy wins like grouping together calls to pack and unpack where possible. Maybe after #511

@benjeffery benjeffery added the enhancement New feature or request label May 13, 2020
@benjeffery benjeffery mentioned this issue May 13, 2020
21 tasks
@benjeffery
Copy link
Member Author

Use struct.pack_into and struct.iter_unpack
Use a separate code path for schemas that have known formats (i.e. no arrays).

@benjeffery benjeffery added Performance This issue addresses performance, either runtime or memory Python API Issue is about the Python API labels Sep 29, 2020
@molpopgen
Copy link
Member

I'm feeling some pain from this one. If we consider the following, which is generated from running a fwdpy11 sim:

Start sim at = 17:44:11
Burn in done at = 18:20:54
Start adaptation to new environment at = 18:20:54
Done at = 18:21:00
Done dumping native file format at at = 18:21:00 starting tskit export...
Done dumping to tskit at = 18:36:21

The simulation is done and written to the fwdpy11 native format in about 35 minutes. There is metadata for 9e5 individuals, which causes the writing to a trees file to take over 15 minutes.

When creating the tskit.TableCollection, I am using add_row (as opposed to set_columns), and the metadata schema is here.

@jeromekelleher
Copy link
Member

Thanks for the info @molpopgen - can you remind us about this in a couple of weeks when Ben is back please?

@molpopgen
Copy link
Member

Reminder!

@benjeffery benjeffery added this to the Python 0.3.7 milestone Jun 8, 2021
@benjeffery
Copy link
Member Author

Thanks @molpopgen I've pencilled this in for the next release, but depending on complexity it may slip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Performance This issue addresses performance, either runtime or memory Python API Issue is about the Python API
Projects
None yet
Development

No branches or pull requests

3 participants