Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SmallArray# instead of Array# #15

Open
treeowl opened this issue Oct 17, 2020 · 4 comments
Open

Use SmallArray# instead of Array# #15

treeowl opened this issue Oct 17, 2020 · 4 comments

Comments

@treeowl
Copy link
Contributor

treeowl commented Oct 17, 2020

The arrays are fairly small, and, just as importantly, they're used in a "mostly-pure" fashion. So performance will almost certainly be better with SmallArray# than Array#.

@travitch
Copy link
Owner

Good idea - I think SmallArray# was introduced after this code was started, but I never re-evaluated. I suspect this would help some of the unfortunate GC behavior I've observed.

@treeowl
Copy link
Contributor Author

treeowl commented Oct 17, 2020

What unfortunate behavior have you observed? SmallArray is much more about trying to make the mutator fast; Array tries to avoid certain pathological GC perf problems that just probably don't show up here.

@travitch
Copy link
Owner

It has been difficult to get a detailed breakdown of where the GC spends its time, but my benchmarks have just spent way more time than I expected collecting garbage. I think it was a combination of small nursery sizes (in older GHC versions) and each modification re-allocating multiple array chunks (to rebuild the spine of the tree).

The documentation for SmallArray indicates that it is more efficient during GC as long as your card table would have only had a single entry (i.e., < 128), which should apply in this case. I'm not sure how much that will help, of course.

@treeowl
Copy link
Contributor Author

treeowl commented Oct 17, 2020

Oh, interesting. I'd never read that. But I can't imagine an unnecessary card table check would explain particularly bad GC performance. Probably something else going on. Of course, performance building really big vectors is basically guaranteed to be shit regardless, just because it violates the generational hypothesis. We can only do what we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants