Skip to content

Iterator groupby #84

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 24, 2016
Merged

Iterator groupby #84

merged 5 commits into from
Dec 24, 2016

Conversation

hdgarrood
Copy link
Contributor

I recommend reviewing these commits one at a time, as they are each independent of each other. The runSTArray' commit is more or less what I was describing in #78.

@hdgarrood
Copy link
Contributor Author

Benchmark results:

groupby

I was honestly expecting a slightly more drastic difference, although it's perhaps worth noting that the implementation labelled "current" is what's in master right now, not what's in the most recently released version. However, the version of groupBy in master at the moment uses span in its implementation, which had a very nice perf boost in #77, so I expect the difference that people will actually see between groupBy in their code now and the next time they upgrade will be quite a bit larger than it looks in this graph.

. (forall h. Eff (st :: ST h | r) (f (STArray h)))
-> Eff r (f Array)
runSTArray' =
unsafeCoerce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it's safe to just coerce here, without copying the arrays. For example, what about something like:

runSTArray do 
  xs <- initSomeSTArray
  setTimeout 1000 (modifySomehow xs)
  pure xs

The return value is supposed to be a regular array, but it will get modified in place one second after the function returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. I guess the current runSTArray function has the same issue too? What if we restrict the row type to a closed row?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think restricting to a closed row should solve the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. In that case I think I will publish a minor release before doing that, since this will be a breaking change.

For instances where you want to return more than just one STArray from
an STArray computation. For example, a pair of STArrays (with
potentially different types), or an STArray of STArrays, or even an
STArray of nonempty STArrays. The possibilities are endless!

Also use unsafe-coerce instead of FFI.
Implement `groupBy` using `Data.Array.ST.Iterator`; this should make it
faster.
\x y -> odd x && odd y is not an equivalence relation! It doesn't
satisfy reflexivity.
@hdgarrood
Copy link
Contributor Author

I decided to change runSTArray to just return an Array a instead of returning a Pure (Array a), since now we know it's going to be a Pure so we might as well use runPure on it.

@paf31 paf31 merged commit 0f9351e into master Dec 24, 2016
@paf31 paf31 deleted the iterator-groupby branch December 24, 2016 03:40
@paf31
Copy link
Contributor

paf31 commented Dec 24, 2016

Thanks again, and sorry for the delay!

@paf31
Copy link
Contributor

paf31 commented Dec 24, 2016

Hmm, I just remembered this is a breaking change. What shall we do here? Revert, or make a major release?

@paf31 paf31 restored the iterator-groupby branch December 24, 2016 04:48
This was referenced Dec 24, 2016
@hdgarrood
Copy link
Contributor Author

No problem! Up to you whether you think it's worth merging now and making a major release. We could also maybe make a branch off of master from before any breaking changes were merged in case we ever want to do a minor or patch release before the next major one?

@paf31
Copy link
Contributor

paf31 commented Dec 24, 2016

I think we have a few libraries which need major releases:

  • arrays
  • foldable-traversable
  • strings

Any others?

@hdgarrood
Copy link
Contributor Author

I'm not aware of any off the top of my head

@hdgarrood hdgarrood deleted the iterator-groupby branch January 20, 2017 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants