Skip to content

Optimize sliding #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 15, 2025
Merged

Optimize sliding #31

merged 4 commits into from
Jan 15, 2025

Conversation

walking-octopus
Copy link
Contributor

@walking-octopus walking-octopus commented Jan 15, 2025

Performance

  • Time complexity: from O(N² /  step) to O((N / step) * (size + step))
  • Space complexity: from O(N / step + N) to O(N / step + N)

* If my analysis is correct

Changes

length may be quite fast, even on large inputs, but it's still O(N), making the total time complexity quadratic. I tested this change in my Markov chain implementation, and it made a meaningful performance difference on large corpora without any apparent regressions.

While accumulating in reverse order may have lead to better space complexity due to TCO, the reverse in the base case or expensive append negates any difference made in time, so the results are inconsistent; if order doesn't matter, it can cut the time in half. In my defense, the previous version didn't use TCO despite having a subprocedure called tail-call, because the self-call was wrapped in cons.

I'm sure we can squeeze more performance still, but probably only a modest asymptotic or even constant factor difference and will make the logic less clear. Any input is welcome.

Time Complexity: from O(n² / step) to O(N)
Space Complexity: from O(n / step) to O(N / step)
@walking-octopus walking-octopus changed the title Optimize sliding performance Optimize sliding Jan 15, 2025
@filiplajszczak
Copy link
Contributor

LGTM, only it would be better to keep the names of the arguments lst and size.

@walking-octopus
Copy link
Contributor Author

walking-octopus commented Jan 15, 2025

LGTM, only it would be better to keep the names of the arguments lst and size.

Done. This was practically copied from my project, which was using the Clojure signature (partition n step coll), only finding a similar function in algorithms after searching if anyone implemented it any faster. Thanks for the feedback.

Copy link
Owner

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 🙏 🔥

@codereport codereport merged commit 51b1e63 into codereport:master Jan 15, 2025
@walking-octopus
Copy link
Contributor Author

walking-octopus commented Jan 15, 2025

Thanks. I hope I hadn't introduced any non-obvious regressions. Hoping there's maybe still something to make this function faster.

@codereport
Copy link
Owner

@walking-octopus 🤦 My bad. Your original message made me think that maybe you didn't check the unit tests... which of course are failing now. I am going to revert the changes. I was a bit too trusting on this one. I don't actively work in Racket these days so i am happy to accept any and all PRs. Will check that tests are passing next time.

codereport added a commit that referenced this pull request Jan 15, 2025
@codereport
Copy link
Owner

@walking-octopus, happy to accept changes in a new PR if they don't break the tests. raco test main.rkt will run the unit tests so you can check to see if they break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants