-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to optimise OPAL #291
Try to optimise OPAL #291
Conversation
removing function calls from tight loops saves 30-40 cycles per function call on RP2040
reduce opal sample rendering time by sacrificing 200bytes of ram for const sine, exp LUTs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
Wouldn't inlining Sample
and Output
methods in the current code achieve the same results?
Do you have a sense on how much of this gain is due to the method change vs putting that data in memory? (BTW, wouldn't you achieve not being in flash by not defining it const
?)
Unfortunately no as the linker complains if you try to call an inline function, in the this case
Good question! I just reran test with the force into ram macro removed from the LUTs and I get:
so moving the LUT data into ram is what gives approx half the performance improvement. |
Oops! sorry I see what you mean now, make them inline and then call them from the new SampleBuffer function instead. |
and have compiler do the work instead of manually inlining the code
As part of trying to improve performance for #283 this PR does some basic optimisation for the OPAL audio rendering code. The main optimisation in the OPAL code is removing function calls from the tight loop by creating a new method for generating samples together for a whole buffer and then further removing more function calls by inlining several small functions. At the cost of 200bytes of ram, the constant LUTs are also forced to be in ram all the time.
For a release builds takes rendering time for 1
Render()
method call at default bpm and default OPAL instrument settings from:to:
so from around avg of
3.8ms
to2.8ms
which is a decent if not stellar improvement.