Inline TimePos
and TimeSig
functions to improve performance
#7549
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Inline the functions within
TimePos
andTimeSig
to improve performance. I discovered this performance issue as I was profiling a project that had ~36K+ notes in it, running at 999 BPM.In my discovery to figure why this project uses up so much CPU really, I tweaked the code inside
InstrumentTrack::play
a bit (this function was of interest since it was the one generating theNotePlayHandle
s) and removed the twowhile
loops. CPU was of course 0% here, but what I found interesting was that adding the loop back in that iterates over all the notes and doing some small computation with theNote::pos
function was taking around 20-30% of CPU alone, and doing this for other functions likeNote::getVolume
kept the CPU at 0%. I eventually realized with some profiling that the functions inTimePos
have a lot of function call overhead, and inlining theTimePos::operator int()
function made the CPU meter reach 0% again.These are the changes I made when investigating the performance inside the
InstrumentTrack::play
function:I've attached a screenshot of the CPU meter after removing most of the note audio generation code in
InstrumentTrack::play
, and instead only doing a small amount of meaningless computation (master branch, Release build):After inlining the functions inside
TimePos
:In addition, this seems to have correlated greatly with the number of cache misses during execution. After inlining these functions, the cache miss amount has dropped significantly, most likely because there are less instructions to fetch and to deal with. I discovered this using
perf
and it was how I realized that something odd was happening with theTimePos::operator int()
function.perf report
results fromperf record -e cache-misses -p $(pidof lmms) -- sleep 5s
before inlining (you will most likely see cache misses coming from a lot of other places in normal instances):Same
perf report
after inlining: