Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sample caching #7058

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

Add sample caching #7058

wants to merge 16 commits into from

Conversation

messmerd
Copy link
Member

This PR is my effort at a sample cache.

I originally made it to test some eviction strategy ideas that differed from that of #6703, but then I polished it up and now it's here for feedback.

Unfortunately I wasn't able to implement postponed eviction like I hoped, but I'm pretty satisfied with the immediate eviction that I settled on and the ease-of-use of the sample cache.

Features:

  • A SampleBuffer cache is built into SampleLoader and the choice is made when calling the sample loader whether to use sample caching or not. By default caching is on.
  • Caching is supported for both audio files and base64 data, though I haven't tested the base64 sample caching
  • Cached samples are evicted from the cache immediately when no longer used. I'm using a custom deleter in std::shared_ptr<const SampleBuffer> to do this.
  • Cached samples that come from an audio file source are also evicted when their file on disk is changed or deleted.
  • Some minor refactoring was done, particularly to the SampleBuffer class to enforce its creation through either the provided create() methods or through the SampleLoader

There was a discussion on Discord about the nuances of how samples should be identified, hashed, invalidated, what should happen upon filesystem changes, how different samples with the same name on disk should be treated, project files, project bundles, etc.

This PR does not address any of that. It just keeps what is hopefully the same exact behavior as what we have currently, only now with massive improvements to memory usage along with faster project load times when working with projects that use a lot of samples. The work of improving LMMS's behavior for all the different edge cases we can imagine could still happen in the future, but I don't think it has to happen here.

@sakertooth
Copy link
Contributor

Thank you @messmerd. I closed my own implementation in favor of this one, so its not just an alternative. We can go with this. You probably did it better than me by just glancing at the code 🤣, though I'll take a deeper look soon. Could rename the PR if you want now.

@sakertooth sakertooth self-requested a review January 11, 2024 13:35
include/SampleBuffer.h Show resolved Hide resolved
include/SampleBuffer.h Outdated Show resolved Hide resolved
include/SampleLoader.h Outdated Show resolved Hide resolved
include/SampleLoader.h Outdated Show resolved Hide resolved
include/SampleBuffer.h Outdated Show resolved Hide resolved
src/core/SampleLoader.cpp Outdated Show resolved Hide resolved
src/core/SampleLoader.cpp Outdated Show resolved Hide resolved
src/core/SampleLoader.cpp Outdated Show resolved Hide resolved
src/core/SampleLoader.cpp Outdated Show resolved Hide resolved
src/core/SampleLoader.cpp Outdated Show resolved Hide resolved
@messmerd messmerd changed the title Alternative sample cache implementation Sample cache Jan 11, 2024
@messmerd
Copy link
Member Author

@sakertooth Thanks for the review! I'll address it later today.

@sakertooth sakertooth changed the title Sample cache Add sample caching Jan 11, 2024
@sakertooth
Copy link
Contributor

I'm thinking of starting this up again on my end given all the differences, but honestly, I don't know. Its just because I dont like the idea of sacrifing good design for performance (not until we have measurements), and I want there to be some tests, but the current design seems hard to do that.

Identifying actual audio data in the cache with UUIDs and retrieving resources that way makes a lot of sense. It gives us the benefit that it works for any kind of data, and the dependency on the file system is greatly reduced. Its testable because I can cache any kind of audio data I want, it doesn't have to be a file exactly.

@Rossmaxx Rossmaxx mentioned this pull request Jun 17, 2024
9 tasks
@sakertooth
Copy link
Contributor

My thoughts, hopefully everything I say below is easy to understand:

Not sure if I mentioned this already, but my philosophy with SampleBuffer was that it is simply a buffer of audio data, nothing more. It doesn't serve much more purpose than that, and I was really pleased with the simplicity of that class after its refactor. I feel like the changes made to here break that philosophy because its responsible for knowing more than it should if that makes sense. The idea of a SampleBuffer having a Source is fine, but I don't find it ideal for SampleBuffer to be aware of such details. The details of where a buffer came from (i.e., either an audio file or Base64) only concerns the newly added caching feature. In my opinion, these concerns should be somewhere else, outside of SampleBuffer.

Speaking of coupling, it might be simpler for the cache to be in its own class, outside SampleLoader. I say this because the cache most likely will need to have its own interface, allowing other clients (like GUI code) to work with the cache directly (e.g., gathering what is inside the cache to display the current entries to the user, it doesn't really worry about loading any samples here).

The eviction ideas I was pondering about was a misunderstanding on my end. We call this feature a "cache", and when you think of cache, naturally eviction comes to mind. However, while we can continue to call it cache, it seems best to implement the feature as centralized storage, where any client can ask for a sample by its file path and it will return the cached buffer (or load the buffer in, either because it was changed on the filesystem or does not exist in the cache yet). And as a centralized storage, there is little reason to worry about the size of the cache.

Base64 data works as both its identification (Source) and the data itself, but in a readable format. Meshing it in with the caching feature that works closely with the filesystem in particular (where the identification of things comes from the file path, separate from its data) might complicate things for little benefit (like it did for me when I reviewed this a while back).

@messmerd
Copy link
Member Author

messmerd commented Jul 5, 2024

@sakertooth I agree with your points about decoupling Source from SampleBuffer and the idea for a centralized storage. A few weeks ago, I was actually doing some preliminary work planning a centralized, UUID-based storage for audio data, and I'd like to implement it eventually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants