Currently every time a shader is to be used with a different workgroup size and set of buffers, a new kp::Algorithm must be created.
This introduces a lot of unnecessary overhead, as identical shader modules and pipelines are created every time a user wants to use a different set of buffers with the underlying SPIR-V.
Existing machine learning libraries like Pytorch allocate and deallocate buffers in an unpredictable manner, and thus an attempt to integrate Kompute would result in extremely slow performance and possible memory overuse from all the duplicate shader modules and pipelines.
There is a lot that can be done to remedy this problem, but a good first step is to implement separate caches for both vk::ShaderModule and vk::Pipeline obects. This will alleviate a large amount of overhead.