We are currently rolling our own custom redis cache implementation to speed up our application but the cached-prisma models looks like a promising package that we could use in the future to abstract away some of our redundant code. Great find, good work so far.
If you are interested in and our design goals align we could contribute to your repository, if you say it is out of scope of your vision we will just fork it. Our main focus is to be able to fine tune the cache for different models with partially long life times, and not just have a set and forget middleware.
Here are a few ideas and suggestions in no particular order:
Client extension:
Instead of creating a prisma singleton the library could be written as a custom client extension. The https://www.prisma.io/docs/orm/prisma-client/client-extensions/model allModels parameter allows to intercept calls and inject your own logic. This way we could use it in conjunction with our other client extensions for logging or patching calls.
Peer dependencies:
Instead of defining the different cache providers as dependencies they should be moved to peer dependencies. While this will be a single line more during installation you are not polluting other projects with unused clients.
Model specific settings
We have a few models that absolutely need to be cached, and other models that should be excluded from caching. The lifetime of the cache also vastly differs between different models. Some are fine for a few seconds, others are static and can be pretty much saved indefinitely until they are manually invalidated.
It would be great if the priority or lifetime (depending on what the provider offers) of models could be defined.
Cache key factory
For some specific instances requests to the database can slightly differ but the returned values should stil be served from the cache. e.g. when a timestamp is passed as a query parameter. This would result in a cache miss, even though we do not need to hit the database. Combined with the model specific settings a custom cache key function could be used to allow for this behavior.
This is also especially useful if the prisma context is nested.
Cache invalidation
Our application might invalidate data which is outside of our node applications scope. e.g. Postgres triggers or webhook notifications. It would be nice to expose a method to invalidate the cache of specific models, entries. For this
Eager saving & defer
During reading the cache write is awaited, if the cache fails to write or read the data, the retrieval function will throw even though the data could still be served from prisma itself. Mabye an onCacheError callback can be supplied to the constructor to notify about issues and still let the call still succeed.
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/clients/Prisma.ts#L56
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/clients/Prisma.ts#L49
In performance critical paths we could get away not awaiting the cache write. This results in a double retrieval race condition in case that the same data is requested in quick succession.
Should flush really clear everything?
we are using out cache on a model level and as traffic increases read and write operations will happen often. If each impure function flushes the entire cache, the lifetime will never be reached and long living caches are not useful.
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/caches/maps/LruCache.ts#L20-L22
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/caches/providers/Redis.ts#L32-L34
LFU and LRU lose their content after a single impure write, which can have disastrous effect on performance if we have a bad write / read ratio. We could keep track of the dirty state of individual models and just skip those upon retrieval. This would still result in values getting evicted too early if the ringbuffers run out of space with invalid values but it's better than just getting rid of everything.
Is IO redis only flushing keys with their defined prefix? I am not aware of their specific implementation.
Should we not be purging the exact cache key (in combination with the cacheKeyFactory mentioned earlier)? Calling `flush´ after each impure call might also be overkill and we could get away by marking the provider dirty after a write and just calling it at a later point in time. This again depends on read / write performance.
Documentation:
The documentation for the different data providers is not detailed enough without looking into the source code. It would be nice if certain parameters like lifetime (what unit?) could be explained. Certain cache providers have different characteristics, it would be great to know what they are and what features they support.
Additionally it would be good to have a small example on how to implement their own caching provider.
Nested queries
Prisma queries are nested and composed. We could plug and merge those queries partially from the database and reduce merging of certain relations if they have been cached already. This will be really tricky to implement and certainly is just a nice to have.
Support for typed SQL.
Nice to have.
There is a bit more but these are the main points that just came to my mind.
We are currently rolling our own custom redis cache implementation to speed up our application but the
cached-prismamodels looks like a promising package that we could use in the future to abstract away some of our redundant code. Great find, good work so far.If you are interested in and our design goals align we could contribute to your repository, if you say it is out of scope of your vision we will just fork it. Our main focus is to be able to fine tune the cache for different models with partially long life times, and not just have a set and forget middleware.
Here are a few ideas and suggestions in no particular order:
Client extension:
Instead of creating a
prismasingleton the library could be written as a custom client extension. The https://www.prisma.io/docs/orm/prisma-client/client-extensions/modelallModelsparameter allows to intercept calls and inject your own logic. This way we could use it in conjunction with our other client extensions for logging or patching calls.Peer dependencies:
Instead of defining the different cache providers as dependencies they should be moved to peer dependencies. While this will be a single line more during installation you are not polluting other projects with unused clients.
Model specific settings
We have a few models that absolutely need to be cached, and other models that should be excluded from caching. The lifetime of the cache also vastly differs between different models. Some are fine for a few seconds, others are static and can be pretty much saved indefinitely until they are manually invalidated.
It would be great if the priority or lifetime (depending on what the provider offers) of models could be defined.
Cache key factory
For some specific instances requests to the database can slightly differ but the returned values should stil be served from the cache. e.g. when a timestamp is passed as a query parameter. This would result in a cache miss, even though we do not need to hit the database. Combined with the model specific settings a custom cache key function could be used to allow for this behavior.
This is also especially useful if the prisma context is nested.
Cache invalidation
Our application might invalidate data which is outside of our node applications scope. e.g. Postgres triggers or webhook notifications. It would be nice to expose a method to invalidate the cache of specific models, entries. For this
Eager saving & defer
During reading the cache write is awaited, if the cache fails to write or read the data, the retrieval function will throw even though the data could still be served from prisma itself. Mabye an
onCacheErrorcallback can be supplied to the constructor to notify about issues and still let the call still succeed.https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/clients/Prisma.ts#L56
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/clients/Prisma.ts#L49
In performance critical paths we could get away not awaiting the cache write. This results in a double retrieval race condition in case that the same data is requested in quick succession.
Should flush really clear everything?
we are using out cache on a model level and as traffic increases read and write operations will happen often. If each impure function flushes the entire cache, the lifetime will never be reached and long living caches are not useful.
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/caches/maps/LruCache.ts#L20-L22
https://github.com/JoelLefkowitz/cached-prisma/blob/master/src/caches/providers/Redis.ts#L32-L34
LFU and LRU lose their content after a single impure write, which can have disastrous effect on performance if we have a bad write / read ratio. We could keep track of the dirty state of individual models and just skip those upon retrieval. This would still result in values getting evicted too early if the ringbuffers run out of space with invalid values but it's better than just getting rid of everything.
Is IO redis only flushing keys with their defined prefix? I am not aware of their specific implementation.
Should we not be purging the exact cache key (in combination with the
cacheKeyFactorymentioned earlier)? Calling `flush´ after each impure call might also be overkill and we could get away by marking the provider dirty after a write and just calling it at a later point in time. This again depends on read / write performance.Documentation:
The documentation for the different data providers is not detailed enough without looking into the source code. It would be nice if certain parameters like lifetime (what unit?) could be explained. Certain cache providers have different characteristics, it would be great to know what they are and what features they support.
Additionally it would be good to have a small example on how to implement their own caching provider.
Nested queries
Prisma queries are nested and composed. We could plug and merge those queries partially from the database and reduce merging of certain relations if they have been cached already. This will be really tricky to implement and certainly is just a nice to have.
Support for typed SQL.
Nice to have.
There is a bit more but these are the main points that just came to my mind.