Introduce a CacheStrategy for multi-instance data caching #3043

michaelbromley · 2024-09-05T11:38:17Z

The Problem

There are several places where Vendure core makes use of caching in order to significantly improve performance:

We have the TtlCache class which is used in several places to cache calculated values in memory for a given time
We have the SelfRefreshingCache type which is used to cache all zones, channels and tax rates in memory because they are accessed extremely frequently

The issue with those solutions is that they are in-memory-only, and therefore local to the specific server/worker instance.

Being in-memory has two major downsides:

Duplication: each instance of the server & worker maintains its own cache. On a typical deployment with 2x server & 2x worker processes, this means 4x the work needs to be done to fill up 4 caches. Load-balancing means that they will be filled unevenly, which is inefficient.
Invalidation: we cannot implement an invalidation method which works across multiple instances. E.g. the TaxRate cache: we would like to keep it cached forever until a TaxRate is changed/added/removed. We could naively set up an EventBus subscriber to clear the cache when we detect a change. However, the change event will only be published on one instance. Any other instances will not know about that event, and their cache is now stale.

This second point is the reason we use TTLs with relatively short times. Again, leading to more work being done on the database.

It is also the reason we do not use caching in other scenarios that could radically improve performance: we cannot reliably invalidate a cache that is not shared by all instances.

Example

This issue was originally motivated by an investigation I am conducting into the performance of the order-related mutations. Using a prototype of this caching approach, I was able to speed up my benchmark by ~2.5x and cut the p(95) response time from 6.98s to 3s.

Proposed Solution

I propose introducing a shared caching mechanism into the core: CacheStrategy. This will be strategy-based allowing you to decide whether you want to store that cache in:

the database
Redis
some other key-value storage system

The CacheStrategy would replace all existing caching mechanisms mentioned above, and would unlock the opportunity to make huge performance gains in currently slow areas like:

Checking Promotions on large orders
Evaluating collection filters

Because the cache is shared, it means as soon as one instance has cached a value, it will be available to all instances.

Design

At the most basic, the CacheStrategy will implement the typical cache methods: get() add(), delete().

It should also support key eviction via TTL which would be configurable per key.

It should be able to store JSON-like data, i.e. any serializable JS data structure, just like we already support with the job queue.

Here's a sketch of how it would look:

export interface CacheStrategy extends InjectableStrategy {

  get<T>(key: string): Promise<T>;

  add<T>(key: string, value: JsonCompatible<T>, options: { ttl?: number }): Promise<void>

  delete(key: string): Promise<boolean>

  // We could also include convenience methods to replicate the 
  // functionality of the SelfRefreshingCache interface.
  
}

Backward Compatibility

The implementation of CacheStrategy needs to be done in a backward-compatible way, so no changes are needed by the user when upgrading.

The createSelfRefreshingCache() function and TtlCache class would be deprecated, and internally their usage would be replaced with CacheStrategy
By default we will use an InMemoryCacheStrategy which duplicates the current behaviour.
We will also ship an SqlCacheStrategy which stores the cache in a key-value table in the database, using a JSON sql type for the value.
We will provide a reference Redis-based strategy for those who want to use Redis.

Summary

This proposal has the following benefits:

Consolidates existing caching mechanisms into a single concept
Makes it work across horizontally-scaled deployments
Enables correct cache invalidation and therefore much more efficient & correct caching
Is configurable following our well-used strategy pattern
Is 100% backward-compatible

The text was updated successfully, but these errors were encountered:

dlhck · 2024-09-06T15:19:27Z

We should add the support for cache tags. In many scenarios you want to delete items in the cache for a certain namespace, e.g. delete all cached values for a product or a zone. Pimcore has a neat implementation, where we can get some inspiration from: https://pimcore.com/docs/platform/Pimcore/Development_Tools_and_Details/Cache/#overview-of-functionalities

I would also recommend that we take a look at the caching architecture of Symfony as it is a really sophisticated one: https://symfony.com/doc/current/components/cache.html#generic-caching-psr-6

Relates to #3043

Relates to #3043 BREAKING CHANGE: If you are using the `FacetValueChecker` utility class, you should update your code to get it via the `Injector` rather than directly instantiating it. Existing code _will_ still work without changes, but by updating you will see improved performance due to new caching techniques. ```diff - facetValueChecker = new FacetValueChecker(injector.get(TransactionalConnection)); + facetValueChecker = injector.get(FacetValueChecker); ```

Relates to #3043. This plugin implements a simple SQL cache strategy to store cache items in the main database. The implementation needs further testing and potential performance optimization.

michaelbromley · 2024-09-24T19:19:14Z

Cache Tags

Tags are a mechanism of grouping cache items in order to make it possible to invalidate all items based on tags.

Prior Art

Symfony

https://symfony.com/doc/current/components/cache/cache_invalidation.html#using-cache-tags

// invalidate all items related to `tag_1` or `tag_3`
$cache->invalidateTags(['tag_1', 'tag_3']);

// if you know the cache key, you can also delete the item directly
$cache->delete('cache_key');

In the Symphony (& PSR-6 in general) implementation, cache items are wrapped into a CacheItem class, which also allows tags to be set on the item:

// add one or more tags
$item->tag('tag_1');
$item->tag(['tag_2', 'tag_3']);

Laravel

https://laravel.com/docs/11.x/cache

Laravel had a tags implementation but it was recently removed (at least from the documentation):

laravel/docs@63dde39 (undocument cache tags due to complexity of implementation)

It looks like their use of tags was badly designed - you can only invalidate by tags when the array of tags exactly matches. explanation

Drupal

https://www.drupal.org/docs/drupal-apis/cache-api/cache-tags

Any cache backend should implement CacheBackendInterface, so when you set a cache item with the ::set() method, provide third and fourth arguments e.g:

$cache_backend->set(
  $cid, $data, Cache::PERMANENT, ['node:5', 'user:7']
);

This stores a cache item with ID $cid permanently (i.e., stored indefinitely), but makes it susceptible to invalidation through either the node:5 or user:7 cache tags.

Redis-tag-cache

A package from Max Stoiber that implements a very simple (1 file) Redis cache with tags. We can use this as inspiration for our Redis version.

https://www.npmjs.com/package/redis-tag-cache

This implements the solution given in this SO answer using a separate list of keys for each tag and then smembers: https://stackoverflow.com/a/40649819/772859

Implementation

The consensus design is that any cache item can be tagged with one or more string tags. Later you can invalidate by tag and all entries that have that tag will be invalidated.

We need to have 3 concrete implementations:

in-memory (default)
database
redis

The common structure for tags will be to have a separate data structure that stores the tag with a list of keys that have that tag.

For the in-memory store, this can be a Map<string, Set<string>> - a map with the tag as the key, and a set of corresponding cache keys as the value.

The redis implementation is similar, and can be seen in the Redis-tag-cache package above.

For the database store, we would need a separate table to store entries associating a tag with a single cache key:

CREATE TABLE cache_tags (
  id SERIAL PRIMARY KEY,
  tag VARCHAR(255) NOT NULL,  -- Tag name
  cache_key VARCHAR(255) NOT NULL,  -- Corresponding cache key
  FOREIGN KEY (cache_key) REFERENCES cache_items(cache_key) ON DELETE CASCADE
);

Relates to #3043

Relates to #3043. This commit introduces a new DefaultSessionCacheStrategy, which delegates to the underlying CacheStrategy. This means that now you only need to think about the CacheStrategy, and the session cache will use whatever mechanism is defined there.

Relates to #3043.

michaelbromley added @vendure/core type: feature ✨ labels Sep 5, 2024

michaelbromley changed the title ~~Proposal: Introduce a SystemCacheStrategy for multi-instance data caching~~ Proposal: Introduce a CacheStrategy for multi-instance data caching Sep 6, 2024

michaelbromley pinned this issue Sep 10, 2024

michaelbromley added a commit that referenced this issue Sep 10, 2024

feat(core): Implement CacheStrategy and CacheService

489c9c0

Relates to #3043

dlhck added this to Vendure OS Roadmap Sep 23, 2024

dlhck moved this to 📦 Backlog in Vendure OS Roadmap Sep 23, 2024

dlhck moved this from 📦 Backlog to 📅 Planned in Vendure OS Roadmap Sep 24, 2024

dlhck added this to the v3.1 milestone Sep 24, 2024

dlhck added the design 📐 This issue deals with high-level design of a feature label Sep 24, 2024

dlhck mentioned this issue Sep 24, 2024

Pass the variant itself to the productVariantPriceCalculationsStrategy #2126

Open

dlhck changed the title ~~Proposal: Introduce a CacheStrategy for multi-instance data caching~~ Introduce a CacheStrategy for multi-instance data caching Sep 27, 2024

michaelbromley added a commit that referenced this issue Oct 29, 2024

feat(core): Implement cache invalidation by tags

382e314

Relates to #3043

michaelbromley added a commit that referenced this issue Oct 29, 2024

feat(core): Implement redis cache plugin

9d99593

Relates to #3043

michaelbromley added a commit that referenced this issue Oct 30, 2024

feat(core): Create Cache class for a more convenient caching API

a7ceb74

Relates to #3043

michaelbromley added a commit that referenced this issue Oct 30, 2024

refactor(core): Replace instances of TtlCache with CacheService

75feeea

Relates to #3043

michaelbromley added a commit that referenced this issue Oct 30, 2024

refactor(admin-ui-plugin): Use new CacheService to cache metrics

4507735

Relates to #3043.

michaelbromley added a commit that referenced this issue Oct 31, 2024

docs(core): Add docs on caching

7b0a26e

Relates to #3043.

michaelbromley moved this from 📅 Planned to 💯 Ready in Vendure OS Roadmap Oct 31, 2024

michaelbromley closed this as completed Oct 31, 2024

michaelbromley self-assigned this Oct 31, 2024

michaelbromley unpinned this issue Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a CacheStrategy for multi-instance data caching #3043

Introduce a CacheStrategy for multi-instance data caching #3043

michaelbromley commented Sep 5, 2024 •

edited

Loading

dlhck commented Sep 6, 2024

michaelbromley commented Sep 24, 2024 •

edited

Loading

Introduce a CacheStrategy for multi-instance data caching #3043

Introduce a CacheStrategy for multi-instance data caching #3043

Comments

michaelbromley commented Sep 5, 2024 • edited Loading

The Problem

Proposed Solution

Design

Backward Compatibility

Summary

dlhck commented Sep 6, 2024

michaelbromley commented Sep 24, 2024 • edited Loading

Cache Tags

Prior Art

Symfony

Laravel

Drupal

Redis-tag-cache

Implementation

michaelbromley commented Sep 5, 2024 •

edited

Loading

michaelbromley commented Sep 24, 2024 •

edited

Loading