Skip to content

[Tiered Caching] Performance improvement for TieredCaching  #13989

@sgup432

Description

@sgup432

Is your feature request related to a problem? Please describe

The current implementation of TieredCaching.computeIfAbsent() is performing poorly which was discovered while running some performance benchmarks. Before we delve into the root cause, lets me elaborate on the current implementation.

Current implementation:

As of now TieredCaching implementation has onheap and disk tier which are individually thread safe. But we need to ensure that the TieredCache as a whole is thread safe and doesn't allow any key to be cached onto both tiers.

So we implemented a single global read/write lock(link) which ensures that any write operation is performed on TieredCache by only one thread at a time and allows multiple readers to access any cache via read lock.

Performance Issues with currrent approach:

  1. Considering we have a single write/read lock, it makes the TieredCache a single segmented cache and doesn't provide write throughout benefits of a segmented cache(like the underlying onheap cache which has 256 segments). Due to this only one thread can write to cache at a time and impacts write/read throughput.
  2. In case of any cache miss, we are recomputing the query under the write lock(link) which makes the other thread wait and makes the performance really worse.

Describe the solution you'd like

To address above performance issues:

1. Firstly we need to move out the query recompute part(during cache miss) outside the write lock in computeIfAbsent

This can be probably achieved by below(reusing Cache.java logic).

  • Create a concurrent hashmap of Map<Key, CompletableFuture>. This will temporally hold the queries for short duration.
  • Multiple requests for same key A lands in computeIfAbsent. key A is not present in Cache yet.
  • Only one thread will succeed putting a future object in the map for key A and rest will reuse the same future.
  • The thread will succeeded putting a future in the map will load the query. Rest of the threads will wait via future.get().
  • Once query is loaded, the desired thread will put the item onto onHeap cache via put call. Rest of the threads will return the value.
  • Remove the desired key from the map.

2. Avoid using a single read/write lock

Possible solutions:

  • Segmented cache: Instead of using a single write/read local and a single segment, we can use a segmented tiered cache to improve read/write performance.
  • Lock per key: Considering lock usage in TieredCache is to ensure that no same key lands onto both caches, we can considering creating locks per key and then delete them once the request completes.
    • There might be some overhead in creating locks frequently. (Will run performance tests)
    • One pros of this is that we don't really need to move the query recompute part of write lock considering other requests for different key will not be blocked.

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Search:PerformanceenhancementEnhancement or improvement to existing feature or requestv2.18.0Issues and PRs related to version 2.18.0

Type

No type

Projects

Status

✅ Done

Status

Done

Status

New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions