-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-4037 Rework client-side caching content #530
Changes from 6 commits
6ad185e
91e332c
51b9348
487a51b
2897cc7
ce99d38
2d7302e
13198e4
2c54189
d8ad78e
df0528a
df7c8a0
8bca1bb
c2e6628
9c8695b
84a24e2
322b346
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
--- | ||
categories: | ||
- docs | ||
- develop | ||
- stack | ||
- oss | ||
- rs | ||
- rc | ||
- oss | ||
- kubernetes | ||
- clients | ||
description: Server-assisted, client-side caching in Redis | ||
linkTitle: Client-side caching | ||
title: Client-side caching introduction | ||
weight: 20 | ||
--- | ||
|
||
*Client-side caching* is a technique to reduce network traffic between | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
a Redis client and the server. This generally gives better performance. | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
See [Client-side caching compatibility with Redis Software and Redis Cloud]({{< relref "operate/rs/references/compatibility/client-side-caching" >}}) | ||
for details of the Redis versions that support CSC. | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
By default, an [application server](https://en.wikipedia.org/wiki/Application_server) | ||
(which sits between the user app and the database) contacts the | ||
Redis database server through the client library for every read request. | ||
The diagram below shows the flow of communication from the user app, | ||
through the application server to the database and back again: | ||
|
||
{{< image filename="images/csc/CSCNoCache.drawio.svg" >}} | ||
|
||
When you use CSC, the client library | ||
maintains its own local cache of data items as it retrieves them | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from the database. When the same items are needed again, the client | ||
can satisfy the read requests from the cache instead of the database: | ||
|
||
{{< image filename="images/csc/CSCWithCache.drawio.svg" >}} | ||
|
||
Accessing the cache is much faster than communicating with the database over the | ||
network and it reduces network traffic. Also, this technique reduces | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the load on the database server, so you may be able to run it using fewer hardware | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
resources. | ||
|
||
As with other forms of [caching](https://en.wikipedia.org/wiki/Cache_(computing)), | ||
CSC works well in the very common use case where a small subset of the data | ||
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
gets accessed much more frequently than the rest of the data (according | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
to the [Pareto principle](https://en.wikipedia.org/wiki/Pareto_principle)). | ||
|
||
## Updating the cache when the data changes | ||
|
||
All caching systems must implement a scheme to update data in the cache | ||
when the corresponding data changes in the main database. Redis uses an | ||
approach called *tracking*. | ||
|
||
When CSC is enabled, the Redis server remembers or *tracks* the set of keys | ||
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
that each client connection has previously read. This includes cases where the client | ||
reads data directly, as with the [`GET`]({{< relref "/commands/get" >}}) | ||
command, and also where the server calculates values from the stored data, | ||
as with [`STRLEN`]({{< relref "/commands/strlen" >}}). When any client | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph seems a bit long. Maybe try to find a way to break it up? |
||
writes new data to a tracked key, the server sends an invalidation message | ||
to all clients that have accessed that key previously. This message warns | ||
the clients that their cached copies of the data are no longer valid and the clients | ||
will evict the stale data in response. Next time a client reads from | ||
the same key, it will access the database directly and refresh its cache | ||
with the updated data. | ||
|
||
The sequence diagram below shows how two clients might interact as they | ||
access and update the same key: | ||
|
||
{{< image filename="images/csc/CSCSeqDiagram.drawio.svg" >}} | ||
|
||
## Which commands can cache data? | ||
|
||
All read-only commands (with the `@read` | ||
[ACL category]({{< relref "/operate/oss_and_stack/management/security/acl" >}})) | ||
will use cached data, except for the following: | ||
|
||
- Any commands for | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would be easier to read if you indent the explanations under the bullet point for the command type. |
||
[probabilistic data types]({{< relref "/develop/data-types/probabilistic" >}}). | ||
These types are designed to be updated frequently, which means that caching | ||
them gives little or no benefit. | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Non-deterministic commands such as [`HGETALL`]({{< relref "/commands/hgetall" >}}), | ||
[`HSCAN`]({{< relref "/commands/hscan" >}}), | ||
and [`ZRANDMEMBER`]({{< relref "/commands/zrandmember" >}}). By design, these commands | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can mention HGETALL, being a popular one. @uglide can you validate this? |
||
give different results each time they are called. | ||
- Search and query commands (with the `FT.*` prefix), such as | ||
[`FT.SEARCH`]({{< baseurl >}}/commands/ft.search). | ||
|
||
You can use the [`MONITOR`]({{< relref "/commands/monitor" >}}) command to | ||
check the server's behavior when you are using CSC. Because `MONITOR` only | ||
reports activity from the server, you should find that the first cacheable | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
access to a key causes a response from the server. However, subsequent | ||
accesses are satisfied by the cache, and so `MONITOR` should report no | ||
server activity if CSC is working correctly. | ||
|
||
## What data gets cached for a command? | ||
|
||
Broadly speaking, the data from the *specific response* to a command invocation | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
gets cached after it is used for the first time. Subsets of that data | ||
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
or values calculated from it are retrieved from the server as usual and | ||
then cached separately. For example: | ||
|
||
- The whole string retrieved by [`GET`]({{< relref "/commands/get" >}}) | ||
is added to the cache. Parts of the same string retrieved by | ||
[`SUBSTR`]({{< relref "/commands/substr" >}}) are calculated on the | ||
server the first time and then cached separately from the original | ||
string. | ||
- Using [`GETBIT`]({{< relref "/commands/getbit" >}}) or | ||
[`BITFIELD`]({{< relref "/commands/bitfield" >}}) on a string | ||
caches the returned values separately from the original string. | ||
- For composite data types accessed by keys | ||
([hash]({{< relref "/develop/data-types/hashes" >}}), | ||
[JSON]({{< relref "/develop/data-types/json" >}}), | ||
[set]({{< relref "/develop/data-types/sets" >}}), and | ||
[sorted set]({{< relref "/develop/data-types/sorted-sets" >}})), | ||
the whole object is cached separately from the individual fields. | ||
So the results of `JSON.GET mykey $` and `JSON.GET mykey $.myfield` create | ||
separate entries in the cache. | ||
- Ranges from [lists]({{< relref "/develop/data-types/lists" >}}), | ||
[streams]({{< relref "/develop/data-types/streams" >}}), | ||
and [sorted sets]({{< relref "/develop/data-types/sorted-sets" >}}) | ||
are cached separately from the object they form a part of. Likewise, | ||
subsets returned by [`SINTER`]({{< relref "/commands/sinter" >}}) and | ||
[`SDIFF`]({{< relref "/commands/sdiff" >}}) create separate cache entries. | ||
- For multi-key read commands such as [`MGET`]({{< relref "/commands/mget" >}}), | ||
the ordering of the keys is significant. For example `MGET name:1 name:2` is | ||
cached separately from `MGET name:2 name:1` because the server returns the | ||
values in the order you specify. | ||
- Boolean or numeric values calculated from data types (for example | ||
[`SISMEMBER`]({{< relref "/commands/sismember" >}})) and | ||
[`LLEN`]({{< relref "/commands/llen" >}}) are cached separately from the | ||
object they refer to. | ||
|
||
## Usage recommendations | ||
|
||
Like any caching system, CSC has some limitations: | ||
kaitlynmichael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- The cache has only a limited amount of memory available. When the limit | ||
is reached, the client must *evict* potentially useful items from the | ||
cache to make room for new ones. | ||
- Cache misses, tracking, and invalidation messages always add a slight | ||
performance penalty. | ||
|
||
Below are some guidelines to help you use CSC efficiently, within these | ||
limitations: | ||
|
||
- **Use a separate connection for data that is not cache-friendly**: | ||
Caching gives the most benefit | ||
for keys that are read frequently and updated infrequently. However, you | ||
may also have data such as counters and scoreboards that receive frequent | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
updates. In cases like this, the performance overhead of the invalidation | ||
messages can be greater than the savings made by caching. Avoid this problem | ||
by using a separate connection *without* CSC for any data that is | ||
not cache-friendly. | ||
- **Estimate how many items you can cache**: The client libraries let you | ||
specify the maximum number of items you want to hold in the cache. You | ||
can calculate an estimate for this number by dividing the | ||
maximum desired size of the | ||
cache in memory by the average size of the items you want to store | ||
(use the [`MEMORY USAGE`]({{< relref "/commands/memory-usage" >}}) | ||
command to get the memory footprint of a key). So, if you had | ||
andy-stark-redis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
10MB (or 10485760 bytes) available for the cache, and the average | ||
size of an item was 80 bytes, you could fit approximately | ||
10485760 / 80 = 131072 items in the cache. Monitor memory usage | ||
on your server with a realistic test load to adjust your estimate | ||
up or down. | ||
|
||
## Reference | ||
|
||
The Redis server implements extra features for CSC that are not used by | ||
the main Redis clients, but may be useful for custom clients and other | ||
advanced applications. See | ||
[Client-side caching reference]({{< relref "/develop/reference/client-side-caching" >}}) | ||
for a full technical guide to all the options available for CSC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you envision a quick start with the tabbed examples in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mortensi I wanted to check exactly what you want in the tabbed samples. Are you thinking of stuff from the Google doc like:
...or are there other use cases you want to have samples for? I know the Google doc has quite a few examples for Python and Jedis but I'm not sure if we need them in the doc page. Maybe we could have a concrete example at the bottom of some of the entries in this list for extra clarity? I'm not sure that having tabbed samples for these in each language is necessary either, because the samples don't actually do anything useful in themselves, so people wouldn't normally copy/paste them.
However, if we do decide that tabbed samples are a good thing here then I'm confident we can get them ready in good time for the release.