Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis: use atomic operations everywhere #37758

Merged
merged 3 commits into from
May 19, 2023
Merged

redis: use atomic operations everywhere #37758

merged 3 commits into from
May 19, 2023

Conversation

pulsejet
Copy link
Member

@pulsejet pulsejet commented Apr 16, 2023

This removes a lot of acrobatics in the code and does each operation
atomically using a lua script. This also reduces several round trips
to the server, and the scripts are compiled and cached server-side.

Notably, since all operations work only on a single key (except clear,
which is broken anyway and shouldn't be used), they will continue to
function and be atomic for Redis cluster.

EDIT: I did some simple benchmark with a single client process. Looks like locking is approximately 2x faster (probably even better with multiple clients). This is with Redis on the same host, so gains will be better when latency is higher between the servers.

<?php
        $lock = \OC::$server->get(\OCP\Lock\ILockingProvider::class);

        $t1 = microtime(true);
        for($i = 0; $i < 20000; $i++) {
            $lock->acquireLock('benchmark', \OCP\Lock\ILockingProvider::LOCK_EXCLUSIVE);
            $lock->releaseLock('benchmark', \OCP\Lock\ILockingProvider::LOCK_EXCLUSIVE);
        }
        $t2 = microtime(true);
        $output->writeln("Total: " . ($t2 - $t1) . "s");
Before: 5.590616941452s
After: 2.6423881053925s

On a side note, using DB for locking is a joke:

MySQL: 232.49977707863s

@pulsejet pulsejet added 3. to review Waiting for reviews feature: caching Related to our caching system: scssCacher, jsCombiner... performance 🚀 labels Apr 16, 2023
@szaimen szaimen added this to the Nextcloud 27 milestone Apr 16, 2023
@szaimen szaimen requested review from a team, ArtificialOwl and icewind1991 and removed request for szaimen and a team April 16, 2023 20:39
@@ -54,18 +54,19 @@ public function getCache() {

public function get($key) {
$result = $this->getCache()->get($this->getPrefix() . $key);
if ($result === false && !$this->getCache()->exists($this->getPrefix() . $key)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is useless. Redis does not support boolean types, so getting a false means it did not exist.

@@ -82,6 +83,7 @@ public function remove($key) {
}

public function clear($prefix = '') {
// TODO: this is slow and would fail with Redis cluster
$prefix = $this->getPrefix() . $prefix . '*';
$keys = $this->getCache()->keys($prefix);
$deleted = $this->getCache()->del($keys);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail for redis cluster and a large number of keys. The solution is to replace this with a lua script and run it on all masters.

->exec();
return $result !== false;
}
$this->getCache()->unwatch();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 statements here vs 1 now. Ditto for cad.


return $this->getCache()->eval(
'if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using unlink here is a premature optimization that's probably harmful. In fact, unless the values are huge, unlink internally calls del after checking the size, so we just did those checks for no reason...

This removes a lot of acrobatics in the code and does each operation
atomically using a lua script. This also reduces several round trips
to the server, and the scripts are compiled and cached server-side.

Notably, since all operations work only on a single key (except clear,
which is broken anyway and shouldn't be used), they will continue to
function and be atomic for Redis cluster.

Signed-off-by: Varun Patil <varunpatil@ucla.edu>
@pulsejet pulsejet force-pushed the redis-atomic branch 2 times, most recently from 3671f8d to 39e805f Compare April 17, 2023 03:19
Signed-off-by: Robin Appelman <robin@icewind.nl>
@icewind1991
Copy link
Member

Added a unit test to validate the script hashes to make sure we don't forget to update them if we ever change the scripts

@pulsejet
Copy link
Member Author

Added a unit test to validate the script hashes to make sure we don't forget to update them if we ever change the scripts

Good point, thanks!

@skjnldsv skjnldsv mentioned this pull request May 3, 2023
@szaimen szaimen requested a review from a team May 5, 2023 11:00
@pulsejet
Copy link
Member Author

pulsejet commented May 8, 2023

Bump?

Copy link
Member

@CarlSchwan CarlSchwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty neat. I didn't knew that redis allowed to run lua script

@skjnldsv skjnldsv added this to the Nextcloud 28 milestone May 9, 2023
@pulsejet pulsejet requested a review from CarlSchwan May 9, 2023 21:26
Copy link
Member

@juliushaertl juliushaertl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, otherwise this seems like a nice improvement.

Is there any minimum version requirement for this on the redis side?

@pulsejet
Copy link
Member Author

Minor comments, otherwise this seems like a nice improvement.

Thanks, will fix these tomorrow.

Is there any minimum version requirement for this on the redis side?

Redis 2.6, released ~10 years ago.

Signed-off-by: Varun Patil <varunpatil@ucla.edu>
@pulsejet
Copy link
Member Author

CI failure unrelated

Copy link
Member

@juliushaertl juliushaertl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in for 28 👍

@szaimen szaimen disabled auto-merge May 19, 2023 21:29
@szaimen szaimen merged commit 3d2d1c1 into master May 19, 2023
@szaimen szaimen deleted the redis-atomic branch May 19, 2023 21:29
@ChristophWurst
Copy link
Member

Massive kudos for this patch @pulsejet. We ran into a production instance with incomprehensive file locking issues with a Redis cluster memcache backend. Sporadically unlocked files were still locked inside a single request. Only with this patch we could eliminate the issue.

@AndyScherzinger this is not only a performance improvement but fixes Redis cluster inconsistencies. Would this change be ok to backport?

@pulsejet
Copy link
Member Author

Great! A possible reason I see is that Redis::dec wasn't atomic earlier, maybe that led to a race. But transaction support with cluster is iffy to begin with anyway.

@AndyScherzinger
Copy link
Member

@ChristophWurst yes, backporting would be fine from my pov 👍

@ChristophWurst
Copy link
Member

/backport to stable27

@ChristophWurst
Copy link
Member

/backport to stable26

@solracsf
Copy link
Member

@ChristophWurst any specific reason why not backporting to 25 also?

@ChristophWurst
Copy link
Member

Wrongly got the impression 25 is EOL but it's only 24

@ChristophWurst
Copy link
Member

I think I'll backport the 26 backport because there are conflicts to resolve

@ChristophWurst
Copy link
Member

#38539 (comment)

@kesselb kesselb mentioned this pull request Jun 13, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews feature: caching Related to our caching system: scssCacher, jsCombiner... performance 🚀
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants