Cloud NDB - Get operation ignores transient locking failure in memcache leading to cache inconsistency

cloud-ndb v1.8.0 with Memcache global cache

There is a sequence of steps that can lead to cache inconsistency which is caused by the read thread failing to write a lock for transient reasons. The sequence of steps is:

1. Reader gets from memcache and finds nothing
2. Writer writes lock value
3. Reader has transient failure when attempting to lock the key
4. Reader watches key
5. Reader reads from db
6. Writer updates db
7. Writer fails to delete lock from db for whatever reason (connection reset most likely currently)
8. Reader writes stale value using cas

The problem here is that exceptions on transient failures on cache operations from reads are swallowed, which for most of the calls is fine, however very specifically for the lock call in `_datastore_api.lookup()` any exception needs to be treated as the key being locked such that it will not attempt to update memcache with a new value


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cloud NDB - Get operation ignores transient locking failure in memcache leading to cache inconsistency #652

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cloud NDB - Get operation ignores transient locking failure in memcache leading to cache inconsistency #652

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions