-
Notifications
You must be signed in to change notification settings - Fork 252
Description
Summary
When using some of the cache backends provided by django_prometheus (notably LocMemCache and FileBasedCache) together with Django's cache-based session engine, login and other session operations can fail with:
django.contrib.sessions.backends.base.UpdateErroranddjango.contrib.sessions.exceptions.SessionInterrupted.
Root cause: their get() implementations return default whenever the cached value is falsy (e.g. an empty dict), which breaks Django's assumption that cache.get() returns default only when the key is missing. Django's cache session backend relies on this assumption to decide whether to raise UpdateError.
This affects at least:
django-commons/django-prometheusversions:v2.3.1,v2.4.0,v2.4.1, andmaster(all have identicalLocMemCache.get()andFileBasedCache.get()implementations).- Django: confirmed with
5.2.8, but the logic in Django's cache session backend is generic.
There are already PRs that address this problem:
Both PRs adjust the cache backends so that get() preserves Django's semantics and does not treat falsy-but-existing values as cache misses. As of this writing, both PRs are still open, and the current releases still exhibit the bug.
Symptoms / search keywords
For anyone finding this via search, some typical error messages and phrases that can be caused by this issue:
django.contrib.sessions.backends.base.UpdateErrordjango.contrib.sessions.exceptions.SessionInterruptedThe request's session was deleted before the request completed. The user may have logged out in a concurrent request, for example.HttpResponseBadRequest status_code=400together withSessionInterruptedin logsclient.force_login(user)failing whenSESSION_ENGINE = "django.contrib.sessions.backends.cache"- pytest / pytest-django tests failing only when using
django_prometheus.cache.backends.locmem.LocMemCacheorFileBasedCacheas the default cache
Example environment
One environment where this reproduces reliably:
- Python: 3.13.3
- Django: 5.2.8
- django-prometheus: 2.3.1 (also tested with 2.4.1)
- django-redis: 6.0.0
- Test runner: pytest + pytest-django + pytest-xdist (parallel), but the bug already appears in a single process.
Key settings:
CACHES = {
"default": {
"BACKEND": "django_prometheus.cache.backends.locmem.LocMemCache",
},
}
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
SESSION_CACHE_ALIAS = "default"Minimal reproduction 1: pure cache behavior
This shows the behavioral difference between Django's LocMemCache and django_prometheus's LocMemCache when storing a falsy value (e.g. {}).
from django.core.cache.backends.locmem import LocMemCache as DjangoLocMemCache
from django_prometheus.cache.backends.locmem import LocMemCache as PrometheusLocMemCache
def demo():
params = {}
django_cache = DjangoLocMemCache("default", params)
prometheus_cache = PrometheusLocMemCache("default", params)
django_cache.set("k", {})
prometheus_cache.set("k", {})
print("Django LocMemCache:", django_cache.get("k")) # -> {}
print("Prometheus LocMemCache:", prometheus_cache.get("k")) # -> None (unexpected)
if __name__ == "__main__":
demo()Observed result:
- Django's
LocMemCache.get("k")returns the stored{}. - Prometheus
LocMemCache.get("k")returnsNone, even though the key exists and was set to{}.
This comes directly from the implementation in django_prometheus/cache/backends/locmem.py:
class LocMemCache(locmem.LocMemCache):
def get(self, key, default=None, version=None):
django_cache_get_total.labels(backend="locmem").inc()
cached = super().get(key, default=None, version=version)
if cached is not None:
django_cache_hits_total.labels(backend="locmem").inc()
else:
django_cache_misses_total.labels(backend="locmem").inc()
return cached or default # <-- problematicFor any falsy but valid value (e.g. 0, "", {}, []), cached or default will yield default, typically None, even though the key exists.
The same pattern exists in the file-based cache backend:
class FileBasedCache(filebased.FileBasedCache):
def get(self, key, default=None, version=None):
django_cache_get_total.labels(backend="filebased").inc()
cached = super().get(key, default=None, version=version)
if cached is not None:
django_cache_hits_total.labels(backend="filebased").inc()
else:
django_cache_misses_total.labels(backend="filebased").inc()
return cached or default # <-- same issueAn analogous reproduction for FileBasedCache can be done by replacing LocMemCache with FileBasedCache in the snippet above.
Minimal reproduction 2: session backend failure
Given the settings above (LocMemCache as default cache, cache-based sessions), this simple test fails:
from django.contrib.auth import get_user_model
from django.test import Client, TestCase
class SessionBugTest(TestCase):
def test_force_login_with_cache_session(self):
User = get_user_model()
user = User.objects.create_user(username="u", password="p")
client = Client()
# This call ends up raising django.contrib.sessions.backends.base.UpdateError
client.force_login(user)Observed trace (shortened):
File ".../django/test/client.py", line 883, in _login
request.session.save()
File ".../django/contrib/sessions/backends/cache.py", line 89, in save
raise UpdateError
django.contrib.sessions.backends.base.UpdateError
In a real Django app that uses SessionMiddleware, this can surface as:
django.contrib.sessions.exceptions.SessionInterrupted:
The request's session was deleted before the request completed.
and may result in 400 Bad Request responses instead of the expected status code.
Why this breaks Django's assumptions
Django's cache-based session backend (django.contrib.sessions.backends.cache.SessionStore) relies on cache.get() returning default only when the key does not exist:
# django/contrib/sessions/backends/cache.py
from django.contrib.sessions.backends.base import UpdateError
def save(self, must_create=False):
if self.session_key is None:
return self.create()
if must_create:
func = self._cache.add
elif self._cache.get(self.cache_key) is not None:
func = self._cache.set
else:
raise UpdateErrorWith Django's own LocMemCache, this is safe:
get(key)returns the stored value (regardless of truthiness) if the key exists and is not expired.get(key)returnsdefaultonly when the key is missing/expired.
With django_prometheus.cache.backends.locmem.LocMemCache, the semantics change:
- If a key exists but its value is falsy (e.g.
{}used internally by the session backend),get()returnsdefault(typicallyNone). SessionStore.save()then interprets this as "the session entry doesn't exist anymore" and raisesUpdateError.
This is why:
Client().force_login(user)starts failing, and- Requests with a session can raise
SessionInterruptedinSessionMiddleware.
Proposed fix (and relation to existing PRs)
The core requirement is:
Preserve Django's cache semantics:
cache.get(key, default)must returndefaultonly when the key does not exist.
The metrics code should not change the observable behavior of get().
Option A (minimal code change, aligned with existing PRs)
Change LocMemCache.get() to pass through default to the parent, and base the hit/miss distinction on whether the returned value is the default object:
from django.core.cache.backends import locmem
from django_prometheus.cache.metrics import (
django_cache_get_total,
django_cache_hits_total,
django_cache_misses_total,
)
class LocMemCache(locmem.LocMemCache):
def get(self, key, default=None, version=None):
django_cache_get_total.labels(backend="locmem").inc()
cached = super().get(key, default=default, version=version)
if cached is default:
# We received the default back -> treat as miss.
django_cache_misses_total.labels(backend="locmem").inc()
else:
# Any other value -> treat as hit.
django_cache_hits_total.labels(backend="locmem").inc()
return cachedTrade-offs:
- Behavior for callers is now identical to Django's
LocMemCache. - Metrics will treat "stored value exactly equal to
default" as a miss rather than a hit. In practice, using the samedefaultas a stored value is rare; fordefault=Nonethis means storedNonewill be counted as a miss in metrics, but the cache behavior remains correct.
Both PRs #284 and #444 implement essentially this strategy for:
LocMemCacheFileBasedCacheRedisCache/NativeRedisCache(the Redis backends already had the correct semantics regarding falsy values, but the PRs make the hit/miss accounting consistent and clean up logging).
Option B (replicate Django's LocMemCache logic)
Alternatively, LocMemCache.get() could inline the logic from Django's LocMemCache.get() (using _cache, _expire_info, _lock, _has_expired) and insert metrics around it. That would keep metrics perfectly accurate but would duplicate logic from Django.
Workaround (for confirmation)
As a local workaround to confirm the diagnosis, one can add a custom cache backend in a project that wraps Django's cache backends and keeps metrics, but restores the expected get() semantics (as in the snippets above). With such a wrapper, the previously failing session-related tests (e.g. using Client.force_login() with cache-based sessions) pass again, and a full parallel test run completes successfully. This strongly suggests that the problem is specifically the return cached or default pattern in the upstream get() implementations.