-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Synapse gets stuck after seeing psycopg2.errors.ReadOnlySqlTransaction: cannot execute nextval() in a read-only transaction
#16490
Comments
Related to #11473 |
As titled
the issue sounds like a duplicate of #11473 as Dirk says.
This is new to me. I don't think Synapse has any logic for opening read-only transactions. (Searching for For that specific issue, I think the best we could do to help here is to explicitly create connections requesting a writeable session. There is some connection string syntax for this, which we could presumably invoke via psycopg2. It might be that case that raising an error at conn-creation time means the application is more likely to show signs of failure. But that still wouldn't be communicated via the healthcheck endpoint. Being candid though, this seems like an unusual way to run postgres (and a niche failure mode); I don't think we're going to prioritise supporting it. |
psycopg2.errors.ReadOnlySqlTransaction: cannot execute nextval() in a read-only transaction
Also: logs, please. |
A random excerpt (this happens basically on any INSERT I got 8h of this log :))
To bring this some context of what I believe happened:
I didnt verify if this is the reality. But this is my theory based on how patroni/postgres-operator in kubernetes does failover. There is currently no bouncer in between. I am relying purely on kubernetes services for failover routing here. Not sure if it adds much context but its a kubernetes 1.28 kubeadm based with cilium as the network stack and postgres-operator for the database (it uses patroni for failover) Also I guess that patroni sets standby into recovery mode? Not sure how exactly pg/patroni does this. But from what I read online read-only/standby mode is reflected as recovery mode to consumers. But correct me if thats wrong please. |
synapse/synapse/rest/health.py
Lines 29 to 31 in 166ffc0
Hi :)
It seems like the health endpoint is a little too simple. For example, I am getting with patroni (pg cluster) sometimes synapse in a state where it gets stuck in a read-only transaction state
psycopg2.errors.ReadOnlySqlTransaction: cannot execute nextval() in a read-only transaction
. Synapse never recovers until a restart from this, even when the database is already healthy for hours again. Meanwhile, however, my Kubernetes which could have easily restarted synapse by now thinks the server is healthy while all api endpoints fail.Would it be possible to include fatal errors from the database in the evaluation of the health endpoint? Or are there other suggestions on how to solve this for someone hosting a synapse server?
The text was updated successfully, but these errors were encountered: