Description
I'm using lib/pq v1.1.1, go 1.12.5, Linux 4.19.45-1-lts.
I have a db handler on that I run >=1 operations, restart the db-server and wait until the startup finished, the next query fails if all the previous TCP connections of the Operation System to the database were closed in the meantime (ss -na | grep 5432'
shows nothing).
The operation fails with: write tcp [::1]:45676->[::1]:5432: write: broken pipe
.
If another query is done after the failed one, it succeeds.
I expect that the db.Exec()
query succeeds after the postgresql restart finished and the sql package or pq driver retries and reconnects transparently if needed in the background.
If the postgresql-server restart happens quickly while they are still TCP connections in FIN-WAIT-2
or another state, the db operations after the postgres restart succeeds.
How to reproduce:
- run
docker run -p 5432:5432 postgres:latest
- run in another terminal https://gist.github.com/fho/777ccc77971612f3659cbdf5cef27ede
pass as command-line argumentpostgresql://postgres@localhost?sslmode=disable
- Press
q
to do a sql query - Shutdown the postgres server by pressing
ctrl + c
in the terminal - check
watch -n 0.5 "sh -c 'ss -na |grep 5432"
, wait until the TCP connections vanished - start postgres again: run
docker run -p 5432:5432 postgres:latest
- press
q
in the terminal that runs the go program to trigger another query => it fails
My first idea for a fix was to return ErrBadConn
on broken pipe errors but like discussed in #422 this has the issue that operations might be redone.
The mysql driver seems to solve it by having a custom error type to indicate retryable connection errors.
If a Write() on the tcp socket failed before a whole SQL statements was send, it's safe to retry the operation.
The caller of conn.send()
could decide it and set error to ErrBadConn
.