Description
Hello,
I'm working on setting up a health check within our system, and have noticed that our Postgres implementation doesn't seem to handle connection timeouts, dropped network, or a db failover event very gracefully (at least in the way it works when using a multi-az deployment in Amazon RDS). The issue is very similar to one that was fixed in the mysql package a while back -- mysqljs/mysql#821.
We effectively have a class that looks similar to this:
const Promise = require('bluebird');
const pg = require('pg');
class PostgresSQLInterface extends Service {
constructor(dbConnectionOptions, logger) {
const defaults = {
host: 'localhost',
port: '5432',
user: '',
password: null,
max: 10, // Poolsize
min: 10,
keepAlive: true,
idleTimeoutMillis: 10000
};
this.state = {
config: _.merge({}, defaults, dbConnectionOptions)
};
this.createDatabasePool();
}
createDatabasePool() {
this.state.pool = new pg.Pool(this.state.config);
this.state.pool.on('connect', () => {
this.state.healthy = true;
console.log('HEALTHY');
});
this.state.pool.on('error', () => {
this.state.healthy = false;
console.log('UNHEALTHY');
});
}
*getDatabaseConnection() {
try {
const connection = yield this.state.pool.connect();
return connection;
}
catch (error) {
throw new Errors.DatabaseError("Could not connect to Postgres", {
originalError: error
});
}
}
};
In a normal environment, when our application connects, I see it spit out "HEALTHY" x the number of connections it made, as expected. However, there are a few issues:
1.) If the connection is severed (I turn off my wireless, kill my VPN, or trigger a reboot on the RDS instance), no error events are raised even though the documentation for pg-pool states an error is "Emitted whenever an idle client in the pool encounters an error. This is common when your PostgreSQL server shuts down, reboots, or a network partition otherwise causes it to become unavailable while your pool has connected clients." My expectation would be to see a string of "UNHEALTHY" statements.
2.) Similarly, if I initiate a failover in RDS, no error events are raised but I am unable to run queries. This may be due to how AWS handles the DNS entries around multi-az deployments, but I cannot test this while issue #1 remains unresolved.
3.) Calling getDatabaseConnection when no replies are received from postgres (sudo iptables -A INPUT -p tcp --source-port 5432 -j DROP, sudo iptables -F afterwards to restore) hangs on grabbing a connection, as there is no timeout setting.
Am I missing a setting in my configuration, or is this a situation that just hasn't come up yet for someone else? If it's the latter, I'd be more than happy to create a fix for the issue, but want to make sure I'm not overlooking something first.
Node version: 6.2.x
PG version: 6.0.2
Postgres version: 9.5.x
Thanks!