Skip to content

"timeout exceeded when trying to connect" spike after upgrading to 8.2.1 #2262

Open
@mriedem

Description

@mriedem

Yesterday we upgraded from pg@8.0.3 (with pg-pool@3.1.1) to pg@8.2.1 (with pg-pool@3.2.1), specifically from our package-lock.json:

    "pg": {
      "version": "8.2.1",
      "resolved": "https://registry.npmjs.org/pg/-/pg-8.2.1.tgz",
      "integrity": "sha512-DKzffhpkWRr9jx7vKxA+ur79KG+SKw+PdjMb1IRhMiKI9zqYUGczwFprqy+5Veh/DCcFs1Y6V8lRLN5I1DlleQ==",
      "requires": {
        "buffer-writer": "2.0.0",
        "packet-reader": "1.0.0",
        "pg-connection-string": "^2.2.3",
        "pg-pool": "^3.2.1",
        "pg-protocol": "^1.2.4",
        "pg-types": "^2.1.0",
        "pgpass": "1.x",
        "semver": "4.3.2"
      },
      "dependencies": {
        "pg-connection-string": {
          "version": "2.2.3",
          "resolved": "https://registry.npmjs.org/pg-connection-string/-/pg-connection-string-2.2.3.tgz",
          "integrity": "sha512-I/KCSQGmOrZx6sMHXkOs2MjddrYcqpza3Dtsy0AjIgBr/bZiPJRK9WhABXN1Uy1UDazRbi9gZEzO2sAhL5EqiQ=="
        },
        "semver": {
          "version": "4.3.2",
          "resolved": "https://registry.npmjs.org/semver/-/semver-4.3.2.tgz",
          "integrity": "sha1-x6BxWKgL7dBSNVt3DYLWZA+AO+c="
        }
      }
    }

We started seeing a spike in timeout exceeded when trying to connect errors with this stacktrace:

Error: timeout exceeded when trying to connect
    at Timeout._onTimeout (/usr/src/app/node_modules/pg-pool/index.js:188:27)
    at listOnTimeout (internal/timers.js:549:17)
    at processTimers (internal/timers.js:492:7)

This is a pretty basic express app with a postgres 12 backend running on node 12.

We report metrics on the connection pool max/total/idle/waiting count values and there is an obvious spike in the wait count from the time the 8.2.1 upgrade was deployed (around 9am CT yesterday) and then the drop when we reverted that change (about 6am CT today):

image

That corresponds with our API request/response/error rates (again, just a simple express app over a pg db):

image

We're not sure how to debug this. These are the relevant values we're using related to the Pool config:

  • connectionTimeoutMillis = 60000
  • idleTimeoutMillis = 60000
  • max = 150

We have a staging environment where this showed up as well but we didn't have an alert setup for it (we do now). So if there is something we can do to help debug this and provide information back we can probably do that in our staging environment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions