Skip to content

Connections not being returned to pool on transaction failure #445

@AlexLaviolette

Description

@AlexLaviolette

Currently on play 2.6 with play-slick 3.0.1, I’ve been running into an issue during load testing where the app is consuming tons of db connections from the pool and not returning them. I have narrowed it down to the use of transactionally, but have still not been able to pin point the exact cause.

As proof of this, when I increase the db connection pool to 340 (our max_connections is 350), I can hit up to 28k rpm on one server. But with a pool of only 100, it almost instantly crashes under the same load with the connections timing out waiting for a connection from the pool. I can confirm that the database transactions are happening rapidly. When I enable hikari connection leak detection, I can see that some connections are taking > 60s to be returned, yet the database is showing nothing over 5 seconds. From everything I’ve read, we should be able to configure this to run with a much smaller pool. When I remove the use of transactionally, the behaviour of the pool returns to normal.

Here are some logs:
with pool of 100 and transactionally enabled

com.zaxxer.hikari.pool.HikariPool - db - Timeout failure stats (total=100, active=100, idle=0, waiting=33)

with pool of 340 and transactionally enabled

com.zaxxer.hikari.pool.HikariPool - db - After cleanup stats (total=340, active=244, idle=96, waiting=0)

Some errors with pool of 100 and transactionally enabled:

java.sql.SQLTransientConnectionException: db - Connection is not available, request timed out after 10000ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:548)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:145)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:83)
at slick.jdbc.hikaricp.HikariCPJdbcDataSource.createConnection(HikariCPJdbcDataSource.scala:14)
at slick.jdbc.JdbcBackend$BaseSession.(JdbcBackend.scala:438)
at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:46)
at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:37)
at slick.basic.BasicBackend$DatabaseDef.acquireSession(BasicBackend.scala:216)
at slick.basic.BasicBackend$DatabaseDef.acquireSession$(BasicBackend.scala:215)

NewRelic is also showing these requests spending the majority of their time waiting on com.zaxxer.hikari.HikariDataSource.getConnection()

Here is an example of the code that is causing the issue:

def _insertIfNotExists(partner: Partner, accountId: String): DBIO[Account] = {
    accounts
      .filter(_.account_id === accountId)
      .filter(_.partner === partner)
      .result.headOption.flatMap {
      case Some(account) => DBIO.successful(account) // no-op
      case _ => {
        accounts.returning(accounts.map(_.id)).into((account, id) => account.copy(id = Some(id))) += Account(None, partner, accountId)
      }
    }
  }

  def createWithDeviceIfNotExists(partner: Partner, accountId: String, deviceId: String): Future[Device] = {
    db.run(
      _insertIfNotExists(partner, accountId).flatMap { account =>
        devices._insertIfNotExists(deviceId, account.id.get) // this looks exactly like the _insertIfNotExists above
      }.transactionally
    )
  }

I am unable to reproduce this at low load or locally by manually causing the queries to fail. But something about using transactionally here is causing connections to not be returned to the pool for a very long time. There are reports that Previously reported leaked connection com.mysql.jdbc.JDBC4Connection@795acb1e was returned to the pool (unleaked) but often after several minutes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions