Skip to content

Occasional "Connection reset" and 500 Internal Server Error #190

@earshinov

Description

@earshinov

Hi!

We occasionally have "Connection reset" errors in Convergence server logs and 500 Internal Server Error in response to a REST API call.

GET domains/convergence/default/models/ffreport-3605: 500 Server Error
b'{"body":{"error_code":"internal_server_error"},"ok":false}'

Convergence server logs:

INFO: Caught Network I/O errors on odb.pl-grt-orientdb:2424/7048553260737367023, trying an automatic reconnection... (error: Connection reset)
Caused by: java.net.SocketException: Connection reset
    at java.net.SocketInputStream.read(SocketInputStream.java:186) ~[?:?]
    at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?]
    at java.io.BufferedInputStream.read(BufferedInputStream.java:271) ~[?:?]
    at java.io.DataInputStream.readByte(DataInputStream.java:270) ~[?:?]
    at com.orientechnologies.orient.enterprise.channel.binary.OChannelBinary.readByte(OChannelBinary.java:75) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:176) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:160) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:1891) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.lambda$networkOperationRetryTimeout$2(OStorageRemote.java:248) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:307) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    ... 42 more
16:34:35 ERROR RealtimeModelActor - Unexpected error getting model
com.orientechnologies.common.io.OIOException: Connection reset
    at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:351) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.networkOperationRetryTimeout(OStorageRemote.java:228) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.networkOperationNoRetry(OStorageRemote.java:261) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:947) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentRemote.indexQuery(ODatabaseDocumentRemote.java:397) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.index.OIndexRemoteOneValue.get(OIndexRemoteOneValue.java:48) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.index.OIndexRemoteOneValue.get(OIndexRemoteOneValue.java:39) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.index.OIndexAbstractDelegate.get(OIndexAbstractDelegate.java:59) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:231) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:38) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
    at com.convergencelabs.convergence.server.backend.datastore.OrientDBUtil$.$anonfun$findDocumentFromSingleValueIndex$1(OrientDBUtil.scala:189) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at scala.util.Try$.apply(Try.scala:210) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at com.convergencelabs.convergence.server.backend.datastore.OrientDBUtil$.findDocumentFromSingleValueIndex(OrientDBUtil.scala:185) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore$.com$convergencelabs$convergence$server$backend$datastore$domain$model$ModelStore$$findModelDocument(ModelStore.scala:333) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore.$anonfun$getModel$1(ModelStore.scala:198) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.$anonfun$withDatabase$3(PooledDatabaseProvider.scala:58) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.util.TryWithResource.tryWithResolvedResource(TryWithResource.scala:50) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.util.TryWithResource.com$convergencelabs$convergence$server$util$TryWithResource$$tryWithResource(TryWithResource.scala:36) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.util.TryWithResource$.apply(TryWithResource.scala:78) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.$anonfun$withDatabase$1(PooledDatabaseProvider.scala:56) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at scala.util.Success.flatMap(Try.scala:258) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.withDatabase(PooledDatabaseProvider.scala:55) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.datastore.AbstractDatabasePersistence.withDb(AbstractDatabasePersistence.scala:32) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore.getModel(ModelStore.scala:196) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$3(RealtimeModelActor.scala:325) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$3$adapted(RealtimeModelActor.scala:323) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at scala.util.Success.flatMap(Try.scala:258) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$1(RealtimeModelActor.scala:323) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:671) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:430) ~[org.scala-lang.scala-library-2.13.2.jar:?]
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:56) ~[com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
    at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:93) ~[com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [org.scala-lang.scala-library-2.13.2.jar:?]
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94) [org.scala-lang.scala-library-2.13.2.jar:?]
    at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:93) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:48) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) [?:?]
    at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) [?:?]
    at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) [?:?]
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) [?:?]
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) [?:?]

More details:

  • We observe no other connectivity issues with Convergence server or OrientDB.
  • There are no errors or warnings in OrientDB logs.
  • We have everything deployed to a Kubernetes cluster. Our installation has 1 instance of convergence-seed, 3 instances of convergence-server, and 1 instance of OrientDB.

Maybe you could give us some hints for what might go wrong and tips on what Convergence server settings to tweak. Since GET should be a safe method, should we implement retries on the side of our application that makes REST API calls?

Versions

  • Convergence Version: convergencelabs/convergence-server:1.0.0-rc.7
  • OS: Running in Docker

Describe the Bug

REST API client sometimes gets 500 Internal Server Error from Convergence.

Expected Behavior

If Convergence server has some kind of an OrientDB connection pool which might contain stale connections, it should reconnect when hitting connection reset.

Alternatively, if retries are considered a caller's duty, Convergence server should probably return 503 instead of 500 to signify that the client should make a retry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions