Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kong under load keyspace configuration #543

Closed
rafael opened this issue Sep 15, 2015 · 5 comments
Closed

Kong under load keyspace configuration #543

rafael opened this issue Sep 15, 2015 · 5 comments
Assignees
Milestone

Comments

@rafael
Copy link
Contributor

rafael commented Sep 15, 2015

Hi all -

I'm trying to do some small load testing on Kong (0.4.2) but the moment I start triggering some concurrent connections I start getting some errors. I'm trying to do all this testing inside a Kubernetes Cluster in gcloud. My cluster consists of 4 n1-highcpu-4 (16 vCPUs, 14.4 GB memory) instances. Inside the kubernetes cluster I'm running a Cassandra (2.1.7) cluster with 4 containers/nodes and also 4 kong nodes. I'm not putting any limit on how much resources the docker containers can take from the cluster.

Once I start doing something like this:

ab -n 10000 -c 20 http://104.197.45.83/users

Which doesn't seem crazy, I start seeing sporadic errors on Kong like the following:

2015/09/15 03:27:12 [error] 61#0: *60379 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from 10.176.0.13: timeout, client: 10.240.232.213, server: _, request: "GET /users HTTP/1.0", host: "104.197.45.83"
2015/09/15 03:27:12 [error] 61#0: *60375 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from 10.176.0.13: timeout, client: 10.240.199.154, server: _, request: "GET /users HTTP/1.0", host: "104.197.45.83"
2015/09/15 03:27:12 [error] 61#0: *60350 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from 10.176.0.13: timeout, client: 10.240.116.150, server: _, request: "GET /users HTTP/1.0", host: "104.197.45.83"
2015/09/15 03:27:17 [error] 61#0: *62613 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from 10.176.3.13: timeout, client: 10.240.232.213, server: _, request: "GET /users HTTP/1.0", host: "104.197.45.83"

and on the Cassandra side:

java.io.IOException: Error while read(...): Connection reset by peer
    at io.netty.channel.epoll.Native.readAddress(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]

I'm kind of puzzled because as far as I can tell the containers and the cluster don't seem to be under heavy stress. I also used the cassandra-stress tool against the cassandra cluster directly and couldn't find any issues.

Do you guys have any idea of what might be going on here?

@rafael rafael changed the title Kong under load issue - question Kong under load issue Sep 15, 2015
@rafael
Copy link
Contributor Author

rafael commented Sep 16, 2015

So my issue ended up being the replication factor set to 1 in the keyspace. Even though I had a bigger cluster all the request were being redirected to the same node and eventually it started time out. Once I updated the replication factor I stopped getting errors.

I'm going to see if I can put together a pull request to have the replicator factor be a parameter to kong.yml.

@thibaultcha
Copy link
Member

As we already discussed this on Gitter, I'm going to sum it up for future reference:

Even if not explicit, I did not consider the replication factor set to 1 in the migration a problem because one can manually create a keyspace with any configuration and then run the migrations on it. The created keyspace would not be overridden.

That said, I've been wanting to add those options in the kong.yml file, but lacked the time/prioritisation (since it is already possible as just explained). Ideally, if you wish to implement it, the file should provide options for keyspace creation as described here. Something like:

cassandra:
  contact_points:
    ...
  keyspace: ...
  #
  # Keyspace options. Set those before running Kong or any migration.
  # See http://docs.datastax.com/en/cql/3.1/cql/cql_reference/create_keyspace_r.html
  #
  # Replica placement strategy place for the Keyspace.
  strategy_class: SimpleStrategy
  # Required if class is SimpleStrategy; otherwise, not used. 
  # The number of replicas of data on multiple nodes.
  replication_factor: 1
  # Required if class is NetworkTopologyStrategy and you provide the name of the first data center.
  # This value is the number of replicas of data on each node in the first data center.
  data_centers:
    dc1: 2
    dc2: 4

At least that was my original idea. If you don't get to it I'll probably implement this.

@cagdast
Copy link

cagdast commented Sep 16, 2015

I faced the same problem with two nodes Cassandra cluster. The issue got solved after changing replication factor from 1 to 2.

@thibaultcha thibaultcha changed the title Kong under load issue Kong under load keyspace configuration Sep 21, 2015
@thibaultcha
Copy link
Member

Related to #350

thibaultcha added a commit that referenced this issue Sep 30, 2015
Possibility to configure the replication strategy used by the created
keyspace and its options.

Implements #543 and #350
@thibaultcha thibaultcha self-assigned this Sep 30, 2015
thibaultcha added a commit that referenced this issue Oct 5, 2015
Possibility to configure the replication strategy used by the created
keyspace and its options.

Implements #543 and #350
@thibaultcha thibaultcha added this to the 0.6.0 milestone Oct 15, 2015
thibaultcha added a commit that referenced this issue Oct 15, 2015
Possibility to configure the replication strategy used by the created
keyspace and its options.

Implements #543 and #350
@thibaultcha
Copy link
Member

Implemented with #634.

thibaultcha added a commit that referenced this issue Oct 16, 2015
Possibility to configure the replication strategy used by the created
keyspace and its options.

Implements #543 and #350
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants