Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[request] circuit breakers #154

Closed
sonicaghi opened this issue Apr 22, 2015 · 18 comments
Closed

[request] circuit breakers #154

sonicaghi opened this issue Apr 22, 2015 · 18 comments
Labels
idea/new plugin [legacy] those issues belong to Kong Nation, since GitHub issues are reserved for bug reports.

Comments

@sonicaghi
Copy link
Member

Circuit breakers prevent thundering herds, and improve resiliency against intermittent errors. Every client-side endpoint should be wrapped in a circuit breaker.

@sonicaghi sonicaghi added the idea/new plugin [legacy] those issues belong to Kong Nation, since GitHub issues are reserved for bug reports. label Apr 22, 2015
@thibaultcha thibaultcha changed the title Circuit breakers plugin [request] circuit breakers plugin Apr 29, 2015
@sonicaghi sonicaghi changed the title [request] circuit breakers plugin [request] circuit breakers Sep 18, 2015
@sonicaghi
Copy link
Member Author

@shashiranjan84
Copy link
Contributor

some of the Hystrix's featured can be imported to Kong. https://github.com/Netflix/Hystrix/wiki

@t1tcrucible
Copy link

Hi, i'm also in favourite of developing this plugin for Kong - anyone knows if that's already the case?
We are using hystrix on sevice side, but java is not what we need on Kong more a lua module or something.

@sonicaghi
Copy link
Member Author

@t1tcrucible as far as I know no one is developing this yet. Yes, all Netflix tools are mainly Java. Would you like to PR?

@t1tcrucible
Copy link

Yes it would be nice; we are figuring out if we can do this in luascript/C-API.

@subnetmarco
Copy link
Member

This would be very easy to implement, and it would consist in asking the user what threshold of 5** errors the system should accept before shutting down the service. The plugin should also provide an API interface to re-enable the circuit breaker.

An iteration over the first version, would be to also consider timeouts or response time.

@ahmadnassri ahmadnassri added the BC label May 13, 2016
@jmdacruz
Copy link

jmdacruz commented Jun 6, 2016

A plugin for this functionality would be great (instead of people developing their own ad hoc solution)

@subnetmarco
Copy link
Member

subnetmarco commented Jul 12, 2016

I was thinking about having a Circuit Breaker plugin with the following configuration template:

{
    "config": {
        "statuses": [
            500,
            501,
            502
        ],
        "minute": 20
    }
}

This would block the API if more than 20 occurrences of 500, 501 and 502 errors are being returned per minute. We could support second, minute, hour, day, month timespans.

To enable the circuit again, the plugin would provide an API endpoint like:

curl -x POST -d "closed=true" http://127.0.0.1:8001/apis/{api}/circuit

To get the status of a circuit:

curl -X GET http://127.0.0.1:8001/apis/{api}/circuit
{
    "closed": true
}

Thoughts?

@jmdacruz
Copy link

jmdacruz commented Jul 14, 2016

Sounds like a nice first step. I think it would be interesting to explore how the plugin could re-enable the route automatically:

  • First and foremost, Automatically re-enabling a route should be configurable (e.g., auto=true, auto=false)
  • Alternatives for detecting that a route is "healthy":
    • Querying "OPTIONS" on the API
    • Querying an endpoint provided by configuration (e.g., health=http://upstream.somewhere/api/health_check) using "GET". This endpoint is expected to be idempotent (that is, has no side-effects). If 200, then re-enable the circuit.
    • Others?

@jmdacruz
Copy link

jmdacruz commented Jul 14, 2016

Expanding a little bit on the above, you could have the plugin expose the following configuration:

{
    "config": {
        "statuses": [
            500,
            501,
            502
        ],
        "minute": 20,
        "health": {
            "endpoint": "http://upstream.somewhere/api/health_check",
            "method": "GET",
            "expected": "200",
            "wait_after_close": 60,
            "period": 10
        }
    }
}

Where the health section is optional, and if set it means that you want to automatically check for health and re-enable the circuit. endpoint is the URL to query, method is the method to use, and expected is the expected HTTP status code. wait_after_close is the number of seconds to wait after the API is closed in order to start querying the health API, and period is how many seconds to wait between queries. The health check task could be run on an nginx timer.

@alexraju91
Copy link

Seems like the enterprise edition supports Circuit Breaker out of the box.

@luin
Copy link

luin commented Nov 9, 2017

@alexforever86 Could you point out links to the documentation for the circuit breaker feature of the enterprise edition that I can refer to? Didn't find them out.

@hbagdi
Copy link
Member

hbagdi commented Nov 9, 2017

This is a very useful plugin to have.

This would be similar to rate-limiting plugins with support for different datastores for storing counters.

@sonicaghi @thibaultcha @thefosk Would you accept a PR for a simple implementation as discussed in the comments (less the health check part) ?

@thibaultcha
Copy link
Member

@hbagdi Hi,

This feature is already in the works internally. It will be available for testing when it is ready, and we'll be happy to receive some feedback!

@hbagdi
Copy link
Member

hbagdi commented Nov 9, 2017

@thibaultcha Sounds good. Thanks!

@xunchangguo
Copy link

good jod!

@hishamhm
Copy link
Contributor

hishamhm commented Apr 4, 2018

Health checks and circuit breakers are available since 0.12! 🎉

@hishamhm hishamhm closed this as completed Apr 4, 2018
@anuj147
Copy link

anuj147 commented Jun 23, 2021

This out-of-the-box circuit breaker solves for service failures but still following problems remain:

  1. Circuit breaking at route-level is not present. It is totally possible that one route of the service is misbehaving and others aren’t.
  2. Other kong plugins which make internal HTTP calls can fail too and needs to be wrapped around a circuit breaker.

Solution:
We have recently open-sourced a plugin (kong-circuit-breaker) that solves this very issue. It wraps all proxy calls with a circuit breaker and opens the circuit for only the routes which are failing and not on the entire upstream service. We have been using it in production at Dream11 for the past 6 months and it is battle-tested to support heavy throughput. Do check it out.

Detailed blog post link: https://blog.dream11engineering.com/break-circuits-save-kong-3680d88a0639
Github link for lua lib: https://github.com/dream11/lua-circuit-breaker
Github link for plugin: https://github.com/dream11/kong-circuit-breaker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea/new plugin [legacy] those issues belong to Kong Nation, since GitHub issues are reserved for bug reports.
Projects
None yet
Development

No branches or pull requests