Skip to content

"failed to release lock" error during active healthchecks #9221

Closed
@codemug

Description

@codemug

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

2.8.1

Current Behavior

I'm running Kong in declarative mode to expose a bunch of services. I've configured the following active healthcheck on all of them:

active:
  type: http
  httpPath: "/health"
  healthy:
    httpStatuses:
      - 200
      - 302
    interval: 10
    successes: 2
  timeout: 15
  unhealthy:
    httpStatuses:
      - 429
      - 404
      - 500
      - 501
      - 502
      - 503
      - 504
      - 505
    httpFailures: 8
    interval: 10
    tcpFailures: 8
    timeouts: 8

Every once in a while, an upstream target randomly becomes unhealthy and even after the actual upstream service becomes available, the status of the upstream target never returns to healthy.

In one of these instances, I've noticed the following logs in Kong proxy:

2022/08/09 12:04:18 [error] 1118#0: *136780853 [lua] healthcheck.lua:1235: log(): [healthcheck] (0dc6f45b-8f8d-40d2-a504-473544ee190b:my-service) failed to release lock 'lua-resty-healthcheck:0dc6f45b-8f8d-40d2-a504-473544ee190b:my-service:target_list_lock': unlocked, context: ngx.timer
2022/08/09 12:04:18 [error] 1118#0: *136780854 [lua] healthcheck.lua:1235: log(): [healthcheck] (0dc6f45b-8f8d-40d2-a504-473544ee190b:my-service) failed to release lock 'lua-resty-healthcheck:0dc6f45b-8f8d-40d2-a504-473544ee190b:my-service:target_list_lock': unlocked, context: ngx.timer
2022/08/09 12:04:18 [warn] 1115#0: *136781978 [lua] healthcheckers.lua:98: callback(): [healthchecks] balancer 0dc6f45b-8f8d-40d2-a504-473544ee190b:my-service reported health status changed to UNHEALTHY, context: ngx.timer

What kind of lock is this?

I've even checked the resource usage of Kong. It's allowed to use 2 CPUs and 4Gs of memory. The CPU usage remains at around 1% and the memory never crosses 1G.

Expected Behavior

The upstream target should be set to healthy by the healthchecker once the upstream service is available.

Steps To Reproduce

This occurs very randomly, I have tried reproducing it several times but to no avail.

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions