Skip to content

Every connection to Kestrel suddenly timing out, part of the application seems "frozen" #79446

Closed

Description

Description

Hello,

The issue manifests it seems at random on a number of Ubuntu 18.04 servers running our Kestrel-based .NET application.
The servers in question did not install any package updates recently that might have contributed to such behavior.

Kestrel stops responding to requests - all of them time out.

$ curl -vvv -k https://server.com/alive
Trying 1.2.3.4...
TCP_NODELAY set

The machine does not experience high load at the time of the incident. CPU, memory and IO usage is at normal or even low rates.

We captured .NET counters in hope of finding clues of some kind of thread starvation but we did not see anything out of the ordinary:

    Status: Running

[System.Runtime]
    % Time in GC since last GC (%)                                         0
    Allocation Rate (B / 1 sec)                                      529,872
    CPU Usage (%)                                                          3
    Exception Count (Count / 1 sec)                                        0
    GC Committed Bytes (MB)                                            1,511
    GC Fragmentation (%)                                                  66.514
    GC Heap Size (MB)                                                    527
    Gen 0 GC Count (Count / 1 sec)                                         0
    Gen 0 Size (B)                                                65,608,328
    Gen 1 GC Count (Count / 1 sec)                                         0
    Gen 1 Size (B)                                                16,600,328
    Gen 2 GC Count (Count / 1 sec)                                         0
    Gen 2 Size (B)                                                    1.0128e+09
    IL Bytes Jitted (B)                                            9,668,963
    LOH Size (B)                                                      3.2404e+08
    Monitor Lock Contention Count (Count / 1 sec)                         25
    Number of Active Timers                                              526
    Number of Assemblies Loaded                                          580
    Number of Methods Jitted                                         114,669
    POH (Pinned Object Heap) Size (B)                              2,648,776
    ThreadPool Completed Work Item Count (Count / 1 sec)                 701
    ThreadPool Queue Length                                                0
    ThreadPool Thread Count                                               20
    Time spent in JIT (ms / 1 sec)                                         0
    Working Set (MB)                                                   4,785

I attach stack traces taken with dotnet-stack.

Now the surprising part, things that "unblocks it" is:

  • strace call sudo strace -T -t -f -p $PID

  • making a minidump with dotnet-dump

  • restarting the service (not surprising)

After performing the above actions Kestrel is responding to requests again.

We would be grateful for any advice where to look further for the root cause or any additional diagnostics tips.

Reproduction Steps

Don't know yet

Expected behavior

Kestrel responds to requests.

Actual behavior

Requests time out.

Regression?

Not sure.

Known Workarounds

  • strace call sudo strace -T -t -f -p $PID

  • making a minidump with dotnet-dump

  • restarting the service (not surprising)

Configuration

Which version of .NET is the code running on?

.NET 6.0.11

OS: Ubuntu 18.04
Architecture: x64
Config-specific: don't know

Other information

stacks.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    area-System.Netneeds-further-triageIssue has been initially triaged, but needs deeper consideration or reconsideration

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions