Skip to content

Too many open files #5243

Closed
Closed

Description

Bug Report

name = "github.com/Azure/azure-sdk-for-go"

packages = [
"profiles/latest/compute/mgmt/compute",
"profiles/latest/keyvault/keyvault",
"profiles/latest/keyvault/mgmt/keyvault",
"profiles/latest/network/mgmt/network",
"profiles/preview/keyvault/mgmt/keyvault",
"profiles/preview/preview/monitor/mgmt/insights",
"profiles/preview/storage/mgmt/storage",
"services/compute/mgmt/2019-03-01/compute",
"services/keyvault/2016-10-01/keyvault",
"services/keyvault/mgmt/2018-02-14/keyvault",
"services/network/mgmt/2018-12-01/network",
"services/preview/monitor/mgmt/2018-03-01/insights",
"services/preview/monitor/mgmt/2019-03-01/insights",
"services/preview/sql/mgmt/2017-03-01-preview/sql",
"services/preview/subscription/mgmt/2018-03-01-preview/subscription",
"services/resources/mgmt/2018-02-01/resources",
"services/storage/mgmt/2018-11-01/storage",
"version",
  ]
 pruneopts = "UT"
 revision = "b11e4e7e4bf27a4eb96af9db49de9c7509f5bb01"
 version = "v27.3.0"
  • column output by dep status "github.com/Azure/azure-sdk-for-go.

    PROJECT CONSTRAINT VERSION REVISION LATEST PKGS USED
    bitbucket.org/cloudcoreo/go-services-sdk branch master branch master 2826709 f598495 3
    bitbucket.org/cloudcoreo/rosetta-sdk branch master branch master e5d1a53 e5d1a53 4
    contrib.go.opencensus.io/exporter/ocagent v0.4.12 v0.4.12 dcb33c7 v0.4.12 1
    github.com/Azure/azure-sdk-for-go ^27.3.0 v27.3.0 b11e4e7 v27.3.0 18
    github.com/Azure/go-autorest ^11.9.0 v11.9.0 562d376 v11.9.0 10
    github.com/aws/aws-sdk-go v1.19.11 v1.19.11 56c1def v1.19.11 30
    github.com/beorn7/perks v1.0.0 v1.0.0 4b2b341 v1.0.0 1
    github.com/census-instrumentation/opencensus-proto v0.2.0 v0.2.0 a105b96 v0.2.0 6
    github.com/dgrijalva/jwt-go v3.2.0 v3.2.0 06ea103 v3.2.0 1
    github.com/dimchansky/utfbom v1.1.0 v1.1.0 d2133a1 v1.1.0 1
    github.com/go-kit/kit v0.8.0 v0.8.0 12210fb v0.8.0 2
    github.com/go-logfmt/logfmt v0.4.0 v0.4.0 07c9b44 v0.4.0 1
    github.com/golang/protobuf branch master branch master e91709a 6c65a55 12
    github.com/grpc-ecosystem/grpc-gateway v1.8.5 v1.8.5 20f268a v1.8.5 3
    github.com/hashicorp/golang-lru v0.5.1 v0.5.1 7087cb7 v0.5.1 1
    github.com/jmespath/go-jmespath * c2b33e8 1
    github.com/kr/logfmt branch master branch master b84e30a b84e30a 1
    github.com/matttproud/golang_protobuf_extensions v1.0.1 v1.0.1 c12348c v1.0.1 1
    github.com/mitchellh/go-homedir v1.1.0 v1.1.0 af06845 v1.1.0 1
    github.com/prometheus/client_golang v0.9.2 v0.9.2 505eaef v0.9.2 4
    github.com/prometheus/client_model branch master branch master fd36f42 fd36f42 1
    github.com/prometheus/common v0.3.0 v0.3.0 a82f4c1 v0.3.0 3
    github.com/prometheus/procfs branch master branch master e22ddce 8f55e60 1
    github.com/sanity-io/litter v1.1.0 v1.1.0 ae543b7 v1.1.0 1
    github.com/satori/go.uuid v1.2.0 v1.2.0 f58768c v1.2.0 1
    go.opencensus.io v0.19.3 v0.19.3 43463a8 v0.19.3 18
    golang.org/x/crypto branch master branch master 88737f5 4def268 2
    golang.org/x/net branch master branch master 4a65cf9 da137c7 7
    golang.org/x/sync branch master branch master 1122301 1122301 1
    golang.org/x/sys branch master branch master 3fd5a36 fae7ac5 1
    golang.org/x/text v0.3.0 (override) v0.3.0 f21a4df v0.3.0 14
    google.golang.org/api v0.4.0 v0.4.0 067bed6 v0.4.0 1
    google.golang.org/genproto branch master branch master d1146b9 3bdd9d9 4
    google.golang.org/grpc v1.20.0 v1.20.0 236199d v1.20.0 32

  • GO version: go version go1.12.7 darwin/amd64

  • What happened?

    Our long-running service hits multiple endpoints in the above mentioned packages and over the past few days we have consistenly observed that our service exhausts the number of file descriptors (controlled by ulimit). Exhausting the number of fd's causes the service to fail any further io request with the error too many open files.

    With regards to the architecture of our service we create a fixed number of go-routines (order of 10's) and use them to simultaneously make requests to Azure services. From pprof, we have observed that there are numerous additional go-routines (order of 100's) stuck with the following stacks net/http.(*persistConn).readLoop and net/http.(*persistConn).writeLoop for long periods of time. Since each of these open a fd there is a correlation in the number of open fd's and the hung go-routines. We have eliminated all our other network dependencies and suspect there could be a leak in the Azure go sdk which could be causing these go-routines to hang around. The issue we are observing is fairly similar to one resolved in the azure sdk for storage service [link].

    Finally, do you have any logging or diagnosis recommendations which can help us narrow down the issue? There is a possibility that we could be misusing the client api's but if we can root cause the issue then we can make the right corrections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

MgmtThis issue is related to a management-plane library.bugThis issue requires a change to an existing behavior in the product in order to be resolved.customer-reportedIssues that are reported by GitHub users external to the Azure organization.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions