Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many open files #5243

Closed
ajayshekar opened this issue Jul 12, 2019 · 14 comments
Closed

Too many open files #5243

ajayshekar opened this issue Jul 12, 2019 · 14 comments
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library.

Comments

@ajayshekar
Copy link

ajayshekar commented Jul 12, 2019

Bug Report

name = "github.com/Azure/azure-sdk-for-go"

packages = [
"profiles/latest/compute/mgmt/compute",
"profiles/latest/keyvault/keyvault",
"profiles/latest/keyvault/mgmt/keyvault",
"profiles/latest/network/mgmt/network",
"profiles/preview/keyvault/mgmt/keyvault",
"profiles/preview/preview/monitor/mgmt/insights",
"profiles/preview/storage/mgmt/storage",
"services/compute/mgmt/2019-03-01/compute",
"services/keyvault/2016-10-01/keyvault",
"services/keyvault/mgmt/2018-02-14/keyvault",
"services/network/mgmt/2018-12-01/network",
"services/preview/monitor/mgmt/2018-03-01/insights",
"services/preview/monitor/mgmt/2019-03-01/insights",
"services/preview/sql/mgmt/2017-03-01-preview/sql",
"services/preview/subscription/mgmt/2018-03-01-preview/subscription",
"services/resources/mgmt/2018-02-01/resources",
"services/storage/mgmt/2018-11-01/storage",
"version",
  ]
 pruneopts = "UT"
 revision = "b11e4e7e4bf27a4eb96af9db49de9c7509f5bb01"
 version = "v27.3.0"
  • column output by dep status "github.com/Azure/azure-sdk-for-go.

    PROJECT CONSTRAINT VERSION REVISION LATEST PKGS USED
    bitbucket.org/cloudcoreo/go-services-sdk branch master branch master 2826709 f598495 3
    bitbucket.org/cloudcoreo/rosetta-sdk branch master branch master e5d1a53 e5d1a53 4
    contrib.go.opencensus.io/exporter/ocagent v0.4.12 v0.4.12 dcb33c7 v0.4.12 1
    github.com/Azure/azure-sdk-for-go ^27.3.0 v27.3.0 b11e4e7 v27.3.0 18
    github.com/Azure/go-autorest ^11.9.0 v11.9.0 562d376 v11.9.0 10
    github.com/aws/aws-sdk-go v1.19.11 v1.19.11 56c1def v1.19.11 30
    github.com/beorn7/perks v1.0.0 v1.0.0 4b2b341 v1.0.0 1
    github.com/census-instrumentation/opencensus-proto v0.2.0 v0.2.0 a105b96 v0.2.0 6
    github.com/dgrijalva/jwt-go v3.2.0 v3.2.0 06ea103 v3.2.0 1
    github.com/dimchansky/utfbom v1.1.0 v1.1.0 d2133a1 v1.1.0 1
    github.com/go-kit/kit v0.8.0 v0.8.0 12210fb v0.8.0 2
    github.com/go-logfmt/logfmt v0.4.0 v0.4.0 07c9b44 v0.4.0 1
    github.com/golang/protobuf branch master branch master e91709a 6c65a55 12
    github.com/grpc-ecosystem/grpc-gateway v1.8.5 v1.8.5 20f268a v1.8.5 3
    github.com/hashicorp/golang-lru v0.5.1 v0.5.1 7087cb7 v0.5.1 1
    github.com/jmespath/go-jmespath * c2b33e8 1
    github.com/kr/logfmt branch master branch master b84e30a b84e30a 1
    github.com/matttproud/golang_protobuf_extensions v1.0.1 v1.0.1 c12348c v1.0.1 1
    github.com/mitchellh/go-homedir v1.1.0 v1.1.0 af06845 v1.1.0 1
    github.com/prometheus/client_golang v0.9.2 v0.9.2 505eaef v0.9.2 4
    github.com/prometheus/client_model branch master branch master fd36f42 fd36f42 1
    github.com/prometheus/common v0.3.0 v0.3.0 a82f4c1 v0.3.0 3
    github.com/prometheus/procfs branch master branch master e22ddce 8f55e60 1
    github.com/sanity-io/litter v1.1.0 v1.1.0 ae543b7 v1.1.0 1
    github.com/satori/go.uuid v1.2.0 v1.2.0 f58768c v1.2.0 1
    go.opencensus.io v0.19.3 v0.19.3 43463a8 v0.19.3 18
    golang.org/x/crypto branch master branch master 88737f5 4def268 2
    golang.org/x/net branch master branch master 4a65cf9 da137c7 7
    golang.org/x/sync branch master branch master 1122301 1122301 1
    golang.org/x/sys branch master branch master 3fd5a36 fae7ac5 1
    golang.org/x/text v0.3.0 (override) v0.3.0 f21a4df v0.3.0 14
    google.golang.org/api v0.4.0 v0.4.0 067bed6 v0.4.0 1
    google.golang.org/genproto branch master branch master d1146b9 3bdd9d9 4
    google.golang.org/grpc v1.20.0 v1.20.0 236199d v1.20.0 32

  • GO version: go version go1.12.7 darwin/amd64

  • What happened?

    Our long-running service hits multiple endpoints in the above mentioned packages and over the past few days we have consistenly observed that our service exhausts the number of file descriptors (controlled by ulimit). Exhausting the number of fd's causes the service to fail any further io request with the error too many open files.

    With regards to the architecture of our service we create a fixed number of go-routines (order of 10's) and use them to simultaneously make requests to Azure services. From pprof, we have observed that there are numerous additional go-routines (order of 100's) stuck with the following stacks net/http.(*persistConn).readLoop and net/http.(*persistConn).writeLoop for long periods of time. Since each of these open a fd there is a correlation in the number of open fd's and the hung go-routines. We have eliminated all our other network dependencies and suspect there could be a leak in the Azure go sdk which could be causing these go-routines to hang around. The issue we are observing is fairly similar to one resolved in the azure sdk for storage service [link].

    Finally, do you have any logging or diagnosis recommendations which can help us narrow down the issue? There is a possibility that we could be misusing the client api's but if we can root cause the issue then we can make the right corrections.

@kurtzeborn kurtzeborn added customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library. and removed triage labels Jul 15, 2019
@kurtzeborn
Copy link
Member

Thanks for opening this issue! CC'ing @jhendrixMSFT who investigated the related issue.

@jhendrixMSFT jhendrixMSFT self-assigned this Jul 15, 2019
@jhendrixMSFT
Copy link
Member

@ajayshekar looking through the code there are a few places where we don't properly drain the HTTP response body. I will have a fix today, would you be able to test it out to ensure it resolves the issue?

@ajayshekar
Copy link
Author

@kurtzeborn @jhendrixMSFT Thank you for the response! @jhendrixMSFT yes, I'll be able to this out and confirm if the fix worked.

@ajayshekar
Copy link
Author

ajayshekar commented Jul 16, 2019

@jhendrixMSFT Just out of curiosity, what services were the bugs in?

@jhendrixMSFT
Copy link
Member

The bugs are in go-autorest so all services are affected.

@jhendrixMSFT
Copy link
Member

@ajayshekar I've posted Azure/go-autorest#432 with the fix, please test it out and let me know if you still observe any leaks. I will wait for your ack before merging.

@ajayshekar
Copy link
Author

@jhendrixMSFT We only see this issue with the load we have in our production environment. As a result, it is going to take a couple of days for me to confirm if the fix works. Apologies for the delay! I'll post an update on this in a couple of days :)

@jhendrixMSFT
Copy link
Member

No worries, please take your time and thanks for testing it out.

@jhendrixMSFT
Copy link
Member

Hello @ajayshekar, were you able to verify this resolves your issue?

@ajayshekar
Copy link
Author

Hey @jhendrixMSFT, sorry it's taking longer than expected to verify the fix. I hope to get to it within the next week.

@jhendrixMSFT
Copy link
Member

No worries thanks for the update.

@ajayshekar
Copy link
Author

@jhendrixMSFT I was working on rolling this out and ran into an issue. We are using azure-sdk-go@27.3.0 which has a dependence on go-autorest@11.9.0. How do you recommend tackling this?

@jhendrixMSFT
Copy link
Member

@ajayshekar good point, it wouldn't be compatible.
I've created a PR for a back-port against v11, see Azure/go-autorest#460

@lilyjma lilyjma added the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Jun 16, 2020
@jhendrixMSFT
Copy link
Member

Closing as this has been fixed in the latest version of the SDK and go-autorest.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library.
Projects
None yet
Development

No branches or pull requests

4 participants