Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Goroutine leak due to stuck send on channel on timeouts #204

Merged
merged 2 commits into from
Nov 11, 2024

Conversation

sfinke0
Copy link
Contributor

@sfinke0 sfinke0 commented Nov 6, 2024

Hello everyone,

we're using scrapligo to gather information from devices in an API. However we noticed an increasing number of Goroutines due to some devices where the input is running into timeouts.

With pprof we saw the following goroutines hanging:

# DEBUG=1

goroutine profile: total 55
16 @ 0x4731ae 0x407c6d 0x4078d7 0x8d633c 0x47b2c1
#       0x8d633b        [github.com/scrapli/scrapligo/channel](http://github.com/scrapli/scrapligo/channel).(*Channel).SendInputB.func1+0x4fb  /src/vendor/[github.com/scrapli/scrapligo/channel/sendinput.go:80](http://github.com/scrapli/scrapligo/channel/sendinput.go:80)
# DEBUG=2

goroutine 35769 [chan send, 407 minutes]:
[github.com/scrapli/scrapligo/channel](http://github.com/scrapli/scrapligo/channel).(*Channel).SendInputB.func1()
        /src/vendor/[github.com/scrapli/scrapligo/channel/sendinput.go:80](http://github.com/scrapli/scrapligo/channel/sendinput.go:80) +0x4fc
created by [github.com/scrapli/scrapligo/channel](http://github.com/scrapli/scrapligo/channel).(*Channel).SendInputB in goroutine 33303
        /src/vendor/[github.com/scrapli/scrapligo/channel/sendinput.go:35](http://github.com/scrapli/scrapligo/channel/sendinput.go:35) +0x2f4
 
goroutine 101718 [chan send, 48 minutes]:
[github.com/scrapli/scrapligo/channel](http://github.com/scrapli/scrapligo/channel).(*Channel).SendInputB.func1()
        /src/vendor/[github.com/scrapli/scrapligo/channel/sendinput.go:80](http://github.com/scrapli/scrapligo/channel/sendinput.go:80) +0x4fc
created by [github.com/scrapli/scrapligo/channel](http://github.com/scrapli/scrapligo/channel).(*Channel).SendInputB in goroutine 99320
        /src/vendor/[github.com/scrapli/scrapligo/channel/sendinput.go:35](http://github.com/scrapli/scrapligo/channel/sendinput.go:35) +0x2f4

To fix the issue I decided to ditch the timer for a context with timeout. This makes it possible to completely get rid of the select and make sure that the channel is always read. I also added an additional return statement, so that in rare cases the goroutine does not try to write into the channel twice.

Please let me know if you have any suggestions for improvement.

Thanks!

Sebastian

@carlmontanari
Copy link
Contributor

hey @sfinke0 thanks for this! on a quick look this seems great to me -- ill probably take a closer look this weekend -- I think there are probably a few other places that this could also apply that would maybe be good to update (probably send interactive, maybe also send with callbacks).

cc @netixx for viz too since they did a bunch of other fixes around some leaky goroutines earlier this year!

@carlmontanari
Copy link
Contributor

👋 hey again @sfinke0 -- I just pushed a commit to do how you set this up on get prompt, sendinteractive and netconf caps reading.

for context/for future us to remember what the heck was going on here -- this used to be setup with the timeout handler due to blocking reads in channel. but that went away with the channel read loop that runs in the background of things. so your fix is a much better way to handle things now! I left the netconf npc and send with callbacks stuff alone for now though those should probably also be updated at some point.

if this still looks good to you let me know and ill smack merge! thanks a bunch for the help!

@sfinke0
Copy link
Contributor Author

sfinke0 commented Nov 11, 2024

Hi @carlmontanari,

thanks for the improvements, I really like the use of errors.Is because this preserves the original error message. LGTM for merge 😊

Thanks again!

@carlmontanari carlmontanari merged commit 7ea22c0 into scrapli:main Nov 11, 2024
7 checks passed
@sfinke0 sfinke0 deleted the fix/goroutine-leak branch November 11, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants