Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p/discover: UDP listener port not released when macOS firewall is enabled #18443

Open
ryanberckmans opened this issue Jan 14, 2019 · 6 comments
Assignees

Comments

@ryanberckmans
Copy link

System information

Geth
Version: 1.8.20-stable
Git Commit: 24d727b6d6e2c0cde222fa12155c4a6db5caaf2e
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.11.2
Operating System: darwin (OSX 10.13.6)
GOPATH=/Users/me/go
GOROOT=/Users/travis/.gimme/versions/go1.11.2.darwin.amd64

Expected behaviour

Discovery UDP listener should close socket on shutdown/interrupt in all cases.

Actual behaviour

In certain code paths, the discovery UDP listener is not closed on shutdown/interrupt, preventing geth from restarting until the port is manually released or system restarted.

I hit one of these code paths but don't have a specific repro.

Invocation that produced dangling UDP listener (light node):

geth --syncmode=light --cache=512 --rpc --ws --wsorigins=127.0.0.1,http://127.0.0.1:8080,https://127.0.0.1:8443 --datadir=redact

Listener initialization which became dangling:

  [14:57:00.418] [info] GETH NODE: INFO [01-14|14:57:00.418] UDP listener up                          net=enode:/redact@[::]:30303

Interrupt which failed to close UDP listener:

  [14:57:30.059] [info] GETH NODE: INFO [01-14|14:57:30.004] Got interrupt, shutting down...
  INFO [01-14|14:57:30.004] WebSocket endpoint closed                url=ws://127.0.0.1:8546
  INFO [01-14|14:57:30.005] HTTP endpoint closed                     url=http://127.0.0.1:8545
  INFO [01-14|14:57:30.005] IPC endpoint closed                      url="/Users/me/Library/Application Support/augur/geth/geth.ipc"
  INFO [01-14|14:57:30.005] Blockchain manager stopped
  INFO [01-14|14:57:30.005] Stopping light Ethereum protocol
  INFO [01-14|14:57:30.007] Light Ethereum protocol stopped
  INFO [01-14|14:57:30.008] Transaction pool stopped

Fatal when attempting to restart geth:

Fatal: Error starting protocol stack: listen udp [::]:30303: bind: address already in use

Util showing port not released:

$ netstat -anv | grep "30303|pid"
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)     rhiwat shiwat    pid   epid
udp46  58303      0  *.30303                *.*                                196724   9216  45852      0

Confirm pid 45852 doesn't exist (ie. port is unreleased after process killed; not unkilled/zombie process)

$ ps -e | grep 45852
// empty
ryanberckmans added a commit to AugurProject/augur-app that referenced this issue Jan 14, 2019
Fixes AugurProject/augur#261

I was also able to cause geth to quit without properly cleaning up the
UDP listener socket used for p2p discovery. For this I opened a new
issue ethereum/go-ethereum#18443.
@stale
Copy link

stale bot commented Jan 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@chrisfranko
Copy link

chrisfranko commented Feb 4, 2020

Im having the same issue.

System

Geth
Version: 1.9.10-stable
Git Commit: 58cf5686eab9019cc01e202e846a6bbc70a3301d
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.13.7
Operating System: darwin (OSX 10.15.2)

I built with

make geth

Ran geth with

build/bin/geth console

typed exit to close geth

exit

waited a few seconds to restart the node and got

Fatal: Error starting protocol stack: listen udp [::]:30303: bind: address already in use

Process doesn't appear anywhere. I can force it to start by either changing the --port flag or restarting my machine.

@holiman
Copy link
Contributor

holiman commented Apr 16, 2020

First reporter:

Operating System: darwin (OSX 10.13.6)

Second reporter:

Operating System: darwin (OSX 10.15.2)

Might be something with OSX?

@renaynay
Copy link
Contributor

renaynay commented Apr 16, 2020

I could not reproduce the error with either of the situations documented on this issue.

My system info:

    Geth
    Version: 1.9.14-unstable
    Git Commit: 3bf1054a13f2ed2ba8c0c7c44279bbca6e4e7cbb
    Git Commit Date: 20200416
    Architecture: amd64
    Protocol Versions: [65 64 63]
    Go Version: go1.14.1
    Operating System: darwin (OSX 10.15.4)
    GOPATH=
    GOROOT=/usr/local/Cellar/go/1.14.1/libexec

@fjl
Copy link
Contributor

fjl commented Apr 16, 2020

This happens when the macOS firewall is enabled. We cannot fix this issue, but we could work around it by using a random, OS-assigned port by default.

@fjl fjl reopened this Apr 16, 2020
@fjl fjl changed the title Close discovery UDP listener in all cases p2p/discover: UDP listener port not released when macOS firewall is enabled Apr 16, 2020
@capcasady
Copy link

I have worked around this this way which is admittedly nuts. Remove the ethernet cable, wait for sockets to drain.
shutdown. I have never had this fail although I may just have been lucky. This a decades old bug in OSX. My theory is an apparently closed udp socket with data waiting to be read and the firewall is in use doesn't always get cleaned up.
Never found a way to free the socket but I imagine a source code guru could use a debugger to clear the network stack that has that data, maybe without a crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants