Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log the ENR pubkey and full lock hash in app-start #2778

Closed
5 tasks
OisinKyne opened this issue Jan 10, 2024 · 1 comment
Closed
5 tasks

Log the ENR pubkey and full lock hash in app-start #2778

OisinKyne opened this issue Jan 10, 2024 · 1 comment
Assignees
Labels
protocol Protocol Team tickets V1

Comments

@OisinKyne
Copy link
Contributor

🎯 Problem to be solved

Small ticket, might not be important but logging it so we have something to come back to.

A user has given me a log output where they fail to connect to relays on boot with the error:

{"log":"14:23:26.811 INFO cmd        Parsed config                            {\"beacon-node-endpoints\": \"[http://localhost:6002]\", \"builder-api\": \"false\", \"feature-set\": \"stable\", \"feature-set-disable\": \"[]\", \"feature-set-enable\": \"[]\", \"help\": \"false\", \"jaeger-address\": \"\", \"jaeger-service\": \"charon\", \"lock-file\": \".charon/cluster-lock.json\", \"log-color\": \"auto\", \"log-format\": \"console\", \"log-level\": \"info\", \"loki-addresses\": \"[]\", \"loki-service\": \"charon\", \"manifest-file\": \".charon/cluster-manifest.pb\", \"monitoring-address\": \"127.0.0.1:3620\", \"no-verify\": \"false\", \"p2p-allowlist\": \"\", \"p2p-denylist\": \"\", \"p2p-disable-reuseport\": \"false\", \"p2p-external-hostname\": \"\", \"p2p-external-ip\": \"\", \"p2p-relays\": \"[https://charon-relay.dsrvlabs.dev,https//relay-2.prod-relay.721.land,https://relay-1.obol.figment.io,https://0.relay.obol.tech]\", \"p2p-tcp-address\": \"[0.0.0.0:3610]\", \"private-key-file\": \".charon/charon-enr-private-key\", \"private-key-file-lock\": \"false\", \"simnet-beacon-mock\": \"false\", \"simnet-beacon-mock-fuzz\": \"false\", \"simnet-slot-duration\": \"1s\", \"simnet-validator-keys-dir\": \".charon/validator_keys\", \"simnet-validator-mock\": \"false\", \"synthetic-block-proposals\": \"false\", \"validator-api-address\": \"127.0.0.1:3600\"}\n","stream":"stderr","time":"2024-01-05T14:23:26.811586093Z"}

{"log":"14:23:26.811 INFO app-start  Charon starting                          {\"version\": \"v0.18.0\", \"git_commit_hash\": \"67725e4\", \"git_commit_time\": \"2023-11-10T15:18:42Z\"}\n","stream":"stderr","time":"2024-01-05T14:23:26.811751901Z"}
{"log":"14:23:26.836 ERRO app-start  Failed resolving relay addresses from URL: invalid relay url {\"url\": \"https//relay-2.prod-relay.721.land\"}\n","stream":"stderr","time":"2024-01-05T14:23:26.836619272Z"}
{"log":"\u0009p2p/bootnode.go:130 .queryRelayAddrs\n","stream":"stderr","time":"2024-01-05T14:23:26.836632156Z"}
{"log":"\u0009p2p/bootnode.go:83 .resolveRelay\n","stream":"stderr","time":"2024-01-05T14:23:26.836635903Z"}
{"log":"14:23:26.896 INFO app-start  Resolved new relay                       {\"peer\": \"unusual-plant\", \"url\": \"https://0.relay.obol.tech\", \"addrs\": \"[/ip4/34.91.191.228/tcp/3610]\"}\n","stream":"stderr","time":"2024-01-05T14:23:26.896576236Z"}
{"log":"14:23:27.119 INFO app-start  Resolved new relay                       {\"peer\": \"ample-shapes\", \"url\": \"https://relay-1.obol.figment.io\", \"addrs\": \"[/dns/relay-1.obol.figment.io/tcp/3610 /ip4/158.69.126.60/tcp/3610]\"}\n","stream":"stderr","time":"2024-01-05T14:23:27.119486491Z"}
{"log":"14:23:27.809 INFO app-start  Resolved new relay                       {\"peer\": \"real-floor\", \"url\": \"https://charon-relay.dsrvlabs.dev\", \"addrs\": \"[/ip4/34.64.121.83/tcp/3610]\"}\n","stream":"stderr","time":"2024-01-05T14:23:27.80976074Z"}
{"log":"14:23:27.843 INFO app-start  Lock file loaded                         {\"peer_name\": \"favorable-council\", \"peer_index\": 6, \"cluster_hash\": \"54c7588\", \"cluster_name\": \"**redacted**\", \"peers\": 10}\n","stream":"stderr","time":"2024-01-05T14:23:27.843819782Z"}
{"log":"14:23:32.856 WARN relay      Reserve relay circuit: reservation error: status: CONNECTION_FAILED reason: failed to open stream err: failed to dial: failed to dial 16Uiu2HAm5T3YGYX8mzw6quAJjsXtgCCfmxtp4cRtkveH5LqeQ1sz:\n","stream":"stderr","time":"2024-01-05T14:23:32.856672616Z"}
{"log":"  * [/ip4/34.64.121.83/tcp/3610] dial tcp4 0.0.0.0:3610-\u003e34.64.121.83:3610: i/o timeout {\"relay_peer\": \"real-floor\"}\n","stream":"stderr","time":"2024-01-05T14:23:32.856696591Z"}
{"log":"\u0009p2p/relay.go:46 .func2\n","stream":"stderr","time":"2024-01-05T14:23:32.856700017Z"}
{"log":"\u0009app/lifecycle/hook.go:48 .Call\n","stream":"stderr","time":"2024-01-05T14:23:32.856702201Z"}
{"log":"\u0009app/lifecycle/hook.go:131 .startHook\n","stream":"stderr","time":"2024-01-05T14:23:32.856704235Z"}
{"log":"\u0009app/lifecycle/hook.go:154 .func1\n","stream":"stderr","time":"2024-01-05T14:23:32.856706579Z"}
{"log":"14:23:32.856 WARN relay      Reserve relay circuit: reservation error: status: CONNECTION_FAILED reason: failed to open stream err: failed to dial: failed to dial 16Uiu2HAkwtMU2GN3Ahsfmhp2rkPphorftAfy4SoJxdKCkwrhRehM:\n","stream":"stderr","time":"2024-01-05T14:23:32.856727598Z"}
{"log":"  * [/ip4/34.91.191.228/tcp/3610] dial tcp4 0.0.0.0:3610-\u003e34.91.191.228:3610: i/o timeout {\"relay_peer\": \"unusual-plant\"}\n","stream":"stderr","time":"2024-01-05T14:23:32.856730223Z"}
{"log":"\u0009p2p/relay.go:46 .func2\n","stream":"stderr","time":"2024-01-05T14:23:32.856732387Z"}
{"log":"\u0009app/lifecycle/hook.go:48 .Call\n","stream":"stderr","time":"2024-01-05T14:23:32.856734261Z"}
{"log":"\u0009app/lifecycle/hook.go:131 .startHook\n","stream":"stderr","time":"2024-01-05T14:23:32.856736044Z"}
{"log":"\u0009app/lifecycle/hook.go:154 .func1\n","stream":"stderr","time":"2024-01-05T14:23:32.856737978Z"}
{"log":"14:23:32.857 WARN relay      Reserve relay circuit: reservation error: status: CONNECTION_FAILED reason: failed to open stream err: failed to dial: failed to dial 16Uiu2HAmPKHeK1E21iPJSJNfMKonm3wEFZEYh6dhtBU4NHmegDs8:\n","stream":"stderr","time":"2024-01-05T14:23:32.857544614Z"}
{"log":"  * [/ip4/158.69.126.60/tcp/3610] dial tcp4 0.0.0.0:3610-\u003e158.69.126.60:3610: i/o timeout\n","stream":"stderr","time":"2024-01-05T14:23:32.857563399Z"}
{"log":"  * [/ip4/66.70.178.154/tcp/3610] dial tcp4 0.0.0.0:3610-\u003e66.70.178.154:3610: i/o timeout {\"relay_peer\": \"ample-shapes\"}\n","stream":"stderr","time":"2024-01-05T14:23:32.857566885Z"}
{"log":"\u0009p2p/relay.go:46 .func2\n","stream":"stderr","time":"2024-01-05T14:23:32.85756959Z"}
{"log":"\u0009app/lifecycle/hook.go:48 .Call\n","stream":"stderr","time":"2024-01-05T14:23:32.857571985Z"}
{"log":"\u0009app/lifecycle/hook.go:131 .startHook\n","stream":"stderr","time":"2024-01-05T14:23:32.857574059Z"}
{"log":"\u0009app/lifecycle/hook.go:154 .func1\n","stream":"stderr","time":"2024-01-05T14:23:32.857576503Z"}

It would help if the logs contained the full lock hash (cluster_hash is probably the lock hash but its only the short code) and enr pubkey for this node, to help confirm they are running with the appropriate node private key, and have the lock hash we expect them to have setup. (The fear here is this cluster had a false start, cleared down their keys and went again, but that this operator is running an old lock hash. Unless we are supremely unlucky with a shortcode collision, this is not the case, so the proposed change my not necessarily fix the issue/make much of a difference in this instance).

🛠️ Proposed solution

Alongside cluster_hash log the full lock_hash.
Some appropriate time in app-start, log the enr pubkey and peer name.

(This might already happen in normal operation, but because the relays are failing the line isn't getting logged? In that case it might be as easy to close this ticket, or just log the full lock hash)

  • Approved design doc: link
  • Core team consensus on the proposed solution

Describe the solution to be implemented

🧪 Tests

  • Tested by new automated unit/integration/smoke tests
  • Manually tested on core team/canary/test clusters
  • Manually tested on local compose simnet
@github-actions github-actions bot added the protocol Protocol Team tickets label Jan 10, 2024
@boulder225 boulder225 added the V1 label Mar 1, 2024
@gsora
Copy link
Collaborator

gsora commented Mar 14, 2024

Hey team! Please add your planning poker estimate with Zenhub @pinebit @KaloyanTanev

@pinebit pinebit self-assigned this Mar 18, 2024
obol-bulldozer bot pushed a commit that referenced this issue Mar 19, 2024
Added logging of full lock_hash and node's enr when charon starts.

category: refactor
ticket: #2778
@pinebit pinebit closed this as completed Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
protocol Protocol Team tickets V1
Projects
None yet
Development

No branches or pull requests

4 participants