Skip to content

peer: call /peer/verify after starting sing-box; fix doubled /v1#466

Open
myleshorton wants to merge 5 commits into
fisk/peer-localbackendfrom
fisk/peer-call-verify
Open

peer: call /peer/verify after starting sing-box; fix doubled /v1#466
myleshorton wants to merge 5 commits into
fisk/peer-localbackendfrom
fisk/peer-call-verify

Conversation

@myleshorton
Copy link
Copy Markdown
Contributor

@myleshorton myleshorton commented May 6, 2026

Summary

Pairs with lantern-cloud PR getlantern/lantern-cloud#2696 (split live-verify out of /peer/register into an explicit /peer/verify endpoint). Surfaced through end-to-end testing on 2026-05-07; PR has grown to absorb several follow-up fixes that came out of that exercise.

Original three changes

  1. Drop the leading /v1 from peer endpoint paths in peer/api.go. baseURL already ends with /api/v1 (from common.GetBaseURL), so /v1/peer/register was hitting /api/v1/v1/peer/register on prod and 404'ing. Every other radiance API caller appends without /v1 (config/fetcher.go, issue/issue.go); peer/api.go was the odd one out.

  2. Add API.Verify(ctx, routeID) wrapping POST /peer/verify.

  3. Call API.Verify in peer.Client.Start after box.Start succeeds. The server's verifier dials back through the peer's external port using the just-built creds, so the inbound has to be listening before verify runs. Splitting verify out of register resolves the chicken-and-egg where register-time verify could never see a peer that didn't yet have its cert.

Follow-up fixes (added 2026-05-07 from end-to-end testing)

  1. Forward common headers (notably X-Lantern-Config-Client-IP) on every peer endpoint. peer/api.go was building requests with bare http.NewRequestWithContext and skipping the standard common.NewRequestWithHeaders set. lantern-cloud's util.ClientIPWithAddr (cmd/api/util/header.go:155-184) prefers X-Lantern-Config-Client-IP over X-Forwarded-For/RemoteAddr when resolving clientIP; without the header, the server-side verify-dial could target a different IP than radiance has detected as the user's public IP. Adds TestAPI_ForwardsCommonHeaders regression test.

  2. RADIANCE_PEER_EXTERNAL_PORT env-var manual override for routers without UPnP/NAT-PMP/PCP (notably eero). When set, substitutes a manualPortForwarder for the UPnP discovery path. Invalid values fall back to UPnP with a warning rather than silently disabling peer share. Pairs with operator-side router config.

  3. Operator-visibility logs in backend/radiance.go. applyPeerShare now logs the underlying Start error at Error level (and a paired success at Info), and the once-per-daemon "Detected public IP" goes from Debug to Info with the resolved IP added to the structured fields. Both make peer-share triage from local daemon logs much faster.

Stacked on PR #460

This PR's base is fisk/peer-localbackend (PR #460), which introduces the peer package itself. Mergeable in order: #460 → this.

The lantern-cloud counterpart (#2696) can land independently — once it deploys, registration starts returning 200 + ServerConfig without verifying. Until this PR lands and ships in lantern, peers won't trigger /peer/verify and will sit with callback_verified=NULL forever (catalog filter hides them, so no client-visible breakage).

Test plan

  • go test ./peer/ ./backend/ ./common/... — pass
  • End-to-end peer-share toggle hits /peer/register/peer/verify against prod (verified via SigNoz traces 2026-05-07)
  • CI

🤖 Generated with Claude Code

Two changes that pair with the lantern-cloud /peer/verify split:

  1. peer/api.go: drop the leading /v1 from peer endpoint paths.
     baseURL already ends with /api/v1 (from common.GetBaseURL), so
     /v1/peer/register was hitting /api/v1/v1/peer/register on prod
     and 404'ing. Every other radiance API caller appends without
     /v1 (config/fetcher.go, issue/issue.go); peer/api.go was the
     odd one out. Updated NewAPI's docstring to spell out the
     convention.

  2. peer/peer.go: after box.Start succeeds, call API.Verify(routeID).
     The server's verifier dials back through the peer's external
     port using the just-built creds, so the inbound has to be
     listening before verify runs. Splitting verify out of register
     resolves the chicken-and-egg where register-time verify could
     never see a peer that didn't yet have its cert. Verify failure
     here is fatal — the server has already deprecated the row, so
     the deferred cleanup tears the rest of the session down.

  3. peer/api.go: new API.Verify(ctx, routeID) wrapping POST
     /peer/verify.

Tests: stubServer's mux handles the new /peer/verify route plus
verifyCount / verifyDeviceID / verifyStatus knobs. Existing tests
exercise the new step transparently because they use the default
verifyStatus=200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 6, 2026 20:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the peer “Share My Connection” flow to align with the lantern-cloud change that splits verification into an explicit /peer/verify step, and fixes incorrect peer API endpoint path construction that could produce doubled /v1 in production.

Changes:

  • Remove the redundant /v1 prefix from peer API endpoint paths so they append correctly to common.GetBaseURL().
  • Add API.Verify(ctx, routeID) for POST /peer/verify.
  • Call API.Verify from peer.Client.Start after sing-box successfully starts listening.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
peer/peer.go Triggers server-side verification after sing-box startup to avoid the previous register-time chicken-and-egg.
peer/peer_test.go Extends the stub server with a /peer/verify endpoint and related counters/status fields.
peer/api.go Fixes endpoint paths (remove /v1 prefix) and adds an API method for /peer/verify.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread peer/api.go
Comment on lines +51 to +54
// version prefix (matches common.GetBaseURL() which returns ".../api/v1");
// per-endpoint paths are appended without re-adding /v1, mirroring every
// other radiance caller of common.GetBaseURL (config/fetcher.go,
// issue/issue.go, etc.).
Comment thread peer/peer_test.go
Comment on lines 170 to 172
mux := http.NewServeMux()
mux.HandleFunc("/v1/peer/register", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/peer/register", func(w http.ResponseWriter, r *http.Request) {
s.registerCount.Add(1)
Comment thread peer/peer_test.go
Comment on lines +183 to +191
mux.HandleFunc("/peer/verify", func(w http.ResponseWriter, r *http.Request) {
s.verifyCount.Add(1)
s.verifyDeviceID.Store(r.Header.Get("X-Lantern-Device-Id"))
if s.verifyStatus != http.StatusOK {
http.Error(w, "verify failed", s.verifyStatus)
return
}
w.WriteHeader(http.StatusOK)
})
Adam Fisk and others added 4 commits May 6, 2026 14:42
defaultBuildBoxService used to call libbox.NewServiceWithContext with
the caller's bare ctx, which has no lantern-box protocol registries
plumbed in. The samizdat inbound type ServerConfig sends back from
/peer/register isn't a built-in sing-box protocol, so libbox's JSON
decoder couldn't resolve inbounds[0].type="samizdat" and returned
"missing inbound fields registry in context". The integration tests
stub BuildBoxService entirely, so this layer was never exercised in
CI — only surfaced live during the eero end-to-end test.

Two pieces:

  1. Use box.BaseContext() (from getlantern/lantern-box) when calling
     libbox.NewServiceWithContext. That ctx has the InboundOptionsRegistry
     populated with samizdat / reflex / etc. so the decode succeeds.
     Coexists with the user's VPN tunnel (vpn/tunnel.go) — libbox.Setup
     is process-global, the ctx registries are per-box.

  2. TestDefaultBuildBoxService_DecodesSamizdatInbound walks the actual
     decode path with a minimal samizdat-inbound JSON. Verified to fail
     with the exact production error message under the pre-fix code,
     pass under the fix. Cuts the diagnostic loop from a 5-minute
     rebuild+redeploy+toggle cycle to a 0.5s test failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…every peer endpoint

peer/api.go was building requests with bare http.NewRequestWithContext,
skipping the X-Lantern-Config-Client-IP / X-Lantern-User-Id / version
header set that /config-new sends via common.NewRequestWithHeaders.

That mattered for /peer/register specifically: the server's
util.ClientIPWithAddr (lantern-cloud cmd/api/util/header.go:155-184)
prefers X-Lantern-Config-Client-IP over X-Forwarded-For and RemoteAddr
when resolving clientIP. With the header missing, the server fell back
to whatever its X-Forwarded-For chain produced — potentially a
different IP than the radiance-detected publicIP, leading the verifier
to dial back to an address the peer's listener wasn't bound to.

Switching to common.NewRequestWithHeaders makes peer endpoints
consistent with /config-new's header set:

  - X-Lantern-Config-Client-IP (the key one for verify-dial targeting)
  - X-Lantern-App-Version, X-Lantern-Version, X-Lantern-Platform,
    X-Lantern-App, X-Lantern-User-Id, X-Lantern-Time-Zone, X-Lantern-Rand

DeviceIDHeader is set by NewRequestWithHeaders from settings; we
explicitly re-set it to a.deviceID afterward for parity with the
prior behavior in case the two ever diverge.

Adds TestAPI_ForwardsCommonHeaders which hits all four peer endpoints
against a stub server and asserts each carries the expected headers
(uses common.SetPublicIP / Cleanup to avoid leaking into other tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uters

UPnP / NAT-PMP / PCP discovery is silent or absent on a meaningful
chunk of consumer routers — eero in particular ignores all three.
For peers behind such routers, peer.Client.Start currently fails at
the MapPort step with portforward.ErrNoPortForwarding even though
the operator has perfectly valid manual port-forward rules in their
router admin UI.

Add an env-var escape hatch: when RADIANCE_PEER_EXTERNAL_PORT is set
to a 1..65535 value, NewClient's default forwarder substitutes a
manualPortForwarder that:

  - Returns the manual port unchanged for both internal and external
    sides of the Mapping (operator is responsible for matching the
    sing-box bind to the same port).
  - Returns "" from ExternalIP, letting peer_handler's "external_ip
    empty -> use observed" fall-through resolve the IP server-side.
  - Is a no-op for UnmapPort and StartRenewal (nothing to release;
    the manual rule is operator-managed).

Invalid values (non-numeric, <1, >65535) log a warning and fall back
to the default UPnP path so a typo doesn't silently disable peer
share entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… log to Info

Two operator-visibility tweaks that helped during peer-share testing
and are worth keeping:

1. applyPeerShare(true) now logs the underlying Start error at Error
   level (and a paired success log at Info). Without this, a peer
   share toggle that fails server-side (4xx, UPnP miss, samizdat
   verify timeout) only surfaces via the IPC HTTP response — a
   layer the daemon log never sees, making post-hoc triage from
   the user's local logs much harder than necessary.

2. The "Detected public IP" log goes from Debug to Info, with the
   resolved IP added to the structured fields. publicIP is fetched
   exactly once per daemon lifetime; emitting it at Info gives
   operators a single line to compare against what lantern-cloud
   observed for that same client (visible in SigNoz traces) without
   needing to flip the global log level.

No behavior change beyond the log lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Comment thread peer/peer.go
Comment on lines +478 to 480
func defaultBuildBoxService(_ context.Context, options string) (boxService, error) {
bs, err := libbox.NewServiceWithContext(box.BaseContext(), options, nil)
if err != nil {
Comment thread peer/api.go
Comment on lines +52 to +55
// version prefix (matches common.GetBaseURL() which returns ".../api/v1");
// per-endpoint paths are appended without re-adding /v1, mirroring every
// other radiance caller of common.GetBaseURL (config/fetcher.go,
// issue/issue.go, etc.).
Comment thread peer/peer.go
Comment on lines +271 to +273
// is fatal — the server has already deprecated the row, so the
// deferred cleanup tears the rest of the session down.
if err := c.cfg.API.Verify(ctx, regResp.RouteID); err != nil {
Comment thread peer/peer_test.go
Comment on lines +184 to +192
mux.HandleFunc("/peer/verify", func(w http.ResponseWriter, r *http.Request) {
s.verifyCount.Add(1)
s.verifyDeviceID.Store(r.Header.Get("X-Lantern-Device-Id"))
if s.verifyStatus != http.StatusOK {
http.Error(w, "verify failed", s.verifyStatus)
return
}
w.WriteHeader(http.StatusOK)
})
Comment thread backend/radiance.go
} else {
common.SetPublicIP(result.IP.String())
slog.Debug("Detected public IP", "confidence", result.Confidence, "sources", result.Sources)
slog.Info("Detected public IP", "ip", result.IP.String(), "confidence", result.Confidence, "sources", result.Sources)
Comment thread backend/radiance.go
Comment on lines +502 to 506
// Surface the underlying Start error so operators can see it
// in the daemon log (UPnP failure, registration 4xx, etc.)
// rather than only via the IPC HTTP response.
slog.Error("peer share start failed", "err", err)
if rbErr := settings.Patch(settings.Settings{settings.PeerShareEnabledKey: false}); rbErr != nil {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants