Skip to content

feat(bfd): gRPC GetBfdSessions + rustbgpctl bfd (ADR-0067 PR3 — operator surface)#252

Merged
lance0 merged 4 commits into
mainfrom
feat/bfd-surface
May 24, 2026
Merged

feat(bfd): gRPC GetBfdSessions + rustbgpctl bfd (ADR-0067 PR3 — operator surface)#252
lance0 merged 4 commits into
mainfrom
feat/bfd-surface

Conversation

@lance0
Copy link
Copy Markdown
Owner

@lance0 lance0 commented May 24, 2026

Operator inspection surface for single-hop BFD (ADR-0067), building on the
merged actor (PR2, #251). Observability before the RFC 5882 BGP coupling slice.

In this PR

  • proto: BfdService.GetBfdSessionsBfdSession (peer, state,
    diagnostic, strict) + BfdSessionState enum, with an optional peer-address
    filter. Also lands the event contract so the proto is stable now and the
    emission wiring (PR3b) won't churn it: EVENT_CATEGORY_BFD,
    BGP_EVENT_TYPE_BFD_SESSION_{UP,DOWN,STATE_CHANGED}, BfdSessionEvent,
    BgpEvent.bfd oneof.
  • crates/api: bfd_service.rs (snapshot-provider, mirrors rib_service's
    FIB snapshot; parses the peer filter to IpAddr and matches canonical
    forms), registered on both TCP + UDS listeners; ADR-0064 authz tier
    (SensitiveRead) + method-inventory JSON + count tests updated.
  • daemon: the BfdService snapshot is wired from the actor's status
    watch channel (the receiver retained in PR2), with BFD state/diagnostic →
    proto conversions.
  • CLI: rustbgpctl bfd [list | show <peer>], JSON + table output.

Scope note — inspection only; events do not stream yet

The plan's PR3 groups surface (gRPC + CLI) with event emission (actor →
WatchEvents). This PR is the surface half: a complete, mergeable,
read-only inspection unit. BFD events do not yet streamWatchEvents
explicitly rejects EVENT_CATEGORY_BFD and the BFD event types with a
"not yet available" error rather than handing back an empty stream. The event
proto contract is landed here so the emission half (PR3b) is purely wiring:
the actor emits an internal BfdRuntimeEvent → bridge → EventService BFD
stream, and the filter reject flips to accept. PR3b changes the merged actor's
spawn signature and adds a stream to event_service, so it is a cleaner
separate review.

Testing

  • cargo fmt / clippy --workspace -D warnings / test --workspace green.
  • bfd_service unit tests (list / peer-filter / invalid-input / IPv6 canonical
    match / empty), ADR-0064 authz matrix coverage + tier-count + inventory tests.

Next: PR3b (BFD event emission into WatchEvents), then PR4 (RFC 5882 coupling)
and PR5 (M51 interop).

Operator inspection surface for BFD sessions — observability before the RFC
5882 BGP coupling slice.

- proto: BfdService.GetBfdSessions, BfdSession + BfdSessionState enum, with an
  optional peer-address filter. Also lands the event contract for the next
  step (EVENT_CATEGORY_BFD, BGP_EVENT_TYPE_BFD_SESSION_{UP,DOWN,STATE_CHANGED},
  BfdSessionEvent, BgpEvent.bfd oneof) so the proto is stable now.
- crates/api: bfd_service.rs (snapshot-provider, mirrors rib_service's FIB
  snapshot), registered on both TCP + UDS listeners; ADR-0064 authz tier
  (SensitiveRead) + method-inventory/count tests updated; event_service
  filters accept the BFD category/types.
- daemon: BfdService snapshot wired from the actor's status watch (the
  receiver kept from PR2), with BFD state/diagnostic → proto conversions.
- CLI: `rustbgpctl bfd [list|show <peer>]`, JSON + table output.

Tests: bfd_service unit tests (list/filter/empty), authz matrix coverage +
tier counts. Next on this branch: actor-emitted BFD events into the unified
WatchEvents stream.
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:33 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:33 — with GitHub Actions Inactive
@lance0 lance0 requested a review from Copilot May 24, 2026 17:33
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:36 — with GitHub Actions Inactive
… next

Per review: make the ADR-0067 staged plan and CHANGELOG explicit that the
operator surface slice ships read-only inspection (GetBfdSessions + rustbgpctl
bfd) and the BFD event proto contract, but does NOT yet stream live BFD events
over WatchEvents — actor event emission is a separate follow-up. Splits the
ADR's step 3 into 3a (inspection, shipped) and 3b (emission, next).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an operator-facing inspection surface for single-hop BFD (ADR-0067) by exposing session state over gRPC and surfacing it in rustbgpctl, while also landing the (future) BFD event proto contract and expanding EventService filter support.

Changes:

  • Introduce BfdService.GetBfdSessions and related BFD proto types (sessions, states, event contract).
  • Wire the daemon’s BFD status watch channel into the gRPC API server as a live snapshot source.
  • Add rustbgpctl bfd (list|show <peer>) with table/JSON output, and update authz + method inventory.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/main.rs Wires BFD actor status into API snapshot provider; adds BFD state/diagnostic conversions.
proto/rustbgpd.proto Adds BFD service/messages/enums and BFD event contract fields/types.
docs/grpc-method-inventory.json Updates method inventory counts and adds BfdService.GetBfdSessions.
crates/cli/src/main.rs Adds rustbgpctl bfd command routing and subcommands.
crates/cli/src/commands/mod.rs Exposes the new bfd command module.
crates/cli/src/commands/bfd.rs Implements BFD list/show output (table + JSON) via gRPC.
crates/api/src/server.rs Registers BfdService on TCP + UDS listeners; threads snapshot through serve config.
crates/api/src/lib.rs Exports the new bfd_service module.
crates/api/src/event_service/filters.rs Extends EventService filter parsing to accept BFD category/event types.
crates/api/src/event_service/convert.rs Extends stream-lag category string mapping to include BFD.
crates/api/src/bfd_service.rs Implements BfdService backed by a daemon-provided snapshot closure + unit tests.
crates/api/src/authz.rs Adds authz tier entry for BfdService.GetBfdSessions and updates method counts/tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +43 to +47
let filter = request.into_inner().peer_address;
let mut sessions = (self.snapshot)();
if !filter.is_empty() {
sessions.retain(|s| s.peer_address == filter);
}
Comment thread proto/rustbgpd.proto Outdated
Comment on lines +608 to +611
// stable snake_case name: "none", "control_detection_time_expired",
// "echo_function_failed", "neighbor_signaled_session_down",
// "forwarding_plane_reset", "path_down", "concatenated_path_down",
// "administratively_down", "reverse_concatenated_path_down".
Comment thread crates/api/src/event_service/filters.rs Outdated
Comment on lines 391 to 394
| proto::EventCategory::Evpn
| proto::EventCategory::Bfd => {
parsed.insert(category as i32);
}
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:39 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:39 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:42 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:43 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:43 — with GitHub Actions Inactive
…s, doc reserved diag

Three findings on the surface PR:
- GetBfdSessions parses peer_address to IpAddr (invalid_argument on bad input)
  and compares canonical forms, so equivalent IPv6 textual representations
  match — mirrors NeighborService/RibService filter handling.
- WatchEvents now *rejects* EVENT_CATEGORY_BFD and the BFD event types with
  "BFD event streaming is not yet available" instead of accepting a filter that
  yields an empty/immediately-closed stream. The proto contract stays defined;
  PR3b flips these to accept + wires the actual stream.
- proto BfdSession.diagnostic documents "reserved" as a possible value (the
  daemon maps unassigned RFC 5880 diagnostic codes to it).

Tests: invalid-peer rejection + IPv6 canonical-match in bfd_service.
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:46 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Comment thread crates/api/src/bfd_service.rs Outdated
// peer addresses are already `IpAddr::to_string()` (canonical).
let wanted = filter
.parse::<std::net::IpAddr>()
.map_err(|_| Status::invalid_argument(format!("invalid peer_address {filter:?}")))?
Comment on lines +394 to +401
// The BFD event proto contract exists, but the actor does not yet
// emit into WatchEvents (ADR-0067 step 3b). Reject the filter rather
// than hand back an empty/immediately-closed stream.
proto::EventCategory::Bfd => {
return Err(Status::invalid_argument(
"BFD event streaming is not yet available",
));
}
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:49 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:50 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:50 — with GitHub Actions Inactive
Use "invalid peer_address: {e}" to match RibService/NeighborService so
clients get consistent InvalidArgument text across services.
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:53 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:53 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:56 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:57 — with GitHub Actions Inactive
@lance0 lance0 temporarily deployed to kernel-dataplane-auto May 24, 2026 17:57 — with GitHub Actions Inactive
@lance0 lance0 merged commit 91dc36a into main May 24, 2026
37 checks passed
@lance0 lance0 deleted the feat/bfd-surface branch May 24, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants