Skip to content

Conversation

doguhanniltextra
Copy link
Contributor

@doguhanniltextra doguhanniltextra commented Sep 2, 2025

This PR implements a new status command in the SPIKE CLI.

The command displays basic status and statistics of SPIKE Nexus, including:

  • Whether the root key is initialized
  • Number of secrets currently stored
  • Health status of SPIKE Nexus

This addresses task #139. It provides a simple foundation for checking the system
status. Detailed health checks can be added in future iterations.

Copy link
Contributor

@v0lkan v0lkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this issue 🙏 .

As I've mentioned in the comments, creating the status api at SPIKE Nexus is a better option as it is closer to the source of truth.

monitoring - Implements StatusResponse with health, keepers, root key, backing store status
- Adds proper Go documentation for all types and functions
- Uses state.RootKeyZero() for correct root key availability check
- Includes FIPS mode reporting and system uptime metrics
- Addresses all review feedback from PR spiffe#139
Copy link
Contributor

@v0lkan v0lkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting there bit by bit; but we need changes.

- Renamed API URL from NexusOperatorStatus to NexusHealthStatus
- Moved status.go from ./operator to ./health
- Updated route mapping to use health.RouteGetStatus
- Status endpoint no longer implies operator-only access
- Use sync.WaitGroup to run keeper, root key, backing store checks concurrently.
- Add context with timeout to prevent endpoint from blocking indefinitely.
- Aggregate results into StatusResponse once all checks finish or timeout occurs.
- isBackingStoreHealthy() → backingStoreHealthy()
- isRootKeyAvailable() → rootKeyAvailable()
- isFIPSMode() → fipsMode()

Renamed predicate functions for better readability and consistency
with Go coding standards. All usages within health package updated
accordingly. No behavioral changes introduced.
- Updated determineOverallHealth to return "OK" when running in in-memory mode, even though no root key is available (by design).
- Root key availability is only enforced for Lite and Sqlite modes.
- Renamed root key failure return to "ROOT_KEY_UNAVAILABLE" for clarity.
- Added determineOverallHealth() to check backing store and root key
  independently
- Memory mode considered healthy even if root key is unavailable
- Lite mode requires root key but no backing store
- backingStoreHealthy() handles panics; rootKeyAvailable() reflects
    backend type
Implement secure communication between Spike Nexus and Spike Keeper
instances using SPIFFE X.509 certificates for mutual TLS authentication.

Files modified:
- status.go: Updated getKeeperStatus() and created callKeeperHealtht() functions
- health.go: Keeper-side health endpoint handler (RouteHealth)
- route.go: Route mapping for /v1/store/health endpoint

Key changes in status.go:
- Replace non-existent GetTLSCertificate() with GetX509SVID()
- Add proper X.509 SVID to tls.Certificate conversion
- Support full certificate chains (leaf + intermediate CAs)
- Implement certificate array iteration for multi-cert scenarios
- Add error handling for SPIFFE source failures
- Test connectivity to multiple Keeper instances (ports 8443, 8543,
  8643)

  Authentication flow:
  1. Nexus extracts X.509 SVID from workload API source
  2. Converts SVID certificates to TLS format with private key
  3. Establishes mTLS connection to Keeper /v1/store/health endpoint
  4. Keeper validates peer SPIFFE ID via PeerCanTalkToKeeper()
  5. Returns health status based on MemLocked, FIPS, and ValidSPIFFEID
…first

- Added a coherent health check using backingStoreHealthy() before counting secrets.
- Avoids expensive ListKeys() operation if the store is unhealthy.
- Ensures getSecretsCount returns nil when the backing store is not
    healthy.
- Ensure getSecretsCount returns nil when there is no backing store
  (i.e., in "lite" or "memory" modes) to prevent unnecessary operations.
- Only check backingStoreHealthy() and list keys if a real backing store exists.
- All structs moved to go-sdk
- Keeper count removed
- Comment lines added
- Refactored methods based on backend store type

// HandleRequest pattern
request := net.HandleRequest[
reqres.HealthReadRequest, reqres.HealthReadResponse](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@doguhanniltextra can you add these to SPIKE Go SDK.

It's better to cut a release there, refer them here and then continue review, I think.

Thanks a lot 🙏 .

Please let me know when you create the SDK PR, so I can cut a minor release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants