-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Description
1. Context & Motivation
Current State: BanyanDB currently relies on etcd as a hard dependency for cluster coordination, metadata storage, and node discovery (Meta Nodes). This requires maintaining a separate etcd cluster, managing leases for health checks, and handling complex certificate management for secure communication.
Goal: Transform BanyanDB into a "Zero-Dependency" architecture by replacing the etcd-based registry with a decentralized DNS-based Node Discovery mechanism. This simplifies deployment on Kubernetes (StatefulSets) and static environments (VMs/Edge).
2. Technical Design Specification
2.1 Core Abstraction: NodeRegistry
We will introduce a modular NodeRegistry interface to decouple the discovery logic from the specific implementation.
- Old Flow:
Liaison-> WatchetcdKey -> Update gRPC Connection. - New Flow:
Liaison-> PollNodeRegistry-> Update gRPC Connection.
2.2 Discovery Mechanism (DNS)
The primary implementation will be the DNS Registry, operating in a "Pull-based" model.
-
Query Strategy:
- Primary: Query SRV Records (RFC 2782) to discover target hostnames and dynamic ports (critical for K8s Headless Services).
- Fallback: Static Registry: To support environments without DNS or for emergency overrides, loads a fixed list of peers from a local file (
topology.yml). Support hot reloading of this file.
-
Polling & Caching:
- Implement a Custom gRPC Resolver (Go) that polls DNS at a configurable interval (default: 30s). In the startup process, the interval should be 5 seconds to reflect the topology change. There should be two flags to set up the intervals.
- Two-Layer Caching: Respect DNS TTL (Infrastructure layer) and maintain an internal snapshot (Application layer).
-
Resilience (Serve Stale):
- If the DNS server returns a failure (e.g.,
SERVFAIL, Timeout), the resolver MUST NOT flush the current address list. - It must log a warning and return the stale (last known good) list of addresses to ensure partition tolerance.
- If the DNS server returns a failure (e.g.,
2.3 Peer Discovery
- Liaison Node Discovery Liaison nodes will discover the data nodes
- Data Node Mesh: Data nodes will discover peers by resolving the same DNS name they publish themselves.
- Lifecycle: Hot nodes discover Warm/Cold nodes.
2.4 Two-Phase Discovery
Instead of reading the full Node struct from etcd before connecting, the Liaison/Data node will first connect via DNS and then query the node directly for its details.
Add a new gRPC service to return the Node.
2.5 Troubleshooting DNS Discovery
In the absence of etcdctl, operators need new tools.
State gRPC service: bydbctl/UI -> calls (Liaison/Data).GetClusterState() -> returns the internal list derived from DNS. The service will return more internal state than DNS in the future.
Metrics: New metrics are required:
- discovery_dns_lookup_duration_seconds
- discovery_dns_lookup_failures_total
- discovery_cluster_size (Gauge)
3. Task List
- Implement
DNSNodeRegistrywithnet.LookupSRVandnet.LookupHost. - Implement
StaticNodeRegistryfor fallback/file-based discovery. - Update Helm Charts.
- Create E2E test suite for startup.
- Update Documentation ( Concept and operational document )
Use case
No response
Related issues
No response
Are you willing to submit a pull request to implement this on your own?
- Yes I am willing to submit a pull request on my own!
Code of Conduct
- I agree to follow this project's Code of Conduct