|
| 1 | +# FoundationDB Kubernetes Operator - Claude Development Guide |
| 2 | + |
| 3 | +## Project Overview |
| 4 | + |
| 5 | +The FoundationDB Kubernetes Operator is a sophisticated Kubernetes controller that manages FoundationDB clusters in Kubernetes environments. It automates deployment, scaling, backup, restore, and maintenance operations for FoundationDB distributed database clusters. |
| 6 | + |
| 7 | +### Architecture Components |
| 8 | + |
| 9 | +- **API Types** (`api/v1beta2/`): Custom Resource Definitions (CRDs) for FoundationDBCluster, FoundationDBBackup, FoundationDBRestore |
| 10 | +- **Controllers** (`controllers/`): Reconciliation logic for cluster lifecycle management |
| 11 | +- **Internal Packages** (`internal/`): Core business logic for coordination, replacements, maintenance, etc. |
| 12 | +- **PKG Packages** (`pkg/`): Reusable components like admin clients, pod managers, status checks |
| 13 | +- **E2E Tests** (`e2e/`): Comprehensive end-to-end testing with chaos engineering |
| 14 | + |
| 15 | +### Key Patterns |
| 16 | + |
| 17 | +- **Controller-Runtime**: Built on `sigs.k8s.io/controller-runtime` framework |
| 18 | +- **Reconciliation Loops**: Event-driven state reconciliation |
| 19 | +- **Mock-Based Testing**: Extensive mocking for unit tests |
| 20 | +- **Chaos Engineering**: Production-like failure testing with chaos-mesh |
| 21 | + |
| 22 | +## Development Environment Setup |
| 23 | + |
| 24 | +### Prerequisites |
| 25 | + |
| 26 | +```bash |
| 27 | +# Go 1.24+ required |
| 28 | +go version |
| 29 | + |
| 30 | +# Install dependencies |
| 31 | +make deps |
| 32 | + |
| 33 | +# Install FoundationDB client package |
| 34 | +# For macOS: Download from https://github.com/apple/foundationdb/releases |
| 35 | +# For arm64 Mac: Make sure to install the arm64 package |
| 36 | +``` |
| 37 | + |
| 38 | +### Local Development |
| 39 | + |
| 40 | +```bash |
| 41 | +# Clone repository |
| 42 | +git clone https://github.com/FoundationDB/fdb-kubernetes-operator |
| 43 | +cd fdb-kubernetes-operator |
| 44 | + |
| 45 | +# Set up test certificates |
| 46 | +config/test-certs/generate_secrets.bash |
| 47 | + |
| 48 | +# Build and deploy operator (requires local K8s cluster) |
| 49 | +make rebuild-operator |
| 50 | + |
| 51 | +# Create test cluster |
| 52 | +kubectl apply -k ./config/tests/base |
| 53 | +``` |
| 54 | + |
| 55 | +## Build System & Tooling |
| 56 | + |
| 57 | +### Primary Make Commands |
| 58 | + |
| 59 | +| Command | Purpose | |
| 60 | +|---------|---------| |
| 61 | +| `make all` | Complete build pipeline: deps, generate, fmt, vet, test, build | |
| 62 | +| `make test` | Run unit tests (Ginkgo with race detection if TEST_RACE_CONDITIONS=1) | |
| 63 | +| `make lint` | Run golangci-lint with project rules | |
| 64 | +| `make fmt` | Format code using golines + goimports + golangci-lint --fix | |
| 65 | +| `make vet` | Run go vet static analysis | |
| 66 | +| `make generate` | Generate deepcopy methods and CRDs | |
| 67 | +| `make manifests` | Generate CRD YAML files | |
| 68 | +| `make container-build` | Build Docker image | |
| 69 | +| `make deploy` | Deploy operator to Kubernetes cluster | |
| 70 | +| `make rebuild-operator` | Build, push (if remote), deploy, and bounce operator | |
| 71 | + |
| 72 | +### Environment Variables |
| 73 | + |
| 74 | +- `IMG`: Operator image name (default: `fdb-kubernetes-operator:latest`) |
| 75 | +- `SIDECAR_IMG`: Sidecar image name |
| 76 | +- `REMOTE_BUILD`: Set to 1 for remote builds (enables image push) |
| 77 | +- `BUILD_PLATFORM`: Override build platform (e.g., `linux/amd64`) |
| 78 | +- `TEST_RACE_CONDITIONS`: Set to 1 to enable race detection in tests |
| 79 | +- `SKIP_TEST`: Set to 1 to skip tests in build |
| 80 | + |
| 81 | +## Testing Framework |
| 82 | + |
| 83 | +### Unit Testing (Ginkgo v2 + Gomega) |
| 84 | + |
| 85 | +```go |
| 86 | +// Example test structure from controllers/suite_test.go |
| 87 | +var _ = Describe("ControllerName", func() { |
| 88 | + BeforeEach(func() { |
| 89 | + // Setup test environment |
| 90 | + k8sClient = mockclient.NewMockClient() |
| 91 | + }) |
| 92 | + |
| 93 | + It("should reconcile successfully", func() { |
| 94 | + // Test implementation |
| 95 | + Expect(result).To(BeNil()) |
| 96 | + }) |
| 97 | +}) |
| 98 | +``` |
| 99 | + |
| 100 | +### E2E Testing |
| 101 | + |
| 102 | +- **Location**: `e2e/` directory with test packages |
| 103 | +- **Framework**: Ginkgo + Gomega + chaos-mesh for failure injection |
| 104 | +- **Types**: Upgrades, HA failures, stress testing, maintenance mode |
| 105 | +- **Run**: `make test` with e2e labels |
| 106 | + |
| 107 | +### Mock Objects |
| 108 | + |
| 109 | +- **Kubernetes Client**: `mockclient.MockClient` |
| 110 | +- **FDB Admin Client**: `mock.DatabaseClientProvider` |
| 111 | +- **Pod Client**: `mockpodclient.NewMockFdbPodClient` |
| 112 | + |
| 113 | +## Code Standards & Conventions |
| 114 | + |
| 115 | +### Linting & Formatting |
| 116 | + |
| 117 | +**golangci-lint Configuration** (`.golangci.yml`): |
| 118 | +- **Enabled Linters**: errcheck, govet, staticcheck, revive, misspell, ineffassign, unused |
| 119 | +- **Formatters**: gofmt with golines (120 char limit) |
| 120 | +- **Dependency Guard**: Restricted import paths for clean architecture |
| 121 | + |
| 122 | +**Formatting Tools**: |
| 123 | +- `golines`: Line length formatting with gofmt base |
| 124 | +- `goimports`: Import organization |
| 125 | +- `golangci-lint run --fix`: Auto-fix issues |
| 126 | + |
| 127 | +### Package Organization |
| 128 | + |
| 129 | +``` |
| 130 | +├── api/v1beta2/ # CRD types and API definitions |
| 131 | +├── controllers/ # Controller reconciliation logic |
| 132 | +├── internal/ # Internal business logic packages |
| 133 | +├── pkg/ # Reusable library packages |
| 134 | +├── e2e/ # End-to-end tests |
| 135 | +├── kubectl-fdb/ # kubectl plugin |
| 136 | +├── fdbclient/ # FDB client utilities |
| 137 | +└── setup/ # Operator setup and configuration |
| 138 | +``` |
| 139 | + |
| 140 | +### Naming Conventions |
| 141 | + |
| 142 | +- **Types**: PascalCase (e.g., `FoundationDBCluster`) |
| 143 | +- **Functions**: PascalCase for exported, camelCase for internal |
| 144 | +- **Constants**: PascalCase or UPPER_SNAKE_CASE for public constants |
| 145 | +- **Files**: snake_case.go |
| 146 | +- **Test Files**: `*_test.go` with corresponding suite_test.go |
| 147 | + |
| 148 | +### Error Handling |
| 149 | + |
| 150 | +```go |
| 151 | +// Preferred error handling pattern |
| 152 | +err := someOperation() |
| 153 | +if err != nil { |
| 154 | + return reconcile.Result{}, fmt.Errorf("failed to perform operation: %w", err) |
| 155 | +} |
| 156 | + |
| 157 | +// Use structured errors from internal/error_helper.go |
| 158 | +return internal.ReconcileResult{ |
| 159 | + Message: "Operation failed", |
| 160 | + Err: err, |
| 161 | +} |
| 162 | +``` |
| 163 | + |
| 164 | +## Development Workflow |
| 165 | + |
| 166 | +### 1. Understanding Existing Patterns |
| 167 | + |
| 168 | +Before implementing new features: |
| 169 | + |
| 170 | +```bash |
| 171 | +# Find similar implementations |
| 172 | +grep -r "similar_pattern" controllers/ |
| 173 | +grep -r "ProcessGroup" api/v1beta2/ |
| 174 | + |
| 175 | +# Study existing controller structure |
| 176 | +ls controllers/*_controller.go |
| 177 | +``` |
| 178 | + |
| 179 | +### 2. Controller Development |
| 180 | + |
| 181 | +**Controller Structure**: |
| 182 | +```go |
| 183 | +type FoundationDBClusterReconciler struct { |
| 184 | + client.Client |
| 185 | + Log logr.Logger |
| 186 | + Recorder record.EventRecorder |
| 187 | + DatabaseClientProvider DatabaseClientProvider |
| 188 | + PodLifecycleManager podmanager.PodLifecycleManager |
| 189 | +} |
| 190 | + |
| 191 | +func (r *FoundationDBClusterReconciler) Reconcile(ctx context.Context, request ctrl.Request) (ctrl.Result, error) { |
| 192 | + // Implementation |
| 193 | +} |
| 194 | +``` |
| 195 | + |
| 196 | +### 3. API Changes |
| 197 | + |
| 198 | +```bash |
| 199 | +# After modifying API types in api/v1beta2/ |
| 200 | +make clean all # Generate new CRD structs. |
| 201 | +``` |
| 202 | + |
| 203 | +### 4. Testing Strategy |
| 204 | + |
| 205 | +1. **Write unit tests first** using Ginkgo/Gomega |
| 206 | +2. **Use mock clients** for Kubernetes operations |
| 207 | +3. **Add e2e tests** for complex scenarios |
| 208 | +4. **Run race detection**: `TEST_RACE_CONDITIONS=1 make test` |
| 209 | + |
| 210 | +### 5. Common Development Tasks |
| 211 | + |
| 212 | +**Adding a new CRD field**: |
| 213 | +1. Modify struct in `api/v1beta2/*_types.go` |
| 214 | +2. Add validation tags and documentation |
| 215 | +3. Run `make generate manifests` |
| 216 | +4. Add tests in `api/v1beta2/*_test.go` |
| 217 | + |
| 218 | +**Adding a new controller reconciler**: |
| 219 | +1. Create new file in `controllers/` |
| 220 | +2. Implement reconciliation logic |
| 221 | +3. Add to `main.go` and controller setup |
| 222 | +4. Write comprehensive tests |
| 223 | + |
| 224 | +## Key Libraries & Dependencies |
| 225 | + |
| 226 | +### Core Dependencies |
| 227 | + |
| 228 | +- **controller-runtime** (`sigs.k8s.io/controller-runtime`): Kubernetes operator framework |
| 229 | +- **client-go** (`k8s.io/client-go`): Kubernetes API client |
| 230 | +- **FoundationDB Bindings** (`github.com/apple/foundationdb/bindings/go`): FDB Go client |
| 231 | +- **logr** (`github.com/go-logr/logr`): Structured logging interface |
| 232 | + |
| 233 | +### Testing Dependencies |
| 234 | + |
| 235 | +- **Ginkgo v2** (`github.com/onsi/ginkgo/v2`): BDD testing framework |
| 236 | +- **Gomega** (`github.com/onsi/gomega`): Assertion library |
| 237 | +- **chaos-mesh**: Chaos engineering for e2e tests |
| 238 | + |
| 239 | +### Development Tools |
| 240 | + |
| 241 | +- **controller-gen**: CRD and deepcopy generation |
| 242 | +- **kustomize**: Kubernetes configuration management |
| 243 | +- **golangci-lint**: Go linting |
| 244 | +- **golines + goimports**: Code formatting |
| 245 | +- **goreleaser**: Binary releases |
| 246 | + |
| 247 | +## Debugging & Troubleshooting |
| 248 | + |
| 249 | +### Local Debugging |
| 250 | + |
| 251 | +```bash |
| 252 | +# View operator logs |
| 253 | +kubectl logs -f -l app=fdb-kubernetes-operator-controller-manager --container=manager |
| 254 | + |
| 255 | +# Check cluster status |
| 256 | +kubectl get foundationdbcluster test-cluster -o yaml |
| 257 | + |
| 258 | +# Access FDB CLI |
| 259 | +kubectl fdb exec -it test-cluster -- fdbcli |
| 260 | +``` |
| 261 | + |
| 262 | +### Common Issues |
| 263 | + |
| 264 | +1. **Build Failures**: Ensure FoundationDB client is installed for your platform |
| 265 | +2. **Test Failures**: Check mock setup and race conditions |
| 266 | +3. **CRD Issues**: Regenerate with `make manifests` after API changes |
| 267 | +4. **Image Issues**: Verify BUILD_PLATFORM matches your cluster architecture |
| 268 | + |
| 269 | +## Contributing Guidelines |
| 270 | + |
| 271 | +### Before Opening PRs |
| 272 | + |
| 273 | +1. **Run full test suite**: `make all` |
| 274 | +2. **Check formatting**: `make fmt` |
| 275 | +3. **Update documentation** if adding new features |
| 276 | +4. **Add/update tests** for new functionality |
| 277 | +5. **Follow existing patterns** in similar controllers |
| 278 | + |
| 279 | +### Commit Messages |
| 280 | + |
| 281 | +Follow conventional commit format: |
| 282 | +``` |
| 283 | +feat(controller): add new reconciliation step for process replacement |
| 284 | +fix(api): correct validation for database configuration |
| 285 | +test(e2e): add chaos testing for network partitions |
| 286 | +``` |
| 287 | + |
| 288 | +### Pull Request Process |
| 289 | + |
| 290 | +1. Create feature branch from `main` |
| 291 | +2. Implement changes following this guide |
| 292 | +3. Ensure all tests pass |
| 293 | +4. Update documentation if needed |
| 294 | +5. Reference any related GitHub issues |
| 295 | + |
| 296 | +## Additional Resources |
| 297 | + |
| 298 | +- [FoundationDB Documentation](https://apple.github.io/foundationdb/) |
| 299 | +- [controller-runtime Book](https://book.kubebuilder.io/) |
| 300 | +- [Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) |
| 301 | +- [Community Forums](https://forums.foundationdb.org) |
0 commit comments