-
Notifications
You must be signed in to change notification settings - Fork 6
HYPERFLEET-485 - test: add test cases for adapter, API validation, async processing, concurrent processing, and nodepool #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,7 +3,10 @@ | |
| ## Table of Contents | ||
|
|
||
| 1. [Adapter framework can detect and report failures to cluster API endpoints](#test-title-adapter-framework-can-detect-and-report-failures-to-cluster-api-endpoints) | ||
| 2. [Adapter framework can detect and handle resource timeouts](#test-title-adapter-framework-can-detect-and-handle-resource-timeouts) | ||
| 2. [Adapter Job failure correctly reports Health=False](#test-title-adapter-job-failure-correctly-reports-healthfalse) | ||
| 3. [Adapter framework can detect and handle resource timeouts](#test-title-adapter-framework-can-detect-and-handle-resource-timeouts) | ||
| 4. [Adapter crash recovery and message redelivery](#test-title-adapter-crash-recovery-and-message-redelivery) | ||
| 5. [API handles incomplete adapter status reports gracefully](#test-title-api-handles-incomplete-adapter-status-reports-gracefully) | ||
|
|
||
| --- | ||
|
|
||
|
|
@@ -80,6 +83,120 @@ curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ | |
| --- | ||
|
|
||
|
|
||
| ## Test Title: Adapter Job failure correctly reports Health=False | ||
|
|
||
| ### Description | ||
|
|
||
| This test validates that when an adapter Job runs but exits with a non-zero exit code (simulated via `SIMULATE_RESULT=failure`), the adapter framework correctly detects the job failure and reports `Health=False` back to the HyperFleet API. This is distinct from invalid K8s resource errors — this tests the scenario where the Job is created and scheduled successfully but the workload itself fails. | ||
|
|
||
| **Note:** There are two distinct adapter failure modes that should be differentiated: | ||
| - **Job execution failure** (this test): Job is created successfully (`Applied=True`) but the workload fails (`Health=False`). The `data` field should contain error details (exit code, error message). | ||
| - **Job creation failure** (covered by test 1 "Adapter framework can detect and report failures"): The Job or K8s resource cannot be created at all (`Applied=False`, `Health=False`). | ||
|
|
||
| Both failure modes should be verified for Cluster and NodePool resource types, as the adapter framework logic applies to both. | ||
|
|
||
| --- | ||
|
|
||
| | **Field** | **Value** | | ||
| |-----------|-----------| | ||
| | **Pos/Neg** | Negative | | ||
| | **Priority** | Tier2 | | ||
| | **Status** | Draft | | ||
| | **Automation** | Not Automated | | ||
| | **Version** | MVP | | ||
| | **Created** | 2026-02-11 | | ||
| | **Updated** | 2026-02-11 | | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ### Preconditions | ||
| 1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources | ||
| 2. HyperFleet API, Sentinel, and Adapter services are deployed and running successfully | ||
| 3. Example-adapter supports `SIMULATE_RESULT=failure` environment variable | ||
|
|
||
| --- | ||
|
|
||
| ### Test Steps | ||
|
|
||
| #### Step 1: Record original adapter configuration and set failure mode | ||
| **Action:** | ||
| - Record current `SIMULATE_RESULT` value for restoration | ||
| - Set adapter to failure mode: | ||
| ```bash | ||
| kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=failure | ||
| kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet --timeout=60s | ||
| ``` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may like there is a failure-model adapters rather than updating the origin adapter's configuration. Every update/rollback will have the risk to make the environment become dirty.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. Modifying the existing adapter's environment variable has the risk of leaving the environment dirty if cleanup fails. I'll update the test to deploy a separate failure-adapter Helm release with SIMULATE_RESULT=failure pre-configured, and delete it after the test completes. |
||
|
|
||
| **Expected Result:** | ||
| - Adapter deployment updated and new pod is Running | ||
|
|
||
| #### Step 2: Create a cluster to trigger adapter processing | ||
| **Action:** | ||
| ```bash | ||
| curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "kind": "Cluster", | ||
| "name": "failure-test", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is better to have a random string. And also I'd like the cluster should have a label with the test run id.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. I'll update it. |
||
| "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} | ||
| }' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - API returns successful response with cluster ID | ||
|
|
||
| #### Step 3: Wait for job execution and verify job failure | ||
| **Action:** | ||
| - Wait for adapter to process the event and job to run | ||
| - Check job status: | ||
| ```bash | ||
| kubectl get jobs -n ${CLUSTER_ID} -o json | jq '.items[0].status.conditions[]? | select(.type=="Failed")' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Job exists in the cluster namespace | ||
| - Job status shows `Failed=True` | ||
| - Job exit code is non-zero | ||
|
|
||
| #### Step 4: Verify adapter status report | ||
| **Action:** | ||
| ```bash | ||
| curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ | ||
| | jq '.items[0].conditions' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - `Applied` condition has `status: "True"` (Job was created successfully before failing) | ||
| - `Available` condition has `status: "False"` (work not completed; currently returns empty string — see Issue #25) | ||
| - `Health` condition has `status: "False"` with reason indicating job failure | ||
| - `data` field contains error details: | ||
| - Exit code (non-zero) | ||
| - Error message describing the failure | ||
|
|
||
| #### Step 5: Verify top-level resource status reflects failure | ||
| **Action:** | ||
| ```bash | ||
| curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.status.conditions' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - `Ready` condition remains `status: "False"` (cluster did not reach ready state) | ||
| - `Available` condition remains `status: "False"` | ||
| - Cluster does not transition to Ready while adapter reports failure | ||
|
|
||
| #### Step 6: Restore adapter to normal mode | ||
| **Action:** | ||
| ```bash | ||
| kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=success | ||
| kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet --timeout=60s | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Adapter restored to normal operation | ||
|
|
||
| --- | ||
|
|
||
| ## Test Title: Adapter framework can detect and handle resource timeouts | ||
|
|
||
| ### Description | ||
|
|
@@ -172,3 +289,217 @@ curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ | |
| ``` | ||
|
|
||
| --- | ||
|
|
||
|
|
||
| ## Test Title: Adapter crash recovery and message redelivery | ||
|
|
||
| ### Description | ||
|
|
||
| This test validates that when an adapter crashes during event processing, the message broker correctly redelivers the message after the adapter recovers. This ensures that no events are lost due to adapter failures and the system maintains eventual consistency. | ||
|
|
||
| **Note:** This test requires SIMULATE_RESULT=crash mode in the example-adapter, which may not be implemented. | ||
|
|
||
| --- | ||
|
|
||
| | **Field** | **Value** | | ||
| |-----------|-----------| | ||
| | **Pos/Neg** | Negative | | ||
| | **Priority** | Tier2 | | ||
| | **Status** | Draft | | ||
| | **Automation** | Not Automated | | ||
| | **Version** | MVP | | ||
| | **Created** | 2026-02-11 | | ||
| | **Updated** | 2026-02-11 | | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ### Preconditions | ||
| 1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) | ||
| 2. HyperFleet API, Sentinel, and Adapter services are deployed | ||
| 3. Google Pub/Sub is configured with appropriate acknowledgment deadline and retry policy | ||
|
|
||
| --- | ||
|
|
||
| ### Test Steps | ||
|
|
||
| #### Step 1: Configure adapter to crash on event receipt | ||
| **Action:** | ||
| - Set adapter environment variable to crash mode: | ||
| ```bash | ||
| kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=crash | ||
| kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Adapter deployment updated | ||
| - New adapter pod starts | ||
|
|
||
| #### Step 2: Create a cluster to trigger event | ||
| **Action:** | ||
| ```bash | ||
| curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "kind": "Cluster", | ||
| "name": "crash-test", | ||
| "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} | ||
| }' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - API returns successful response with cluster ID | ||
| - Sentinel publishes event to broker | ||
|
|
||
| #### Step 3: Verify adapter crashes on event receipt | ||
| **Action:** | ||
| - Monitor adapter pod status: | ||
| ```bash | ||
| kubectl get pods -n hyperfleet -l app.kubernetes.io/instance=example-adapter -w | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Adapter pod crashes (CrashLoopBackOff or Error state) | ||
| - Pod restarts automatically | ||
|
|
||
| #### Step 4: Restore adapter to normal mode | ||
| **Action:** | ||
| ```bash | ||
| kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=success | ||
| kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Adapter deployment updated | ||
| - New adapter pod starts and remains Running | ||
|
|
||
| #### Step 5: Verify message redelivery and processing | ||
| **Action:** | ||
| - Wait for adapter to process redelivered message | ||
| - Check cluster status: | ||
| ```bash | ||
| curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Adapter eventually processes the cluster event | ||
| - Status is reported (may show Applied=True, Available progressing) | ||
| - No event is lost | ||
|
|
||
| --- | ||
|
|
||
| ### Notes | ||
|
|
||
| - Message redelivery behavior depends on Google Pub/Sub configuration: | ||
| - Uses acknowledgment deadline and retry policy | ||
| - If adapter crashes before acknowledging message, Pub/Sub will redeliver after the acknowledgment deadline expires | ||
| - Sentinel may also republish events during its polling cycle if generation > observed_generation | ||
|
|
||
| --- | ||
|
|
||
| ## Test Title: API handles incomplete adapter status reports gracefully | ||
|
|
||
| ### Description | ||
|
|
||
| This test validates that the HyperFleet API can gracefully handle adapter status reports that are missing expected fields. When an adapter reports status with incomplete or missing condition fields (e.g., missing `reason`, `message`, or `observed_generation`), the API should accept the report without crashing and store what is available, rather than rejecting the entire status update. | ||
|
|
||
| --- | ||
|
|
||
| | **Field** | **Value** | | ||
| |-----------|-----------| | ||
| | **Pos/Neg** | Negative | | ||
| | **Priority** | Tier2 | | ||
| | **Status** | Draft | | ||
| | **Automation** | Not Automated | | ||
| | **Version** | MVP | | ||
| | **Created** | 2026-02-11 | | ||
| | **Updated** | 2026-02-11 | | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ### Preconditions | ||
| 1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources | ||
| 2. HyperFleet API is deployed and running successfully | ||
| 3. A cluster resource has been created and its cluster_id is available | ||
|
|
||
| --- | ||
|
|
||
| ### Test Steps | ||
|
|
||
| #### Step 1: Submit a status report with missing optional fields | ||
| **Action:** | ||
| - Send a status report with minimal fields (missing `reason`, `message`): | ||
| ```bash | ||
| curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "adapter": "test-incomplete-adapter", | ||
| "conditions": [ | ||
| { | ||
| "type": "Applied", | ||
| "status": "True" | ||
| } | ||
| ] | ||
| }' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - API accepts the status report (HTTP 200/201) | ||
| - API does not return an error or crash | ||
| - Status is stored with available fields; missing fields default to empty or null | ||
|
|
||
| #### Step 2: Submit a status report with missing observed_generation | ||
| **Action:** | ||
| - Send a status report without `observed_generation`: | ||
| ```bash | ||
| curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "adapter": "test-no-generation-adapter", | ||
| "conditions": [ | ||
| { | ||
| "type": "Applied", | ||
| "status": "True", | ||
| "reason": "ResourceApplied", | ||
| "message": "Resource applied successfully" | ||
| } | ||
| ] | ||
| }' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - API accepts the status report | ||
| - `observed_generation` defaults to 0 or null | ||
| - Cluster conditions are not corrupted | ||
|
|
||
| #### Step 3: Submit a status report with empty conditions array | ||
| **Action:** | ||
| - Send a status report with an empty conditions list: | ||
| ```bash | ||
| curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "adapter": "test-empty-conditions-adapter", | ||
| "conditions": [] | ||
| }' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - API either accepts the report with empty conditions, or returns a clear validation error (HTTP 400) | ||
| - API does not crash or return HTTP 500 | ||
|
|
||
| #### Step 4: Verify cluster state is not corrupted | ||
| **Action:** | ||
| - Retrieve the cluster and its statuses: | ||
| ```bash | ||
| curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.conditions' | ||
| curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.' | ||
| ``` | ||
|
|
||
| **Expected Result:** | ||
| - Cluster conditions remain consistent and valid | ||
| - Previously reported statuses from other adapters are not affected | ||
| - Incomplete status entries are stored but do not interfere with cluster Ready/Available evaluation | ||
|
|
||
| --- | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be the basic case to validate the workflow with the adapter status is
false, it is not on the adapter.