Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Preflight Validate API support #4329

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

hemarina
Copy link
Contributor

@hemarina hemarina commented Sep 17, 2024

Adds support for Preflight Validate API.

TODO:

  • Tests

Questions

What is Preflight?

Preflight API validates whether the specified template is syntactically correct and will be accepted by Azure Resource Manager. More information on preflight wiki.

Why do we implement Validate API?

There’re two preflight API: Validate API and Deployment API. We implemented deployment API. However, deployment API creates a scenario where specific resource fails at preflight API and stops its deployment where some resources are already created. For example, resources are created at an unsupported location. This will cause more dev-test loop for cleaning resources and re-deploy. Adding Validate API support before provisioning to azure solves this situation.

What are Preflight Validate API limitations?

  1. Around 60% applicable resource types are enabled with preflight. Check preflight coverage dashboard.
  2. Preflight is not able to validate nested resources with parameters referencing runtime functions.

Example Test Error Message

Standard deployments

image

Deployment stack

image

hemarina and others added 6 commits August 14, 2024 16:04
To assign the process to the job object after we create it, we need
the `windows.Handle` that backs the process we started with `os/exec`.
This was not exported and so we used some gnarly casting to cast the
`os.Process` to a structure we authored that had the same layout of
`os.Process` so we could grab the handle property.

This is unsafe if the layout of the `os.Process` changes, which it did
in Go 1.23.

Move to code that doesn't depend on the internals of `os.Process` by
using the exported Pid to call `windows.OpenProcess` to get a handle and
then pass that along to `windows.AssignProcessToJobObject`.
@hemarina hemarina self-assigned this Sep 17, 2024
@hemarina hemarina marked this pull request as ready for review September 20, 2024 21:57
Comment on lines 571 to 577
err = p.validatePreflight(
ctx,
bicepDeploymentData.Target,
bicepDeploymentData.CompiledBicep.RawArmTemplate,
bicepDeploymentData.CompiledBicep.Parameters,
deploymentTags,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is great that the validate API exists but I am concerned that this may cause considerable time delay during deployments.

Do we have an idea of the amount of extra time the validations will take especially against more complex & nested deployments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hemarina and I had spoke awhile back about this while she was implementing -- I think this is pretty lightweight (<3s) from what she showed me, but it'd be great to test this out and gather the numbers with a few different templates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is lightweight as Wei said but I'll run some tests to confirm the exact number or ask manual test team to run all todo templates.

Copy link
Contributor

@wbreza wbreza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expectation for templates with nested deployments? For example, I used our Todo AKS template, specified an invalid K8s version parameter and expected that this would fail at our own preflight check call.

Instead the preflight from azd passes, but then we get a preflight failure during ARM deployment.

Preflight validation check for resource(s) for container service aks-oc3hr2l7nqlee in resource group rg-wabrez-todo-aks-swed failed. Message: Web application routing requires at least Kubernetes 1.22. Details: 

 (Code: WebAppRoutingUnsupportedK8SVersion)

Failing Deployment Example

If it is not going to catch preflight errors on nested deployments I question whether it is valuable to add at this point.

Comment on lines +710 to +711
var rawResponse *http.Response
ctxWithResp := runtime.WithCaptureResponse(ctx, &rawResponse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this? If we get an HttpError we should be able to case the error to ARM http error to parse the error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a workaround for a related bug here: Azure/azure-sdk-for-go#23350


_, err = validate.PollUntilDone(ctx, nil)
if err != nil {
deploymentError := createDeploymentError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use this same createDeploymentError() func in the deployment stack use cases as well.

Comment on lines +740 to +749
type PreflightErrorResponse struct {
Error struct {
Code string `json:"code"`
Message string `json:"message"`
Details []struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"details"`
} `json:"error"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this basically captures the same ARM errors that we use in the AzureDeploymentError object that we already have. Can we reuse that one instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Spike] Preflight Checks API integration
4 participants