Skip to content

Conversation

@earaghbidikashani
Copy link
Contributor

@earaghbidikashani earaghbidikashani commented Oct 27, 2025

This PR moves template validation and default application from the controller to admission webhooks, providing early validation and consistent template enforcement at admission time.

Core Implementation

Created TemplateValidator (internal/webhook/v1alpha1/template_validator.go)
• Validates workspace template constraints during admission
• Applies template defaults to workspace specs
• Handles both create and update operations with differential validation

Validation Features

Template Constraint Validation:
• Image allowlist enforcement
• Resource bounds checking (CPU, memory, GPU)
• Storage size limits
• Template reference validation

Comprehensive Default Application:
• Default image, resources, and storage from templates
• Container configuration, node selectors, affinity, tolerations
• Ownership type defaults
• Template tracking labels

Webhook Integration

Updated Workspace Webhooks:
• Integrated template validator into existing webhook infrastructure
• Maintains existing ownership and immutability validations
• Proper error handling and user feedback

Testing Updates

E2E Test Modernization:
• Updated tests to expect webhook rejection instead of controller validation
• Added dedicated webhook validation test suite
• Tests for template defaults application and multi-violation scenarios

Behavior Changes

• Invalid workspaces are rejected at admission time
• Template defaults applied immediately during creation
• Fast validation feedback to users
• No invalid resources stored in etcd

Testing Results

The webhook validation is working correctly, as demonstrated by these test results:

Error from server (Forbidden): error when creating "/tmp/test-workspace-invalid.yaml": admission webhook "vworkspace-v1alpha1.kb.io" denied the request: workspace violates template constraints: 2 violations: CPU request exceeds template maximum; Memory request exceeds template maximum
Error from server (Forbidden): error when creating "/tmp/test-workspace-invalid-image.yaml": admission webhook "vworkspace-v1alpha1.kb.io" denied the request: workspace violates template constraints: Image is not in the template's allowed list
Error from server (Forbidden): error when creating "/tmp/test-workspace-invalid-storage.yaml": admission webhook "vworkspace-v1alpha1.kb.io" denied the request: workspace violates template constraints: Storage size exceeds template maximum
Error from server (Forbidden): error when creating "STDIN": admission webhook "vworkspace-v1alpha1.kb.io" denied the request: workspace violates template constraints: 3 violations: Image is not in the template's allowed list; CPU request exceeds template maximum; Memory request exceeds template maximum

Opened #87 to follow up with removing the operator validation - need to confirm on env vars

@earaghbidikashani earaghbidikashani changed the title Main2 feat: Template Validation and Defaults in Admission Webhooks Oct 27, 2025
@earaghbidikashani earaghbidikashani changed the title feat: Template Validation and Defaults in Admission Webhooks feat: template Validation and Defaults in Admission Webhooks Oct 27, 2025
Copy link
Contributor

@andrii-i andrii-i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @earaghbidikashani. Thank you for your work. Please find comments and suggestions below. P0s are a must to fix to merge this IMO.

P0:

  • GPU limits validation
  • Move inline YAMLs to files (maintainability)
  • Multiple Template get calls
  • Fix ApplyTemplateDefaults re PrimaryStorage

P1:

  • GPU test coverage
  • Improve validation error messages

}

// validateResourceBounds checks if resources are within bounds
func (tv *TemplateValidator) validateResourceBounds(resources corev1.ResourceRequirements, bounds *workspacev1alpha1.ResourceBounds) []controller.TemplateViolation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing GPU bounds validation / this function does not check GPU bounds. We also need to add E2E test for it or track the need in the issue.

Nit: for extra cleanliness separate validation logic for CPU, GPU etc can be broken out into separate helper functions but it's also always a bit of a tradeoff to this so up to you

Copy link
Contributor Author

@earaghbidikashani earaghbidikashani Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought this logic over from template resolver here: https://github.com/jupyter-infra/jupyter-k8s/blob/main/internal/controller/template_resolver.go#L227

You already created an issue tracking this change, the purpose of this PR is to bring the existing validation logic from operator to the webhooks. We'll remove the operator validation logic entirely (I didn't do it in this PR because env var is only in resolved template and not in workspace itself, we need to add env var in workspace and remove the dependency to resolved template entirely)

Copy link
Contributor

@andrii-i andrii-i Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver seems to have some GPU bounds enforcement

if bounds.GPU != nil && resources.Requests != nil {

joshuatowner
joshuatowner previously approved these changes Oct 27, 2025
}

// Only validate storage if it changed
if !storageEqual(oldWorkspace.Spec.Storage, newWorkspace.Spec.Storage) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we scope this specifically to storage size? this helps with the equality check confusion below as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about mount path? should it be immutable then?

Copy link
Contributor

@JGuinegagne JGuinegagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

  • break down validator.go into several modules (at very least one for bounds, one for images, one for all other attributes)
  • move label names to constants.go
  • add unit test coverage form validator methods
  • update existing webhook unit tests

// Validate templateRef immutability
oldTemplateRef := fetchTemplateRef(oldWorkspace)
newTemplateRef := fetchTemplateRef(newWorkspace)
if oldTemplateRef != "" && oldTemplateRef != newTemplateRef && !isAdmin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: do you think we should allow even an admin to change the template? as opposed to deleting the Workspace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is following the CEL validation rule in the workspace - making the template reference immutable.

Are you suggesting that template should be immutable and template reference be mutable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants