Skip to content

Conversation

@dcolina
Copy link
Member

@dcolina dcolina commented Dec 12, 2025

Summary

Improves error messages in deployment-guard workflow for better traceability and fixes repository validation bug when using registry prefixes.

Changes

1. Error Message Improvements (Related to dotCMS/deutschebank-infrastructure#339)

All validation error messages now include:

  • Clear "REASON:" prefix explaining why validation failed
  • Explicit statement of validation rules
  • Removed "how to fix" suggestions (just state the rules)

Before:

❌ BLOCKED: Modified files are not in the allowlist
Please ensure you're only modifying allowed files.

After:

❌ BLOCKED: File allowlist validation failed

REASON: One or more modified files are not in the allowlist

Validation Rule: Only files matching the following pattern can be modified:
  - Pattern: kubernetes/dotcms/**/statefulset.ya?ml

2. Bug Fix: Repository Validation with Registry Prefix

Fixed bug where images with registry prefixes (e.g., mirror.gcr.io/dotcms/dotcms:25.12.11) would fail repository validation even when the base repository (dotcms/dotcms) is in the allowlist.

The Problem:

# Actual image in statefulset
image: mirror.gcr.io/dotcms/dotcms:25.12.11

# Allowed repositories in caller workflow
allowed_image_repositories: 'dotcms/dotcms'

# Previous behavior: ❌ FAIL - exact match required
# mirror.gcr.io/dotcms/dotcms != dotcms/dotcms

The Solution:

  • Extract base repository name from full image path
  • Compare both full repository AND base repository with allowlist
  • Handles: mirror.gcr.io/dotcms/dotcms, gcr.io/project/dotcms/dotcms, dotcms/dotcms
  • All resolve to base: dotcms/dotcms for comparison

Added Debug Logging:

   Full image: mirror.gcr.io/dotcms/dotcms:25.12.11
   Repository: mirror.gcr.io/dotcms/dotcms
   Tag: 25.12.11
   Comparing 'dotcms/dotcms' with allowed 'dotcms/dotcms'
   ✓ Match found

3. Improved All Validation Errors

  • File Allowlist: Clear reason and pattern display
  • Image-Only Changes: Explicit list of allowed vs not-allowed changes
  • Image Format: Shows expected format when invalid
  • Repository: Shows reason for failure and allowed list
  • Version Pattern: Explains evergreen requirements and shows pattern
  • Registry Existence: Clear reason when image not found
  • Final Summary: Lists all validation rules including downgrade prevention

Testing

Tested with Deutsche Bank infrastructure during Phase 1 testing:

  • ✅ Test 1.1: PUBLIC member bypass works
  • ✅ Test 1.2: File allowlist blocks correctly
  • ✅ Test 1.3: Image-only check detects multiple changes
  • ✅ Test 1.4: LTS version rejected
  • ✅ Test 1.5: Latest tag rejected
  • Identified repository validation bug with mirror.gcr.io prefix → Fixed

Benefits

  1. Traceability: All errors clearly state WHY validation failed
  2. Compliance: Error messages suitable for audit purposes
  3. User Experience: Users understand validation rules without needing documentation
  4. Registry Flexibility: Supports images from mirror registries without configuration changes

Breaking Changes

None - this is backward compatible. Existing workflows will continue to work and benefit from improved error messages.

Related Issues

  • dotCMS/deutschebank-infrastructure#339 - Deployment guard implementation and testing

## Changes

### Error Message Improvements (Issue #339)
- Add clear "REASON:" prefix to all validation failures for traceability
- Remove "how to fix" suggestions, replace with validation rules reminder
- Improve error messages to state WHAT failed and WHY

### Bug Fix: Repository Validation with Registry Prefix
- Fix repository validation to extract base repository name
- Handle images with registry prefix (e.g., mirror.gcr.io/dotcms/dotcms)
- Compare both full repository and base repository with allowlist
- Add debug logging to show comparison process

### Specific Error Message Changes
1. **File Allowlist**: "BLOCKED: File allowlist validation failed"
   - States which files are not in allowlist
   - Shows pattern requirements

2. **Image-Only**: "BLOCKED: Image-only validation failed"
   - Clarifies only image attribute can be modified
   - Lists what changes are not allowed

3. **Image Validation**: "BLOCKED: Image validation failed"
   - Lists all validation rules (repository, version, existence, downgrades)
   - Explains evergreen version requirements
   - Removes remediation suggestions

### Examples
Before: "❌ Repository not allowed: mirror.gcr.io/dotcms/dotcms"
After: Extracts "dotcms/dotcms" and matches against allowlist ✅

Related to dotCMS/deutschebank-infrastructure#339
@dcolina dcolina requested review from a team as code owners December 12, 2025 12:42
## Anti-Downgrade Validation
- Compare old vs new image versions to prevent downgrades
- Extract version from tag (YY.MM.DD) and compare numerically
- Allow same version with different suffixes (e.g., 25.12.11 → 25.12.11-2)
- Clear error message when downgrade detected

## Canonical Registry Check
- Verify image existence in Docker Hub instead of mirror registries
- Extract base repository (dotcms/dotcms) from full image path
- Use canonical image without registry prefix for docker manifest inspect
- Assumes mirror registries have same images as Docker Hub
- Avoids authentication issues with private mirror registries

## Implementation Details
- Pass old-images output from validate-image-only-changed job
- Match old and new images by index
- Version comparison handles optional suffixes (-1, -2, etc)
- Validation order:
  1. Format
  2. Repository allowlist
  3. Version pattern
  4. Anti-downgrade ⭐ NEW
  5. Registry existence (canonical) ⭐ IMPROVED

Related to dotCMS/deutschebank-infrastructure#339
spbolton
spbolton previously approved these changes Dec 12, 2025
- Use block redirect for multiple outputs (SC2129)
- Replace array expansion with mapfile (SC2207)
- Remove unused NEW_IMAGES_ARRAY variable (SC2034)
@dcolina dcolina force-pushed the issue-339-improve-deployment-guard-error-messages branch from f6c5176 to fb1c43e Compare December 12, 2025 13:38
- Fix multiple redirects in early exit paths
- Use block redirect for all consecutive GITHUB_OUTPUT writes
spbolton
spbolton previously approved these changes Dec 12, 2025
- Update anti-downgrade validation to handle tags with commit hash (_hash suffix)
- Support formats: YY.MM.DD, YY.MM.DD-N, YY.MM.DD_hash, YY.MM.DD-N_hash
- Two-step extraction: first remove hash, then remove rebuild number
- Update all error messages to reflect accepted formats
- Update final validation summary message

This allows using immutable tags like 25.12.08-2_872913a which include
the git commit hash for better traceability and immutability.
@dcolina dcolina merged commit 3ae86f1 into main Dec 12, 2025
3 checks passed
@dcolina dcolina deleted the issue-339-improve-deployment-guard-error-messages branch December 12, 2025 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants