Skip to content

Conversation

@sfreudenthaler
Copy link
Member

Changes relative path reference from remote call to local call to prevent startup_failure in consumer workflows.

🤖 Generated with Claude Code

Changes relative path reference from remote call to local call to prevent
startup_failure in consumer workflows.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sfreudenthaler sfreudenthaler requested review from a team as code owners July 2, 2025 23:54
@sfreudenthaler sfreudenthaler merged commit d7466dd into main Jul 2, 2025
3 checks passed
@sfreudenthaler sfreudenthaler deleted the fix-circular-dependency branch July 2, 2025 23:59
dcolina added a commit that referenced this pull request Dec 16, 2025
BREAKING CHANGES:
- verify_image_existence now defaults to true (was false in v1.x)

✨ Added:
- Robust version comparison with rebuild number and hash support
- Improved registry validation with private registry fallback
- Enhanced error reporting with accumulated failures
- Comprehensive CHANGELOG documenting all fixes

🔧 Changed:
- Complete replacement of temporary files with bash arrays
- Added strict error handling (set -euo pipefail) to all scripts
- Better error messages for all validation failures

🐛 Fixed:
- Bug #1: Rebuild downgrade detection (25.12.08-2 → 25.12.08 now blocked)
- Bug #2: Temporary file race conditions and persistence issues
- Bug #3: Image existence validation for private registries
- Bug #4: Silent failures in validation loops
- Bug #5: Version pattern validation edge cases

🔒 Security:
- Strict error handling prevents silent failures
- Eliminated temporary file security concerns
- Better input validation before processing

📝 Details:
This is a complete architectural refactor addressing all known bugs in
the deployment guard workflow. The main improvements are:

1. Version Comparison: Now properly handles:
   - Base version (YY.MM.DD)
   - Rebuild numbers (-N suffix)
   - Commit hashes (_hash suffix)
   - All combinations of the above

2. State Management: Replaced all temporary files with bash arrays:
   - Before: /tmp/validation_failed.txt, /tmp/new_images.txt
   - After: VALIDATION_FAILED, FAILED_IMAGES, NEW_IMAGES arrays
   - Eliminates race conditions and cleanup issues

3. Registry Validation: Now supports both public and private registries
   - Tries Docker Hub first
   - Falls back to full image path for private registries

See CHANGELOG.md for complete migration guide and bug details.
dcolina added a commit that referenced this pull request Dec 16, 2025
#19)

## 🎯 Overview

Complete architectural refactor of the Deployment Guard workflow to
v2.0.0, eliminating all fragile temporary file-based state management
and fixing critical validation bugs.

## 🚨 Breaking Changes

- **`verify_image_existence` now defaults to `true`** (was `false` in
v1.x)
  - This was always the intended behavior but was disabled due to bugs
- If you want to disable, explicitly set to `false` in your workflow
configuration

## ✨ What's New

### 1. Robust Version Comparison
Complete rewrite of anti-downgrade logic with proper handling of:
- Base version comparison (YY.MM.DD format)
- Rebuild number comparison (e.g., `-2` in `25.12.08-2`)
- Hash comparison (e.g., `_abc123` in `25.12.08_abc123`)
- Full support for all combinations: `25.12.08`, `25.12.08-2`,
`25.12.08_abc`, `25.12.08-2_abc`

### 2. Improved Registry Validation
- Tries Docker Hub first for canonical images
- Falls back to full image path for private registries
- Better error messages indicating which registry was checked
- Handles mirror registries gracefully

### 3. Enhanced Error Reporting
- State variables accumulate ALL validation failures before exiting
- Detailed failure reasons for each failed image/file
- Clear indication of which validation step failed and why

## 🐛 Bugs Fixed

### Bug #1: Rebuild Downgrade Not Detected (Critical)
**Issue**: v1.x allowed downgrade from `25.12.08-2` to `25.12.08`
**Root Cause**: Only compared base version (YY.MM.DD), ignored rebuild
numbers
**Fix**: Now extracts and compares rebuild numbers when base version is
the same

**Examples**:
- ❌ v1.x: `25.12.08-2` → `25.12.08` = ✅ Allowed (BUG)
- ✅ v2.0.0: `25.12.08-2` → `25.12.08` = ❌ Blocked (CORRECT)
- ✅ v2.0.0: `25.12.08` → `25.12.08-2` = ✅ Allowed (upgrade)
- ✅ v2.0.0: `25.12.08-2_abc` → `25.12.08-2_xyz` = ✅ Allowed (same
version, different hash)

### Bug #2: Temporary File Race Conditions
**Issue**: Race conditions with `/tmp/validation_failed.txt` file
**Root Cause**: Multiple writes to same file, manual cleanup required
**Fix**: Eliminated ALL temporary files, using in-memory bash arrays

### Bug #3: Image Existence Check Failures
**Issue**: Validation failed for valid private registry images
**Root Cause**: Only checked Docker Hub canonical image
**Fix**: Now tries Docker Hub first, then falls back to full image path

### Bug #4: Silent Failures in Validation Loops
**Issue**: Validation could continue after failures
**Root Cause**: Lack of strict error handling
**Fix**: Added `set -euo pipefail` to all bash scripts

### Bug #5: Version Pattern Validation Edge Cases
**Issue**: Malformed tags could pass validation
**Root Cause**: Regex didn't enforce proper boundaries
**Fix**: Improved regex validation with proper format checks

## 🔧 Technical Changes

### State Management Architecture
**Before (v1.x)**: Used temporary files
```bash
echo "false" > /tmp/validation_failed.txt
echo "$image" >> /tmp/new_images.txt
[ -f /tmp/validation_failed.txt ] && exit 1
```

**After (v2.0.0)**: Uses bash arrays
```bash
VALIDATION_FAILED=false
FAILED_IMAGES=()
FAILED_IMAGES+=("$image: reason")
if [ "$VALIDATION_FAILED" = "true" ]; then
  printf '   - %s\n' "${FAILED_IMAGES[@]}"
  exit 1
fi
```

### Error Handling
All bash scripts now use strict mode:
```bash
set -euo pipefail
```

## 📝 Documentation

Added comprehensive CHANGELOG.md with:
- Complete bug details and technical explanations
- Migration guide from v1.x to v2.0.0
- Version support matrix
- Testing recommendations

## 🧪 Testing

Recommended testing approach:
```yaml
uses: dotCMS/ai-workflows/.github/workflows/deployment-guard.yml@v2.0.0
with:
  testing_force_non_bypass: true  # Force validation even for org members
  verify_image_existence: true    # Now enabled by default
```

## 📊 Test Cases Covered

| Scenario | v1.x | v2.0.0 |
|----------|------|--------|
| `25.12.08-2` → `25.12.08` | ✅ (bug) | ❌ (correct) |
| `25.12.08` → `25.12.08-2` | ✅ | ✅ |
| `25.12.08-2` → `25.12.08-3` | ✅ | ✅ |
| `25.12.08_abc` → `25.12.08_xyz` | ✅ | ✅ |
| `25.12.07` → `25.12.08` | ✅ | ✅ |
| `25.12.08` → `25.12.07` | ❌ | ❌ |
| Private registry image | ❌ (bug) | ✅ (fixed) |

## 🔄 Migration Path

1. **Week 1**: Deploy v2.0.0 to staging/dev
2. **Week 2**: Monitor and validate test cases
3. **Week 3**: Deploy to production
4. **Week 4**: Deprecate v1.x

## 📚 Related Issues

Fixes bugs reported in Deutsche Bank infrastructure validation.

## ✅ Checklist

- [x] All temporary files eliminated
- [x] Rebuild downgrade detection fixed
- [x] Private registry support added
- [x] Strict error handling implemented
- [x] Comprehensive CHANGELOG created
- [x] All validation cases tested
- [x] Breaking changes documented

---
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant