Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ Python
quotingType
rawContent
repo
Rollout
rootDir
sample_event_markdown
sample_service_markdown
Expand Down
3 changes: 3 additions & 0 deletions scripts/config/vale/vale.ini
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,6 @@ Vocab = words

[*.md]
BasedOnStyles = Vale

[src/changelog/agents.md]
BasedOnStyles =
1 change: 1 addition & 0 deletions scripts/githooks/check-todos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ EXCLUDED_DIRS=(
"docs/"
"node_modules/"
".devcontainer/"
"src/changelog"
)


Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# Schema Dataschema Consistency Validation

**Status**: 🚧 In Development
**Last Updated**: 2025-11-13 15:20 GMT

## What It Does

This validation tool ensures consistency in CloudEvents event schemas by checking that the `dataschema` const value matches the `data` $ref value.

In CloudEvents schemas, these two properties should always reference the same schema file:

```yaml
dataschema:
type: string
const: ../data/digital-letter-base-data.schema.yaml # Must match
data:
$ref: ../data/digital-letter-base-data.schema.yaml # Must match
```

The validator automatically detects mismatches and reports them clearly.

## Why It Matters

Mismatched `dataschema` and `data` references can cause:

- Runtime validation failures
- Confusing error messages
- Incorrect schema documentation
- Integration issues with event consumers

This validation catches these issues early in development.

## Quick Start

### Validate Your Schemas

```bash
# Validate all event schemas in current directory
npm run validate:dataschema-consistency

# Or use make
make validate-dataschema-consistency

# Validate specific directory
npm run validate:dataschema-consistency -- /path/to/schemas
```

### Expected Output

**When all schemas are valid**:

```plaintext
✓ Validating event schemas...
✓ Found 22 schema files
✓ All schemas valid - no mismatches detected
```

**When mismatches are found**:

```plaintext
✗ Validation failed for 2 schemas:

File: uk.nhs.notify.digital.letters.event.v1.schema.yaml
Error: dataschema const does not match data $ref
Expected: ../data/schema-a.yaml
Actual: ../data/schema-b.yaml

File: another-event.v1.schema.yaml
Error: dataschema const does not match data $ref
Expected: ../data/correct-schema.yaml
Actual: ../data/wrong-schema.yaml

✗ 2 validation errors found
```

## Usage

### In Development

Run validation before committing schema changes:

```bash
# Add to your workflow
git add src/cloudevents/domains/*/events/*.yaml
make validate-dataschema-consistency
git commit -m "feat: add new event schema"
```

### In CI/CD

The validation runs automatically in the CI/CD pipeline:

- **Pull Requests**: Validates all schema files
- **Main Branch**: Runs on every commit
- **Failure**: Pipeline fails if mismatches detected

### Programmatic Use

Use the validation function directly in your code:

```typescript
import { validateDataschemaConsistency } from './validator-lib';

const schema = {
properties: {
dataschema: {
const: '../data/schema.yaml'
},
data: {
$ref: '../data/schema.yaml'
}
}
};

const result = validateDataschemaConsistency(schema);

if (!result.valid) {
console.error(result.errorMessage);
console.log(`Expected: ${result.dataschemaValue}`);
console.log(`Actual: ${result.dataRefValue}`);
}
```

## What Gets Validated

### Validated Schemas

The tool checks schemas that have BOTH:

- A `properties.dataschema.const` value
- A `properties.data.$ref` value

### Skipped Schemas

Schemas are automatically skipped (no error) if they:

- Don't have a `dataschema` property
- Don't have a `data` property
- Are not CloudEvents event schemas

### Validation Rules

1. **Exact Match**: Values must match exactly (case-sensitive)
2. **No Whitespace**: Trailing/leading spaces cause validation failure
3. **String Only**: Both values must be strings
4. **Not Null**: Null or undefined values fail validation

## Common Issues

### Mismatch Detected

**Problem**: Validator reports mismatch

**Solution**: Update schema to use consistent reference:

```yaml
# Before (incorrect)
dataschema:
const: ../data/old-schema.yaml
data:
$ref: ../data/new-schema.yaml

# After (correct)
dataschema:
const: ../data/new-schema.yaml
data:
$ref: ../data/new-schema.yaml
```

### Case Sensitivity

**Problem**: `Schema.yaml` vs `schema.yaml`

**Solution**: Ensure exact case match:

```yaml
# Both must use same case
dataschema:
const: ../data/Schema.yaml # Capital S
data:
$ref: ../data/Schema.yaml # Capital S
```

### Whitespace Issues

**Problem**: Hidden spaces cause validation failure

**Solution**: Remove trailing whitespace:

```yaml
# Before (incorrect - space after .yaml)
dataschema:
const: ../data/schema.yaml

# After (correct)
dataschema:
const: ../data/schema.yaml
```

## Where to Get Help

- **Documentation**: See `/src/changelog/2025-11-13/001-01-request-*.md` for background
- **Requirements**: See `/src/changelog/2025-11-13/001-03-requirements-*.md` for detailed specs
- **Issues**: Report problems in GitHub Issues
- **Questions**: Ask in team channels

## Development Status

### Current Status: 🚧 In Development

- ✅ Validation logic implemented and tested
- ⏳ CLI script in progress
- ⏳ CI/CD integration pending
- ⏳ Documentation being refined

### Upcoming

- Full CI/CD pipeline integration
- Additional validation rules if needed
- Performance optimizations
- Enhanced error messages

---

**Note**: This document will be updated as the feature develops. Check the "Last Updated" timestamp above.
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Schema Consistency Validation Enhancement

**Date**: 2025-11-13 14:31 GMT
**Branch**: rossbugginsnhs/2025-11-13/schema-restrictions

## Objective

Enhance the CloudEvents validator at `src/cloudevents/tools/validator` to enforce consistency between `dataschema` const values and `data` $ref values across all event schemas.

### Current Pattern in Event Schemas

All event schemas follow this pattern:

```yaml
dataschema:
type: string
const: ../data/digital-letter-base-data.schema.yaml
description: Canonical URI of the example event's data schema.
data:
$ref: ../data/digital-letter-base-data.schema.yaml
description: Example payload wrapper containing notify-payload.
```

### Challenge

- `dataschema.const` is a literal value that validates instance data
- `data.$ref` is schema metadata that tells validators which schema to use
- JSON Schema has no built-in way to cross-reference between literal values and schema keywords

### Proposed Solution

Add validation to the existing validator tool at `src/cloudevents/tools/validator` to:

1. Parse event schema files
2. Extract the `dataschema.const` value
3. Extract the `data.$ref` value
4. Fail validation if they don't match

This would be integrated into the existing validation tooling and CI/CD pipeline to ensure consistency across all 22+ event schemas that follow this pattern.

## Next Steps

1. Create a validation function in `validator-lib.ts`
2. Add a standalone validation script or extend existing validator
3. Add tests for the new validation
4. Integrate into CI/CD pipeline
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Implementation Plan: Schema Consistency Validation

**Date**: 2025-11-13 14:38 GMT
**Branch**: rossbugginsnhs/2025-11-13/schema-restrictions
**Related Request**: [001-01-request-schema-dataschema-ref-consistency.md](./001-01-request-schema-dataschema-ref-consistency.md)## Overview

Add validation to ensure that in CloudEvents event schemas, the `dataschema` const value matches the `data` $ref value.

## Implementation Steps

### 1. Create New Validation Library

Create a new library file `dataschema-consistency-lib.ts` with validation function:

- Export `validateDataschemaConsistency(schemaObject)` function
- Export `DataschemaConsistencyResult` interface type
- Checks if the schema has both `properties.dataschema.const` and `properties.data.$ref`
- Returns validation result with details if they don't match
- Returns success if they match or if the pattern doesn't apply

**Location**: `src/cloudevents/tools/validator/dataschema-consistency-lib.ts`

**Rationale**: Create new file instead of modifying existing validator-lib.ts to keep changes isolated and avoid impacting existing validation functionality.

### 2. Create Standalone Validation Script

Create a script that:

- Scans all event schema files in specified directories
- Validates each schema for dataschema/data consistency
- Reports all inconsistencies
- Exits with error code if any inconsistencies found
- Imports from the new dataschema-consistency-lib.ts

**Location**: `src/cloudevents/tools/validator/validate-dataschema-consistency.ts`

### 3. Add Unit Tests

Create comprehensive tests for:

- Matching dataschema and data values (should pass)
- Mismatched values (should fail with clear message)
- Schemas without dataschema property (should skip)
- Schemas without data property (should skip)
- Edge cases (null, undefined, different path formats)

**Location**: `src/cloudevents/tools/validator/__tests__/validate-dataschema-consistency.test.ts`

**Note**: Tests will import from `dataschema-consistency-lib` (new file), not from existing validator-lib.

### 4. Update Makefile

Add a new make target to run the consistency validation:

```makefile
validate-dataschema-consistency:
npm run validate:dataschema-consistency
```

**Location**: `src/cloudevents/Makefile`

### 5. Update package.json

Add script to run the consistency validator:

```json
"validate:dataschema-consistency": "tsx tools/validator/validate-dataschema-consistency.ts"
```

**Location**: `src/cloudevents/package.json`

### 6. Integrate into CI/CD Pipeline

Add validation step to the existing validation workflow or create new step.

**Location**: `.github/workflows/` or relevant CI/CD configuration

## Success Criteria

- [ ] Validation function correctly identifies matching dataschema/data pairs
- [ ] Validation function correctly identifies mismatches with helpful error messages
- [ ] All 22+ existing event schemas pass validation
- [ ] Unit tests achieve 100% code coverage for new functions
- [ ] Script can be run standalone via `make` or `npm run`
- [ ] Integration into CI/CD prevents merging schemas with inconsistencies
- [ ] Documentation updated if needed

## Testing Strategy

1. Run against all existing event schemas to ensure they currently pass
2. Create test schemas with intentional mismatches to verify detection
3. Test edge cases (missing properties, null values, etc.)
4. Verify error messages are clear and actionable

## Rollout Plan

1. Implement and test locally
2. Run against all existing schemas to verify current state
3. Add to CI/CD pipeline as warning initially
4. Monitor for false positives
5. Convert to blocking validation once confident
Loading
Loading