Skip to content

Conversation

@toolate28
Copy link
Owner

@toolate28 toolate28 commented Jan 18, 2026

This pull request introduces a comprehensive prompt generation toolkit for AWI, inspired by DSPy, and adds robust testing, documentation, and validation improvements. The main focus is on integrating a modular prompt generation pipeline, enhancing security and correctness in scripts, and providing thorough unit tests for edge cases and optimizations.

Prompt Generation Toolkit Integration:

  • Added a new section to interface/awi-spec.md documenting the AWI prompt toolkit, including its architecture, components (ChainOfThought, Predict, AwiPromptGen), optimization techniques (COPRO, SIMBA), and usage examples. Reference to the implementation in experiments/awi_prompt_gen.py is also provided.
  • Updated requirements-ml.txt to mention the optional dspy-ai dependency for prompt optimization, clarifying its use for the new AWI prompt toolkit.

Testing and Quality Assurance:

  • Added a comprehensive unit test suite in experiments/test_awi_prompt_gen.py covering:
    • Integration between scaffolder and refiner
    • Metadata propagation
    • Handling of empty/malformed intents and histories
    • YAML injection prevention
    • Permission level inference
    • Coherence example validation for COPRO optimization

Security and Validation Enhancements:

  • Improved input validation in the awi_request function within ops/scripts/spiralsafe by ensuring level and ttl are non-negative integers, preventing malformed requests and potential script errors.

Maintenance and Cleanup:

  • Removed outdated quantum cognition session history data from media/output/quantum_cognition/session_history.json, likely as part of a cleanup or data refresh.This pull request introduces a comprehensive suite of improvements focused on prompt generation for AI-human collaboration, validation enhancements, and documentation updates. The most significant changes are the addition of a unit test module for the AWI prompt generation system, new documentation for the prompt toolkit architecture and usage, stricter input validation in the spiralsafe script, and optional support for DSPy-style prompt optimization.

Prompt Generation Toolkit and Testing

  • Added a full unit test suite for the AwiPromptGen module in experiments/test_awi_prompt_gen.py, covering integration, metadata propagation, edge case handling, YAML injection prevention, permission inference, and coherence example validation.
  • Documented the AWI prompt toolkit architecture and usage in interface/awi-spec.md, including the DSPy-inspired pipeline, optimization techniques (COPRO, SIMBA), and a reference implementation.
  • Updated requirements-ml.txt to optionally support DSPy-style prompt optimization by adding a commented-out dependency for dspy-ai.

Validation Improvements

  • Enhanced input validation in the awi_request function of ops/scripts/spiralsafe to ensure level and ttl are non-negative integers without leading zeros (except for 0).

Data Cleanup

  • Removed obsolete quantum cognition session history data from media/output/quantum_cognition/session_history.json to keep the repository clean and relevant.name: Pull request template
    about: Use this template for PRs that add CI, bump templates, or agent-facing files
    title: ''
    labels: ''
    assignees: ''

Summary

Describe the change in 1-2 sentences.

ATOM Tag

ATOM: ATOM-TYPE-YYYYMMDD-NNN-description

(Generate with: ./scripts/atom-track.sh TYPE "description" "file")

Why

Why this change is needed and what it enables.

What changed

  • Files added
  • Scripts
  • CI workflows

Verification / Testing

  • scripts/validate-bump.sh passes locally (if bump.md changed)
  • scripts/validate-branch-name.sh tested on example branches (if applicable)
  • bash scripts/verify-environment.sh prints ENV OK
  • scripts/test-scripts.sh passes (if scripts changed)
  • All shell scripts pass shellcheck
  • ATOM tag created and logged

Claude Interaction

You can interact with Claude in this PR by:

  • @mentioning Claude in comments for questions or reviews
  • Adding labels: claude:review, claude:help, claude:analyze
  • Requesting reviews: Claude will provide automated feedback
  • Ask questions: Claude can explain code, suggest improvements, or identify issues

Example commands:

  • @claude please review this PR for ATOM compliance
  • @claude explain the changes in scripts/atom-track.sh
  • @claude check for security issues
  • @claude suggest improvements

Notes

Copilot AI and others added 14 commits January 17, 2026 16:12
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
…paths, input validation

Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
…on, path traversal protection

Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Copilot AI review requested due to automatic review settings January 18, 2026 14:24
@toolate28 toolate28 added the enhancement New feature or request label Jan 18, 2026
@vercel
Copy link
Contributor

vercel bot commented Jan 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
h.and.s Ready Ready Preview, Comment Jan 19, 2026 4:54pm

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a comprehensive AWI (Authorization-With-Intent) prompt generation toolkit inspired by DSPy patterns, along with validation improvements and documentation updates. The changes implement a production-ready prompt optimization system for AI-human collaboration with security hardening and extensive test coverage.

Changes:

  • Added a DSPy-inspired prompt generation module (experiments/awi_prompt_gen.py) implementing ChainOfThought scaffolding, Predict refinement, COPRO optimization, and SIMBA annealing techniques
  • Created comprehensive unit tests for the prompt generation module covering integration, edge cases, security (YAML injection prevention), and permission inference
  • Enhanced input validation in the spiralsafe AWI request function to prevent leading-zero integers and ensure non-negative values
  • Documented the prompt toolkit architecture, optimization techniques, and usage examples in interface/awi-spec.md
  • Added optional DSPy dependency to requirements-ml.txt (commented out)
  • Cleaned up obsolete quantum cognition session data and updated .gitignore to exclude generated output files

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
experiments/awi_prompt_gen.py New 715-line module implementing DSPy-style prompt generation with security features (YAML sanitization, path validation), COPRO/SIMBA optimization, and AWI protocol integration
experiments/test_awi_prompt_gen.py Comprehensive test suite covering scaffolder-refiner integration, metadata propagation, edge cases, YAML injection prevention, and permission inference
interface/awi-spec.md Added documentation section for prompt toolkit with architecture diagram, component descriptions, optimization techniques, and usage examples
ops/scripts/spiralsafe Enhanced awi_request function validation to reject leading-zero integers and ensure non-negative level/TTL values
requirements-ml.txt Added commented optional dependency for dspy-ai==2.4.0 with clear usage instructions
media/output/quantum_cognition/session_history.json Removed 306 lines of obsolete session history data
.gitignore Added pattern to exclude media/output/ directory for generated files

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated no new comments.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Copy link
Contributor

Copilot AI commented Jan 19, 2026

@toolate28 I've opened a new pull request, #168, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits January 19, 2026 15:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: toolated <toolated@toolated.online>
Copy link
Contributor

Copilot AI commented Jan 19, 2026

@toolate28 I've opened a new pull request, #169, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 7 commits January 19, 2026 15:58
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
…sertion

Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
…ot/sub-pr-156

Signed-off-by: toolated <toolated@pm.me>
…ot/sub-pr-156-again

Signed-off-by: toolated <toolated@pm.me>
…#169)

## Summary

Refactored `test_awi_prompt_gen.py` from custom test runner to standard
pytest conventions, removing incompatible patterns that prevented pytest
discovery and execution.

## ATOM Tag

**ATOM:** `ATOM-REFACTOR-20260119-001-pytest-compatible-tests`

## Why

The test class used custom `__init__`, `run_all()`, and `test()` methods
that pytest cannot recognize. Pytest expects standard test classes
without custom `__init__` and standard `assert` statements instead of
custom assertion methods.

## What changed

**Before:**
```python
class TestAwiPromptGen:
    def __init__(self):
        self.passed = 0
        self.failed = 0
    
    def test(self, name: str, condition: bool, error_msg: str = ""):
        if condition:
            print(f"  ✅ PASS: {name}")
            self.passed += 1
        else:
            print(f"  ❌ FAIL: {name}")
            self.failed += 1
    
    def run_all(self):
        # Custom test orchestration
```

**After:**
```python
class TestAwiPromptGen:
    def test_scaffolder_refiner_integration(self):
        gen = AwiPromptGen()
        result = gen(user_intent="...", history=[...])
        
        assert isinstance(result, Prediction), f"Expected Prediction, got {type(result)}"
        assert len(result.content) > 0, "Content should not be empty"
```

### Changes
- Removed custom test infrastructure (`__init__`, `run_all()`, `test()`
method)
- Converted all 7 test methods to use standard `assert` statements
- Added `if __name__ == "__main__"` block for standalone execution via
pytest
- Cleaned up line continuations to follow PEP 8
- Moved pytest import to module level for clarity

## Verification / Testing

- [x] All 7 tests pass with `pytest experiments/test_awi_prompt_gen.py
-v`
- [x] Standalone execution works: `python
experiments/test_awi_prompt_gen.py`
- [x] Tests discovered and executed by pytest's test discovery
- [x] Code review feedback addressed

## Claude Interaction

You can interact with Claude in this PR by:
- **@mentioning Claude** in comments for questions or reviews
- **Adding labels**: `claude:review`, `claude:help`, `claude:analyze`
- **Requesting reviews**: Claude will provide automated feedback
- **Ask questions**: Claude can explain code, suggest improvements, or
identify issues

### Example commands:
- `@claude please review this PR for ATOM compliance`
- `@claude explain the changes in scripts/atom-track.sh`
- `@claude check for security issues`
- `@claude suggest improvements`

## Notes

- Tests remain organized in `TestAwiPromptGen` class for structure
- All original test logic preserved, only execution mechanism changed
- Compatible with existing pytest configuration in
`requirements-dev.txt`

## Checklist

- [x] ATOM tag created and referenced
- [x] Tests passing
- [x] Documentation updated (inline comments clarifying assertions)
- [x] No secrets committed
- [x] Follows existing patterns (matches other test files in repo)
- [x] Ready for Claude review

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).
…ot/sub-pr-156

Signed-off-by: toolated <toolated@pm.me>
## Summary

The `TestAwiPromptGen` class was not discoverable by pytest because it
lacked `unittest.TestCase` inheritance and used a custom test runner
instead of pytest conventions.

## ATOM Tag

**ATOM:** `ATOM-REFACTOR-20260119-001-pytest-test-discovery`

## Why

Pytest test discovery requires either:
1. Test classes inheriting from `unittest.TestCase`, or
2. Standalone `test_*` functions

The custom class with `run_all()` method bypassed pytest's discovery
mechanism, making tests invisible to CI and manual `pytest` runs.

## What changed

- Converted class-based tests to 7 standalone `test_*` functions
- Replaced custom `self.test()` assertions with standard `assert`
statements
- Removed test runner infrastructure (`__init__`, `run_all()`, pass/fail
counters)
- Reduced file from 299 to 161 lines while preserving all test logic

**Before:**
```python
class TestAwiPromptGen:
    def test(self, name: str, condition: bool, error_msg: str = ""):
        if condition:
            print(f"  ✅ PASS: {name}")
            self.passed += 1
        else:
            print(f"  ❌ FAIL: {name}")
            self.failed += 1
    
    def test_scaffolder_refiner_integration(self):
        gen = AwiPromptGen()
        result = gen(user_intent="...", history=[...])
        self.test("Result is Prediction", isinstance(result, Prediction))
```

**After:**
```python
def test_scaffolder_refiner_integration():
    """Test that scaffolder and refiner work together correctly."""
    gen = AwiPromptGen()
    result = gen(user_intent="...", history=[...])
    assert isinstance(result, Prediction), f"Expected Prediction, got {type(result)}"
```

## Verification / Testing

- [x] All 7 tests discovered by `pytest --collect-only`
- [x] All 7 tests pass with `pytest -v`
- [x] Follows pattern in `ops/integrations/*/tests/test_*.py`
- [x] ATOM tag created and logged

## Claude Interaction

You can interact with Claude in this PR by:
- **@mentioning Claude** in comments for questions or reviews
- **Adding labels**: `claude:review`, `claude:help`, `claude:analyze`
- **Requesting reviews**: Claude will provide automated feedback
- **Ask questions**: Claude can explain code, suggest improvements, or
identify issues

### Example commands:
- `@claude please review this PR for ATOM compliance`
- `@claude explain the changes in scripts/atom-track.sh`
- `@claude check for security issues`
- `@claude suggest improvements`

## Notes

- Addresses feedback from PR #156:
#156 (comment)
- Maintains identical test coverage and assertions
- No behavior changes, pure refactoring for pytest compatibility

## Checklist

- [x] ATOM tag created and referenced
- [x] Tests passing
- [ ] Documentation updated (test file is self-documenting)
- [x] No secrets committed
- [x] Follows existing patterns
- [x] Ready for Claude review

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
@toolate28 toolate28 changed the title Copilot/refine prompt toolkit collaboration ATOM-TYPE-20260120-AWI-DSpy-Toolkit Jan 20, 2026
@toolate28 toolate28 self-assigned this Jan 20, 2026
@toolate28 toolate28 requested a review from Copilot February 5, 2026 15:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

@@ -0,0 +1,163 @@
#!/usr/bin/env python3
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title doesn't follow the expected ATOM tag format. According to the custom coding guidelines, ATOM tags should follow the pattern ATOM-TYPE-YYYYMMDD-NNN-description where:

  • TYPE should be one of: INIT, FEATURE, FIX, DOC, REFACTOR, TEST, DECISION, RELEASE, TASK
  • YYYYMMDD is the date
  • NNN is a three-digit sequence number

The current title "ATOM-TYPE-20260120-AWI-DSpy-Toolkit" is missing:

  1. A specific type (it literally says "TYPE" instead of FEATURE, TEST, etc.)
  2. A three-digit sequence number before the description

Expected format would be something like: "ATOM-FEATURE-20260120-001-awi-dspy-toolkit" or "ATOM-TEST-20260120-001-awi-dspy-toolkit"

Copilot uses AI. Check for mistakes.
log_error "Level must be a non-negative integer"
exit 1
fi

Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation ensures level is a non-negative integer, but according to the AWI specification (interface/awi-spec.md lines 52-60), permission levels are defined as 0-4. The validation should also check that level is within this valid range to prevent invalid requests to the API.

Suggested change
# Enforce AWI permission level range 0-4 (inclusive)
if (( level < 0 || level > 4 )); then
log_error "Level must be between 0 and 4 (inclusive)"
exit 1
fi

Copilot uses AI. Check for mistakes.
- AWI protocol: Permission scaffolding integration
- ATOM system: Session tracking and verification

ATOM-FEATURE-20260117-001-awi-prompt-gen
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a date discrepancy between the PR title and the ATOM tags in the code files. The PR title uses "20260120" (January 20, 2026) while the ATOM tags in both test_awi_prompt_gen.py (line 10) and awi_prompt_gen.py (line 44) use "20260117" (January 17, 2026). These dates should be consistent to maintain proper ATOM trail tracking.

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +163
for ex in examples.get("negative", []):
assert ex.coherence_score < COHERENCE_HIGH_THRESHOLD, \
f"Negative example should have low coherence: expected < {COHERENCE_HIGH_THRESHOLD}, got {ex.coherence_score}"
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test_coherence_examples function, the negative examples loop checks that coherence_score is below the threshold, but it doesn't verify that is_positive is False (unlike the positive examples which check both properties). For consistency and completeness, negative examples should also verify that is_positive is False.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +17
import pytest

Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'pytest' is not used.

Suggested change
import pytest

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants