Claude/progress semantic check 011 cuqy3 ex6fy ru xb6 w gj nze#14
Open
calvingiles wants to merge 3 commits intomainfrom
Open
Claude/progress semantic check 011 cuqy3 ex6fy ru xb6 w gj nze#14calvingiles wants to merge 3 commits intomainfrom
calvingiles wants to merge 3 commits intomainfrom
Conversation
This commit activates the semantic test-adherence feature specification by: - Moving spec from specs/future/ to specs/ (activating for implementation) - Changing Status from "Provisional" to "Draft" - Adding LiteLLM as the integration library (REQ-021) - Adding Groq provider support for free-tier CI/CD usage (REQ-022) - Adding requirement for default model configurations pinned in releases (REQ-023) - Expanding provider support: groq, anthropic, openai, ollama, vertex_ai, bedrock (REQ-036) - Setting groq as default provider (REQ-037) - Updating environment variables for all providers (REQ-042) - Fixing SPEC ID conflict: spec-coverage-linter changed from SPEC-003 to SPEC-004 The specification now contains 67 functional requirements ready for implementation. This aligns with the technical design document in technical-notes/llm-provider-selection.md. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit implements the core MVP for the semantic test-adherence checker defined in SPEC-003, enabling AI/LLM-powered validation of test-requirement alignment. ## What's Added: ### Core Modules: - **semantic_test_result.py**: Result dataclasses for semantic analysis - SemanticAnalysisResult: Individual test-requirement pair analysis - SemanticTestAdherenceResult: Overall validation results with reporting - **llm_provider.py**: LLM provider abstraction layer - LLMProvider: Abstract base class for LLM providers - LiteLLMProvider: Implementation using LiteLLM library - Support for 6 providers: Groq (default), Anthropic, OpenAI, Ollama, Vertex AI, Bedrock - Default models pinned per provider (as per REQ-023) - Retry logic with exponential backoff (3 retries, as per REQ-025) - JSON response parsing with error handling - **semantic_test_analyzer.py**: Core semantic analyzer - Requirement discovery from spec files with full text extraction - Test discovery with source code and docstring extraction - Per-test-requirement pair semantic analysis using LLM - Confidence scoring and threshold-based validation - Provisional spec exclusion (consistent with check-coverage) ### CLI Integration: - Added `check-semantic-test-adherence` command to cli.py - Command-line arguments: --llm-provider, --llm-model, --threshold, --specs-dir, --tests-dir - Comprehensive help text with examples and LLM provider configuration ### Dependencies: - Added litellm package for unified LLM provider access - Updated pyproject.toml and uv.lock ### Exports: - Updated __init__.py to export new classes for library usage ## Implementation Notes: - Follows existing project patterns (reuses logic from spec_coverage_linter) - Defaults to Groq provider for free-tier CI/CD usage (REQ-037) - All code passes ruff linting and formatting checks - Existing test suite passes (131 tests) ## What's Not Yet Implemented (Future Work): - Comprehensive test suite for semantic analyzer (TEST-001 to TEST-010) - Configuration file support in config.py (REQ-031) - Batching optimization for multiple tests (REQ-030) - Caching support (REQ-044-045) - Alternative output formats: JSON, Markdown (REQ-052-054) - Concurrent request pooling (REQ-043) ## Requirements Addressed: Core functionality implements: - REQ-001 to REQ-020: Requirement/test discovery and semantic analysis - REQ-021 to REQ-029: LLM integration with LiteLLM - REQ-031 to REQ-037: Basic configuration (CLI args) - REQ-046 to REQ-048: Reporting - REQ-055 to REQ-057: Exit codes - REQ-058 to REQ-062: Error handling - REQ-063 to REQ-067: Integration with existing tools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit enhances the CI pipeline with two new validation steps: 1. **Unique Spec IDs Validation**: - Validates that all SPEC IDs and requirement IDs are unique - Prevents duplicate identifiers across the codebase - Required check (must pass for CI to succeed) 2. **Semantic Test-Adherence Validation (Optional)**: - Validates that tests semantically test their linked requirements using AI/LLM - Uses Groq provider with GROQ_API_KEY from GitHub secrets - Set as optional with `continue-on-error: true` since API key may not be configured - Provides early feedback when API key is available Both checks are added to the lint job for comprehensive validation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.