Skip to content

Conversation

jxnl
Copy link
Collaborator

@jxnl jxnl commented Oct 7, 2025

Add YAML mode to the instructor library to enable LLMs to return structured data in YAML, improving token efficiency and readability.

This PR introduces instructor.Mode.YAML, allowing users to prompt LLMs for YAML output. Benchmarks show YAML is approximately 28.1% more token-efficient than pretty-printed JSON, while maintaining full Pydantic model compatibility and robust error handling.


Slack Thread

Open in Cursor Open in Web

Copy link

cursor bot commented Oct 7, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

Copy link

cloudflare-workers-and-pages bot commented Oct 7, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
instructor d95ca11 Oct 08 2025, 12:30 AM

Copy link
Contributor

claude bot commented Oct 7, 2025

Claude finished @jxnl's task —— View job


PR Review: YAML Mode Implementation

I've completed a comprehensive code review of this YAML mode implementation. Overall, this is a solid feature addition that follows instructor's architecture well, but there are several important issues to address.

Strengths

  1. Consistent Architecture: The implementation follows the established pattern perfectly:

    • Added Mode.YAML to the enum (instructor/mode.py:26)
    • Properly categorized as a JSON mode (instructor/mode.py:109)
    • Integrated into all the right handler mappings across response.py and openai/utils.py
  2. Complete Integration: YAML mode is properly integrated across the entire request/response lifecycle:

    • Request handling (handle_yaml_mode in openai/utils.py:422-465)
    • Response parsing (parse_yaml in function_calls.py:647-700)
    • Error recovery (reask_yaml in openai/utils.py:81-103)
  3. Robust YAML Parsing: The parse_yaml method includes thoughtful logic:

    • Code block extraction with regex patterns for both yaml and yml
    • Fallback to raw content detection with heuristics
    • Proper error handling with descriptive messages
    • Safe YAML loading with security considerations
  4. Benchmarking Evidence: Concrete performance data showing 28.1% token efficiency improvement over pretty JSON.

⚠️ Critical Issues

1. Missing Dependency Management

  • Issue: The code imports yaml but PyYAML is not listed in pyproject.toml dependencies
  • Impact: This will cause runtime ImportError for users who don't have PyYAML installed
  • Fix needed: Add pyyaml as either a core dependency or optional dependency

2. No Test Coverage

  • Issue: No actual tests found for YAML mode functionality
  • Impact: No verification that the implementation works correctly with real LLM responses
  • Repository guideline: "Include tests for all changes" (CLAUDE.md:155)
  • Expected: Tests in tests/llm/test_openai/ following existing patterns

3. Import Error Handling

  • Issue: Missing graceful handling if PyYAML is not available
  • Impact: Hard crashes instead of informative error messages
  • Location: function_calls.py:654 - naked import yaml

🔧 Code Quality Issues

4. Code Duplication in Reask Handler

  • Issue: reask_yaml is nearly identical to reask_md_json
  • Location: openai/utils.py:81-103 vs 106-128
  • Improvement: Could be refactored to share common logic

5. Hard-coded Prompts

  • Issue: System message in handle_yaml_mode is hard-coded without flexibility
  • Location: openai/utils.py:436-445
  • Concern: No way to customize the prompt style or add provider-specific optimizations

6. Limited Error Context in Reask

  • Issue: YAML reask message could be more specific about YAML formatting requirements
  • Location: Line 99: "Correct your YAML ONLY RESPONSE, based on the following errors:\n{exception}"
  • Improvement: Could include YAML-specific guidance

📋 Missing Features

7. Provider Support

  • Issue: Only implemented for OpenAI provider
  • Gap: Other providers (Anthropic, Gemini, etc.) don't support YAML mode
  • Expected: Based on the architecture, should support major providers

8. Documentation Gaps

  • Missing: No integration with main docs in docs/ directory
  • Missing: No examples in examples/ directory following the pattern
  • Present: Only standalone README file

🔒 Security & Performance

9. YAML Security

  • Positive: Uses yaml.safe_load() which is secure
  • Good: Proper exception handling for malformed YAML

10. Regex Performance

  • Issue: Multiple regex operations in sequence could be optimized
  • Location: function_calls.py:670-680
  • Minor: Not critical but could be more efficient

📊 Benchmarking Script Issues

11. Benchmark Accuracy

  • Issue: Token estimation (len(text) // 4) is very rough
  • Improvement: Could use actual tokenizer for more accurate results
  • Note: Results are directionally correct but may not reflect real-world usage

🛠️ Recommended Actions

High Priority:

  1. Add PyYAML dependency to pyproject.toml (core or optional extra)
  2. Add comprehensive tests following tests/llm/test_openai/ patterns
  3. Add graceful import error handling with helpful error messages

Medium Priority:
4. Extend to other providers (Anthropic, Gemini) for feature parity
5. Add proper documentation integration and examples
6. Refactor reask handlers to reduce duplication

Low Priority:
7. Optimize regex patterns and improve error messages
8. Add prompt customization options

💭 Architecture Feedback

The implementation demonstrates excellent understanding of instructor's architecture. The mode-based dispatch pattern is used correctly, and the integration points are all properly covered. The choice to treat YAML as a JSON mode is architecturally sound since both produce structured data that maps to Pydantic models.

🎯 Overall Recommendation

Approve with required changes. This is a valuable feature that addresses a real need (token efficiency), but the missing dependency and test coverage issues must be resolved before merging. The implementation quality is good and follows established patterns well.


@github-actions github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 7, 2025
Implements YAML mode as an alternative to JSON for LLM responses.

Changes:
- Added Mode.YAML enum value
- Implemented YAML request handler and response parser
- Added YAML mode to allowed modes for OpenAI provider
- Improved system prompt for YAML responses
- Added accurate token benchmark using tiktoken

Token efficiency (vs JSON Pretty):
- YAML Standard: +32.1% more efficient
- JSON Compact: +40.8% more efficient (most efficient)

Co-authored-by: jason <jason@jxnl.co>
@jxnl jxnl force-pushed the cursor/implement-yaml-mode-and-benchmark-efficiency-5b08 branch from 735c5f1 to 0578dd8 Compare October 8, 2025 00:15
Copy link
Contributor

claude bot commented Oct 8, 2025

Claude finished @jxnl's task —— View job


PR Review: YAML Mode Implementation

I've completed a comprehensive code review of this YAML mode implementation. Overall, this is a well-architected feature that follows instructor's patterns correctly, but there are several critical issues that must be addressed before merging.

Strengths

  1. Excellent Architecture Adherence: The implementation perfectly follows instructor's established patterns:

    • Added Mode.YAML to the enum (instructor/mode.py:26)
    • Properly categorized as a JSON mode (instructor/mode.py:109)
    • Integrated into all correct handler mappings (response.py:436,616 and openai/utils.py)
  2. Complete Integration: YAML mode is properly integrated across the entire request/response lifecycle:

    • Request handling (handle_yaml_mode in openai/utils.py:421-463)
    • Response parsing (parse_yaml in function_calls.py:646-706)
    • Error recovery (reask_yaml in openai/utils.py:80-102)
  3. Robust YAML Parsing: The parse_yaml method includes thoughtful logic:

    • Code block extraction with regex patterns for both yaml and yml
    • Fallback to raw content detection with heuristics
    • Proper error handling with descriptive messages
    • Safe YAML loading (yaml.safe_load()) for security
  4. Comprehensive Benchmarking: The accurate_token_benchmark.py provides solid evidence of token efficiency benefits with real tiktoken measurements.

⚠️ Critical Issues

1. Missing PyYAML Dependency 🚨

  • Issue: Code imports yaml but PyYAML is not in pyproject.toml dependencies
  • Impact: Runtime ImportError for users who don't have PyYAML installed
  • Location: function_calls.py:653, accurate_token_benchmark.py:8
  • Fix needed: Add pyyaml as either core or optional dependency

2. No Test Coverage 🚨

  • Issue: Zero tests found for YAML mode functionality
  • Impact: No verification that implementation works with real LLM responses
  • Repository guideline violation: "Include tests for all changes" (CLAUDE.md:155)
  • Expected: Tests in tests/llm/test_openai/ following existing patterns

3. Import Error Handling ⚠️

  • Issue: Missing graceful handling if PyYAML unavailable
  • Location: function_calls.py:653 - naked import yaml
  • Better approach: Conditional import with informative error message

🔧 Code Quality Issues

4. Code Duplication

  • Issue: reask_yaml (lines 80-102) nearly identical to reask_md_json (lines 105-127)
  • Improvement: Could extract common reask logic to reduce duplication

5. Hard-coded System Messages

  • Issue: YAML prompt in handle_yaml_mode is inflexible
  • Location: openai/utils.py:435-443
  • Limitation: No customization options for different use cases

6. Limited Provider Support

  • Issue: Only implemented for OpenAI provider
  • Gap: Other major providers (Anthropic, Gemini) don't support YAML mode
  • Expected: Based on architecture, should support multiple providers

📋 Missing Documentation & Examples

7. No Integration Documentation

  • Missing: Documentation in docs/ directory
  • Missing: Examples in examples/ directory following repo patterns
  • Present: Only standalone README (not integrated with main docs)

🔒 Security & Performance Assessment

✅ Security: Good use of yaml.safe_load() prevents code injection
✅ Performance: Regex patterns are reasonable, no obvious bottlenecks
⚠️ Token Estimation: Benchmark uses rough estimation fallback when tiktoken unavailable

📊 Benchmarking Analysis

The accurate_token_benchmark.py is well-designed:

  • Uses actual tiktoken for precise measurements
  • Tests diverse data structures (simple, nested, arrays)
  • Shows meaningful efficiency gains (varies by data type)
  • Graceful fallback when tiktoken unavailable

🛠️ Required Actions for Merge

High Priority (Blocking):

  1. Add PyYAML dependency to pyproject.toml
  2. Add comprehensive tests in tests/llm/test_openai/
  3. Add graceful import handling with helpful error messages

Medium Priority:
4. Extend to other providers (Anthropic, Gemini) for feature parity
5. Add proper documentation and examples integration
6. Refactor duplicate reask logic

Low Priority:
7. Add prompt customization options
8. Optimize regex patterns for better performance

💭 Architecture Feedback

The implementation demonstrates excellent understanding of instructor's architecture. The mode-based dispatch pattern is used correctly, and treating YAML as a JSON mode is architecturally sound since both produce structured data for Pydantic validation.

🎯 Recommendation

Conditional Approve - This is a valuable feature addressing real token efficiency needs, but the missing dependency and test coverage are blocking issues. The code quality is good and follows established patterns well.

The 28.1% token efficiency claim appears credible based on the benchmark implementation, making this a worthwhile addition once the critical issues are resolved.


Copy link
Contributor

claude bot commented Oct 8, 2025

Claude finished @jxnl's task —— View job


PR Code Review: YAML Mode Implementation

I've completed a comprehensive code review of this YAML mode implementation. Here's my detailed analysis:

Review Tasks:

  • Read repository guidelines and setup instructions
  • Analyze all 5 changed files for implementation quality
  • Check dependency management and imports
  • Evaluate architecture and integration patterns
  • Assess security considerations
  • Review error handling and edge cases
  • Check test coverage and documentation
  • Provide comprehensive feedback and recommendations

Strengths

1. Excellent Architecture Adherence
The implementation perfectly follows instructor's established patterns:

  • Added Mode.YAML to the enum (instructor/mode.py:26)
  • Properly categorized as a JSON mode (instructor/mode.py:109)
  • Integrated into all correct handler mappings (response.py:436,616 and openai/utils.py)

2. Complete Integration
YAML mode is properly integrated across the entire request/response lifecycle:

  • Request handling (handle_yaml_mode in openai/utils.py:421-463)
  • Response parsing (parse_yaml in function_calls.py:646-706)
  • Error recovery (reask_yaml in openai/utils.py:80-102)

3. Robust YAML Parsing
The parse_yaml method includes thoughtful logic:

  • Code block extraction with regex patterns for both yaml and yml
  • Fallback to raw content detection with heuristics
  • Proper error handling with descriptive messages
  • Safe YAML loading (yaml.safe_load()) for security

4. Clean System Message Design
The YAML prompt in handle_yaml_mode provides clear instructions and proper schema integration.


🚨 Critical Issues

1. Missing PyYAML Dependency (BLOCKING)

  • Issue: Code imports yaml but PyYAML is not in pyproject.toml dependencies
  • Impact: Runtime ImportError for users who don't have PyYAML installed
  • Location: function_calls.py:653
  • Fix needed: Add pyyaml as either core or optional dependency

2. No Test Coverage (BLOCKING)

  • Issue: Zero tests found for YAML mode functionality
  • Impact: No verification that implementation works with real LLM responses
  • Repository guideline violation: "Include tests for all changes" (CLAUDE.md:155)
  • Expected: Tests in tests/llm/test_openai/ following existing patterns

3. Import Error Handling (HIGH)

  • Issue: Missing graceful handling if PyYAML unavailable
  • Location: function_calls.py:653 - naked import yaml
  • Better approach: Conditional import with informative error message

⚠️ Code Quality Issues

4. Code Duplication

  • Issue: reask_yaml (lines 80-102) nearly identical to reask_md_json (lines 105-127)
  • Improvement: Could extract common reask logic to reduce duplication

5. Hard-coded System Messages

  • Issue: YAML prompt in handle_yaml_mode is inflexible
  • Location: openai/utils.py:435-443
  • Limitation: No customization options for different use cases

6. Limited Provider Support

  • Issue: Only implemented for OpenAI provider
  • Gap: Other major providers (Anthropic, Gemini) don't support YAML mode
  • Expected: Based on architecture, should support multiple providers

📋 Missing Documentation & Examples

7. No Integration Documentation

  • Missing: Documentation in docs/ directory
  • Missing: Examples in examples/ directory following repo patterns
  • Present: Only standalone implementation (not integrated with main docs)

🔒 Security & Performance Assessment

✅ Security: Good use of yaml.safe_load() prevents code injection
✅ Performance: Regex patterns are reasonable, no obvious bottlenecks
⚠️ Token Estimation: Implementation looks sound based on code analysis


🛠️ Required Actions for Merge

High Priority (Blocking):

  1. Add PyYAML dependency to pyproject.toml
  2. Add comprehensive tests in tests/llm/test_openai/
  3. Add graceful import handling with helpful error messages

Medium Priority:
4. Extend to other providers (Anthropic, Gemini) for feature parity
5. Add proper documentation and examples integration
6. Refactor duplicate reask logic

Low Priority:
7. Add prompt customization options
8. Optimize regex patterns for better performance


💭 Architecture Feedback

The implementation demonstrates excellent understanding of instructor's architecture. The mode-based dispatch pattern is used correctly, and treating YAML as a JSON mode is architecturally sound since both produce structured data for Pydantic validation.

The choice to integrate at the same level as other JSON modes (JSON, MD_JSON, JSON_SCHEMA) is appropriate and maintains consistency with the existing codebase.


🎯 Recommendation

Conditional Approve - This is a valuable feature addressing real token efficiency needs, but the missing dependency and test coverage are blocking issues. The code quality is good and follows established patterns well.

Once the critical issues are resolved, this will be a solid addition to the instructor library that provides meaningful performance benefits while maintaining the library's high standards for reliability and usability.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant