fix: Handle markdown-wrapped JSON in LLM schema generation (#1663)#1720
Closed
YuriNachos wants to merge 1 commit intounclecode:mainfrom
Closed
fix: Handle markdown-wrapped JSON in LLM schema generation (#1663)#1720YuriNachos wants to merge 1 commit intounclecode:mainfrom
YuriNachos wants to merge 1 commit intounclecode:mainfrom
Conversation
Fixes unclecode#1663 Claude Sonnet and other LLMs sometimes return valid JSON wrapped in markdown code blocks (\`\`\`json...\`\`\`), causing JSONDecodeError in JsonCssExtractionStrategy.generate_schema(). Added pre-processing to strip markdown code block markers before JSON parsing, handling both \`\`\`json and \`\`\` formats. Co-Authored-By: Claude <noreply@anthropic.com>
cc5ffd3 to
8d619f5
Compare
Owner
|
Thanks for the fix! This issue has already been addressed on the develop branch — there's now a |
unclecode
added a commit
that referenced
this pull request
Feb 1, 2026
- PR #1714: Replace tf-playwright-stealth with playwright-stealth - PR #1721: Respect <base> tag in html2text for relative links - PR #1719: Include GoogleSearchCrawler script.js in package data - PR #1717: Allow local embeddings by removing OpenAI fallback - Fix: Extract <base href> from raw HTML before head gets stripped - Close duplicates: #1703, #1698, #1697, #1710, #1720 - Update CONTRIBUTORS.md and PR-TODOLIST.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes
Fixes #1663
Details
Claude Sonnet and other LLMs sometimes return valid JSON wrapped in markdown
code blocks, even when JSON mode is enabled. This causes
json.loads()to failwith:
```
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```
The fix pre-processes the LLM response to remove common markdown patterns:
Before attempting JSON parsing in
JsonCssExtractionStrategy.generate_schema().Test plan
🤖 Generated with Claude Code