-
Notifications
You must be signed in to change notification settings - Fork 576
feat: Add Unstract Adapter Extension Skill and Azure AI Foundry Adapter #1721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* feat: add azure ai foundry adapter * feat: update parameters in various adapters
* feat: add azure ai foundry adapter * feat: update parameters in various adapters
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds optional LLM/embedding parameters, a new Azure AI Foundry LLM adapter and schema, reasoning/json_mode/dimensions support in parameter validation and JSON schemas, multiple adapter generator templates and scaffolding scripts, plus documentation and tooling for adapter lifecycle and model/schema management. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (13)
.claude/skills/unstract-adapter-extension/assets/templates/llm_schema.json.template (1)
34-40: Consider using "integer" type instead of "number" with multipleOf: 1.For integer fields like
max_tokens, using"type": "integer"is more idiomatic than"type": "number"with"multipleOf": 1. This applies tomax_tokens(line 35),max_retries(line 42), andtimeout(line 50) as well.🔎 Proposed refactor
"max_tokens": { - "type": "number", + "type": "integer", "minimum": 0, - "multipleOf": 1, "title": "Maximum Output Tokens", "description": "Maximum number of output tokens to limit LLM replies." },Apply the same pattern to
max_retriesandtimeoutfields..claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py (3)
309-342: Consider more specific exception handling for image processing.Line 339 catches a broad
Exception. While the function handles this gracefully with a print statement, consider catching specific PIL exceptions (e.g.,PIL.UnidentifiedImageError,OSError) for better error diagnosis.🔎 Suggested improvement
except ImportError: if image_data[:8] == b"\x89PNG\r\n\x1a\n": output_path.write_bytes(image_data) return True print(" Note: Install Pillow for better image processing: pip install Pillow") return False - except Exception as e: + except (OSError, ValueError) as e: print(f" Error processing image: {e}") return False
355-415: LGTM with minor suggestion: Local logo processing mirrors download logic.The function appropriately handles SVG conversion, raster resizing, and missing dependencies. Same suggestion as
download_and_process_logo: consider more specific exception handling at line 413.
478-490: Consider using copy.deepcopy for schema template cloning.Line 478 uses
json.loads(json.dumps(EMBEDDING_SCHEMA_TEMPLATE))to clone the template dict. While this ensures JSON serializability,copy.deepcopy()would be more idiomatic and efficient.🔎 Suggested refactor
Add import:
import copyThen replace:
- schema = json.loads(json.dumps(EMBEDDING_SCHEMA_TEMPLATE)) + schema = copy.deepcopy(EMBEDDING_SCHEMA_TEMPLATE)unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/openai.json (1)
33-39: Consider using "integer" type for dimensions field.Similar to the template file, this field uses
"type": "number"with"multipleOf": 1to represent integers. Using"type": "integer"would be more idiomatic.🔎 Proposed refactor
"dimensions": { - "type": "number", + "type": "integer", "minimum": 1, - "multipleOf": 1, "title": "Dimensions", "description": "Output embedding dimensions. Only supported by text-embedding-3-* models. Leave empty for default dimensions (1536 for small, 3072 for large)." },.claude/skills/unstract-adapter-extension/assets/templates/embedding_schema.json.template (1)
35-42: Consider using "integer" type for numeric fields.Both
embed_batch_sizeandtimeoutuse"type": "number"with"multipleOf": 1. Consider using"type": "integer"for clarity, consistent with the suggestion for the LLM template.🔎 Proposed refactor
"embed_batch_size": { - "type": "number", + "type": "integer", "minimum": 1, - "multipleOf": 1, "title": "Embed Batch Size", "default": 10, "description": "Number of texts to embed per batch." }, "timeout": { - "type": "number", + "type": "integer", "minimum": 0, - "multipleOf": 1, "title": "Timeout", "default": 240, "description": "Request timeout in seconds." }unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/azure.json (1)
23-23: LGTM: Azure embedding schema updated with dimensions support.The model description update and new dimensions property align well with the OpenAI schema changes. The note about text-embedding-3-* model support is helpful.
Same optional refactor suggestion applies: consider using
"type": "integer"instead of"type": "number"with"multipleOf": 1for the dimensions field.Also applies to: 50-56
.claude/skills/unstract-adapter-extension/references/provider_capabilities.md (1)
45-90: Optional: Add language specifiers to fenced code blocks.The parameter specification blocks (lines 45-90) lack language specifiers in their fenced code blocks. While this doesn't affect functionality, adding a specifier like
textoryamlwould satisfy linting tools.Example:
### OpenAI LLM -``` +```text api_key, api_base, model, max_tokens, temperature, top_p enable_reasoning, reasoning_effort (o1/o3 models) ```.claude/skills/unstract-adapter-extension/references/adapter_patterns.md (1)
7-25: Consider adding a language identifier to the code block.The fenced code block for the architecture overview lacks a language specification. While this is a text diagram, adding a language identifier improves markdown rendering consistency.
Suggested fix
-``` +```text unstract/sdk1/src/unstract/sdk1/adapters/ ├── base1.py # Parameter classes & base adapter.claude/skills/unstract-adapter-extension/scripts/manage_models.py (2)
40-43: Consider adding error handling for malformed JSON.
load_schemadoesn't handlejson.JSONDecodeError. If a schema file is corrupted, the script will crash with an unhelpful traceback.🔎 Proposed fix
def load_schema(schema_path: Path) -> dict: """Load and parse a JSON schema file.""" - with open(schema_path) as f: - return json.load(f) + try: + with open(schema_path) as f: + return json.load(f) + except json.JSONDecodeError as e: + print(f"Error: Invalid JSON in {schema_path}: {e}") + sys.exit(1)
115-123: Setting default to a model not in enum could cause UI validation issues.
set_default_modelallows setting a default that may not exist in the enum list. If the schema has an enum constraint, setting a default outside that enum could cause validation errors in the UI.🔎 Proposed improvement
def set_default_model(schema: dict, model: str) -> dict: """Set the default model.""" if "properties" not in schema: schema["properties"] = {} if "model" not in schema["properties"]: schema["properties"]["model"] = {"type": "string", "title": "Model"} + # Warn if setting default to a value not in enum + existing_enum = schema["properties"]["model"].get("enum", []) + if existing_enum and model not in existing_enum: + print(f"Warning: '{model}' is not in the enum list. Adding it.") + existing_enum.append(model) + schema["properties"]["model"]["enum"] = existing_enum + schema["properties"]["model"]["default"] = model return schema.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py (2)
345-347: Avoid catching bareException- narrows exception handling.Catching all exceptions hides specific errors and makes debugging harder. Consider catching specific exceptions like
OSErrororPIL.UnidentifiedImageError.🔎 Proposed fix
- except Exception as e: - print(f" Error processing image: {e}") - return False + except (OSError, ValueError) as e: + print(f" Error processing image: {e}") + return False
464-466: Reconsider default parameter class choice.Defaulting to
AnyscaleLLMParameterswhen no parameter class is specified seems arbitrary. Consider defaulting to a more generic base class or requiring explicit specification.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
frontend/public/icons/adapter-icons/AzureAIFoundry.pngis excluded by!**/*.png
📒 Files selected for processing (22)
.claude/skills/unstract-adapter-extension/SKILL.md.claude/skills/unstract-adapter-extension/assets/templates/embedding_adapter.py.template.claude/skills/unstract-adapter-extension/assets/templates/embedding_parameters.py.template.claude/skills/unstract-adapter-extension/assets/templates/embedding_schema.json.template.claude/skills/unstract-adapter-extension/assets/templates/llm_adapter.py.template.claude/skills/unstract-adapter-extension/assets/templates/llm_parameters.py.template.claude/skills/unstract-adapter-extension/assets/templates/llm_schema.json.template.claude/skills/unstract-adapter-extension/references/adapter_patterns.md.claude/skills/unstract-adapter-extension/references/json_schema_guide.md.claude/skills/unstract-adapter-extension/references/provider_capabilities.md.claude/skills/unstract-adapter-extension/scripts/check_adapter_updates.py.claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py.claude/skills/unstract-adapter-extension/scripts/manage_models.pyunstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/azure.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/openai.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/azure_ai_foundry.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/azure_ai_foundry.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/mistral.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.json
🧰 Additional context used
🧬 Code graph analysis (3)
unstract/sdk1/src/unstract/sdk1/adapters/llm1/azure_ai_foundry.py (2)
unstract/sdk1/src/unstract/sdk1/adapters/base1.py (8)
AzureAIFoundryLLMParameters(691-715)BaseAdapter(76-117)get_id(81-82)get_name(86-87)get_description(91-92)get_provider(96-97)get_icon(101-102)get_adapter_type(116-117)unstract/sdk1/src/unstract/sdk1/constants.py (1)
AdapterTypes(16-22)
.claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py (1)
.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py (7)
to_class_name(148-166)to_icon_name(169-174)fetch_url(177-187)search_potential_logo_sources(190-246)download_and_process_logo(249-347)copy_logo(350-420)main(542-649)
.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py (1)
.claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py (1)
to_class_name(147-163)
🪛 LanguageTool
.claude/skills/unstract-adapter-extension/SKILL.md
[grammar] ~6-~6: Ensure spelling is correct
Context: ...and self-hosted models (Ollama). --- # Unstract Adapter Extension Skill This skill pro...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
.claude/skills/unstract-adapter-extension/references/provider_capabilities.md
45-45: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
51-51: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
57-57: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
63-63: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
70-70: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
76-76: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
82-82: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
88-88: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
138-138: Bare URL used
(MD034, no-bare-urls)
139-139: Bare URL used
(MD034, no-bare-urls)
140-140: Bare URL used
(MD034, no-bare-urls)
141-141: Bare URL used
(MD034, no-bare-urls)
142-142: Bare URL used
(MD034, no-bare-urls)
143-143: Bare URL used
(MD034, no-bare-urls)
144-144: Bare URL used
(MD034, no-bare-urls)
145-145: Bare URL used
(MD034, no-bare-urls)
.claude/skills/unstract-adapter-extension/references/adapter_patterns.md
24-24: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
42-42: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
301-301: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.14.10)
.claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py
177-177: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
178-178: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
209-209: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
210-210: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
229-229: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
230-230: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
280-280: subprocess call: check for execution of untrusted input
(S603)
281-293: Starting a process with a partial executable path
(S607)
300-300: Consider moving this statement to an else block
(TRY300)
331-331: Consider moving this statement to an else block
(TRY300)
339-339: Do not catch blind exception: Exception
(BLE001)
360-360: subprocess call: check for execution of untrusted input
(S603)
361-373: Starting a process with a partial executable path
(S607)
380-380: Consider moving this statement to an else block
(TRY300)
406-406: Consider moving this statement to an else block
(TRY300)
413-413: Do not catch blind exception: Exception
(BLE001)
.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py
183-183: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
184-184: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
215-215: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
216-216: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
235-235: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
236-236: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
286-286: subprocess call: check for execution of untrusted input
(S603)
287-299: Starting a process with a partial executable path
(S607)
306-306: Consider moving this statement to an else block
(TRY300)
337-337: Consider moving this statement to an else block
(TRY300)
345-345: Do not catch blind exception: Exception
(BLE001)
366-366: subprocess call: check for execution of untrusted input
(S603)
367-379: Starting a process with a partial executable path
(S607)
386-386: Consider moving this statement to an else block
(TRY300)
412-412: Consider moving this statement to an else block
(TRY300)
419-419: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (30)
.claude/skills/unstract-adapter-extension/scripts/init_embedding_adapter.py (6)
147-169: LGTM: String conversion utilities are well-implemented.The special-case handling for known providers and the fallback logic are appropriate.
171-182: LGTM: URL fetching is properly secured.The function includes appropriate error handling, timeout protection, and a User-Agent header. The static analysis warning (S310) is a false positive for this legitimate logo-fetching use case.
184-241: LGTM: Logo search functionality is well-designed.The separation of search from download, use of HEAD requests, and appropriate error handling are all good practices.
269-308: LGTM: SVG conversion with ImageMagick is properly secured.The subprocess call uses a list of arguments (not shell=True), preventing command injection. Input paths are controlled, and missing dependency handling is appropriate.
510-522: Excellent design: Auto-logo search doesn't auto-download.The decision to search for logos but require explicit user confirmation (via
--logo-url) before downloading is a good security and user-experience practice. This prevents unintended downloads of incorrect logos.
532-643: LGTM: CLI interface is well-designed.The argument parsing, dry-run support, error reporting, and next-steps guidance provide a great user experience. Exit code handling is correct.
unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/openai.json (1)
18-19: LGTM: Default model updated to newer version.Updating the default from
text-embedding-ada-002totext-embedding-3-smallaligns with current OpenAI recommendations.unstract/sdk1/src/unstract/sdk1/adapters/llm1/azure_ai_foundry.py (1)
1-40: LGTM: Azure AI Foundry adapter correctly implements all required methods.The adapter class properly:
- Inherits from both
AzureAIFoundryLLMParametersandBaseAdapter- Implements all required static methods with appropriate decorators
- Uses correct ID format (
provider|uuid)- Returns proper metadata structure
- Follows naming conventions
.claude/skills/unstract-adapter-extension/references/provider_capabilities.md (1)
1-145: LGTM: Comprehensive provider capabilities reference.The documentation provides a well-organized reference for provider features, parameters, authentication methods, and model prefixes. The tables are clear and the feature definitions are helpful.
.claude/skills/unstract-adapter-extension/SKILL.md (1)
1-353: LGTM: Comprehensive and well-structured skill documentation.This documentation provides excellent coverage of:
- Supported operations with clear command examples
- Detailed workflows for each operation type
- Validation checklists to prevent common errors
- Maintenance procedures for keeping adapters current
- Troubleshooting guidance
The inclusion of practical tips (like the GitHub raw URL guidance at lines 76-78) and the emphasis on logo verification (auto-search doesn't auto-download) demonstrate thoughtful design.
.claude/skills/unstract-adapter-extension/assets/templates/llm_schema.json.template (1)
29-29: Use standard JSON Schema format "uri" instead of non-standard "url".The JSON Schema specification defines standard formats for resource identifiers as "uuid, uri, uri-reference, iri, iri-reference". "url" is not a recognized format. Use "uri" for API base URLs to ensure schema compliance.
- "format": "url", + "format": "uri",unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/azure_ai_foundry.json (1)
1-59: LGTM!The JSON schema is well-structured and follows the established patterns from other adapter schemas in the codebase. The required fields (
adapter_name,api_key,api_base) are appropriate for Azure AI Foundry, and the use offormat: "password"for the API key andformat: "uri"for the endpoint URL provides proper UI hints. The integer-like fields correctly use"type": "number"with"multipleOf": 1for consistency with other adapter schemas..claude/skills/unstract-adapter-extension/references/json_schema_guide.md (1)
1-549: LGTM!This is a comprehensive and well-organized reference guide for creating adapter JSON schemas. The documentation covers all essential field types, provides clear examples for conditional fields using
allOf/if/then, and includes practical complete examples for various adapter types (simple LLM, cloud provider, Azure-style, self-hosted, embedding). The best practices section offers valuable guidance for schema authors..claude/skills/unstract-adapter-extension/references/adapter_patterns.md (1)
36-562: LGTM!The adapter patterns reference is comprehensive and well-structured. It covers the complete LLM adapter implementation with parameter classes, adapter classes, and JSON schemas. The patterns for reasoning, thinking, conditional fields, and optional credentials are clearly documented with practical code examples. The "Common Mistakes" section is particularly valuable for developers new to adapter development.
.claude/skills/unstract-adapter-extension/assets/templates/llm_parameters.py.template (1)
1-61: LGTM!The LLM parameters template is well-designed and follows the documented patterns. The
validate_modelmethod correctly implements idempotent prefix handling to avoid double-prefixing. The template provides clear placeholder documentation and appropriate comments for customization points. The structure aligns with the patterns documented inadapter_patterns.md..claude/skills/unstract-adapter-extension/assets/templates/embedding_parameters.py.template (1)
1-57: LGTM!The embedding parameters template correctly differentiates from the LLM template by not applying a model prefix (as noted in the docstring, embedding models typically don't need prefixes). The
embed_batch_sizefield with a default of 10 is appropriate for embedding adapters. The template structure is consistent with the LLM parameters template and follows established patterns.unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json (1)
22-48: LGTM!The schema updates appropriately support multiple AWS authentication methods. The updated descriptions for
aws_access_key_idandaws_secret_access_keyclarify that these can be left empty when using AWS Profile or IAM role authentication. The newaws_profile_namefield enables AWS SSO authentication, and themodel_idfield for Application Inference Profile ARN supports cost tracking. This aligns with the patterns documented in the "Optional Credentials Pattern" section of the adapter patterns reference..claude/skills/unstract-adapter-extension/assets/templates/embedding_adapter.py.template (1)
1-50: LGTM!The embedding adapter template is well-structured and follows the documented patterns. The inheritance order (
${PARAM_CLASS}first, thenBaseAdapter) is correct as noted in the adapter patterns documentation. All required static methods are present with appropriate return types and the metadata structure matches the expected format for adapter registration.Also applies to: 55-59
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/mistral.json (1)
16-99: LGTM!The schema correctly implements support for Magistral Small 2506 and Magistral Medium models, which are Mistral's first reasoning models released in two variants: a 24B parameter open-source version and a more powerful enterprise version. The conditional
allOfblock properly gates thereasoning_effortfield, showing it only whenenable_reasoningis true—an appropriate pattern for these models' reasoning capabilities. The updated model default tomistral-large-latestand inclusion of the Magistral model examples are correct.unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.json (1)
28-64: LGTM!The new properties (
max_tokens,temperature,json_mode) and thecontext_windowtitle fix are well-structured, properly validated, and align with LiteLLM's Ollama parameters. Thejson_modeboolean provides a user-friendly abstraction that theOllamaLLMParameters.validate()method correctly maps toresponse_format..claude/skills/unstract-adapter-extension/scripts/manage_models.py (1)
158-164: PotentialIndexErrorwhen enum list is empty.If
model_prop["enum"]is an empty list[], accessingmodel_prop["enum"][0]on line 161 will raise anIndexError. The ternary conditionif model_prop["enum"]handles this, but the logic order should be verified.Actually, reviewing again - the ternary handles this correctly. The issue is if
"default"doesn't exist ANDenumis empty, thendefaultbecomes""which is safe..claude/skills/unstract-adapter-extension/scripts/check_adapter_updates.py (3)
373-378:list_adapterscorrectly filters Python files.The logic uses
py_file.name.startswith("__")which checks the filename string, not the Path object directly. This is correct behavior.
49-51: The LITELLM_FEATURES dict requires regular maintenance to reflect current model availability. While "o4-mini" and "gpt-5" are established, released models (not speculative), the hardcoded list should be reviewed for outdated or superseded entries like legacy model identifiers to prevent false positives in adapter update suggestions.
65-66: These model names follow Anthropic's standard versioning format. Anthropic recommends using specific model versions likeclaude-sonnet-4-5-20250929in production applications, where the final component is the snapshot date (YYYYMMDD). Claude Sonnet 4.5 is available via the API using the model stringclaude-sonnet-4-5-20250929, released on September 29, 2025. Similarly, Claude Opus 4.1 was released on August 5, 2025, matching the snapshot date. The format is consistent with earlier models likeclaude-3-5-sonnet-20241022.Likely an incorrect or invalid review comment.
unstract/sdk1/src/unstract/sdk1/adapters/base1.py (4)
666-679: LGTM - clean json_mode to response_format conversion.The
json_modeboolean is correctly converted to{"type": "json_object"}format expected by LiteLLM. Usingpopon the copied metadata ensures the original isn't mutated andjson_modedoesn't leak into the final output.
608-650: LGTM - consistent reasoning pattern with OpenAI/Azure.The Mistral reasoning implementation follows the same pattern established for OpenAI and Azure adapters, maintaining codebase consistency.
444-448: Good additions for AWS SSO and cost tracking.
aws_profile_nameenables SSO authentication workflows, andmodel_idsupports Application Inference Profile for cost allocation. Both have safeNonedefaults.
691-716: LGTM -azure_ai/prefix is correct for LiteLLM.The
AzureAIFoundryLLMParametersclass correctly uses theazure_ai/prefix format that LiteLLM expects for Azure AI Foundry models. Thevalidate_modelmethod properly handles both prefixed and unprefixed model names, adding the prefix only when necessary..claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py (2)
286-302: Subprocess command with partial path relies onmagickbeing in PATH.The script uses
magickwithout an absolute path, which is fine for developer tooling but could fail silently in some environments. The error handling on line 307-311 addresses this adequately.
148-166: Consistent withinit_embedding_adapter.py- good for maintainability.The
to_class_namefunction matches the implementation in the embedding adapter script, ensuring consistent class naming across both adapter types.
| @staticmethod | ||
| def get_icon() -> str: | ||
| """Return icon path.""" | ||
| return "/icons/adapter-icons/${PROVIDER_NAME}.png" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Icon path may contain spaces causing loading issues.
The get_icon() method uses ${PROVIDER_NAME} which is the display name (e.g., "Azure AI Foundry"). However, icon filenames typically use PascalCase without spaces (e.g., AzureAIFoundry.png). Using the display name directly could result in paths with spaces that may fail to load.
Suggested fix
Consider using ${CLASS_NAME} instead, which is already in PascalCase:
@staticmethod
def get_icon() -> str:
"""Return icon path."""
- return "/icons/adapter-icons/${PROVIDER_NAME}.png"
+ return "/icons/adapter-icons/${CLASS_NAME}.png"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @staticmethod | |
| def get_icon() -> str: | |
| """Return icon path.""" | |
| return "/icons/adapter-icons/${PROVIDER_NAME}.png" | |
| @staticmethod | |
| def get_icon() -> str: | |
| """Return icon path.""" | |
| return "/icons/adapter-icons/${CLASS_NAME}.png" |
🤖 Prompt for AI Agents
In
.claude/skills/unstract-adapter-extension/assets/templates/embedding_adapter.py.template
around lines 51 to 54, the get_icon() return uses ${PROVIDER_NAME} which may
contain spaces (display name) and break icon loading; update the template to use
${CLASS_NAME} (PascalCase filename) for the icon path or otherwise
sanitize/remove spaces from the provider name so the returned path is a valid
filename (e.g., "/icons/adapter-icons/${CLASS_NAME}.png").
| @staticmethod | ||
| def get_icon() -> str: | ||
| """Return icon path.""" | ||
| return "/icons/adapter-icons/${PROVIDER_NAME}.png" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Icon path may have spaces causing mismatch with actual icon files.
The template uses ${PROVIDER_NAME} directly in the icon path, but the init scripts use to_icon_name() which removes spaces from the display name. If PROVIDER_NAME is "New Provider", the template outputs /icons/adapter-icons/New Provider.png, while the actual icon is saved as NewProvider.png.
Consider using a separate placeholder like ${ICON_NAME} that corresponds to the space-removed version, or update the template to match the init script's naming logic.
🤖 Prompt for AI Agents
In
.claude/skills/unstract-adapter-extension/assets/templates/llm_adapter.py.template
around lines 51 to 54, the icon path uses ${PROVIDER_NAME} which can include
spaces and mismatch the init script's to_icon_name() (which removes spaces);
change the template to use a space-free placeholder (e.g., ${ICON_NAME}) or
apply the same transformation as the init script (remove spaces from
PROVIDER_NAME) so the generated path matches the actual saved icon filename
(e.g., /icons/adapter-icons/${ICON_NAME}.png).
| def fetch_url(url: str, timeout: int = 10) -> bytes | None: | ||
| """Fetch content from URL with error handling.""" | ||
| try: | ||
| headers = { | ||
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" | ||
| } | ||
| request = Request(url, headers=headers) | ||
| with urlopen(request, timeout=timeout) as response: | ||
| return response.read() | ||
| except (URLError, HTTPError, TimeoutError): | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider validating URL scheme to prevent file:// access.
The static analyzer flags urlopen for accepting arbitrary URL schemes. If a user passes --logo-url "file:///etc/passwd", it could read local files. Consider restricting to http:// and https:// schemes.
🔎 Proposed fix
def fetch_url(url: str, timeout: int = 10) -> bytes | None:
"""Fetch content from URL with error handling."""
+ from urllib.parse import urlparse
+
+ parsed = urlparse(url)
+ if parsed.scheme not in ("http", "https"):
+ print(f" Error: Only http/https URLs are supported, got: {parsed.scheme}")
+ return None
+
try:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def fetch_url(url: str, timeout: int = 10) -> bytes | None: | |
| """Fetch content from URL with error handling.""" | |
| try: | |
| headers = { | |
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" | |
| } | |
| request = Request(url, headers=headers) | |
| with urlopen(request, timeout=timeout) as response: | |
| return response.read() | |
| except (URLError, HTTPError, TimeoutError): | |
| return None | |
| def fetch_url(url: str, timeout: int = 10) -> bytes | None: | |
| """Fetch content from URL with error handling.""" | |
| from urllib.parse import urlparse | |
| parsed = urlparse(url) | |
| if parsed.scheme not in ("http", "https"): | |
| print(f" Error: Only http/https URLs are supported, got: {parsed.scheme}") | |
| return None | |
| try: | |
| headers = { | |
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" | |
| } | |
| request = Request(url, headers=headers) | |
| with urlopen(request, timeout=timeout) as response: | |
| return response.read() | |
| except (URLError, HTTPError, TimeoutError): | |
| return None |
🧰 Tools
🪛 Ruff (0.14.10)
183-183: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
184-184: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
🤖 Prompt for AI Agents
.claude/skills/unstract-adapter-extension/scripts/init_llm_adapter.py around
lines 177 to 187: the fetch_url function currently passes any URL to urlopen
which allows unsafe schemes like file://; parse the URL (e.g., with
urllib.parse.urlparse) and validate that the scheme is either "http" or "https"
before creating the Request and calling urlopen, returning None or raising a
controlled error for disallowed schemes; perform this check immediately after
receiving the url argument and only proceed to build the Request and call
urlopen when the scheme is allowed.
* remove claude artifacts from gitignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
.claude/skills/adapter-ops/assets/templates/llm_adapter.py.template (1)
51-54: Icon path placeholder mismatch persists.The past review comment about icon path spacing is still valid. The template uses
${PROVIDER_NAME}in the icon path (line 54), which may contain spaces (e.g., "New Provider"), while the init scripts save icons with spaces removed (e.g., "NewProvider.png"). This creates a mismatch between the generated adapter code and the actual icon filename.Consider updating the init scripts to use a separate
${ICON_NAME}placeholder that matches theto_icon_name()transformation, or update the template to remove spaces from the provider name in the icon path..claude/skills/adapter-ops/assets/templates/embedding_adapter.py.template (1)
51-54: Icon path uses display name which may contain spaces.This issue was previously identified. The
${PROVIDER_NAME}placeholder (e.g., "Azure AI Foundry") will produce paths with spaces that may fail to load. Consider using${CLASS_NAME}instead..claude/skills/adapter-ops/scripts/init_llm_adapter.py (1)
177-187: URL scheme validation still missing.This issue was previously flagged. The
urlopencall accepts arbitrary URL schemes includingfile://, which could read local files if a malicious URL is provided via--logo-url.
🧹 Nitpick comments (3)
.claude/skills/adapter-ops/assets/templates/llm_schema.json.template (1)
29-29: Consider using "uri" format for consistency.Line 29 uses
"format": "url"while the embedding template uses"format": "uri". The JSON Schema specification recognizes "uri" as a standard format validator. Consider updating to "uri" for consistency across templates, though "url" may also be supported by some validators.🔎 Proposed fix
"api_base": { "type": "string", - "format": "url", + "format": "uri", "title": "API Base", "default": "", "description": "API endpoint URL (optional, uses default if not specified)." },.claude/skills/adapter-ops/scripts/init_llm_adapter.py (2)
148-420: Consider extracting shared utilities to reduce duplication.The utility functions (
to_class_name,to_icon_name,fetch_url,search_potential_logo_sources,download_and_process_logo,copy_logo) are duplicated verbatim ininit_embedding_adapter.py. Extracting them to a shared module (e.g.,adapter_utils.py) would improve maintainability.Example structure
# .claude/skills/adapter-ops/scripts/adapter_utils.py def to_class_name(provider: str) -> str: ... def to_icon_name(display_name: str) -> str: ... def fetch_url(url: str, timeout: int = 10) -> bytes | None: ... # ... other shared functionsThen import in both scripts:
from adapter_utils import to_class_name, to_icon_name, fetch_url, ...
24-32: Path resolution relies on fixed directory structure.The
REPO_ROOTcalculation assumes this script is always at.claude/skills/adapter-ops/scripts/. If the skill directory is renamed or moved, paths will break silently.Consider adding a validation check:
if not (REPO_ROOT / "unstract").exists(): print(f"Warning: Repository root detection may be incorrect: {REPO_ROOT}")
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (15)
.claude/skills/adapter-ops/SKILL.md.claude/skills/adapter-ops/assets/templates/embedding_adapter.py.template.claude/skills/adapter-ops/assets/templates/embedding_parameters.py.template.claude/skills/adapter-ops/assets/templates/embedding_schema.json.template.claude/skills/adapter-ops/assets/templates/llm_adapter.py.template.claude/skills/adapter-ops/assets/templates/llm_parameters.py.template.claude/skills/adapter-ops/assets/templates/llm_schema.json.template.claude/skills/adapter-ops/references/adapter_patterns.md.claude/skills/adapter-ops/references/json_schema_guide.md.claude/skills/adapter-ops/references/provider_capabilities.md.claude/skills/adapter-ops/scripts/check_adapter_updates.py.claude/skills/adapter-ops/scripts/init_embedding_adapter.py.claude/skills/adapter-ops/scripts/init_llm_adapter.py.claude/skills/adapter-ops/scripts/manage_models.py.gitignore
💤 Files with no reviewable changes (1)
- .gitignore
✅ Files skipped from review due to trivial changes (2)
- .claude/skills/adapter-ops/references/json_schema_guide.md
- .claude/skills/adapter-ops/references/adapter_patterns.md
🧰 Additional context used
🧬 Code graph analysis (1)
.claude/skills/adapter-ops/scripts/manage_models.py (2)
.claude/skills/adapter-ops/scripts/init_embedding_adapter.py (1)
main(532-639).claude/skills/adapter-ops/scripts/init_llm_adapter.py (1)
main(542-649)
🪛 LanguageTool
.claude/skills/adapter-ops/SKILL.md
[grammar] ~6-~6: Ensure spelling is correct
Context: ...and self-hosted models (Ollama). --- # Unstract Adapter Extension Skill This skill pro...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
.claude/skills/adapter-ops/references/provider_capabilities.md
45-45: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
51-51: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
57-57: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
63-63: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
70-70: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
76-76: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
82-82: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
88-88: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
138-138: Bare URL used
(MD034, no-bare-urls)
139-139: Bare URL used
(MD034, no-bare-urls)
140-140: Bare URL used
(MD034, no-bare-urls)
141-141: Bare URL used
(MD034, no-bare-urls)
142-142: Bare URL used
(MD034, no-bare-urls)
143-143: Bare URL used
(MD034, no-bare-urls)
144-144: Bare URL used
(MD034, no-bare-urls)
145-145: Bare URL used
(MD034, no-bare-urls)
.claude/skills/adapter-ops/SKILL.md
7-7: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.10)
.claude/skills/adapter-ops/scripts/init_embedding_adapter.py
177-177: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
178-178: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
209-209: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
210-210: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
229-229: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
230-230: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
280-280: subprocess call: check for execution of untrusted input
(S603)
281-293: Starting a process with a partial executable path
(S607)
300-300: Consider moving this statement to an else block
(TRY300)
331-331: Consider moving this statement to an else block
(TRY300)
339-339: Do not catch blind exception: Exception
(BLE001)
360-360: subprocess call: check for execution of untrusted input
(S603)
361-373: Starting a process with a partial executable path
(S607)
380-380: Consider moving this statement to an else block
(TRY300)
406-406: Consider moving this statement to an else block
(TRY300)
413-413: Do not catch blind exception: Exception
(BLE001)
.claude/skills/adapter-ops/scripts/init_llm_adapter.py
183-183: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
184-184: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
215-215: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
216-216: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
235-235: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
236-236: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
286-286: subprocess call: check for execution of untrusted input
(S603)
287-299: Starting a process with a partial executable path
(S607)
306-306: Consider moving this statement to an else block
(TRY300)
337-337: Consider moving this statement to an else block
(TRY300)
345-345: Do not catch blind exception: Exception
(BLE001)
366-366: subprocess call: check for execution of untrusted input
(S603)
367-379: Starting a process with a partial executable path
(S607)
386-386: Consider moving this statement to an else block
(TRY300)
412-412: Consider moving this statement to an else block
(TRY300)
419-419: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (27)
.claude/skills/adapter-ops/SKILL.md (1)
1-350: Excellent comprehensive documentation!This skill documentation is well-structured and provides clear, actionable guidance for adapter management. The workflows are detailed with concrete examples, and the inclusion of troubleshooting, validation checklists, and maintenance procedures makes this a valuable reference.
.claude/skills/adapter-ops/references/provider_capabilities.md (1)
1-145: Well-organized provider capability reference.This reference document provides a clear, tabulated overview of LiteLLM provider capabilities. The feature matrices, parameter listings, and authentication methods are comprehensive and align well with the adapter scaffolding system introduced in this PR.
.claude/skills/adapter-ops/scripts/manage_models.py (9)
27-37: Path resolution looks correct.The path navigation and schema path construction are appropriate for the repository structure.
40-50: Clean JSON I/O functions.The load and save functions are simple and correct. The trailing newline in save_schema is good for git-friendly diffs.
53-61: LGTM: Clear model information extraction.The function safely extracts model configuration with appropriate defaults.
64-89: Robust enum addition logic.The function correctly handles missing structures, avoids duplicates, and sets a sensible default.
92-112: Well-handled model removal with cleanup.The function correctly handles default updates and removes the enum property when empty.
115-134: Simple and correct field updates.Both functions properly initialize the schema structure before updating fields.
137-151: convert_to_enum provides guidance rather than automatic conversion.This function intentionally directs users to use the
--action add-enumcommand rather than automatically converting to an enum. This is reasonable since the script can't know which models should be in the enum list. The function name could be clearer about this behavior (e.g.,guide_to_enum), but the current implementation is acceptable.
154-166: Correct enum-to-freetext conversion.The function properly preserves the default value (using the first enum value as fallback if no explicit default exists) and removes the enum constraint.
169-285: Well-structured CLI with proper action dispatch.The main function provides a clean CLI interface with appropriate error checking, action dispatch, and dry-run support. The diff-based update approach ensures changes are only written when necessary.
.claude/skills/adapter-ops/scripts/check_adapter_updates.py (6)
26-239: Comprehensive LiteLLM features catalog.The LITELLM_FEATURES dictionary provides a well-structured reference for known parameters and capabilities across providers. As noted in the maintenance workflow documentation, this catalog should be updated periodically when LiteLLM adds new features.
242-253: Flexible schema loading with pattern matching.The function correctly handles multiple filename patterns (with and without underscores) to accommodate different provider naming conventions.
256-269: Correctly extracts properties from JSON Schema.The function handles both standard properties and conditional properties defined in
allOfblocks, which is appropriate for schemas with feature-gated fields.
272-364: Thorough adapter analysis with alias handling.The analyze_adapter function provides comprehensive checking for missing parameters, reasoning/thinking support, and outdated models. The alias filtering (lines 310-323) is particularly valuable for preventing false positives when parameters have alternative names (e.g.,
api_basevsbase_url).
367-423: Clean adapter discovery and reporting.The list_adapters function correctly identifies Python files, and print_report provides a well-formatted, color-coded summary with actionable information.
426-469: Effective CLI with CI/CD-friendly exit codes.The main function provides flexible filtering options (adapter type, specific provider) and returns exit code 1 when updates are needed, which is useful for automated checks.
.claude/skills/adapter-ops/assets/templates/embedding_schema.json.template (1)
1-52: Well-structured embedding schema template.The template provides appropriate defaults and validation constraints. Good use of
format: "password"for the API key andformat: "uri"for the API base URL to guide UI rendering..claude/skills/adapter-ops/assets/templates/llm_schema.json.template (1)
1-58: Solid LLM schema template with appropriate defaults.The template provides sensible defaults (15-minute timeout, 5 max retries) and proper validation constraints for LLM configuration parameters.
.claude/skills/adapter-ops/assets/templates/llm_adapter.py.template (1)
1-59: Clean adapter template structure.The template provides a well-organized adapter class skeleton with all required static methods. The placeholder system is clear and documented.
.claude/skills/adapter-ops/assets/templates/embedding_parameters.py.template (1)
1-57: Well-structured embedding parameter template.The template provides a clean parameter class with proper validation hooks. The comments about field mapping (lines 40-42) and the note that embedding models typically don't need prefixes (line 49) are helpful guidance for developers using this template.
.claude/skills/adapter-ops/assets/templates/llm_parameters.py.template (1)
1-61: LGTM! Template follows established patterns.The template correctly implements:
- Idempotent model prefixing in
validate_model- Pydantic validation flow via
model_dump()- Clear placeholder documentation
.claude/skills/adapter-ops/assets/templates/embedding_adapter.py.template (1)
1-50: Template structure looks good.The adapter template correctly implements the required static methods for adapter registration and metadata. The inheritance pattern and method signatures align with existing adapters.
Also applies to: 55-59
.claude/skills/adapter-ops/scripts/init_llm_adapter.py (2)
460-467: Verify thatAnyscaleLLMParametersis the intended default.When neither
--add-param-classnor--use-param-classis specified, the adapter defaults toAnyscaleLLMParameters. Confirm this is intentional, or consider using a more generic base class likeBaseChatCompletionParameters.
34-145: Implementation looks solid.The script provides good developer ergonomics:
- Clear dry-run mode
- Structured result with errors/warnings separation
- Helpful next steps guidance
- Safe handling of existing files
Also applies to: 423-539, 542-649
.claude/skills/adapter-ops/scripts/init_embedding_adapter.py (2)
454-459: Default parameter class is appropriate.Using
OpenAIEmbeddingParametersas the default is reasonable since OpenAI is a common embedding provider and likely has the most generic parameter set.
34-124: Embedding adapter scaffolding implementation looks correct.The script properly handles:
- Embedding-specific schema fields (
embed_batch_size,api_base)- Correct target directories (
embedding1/vsllm1/)- Appropriate default parameter class
Also applies to: 417-529, 532-643
| @staticmethod | ||
| def validate_model(adapter_metadata: dict[str, "Any"]) -> str: | ||
| model = adapter_metadata.get("model", "") | ||
| return model | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find the file
find . -name "init_embedding_adapter.py" -type f 2>/dev/null | head -20Repository: Zipstack/unstract
Length of output: 123
🏁 Script executed:
cat -n ./.claude/skills/adapter-ops/scripts/init_embedding_adapter.py | head -200Repository: Zipstack/unstract
Length of output: 7865
🏁 Script executed:
grep -n "LLM_ADAPTER_TEMPLATE\|validate_model" ./.claude/skills/adapter-ops/scripts/init_embedding_adapter.py | head -30Repository: Zipstack/unstract
Length of output: 218
🏁 Script executed:
find . -name "*init*llm*adapter*" -o -name "*init*completion*" 2>/dev/null | grep -i scriptRepository: Zipstack/unstract
Length of output: 117
🏁 Script executed:
grep -A 10 "def validate_model" ./.claude/skills/adapter-ops/scripts/init_llm_adapter.pyRepository: Zipstack/unstract
Length of output: 385
🏁 Script executed:
cat -n ./.claude/skills/adapter-ops/scripts/init_llm_adapter.py | sed -n '126,145p'Repository: Zipstack/unstract
Length of output: 914
validate_model should add provider prefix to model name.
The embedding adapter returns the model unchanged, but LiteLLM embedding models require provider prefixes (e.g., openai/text-embedding-ada-002) just like completion models. Update validate_model to conditionally prepend the provider prefix, matching the LLM adapter's implementation.
🤖 Prompt for AI Agents
In .claude/skills/adapter-ops/scripts/init_embedding_adapter.py around lines
139-143, validate_model currently returns the raw model string; update it to
fetch the provider from adapter_metadata (e.g., adapter_metadata.get("provider",
"")) and, if a provider is present and the model does not already include a
provider prefix (no "/" in model), prepend the provider followed by "/" to the
model before returning it; otherwise return the model unchanged — mirror the
conditional prefixing logic used by the LLM adapter so embedding models become
e.g. "openai/text-embedding-ada-002".
| def fetch_url(url: str, timeout: int = 10) -> bytes | None: | ||
| """Fetch content from URL with error handling.""" | ||
| try: | ||
| headers = { | ||
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" | ||
| } | ||
| request = Request(url, headers=headers) | ||
| with urlopen(request, timeout=timeout) as response: | ||
| return response.read() | ||
| except (URLError, HTTPError, TimeoutError): | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
URL scheme validation missing (same as LLM script).
The fetch_url function and search_potential_logo_sources use urlopen without validating URL schemes. Consider restricting to http:// and https:// to prevent file:// access.
🔎 Proposed fix
def fetch_url(url: str, timeout: int = 10) -> bytes | None:
"""Fetch content from URL with error handling."""
+ from urllib.parse import urlparse
+
+ parsed = urlparse(url)
+ if parsed.scheme not in ("http", "https"):
+ print(f" Error: Only http/https URLs are supported, got: {parsed.scheme}")
+ return None
+
try:
headers = {🧰 Tools
🪛 Ruff (0.14.10)
177-177: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
178-178: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
🤖 Prompt for AI Agents
In .claude/skills/adapter-ops/scripts/init_embedding_adapter.py around lines 171
to 181, fetch_url currently calls urlopen without validating the URL scheme
which allows unsafe schemes like file://; update the function to parse and
validate the URL scheme (allow only 'http' and 'https') before creating the
Request and calling urlopen, returning None immediately for invalid or missing
schemes; use urllib.parse.urlparse to check scheme, and keep existing error
handling otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
.claude/skills/adapter-ops/scripts/check_adapter_updates.py (2)
242-254: Add error handling for JSON parsing.If a schema file exists but contains malformed JSON,
json.load(f)will raise aJSONDecodeError, causing the script to crash. Consider catching and handling this error to provide a clearer error message.🔎 Proposed fix
def load_json_schema(adapter_type: str, provider: str) -> dict | None: """Load JSON schema for an adapter.""" schema_dir = SDK1_ADAPTERS / f"{adapter_type}1" / "static" # Try common filename patterns for filename in [f"{provider}.json", f"{provider.replace('_', '')}.json"]: schema_path = schema_dir / filename if schema_path.exists(): - with open(schema_path) as f: - return json.load(f) + try: + with open(schema_path) as f: + return json.load(f) + except json.JSONDecodeError as e: + print(f"Warning: Failed to parse {schema_path}: {e}", file=sys.stderr) + return None return None
305-307: Add clarifying comment for "adapter_name" exclusion.The exclusion of
"adapter_name"from the missing parameters check is not explained. A brief comment would improve code clarity.🔎 Proposed fix
# Find missing parameters known_params = set(features.get("known_params", [])) + # adapter_name is always present in schemas but not a LiteLLM param missing = ( known_params - current_props - {"adapter_name"} - ) # adapter_name is always present + )
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
.claude/skills/adapter-ops/scripts/check_adapter_updates.py
🧰 Additional context used
🧬 Code graph analysis (1)
.claude/skills/adapter-ops/scripts/check_adapter_updates.py (2)
.claude/skills/adapter-ops/scripts/init_embedding_adapter.py (1)
main(532-639).claude/skills/adapter-ops/scripts/init_llm_adapter.py (1)
main(542-649)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (6)
.claude/skills/adapter-ops/scripts/check_adapter_updates.py (6)
1-24: LGTM!The imports and path resolution are clean and consistent with the related adapter initialization scripts.
256-270: LGTM!The function correctly extracts properties from both direct
propertiesand conditionalallOf/then/propertiessections of JSON schemas.
310-324: LGTM!The alias filtering logic correctly handles common parameter name variations across providers, preventing false positives in the missing parameters report.
367-380: LGTM!The function correctly discovers adapters by scanning the filesystem and filtering out Python internal files.
382-424: LGTM!The report formatting is clear and user-friendly, with good visual organization using emoji indicators and grouped output.
426-469: LGTM!The CLI interface is well-designed with appropriate options for filtering and output formatting. The exit code behavior (returning 1 if updates are needed) enables integration with CI/CD pipelines.
| "vertex_ai": { | ||
| "known_params": [ | ||
| "json_credentials", | ||
| "vertex_credentials", | ||
| "project", | ||
| "vertex_project", | ||
| "model", | ||
| "max_tokens", | ||
| "max_retries", | ||
| "timeout", | ||
| "temperature", | ||
| "safety_settings", | ||
| "enable_thinking", | ||
| "budget_tokens", | ||
| "thinking", | ||
| "reasoning_effort", | ||
| "tools", | ||
| "googleSearch", | ||
| ], | ||
| "thinking_models": ["gemini-2.5-flash-preview", "gemini-2.5-pro"], | ||
| "docs_url": "https://docs.litellm.ai/docs/providers/vertex", | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix naming inconsistency for Vertex AI provider.
The provider is named "vertex_ai" in the LLM section (line 108) but "vertexai" in the embedding section (line 214). This inconsistency could cause lookup failures in analyze_adapter when the function retrieves features using LITELLM_FEATURES.get(adapter_type, {}).get(provider, {}) (line 296).
🔎 Proposed fix
Standardize on one naming convention. Based on the typical provider naming pattern, use "vertex_ai":
- "vertexai": {
+ "vertex_ai": {
"known_params": [Also applies to: 214-227
🤖 Prompt for AI Agents
.claude/skills/adapter-ops/scripts/check_adapter_updates.py around lines 108-129
and 214-227: there is a naming mismatch where the LLM section uses "vertex_ai"
but the embedding section uses "vertexai", causing lookup failures in
analyze_adapter; change the embedding section provider name from "vertexai" to
"vertex_ai" (and ensure any other occurrences in this file use the same
"vertex_ai" key) so LITELLM_FEATURES lookups are consistent.
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - check the code rabbit comments in case they're valid
| .prompting/ | ||
|
|
||
| # Claude | ||
| CLAUDE.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hari-kuriakose we should commit a common CLAUDE.md that all of us can use. This might cause our local versions to be tracked in git
gaya3-zipstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
adapter-ops)Why
How
Skill Structure:
SKILL.md: Comprehensive documentation with workflows and quick referencescripts/init_llm_adapter.py: Automated LLM adapter creation with logo supportscripts/init_embedding_adapter.py: Automated embedding adapter creationscripts/manage_models.py: Add/remove models from existing adaptersscripts/check_adapter_updates.py: Compare adapters against LiteLLM featurestemplates/: Jinja2 templates for adapter filesreferences/: Pattern documentation and JSON schema guideAzure AI Foundry Adapter:
llm1/azure_ai_foundry.pyllm1/static/azure_ai_foundry.jsonAzureAIFoundryLLMParametersinbase1.pyfrontend/public/icons/adapter-icons/AzureAIFoundry.pngExisting Adapter Updates:
Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No breaking changes expected:
.claude/skills/directory)Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A (backend changes only)
Checklist
I have read and understood the Contribution Guidelines.