-
Notifications
You must be signed in to change notification settings - Fork 576
UN-2563 [FEAT] Unstract autogen client #1434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Add sdk1 to unstract common ns * Update backend deps * Fix dev essentials container names * Update sample compose override
* Refactor provider args validation
[feat] add file storage, tool, util, platform, prompt modules in sdk1 [feat] integrate sdk1 in backend under feature flag
feat: add x2txt adapter to sdk1 feat: add adapter and tools helpers to sdk1
feat: add llm no-op adapter in sdk1 feat: add utils in sdk1
Signed-off-by: Deepak <89829542+Deepak-Kesavan@users.noreply.github.com>
Summary by CodeRabbit
WalkthroughIntroduces a new Python package “unstract-autogen-client” with project config, documentation, core client implementation, exceptions, helpers, utilities, and tests. The UnstractAutoGenClient wraps an Unstract LLM adapter to offer ChatCompletionClient-compatible sync/async calls, streaming, usage tracking, token counting, and structured error handling. Adds a simple agent wrapper and convenience processing functions. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor App
participant Client as UnstractAutoGenClient
participant Adapter as Unstract LLM Adapter
App->>Client: create(messages, options)
activate Client
Client->>Client: normalize_messages / build params
Client->>Adapter: complete(params)
Adapter-->>Client: response
Client->>Client: extract_content / usage / finish_reason
Client-->>App: CreateResult (content, usage, finish_reason)
deactivate Client
sequenceDiagram
autonumber
actor App
participant Agent as SimpleAutoGenAgent
participant Client as UnstractAutoGenClient
participant Adapter as Unstract LLM Adapter
App->>Agent: process_message(message, context?)
Agent->>Agent: compose history + system + user
Agent->>Client: create(messages)
Client->>Adapter: complete(params)
Adapter-->>Client: response
Client-->>Agent: content + usage + finish_reason
Agent->>Agent: update/truncate history
Agent-->>App: result
sequenceDiagram
autonumber
actor App
participant Client as UnstractAutoGenClient
participant Adapter as Unstract LLM Adapter
App->>Client: create_stream(messages)
activate Client
Client->>Adapter: complete_stream(params)
loop stream chunks
Adapter-->>Client: token/partial content
Client-->>App: chunk
end
Client-->>App: final aggregated CreateResult
deactivate Client
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
|
feat: unified embedding interface powered by litellm
- Move audit.py and usage_handler.py from utils/ to parent directory - Remove re-exports from utils/__init__.py to eliminate circular dependencies - Update imports across codebase to be explicit: * Direct imports from utils.common for Utils and log_elapsed * Direct import from audit module for Audit class * Direct import from utils.tool for ToolUtils - Fix imports in backend services, prompt-service, and tools Files moved: - unstract/sdk1/utils/audit.py → unstract/sdk1/audit.py - unstract/sdk1/utils/usage_handler.py → unstract/sdk1/usage_handler.py
- Rename embedding/ to embedding1/ for all adapter types - Rename llm/ to llm1/ directory structure - Add new base1.py classes for adapter foundations - Update backend adapter processor to support new structure - Update import paths and references in embedding.py and llm.py
feat: bump litellm version chore: generate uv lock
* Add v1 embedding and llm adapters * Add unified litellm powered embedding and llm interfaces * Fix internal v1 imports
for more information, see https://pre-commit.ci
…LLMs - Update litellm dependency from v1.74.7 to v1.76.0 in pyproject.toml - Add new uv.lock file for sdk1 package dependency management - Update backend and prompt-service uv.lock files for dependency consistency
- Fixed vertex AI llms connector not working - Modified params to include thinking capabilities for bedrock llm. - Update OpenAI adapter configuration schema (Previously it was the schema for OpenAI embedding).
- Split BaseParameters into BaseChatCompletionParameters and BaseEmbeddingParameters - Rename OpenAIParameters to OpenAILLMParameters for clarity - Add OpenAIEmbeddingParameters for embedding-specific configurations - Update all embedding adapters to use new parameter classes(*provider*EmbeddingParameters) - Remove batch_embed_size from bedrock.json configuration file since bedrock doesn't support it. - Improve embedding.py with better parameter handling and litellm.drop_params - Add fallback for platform API key when tool is not available - Fix default parameter handling in EmbeddingCompat class
…lugin - Fix LLM response structure compatibility by returning LLMResponseCompat objects - Merge usage_kwargs into platform_kwargs for proper audit/usage tracking - Add adapter_instance_id to platform_kwargs similar to original SDK - Update test_connection method to use LLMResponseCompat.text attribute - Ensure usage data (run_id, llm_usage_reason, adapter_instance_id) is properly passed to audit system This resolves TypeError and AttributeError issues when using SDK1 with plugins that expect unstract-sdk compatible response structures and usage tracking.
…index) - Added get_model_name method in sdk1 llm.py - moved model_name sanitization from cost_calculation.py to audit.py(sdk1)
Signed-off-by: Hari John Kuriakose <hari@zipstack.com>
…unstract-autogen-client
ebb1b8a to
138886d
Compare
…act-autogen-client # Conflicts: # backend/prompt_studio/prompt_studio_core_v2/serializers.py # backend/pyproject.toml # backend/uv.lock # backend/workflow_manager/endpoint_v2/source.py # platform-service/uv.lock # prompt-service/src/unstract/prompt_service/controllers/answer_prompt.py # prompt-service/src/unstract/prompt_service/helpers/usage.py # prompt-service/src/unstract/prompt_service/services/answer_prompt.py # prompt-service/src/unstract/prompt_service/services/extraction.py # prompt-service/uv.lock # tools/classifier/src/helper.py # unstract/sdk1/pyproject.toml # unstract/sdk1/src/unstract/sdk1/adapters/adapterkit.py # unstract/sdk1/src/unstract/sdk1/adapters/base.py # unstract/sdk1/src/unstract/sdk1/adapters/base1.py # unstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.py # unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/openai.json # unstract/sdk1/src/unstract/sdk1/adapters/ocr/google_document_ai/src/google_document_ai.py # unstract/sdk1/src/unstract/sdk1/adapters/ocr/ocr_adapter.py # unstract/sdk1/src/unstract/sdk1/adapters/registry.py # unstract/sdk1/src/unstract/sdk1/adapters/vectordb/milvus/src/milvus.py # unstract/sdk1/src/unstract/sdk1/adapters/vectordb/no_op/src/no_op_custom_vectordb.py # unstract/sdk1/src/unstract/sdk1/adapters/vectordb/no_op/src/no_op_vectordb.py # unstract/sdk1/src/unstract/sdk1/adapters/x2text/helper.py # unstract/sdk1/src/unstract/sdk1/adapters/x2text/llama_parse/src/llama_parse.py # unstract/sdk1/src/unstract/sdk1/adapters/x2text/llm_whisperer_v2/src/helper.py # unstract/sdk1/src/unstract/sdk1/cache.py # unstract/sdk1/src/unstract/sdk1/embedding.py # unstract/sdk1/src/unstract/sdk1/file_storage/env_helper.py # unstract/sdk1/src/unstract/sdk1/file_storage/helper.py # unstract/sdk1/src/unstract/sdk1/llm.py # unstract/sdk1/src/unstract/sdk1/tool/cache.py # unstract/sdk1/src/unstract/sdk1/tool/validator.py # unstract/sdk1/src/unstract/sdk1/utils/common.py # unstract/sdk1/src/unstract/sdk1/utils/metrics_mixin.py # unstract/sdk1/src/unstract/sdk1/vector_db.py # unstract/sdk1/uv.lock
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
🧹 Nitpick comments (9)
unstract/autogen-client/README.md (1)
174-176: Format URLs as Markdown links.The bare URLs trigger markdown linting warnings. Format them as proper Markdown links for better presentation.
Apply this diff:
-- GitHub Issues: https://github.com/Zipstack/unstract/issues -- Documentation: https://docs.unstract.com -- Email: support@unstract.com +- [GitHub Issues](https://github.com/Zipstack/unstract/issues) +- [Documentation](https://docs.unstract.com) +- Email: [support@unstract.com](mailto:support@unstract.com)unstract/autogen-client/src/unstract/autogen_client/utils.py (1)
78-92: Token estimation is overly simplistic.The word-based splitting doesn't account for:
- Multi-word tokens common in tokenizers
- Special characters and punctuation
- Language-specific tokenization rules
This could lead to significant under/over-estimation for token limits and usage tracking.
Consider using
tiktokenfor OpenAI models or providing a configurable tokenizer:def estimate_token_count(messages: list[dict[str, str]], encoding_name: str = "cl100k_base") -> int: """Estimate token count using tiktoken.""" try: import tiktoken encoding = tiktoken.get_encoding(encoding_name) total_content = " ".join( msg.get("content", "") for msg in messages if isinstance(msg.get("content"), str) ) return len(encoding.encode(total_content)) except ImportError: # Fallback to word-based estimation return max(1, len(total_content.split()))unstract/autogen-client/pyproject.toml (1)
48-48: Consider updating pytest version constraint.The constraint
pytest>=7.0.0is broad and allows any 7.x or 8.x version. As of September 2025, pytest 8.4.2 is the current stable release with important bug fixes. Consider tightening the constraint for CI reproducibility.Based on learnings
- "pytest>=7.0.0", + "pytest>=8.4.0,<9.0.0",This ensures you get the latest pytest 8.x fixes while avoiding unexpected breaking changes from pytest 9.x.
unstract/autogen-client/tests/test_utils.py (1)
117-124: Misleading test name.The test name says "no message" but the test only deletes the
messageattribute (line 121) rather than testing a scenario wherechoices[0]has nomessageattribute from the start. The current test may not properly validate the error handling.Consider using a more explicit mock setup:
def test_extract_content_no_message(self) -> None: """Test content extraction with no message.""" response = Mock() - response.choices = [Mock()] - del response.choices[0].message + response.choices = [Mock(spec=[])] # Mock with no attributes content = extract_content(response) assert content == ""unstract/autogen-client/tests/test_client.py (2)
64-81: Simplify the mock setup for missingcompletionmethod.The test correctly validates the error case, but the
delattrpattern withhasattrguard is verbose.Apply this diff for a cleaner approach:
- # Test that adapter must have completion method - mock_adapter = Mock() - delattr(mock_adapter, "completion") if hasattr( - mock_adapter, "completion" - ) else None + # Test that adapter must have completion method + mock_adapter = Mock(spec=[]) # Empty spec, no attributes with pytest.raises( UnstractConfigurationError, match="llm_adapter must have a 'completion' method",
128-152: Remove unused variable assignment.The test assigns
resulton line 140 but never uses it. The test only verifies the adapter's call arguments.Apply this diff:
- result = await client.create(messages) + await client.create(messages)unstract/autogen-client/src/unstract/autogen_client/client.py (2)
199-200: Add exception chaining for better debugging.Preserve the original stack trace by using
raise ... from eto help with debugging.Apply this diff:
except Exception as e: - raise self._handle_exception(e) + raise self._handle_exception(e) from e
292-293: Add exception chaining for better debugging.Same as in the
createmethod, preserve the original stack trace.Apply this diff:
except Exception as e: - raise self._handle_exception(e) + raise self._handle_exception(e) from eunstract/autogen-client/src/unstract/autogen_client/helper.py (1)
115-117: Preserve exception chain for better debugging.When re-raising exceptions, use
raise ... from eto preserve the original exception chain, which aids debugging by showing the full stack trace.Apply this diff:
except Exception as e: logger.error(f"Error processing message: {str(e)}") - raise UnstractCompletionError(f"Failed to process message: {str(e)}") + raise UnstractCompletionError(f"Failed to process message: {str(e)}") from e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (9)
unstract/autogen-client/README.md(1 hunks)unstract/autogen-client/pyproject.toml(1 hunks)unstract/autogen-client/src/unstract/autogen_client/__init__.py(1 hunks)unstract/autogen-client/src/unstract/autogen_client/client.py(1 hunks)unstract/autogen-client/src/unstract/autogen_client/exceptions.py(1 hunks)unstract/autogen-client/src/unstract/autogen_client/helper.py(1 hunks)unstract/autogen-client/src/unstract/autogen_client/utils.py(1 hunks)unstract/autogen-client/tests/test_client.py(1 hunks)unstract/autogen-client/tests/test_utils.py(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
unstract/autogen-client/README.md
174-174: Bare URL used
(MD034, no-bare-urls)
175-175: Bare URL used
(MD034, no-bare-urls)
176-176: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.13.1)
unstract/autogen-client/tests/test_client.py
140-140: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
unstract/autogen-client/src/unstract/autogen_client/client.py
200-200: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
293-293: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
unstract/autogen-client/src/unstract/autogen_client/helper.py
117-117: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (26)
unstract/autogen-client/src/unstract/autogen_client/exceptions.py (2)
4-21: LGTM! Well-designed exception hierarchy.The base exception class properly captures both the error message and original exception, and the
__str__method provides clear error context. The design follows Python exception best practices.
23-50: LGTM! Specific exception types cover key error scenarios.The five specific exception types (Configuration, Completion, Connection, Timeout, Validation) provide granular error handling and align with the error handling examples in the README.
unstract/autogen-client/pyproject.toml (1)
69-69: LGTM! Local development setup for sdk1.The local editable path dependency for
unstract-sdk1is appropriate for monorepo development and aligns with the PR objective that states this depends on PR #1482 (Add SDK v1 into unstract package namespace).unstract/autogen-client/src/unstract/autogen_client/__init__.py (1)
1-31: LGTM! Clean public API definition.The package exports are well-organized, with a clear version string, comprehensive imports, and a matching
__all__declaration. The API surface exposes the core client, exception hierarchy, and helper utilities.unstract/autogen-client/tests/test_client.py (7)
22-50: LGTM! Well-structured test fixtures.The fixtures provide clean mocks for the adapter, response, and client. Disabling retries in the test client simplifies test scenarios.
83-103: LGTM! Model info and capabilities tests are thorough.The tests correctly verify the
model_infoproperty andcapabilities()method, confirming the expected structure and values.
105-126: LGTM! Comprehensive create completion test.The test properly verifies the response structure, usage tracking, and adapter call with normalized messages.
154-176: LGTM! Error handling tests are well-structured.Both tests correctly verify exception handling for adapter failures and closed client state.
178-211: LGTM! Token counting tests cover key scenarios.The tests properly verify token estimation, unknown max handling, and remaining token calculation with known limits.
213-243: LGTM! Usage tracking tests validate accumulation logic.The tests correctly verify that
total_usage()accumulates across requests whileactual_usage()reflects only the last request.
245-294: LGTM! Streaming test validates chunk aggregation and final result.The test properly mocks streaming responses, verifies that chunks are yielded as strings, and confirms the final
CreateResultcontains aggregated content and usage data.unstract/autogen-client/src/unstract/autogen_client/client.py (8)
91-107: Initialization logic is sound.The instance variable setup, default model info assignment, and usage tracking initialization are well-structured.
109-127: LGTM! Model info and capabilities are correctly implemented.The property and method provide appropriate access to model information and capability flags.
129-197: LGTM! Create completion logic is well-structured.The method properly handles message normalization, retry logic, response extraction, and usage tracking.
264-268: Verify if a second call is necessary for usage data.The code makes a second non-streaming call to retrieve usage information. If the streaming response already includes usage data in the final chunk, this extra call could be avoided.
Please verify whether the Unstract adapter's streaming responses include usage data in the final chunk. If they do, extract it directly instead of making a second call.
295-340: LGTM! Token counting and usage tracking methods are robust.The methods handle edge cases well, including a fallback for token counting and proper handling of unknown max tokens.
342-346: LGTM! Clean resource cleanup.The
closemethod properly sets the closed state and logs the event.
444-472: LGTM! Retry logic with exponential backoff is well-implemented.The method properly implements retry logic with exponential backoff and appropriate logging.
474-509: LGTM! Comprehensive error mapping and handling.The method maps common error patterns to specific exception types, providing clear error classification.
unstract/autogen-client/src/unstract/autogen_client/helper.py (7)
1-15: LGTM! Module structure is clean.The imports are minimal and appropriate, and the module docstring clearly explains the purpose of providing AutoGen functionality without requiring AutoGen dependencies.
18-53: LGTM! Well-structured initialization.The constructor properly initializes all components with sensible defaults, and the parameters are clearly documented.
119-142: Synchronous wrapper pattern is acceptable.The pattern of creating a new event loop for each synchronous call is safe and ensures proper cleanup. Note that this will fail if called from within an existing async context, but that's acceptable for a synchronous API.
144-150: LGTM! Clean resource management.Both methods properly manage their respective resources—clearing history and delegating cleanup to the underlying client.
153-168: LGTM! Straightforward factory function.The factory properly forwards all arguments to the constructor and provides a clean API for creating agent instances.
171-224: LGTM! Proper resource management.The function correctly uses try/finally to ensure the client is closed, and the logic for building messages is consistent with the agent implementation.
227-262: LGTM! Consistent synchronous wrapper.The synchronous wrapper follows the same safe pattern as
process_message, ensuring proper event loop lifecycle management.
| dependencies = [ | ||
| "autogen-core>=0.4.0", | ||
| "pydantic>=2.0.0", | ||
| "typing-extensions>=4.0.0", | ||
| "unstract-sdk1", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify autogen-core compatibility.
The dependency specifies autogen-core>=0.4.0 without an upper bound. Given that this is a client implementation of AutoGen's interface, breaking changes in future AutoGen releases could affect compatibility.
🏁 Script executed:
#!/bin/bash
# Check latest autogen-core version and recent changes
pip index versions autogen-core 2>/dev/null | head -20
# Check if there are any known breaking changes or deprecations
gh api graphql -f query='
{
repository(owner: "microsoft", name: "autogen") {
releases(first: 5, orderBy: {field: CREATED_AT, direction: DESC}) {
nodes {
name
tagName
description
createdAt
}
}
}
}'Length of output: 13795
Add an upper bound to autogen-core dependency
In unstract/autogen-client/pyproject.toml, change
"autogen-core>=0.4.0",to
"autogen-core>=0.4.0,<1.0.0",to guard against breaking changes in future major releases.
🤖 Prompt for AI Agents
In unstract/autogen-client/pyproject.toml around lines 36 to 41, the
autogen-core dependency is unbounded above; update the dependency spec from
"autogen-core>=0.4.0" to include an upper bound to prevent accidental breaks
from future major releases (for example "autogen-core>=0.4.0,<1.0.0"), and save
the file so the new constrained version range is used by tooling.
|
|
||
| ## Requirements | ||
|
|
||
| - Python 3.8+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent Python version requirement.
The README states Python 3.8+ is required, but the pyproject.toml (line 11) specifies requires-python = ">=3.10". Update the README to match the actual requirement.
Apply this diff:
-- Python 3.8+
+- Python 3.10+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Python 3.8+ | |
| - Python 3.10+ |
🤖 Prompt for AI Agents
In unstract/autogen-client/README.md around line 159, the README currently
states "Python 3.8+" which conflicts with pyproject.toml's requires-python =
">=3.10"; update the README text to "Python 3.10+" (or equivalent wording like
"Python >=3.10") so the documented requirement matches pyproject.toml; ensure no
other README references to Python version remain inconsistent.
| if not hasattr(llm_adapter, "complete"): | ||
| raise UnstractConfigurationError( | ||
| "llm_adapter must have a 'complete' method" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Inconsistent method name check.
Line 86 checks for "complete" but the error message on line 88 says "completion". The test fixtures in test_client.py (line 26) mock a completion method, which will fail this validation check.
Verify the correct method name for the Unstract LLM adapter. If it's completion, apply this diff:
- if not hasattr(llm_adapter, "complete"):
+ if not hasattr(llm_adapter, "completion"):
raise UnstractConfigurationError(
"llm_adapter must have a 'completion' method"
)If it's complete, update the error message and test mocks accordingly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not hasattr(llm_adapter, "complete"): | |
| raise UnstractConfigurationError( | |
| "llm_adapter must have a 'complete' method" | |
| ) | |
| if not hasattr(llm_adapter, "completion"): | |
| raise UnstractConfigurationError( | |
| "llm_adapter must have a 'completion' method" | |
| ) |
🤖 Prompt for AI Agents
In unstract/autogen-client/src/unstract/autogen_client/client.py around lines 86
to 89, the validation checks for an llm_adapter method named "complete" but the
error message and test fixtures reference "completion"; align them by choosing
the correct adapter API: if the adapter should expose "completion", change the
hasattr check to look for "completion" and update the raised
UnstractConfigurationError message to mention "completion" (and ensure test
mocks use "completion"); if the adapter should expose "complete", update the
error text to say "'complete' method" and change any tests/mocks that reference
"completion" to "complete" so runtime validation and tests are consistent.
| def _call(): | ||
| # Extract prompt from messages for SDK1 complete method | ||
| messages = params.get("messages", []) | ||
| prompt = "" | ||
|
|
||
| # Find the last user message as the prompt | ||
| for msg in reversed(messages): | ||
| if isinstance(msg, dict): | ||
| role = msg.get("role") | ||
| content = msg.get("content", "") | ||
| elif hasattr(msg, "role") and hasattr(msg, "content"): | ||
| role = msg.role | ||
| content = msg.content | ||
| else: | ||
| continue | ||
|
|
||
| if role == "user": | ||
| prompt = content | ||
| break | ||
|
|
||
| # If no user message found, use default prompt | ||
| if not prompt: | ||
| prompt = "Which model are you?" | ||
|
|
||
| # Create parameters for SDK1 complete method (remove messages, add prompt) | ||
| complete_params = {k: v for k, v in params.items() if k != "messages"} | ||
| complete_params["prompt"] = prompt | ||
|
|
||
| # Use the Unstract LLM adapter's complete method | ||
| return self.llm_adapter.complete(**complete_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major: Prompt extraction discards conversation context.
The code only extracts the last user message as the prompt (lines 364-376), discarding system messages and prior conversation history. This breaks multi-turn conversations and system instructions.
Additionally, the fallback prompt "Which model are you?" (line 380) seems arbitrary and could produce confusing results.
If the Unstract adapter's complete method only accepts a single prompt string (not a message list), document this limitation clearly in the docstring and consider raising an error or warning when multiple messages are provided. If it does accept messages, pass the full normalized message list instead of extracting a single prompt.
Apply this diff if the adapter supports message lists:
- # Extract prompt from messages for SDK1 complete method
- messages = params.get("messages", [])
- prompt = ""
-
- # Find the last user message as the prompt
- for msg in reversed(messages):
- if isinstance(msg, dict):
- role = msg.get("role")
- content = msg.get("content", "")
- elif hasattr(msg, "role") and hasattr(msg, "content"):
- role = msg.role
- content = msg.content
- else:
- continue
-
- if role == "user":
- prompt = content
- break
-
- # If no user message found, use default prompt
- if not prompt:
- prompt = "Which model are you?"
-
- # Create parameters for SDK1 complete method (remove messages, add prompt)
- complete_params = {k: v for k, v in params.items() if k != "messages"}
- complete_params["prompt"] = prompt
-
- # Use the Unstract LLM adapter's complete method
- return self.llm_adapter.complete(**complete_params)
+ # Use the Unstract LLM adapter's complete method with full message context
+ return self.llm_adapter.completion(**params)Committable suggestion skipped: line range outside the PR's diff.
| def _stream_call(): | ||
| # Extract prompt from messages for SDK1 complete method | ||
| messages = params.get("messages", []) | ||
| prompt = "" | ||
|
|
||
| # Find the last user message as the prompt | ||
| for msg in reversed(messages): | ||
| if isinstance(msg, dict): | ||
| role = msg.get("role") | ||
| content = msg.get("content", "") | ||
| elif hasattr(msg, "role") and hasattr(msg, "content"): | ||
| role = msg.role | ||
| content = msg.content | ||
| else: | ||
| continue | ||
|
|
||
| if role == "user": | ||
| prompt = content | ||
| break | ||
|
|
||
| # If no user message found, use default prompt | ||
| if not prompt: | ||
| prompt = "Which model are you?" | ||
|
|
||
| # Create parameters for SDK1 complete method (remove messages, add prompt) | ||
| complete_params = {k: v for k, v in params.items() if k != "messages"} | ||
| complete_params["prompt"] = prompt | ||
|
|
||
| # Use the Unstract LLM adapter's complete method with streaming | ||
| return self.llm_adapter.complete(**complete_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Duplicate prompt extraction logic and same context loss issue.
This code duplicates the prompt extraction logic from _execute_completion (lines 358-387), with the same issue of discarding conversation context.
Extract the prompt preparation logic into a shared private method to eliminate duplication:
def _prepare_completion_params(self, params: dict[str, Any]) -> dict[str, Any]:
"""Prepare completion parameters by extracting prompt from messages."""
# Extract prompt from messages for SDK1 complete method
messages = params.get("messages", [])
prompt = ""
# Find the last user message as the prompt
for msg in reversed(messages):
if isinstance(msg, dict):
role = msg.get("role")
content = msg.get("content", "")
elif hasattr(msg, "role") and hasattr(msg, "content"):
role = msg.role
content = msg.content
else:
continue
if role == "user":
prompt = content
break
# If no user message found, use default prompt
if not prompt:
prompt = "Which model are you?"
# Create parameters for SDK1 complete method
complete_params = {k: v for k, v in params.items() if k != "messages"}
complete_params["prompt"] = prompt
return complete_paramsThen call this from both methods:
def _call():
complete_params = self._prepare_completion_params(params)
return self.llm_adapter.complete(**complete_params)🤖 Prompt for AI Agents
In unstract/autogen-client/src/unstract/autogen_client/client.py around lines
358-387 and 401-430, the prompt extraction logic is duplicated and discards
conversation context; extract that logic into a private helper (e.g. def
_prepare_completion_params(self, params: dict[str, Any]) -> dict[str, Any]:)
that performs the messages -> prompt extraction, applies the default prompt when
none found, and returns the complete_params dict (params without "messages" plus
"prompt"); then replace the duplicated blocks in both _execute_completion and
_stream_call with a call to self._prepare_completion_params(params) and pass the
result to self.llm_adapter.complete(**complete_params). Ensure the new helper
lives on the class and is covered by both call sites.
| def extract_usage(response: Any) -> RequestUsage: | ||
| """Extract usage information from adapter response. | ||
|
|
||
| Args: | ||
| response: Response from Unstract LLM adapter | ||
|
|
||
| Returns: | ||
| RequestUsage object | ||
| """ | ||
| try: | ||
| if isinstance(response, dict) and "usage" in response: | ||
| usage = response["usage"] | ||
| return RequestUsage( | ||
| prompt_tokens=usage.get("prompt_tokens", 0), | ||
| completion_tokens=usage.get("completion_tokens", 0), | ||
| ) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to extract usage: {str(e)}") | ||
|
|
||
| return RequestUsage(prompt_tokens=0, completion_tokens=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Missing extraction from dict format.
The function only handles response objects with a usage attribute when the response is a dict. However, extract_content (lines 106-111) shows that responses can also be plain dicts with nested structure. The usage extraction should handle both object attributes and dict access consistently.
Add support for extracting usage from object attributes:
def extract_usage(response: Any) -> RequestUsage:
"""Extract usage information from adapter response.
Args:
response: Response from Unstract LLM adapter
Returns:
RequestUsage object
"""
try:
if isinstance(response, dict) and "usage" in response:
usage = response["usage"]
return RequestUsage(
prompt_tokens=usage.get("prompt_tokens", 0),
completion_tokens=usage.get("completion_tokens", 0),
)
+ elif hasattr(response, "usage"):
+ usage = response.usage
+ return RequestUsage(
+ prompt_tokens=getattr(usage, "prompt_tokens", 0),
+ completion_tokens=getattr(usage, "completion_tokens", 0),
+ )
except Exception as e:
logger.warning(f"Failed to extract usage: {str(e)}")
return RequestUsage(prompt_tokens=0, completion_tokens=0)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def extract_usage(response: Any) -> RequestUsage: | |
| """Extract usage information from adapter response. | |
| Args: | |
| response: Response from Unstract LLM adapter | |
| Returns: | |
| RequestUsage object | |
| """ | |
| try: | |
| if isinstance(response, dict) and "usage" in response: | |
| usage = response["usage"] | |
| return RequestUsage( | |
| prompt_tokens=usage.get("prompt_tokens", 0), | |
| completion_tokens=usage.get("completion_tokens", 0), | |
| ) | |
| except Exception as e: | |
| logger.warning(f"Failed to extract usage: {str(e)}") | |
| return RequestUsage(prompt_tokens=0, completion_tokens=0) | |
| def extract_usage(response: Any) -> RequestUsage: | |
| """Extract usage information from adapter response. | |
| Args: | |
| response: Response from Unstract LLM adapter | |
| Returns: | |
| RequestUsage object | |
| """ | |
| try: | |
| if isinstance(response, dict) and "usage" in response: | |
| usage = response["usage"] | |
| return RequestUsage( | |
| prompt_tokens=usage.get("prompt_tokens", 0), | |
| completion_tokens=usage.get("completion_tokens", 0), | |
| ) | |
| elif hasattr(response, "usage"): | |
| usage = response.usage | |
| return RequestUsage( | |
| prompt_tokens=getattr(usage, "prompt_tokens", 0), | |
| completion_tokens=getattr(usage, "completion_tokens", 0), | |
| ) | |
| except Exception as e: | |
| logger.warning(f"Failed to extract usage: {str(e)}") | |
| return RequestUsage(prompt_tokens=0, completion_tokens=0) |
🤖 Prompt for AI Agents
In unstract/autogen-client/src/unstract/autogen_client/utils.py around lines
125-144, the extract_usage function only reads usage when response is a dict
with a top-level "usage" key; update it to handle both objects with a usage
attribute and dicts with either a top-level "usage" or nested usage (e.g.,
response.get("response", {}).get("usage")). Implement a priority: if
hasattr(response, "usage") use that, else if isinstance(response, dict) check
"usage" then check nested keys like "response" -> "usage"; when extracting, use
.get("prompt_tokens", 0) and .get("completion_tokens", 0) to build and return a
RequestUsage, and keep the current exception handling/return fallback.
| def extract_finish_reason(response: Any) -> str: | ||
| """Extract and normalize finish reason from adapter response. | ||
|
|
||
| Args: | ||
| response: Response from Unstract LLM adapter | ||
|
|
||
| Returns: | ||
| Normalized finish reason (defaults to 'stop') | ||
| """ | ||
| try: | ||
| if hasattr(response, "choices") and response.choices: | ||
| choice = response.choices[0] | ||
| raw_reason = getattr(choice, "finish_reason", None) | ||
| return normalize_finish_reason(raw_reason) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to extract finish reason: {e}") | ||
|
|
||
| return normalize_finish_reason(None) # This will return 'stop' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent return behavior for missing finish reason.
The function returns normalize_finish_reason(None) at line 164, which will return "stop" based on the current implementation. However, the test at line 183 expects None. This creates the same inconsistency as in normalize_finish_reason.
Ensure the behavior matches the test expectations. If the test expects None, update line 164:
- return normalize_finish_reason(None) # This will return 'stop'
+ return NoneAnd update the return type:
-def extract_finish_reason(response: Any) -> str:
+def extract_finish_reason(response: Any) -> str | None:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def extract_finish_reason(response: Any) -> str: | |
| """Extract and normalize finish reason from adapter response. | |
| Args: | |
| response: Response from Unstract LLM adapter | |
| Returns: | |
| Normalized finish reason (defaults to 'stop') | |
| """ | |
| try: | |
| if hasattr(response, "choices") and response.choices: | |
| choice = response.choices[0] | |
| raw_reason = getattr(choice, "finish_reason", None) | |
| return normalize_finish_reason(raw_reason) | |
| except Exception as e: | |
| logger.warning(f"Failed to extract finish reason: {e}") | |
| return normalize_finish_reason(None) # This will return 'stop' | |
| def extract_finish_reason(response: Any) -> str | None: | |
| """Extract and normalize finish reason from adapter response. | |
| Args: | |
| response: Response from Unstract LLM adapter | |
| Returns: | |
| Normalized finish reason, or None if no finish reason is present | |
| """ | |
| try: | |
| if hasattr(response, "choices") and response.choices: | |
| choice = response.choices[0] | |
| raw_reason = getattr(choice, "finish_reason", None) | |
| return normalize_finish_reason(raw_reason) | |
| except Exception as e: | |
| logger.warning(f"Failed to extract finish reason: {e}") | |
| return None |
🤖 Prompt for AI Agents
In unstract/autogen-client/src/unstract/autogen_client/utils.py around lines 147
to 164, extract_finish_reason currently falls back to
normalize_finish_reason(None) (which returns "stop") when no finish reason is
present, but tests expect None; change the final return to return None instead
of normalize_finish_reason(None) and update the function signature/type hint to
reflect Optional[str] (and adjust any callers or docstring if necessary) so
missing finish reasons return None consistently with tests.
| def test_normalize_finish_reason(self) -> None: | ||
| """Test finish reason normalization.""" | ||
| assert normalize_finish_reason(None) is None | ||
| assert normalize_finish_reason("stop") == "stop" | ||
| assert normalize_finish_reason("stop_sequence") == "stop" | ||
| assert normalize_finish_reason("length") == "length" | ||
| assert normalize_finish_reason("function_calls") == "function_calls" | ||
| assert normalize_finish_reason("content_filter") == "content_filter" | ||
| assert normalize_finish_reason("unknown_reason") == "unknown" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test expectation conflicts with implementation.
Line 29 expects normalize_finish_reason(None) to return None, but the implementation at utils.py line 36 returns "stop". This test will fail.
Update the test to match the intended behavior, or update the implementation. If the function should return None for None input:
- assert normalize_finish_reason(None) is None
+ assert normalize_finish_reason(None) == "stop"Or if the implementation should change, see the comment on utils.py lines 26-39.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_normalize_finish_reason(self) -> None: | |
| """Test finish reason normalization.""" | |
| assert normalize_finish_reason(None) is None | |
| assert normalize_finish_reason("stop") == "stop" | |
| assert normalize_finish_reason("stop_sequence") == "stop" | |
| assert normalize_finish_reason("length") == "length" | |
| assert normalize_finish_reason("function_calls") == "function_calls" | |
| assert normalize_finish_reason("content_filter") == "content_filter" | |
| assert normalize_finish_reason("unknown_reason") == "unknown" | |
| def test_normalize_finish_reason(self) -> None: | |
| """Test finish reason normalization.""" | |
| assert normalize_finish_reason(None) == "stop" | |
| assert normalize_finish_reason("stop") == "stop" | |
| assert normalize_finish_reason("stop_sequence") == "stop" | |
| assert normalize_finish_reason("length") == "length" | |
| assert normalize_finish_reason("function_calls") == "function_calls" | |
| assert normalize_finish_reason("content_filter") == "content_filter" | |
| assert normalize_finish_reason("unknown_reason") == "unknown" |
🤖 Prompt for AI Agents
In unstract/autogen-client/tests/test_utils.py around lines 27 to 36, the first
assertion expects normalize_finish_reason(None) to be None but the
implementation in utils.py returns "stop"; update the test to match the
implementation by changing the first assertion to assert
normalize_finish_reason(None) == "stop" so the test reflects the intended
mapping of None -> "stop".
| def test_extract_usage_success(self) -> None: | ||
| """Test successful usage extraction.""" | ||
| response = Mock() | ||
| response.usage = Mock() | ||
| response.usage.prompt_tokens = 10 | ||
| response.usage.completion_tokens = 5 | ||
|
|
||
| usage = extract_usage(response) | ||
| assert isinstance(usage, RequestUsage) | ||
| assert usage.prompt_tokens == 10 | ||
| assert usage.completion_tokens == 5 | ||
|
|
||
| def test_extract_usage_no_usage(self) -> None: | ||
| """Test usage extraction with no usage.""" | ||
| response = Mock() | ||
| del response.usage | ||
|
|
||
| usage = extract_usage(response) | ||
| assert isinstance(usage, RequestUsage) | ||
| assert usage.prompt_tokens == 0 | ||
| assert usage.completion_tokens == 0 | ||
|
|
||
| def test_extract_usage_missing_fields(self) -> None: | ||
| """Test usage extraction with missing fields.""" | ||
| response = Mock() | ||
| response.usage = Mock() | ||
| # Missing prompt_tokens and completion_tokens | ||
|
|
||
| usage = extract_usage(response) | ||
| assert isinstance(usage, RequestUsage) | ||
| assert usage.prompt_tokens == 0 | ||
| assert usage.completion_tokens == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Missing test coverage for object attribute extraction.
The tests only verify extraction from dict responses (line 129-131) but don't test extraction from objects with usage attributes, which is the pattern shown in test_extract_content_success (lines 91-94).
Add a test for object attribute extraction:
def test_extract_usage_from_object(self) -> None:
"""Test usage extraction from object attributes."""
response = Mock()
response.usage = Mock()
response.usage.prompt_tokens = 15
response.usage.completion_tokens = 8
usage = extract_usage(response)
assert isinstance(usage, RequestUsage)
assert usage.prompt_tokens == 15
assert usage.completion_tokens == 8This would catch the missing functionality noted in the review of utils.py lines 125-144.
🤖 Prompt for AI Agents
In unstract/autogen-client/tests/test_utils.py around lines 126 to 157, tests
lack coverage for extracting usage from objects with a usage attribute; add a
new test function that constructs a Mock response with
response.usage.prompt_tokens and response.usage.completion_tokens set (e.g., 15
and 8), calls extract_usage(response), and asserts the returned RequestUsage has
those values, mirroring the dict-style test and the pattern used in
test_extract_content_success.
| def test_extract_finish_reason_no_choices(self) -> None: | ||
| """Test finish reason extraction with no choices.""" | ||
| response = Mock() | ||
| response.choices = [] | ||
|
|
||
| reason = extract_finish_reason(response) | ||
| assert reason is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test expectation conflicts with implementation.
Line 183 expects extract_finish_reason to return None when there are no choices, but the implementation at utils.py line 164 calls normalize_finish_reason(None), which returns "stop". This test will fail.
Update the test:
- assert reason is None
+ assert reason == "stop"Or update the implementation to return None (see comment on utils.py lines 147-164).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_extract_finish_reason_no_choices(self) -> None: | |
| """Test finish reason extraction with no choices.""" | |
| response = Mock() | |
| response.choices = [] | |
| reason = extract_finish_reason(response) | |
| assert reason is None | |
| def test_extract_finish_reason_no_choices(self) -> None: | |
| """Test finish reason extraction with no choices.""" | |
| response = Mock() | |
| response.choices = [] | |
| reason = extract_finish_reason(response) | |
| assert reason == "stop" |
🤖 Prompt for AI Agents
In unstract/autogen-client/tests/test_utils.py around lines 177 to 183, the test
expects extract_finish_reason(response) to return None when response.choices ==
[], but the current implementation in utils.py (lines ~147-164) calls
normalize_finish_reason(None) which returns "stop", causing the test to fail;
either update the test to expect "stop" or (preferred) change the
implementation: modify extract_finish_reason to return None immediately when
choices is falsy or empty (do not call normalize_finish_reason on a missing
choice), and add a unit test verifying behavior for both empty choices and
choices with None/explicit reasons.
|
|
|







What
Why
How
UnstractAutoGenClientclass that extends AutoGen'sChatCompletionClientinterfaceSimpleAutoGenAgentfor simplified agent interactions without requiring AutoGen dependenciescreate_simple_autogen_agent,process_with_autogen,process_with_autogen_asyncPackage Structure
Key Features
Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No - This is a new standalone package that doesn't modify any existing functionality. It only adds a new client implementation that bridges AutoGen and Unstract adapters. The package is completely isolated in the
unstract/autogen-client/directory.Database Migrations
None
Env Config
None
Relevant Docs
unstract/autogen-client/README.mdRelated Issues or PRs
Dependencies
autogen-core>=0.4.0.dev12- Microsoft AutoGen frameworkunstract-sdk1- Unstract SDK for LLM adapters (uses sdk1.llm module)Notes on Testing
unstract/autogen-client/tests/test_client.py: Tests for UnstractAutoGenClienttest_utils.py: Tests for utility functionscd unstract/autogen-client && pytest tests/Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.