Skip to content

Fix component tool parameters #9342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
May 15, 2025
Merged

Fix component tool parameters #9342

merged 29 commits into from
May 15, 2025

Conversation

sjrl
Copy link
Contributor

@sjrl sjrl commented May 5, 2025

Related Issues

Proposed Changes:

Properties Schema Generation

  • Refactor the property JSON Schema generation based on the signature of a Component's run method. Adopted the approached used in from_function.py where we utilize Pydantic's model_json_schema. This adds full support of basic types.
  • New utility methods are used to accomplish this, namely the ones found in parameters_schema_utils.py. The only semi-interesting thing we do is specially handle the conversion of our dataclasses into a pydantic model. We do this so we can utilize the docstring descriptions of the parameter names. By default pydantic ignores docstrings.
    • We had to specially handle the ChatMessage object because conversion of this dataclass into a Pydantic one fails b/c Pydantic strictly enforces that field names can't have leading underscores. In this case we update the field name of the converted ChatMessage pydantic model to leave out the underscores. This should be find for JSON schema generation since ChatMessage.from_dict does support loading from non-underscored field names.

Remaining TODOs

  • Check if the changes above (e.g. oneOf, null, additionalProperties) are generally supported by LLM providers
    • OpenAIChatGenerator --> integration test
    • AzureOpenAIChatGenerator --> integration test
    • HuggingFaceAPIChatGenerator --> integration test
  • Once merged we will need to check if core-integration tests are affected in the case of serialization/deserialization of Tools and ToolSets

How did you test it?

  • Added unit tests
  • Updated existing unit tests
  • Updated integration tests to use a tool with a more complicated signature

Notes for the reviewer

As a heads up most of the line changes ~500 are for tests.

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels May 5, 2025
@coveralls
Copy link
Collaborator

coveralls commented May 5, 2025

Pull Request Test Coverage Report for Build 15039406888

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 13 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.05%) to 90.465%

Files with Coverage Reduction New Missed Lines %
tools/serde_utils.py 1 97.22%
components/generators/chat/hugging_face_api.py 2 95.7%
tools/tool.py 4 92.41%
tools/component_tool.py 6 93.41%
Totals Coverage Status
Change from base Build 14907254108: 0.05%
Covered Lines: 10930
Relevant Lines: 12082

💛 - Coveralls

@sjrl sjrl marked this pull request as ready for review May 14, 2025 10:20
@sjrl sjrl requested a review from a team as a code owner May 14, 2025 10:20
@sjrl sjrl requested a review from a team as a code owner May 14, 2025 10:20
@sjrl sjrl requested review from dfokina and vblagoje and removed request for a team May 14, 2025 10:20
@anakin87 anakin87 self-requested a review May 14, 2025 10:26
@sjrl
Copy link
Contributor Author

sjrl commented May 14, 2025

Current integration error is resulting from this issue huggingface/text-generation-inference#2876 where it doesn't look like HuggingFace supports the $defs field in the parameters schema whereas OpenAI does.

I'd suggest a separate issue for this is opened to resolve since this is already an existing bug in main when using the tool decorator on a function whose signature includes a pydantic object or dataclass.

Let me know if you agree and I can remove that test for now and open an issue.

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments.
I would appreciate it if @vblagoje could review too, since he originally worked on this.

@anakin87
Copy link
Member

Current integration error is resulting from this issue huggingface/text-generation-inference#2876 ...
Let me know if you agree and I can remove that test for now and open an issue.

I agree.

(When I originally worked on tools support for HF API, I found several inconsistencies (deepset-ai/haystack-experimental#120 (comment)). I hope that most of them are fixed now, but I am not sure.)

@sjrl
Copy link
Contributor Author

sjrl commented May 14, 2025

@anakin87 do you think it's worth updating create_tool_from_function to also use _resolve_type? Otherwise if we have a function that uses our ChatMessage dataclass in its signature then users will get an error message. Only downside I see right now is _resolve_type would introduce the docparser dep which is used _dataclass_to_pydantic_model function.

@anakin87
Copy link
Member

@anakin87 do you think it's worth updating create_tool_from_function to also use _resolve_type? Otherwise if we have a function that uses our ChatMessage dataclass in its signature then users will get an error message. Only downside I see right now is _resolve_type would introduce the docparser dep which is used _dataclass_to_pydantic_model function.

Since we expect basic python types

The function is expected to have basic python input types (str, int, float, bool, list, dict, tuple).

I would leave it unchanged for now

Copy link
Member

@vblagoje vblagoje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very very nice!

@anakin87 anakin87 self-requested a review May 15, 2025 07:31
@sjrl sjrl enabled auto-merge (squash) May 15, 2025 07:46
@sjrl sjrl merged commit 9ae76e1 into main May 15, 2025
22 checks passed
@sjrl sjrl deleted the fix-component-tool-parameters branch May 15, 2025 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The auto construction of Tool parameters in ComponentTool does not work for more complex types
4 participants