Skip to content

[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol#832

Merged
ashwinb merged 7 commits intomainfrom
agents_memory_tool
Jan 22, 2025
Merged

[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol#832
ashwinb merged 7 commits intomainfrom
agents_memory_tool

Conversation

@ashwinb
Copy link
Contributor

@ashwinb ashwinb commented Jan 21, 2025

See #827 for the broader design.

Third part:

  • we need to make tool_runtime.rag_tool.query_context() and tool_runtime.rag_tool.insert_documents() methods work smoothly with complete type safety. To that end, we introduce a sub-resource path tool-runtime/rag-tool/ and make changes to the resolver to make things work.
  • the PR updates the agents implementation to directly call these typed APIs for memory accesses rather than going through the complex, untyped "invoke_tool" API. the code looks much nicer and simpler (expectedly.)
  • there are a number of hacks in the server resolver implementation still, we will live with some and fix some

Note that we must make sure the client SDKs are able to handle this subresource complexity also. Stainless has support for subresources, so this should be possible but beware.

Test Plan

Our RAG test is sad (doesn't actually test for actual RAG output) but I verified that the implementation works. I will work on fixing the RAG test afterwards.

pytest -s -v tests/agents/test_agents.py -k "rag and together" --safety-shield=meta-llama/Llama-Guard-3-8B

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 21, 2025
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been move from provider configuration to call-time parameter which is the correct abstraction level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update this in all other tool runtimes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not have the toolgroup namespacing for tool names like "rag_tool.query_context". are we adding toolgroup namespacing here? if so, should we add the same for other toolgroups?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's a fair point. I will update this / revert it.

@ashwinb ashwinb force-pushed the faiss_vector_io branch 2 times, most recently from 2bf253c to 9282794 Compare January 22, 2025 18:01
Base automatically changed from faiss_vector_io to main January 22, 2025 18:02
@ashwinb ashwinb force-pushed the agents_memory_tool branch from 45fb353 to 5297aef Compare January 22, 2025 18:02
@ashwinb ashwinb merged commit 1a74904 into main Jan 22, 2025
2 checks passed
@ashwinb ashwinb deleted the agents_memory_tool branch January 22, 2025 18:04
ashwinb added a commit to llamastack/llama-stack-client-python that referenced this pull request Jan 22, 2025
See See llamastack/llama-stack#827 for the
broader design.

See llamastack/llama-stack#832 for the main
corresponding Llama Stack PR.

## Test Plan

(running client-sdk tests)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants