[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol by ashwinb · Pull Request #832 · llamastack/llama-stack

ashwinb · 2025-01-21T20:20:33Z

See #827 for the broader design.

Third part:

we need to make tool_runtime.rag_tool.query_context() and tool_runtime.rag_tool.insert_documents() methods work smoothly with complete type safety. To that end, we introduce a sub-resource path tool-runtime/rag-tool/ and make changes to the resolver to make things work.
the PR updates the agents implementation to directly call these typed APIs for memory accesses rather than going through the complex, untyped "invoke_tool" API. the code looks much nicer and simpler (expectedly.)
there are a number of hacks in the server resolver implementation still, we will live with some and fix some

Note that we must make sure the client SDKs are able to handle this subresource complexity also. Stainless has support for subresources, so this should be possible but beware.

Test Plan

Our RAG test is sad (doesn't actually test for actual RAG output) but I verified that the implementation works. I will work on fixing the RAG test afterwards.

pytest -s -v tests/agents/test_agents.py -k "rag and together" --safety-shield=meta-llama/Llama-Guard-3-8B

ashwinb · 2025-01-21T20:21:20Z

llama_stack/apis/tools/rag_tool.py

this has been move from provider configuration to call-time parameter which is the correct abstraction level

ashwinb · 2025-01-21T20:21:44Z

llama_stack/apis/tools/tools.py

will update this in all other tool runtimes

dineshyv · 2025-01-21T23:28:02Z

llama_stack/providers/inline/tool_runtime/memory/memory.py

we do not have the toolgroup namespacing for tool names like "rag_tool.query_context". are we adding toolgroup namespacing here? if so, should we add the same for other toolgroups?

yeah that's a fair point. I will update this / revert it.

llama_stack/providers/utils/memory/vector_store.py

llama_stack/providers/inline/agents/meta_reference/agent_instance.py

See See llamastack/llama-stack#827 for the broader design. See llamastack/llama-stack#832 for the main corresponding Llama Stack PR. ## Test Plan (running client-sdk tests)

ashwinb requested review from dineshyv, dltn, hardikjshah, raghotham, sixianyi0721, vladimirivic and yanxi0830 January 21, 2025 20:20

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 21, 2025

ashwinb commented Jan 21, 2025

View reviewed changes

llama_stack/apis/tools/tools.py Outdated

Copy link

Contributor Author

ashwinb Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update this in all other tool runtimes

dineshyv reviewed Jan 21, 2025

View reviewed changes

dineshyv reviewed Jan 22, 2025

View reviewed changes

llama_stack/providers/inline/agents/meta_reference/agent_instance.py Outdated Show resolved Hide resolved

ashwinb mentioned this pull request Jan 22, 2025

[memory refactor] Introduce tool_runtime.rag_tool as a subresource llamastack/llama-stack-client-python#93

Merged

raghotham approved these changes Jan 22, 2025

View reviewed changes

ashwinb force-pushed the faiss_vector_io branch 2 times, most recently from 2bf253c to 9282794 Compare January 22, 2025 18:01

Base automatically changed from faiss_vector_io to main January 22, 2025 18:02

ashwinb added 7 commits January 22, 2025 10:02

Introduce RAGToolRuntime as a specialized sub-protocol

2f76de1

RAG Agent test passes

a1433c0

add a test for rag via curl; this can be generalized

68f2550

slight rename

0399820

bug fix, generate openapi spec

460dc8a

update openapi generator

89f51a8

reuse some variables

5297aef

ashwinb force-pushed the agents_memory_tool branch from 45fb353 to 5297aef Compare January 22, 2025 18:02

ashwinb merged commit 1a74904 into main Jan 22, 2025
2 checks passed

ashwinb deleted the agents_memory_tool branch January 22, 2025 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol#832

[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol#832
ashwinb merged 7 commits intomainfrom
agents_memory_tool

ashwinb commented Jan 21, 2025 •

edited

Loading

Uh oh!

ashwinb Jan 21, 2025

Uh oh!

ashwinb Jan 21, 2025

Uh oh!

dineshyv Jan 21, 2025

Uh oh!

ashwinb Jan 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ashwinb commented Jan 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Uh oh!

ashwinb Jan 21, 2025

Choose a reason for hiding this comment

Uh oh!

ashwinb Jan 21, 2025

Choose a reason for hiding this comment

Uh oh!

dineshyv Jan 21, 2025

Choose a reason for hiding this comment

Uh oh!

ashwinb Jan 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ashwinb commented Jan 21, 2025 •

edited

Loading