-
Notifications
You must be signed in to change notification settings - Fork 10
feat: Web Search V2 with context extraction + multi-model adjustment #154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ed query handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a significantly enhanced web search capability with topic-based multi-query support, content extraction from web pages, and improved LLM integration for better search query generation and result processing.
- Adds topic analysis agent to detect relevant information in user prompts and determine web search necessity
- Implements multi-query web search with parallel execution across different topics
- Enhances result quality by extracting main content from web pages using trafilatura
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/nilai_api/test_web_search.py | Updates test to reflect new message merging behavior (2 messages instead of 3) |
| packages/nilai-common/src/nilai_common/api_model.py | Adds new data models for topic analysis, result content extraction, and message merging utilities |
| packages/nilai-common/src/nilai_common/init.py | Exports new model classes for external use |
| nilai-api/src/nilai_api/handlers/web_search.py | Major refactor implementing topic-based multi-search with content extraction and improved query generation |
| nilai-api/pyproject.toml | Adds trafilatura dependency for web content extraction |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| return results | ||
|
|
||
|
|
||
| def _sanitize_query(query: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE are not great to debug. It would be necessary to have a description of what this function is doing.
|
|
||
| async def perform_web_search_async(query: str) -> WebSearchContext: | ||
| """Perform an asynchronous web search using the Brave Search API. | ||
| def _strip_code_fences(text: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. It would benefit highly from a good docstring
| """Handle web search enhancement for a conversation. | ||
| Extracts the most recent user message, generates an optimized search query | ||
| using an LLM, and enhances the conversation with web search results. | ||
| Args: | ||
| req_messages: List of conversation messages to process | ||
| model_name: Name of the LLM model to use for query generation | ||
| client: LLM client instance for making API calls | ||
| Returns: | ||
| WebSearchEnhancedMessages with web search context added, or original | ||
| messages if no user query is found or search fails | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing the docstrings?
| sections: List[str] = [] | ||
| all_sources: List[Source] = [] | ||
|
|
||
| for idx, (tq, ctx) in enumerate(zip(topic_queries, contexts), start=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These variables should be named topic_query and context. Short names normally lose significance in longer code.
|
|
||
|
|
||
| # Common source-type identifier for recording the original query used in web search | ||
| WEB_SEARCH_QUERY_SOURCE = "web_search_query" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go with the web_search handler
| def _format_search_results(results: List[SearchResult]) -> str: | ||
| lines = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure all the functions who do not have a docstring have one please
| @staticmethod | ||
| def merge_system_content( | ||
| messages: List[Message], system_content: str | ||
| ) -> List[Message]: | ||
| """Prepend or merge a system message with the given content. | ||
| - If the first message is a system message, append the new content after a blank line. | ||
| - Otherwise, create a new system message and prepend it. | ||
| """ | ||
| if not messages: | ||
| return [MessageAdapter.new_message(role="system", content=system_content)] | ||
|
|
||
| first = MessageAdapter(raw=messages[0]) | ||
| if first.role == "system": | ||
| existing = first.extract_text() or "" | ||
| merged = ( | ||
| (existing + "\n\n" + system_content) if existing else system_content | ||
| ) | ||
| new_first = MessageAdapter.new_message(role="system", content=merged) | ||
| return [new_first] + [ | ||
| MessageAdapter(raw=m).to_openai_param() for m in messages[1:] | ||
| ] | ||
|
|
||
| return [MessageAdapter.new_message(role="system", content=system_content)] + [ | ||
| MessageAdapter(raw=m).to_openai_param() for m in messages | ||
| ] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a static method to apply a change to the messages. I think copying the list of messages is not very productive here. I think the change to the list of messages needs to be applied in place and should be in the ChatCompletionRequest. I don't think this makes total sense.
|
|
||
| async def generate_search_query_from_llm( | ||
| user_message: str, model_name: str, client | ||
| user_message: str, model_name: str, client, *, topic: str | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client doesn't have a type in any of the functions signatures.
|
|
||
| logger.info("Analyze web search topics start model=%s", model_name) | ||
| try: | ||
| response = await client.chat.completions.create(**req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you do the chat completions we use QueryLogManager.log_query to track the use of completion tokens, you should track the usage somehow and pass it to the response object that then we use. In the new PR we introduce the ability to track the number of web searches, and also we would need to track the number of queries done.
| if not content: | ||
| logger.error("LLM returned empty search query") | ||
| raise RuntimeError("LLM returned an empty search query") | ||
| logger.warning("LLM returned empty search query; falling back to user input") | ||
| content = _sanitize_query(user_message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comes from the fact that sometimes GPT OSS fails to answer. I would include reasoning: low in the system prompt in order to avoid this as much as possible.
jcabrero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the interest of time is better for you to have the ability to merge. I approve the PR but please make sure to apply all the changes.
…ystem content management
… type annotations
This pull request significantly enhances the web search capabilities by introducing topic-based multi-query web search, improving result enrichment with main page content extraction, and refactoring message handling for better integration with LLM-driven workflows.
-> New agent that detects key information in the user prompt and determines whether a web search is necessary
-> Launch X web searches and compile them to obtain the final result