-
Notifications
You must be signed in to change notification settings - Fork 10
feat: integrate Brave Search API #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
836fb6c to
e1f9180
Compare
jcabrero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we're in a bit of a rush here today. The PR in itself seems to be working in the general functionality, though there are some errors where the rate limits are not the expected.
The changes to the unit tests are not essential now.
If it is tremendously urgent, we merge it, but you should come back to it and address all the changes.
| brave_rate_limiter = AsyncLimiter(WEB_SEARCH_API_RPS, 1) | ||
|
|
||
| def perform_web_search_sync(query: str) -> WebSearchContext: | ||
| """Synchronously query Brave and build a contextual prompt. | ||
| _http_client: httpx.AsyncClient | None = None | ||
| _client_lock = asyncio.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the maximum concurrency limit for Brave API is WEB_SEARCH_API_RPS, then you should consider that there will be a single brave_rate_limiter per gunicorn worker. This means that, if there are 20 gunicorn workers and each has a WEB_SEARCH_API_RPS of 30, there may be up to 20 * 30 = 600 requests per second, and 20 happening concurrently from the different processes.
I would consider adding this to the request or the app state in FastAPI so that it is shared among the different processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now manage this with a Redis bucket. It also performs a check directly at the chat.completion level rather than via a web search query as this is easier to manage. We ensure that the number of queries with web search enabled does not exceed max_concurrent_requests//count so that we don't exceed the 20 queries/s allowed by the Brave API.
|
|
||
| def perform_web_search_sync(query: str) -> WebSearchContext: | ||
| """Synchronously query Brave and build a contextual prompt. | ||
| _http_client: httpx.AsyncClient | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _http_client: httpx.AsyncClient | None = None | |
| _http_client: Optional[httpx.AsyncClient] = None |
| # Brave Search API configuration | ||
| BRAVE_SEARCH_API: str | None = os.getenv("BRAVE_SEARCH_API") | ||
|
|
||
| # Web Search API configuration | ||
| WEB_SEARCH_API_PATH: Dict[str, str] = { | ||
| "web": "https://api.search.brave.com/res/v1/web/search", | ||
| } | ||
| WEB_SEARCH_COUNT: int = 3 | ||
| WEB_SEARCH_LANG: str = "en" | ||
| WEB_SEARCH_COUNTRY: str = "us" | ||
| WEB_SEARCH_TIMEOUT: float = 20.0 | ||
| WEB_SEARCH_API_MAX_CONCURRENT_REQUESTS: int = 20 | ||
| WEB_SEARCH_API_RPS: int = 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd love this to be a single type called
WebSearchSettingswith each field being one member. - The
WEB_SEARCH_API_PATH, it's not great to haveDict[str, str]because it is just one element and you don't seem to use more than one.
| # Brave Search API configuration | |
| BRAVE_SEARCH_API: str | None = os.getenv("BRAVE_SEARCH_API") | |
| # Web Search API configuration | |
| WEB_SEARCH_API_PATH: Dict[str, str] = { | |
| "web": "https://api.search.brave.com/res/v1/web/search", | |
| } | |
| WEB_SEARCH_COUNT: int = 3 | |
| WEB_SEARCH_LANG: str = "en" | |
| WEB_SEARCH_COUNTRY: str = "us" | |
| WEB_SEARCH_TIMEOUT: float = 20.0 | |
| WEB_SEARCH_API_MAX_CONCURRENT_REQUESTS: int = 20 | |
| WEB_SEARCH_API_RPS: int = 20 | |
| # Brave Search API configuration | |
| BRAVE_SEARCH_API: Optional[str] = os.getenv("BRAVE_SEARCH_API") | |
| # Web Search API configuration | |
| WEB_SEARCH_API_PATH: Dict[str, str] = { | |
| "web": "https://api.search.brave.com/res/v1/web/search", | |
| } | |
| WEB_SEARCH_COUNT: int = 3 | |
| WEB_SEARCH_LANG: str = "en" | |
| WEB_SEARCH_COUNTRY: str = "us" | |
| WEB_SEARCH_TIMEOUT: float = 20.0 | |
| WEB_SEARCH_API_MAX_CONCURRENT_REQUESTS: int = 20 | |
| WEB_SEARCH_API_RPS: int = 20 |
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail="Web search requested with an empty query", | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be 503 SERVICE NOT AVAILABLE, because it is not an internal server error but rather that the websearch service is not available.
| params = { | ||
| "q": q, | ||
| "summary": 1, | ||
| "count": WEB_SEARCH_COUNT, | ||
| "country": WEB_SEARCH_COUNTRY, | ||
| "lang": WEB_SEARCH_LANG, | ||
| } | ||
| headers = { | ||
| "X-Subscription-Token": BRAVE_SEARCH_API, | ||
| "Api-Version": "2023-10-11", | ||
| "Accept": "application/json", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: With q being the only thing that you need to produce at runtime (because the rest you know beforehand), you could decide to cache params and headers as a global variable. Though not tremendous in speedup.
| async def _get_http_client() -> httpx.AsyncClient: | ||
| """Get or create a shared HTTP client instance to avoid creating it on every request. | ||
| Returns: | ||
| An AsyncClient configured with timeouts and connection limits | ||
| """ | ||
| if not query or not query.strip(): | ||
| global _http_client | ||
| async with _client_lock: | ||
| if _http_client is None: | ||
| _http_client = httpx.AsyncClient( | ||
| timeout=WEB_SEARCH_TIMEOUT, | ||
| limits=httpx.Limits( | ||
| max_connections=WEB_SEARCH_API_MAX_CONCURRENT_REQUESTS | ||
| ), | ||
| ) | ||
| return _http_client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This is not bad, but instead of doing global, which can be considered not great practice, I would instead do something like @lru_cache(max_size=1) which permits there to exist a single http_client. Then you simplify the behaviour of the function just creating one function.
tests/e2e/test_openai.py
Outdated
| else: | ||
| print("All retries failed.") | ||
| raise last_exception | ||
| pytest.skip("web_search could not be validated due to environment") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it ever get here? Otherwise, what is the purpose? I can't really make full sense of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Brave API key was not set in the CI pipeline and the test was breaking. Now that it's integrated, I've removed it
| assert "[1]" in ctx.prompt | ||
| assert "[2]" in ctx.prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is testing for the substring "[1]" and "[2]" being in the prompt? Do not really understand.
| @pytest.mark.asyncio | ||
| async def test_perform_web_search_async_success(): | ||
| """Test successful web search with mock response""" | ||
| mock_data = { | ||
| "web": { | ||
| "results": [ | ||
| { | ||
| "title": "Latest AI Developments", | ||
| "description": "OpenAI announces GPT-5 with improved capabilities and better performance across various tasks.", | ||
| "url": "https://example.com/ai1", | ||
| }, | ||
| { | ||
| "title": "AI Breakthrough in Robotics", | ||
| "description": "New neural network architecture improves robot learning efficiency by 40% in recent studies.", | ||
| "url": "https://example.com/ai2", | ||
| }, | ||
| ] | ||
| } | ||
| } | ||
|
|
||
| with ( | ||
| patch("nilai_api.handlers.web_search.BRAVE_SEARCH_API", "test-key"), | ||
| patch( | ||
| "nilai_api.handlers.web_search._make_brave_api_request", | ||
| return_value=mock_data, | ||
| ), | ||
| ): | ||
| ctx = await perform_web_search_async("AI developments") | ||
|
|
||
| assert ctx.sources is not None | ||
| assert len(ctx.sources) > 0 | ||
| assert "GPT-5" in ctx.prompt | ||
| assert "40%" in ctx.prompt | ||
| assert ctx.sources[0].source == "https://example.com/ai1" | ||
| assert ctx.sources[1].source == "https://example.com/ai2" | ||
| assert "[1]" in ctx.prompt | ||
| assert "[2]" in ctx.prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fully sure this test does much as you're not even keeping the format of what websearch would return as it would return an api_modelresult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would check all or most of these changes again. Some of the tests because you mocking values that do not really have to do with the SearchResult API model, I am not even sure they make sense because it's like I do a test where I say: "I mock sum function always returns the string '3' and I check if it returns 3. By itself, it is checking pretty much nothing. What you should try to mock is the Brave API calls that you make, and make sure that returns something that you mock. In that way, you make sure that the functionality you plug is doing what it should do.
35a32bd to
83b6d46
Compare
jcabrero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one last comment, that you may decide to keep. The AsyncLimiter is still in web_search.py. Can it conflict in any way with the Redis rate limiter? Is it really required?
In any case, I think the PR looks much much better now. Great job! 💪
Refactors the API’s web search integration by migrating from DuckDuckGo to Brave Search and adding async support
aiolimiterfor rate limiting; updates config and dependencies.httpx; robust error handling; parses results into newSearchResultmodel.SearchResultfor structured output; updates public API exports.