-
Notifications
You must be signed in to change notification settings - Fork 630
Add azure provider #3526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add azure provider #3526
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Thanks for the contribution. 🎉 Can you please move the provider into a |
Hi @jkppr ! I make the change about the contrib folder dedicated for new provider. About the CLA, i'm little block by our compagny process. I try to fix it fastly and i send a message when it's ok 👌 Great day ! |
Thanks for making the changes. Is the code ready for review? If yes, can you mark the PR as ready? Then I'll give it a quick review. If all is good and CLA sorted we can merge it. |
c71c72e
to
2d2f2d1
Compare
I've requested the Black formatting in your GitHub actions for azureai.py. Do you need a final rebase to make just one clean commit? |
Please take a look at the linter issues.
Once CLA is sorted we can review the code and when all is approved, we will merge in the latest changes from the master branch and squash & merge. |
d32b219
to
ce4518d
Compare
Move azureai provider into contrib folder Format azureai.py to be compliant with black formating Fix pylint feeback
ce4518d
to
0df7977
Compare
Finally .. i just didn't link my personal profile with my work email address ... Apolagize ! |
/Gemini summary |
Summary of ChangesThis pull request significantly enhances Timesketch's large language model capabilities by adding a new provider for Azure AI. This integration allows Timesketch to leverage Azure OpenAI services, expanding the range of available LLM backends for features like summarization and other text-based analyses. The change is implemented modularly, making it easy to extend with future providers. Highlights
Changelog
Activity
|
/Gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new Azure AI provider for LLM functionality. While the initiative is great, the current implementation has several issues that need to be addressed. My review focuses on improving code quality, robustness, and adherence to the project's style guide. The feedback includes fixing incorrect type hints, addressing potential runtime errors like UnboundLocalError
and unsafe dictionary access, ensuring consistency with the base provider class, and improving documentation. Addressing these points will make the new provider more reliable and maintainable.
"""Azure Ai Foundry LLM provider.""" | ||
|
||
import json | ||
from typing import Optional, Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"endpoint, api_key, and model are required for AzureAI provider" | ||
) | ||
|
||
def generate(self, prompt: str, response_schema: Optional[dict] = None) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type hint for this function is dict
, but it can also return a str
on line 54. This is inconsistent and can lead to type errors. The return type should be Union[dict, str]
.
Additionally, the base class LLMProvider.generate
is type-hinted to return str
. Overriding it with a different return type violates the Liskov Substitution Principle, which could cause issues in code that uses the provider through the common interface.
This method is also missing a docstring. Please add a Google-style docstring as per the style guide.1
def generate(self, prompt: str, response_schema: Optional[dict] = None) -> dict: | |
def generate(self, prompt: str, response_schema: Optional[dict] = None) -> Union[dict, str]: |
Style Guide References
Footnotes
-
The style guide requires docstrings for functions. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeremyTar Is there a specific reason you changed the return type to dict instead of string?
If I'm not mistaken this would make the azureai provider incompatible with the existing features like Event summary or query generation? (Maybe you tested it and it works?)
Would it be an option to return only the LLM text answer from the dict like it's done in the aistudio provider for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about this point 😅 We mainly use the summarization feature and we don't have the opportunity to use the other providers in our context.
I relied on the other providers to build the Azure provider. Despite initial tests by providing a string provided by response_data = response.json()["choices"][0]["message"]["content"]
I encountered JSON type errors.
Currently, the provider works for us. To do this, we add the key expected by the response_schema schema expected.
I'd appreciate your opinions, because I can't understand how the Ollama provider works, for example. I probably misunderstood the use of response_schema
or the use of json.loads()
?
|
||
|
||
class AzureAI(interface.LLMProvider): | ||
"""AzureAI class for interacting with the Azure OpenAI API via Timesketch.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class docstring should follow the Google style guide as specified in the project's style guide.1 It should be more descriptive, explaining what the class does and what configuration it expects.
For example:
"""AzureAI provider for Timesketch.
This provider uses the Azure OpenAI API to generate text.
It requires the endpoint, api_key, and model to be configured.
"""
Style Guide References
Footnotes
-
The style guide requires using Google-style docstrings. ↩
|
||
NAME = "azureai" | ||
|
||
def __init__(self, config: dict, **kwargs: Any): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This __init__
method is missing a docstring. According to the style guide, all functions should have a docstring explaining their purpose, arguments, and any exceptions they might raise.1
Please add a Google-style docstring, for example:
"""Initializes the AzureAI provider.
Args:
config (dict): A dictionary of provider-specific configuration options.
**kwargs (Any): Additional arguments passed to the base class.
Raises:
ValueError: If required configuration keys (endpoint, api_key, model)
are missing.
"""
Style Guide References
Footnotes
-
The style guide requires docstrings for functions, following the Google style. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at the unresolved comments by Gemini.
I'll leave the second round of review to @itsmvd
f"Error JSON parsing text: {formatted_data}: {error}" | ||
) from error | ||
|
||
return response_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this response_data by default a string?
If yes let's add the Union type hint as recommended by Gemini code review to ensure we can return dict in case a response_schema is used and str by default.
I take note of all the comments and start working on them tomorrow. |
Comments Module variables
This PR want to add a simply new provider to use llm functionality on Timesketch. Including a new file in :
/timesketch/timesketch/lib/llm/providers/contrib/azureai.py
Actualité using on our context with all the summary feature we need to improve it with other llm features and pass all the tests required for a good PR.
Checks
Closing issues
closes #3522
- Add azureai provider for LLM featuresNice day !