-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code for AstraDB #197
Add code for AstraDB #197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you have any questions or if there is any clarification needed.
Thanks for contributing.
mirascope/astra/types.py
Outdated
application_token: str = "your token here" | ||
collection_name: str = "your collection name here" | ||
|
||
class AstraParams(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to extend BaseVectorStoreParams, this will make it easier for us to integrate with tools such as weave
and logfire
easier.
mirascope/astra/vectorstores.py
Outdated
limit=kwargs.get('limit', 10), # Example of additional parameter | ||
fields=["text", "source"] | ||
) | ||
return AstraQueryResult.convert(results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove convert and construct the pydantic model here.
return AstraQueryResult(documents=..., embeddings=...)
mirascope/astra/vectorstores.py
Outdated
Each document is expected to have text, embeddings, and a source. | ||
""" | ||
if not text: | ||
logging.error("No text provided for addition.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can raise ValueError here instead of logging, user can then try except as necessary and handle it themselves with logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but also the arguments requires text, so maybe unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is not necessary since the typing requires text. If not provided, this will throw an error anyway that the user needs to resolve rather than log.
pyproject.toml
Outdated
@@ -28,6 +28,7 @@ mistralai = { version = ">=0.1.6,<1.0.0", optional = true } | |||
groq = { version = ">=0.4.2,<1.0.0", optional = true } | |||
cohere = { version = "^5.2.5", optional = true } | |||
pinecone-client = { version = "^3.2.2", optional = true } | |||
astrapy = "^1.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add astra as an optional dependency
poetry add --optional astrapy
Then update [tool.poetry.extras]
with
astrapy = ["astrapy"]
and update all
to include astrapy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is missing tests. The tests should be located in a tests/astra
directory.
Please note that all code should be tested with 100% coverage. Ideally we also test additional flows beyond just coverage, but this is not required for a PR (and we will add additional tests should bugs be discovered in the future).
To check coverage, the easiest way is to run the following command from the root directory:
poetry run pytest tests --cov=./ --cov-report-html
This will produce a directory named htmlcov
inside of which there will be an index.html
file that is interactive and shows you all uncovered lines by file.
I recommend looking in the tests/
directory for a reference of how to write tests. For this PR, I would specifically look at the chroma tests and the pinecone tests.
If you have any questions, let us know! We're here to help.
mirascope/astra/vectorstores.py
Outdated
Each document is expected to have text, embeddings, and a source. | ||
""" | ||
if not text: | ||
logging.error("No text provided for addition.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is not necessary since the typing requires text. If not provided, this will throw an error anyway that the user needs to resolve rather than log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docstrings to the module and all public classes and functions. They should follow the Google docstring style
mirascope/astra/vectorstores.py
Outdated
"""A vector store for AstraDB. | ||
|
||
Example: | ||
from mirascope.openai import OpenAIEmbedder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please wrap the example in a python code block, e.g.
Example:
```python
{code here}
```
@@ -31,6 +31,7 @@ pinecone-client = { version = "^3.2.2", optional = true } | |||
logfire = { version = ">=0.26.0,<1.0.0", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please include astrapy
with the correct version and optional = true
like we do for other optional dependencies.
Allows use of AstraDB, similar to other databases already included.
Test plan: Use the example code in mirascope/astra/vectorstore.py