Skip to content

Commit bd23226

Browse files
committed
Merge main
2 parents f14dae8 + b280672 commit bd23226

30 files changed

+863
-183
lines changed

CONTRIBUTING.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Then install the package through poetry:
1616
Note - You may need to install poetry. See [here](https://python-poetry.org/docs/#installing-with-the-official-installer)
1717

1818
```bash
19-
poetry install --with test
19+
poetry install --with dev
2020
```
2121

2222
## Testing
+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
---
2+
sidebar_position: 9
3+
---
4+
5+
# dsp.Mistral
6+
7+
### Usage
8+
9+
```python
10+
lm = dsp.Mistral(model='mistral-medium-latest', api_key="your-mistralai-api-key")
11+
```
12+
13+
### Constructor
14+
15+
The constructor initializes the base class `LM` and verifies the `api_key` provided or defined through the `MISTRAL_API_KEY` environment variable.
16+
17+
```python
18+
class Mistral(LM):
19+
def __init__(
20+
self,
21+
model: str = "mistral-medium-latest",
22+
api_key: Optional[str] = None,
23+
**kwargs,
24+
):
25+
```
26+
27+
**Parameters:**
28+
- `model` (_str_): Mistral AI pretrained models. Defaults to `mistral-medium-latest`.
29+
- `api_key` (_Optional[str]_, _optional_): API provider from Mistral AI. Defaults to None.
30+
- `**kwargs`: Additional language model arguments to pass to the API provider.
31+
32+
### Methods
33+
34+
Refer to [`dspy.Mistral`](#) documentation.

docs/api/retrieval_model_clients/AzureCognitiveSearch.md

+7-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ sidebar_position: 3
66

77
### Constructor
88

9-
The constructor initializes an instance of the `AzureCognitiveSearch` class and sets up parameters for sending queries and retreiving results with the Azure Cognitive Search server.
9+
The constructor initializes an instance of the `AzureCognitiveSearch` class and sets up parameters for sending queries and retreiving results with the Azure Cognitive Search server.
1010

1111
```python
1212
class AzureCognitiveSearch:
@@ -21,6 +21,7 @@ class AzureCognitiveSearch:
2121
```
2222

2323
**Parameters:**
24+
2425
- `search_service_name` (_str_): Name of Azure Cognitive Search server.
2526
- `search_api_key` (_str_): API Authentication token for accessing Azure Cognitive Search server.
2627
- `search_index_name` (_str_): Name of search index in the Azure Cognitive Search server.
@@ -31,4 +32,8 @@ class AzureCognitiveSearch:
3132

3233
Refer to [ColBERTv2](/api/retrieval_model_clients/ColBERTv2) documentation. Keep in mind there is no `simplify` flag for AzureCognitiveSearch.
3334

34-
AzureCognitiveSearch supports sending queries and processing the received results, mapping content and scores to a correct format for the Azure Cognitive Search server.
35+
AzureCognitiveSearch supports sending queries and processing the received results, mapping content and scores to a correct format for the Azure Cognitive Search server.
36+
37+
### Deprecation Notice
38+
39+
This module is scheduled for removal in future releases. Please use the AzureAISearchRM class from dspy.retrieve.azureaisearch_rm instead.For more information, refer to the updated documentation(docs/docs/deep-dive/retrieval_models_clients/Azure.mdx).

docs/api/retrieval_model_clients/ChromadbRM.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Search the chromadb collection for the top `k` passages matching the given query
4141
ChromadbRM have the flexibility from a variety of embedding functions as outlined in the [chromadb embeddings documentation](https://docs.trychroma.com/embeddings). While different options are available, this example demonstrates how to utilize OpenAI embeddings specifically.
4242

4343
```python
44-
from dspy.retrieve.chroma_rm import ChromadbRM
44+
from dspy.retrieve.chromadb_rm import ChromadbRM
4545
import os
4646
import openai
4747
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
@@ -62,4 +62,4 @@ results = retriever_model("Explore the significance of quantum computing", k=5)
6262

6363
for result in results:
6464
print("Document:", result.long_text, "\n")
65-
```
65+
```

docs/docs/building-blocks/1-language_models.md

+7-6
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,6 @@ For example, to use OpenAI language models, you can do it as follows.
1818
gpt3_turbo = dspy.OpenAI(model='gpt-3.5-turbo-1106', max_tokens=300)
1919
dspy.configure(lm=gpt3_turbo)
2020
```
21-
**Output:**
22-
```text
23-
['Hello! How can I assist you today?']
24-
```
2521

2622
## Directly calling the LM.
2723

@@ -31,11 +27,16 @@ You can simply call the LM with a string to give it a raw prompt, i.e. a string.
3127
gpt3_turbo("hello! this is a raw prompt to GPT-3.5")
3228
```
3329

30+
**Output:**
31+
```text
32+
['Hello! How can I assist you today?']
33+
```
34+
3435
This is almost never the recommended way to interact with LMs in DSPy, but it is allowed.
3536

3637
## Using the LM with DSPy signatures.
3738

38-
You can also use the LM via DSPy [signatures] and [modules], which we discuss in more depth in the remaining guides.
39+
You can also use the LM via DSPy [`signature` (input/output spec)](https://dspy-docs.vercel.app/docs/building-blocks/signatures) and [`modules`](https://dspy-docs.vercel.app/docs/building-blocks/modules), which we discuss in more depth in the remaining guides.
3940

4041
```python
4142
# Define a module (ChainOfThought) and assign it a signature (return an answer, given a question).
@@ -172,4 +173,4 @@ model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
172173
model_path = 'dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so'
173174

174175
llama = dspy.ChatModuleClient(model=model, model_path=model_path)
175-
```
176+
```

docs/docs/building-blocks/2-signatures.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ sidebar_position: 2
66

77
When we assign tasks to LMs in DSPy, we specify the behavior we need as a Signature.
88

9-
**A signature is a declarative specification of input/output behavior of a DSPy module.** Signatures allow you tell the LM _what_ it needs to do, rather than specify _how_ we should ask the LM to do it.
9+
**A signature is a declarative specification of input/output behavior of a DSPy module.** Signatures allow you to tell the LM _what_ it needs to do, rather than specify _how_ we should ask the LM to do it.
1010

1111

1212
You're probably familiar with function signatures, which specify the input and output arguments and their types. DSPy signatures are similar, but the differences are that:
@@ -157,4 +157,4 @@ Prediction(
157157

158158
While signatures are convenient for prototyping with structured inputs/outputs, that's not the main reason to use them!
159159

160-
You should compose multiple signatures into bigger [DSPy modules] and [compile] these modules into optimized prompts and finetunes.
160+
You should compose multiple signatures into bigger [DSPy modules](https://dspy-docs.vercel.app/docs/building-blocks/modules) and [compile these modules into optimized prompts](https://dspy-docs.vercel.app/docs/building-blocks/optimizers#what-does-a-dspy-optimizer-tune-how-does-it-tune-them) and finetunes.

docs/docs/building-blocks/4-data.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ input_key_only = article_summary.inputs()
7878
non_input_key_only = article_summary.labels()
7979

8080
print("Example object with Input fields only:", input_key_only)
81-
print("Example object with Non-Input fields only:", non_input_key_only))
81+
print("Example object with Non-Input fields only:", non_input_key_only)
8282
```
8383

8484
**Output**

docs/docs/building-blocks/solving_your_task.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Using DSPy well for solving a new task is just doing good machine learning with
88

99
What this means is that it's an iterative process. You make some initial choices, which will be sub-optimal, and then you refine them incrementally.
1010

11-
As we discuss below, you will define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers [(`modules`)](https://dspy-docs.vercel.app/docs/building-blocks/modules) to use, giving each layer a [`signature` (input/output spec)](https://dspy-docs.vercel.app/docs/building-blocks/signatures), and then calling your modules freely in your Python code. Lastly, you use a DSPy [`optimizer`]https://dspy-docs.vercel.app/docs/building-blocks/optimizers) to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
11+
As we discuss below, you will define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers [(`modules`)](https://dspy-docs.vercel.app/docs/building-blocks/modules) to use, giving each layer a [`signature` (input/output spec)](https://dspy-docs.vercel.app/docs/building-blocks/signatures), and then calling your modules freely in your Python code. Lastly, you use a DSPy [`optimizer`](https://dspy-docs.vercel.app/docs/building-blocks/optimizers) to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
1212

1313

1414
## 1) Define your task.

docs/docs/deep-dive/data-handling/examples.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ input_key_only = article_summary.inputs()
6868
non_input_key_only = article_summary.labels()
6969

7070
print("Example object with Input fields only:", input_key_only)
71-
print("Example object with Non-Input fields only:", non_input_key_only))
71+
print("Example object with Non-Input fields only:", non_input_key_only)
7272
```
7373

7474
**Output**

docs/docs/deep-dive/retrieval_models_clients/Azure.mdx

+87-37
Original file line numberDiff line numberDiff line change
@@ -4,80 +4,130 @@ sidebar_position: 2
44

55
import AuthorDetails from '@site/src/components/AuthorDetails';
66

7-
# AzureCognitiveSearch
7+
# AzureAISearch
8+
9+
A retrieval module that utilizes Azure AI Search to retrieve top passages for a given query.
810

911
## Prerequisites
1012

1113
```bash
12-
pip install azure-core
14+
pip install azure-search-documents
1315
```
1416

15-
## Setting up the Azure Client
16-
17-
The constructor initializes an instance of the `AzureCognitiveSearch` class and sets up parameters for sending queries and retrieving results with the Azure Cognitive Search server.
18-
19-
- `search_service_name` (_str_): Name of Azure Cognitive Search server.
20-
- `search_api_key` (_str_): API Authentication token for accessing Azure Cognitive Search server.
21-
- `search_index_name` (_str_): Name of search index in the Azure Cognitive Search server.
22-
- `field_text` (_str_): Field name that maps to DSP "content" field.
23-
- `field_score` (_str_): Field name that maps to DSP "score" field.
17+
## Setting up the AzureAISearchRM Client
18+
19+
The constructor initializes an instance of the `AzureAISearchRM` class and sets up parameters for sending queries and retrieving results with the Azure AI Search server.
20+
21+
- `search_service_name` (str): The name of the Azure AI Search service.
22+
- `search_api_key` (str): The API key for accessing the Azure AI Search service.
23+
- `search_index_name` (str): The name of the search index in the Azure AI Search service.
24+
- `field_text` (str): The name of the field containing text content in the search index. This field will be mapped to the "content" field in the dsp framework.
25+
- `k` (int, optional): The default number of top passages to retrieve. Defaults to 3.
26+
- `semantic_ranker` (bool, optional): Whether to use semantic ranking. Defaults to False.
27+
- `filter` (str, optional): Additional filter query. Defaults to None.
28+
- `query_language` (str, optional): The language of the query. Defaults to "en-Us".
29+
- `query_speller` (str, optional): The speller mode. Defaults to "lexicon".
30+
- `use_semantic_captions` (bool, optional): Whether to use semantic captions. Defaults to False.
31+
- `query_type` (Optional[QueryType], optional): The type of query. Defaults to QueryType.FULL.
32+
- `semantic_configuration_name` (str, optional): The name of the semantic configuration. Defaults to None.
33+
34+
Available Query Types:
35+
36+
SIMPLE
37+
"""Uses the simple query syntax for searches. Search text is interpreted using a simple query
38+
#: language that allows for symbols such as +, * and "". Queries are evaluated across all
39+
#: searchable fields by default, unless the searchFields parameter is specified."""
40+
FULL
41+
"""Uses the full Lucene query syntax for searches. Search text is interpreted using the Lucene
42+
#: query language which allows field-specific and weighted searches, as well as other advanced
43+
#: features."""
44+
SEMANTIC
45+
"""Best suited for queries expressed in natural language as opposed to keywords. Improves
46+
#: precision of search results by re-ranking the top search results using a ranking model trained
47+
#: on the Web corpus.""
48+
49+
More Details: https://learn.microsoft.com/en-us/azure/search/search-query-overview
50+
51+
Example of the AzureAISearchRM constructor:
2452

2553
```python
26-
class AzureCognitiveSearch:
27-
def __init__(
28-
self,
29-
search_service_name: str,
30-
search_api_key: str,
31-
search_index_name: str,
32-
field_text: str,
33-
field_score: str, # required field to map with "score" field in dsp framework
34-
):
54+
AzureAISearchRM(
55+
search_service_name: str,
56+
search_api_key: str,
57+
search_index_name: str,
58+
field_text: str,
59+
k: int = 3,
60+
semantic_ranker: bool = False,
61+
filter: str = None,
62+
query_language: str = "en-Us",
63+
query_speller: str = "lexicon",
64+
use_semantic_captions: bool = False,
65+
query_type: Optional[QueryType] = QueryType.FULL,
66+
semantic_configuration_name: str = None
67+
)
3568
```
3669

3770
## Under the Hood
3871

39-
### `__call__(self, query: str, k: int = 10) -> Union[list[str], list[dotdict]]:`
72+
### `forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None) -> dspy.Prediction`
4073

4174
**Parameters:**
42-
- `query` (_str_): Search query string used for retrieval sent to Azure Cognitive Search service.
43-
- `k` (_int_, _optional_): Number of passages to retrieve. Defaults to 10.
75+
76+
- `query_or_queries` (Union[str, List[str]]): The query or queries to search for.
77+
- `k` (_Optional[int]_, _optional_): The number of results to retrieve. If not specified, defaults to the value set during initialization.
4478

4579
**Returns:**
46-
- `Union[list[str], list[dotdict]]`: list of top-k search results
4780

48-
Internally, the method handles the specifics of preparing the request query to the Azure Cognitive Search service and corresponding payload to obtain the response.
81+
- `dspy.Prediction`: Contains the retrieved passages, each represented as a `dotdict` with a `long_text` attribute.
4982

50-
The method sends a query and number of desired passages (k) to Azure Cognitive Search using `azure_search_request`. This function communicates with Azure and processes the search results as a list of dictionaries.
83+
Internally, the method handles the specifics of preparing the request query to the Azure AI Search service and corresponding payload to obtain the response.
5184

52-
This is then converted to `dotdict` objects that internally map the retrieved content and scores, listed by descending order of relevance.
85+
The function handles the retrieval of the top-k passages based on the provided query.
5386

54-
## Sending Retrieval Requests via Azure Client
55-
1) _**Recommended**_ Configure default RM using `dspy.configure`.
87+
## Sending Retrieval Requests via AzureAISearchRM Client
88+
89+
1. _**Recommended**_ Configure default RM using `dspy.configure`.
5690

5791
This allows you to define programs in DSPy and have DSPy internally conduct retrieval using `dsp.retrieve` on the query on the configured RM.
5892

5993
```python
6094
import dspy
61-
import dsp
95+
from dspy.retrieve.azureaisearch_rm import AzureAISearchRM
96+
97+
azure_search = AzureAISearchRM(
98+
"search_service_name",
99+
"search_api_key",
100+
"search_index_name",
101+
"field_text",
102+
"k"=3
103+
)
62104

63-
dspy.settings.configure(rm= TODO)
64-
retrieval_response = dsp.retrieve("When was the first FIFA World Cup held?", k=5)
105+
dspy.settings.configure(rm=azure_search)
106+
retrieve = dspy.Retrieve(k=3)
107+
retrieval_response = retrieve("What is Thermodynamics").passages
65108

66109
for result in retrieval_response:
67110
print("Text:", result, "\n")
68111
```
69112

113+
2. Generate responses using the client directly.
70114

71-
2) Generate responses using the client directly.
72115
```python
73-
import dspy
116+
from dspy.retrieve.azureaisearch_rm import AzureAISearchRM
74117

75-
retrieval_response = TODO('When was the first FIFA World Cup held?', k=5)
118+
azure_search = AzureAISearchRM(
119+
"search_service_name",
120+
"search_api_key",
121+
"search_index_name",
122+
"field_text",
123+
"k"=3
124+
)
76125

126+
retrieval_response = azure_search("What is Thermodynamics", k=3)
77127
for result in retrieval_response:
78-
print("Text:", result['text'], "\n")
128+
print("Text:", result.long_text, "\n")
79129
```
80130

81131
***
82132

83-
<AuthorDetails name="Arnav Singhvi"/>
133+
<AuthorDetails name="Prajapati Harishkumar Kishorkumar"/>

docs/docs/quick-start/installation.mdx

+5
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,11 @@ import TabItem from '@theme/TabItem';
5353
pip install dspy-ai[mongodb]
5454
```
5555
</TabItem>
56+
<TabItem value="weaviate" label="Weaviate">
57+
```text
58+
pip install dspy-ai[weaviate]
59+
```
60+
</TabItem>
5661

5762
</Tabs>
5863

dsp/modules/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
from .gpt3 import *
1111
from .hf import HFModel
1212
from .hf_client import Anyscale, HFClientTGI, Together
13+
from .mistral import *
1314
from .ollama import *
1415
from .pyserini import *
1516
from .sbert import *

dsp/modules/azure_openai.py

+1-9
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,6 @@
1-
import logging
2-
3-
# Configure logging
4-
logging.basicConfig(
5-
level=logging.INFO,
6-
format="%(message)s",
7-
handlers=[logging.FileHandler("azure_openai_usage.log")],
8-
)
9-
101
import functools
112
import json
3+
import logging
124
from typing import Any, Literal, Optional, cast
135

146
import backoff

0 commit comments

Comments
 (0)