Skip to content

evaluate() fails with *APIConnectionError(Connection error.)* when azure (AzureOpenAIEmbeddings, AzureChatOpenAI) langchain models are used #2038

Open
@rajivchandrasekar

Description

@rajivchandrasekar

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
While using azure models (AzureOpenAIEmbeddings, AzureChatOpenAI) the evaluate() module seems to fail with
Error Message : Exception raised in Job[1]: APIConnectionError(Connection error.)

Ragas version: Version: 0.2.15
Python version: 3.11.0

Code to Reproduce

import os
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from langchain_openai.chat_models import AzureChatOpenAI
import httpx
from dotenv import load_dotenv

from ragas.metrics import FactualCorrectness, LLMContextPrecisionWithoutReference, Faithfulness
from ragas import SingleTurnSample, EvaluationDataset
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper


from ragas import evaluate
load_dotenv()


embeddings = AzureOpenAIEmbeddings(model="MYMODELNAME", http_client=httpx.Client(verify=False))
evaluation_llm = AzureChatOpenAI(deployment_name="MYMODELNAME",   http_client=httpx.Client(verify=False),  temperature=0.1)

dataset = []

dataset.append(
        {
            "user_input": "When was the first super bowl?",
            "retrieved_contexts": [
            "The First AFLNFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles."
        ],
            "response": "The first superbowl was held on Jan 15, 1967"
        }
    )

evaluation_dataset = EvaluationDataset.from_list(dataset)
result = evaluate(
    llm=evaluation_llm,
    embeddings= LangchainEmbeddingsWrapper(embeddings=embeddings),
    dataset= evaluation_dataset,
    metrics=[
        Faithfulness(llm=evaluation_llm)
    ],
)

print(result)

Error trace

Evaluating:   0%|                                                                                                                 | 0/2 [00:00<?, ?it/s]Exception raised in Job[1]: APIConnectionError(Connection error.)
Evaluating:  50%|████████████████████████████████████████████████████                                                    | 1/2 [02:36<02:36, 156.27s/it]
{'faithfulness': nan}

Expected behavior

Expected scores for all the metrics thats part of evaluate

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmodule-metricsthis is part of metrics module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions