Skip to content

Non-ASCII characters in faithfulness metric #1022

Closed
@mckbrchill

Description

@mckbrchill

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

When I work with cyrillic texts, the candidates sentences generated in faithfulness metric are being passed through json.dumps with default ensure_ascii=True in _create_nli_prompt method, so statements_str contains strings with escape sequences which are then passed to LLM again.

    def _create_nli_prompt(self, row: t.Dict, statements: t.List[str]) -> PromptValue:
        assert self.llm is not None, "llm must be set to compute score"

        contexts = row["contexts"]
        # check if the statements are support in the contexts
        contexts_str: str = "\n".join(contexts)
        statements_str: str = json.dumps(statements)
        prompt_value = self.nli_statements_message.format(
            context=contexts_str, statements=statements_str
        )
        return prompt_value

Ragas version: 0.1.7
Python version: 3.10.8

Expected behavior
I expect the candidate sentences to be in cyrillic symbols when being passed to LLM again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    answered🤖 The question has been answered. Will be closed automatically if no new commentsbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions