Skip to content

difrences in LLM responces. #2075

Open
Open
@Sdegraauw

Description

@Sdegraauw

I will be using context recall as an example as my team has been having issues with parsing errors.

it seems that local models (here Gemma3 4b) react difrent to the current instruction prompt than expacted.

Image

as it seems that the code is looking for an attribute called "attributed"

Image

I have 2 sugestions to improve this. the first is to include the structure in the prompt

Image

the second is to use multiple steps, this one is based on some recent expirements (i don't know of any specific research but it's similar to CoT prompting)

prompt 1: “Given a context, and an answer, analyse each sentence in the answer and classify if the sentence can be attributed to the given context or not.”
Prompt 2: “given an analysis, translate the classifications into ‘Yes’ (1) or ‘No’ (0) as a binary classification. And format it as a json with an ‘attributed’, and ‘reason’ attribute.”

Image

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmodule-metricsthis is part of metrics module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions