difrences in LLM responces.

I will be using context recall as an example as my team has been having issues with parsing errors.

it seems that local models (here Gemma3 4b) react difrent to the current instruction prompt than expacted. 

![Image](https://github.com/user-attachments/assets/ea344aa0-0dc3-479f-8912-3315bb7a2143)

as it seems that the code is looking for an attribute called "attributed" 

![Image](https://github.com/user-attachments/assets/058d0c09-7aa0-41b2-b587-35802109891b)

I have 2 sugestions to improve this. the first is to include the structure in the prompt

![Image](https://github.com/user-attachments/assets/f731bd9c-8a25-44dc-b728-e4efb21c826b)

the second is to use multiple steps, this one is based on some recent expirements (i don't know of any specific research but it's similar to CoT prompting)

prompt 1: “Given a context, and an answer, analyse each sentence in the answer and classify if the sentence can be attributed to the given context or not.”
Prompt 2: “given an analysis,  translate the classifications into ‘Yes’ (1) or ‘No’ (0) as a binary classification. And format it as a json with an ‘attributed’, and ‘reason’ attribute.”

![Image](https://github.com/user-attachments/assets/7f5bdaac-4d52-4329-8f24-01d351bd969b)

![Image](https://github.com/user-attachments/assets/ad47cddf-397e-441d-9312-85b83092befd)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

difrences in LLM responces. #2075

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

difrences in LLM responces. #2075

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions