Skip to content

Conversation

@maraisr
Copy link
Member

@maraisr maraisr commented Jun 7, 2025

Much like our web prompt tooling, we should allow the models cli to support templated variables in string evaluators.

Before

Running test case 1/1...
  ✗ FAILED
    Model Response: Goodbye! Take care and see you next time! 🌍👋
    ✗ string evaluator (score: 0.00)
      Expected to contain: '{{expected}}' ⬅️⬅️⬅️⬅️
    ✓ similarity check (score: 0.25)
      LLM evaluation matched choice: '2'

After

Running test case 1/1...
  ✗ FAILED
    Model Response: Hello there! How can I assist you today? 😊
    ✓ string evaluator (score: 1.00)
      Expected to contain: 'hello' ⬅️⬅️⬅️⬅️
    ✗ similarity check (score: 0.00)
      LLM evaluation matched choice: '1'

Copilot AI review requested due to automatic review settings June 7, 2025 06:32
@maraisr maraisr requested a review from a team as a code owner June 7, 2025 06:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Support templated variables in string evaluators for the CLI, allowing test cases to inject dynamic values into string checks.

  • Introduces string fields in example prompts and switches evaluator contains to use {{string}}
  • Updates runStringEvaluator signature to accept a testCase map and applies templateString in each comparison
  • Adjusts tests to pass an empty testCase into runStringEvaluator

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
examples/sample_prompt.yml Added string keys under testData and updated evaluator contains
cmd/eval/eval_test.go Changed runStringEvaluator calls to include the new testCase param
cmd/eval/eval.go Refactored runStringEvaluator to template all string criteria
Comments suppressed due to low confidence (2)

examples/sample_prompt.yml:9

  • [nitpick] The key string in testData is very generic and may be confused with the evaluator’s string block. Consider renaming it to something more descriptive like templateVar or placeholder.
    string: hello

cmd/eval/eval_test.go:132

  • There are no tests covering the new templating functionality or error paths when a placeholder is missing. Consider adding tests that include {{string}} substitutions and missing-key scenarios to validate templateString behavior.
result, err := handler.runStringEvaluator("test", tt.evaluator, map[string]interface{}{}, tt.response)

@maraisr maraisr requested a review from sgoedecke June 7, 2025 06:37
Comment on lines +376 to +378
if err != nil {
return EvaluationResult{}, fmt.Errorf("failed to template message content: %w", err)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I do agree with Copilot in that this looks a little gnarly. But with the way that Go wants to handle errors we kinda need this everywhere.

That is unless, we create the helper method here — that should there be an error, we just default back to the provided string. But open to suggestions/opinions.

Copy link
Collaborator

@sgoedecke sgoedecke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this, nice catch!

@maraisr maraisr merged commit f5988c3 into main Jun 10, 2025
5 checks passed
@maraisr maraisr deleted the mr/support-variables-in-string branch June 10, 2025 05:38
- name: string evaluator
string:
contains: world
contains: '{{string}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support that in the web UI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants