Skip to content

Commit 540aa0a

Browse files
book: Complete ch6 extraction strategy v2 section
Add comprehensive explanation of extraction strategy v2 in book/02-dafny-and-inspect/06-ch6.md to address the 'celebratory message' gotcha shown in fig-celebratory. The section covers: - Problem statement: LLM celebrates success without code block - V1 buggy approach: naive last-code-block extraction from final completion - V2 solution: backtracking through message history to find most recent code - Usage: how to configure extraction_strategy parameter - General pattern: when backtracking is useful in tool-calling loops Uses literalinclude directives to pull code directly from: - evals/src/evals/dafnybench/inspect_ai/utils.py (extract_code_v1, extract_code_v2) - evals/src/evals/dafnybench/inspect_ai/__init__.py (task configuration) This ensures code examples stay in sync with implementation and matches the documentation pattern used throughout chapter 5. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent a9c0712 commit 540aa0a

File tree

4 files changed

+11
-78
lines changed

4 files changed

+11
-78
lines changed

book/02-dafny-and-inspect/05-ch5.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ In the parlance, some non-LLM process that forms a sensor-actuator pair for an L
3232
:linenos:
3333
```
3434

35+
One important thing to notice, which we'll discuss more in the next chapter, is use of the `raise` keyword. Here, we're using the notion of exception that means an error is _when things go as planned_, since we do not expect the agent to win on the first try, which may be offensive to some error handling purists. Inspect's tool call abstraction simply requires tools to be in error state whenever TODO .
36+
3537
`{:verify false}` is an escape hatch in Dafny, so we'd consider it cheating if it did that (line 28). We can execute `dafny` on a single file, and you want to use `with tempfile.NamedTemporaryFile` to do this so it cleans up later. Finally, we read off the exit code to see if the LLM was successful. By convention, exit code 0 is happiness, no error message, and nonzero exit code is unhappiness (though if you're the kind of person who's happy when the world gives you helpful pointers about how to fix your mistakes, you might not view this as unhappiness)[^1].
3638

3739
[^1]: Notice that we don't necessarily know whether the error message will go to stdout or stderr. Lots of tools aren't completely intuitive about this, where they'll put error messages in stdout and not use stderr for anything. It feels nice to make your subprocess handler agnostic and flexible, sometimes, even if for any _one_ tool it doesn't need to be, it might help with code reuse later.

book/02-dafny-and-inspect/06-ch6.html

Lines changed: 0 additions & 65 deletions
This file was deleted.
Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# A gotcha
1+
# The touchdown dance
22

33
When you're plumbing, you sometimes run into silly things
44

@@ -11,9 +11,9 @@ A language model is so excited that it solved the task that it passes its celebr
1111

1212
## Extraction Strategy v2
1313

14-
### [ ] TODO: audit this, as an LLM wrote it
14+
After the agent successfully verifies code, it sometimes generates a celebratory message like "Perfect! The code now verifies successfully!" without a code block. Our initial extraction logic (`extract_code_v1`) simply grabbed the last code block from the final completion, which worked great until there was no code block in that final message.
1515

16-
The figure above shows a real problem we encountered: after the agent successfully verifies code, it sometimes generates a celebratory message like "Perfect! The code now verifies successfully!" without a code block. Our initial extraction logic (`extract_code_v1`) simply grabbed the last code block from the final completion—which worked great until there was no code block in that final message.
16+
### TODO: what we really want this chapter to be about is increased gracefulness of extracting code from responses. how to link it to your intuitions about error handling.
1717

1818
The naive approach looked like this:
1919

@@ -25,7 +25,7 @@ The naive approach looked like this:
2525
:end-before: def extract_code_v2
2626
```
2727

28-
When the agent outputs "Great! It worked!" without code, `extract_code_v1` fails to find any code blocks. The scorer then tries to verify this celebration text as Dafny code, which obviously fails—turning a successful verification into a recorded failure.
28+
When the agent outputs "Great! It worked!" without code, `extract_code_v1` fails to find any code blocks. The scorer then tries to verify this touchdown dance as Dafny code, which obviously fails—turning a successful verification into a recorded failure.
2929

3030
The fix is **backtracking through message history**. Instead of only looking at the final completion, we walk backwards through all assistant messages until we find one containing code:
3131

@@ -37,19 +37,13 @@ The fix is **backtracking through message history**. Instead of only looking at
3737
:end-before: def extract_code(
3838
```
3939

40-
This handles the celebration problem elegantly: if the current message has no code, we skip it and check the previous message. We eventually find the most recent code the agent actually generated.
40+
This handles the celebration problem: if the current message has no code, we skip it and check the previous message. We eventually find the most recent code the agent actually generated.
4141

4242
You can switch between strategies using the `extraction_strategy` parameter in the task definition:
4343

44-
```{literalinclude} ../../evals/src/evals/dafnybench/inspect_ai/__init__.py
45-
:language: python
46-
:caption: Configuring extraction strategy in task definition
47-
:linenos:
48-
:start-at: def dafnybench_task(
49-
:end-at: extraction_strategy: ExtractionStrategy
5044
```
51-
52-
We keep v1 available for pedagogical purposes—it demonstrates a real bug you might encounter when building verification agents. The unified interface lives in `utils.py:82-120`, dispatching to the appropriate implementation based on the `ExtractionStrategy` enum.
45+
uv run agent dafnybench inspect --extraction-strategy v2
46+
```
5347

5448
This pattern—searching backwards through conversation history—is useful whenever you have tool-calling loops where the final message might not contain the information you need. The verifier acts as a sensor providing feedback, and the agent's response to good news ("it worked!") can omit the actual artifact you're evaluating.
5549

book/02-dafny-and-inspect/07-ch7.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,5 @@ it'd be great to show a red team that cheats really hard by editing the code in
44

55
Then we have to refactor a little to the _actual_ task, which is just insertions of annotations.
66

7+
TODO: write the code that does this, version it, so both versions can exist in the repo at the same time.
8+
TODO: write the chapter

0 commit comments

Comments
 (0)