Resumable osmosis eval with Cache Management

### Problem or Use Case

`osmosis eval` runs are often long (large datasets, multiple runs per row, external model calls) and can be interrupted by network issues, process restarts, machine shutdowns, or quota/auth failures.
When that happens, users may need to restart from scratch, which causes:

- Repeated API/computation cost for already-completed runs
- Longer experiment iteration cycles
- Higher risk of conflicting state when duplicate evals run  @concurrently
- No clear built-in workflow to inspect, filter, and clean old cache entries

### Proposed Solution

Add first-class resumable execution and cache lifecycle management for osmosis eval:
- Persist eval progress/results to disk with a stable task ID derived from config + source/data fingerprints
- Auto-resume when re-running the same command after interruption
- Add `--fresh` to force a clean rerun and `--retry-failed` to rerun only failed runs
- Add osmosis eval cache subcommands for cache inspection and cleanup (`dir, ls, rm`)
- Use file locking + atomic writes to ensure consistency and prevent concurrent corruption
- Detect dataset changes during/after runs and warn or fail with actionable guidance
- Support `--log-samples` and structured output directories for better debugging/auditing

### Alternatives Considered

_No response_

### SDK Component

None

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resumable osmosis eval with Cache Management #77

Problem or Use Case

Proposed Solution

Alternatives Considered

SDK Component

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Resumable osmosis eval with Cache Management #77

Description

Problem or Use Case

Proposed Solution

Alternatives Considered

SDK Component

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions