Decouple first prompt evaluation and block size in CAPO

In CAPO, the block size for racing is controlled by a single parameter. The (sensible) default is 30, and we should not go below it for the statistical test to hold also during the first evaluation of a prompt. However, in very expensive settings (e.g., expensive/time-consuming reward computation), we might want to increase evaluations during racing by fewer than 30 blocks. 

My suggestion is to somehow "decouple" the amount of initial evaluations required and the block size more. For example, we could have two parameters
- `block_size`: how many evaluations are added in each racing iteration
- `init_block_evals`: how many blocks are used for the first evaluation of a prompt

In my setting, we could set `block_size=5` and `init_block_evals=6` ($6 \times 5 = 30$). The statistical test would still be always valid, and we can increase evaluations more finely during racing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decouple first prompt evaluation and block size in CAPO #66

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Decouple first prompt evaluation and block size in CAPO #66

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions