Skip to content

Questions about Table 1 in the paper on arxiv #2

Open
@FZaKK

Description

@FZaKK

Thank you for your work as well as for the open source code (It is reproductive). I have a little question about Table 1 in this paper on arxiv. In the paper, $D^{bd}{val}$ and $D^{cl}{val}$ are used for the evaluation of the attack performance. Are these two subsets (with 1000 test samples) sampled from the openo1_sft_filter.json? I really didn't find the relevant information in the paper (Maybe it was my carelessness). I would be very grateful if you could respond.🤔

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions