Support OMTG Benchmark by insomniaaac · Pull Request #1427 · open-compass/VLMEvalKit

insomniaaac · 2026-02-04T07:34:16Z

Description

This PR adds support for the One-to-Many Temporal Grounding (OMTG) benchmark, as proposed in the paper Towards One-to-Many Temporal Grounding.

Unlike traditional temporal grounding tasks that assume a one-to-one mapping, OMTG requires the model to localize all disjoint video segments corresponding to a query.

Key Changes

New Benchmark Support: Added OMTGBench dataset class.
New Metrics: Implemented rigorous metrics for multi-instance retrieval:
- C-Acc (Count Accuracy): Evaluates event cardinality perception.
- EtF1 (Effective Temporal F1): The primary metric that penalizes incomplete retrieval.
- tF1 (Temporal F1-Score).
Evaluation Pipeline: Integrated the OMTG evaluation logic into the existing framework.

How to Use

Users can evaluate models on the OMTG benchmark using the following command:

python run.py --data OMTGBench --model Qwen3-VL-4B-Instruct --verbose

FangXinyu-0913

Consider add quick config in vlmeval/dataset/video_dataset_config.py for better usage
Please report OMTG Bench performance for representative models (using VLMEvalKit, the official repo, and paper results). Include environment details (transformers, torch, vllm/sglang, flash-attention, python) and specific configs (like nframe) used for these runs.
Please help fix the lint: https://github.com/open-compass/VLMEvalKit/actions/runs/21662694744/job/62492361066?pr=1427

support omtg benchmark

4c219c0

FangXinyu-0913 reviewed Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support OMTG Benchmark#1427

Support OMTG Benchmark#1427
insomniaaac wants to merge 1 commit intoopen-compass:mainfrom
insomniaaac:main

insomniaaac commented Feb 4, 2026

Uh oh!

FangXinyu-0913 left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

insomniaaac commented Feb 4, 2026

Description

Key Changes

How to Use

Uh oh!

FangXinyu-0913 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FangXinyu-0913 left a comment •

edited

Loading