[AAAI 2025] EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models

[AAAI 2025] This is the official repo of the paper "EditBoard, a comprehensive evaluation benchmark for text-based video editing models" [Paper].

📖 Table of Contents

Installation
Dataset structure
Usage
Acknowledgement
Citation

🔨 Installation

conda create -n EditBoard python==3.9
conda activate EditBoard
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118 # or other version with CUDA<=12.1
pip install -r requirements.txt

📁 Dataset Structure

For any given video, you need to segment it into frames and save all the frames into a directory named after the video. All frames must be resized to 512x512 pixels. To simplify this process, we provide a preprocessing script, preprocess.py, which supports MP4 and GIF video formats.

The command to run the script is:

python preprocess.py --input_path <path_to_your_videos> --output_path <path_to_save_frames>

--input_path: The path to the directory containing your videos.
--output_path: The path where the resulting frame directories will be saved.

Each frame folder will contain all frames from the corresponding video, e.g.:

dataset/
├── bear/
│   ├── frame_00000.png
│   ├── frame_00001.png
│   ├── frame_00002.png
│   └── ...
├── bear_white/
│   ├── frame_00000.png
│   ├── frame_00001.png
│   └── ...
└── bear_mask/
    ├── frame_00000.png
    ├── frame_00001.png
    └── ...

⚠️ Important:
It is crucial that the corresponding original video, edited video, and semantic_mask folders contain the same number of image frames.

🚀 Usage

We have implemented all nine evaluation dimensions used in our paper: ["ff_alpha", "ff_beta", "semantic_score", "success_rate", "clip_similarity", 'subject_consistency', 'background_consistency', 'aesthetic_quality', 'imaging_quality']

We offer two forms of commands for evaluation:

Normal Command – evaluate one pair of videos at a time.
Script Command – evaluate multiple pairs in batch mode using a CSV or Excel file.

The final evaluation results will be saved in {output_path}/{result_name}_eval_results.json.

Normal Command

This is a full example for evaluating all nine dimensions on a single pair of videos.

python -W ignore evaluate.py \
  --output_path './output/' \
  --result_name "result" \
  --dimension "ff_alpha" "ff_beta" "semantic_score" "success_rate" "clip_similarity" 'subject_consistency' 'background_consistency' 'aesthetic_quality' 'imaging_quality' \
  --original_video_path './sample/bear' \
  --edited_video_path './sample/bear_white' \
  --semantic_mask_path './sample/bear_mask' \
  --source_prompt 'a brown bear walks on rocks' \
  --target_prompt 'a white bear walks on rocks'

Script Command

This command evaluates multiple pairs in batch mode using a CSV or Excel file. The --dimension and --script arguments are mandatory.

python -W ignore evaluate.py \
  --output_path './output/' \
  --result_name "result" \
  --dimension "ff_alpha" "ff_beta" "semantic_score" "success_rate" "clip_similarity" 'subject_consistency' 'background_consistency' 'aesthetic_quality' 'imaging_quality' \
  --script './sample/script.csv'

The script file (e.g., --script) must be a .csv or .xlsx file with the following header and format:

original_video_path	edited_video_path	semantic_mask_path	source_prompt	target_prompt
./sample/bear	./sample/bear_autumn	./sample/bear_mask	a brown bear walks on rocks	a brown bear walks on rocks in the autumn

An example script file is available at /EditBoard/sample/script.csv.

Required Inputs for Each Dimension

Different dimensions require different input fields. Please ensure all necessary arguments are provided when running evaluation.

Dimension	Required Inputs
`ff_alpha`, `ff_beta`	`original_video_path`, `edited_video_path`
`semantic_score`	`original_video_path`, `edited_video_path`, `semantic_mask_path`
`success_rate`, `clip_similarity`	`edited_video_path`, `source_prompt`, `target_prompt`
`subject_consistency`, `background_consistency`, `aesthetic_quality`, `imaging_quality`	`edited_video_path`

Example Commands for Each Dimension

ff_alpha, ff_beta

python -W ignore evaluate.py \
  --dimension "ff_alpha" "ff_beta" \
  --original_video_path './sample/bear' \
  --edited_video_path './sample/bear_white'

semantic_score

python -W ignore evaluate.py \
  --dimension "semantic_score" \
  --original_video_path './sample/bear' \
  --edited_video_path './sample/bear_white' \
  --semantic_mask_path './sample/bear_mask'

success_rate, clip_similarity

python -W ignore evaluate.py \
  --dimension "success_rate" "clip_similarity" \
  --edited_video_path './sample/bear_white' \
  --source_prompt 'a brown bear walks on rocks' \
  --target_prompt 'a white bear walks on rocks'

subject_consistency, background_consistency, aesthetic_quality, imaging_quality

python -W ignore evaluate.py \
  --dimension "subject_consistency" "background_consistency" "aesthetic_quality" "imaging_quality" \
  --edited_video_path './sample/bear_white'

♥️ Acknowledgement

This project wouldn't be possible without the following open-sourced repositories: CLIP and VBench.

📫 Citation

If you find this repo useful for your research, please consider citing our work:

@inproceedings{chen2025editboard,
  title={Editboard: Towards a comprehensive evaluation benchmark for text-based video editing models},
  author={Chen, Yupeng and Chen, Penglin and Zhang, Xiaoyu and Huang, Yixian and Xie, Qian},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={15},
  pages={15975--15983},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[AAAI 2025] EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models

📖 Table of Contents

🔨 Installation

📁 Dataset Structure

🚀 Usage

Normal Command

Script Command

Required Inputs for Each Dimension

♥️ Acknowledgement

📫 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
editboard		editboard
sample		sample
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt

License

Samchen2003/EditBoard

Folders and files

Latest commit

History

Repository files navigation

[AAAI 2025] EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models

📖 Table of Contents

🔨 Installation

📁 Dataset Structure

🚀 Usage

Normal Command

Script Command

Required Inputs for Each Dimension

♥️ Acknowledgement

📫 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages