Skip to content

Latest commit

 

History

History
55 lines (42 loc) · 1.7 KB

README.md

File metadata and controls

55 lines (42 loc) · 1.7 KB

AskVideos-VideoCLIP

Pre-trained & Fine-tuned Checkpoints

Checkpoint Link
AskVideos-VideoCLIP-v0.1 link

Introduction

  • AskVideos-VideoCLIP is a language-grounded video embedding model.
  • 16 frames are sampled from each video clip to generate a video embedding.
  • The model is trained on 2M clips from WebVid and 1M clips from the AskYoutube dataset.
  • The model is trained with contrastive and captioning loss to ground the video embeddings to text.

Usage

Environment Preparation

First, install ffmpeg.

apt update
apt install ffmpeg

Then, create a conda environment:

conda create -n askvideosclip python=3.9 
conda activate askvideosclip

Then, install the requiremnts:

pip3 install -U pip
pip3 install -r requirements.txt

How to Run Demo Locally

python video_clip.py

The demo is also available to run on colab.

Term of Use

AskVideos code and models are distributed under the Apache 2.0 license.

Acknowledgement

This model is built off of the Video-LLaMA Video-Qformer model.