Skip to content

Atlas3DSS/AskVideos-VideoCLIP

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AskVideos-VideoCLIP

Pre-trained & Fine-tuned Checkpoints

Checkpoint Link
AskVideos-VideoCLIP-v0.1 link

Introduction

  • AskVideos-VideoCLIP is a language-grounded video embedding model.
  • 16 frames are sampled from each video clip to generate a video embedding.
  • The model is trained on 2M clips from WebVid and 1M clips from the AskYoutube dataset.
  • The model is trained with contrastive and captioning loss to ground the video embeddings to text.

Usage

Environment Preparation

First, install ffmpeg.

apt update
apt install ffmpeg

Then, create a conda environment:

conda create -n askvideosclip python=3.9 
conda activate askvideosclip

Then, install the requiremnts:

pip3 install -U pip
pip3 install -r requirements.txt

How to Run Demo Locally

python video_clip.py

The demo is also available to run on colab.

Term of Use

AskVideos code and models are distributed under the Apache 2.0 license.

Acknowledgement

This model is built off of the Video-LLaMA Video-Qformer model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 67.5%
  • Python 32.5%