Stars
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation (TPAMI 2020)
Displays the CVPR accepted papers in a way that they are easy to parse :)
cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Python bindings for FFmpeg - with complex filtering support
Temporal Relational Modeling with Self-Supervision for Action Segmentation
Inflated i3d network with inception backbone, weights transfered from tensorflow
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Code for I3D Feature Extraction
Replace the MS-TCN with ASFormer in asrf
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
[BMVC 2021]: Official PyTorch implementation of : "Few Shot Temporal Action Localization using Query Adaptive Transformers"
[TIP 2022] End-to-end Temporal Action Detection with Transformer
Code for ''Alleviating Over-segmentation Errors by Detecting Action Boundaries'' accepted in WACV2021
Convolutional neural network model for video classification trained on the Kinetics dataset.
Official repo for BMVC2021 paper ASFormer: Transformer for action segmentation
An open-source toolbox for action understanding based on PyTorch
The Munich Open-Source Large-Scale Multimedia Feature Extractor
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
A toolkit for making real world machine learning and data analysis applications in C++
nlp text classification task with bert and pytorch on IMDB dataset
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
A technical report on convolution arithmetic in the context of deep learning