[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
-
Updated
Jan 4, 2025 - Python
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
🚀 Cross attention map tools for huggingface/diffusers
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
1-shot image segmentation using Stable Diffusion
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
[NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"
The official repository of "Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models".
This is the implementation of the paper Enhanced Photovoltaic Power Forecasting: An iTransformer and LSTM-Based Model Integrating Temporal and Covariate Interactions
A lightweight PyTorch implementation of the Transformer-XL architecture proposed by Dai et al. (2019)
[ITSC-2023] HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
TGRS: Code for "Unsupervised Hybrid Network of Transformer and CNN for Blind Hyperspectral and RGB Image Fusion"
SOVL System (Self-Organizing Virtual Lifeform): A complex, purpose-agnostic autonomous agent with continuous, asynchronous learning capabilities via a dynamic scaffolded LLM and a frozen base LLM
Detect Deepfaked Faces Using Multiple Deeplearning Models
This model synthesises high-fidelity fashion videos from single images featuring spontaneous and believable movements.
Segment-Like-Me: 1-shot image segmentation using Stable Diffusion
Pytorch implementation of CL-ViT and FF-ViT models
IEEE ICME : "Cross-Attention is not always needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition"
3D Human-Object Interaction in Video A New Approach to Object Tracking via Cross-Modal Attention
Add a description, image, and links to the cross-attention topic page so that developers can more easily learn about it.
To associate your repository with the cross-attention topic, visit your repo's landing page and select "manage topics."