Stars
This is a repository for listing papers on scene graph generation and application.
A utils library of programmactially generating caption and QAs from scene graph
Unofficial implementation of YOLO-World + EfficientSAM for ComfyUI
(ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.
A instruction data generation system for multimodal language models.
This is an automatic full segmentation tool based on Segment-Anything-2 and Segment-Anything-1. Our tool performs automatic full segmentation of the video, enabling the tracking of each object and …
[ECCV'24] Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
[NeurIPS 2024] A task generation and model evaluation system for multimodal language models.
This repository provides a unified interface for generating images from text prompts using various state-of-the-art models including DALL-E, Stable Diffusion, and many others. It's designed to simp…