Skip to content
View scutyuanzhi's full-sized avatar

Block or report scutyuanzhi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,517 603 Updated Mar 7, 2025

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

Python 317 8 Updated Jan 12, 2025

Next-Token Prediction is All You Need

Python 2,021 78 Updated Oct 24, 2024

Nightly release of ControlNet 1.1

Python 4,923 389 Updated Aug 8, 2024

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,231 559 Updated Oct 24, 2024

Official implementation of ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining (AAAI 2024)

Python 47 2 Updated Jul 4, 2024

Large-scale text-video dataset. 10 million captioned short videos.

Python 626 39 Updated Aug 14, 2024

Generative Models by Stability AI

Python 25,467 2,828 Updated Sep 4, 2024

Official PyTorch implementation of the CVPR 2022 paper: "Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator"

Python 88 9 Updated Sep 17, 2022

A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten ge…

232 8 Updated Dec 19, 2024

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,567 294 Updated Mar 7, 2025

[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Python 344 31 Updated Mar 14, 2024

This repository is the implementation of "Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context".

Python 86 8 Updated Feb 21, 2023

Search image collections by multiple color palettes or by image color similarity.

Python 237 36 Updated Jan 9, 2016

Implementation of DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Python 227 14 Updated Jul 19, 2023

ICLR 2024 (Spotlight)

Python 749 20 Updated Mar 2, 2024

Official implementation of "Composer: Creative and Controllable Image Synthesis with Composable Conditions"

1,553 48 Updated Dec 26, 2023

[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

C++ 1,433 287 Updated Aug 30, 2024

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++ 1,648 190 Updated Dec 27, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,872 2,604 Updated Mar 4, 2025

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Python 2,422 179 Updated Jan 23, 2025

Text-To-Image Generation with Chinese Characters

Python 128 14 Updated Jul 20, 2023
Python 7,759 507 Updated Apr 14, 2024

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,604 2,926 Updated Sep 2, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 38,064 4,652 Updated Mar 1, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,939 5,745 Updated Mar 10, 2025

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Python 2,066 623 Updated Aug 9, 2023

Let us control diffusion models!

Python 31,669 2,836 Updated Feb 25, 2024

A latent text-to-image diffusion model

Jupyter Notebook 69,854 10,355 Updated Jun 18, 2024
Next