zai-org · zRzRzRzRzRzRzR · Mar 29, 2025 · Mar 14, 2025 · Mar 20, 2025 · Mar 20, 2025
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yaml b/.github/ISSUE_TEMPLATE/bug_report.yaml
@@ -30,14 +30,14 @@ body:
         If you have code snippets, error messages, stack traces, please provide them here as well.
         Please format your code correctly using code tags. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
         Do not use screenshots, as they are difficult to read and (more importantly) do not allow others to copy and paste your code.
-        
+
         请提供能重现您遇到的问题的代码示例,最好是最小复现单元。
         如果您有代码片段、错误信息、堆栈跟踪，也请在此提供。
         请使用代码标签正确格式化您的代码。请参见 https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
         请勿使用截图，因为截图难以阅读，而且（更重要的是）不允许他人复制粘贴您的代码。
       placeholder: |
         Steps to reproduce the behavior/复现Bug的步骤:
-          
+
           1.
           2.
           3.
@@ -48,4 +48,4 @@ body:
       required: true
     attributes:
       label: Expected behavior / 期待表现
-      description: "A clear and concise description of what you would expect to happen. /简单描述您期望发生的事情。"
+      description: "A clear and concise description of what you would expect to happen. /简单描述您期望发生的事情。"
diff --git a/.github/ISSUE_TEMPLATE/feature-request.yaml b/.github/ISSUE_TEMPLATE/feature-request.yaml
@@ -29,6 +29,6 @@ body:
     attributes:
       label: Your contribution / 您的贡献
       description: |
-        
+
         Your PR link or any other link you can help with.
-        您的PR链接或者其他您能提供帮助的链接。
+        您的PR链接或者其他您能提供帮助的链接。
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,28 @@
+# Contribution Guide
+
+We welcome your contributions to this repository. To ensure elegant code style and better code quality, we have prepared the following contribution guidelines.
+
+## What We Accept
+
++ This PR fixes a typo or improves the documentation (if this is the case, you may skip the other checks).
++ This PR fixes a specific issue — please reference the issue number in the PR description. Make sure your code strictly follows the coding standards below.
++ This PR introduces a new feature — please clearly explain the necessity and implementation of the feature. Make sure your code strictly follows the coding standards below.
+
+## Code Style Guide
+
+Good code style is an art. We have prepared a `pyproject.toml` and a `pre-commit` hook to enforce consistent code formatting across the project. You can clean up your code following the steps below:
+
+1. Install the required dependencies:
+```shell
+    pip install ruff pre-commit
+```
+2. Then, run the following command:
+```shell
+    pre-commit run --all-files
+```
+If your code complies with the standards, you should not see any errors.
+
+## Naming Conventions
+
+- Please use **English** for naming; do not use Pinyin or other languages. All comments should also be in English.
+- Follow **PEP8** naming conventions strictly, and use underscores to separate words. Avoid meaningless names such as `a`, `b`, `c`.
diff --git a/.github/PULL_REQUEST_TEMPLATE/pr_template.md b/.github/PULL_REQUEST_TEMPLATE/pr_template.md
diff --git a/.github/workflows/python-lint.yml b/.github/workflows/python-lint.yml
@@ -0,0 +1,27 @@
+name: Python Linting
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  lint:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.10'
+          cache: 'pip'
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install pre-commit
+
+      - name: Run pre-commit
+        run: pre-commit run --all-files
diff --git a/.gitignore b/.gitignore
@@ -8,4 +8,4 @@ logs/
 .idea
 output*
 test*
-img
+img
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,19 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.4.5
+    hooks:
+      - id: ruff
+        args: [--fix, --respect-gitignore, --config=pyproject.toml]
+      - id: ruff-format
+        args: [--config=pyproject.toml]
+
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-yaml
+      - id: check-toml
+      - id: check-case-conflict
+      - id: check-merge-conflict
+      - id: debug-statements
diff --git a/README.md b/README.md
@@ -8,8 +8,8 @@
 </div>
 
 <p align="center">
-<a href="https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4"  target="_blank"> 🤗 HuggingFace Space</a>  
-<a href="https://modelscope.cn/studios/ZhipuAI/CogView4" target="_blank">  🤖ModelScope Space</a> 
+<a href="https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4"  target="_blank"> 🤗 HuggingFace Space</a>
+<a href="https://modelscope.cn/studios/ZhipuAI/CogView4" target="_blank">  🤖ModelScope Space</a>
 <a href="https://zhipuaishengchan.datasink.sensorsdata.cn/t/4z" target="_blank"> 🛠️ZhipuAI MaaS(Faster)</a>
 <br>
 <a href="resources/WECHAT.md" target="_blank"> 👋 WeChat Community</a>  <a href="https://arxiv.org/abs/2403.05121" target="_blank">📚 CogView3 Paper</a>
@@ -19,7 +19,8 @@
 
 ## Project Updates
 
-- 🔥🔥 ```2025/03/04```: We've adapted and open-sourced the [diffusers](https://github.com/huggingface/diffusers) version
+- 🔥🔥 ```2025/03/24```: We are launching [CogKit](https://github.com/THUDM/CogKit), a powerful toolkit for fine-tuning and inference of the **CogView4** and **CogVideoX** series, allowing you to fully explore our multimodal generation models.
+- ```2025/03/04```: We've adapted and open-sourced the [diffusers](https://github.com/huggingface/diffusers) version
   of **CogView-4** model, which has 6B parameters, supports native Chinese input, and Chinese text-to-image generation.
   You can try it [online](https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4).
 - ```2024/10/13```: We've adapted and open-sourced the [diffusers](https://github.com/huggingface/diffusers) version of
@@ -31,9 +32,9 @@
 
 ## Project Plan
 
-- [X] Diffusers workflow adaptation  
-- [ ] Cog series fine-tuning kits (coming soon)  
-- [ ] ControlNet models and training code  
+- [X] Diffusers workflow adaptation
+- [X] Cog series fine-tuning kits (coming soon)
+- [ ] ControlNet models and training code
 
 ## Community Contributions
 
@@ -160,7 +161,7 @@ python prompt_optimize.py --api_key "Zhipu AI API Key" --prompt {your prompt} --
 
 ### Inference Model
 
-Run the model with `BF16` precision:
+Run the model `CogView4-6B` with `BF16` precision:
 
 ```python
 from diffusers import CogView4Pipeline
@@ -185,37 +186,23 @@ image = pipe(
 
 image.save("cogview4.png")
 ```
+
 For more inference code, please check:
 
 1. For using `BNB int4` to load `text encoder` and complete inference code annotations,
    check [here](inference/cli_demo_cogview4.py).
 2. For using `TorchAO int8 or int4` to load `text encoder & transformer` and complete inference code annotations,
    check [here](inference/cli_demo_cogview4_int8.py).
 3. For setting up a `gradio` GUI DEMO, check [here](inference/gradio_web_demo.py).
-## Installation
-```
-git clone https://github.com/THUDM/CogView4
-cd CogView4
-git clone https://huggingface.co/THUDM/CogView4-6B
-pip install -r inference/requirements.txt
-```
-## Quickstart
-12G VRAM
-```
-MODE=1 python inference/gradio_web_demo.py
-```
-24G VRAM 32G RAM
-```
-MODE=2 python inference/gradio_web_demo.py
-```
-24G VRAM 64G RAM
-```
-MODE=3 python inference/gradio_web_demo.py
-```
-48G VRAM 64G RAM
-```
-MODE=4 python inference/gradio_web_demo.py
-```
+
+
+## Fine-tuning
+
+This repository does not contain fine-tuning code, but you can fine-tune using the following two approaches, including both LoRA and SFT:
+
+1. [CogKit](https://github.com/THUDM/CogKit), our officially maintained system-level fine-tuning framework that supports CogView4 and CogVideoX.
+2. [finetrainers](https://github.com/a-r-r-o-w/finetrainers), a low-memory solution that enables fine-tuning on a single RTX 4090.
+3. If you want to train ControlNet models directly, you can refer to the [training code](https://github.com/huggingface/diffusers/tree/main/examples/cogview4-control) and train your own models.
 
 ## License
 

diff --git a/README_ja.md b/README_ja.md
@@ -8,8 +8,8 @@
 
 </div>
 <p align="center">
-<a href="https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4"  target="_blank"> 🤗 HuggingFace Space</a>  
-<a href="https://modelscope.cn/studios/ZhipuAI/CogView4" target="_blank">  🤖ModelScope Space</a> 
+<a href="https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4"  target="_blank"> 🤗 HuggingFace Space</a>
+<a href="https://modelscope.cn/studios/ZhipuAI/CogView4" target="_blank">  🤖ModelScope Space</a>
 <a href="https://zhipuaishengchan.datasink.sensorsdata.cn/t/4z" target="_blank"> 🛠️ZhipuAI MaaS(Faster)</a>
 <br>
 <a href="resources/WECHAT.md" target="_blank"> 👋 WeChat Community</a>  <a href="https://arxiv.org/abs/2403.05121" target="_blank">📚 CogView3 Paper</a>
@@ -20,7 +20,9 @@
 
 ## プロジェクトの更新
 
-- 🔥🔥 ```2025/03/04```: [diffusers](https://github.com/huggingface/diffusers) バージョンの **CogView-4**
+- 🔥🔥 ```2025/03/24```: [CogView4-6B-Control](https://huggingface.co/THUDM/CogView4-6B-Control) モデルをリリースしました！[トレーニングコード](https://github.com/huggingface/diffusers/tree/main/examples/cogview4-control) を使用して、自身でトレーニングすることも可能です。
+  さらに、**CogView4** および **CogVideoX** シリーズのファインチューニングと推論を簡単に行えるツールキット [CogKit](https://github.com/THUDM/CogKit) も公開しました。私たちのマルチモーダル生成モデルを存分に活用してください！
+- ```2025/03/04```: [diffusers](https://github.com/huggingface/diffusers) バージョンの **CogView-4**
   モデルを適応し、オープンソース化しました。このモデルは6Bのパラメータを持ち、ネイティブの中国語入力と中国語のテキストから画像生成をサポートしています。オンラインで試すことができます [こちら](https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4)。
 - ```2024/10/13```: [diffusers](https://github.com/huggingface/diffusers) バージョンの **CogView-3Plus-3B**
   モデルを適応し、オープンソース化しました。オンラインで試すことができます [こちら](https://huggingface.co/spaces/THUDM-HF-SPACE/CogView3-Plus-3B-Space)。
@@ -31,7 +33,7 @@
 ## プロジェクト計画
 
 - [X] Diffusers ワークフローの適応
-- [ ] Cogシリーズのファインチューニングスイート (近日公開)
+- [X] Cogシリーズのファインチューニングスイート (近日公開)
 - [ ] ControlNetモデルとトレーニングコード
 
 ## コミュニティの取り組み
@@ -85,12 +87,12 @@
 
 DITモデルは `BF16` 精度と `batchsize=4` でテストされ、結果は以下の表に示されています：
 
-| 解像度         | enable_model_cpu_offload OFF | enable_model_cpu_offload ON | enable_model_cpu_offload ON </br> Text Encoder 4bit | 
-|-------------|------------------------------|-----------------------------|-----------------------------------------------------| 
-| 512 * 512   | 33GB                         | 20GB                        | 13G                                                 | 
-| 1280 * 720  | 35GB                         | 20GB                        | 13G                                                 | 
-| 1024 * 1024 | 35GB                         | 20GB                        | 13G                                                 | 
-| 1920 * 1280 | 39GB                         | 20GB                        | 14G                                                 | 
+| 解像度         | enable_model_cpu_offload OFF | enable_model_cpu_offload ON | enable_model_cpu_offload ON </br> Text Encoder 4bit |
+|-------------|------------------------------|-----------------------------|-----------------------------------------------------|
+| 512 * 512   | 33GB                         | 20GB                        | 13G                                                 |
+| 1280 * 720  | 35GB                         | 20GB                        | 13G                                                 |
+| 1024 * 1024 | 35GB                         | 20GB                        | 13G                                                 |
+| 1920 * 1280 | 39GB                         | 20GB                        | 14G                                                 |
 
 さらに、プロセスが強制終了されないようにするために、少なくとも`32GB`のRAMを持つデバイスを推奨します。
 
@@ -157,7 +159,7 @@ python prompt_optimize.py --api_key "Zhipu AI API Key" --prompt {your prompt} --
 
 ### 推論モデル
 
-`BF16` 精度でモデルを実行します：
+`BF16` の精度で `CogView4-6B` モデルを実行する：
 
 ```python
 from diffusers import CogView4Pipeline
@@ -182,37 +184,20 @@ image = pipe(
 
 image.save("cogview4.png")
 ```
-For more inference code, please check:
-
-1. For using `BNB int4` to load `text encoder` and complete inference code annotations,
-   check [here](inference/cli_demo_cogview4.py).
-2. For using `TorchAO int8 or int4` to load `text encoder & transformer` and complete inference code annotations,
-   check [here](inference/cli_demo_cogview4_int8.py).
-3. For setting up a `gradio` GUI DEMO, check [here](inference/gradio_web_demo.py).
-## Installation
-```
-git clone https://github.com/THUDM/CogView4
-cd CogView4
-git clone https://huggingface.co/THUDM/CogView4-6B
-pip install -r inference/requirements.txt
-```
-## Quickstart
-12G VRAM
-```
-MODE=1 python inference/gradio_web_demo.py
-```
-24G VRAM 32G RAM
-```
-MODE=2 python inference/gradio_web_demo.py
-```
-24G VRAM 64G RAM
-```
-MODE=3 python inference/gradio_web_demo.py
-```
-48G VRAM 64G RAM
-```
-MODE=4 python inference/gradio_web_demo.py
-```
+
+より詳しい推論コードについては、以下をご確認ください：
+
+1. `BNB int4` を使用して `text encoder` をロードし、完全な推論コードの注釈を確認するには、[こちら](inference/cli_demo_cogview4.py) をご覧ください。
+2. `TorchAO int8 または int4` を使用して `text encoder & transformer` をロードし、完全な推論コードの注釈を確認するには、[こちら](inference/cli_demo_cogview4_int8.py) をご覧ください。
+3. `gradio` GUI デモをセットアップするには、[こちら](inference/gradio_web_demo.py) をご覧ください。
+
+## ファインチューニング（微調整）
+
+このリポジトリにはファインチューニング用のコードは含まれていませんが、LoRA および SFT を含む以下の 2 つの方法でファインチューニングが可能です：
+
+1. [CogKit](https://github.com/THUDM/CogKit)：CogView4 および CogVideoX のファインチューニングをサポートする、公式で保守されているシステムレベルのファインチューニングフレームワークです。
+2. [finetrainers](https://github.com/a-r-r-o-w/finetrainers)：低メモリ環境向けのソリューションで、RTX 4090 でのファインチューニングが可能です。
+3. ControlNet モデルを直接訓練したい場合は、[トレーニングコード](https://github.com/huggingface/diffusers/tree/main/examples/cogview4-control) を参考にして自前で訓練することができます。
 
 ## ライセンス
-Original file line number
+Diff line change
@@ Expand Up / @@ -8,4 +8,4 @@ logs/ @@
     .idea
     output*
     test*
-    img
+    img