Skip to content

The content in the repository does not match the description in the paper. | 仓库中与论文描述不相符合 #221

@X-ShareAiLab

Description

@X-ShareAiLab

I have observed that the paper describes using VM, VLM, and LLM for dictionary initialization, but I could not find any pipeline in the code that uses YOLOv9/CLIP/LLM as an "Initializer". While there are CLIP/LLM related components within the ultralytics/ directory, they are not integrated with the Re-parameterization Dictionary (RD).
No script or entry point was found for iterating through the training set and exporting intermediate features. (This does not match the description in the paper of iterating through the dataset).
No k-means/centroid logic related to RD was located; it is only used for quantization in ultralytics/ultralytics/engine/exporter.py (but is unrelated to RD).
No initialization logic for writing centroids into the RD module's weights was discovered; DConv is only initialized with standard random initialization and trained end-to-end.

Existing RD Implementation Location:

Core Module: YOLO/yolo/model/module.py - class DConv(in_channels=512, alpha=0.8, atoms=512): Implements the "dictionary space" mapping, atom modeling, and decoding with a 1×1 / 5×5dw / 1×1 three-stage convolution. There is no interface for loading an external dictionary or for offline initialization.

class RepNCSPELAN(..., atoms: 512, rd_args={}): Connects DConv after RepNCSPELAN, merely passing through the atoms and rd_args parameters (like in_channels, alpha). Network Builder: YOLO/yolo/model/yolo.py (dynamically builds network layers), no "pre-trained dictionary" weights are injected.
Configuration Example: YOLO/yolo/config/model/rd-9c.yaml (controls RD capacity via the atoms switch).

Additional Notes

The k-means hits within the project come from export quantization (ultralytics/.../exporter.py) and are unrelated to RD dictionary initialization.
CLIP/text encoders and others exist within the ultralytics/ ecosystem (for YOLO-World, YOLOE, etc.) but do not form a closed loop with RD for "encoder → clustering → initialization".

I kindly request the authors to provide an explanation for this. I encountered this issue while migrating the project, and it currently appears to be missing the content related to Figure 2 on page 4 and Section 3.2 "DICTIONARY INITIALIZATION" on page 5 of the paper.

我观察到论文中有描述采用VM VLM LLM进行字典初始化的内容,但是我在代码中没有查询到使用 YOLOv9/CLIP/LLM 作为“Initializer”的管线;在ultralytics/ 里虽有 CLIP/LLM 相关组件,但与 RD 无集成关系。
未检索到遍历训练集、导出中间特征的脚本或入口。(与论文中描述的遍历论文不符合)
未找到与 RD 相关的 k-means/centroid 逻辑;仅在 ultralytics/ultralytics/engine/exporter.py 中用于量化(但与 RD 无关)。
未发现将质心写入 RD 模块权重的初始化逻辑;DConv 仅常规随机初
始化并端到端训练。
现有 RD 实现位置

以下是我的分析结果

核心模块: YOLO/yolo/model/module.py - class DConv(in_channels=512, alpha=0.8, atoms=512):以 1×1/5×5dw/1×1 三段卷积实现“字典空间”映射、原子建模与解码;无外部字典加载或离线初始化接口。
class RepNCSPELAND(..., atoms: 512, rd_args={}):将 DConv 串接到 RepNCSPELAN 之后,仅透传 atoms 与 rd_args(如 in_channels、alpha)。组网器: YOLO/yolo/model/yolo.py(动态构建网络层),未注入任何“预训练字典”权重。
配置示例: YOLO/yolo/config/model/rd-9c.yaml(通过 atoms 开关/调整 RD 容量)。

补充说明

项目内的 kmeans 命中来自导出量化(ultralytics/.../exporter.py),与 RD 字典初始化无关。
项目内的 CLIP/文本编码器等存在于 ultralytics/ 生态(YOLO-World/YOLOE 等),未与 RD 形成“编码器→聚类→初始化”的闭环。

恳请作者对此进行解释,我在使用此项目迁移的时候发现此错误,目前疑似缺失论文的第4页的图2 (Figure 2) 和 第5页的3.2节 "DICTIONARY INITIALIZATION" 中的相关的内容

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions