Skip to content

How to quantize both vision encoder and llm together? #1998

@XULU42

Description

@XULU42

I am trying to quantize dots_ocr model, whose submodules are ['model','vision_tower','lm_head'], as shown below:

Image

By tracing the code down, I found that this code is for parsing modules for quantization, but it only return one module. By experiment, I can quantize either vision_tower or model separately and loading by vllm, but I can't quantize both due to above mechanism. Please help me figure out how to solve this, thanks^_^

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions