Skip to content

[Bug]: AutoAWQ marlin methods error #7517

Open
@MichoChan

Description

Your current environment

vllm 0.5.4

🐛 Describe the bug

autoawq marlin must with no zero point, but vllm:

def query_marlin_supported_quant_types(has_zp: bool,
                                       min_capability: Optional[int] = None):
    if min_capability is None:
        major, minor = current_platform.get_device_capability()
        min_capability = major * 10 + minor

    if min_capability < 80:
        return []

    if has_zp:
        # AWQ style, unsigned + runtime zero-point
        return [scalar_types.uint4, scalar_types.uint8]
    else:
        # GPTQ style, unsigned + symmetric bias
        # TODO: once fp8_marlin is merged into "gptq_marlin" we should be able
        #  to add `scalar_types.float8_e4m3fn` here
        return [scalar_types.uint4b8, scalar_types.uint8b128]`

this would error### ###

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions