Skip to content

[QAT] Linear layer's weight quantization granularity can only be per_group #2189

Open
@Cypher-Bruce

Description

@Cypher-Bruce

Although fake_quantizer implements forwarding method for all three kinds of granularities, only per_group can be used for linear layer weight in QAT. This is because when FakeQuantizedLinear's __init__ is called, it first trys to get the group size in weight config:

# initialize weight fake quantizer
if weight_config is not None:
group_size = weight_config.group_size
if group_size is not None and in_features % group_size != 0:
raise ValueError(
"in_features (%s) %% group_size (%s) must be == 0"
% (in_features, group_size)
)
self.weight_fake_quantizer = FakeQuantizer(weight_config)
else:
self.weight_fake_quantizer = None

And if the weight config use any granularity other than per_group, an exception will be raised

def group_size(self) -> int:
"""
If this is per group granularity, return the group size.
Otherwise, throw an error.
"""
if isinstance(self.granularity, PerGroup):
return self.granularity.group_size
else:
raise ValueError(
"`group_size` is undefined for %s granularity" % self.granularity
)

An easy fix can be checking granularity type before getting group size

        # initialize weight fake quantizer
        if weight_config is not None:
            if isinstance(weight_config.granularity, PerGroup):
                group_size = weight_config.group_size
                if group_size is not None and in_features % group_size != 0:
                    raise ValueError(
                        "in_features (%s) %% group_size (%s) must be == 0"
                        % (in_features, group_size)
                    )
            self.weight_fake_quantizer = FakeQuantizer(weight_config)
        else:
            self.weight_fake_quantizer = None

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions