Working around new int4wo weight packing

Given the change in output shape/behavior in https://github.com/pytorch/pytorch/pull/139611 + https://github.com/pytorch/ao/pull/1278

**Question**: What is the recommended way of migrating to the new cpu implementation of 
- **_weight_int4pack_mm_for_cpu** 
- **_convert_weight_to_int4pack_for_cpu** 

while maintaining the previous behavior? 

---
Specifically  [_convert_weight_to_int4pack](https://github.com/pytorch/torchchat/blob/main/torchchat/utils/gguf_loader.py#L609C17-L614C18
)
```
        q, s, z = Q4_0.unpack(t)
        scales_and_zeros = pack_scales_and_zeros(s, z)
        q_uint8 = (q[::, ::2] << 4 | q[::, 1::2]).to(torch.uint8)
        weight_int4pack = torch.ops.aten._convert_weight_to_int4pack(
            q_uint8, inner_k_tiles
        )
```
and [_weight_int4pack_mm](https://github.com/pytorch/torchchat/blob/main/torchchat/utils/gguf_loader.py#L125C9-L130C10)
```
        c = torch.ops.aten._weight_int4pack_mm(
            input,
            weight_int4pack,
            groupsize,
            scales_and_zeros,
        )
```
 
---
## Tested: With no code changes

The following [error](https://github.com/pytorch/torchchat/actions/runs/12129566400/job/33818248142?pr=1367) is encountered:
> Could not run 'aten::_convert_weight_to_int4pack' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend

## Tested: Naive (Just add *_for_cpu)

Size [mismatch](https://github.com/pytorch/torchchat/actions/runs/12209167137/job/34063542394?pr=1367) was encountered (expected since signatures are different)
> 	size mismatch for model.layers.0.attention.wq.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([256, 16, 32, 4]).


cc: @yanbing-j @jerryzh168 who worked on the changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Working around new int4wo weight packing #1389

Tested: With no code changes

Tested: Naive (Just add *_for_cpu)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Working around new int4wo weight packing #1389

Description

Tested: With no code changes

Tested: Naive (Just add *_for_cpu)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions