Question: intentional FP16-only path for int8_vectorwise_quant / LLM.int8 activation quant? (BF16 support + removing casts)

Hi guys, first of all, incredible work👍

Just a quick design question about the `LLM.int8` activation quantization path.

### Context

In `MatMul8bitLt.forward`, activations A are always cast to FP16 before quantization:

```py
#bitsandbytes/autograd/_functions.py::MatMul8bitLt.forward
CA, SCA, outlier_cols = F.int8_vectorwise_quant(A.to(torch.float16), threshold=state.threshold)
```

and the CUDA kernel implementation currently hard-requires FP16:
```py
# bitsandbytes/backends/cuda/ops.py
@register_kernel("bitsandbytes::int8_vectorwise_quant", "cuda")
def _(A, threshold=0.0):
    torch._check(A.dtype == torch.float16, ...)
    lib.cint8_vector_quant(get_ptr(A), ...)
```
On the native side, the exported ABI and launcher are also half-only:

- `csrc/pythonInterface.cpp`: cint8_vector_quant(half* A, ...)
- `csrc/ops.cu`: int8VectorQuant(half* A, ...)
- `csrc/kernels.cu`: instantiations only for half
So BF16 inputs get an extra bf16 -> fp16 cast + warning even though the rest of the pipeline tries to preserve output dtype (int8_scaled_mm(..., dtype=A.dtype)).

### Question

Was the FP16-only design for `int8_vectorwise_quant / LLM.int8` activation quantization intentional (e.g. for kernel simplicity, CUB reduction constraints, arch compatibility), or is it mainly an unimplemented gap?

I’m asking because I see a lot of LLM inference/training frameworks use BF16 activations by default these days, and this path currently forces an FP16 cast for quantization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question: intentional FP16-only path for int8_vectorwise_quant / LLM.int8 activation quant? (BF16 support + removing casts) #1868

Context

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Question: intentional FP16-only path for int8_vectorwise_quant / LLM.int8 activation quant? (BF16 support + removing casts) #1868

Description

Context

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions