INT4 Quantization

### Describe the issue

Hello,

What is the best method to quantize a BERT model in int4 using ipex?

For example ipex int8 from the docs is:
```python
qconfig = ipex.quantization.default_dynamic_qconfig

prepared_model = prepare(model, qconfig, example_inputs=data)
converted_model = convert(prepared_model)

with torch.no_grad():
    traced_model = torch.jit.trace(converted_model, data, check_trace=False, strict=False)
    traced_model = torch.jit.freeze(traced_model)

traced_model.save("int8_quantized_model.pt")
```
How should this be done for 4bit?

Thank you,

Hank

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

INT4 Quantization #461

Describe the issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

INT4 Quantization #461

Description

Describe the issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions