support added for HF SmolLM3-3B by AshutoshSinghIntel · Pull Request #1715 · huggingface/optimum-intel

AshutoshSinghIntel · 2026-05-04T09:55:55Z

What does this PR do?

OpenVINO export:

optimum-cli export openvino -m HuggingFaceTB/SmolLM3-3B ./SmolLM3-3B --task text-generation-with-past

Inference Script:

import argparse
from transformers import AutoTokenizer
from optimum.intel.openvino import OVModelForCausalLM

model_id = "HuggingFaceTB/SmolLM3-3B"

def main():
    parser = argparse.ArgumentParser(description="SmolLM3-3B inference with OpenVINO")
    parser.add_argument("--model", type=str, default=model_id, help="Path to exported OV model or HF model ID")
    parser.add_argument("--max-new-tokens", type=int, default=100)
    parser.add_argument("--device", type=str, default="CPU")
    args = parser.parse_args()

    model = OVModelForCausalLM.from_pretrained(args.model, device=args.device)
    tokenizer = AutoTokenizer.from_pretrained(args.model)

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ]

    inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_dict=True, return_tensors="pt", add_generation_prompt=True)
    output = model.generate(**inputs, max_new_tokens=args.max_new_tokens)
    print(tokenizer.decode(output[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

if __name__ == "__main__":
    main()

Fixes # CVS-183437

Before submitting

[N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2026-05-04T10:02:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rkazants

please provide proper PR description with code snippets for export and inference. See reference #1688

The other part looks good to me

Copilot

Pull request overview

Adds OpenVINO export support for the HuggingFace Transformers smollm3 architecture (SmolLM3-3B), wiring it into the OpenVINO TasksManager configs, test matrices, and the supported-models documentation.

Changes:

Register a new SmolLM3OpenVINOConfig (Llama-based) for multiple tasks with a minimum transformers version of 4.53.0.
Extend OpenVINO exporter and GenAI/decoder/CLI test coverage to include smollm3, and add a tiny internal test model id.
Update OpenVINO supported-architectures documentation to list SmolLM3, and remove smollm3 from the “ONNX supported but untested” warning set.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`optimum/exporters/openvino/model_configs.py`	Registers `smollm3` in the OpenVINO TasksManager with an OpenVINO config class and version gate.
`optimum/exporters/openvino/utils.py`	Removes `smollm3` from `ONNX_SUPPORTED_ARCHITECTURES` so it’s no longer treated as “untested / export at your own risk”.
`tests/openvino/utils_tests.py`	Adds an internal tiny test model mapping for `smollm3`.
`tests/openvino/test_genai.py`	Includes `smollm3` in the GenAI LLM pipeline supported-architecture matrix for `transformers>=4.53.0`.
`tests/openvino/test_exporters_cli.py`	Adds CLI export test coverage and expected tokenizer artifact counts for `smollm3`.
`tests/openvino/test_export.py`	Adds `smollm3` to the export integration test architecture mapping.
`tests/openvino/test_decoder.py`	Adds `smollm3` to the decoder integration supported-architecture matrix for `transformers>=4.53.0`.
`docs/source/openvino/models.mdx`	Documents SmolLM3 as a supported architecture.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

regisss · 2026-05-05T08:05:27Z

LGTM.
I agree with @rkazants' comment, let's stay consistent with previous PRs please.

AshutoshSinghIntel · 2026-05-05T09:25:50Z

Thanks @rkazants and @regisss , I have updated the PR description.

AshutoshSinghIntel · 2026-05-05T09:28:26Z

@IlyasMoutawwakil Could you please provide me access to https://huggingface.co/optimum-intel-internal-testing for uploading the tiny-random-smollm3 ? I ran it locally and tests are passing.

rkazants · 2026-05-05T11:52:11Z

@popovaan, please take a look:

check real model work
ask for wwb metrics

AshutoshSinghIntel · 2026-05-07T12:57:14Z

The WWB similarity score is 1.0 (using CPU, fp16, and the default number of samples, which was 27).

Similarity evaluation:  96%|#########6| 
26/27 [00:05<00:00,  5.61it/s]
Similarity evaluation: 100%|##########| 
27/27 [00:05<00:00,  5.67it/s]
Similarity evaluation: 100%|##########| 
27/27 [00:05<00:00,  4.74it/s]
INFO:whowhatbench.wwb:Metrics for model: SmolLM3-3B
INFO:whowhatbench.wwb:   similarity
0         1.0

AshutoshSinghIntel · 2026-05-07T12:59:15Z

Below is the script for creating tiny-model (I do not have the access to publish yet):

from transformers import (
    AutoTokenizer,
    SmolLM3Config,
    SmolLM3ForCausalLM,
)

def create_tiny_random_smollm3():

    config = SmolLM3Config(
        vocab_size=128256,
        hidden_size=32,
        intermediate_size=64,
        num_hidden_layers=2,
        num_attention_heads=4,
        num_key_value_heads=2,
        max_position_embeddings=256,
        hidden_act="silu",
        rms_norm_eps=1e-6,
        tie_word_embeddings=True,
        use_cache=True,
        attention_bias=False,
        mlp_bias=False,
        use_sliding_window=False,
        pad_token_id=128004,
        bos_token_id=128000,
        eos_token_id=128012,
    )

    model = SmolLM3ForCausalLM(config)
    print(f"Model parameters: {sum(p.numel() for p in model.parameters()):,}")

    # Use SmolLM3-3B tokenizer
    tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM3-3B")

    output_dir = "./tiny-random-smollm3"
    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)
    print(f"Saved to {output_dir}")

if __name__ == "__main__":
    create_tiny_random_smollm3()

popovaan · 2026-05-07T13:02:37Z

Please add quantization tests.

popovaan · 2026-05-07T13:05:08Z

There was a warning during conversion of the model "The OpenVINO export of smollm3 models is not officially supported by optimum-intel, export at your own risks.", was it fixed?

AshutoshSinghIntel · 2026-05-07T13:51:31Z

There was a warning during conversion of the model "The OpenVINO export of smollm3 models is not officially supported by optimum-intel, export at your own risks.", was it fixed?

With the dedicated config added by this PR, the warning does not appear.

AshutoshSinghIntel · 2026-05-07T16:10:34Z

Please add quantization tests.

added, kindly check.

popovaan · 2026-05-12T08:20:02Z

@rkazants @echarlaix @regisss please review this PR.

regisss

LGTM

AshutoshSinghIntel · 2026-05-26T10:20:31Z

Hi @regisss and @rkazants , kindly help to re-review. I updated code to take care of different int8 count expectation based on different task in a single model.

e.g. in SmolLM3-3B,
text-generation-with-past: 30
feature-extraction: 30
text-classification: 32

AshutoshSinghIntel added 2 commits May 4, 2026 15:11

support added for HF SmolLM3-3B

2c8c7a6

added smollm3 model to docs

f50a848

AshutoshSinghIntel marked this pull request as ready for review May 4, 2026 14:50

rkazants requested review from Copilot, echarlaix and popovaan May 5, 2026 05:03

Copilot started reviewing on behalf of rkazants May 5, 2026 05:03 View session

rkazants reviewed May 5, 2026

View reviewed changes

rkazants requested a review from regisss May 5, 2026 05:04

Copilot AI reviewed May 5, 2026

View reviewed changes

popovaan reviewed May 7, 2026

View reviewed changes

Comment thread optimum/exporters/openvino/utils.py

Comment thread optimum/exporters/openvino/model_configs.py

added tests for quantization, feature-extraction and text-classification

fdb3e0b

popovaan approved these changes May 12, 2026

View reviewed changes

rkazants approved these changes May 12, 2026

View reviewed changes

regisss approved these changes May 12, 2026

View reviewed changes

AshutoshSinghIntel added 3 commits May 15, 2026 14:01

Merge branch 'main' into support-SmolLM3-3B

db672e0

added missing smollm3 entry in EXPECTED_NUM_SDPA

77da88f

handle task based int8 count

1e9fd30

AshutoshSinghIntel added 3 commits May 20, 2026 14:35

missing import entry

242a39b

Merge branch 'main' into support-SmolLM3-3B

9d64c8e

black tool file reformat

5a6e33e

AshutoshSinghIntel requested review from popovaan, regisss and rkazants May 20, 2026 14:02

popovaan approved these changes May 26, 2026

View reviewed changes

Conversation

AshutoshSinghIntel commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented May 4, 2026

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

regisss commented May 5, 2026

Uh oh!

AshutoshSinghIntel commented May 5, 2026

Uh oh!

AshutoshSinghIntel commented May 5, 2026

Uh oh!

rkazants commented May 5, 2026

Uh oh!

AshutoshSinghIntel commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AshutoshSinghIntel commented May 7, 2026

Uh oh!

Uh oh!

Uh oh!

popovaan commented May 7, 2026

Uh oh!

popovaan commented May 7, 2026

Uh oh!

AshutoshSinghIntel commented May 7, 2026

Uh oh!

AshutoshSinghIntel commented May 7, 2026

Uh oh!

popovaan commented May 12, 2026

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

AshutoshSinghIntel commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

AshutoshSinghIntel commented May 4, 2026 •

edited

Loading

AshutoshSinghIntel commented May 7, 2026 •

edited

Loading