Add test for Phi-3-vision-128k-instruct #1850

kshitij12345 · 2025-03-07T12:34:23Z

Adds a test for Phi-3-vision-128k-instruct

The test takes around 4GB of device memory while running.

The relaxed tolerances worked fine with 50 repeats of the test - pytest thunder/tests/test_networks.py -k phi3_vi --count 50

IvanYashchuk

If possible the test should be skipped if there's not enough memory on a GPU to run the test. Another alternative is to modify the config to improve test duration and memory consumption.

IvanYashchuk · 2025-03-07T13:27:19Z

thunder/tests/test_networks.py

+    from thunder.dynamo import thunderfx
+
+    cfg = Phi3Config(**phi3_vision_cfg)
+    cfg.num_hidden_layers = 2


What is the memory requirement for 1 layer?

IvanYashchuk · 2025-03-07T13:28:12Z

thunder/tests/test_networks.py

+    cfg.num_hidden_layers = 2
+
+    with torch.device("cuda"):
+        model = AutoModelForCausalLM.from_config(cfg, trust_remote_code=False, torch_dtype=torch.bfloat16)


Changing vocab_size from 32064 to a smaller number should decrease the memory requirements of this test.

IvanYashchuk · 2025-03-07T13:29:16Z

thunder/tests/test_networks.py

+
+@requiresCUDA
+def test_hf_phi3_vision():
+    # This test takes around 4045406208 bytes (~4GB) of memory.


Is there a decorator to skip the test based on the memory requirements? There are NVIDIA internal CI jobs on hardware with a limited amount of memory that could potentially fail.

That would be nice +1 here

IvanYashchuk · 2025-03-07T13:30:50Z

thunder/tests/test_networks.py

@@ -558,6 +558,189 @@ def test_hf_llama():
    assert len(get_fusion_symbols(thunder.last_traces(jm)[-1])) == 6


+# We need to copy config here as the AutoModel doesn't work with `trust_remote_code=True`


What is the error?

Error is - ValueError: Loading microsoft/Phi-3-vision-128k-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True to remove this error.

Script

import torch from transformers import AutoModelForCausalLM, AutoConfig from transformers.models.phi3 import Phi3Config model_id = "microsoft/Phi-3-vision-128k-instruct" # Initialize the pre-trained model cfg = AutoConfig.from_pretrained(model_id, trust_remote_code=False) # model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto") cfg.num_hidden_layers = 2 from thunder.dynamo import thunderfx from thunder.dynamo.report import get_thunder_fxgraph_reports, fx_report, ThunderCompileSpecification with torch.device("cuda"): model = AutoModelForCausalLM.from_config(cfg, trust_remote_code=False, torch_dtype=torch.bfloat16) print(model)

riccardofelluga · 2025-03-07T14:02:34Z

thunder/tests/test_networks.py

+    "original_max_position_embeddings": 4096,
+    "rms_norm_eps": 1e-05,
+    "rope_scaling": {
+        "long_factor": [


The length of this array of values is given by ~hidden_size/2*num_attention_heads so reducing the size of the model will improve this line count here. Also I think these are just numbers that you can set programmatically since we don't care about correctness in this test.

riccardofelluga · 2025-03-07T14:04:33Z

thunder/tests/test_networks.py

+# for `Phi-3-vision-128k-instruct`.
+phi3_vision_cfg = {
+    "_name_or_path": "Phi-3-vision-128k-instruct",
+    "architectures": ["Phi3VForCausalLM"],


Interesting that this is not a model included in the transformers library and therefore one would think that with thrust_remote_code=False this model wouldn't be loaded at all, but it doesn't seem to be the case 🤔

riccardofelluga · 2025-03-07T14:04:58Z

thunder/tests/test_networks.py

+            64.81001281738281,
+            64.81001281738281,
+        ],
+        "short_factor": [


Same as the long_factor

riccardofelluga · 2025-03-07T14:05:30Z

thunder/tests/test_networks.py

+    "transformers_version": "4.38.1",
+    "use_cache": True,
+    "vocab_size": 32064,
+    "_attn_implementation": "sdpa",


What happens if _attn_implementation is not set?

riccardofelluga · 2025-03-07T14:06:42Z

thunder/tests/test_networks.py

+
+@requiresCUDA
+def test_hf_phi3_vision():
+    # This test takes around 4045406208 bytes (~4GB) of memory.


That would be nice +1 here

riccardofelluga · 2025-03-07T14:08:42Z

thunder/tests/test_networks.py

+        loss_grad = torch.randn_like(expected.loss)
+        actual_grads = torch.autograd.grad(actual.loss, model.parameters(), grad_outputs=loss_grad)
+        expected_grads = torch.autograd.grad(expected.loss, model.parameters(), grad_outputs=loss_grad)
+        torch.testing.assert_close(actual_grads, expected_grads, rtol=1e-2, atol=1e-2)


Asking for info now that I see the custom tolerances, on what order of magnitude is the mismatch that you are getting?

kshitij12345 · 2025-03-07T17:54:34Z

Interesting that this is not a model included in the transformers library and therefore one would think that with thrust_remote_code=False this model wouldn't be loaded at all, but it doesn't seem to be the case 🤔

Good catch, on checking the type of model, I see it is Phi3ForCausalLM even if the config specifies a modeling_phi3_v.Phi3VForCausalLM architecture.

With trust_remote_code, it seems to be hitting an error on thunderfx, will investigate and proceed by filing relevant issue. Will turn the PR back to draft till it is ready again.

Thanks @riccardofelluga @IvanYashchuk

Script to check model

import torch
from transformers import AutoModelForCausalLM
from transformers.models.phi3 import Phi3Config

model_id = "microsoft/Phi-3-vision-128k-instruct"

phi3_vision_cfg = {
  "_name_or_path": "Phi-3-vision-128k-instruct",
  "architectures": [
    "Phi3VForCausalLM"
  ],
  "attention_dropout": 0.0,
  "auto_map": {
    "AutoConfig": "configuration_phi3_v.Phi3VConfig",
    "AutoModelForCausalLM": "modeling_phi3_v.Phi3VForCausalLM"
  },
  "bos_token_id": 1,
  "embd_layer": {
    "embedding_cls": "image",
    "hd_transform_order": "sub_glb",
    "projection_cls": "mlp",
    "use_hd_transform": True,
    "with_learnable_separator": True
  },
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 3072,
  "img_processor": {
    "image_dim_out": 1024,
    "model_name": "openai/clip-vit-large-patch14-336",
    "name": "clip_vision_model",
    "num_img_tokens": 144
  },
  "initializer_range": 0.02,
  "intermediate_size": 8192,
  "max_position_embeddings": 131072,
  "model_type": "phi3_v",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "original_max_position_embeddings": 4096,
  "rms_norm_eps": 1e-05,
  "rope_scaling": {
    "long_factor": [
      1.0299999713897705,
      1.0499999523162842,
      1.0499999523162842,
      1.0799999237060547,
      1.2299998998641968,
      1.2299998998641968,
      1.2999999523162842,
      1.4499999284744263,
      1.5999999046325684,
      1.6499998569488525,
      1.8999998569488525,
      2.859999895095825,
      3.68999981880188,
      5.419999599456787,
      5.489999771118164,
      5.489999771118164,
      9.09000015258789,
      11.579999923706055,
      15.65999984741211,
      15.769999504089355,
      15.789999961853027,
      18.360000610351562,
      21.989999771118164,
      23.079999923706055,
      30.009998321533203,
      32.35000228881836,
      32.590003967285156,
      35.56000518798828,
      39.95000457763672,
      53.840003967285156,
      56.20000457763672,
      57.95000457763672,
      59.29000473022461,
      59.77000427246094,
      59.920005798339844,
      61.190006256103516,
      61.96000671386719,
      62.50000762939453,
      63.3700065612793,
      63.48000717163086,
      63.48000717163086,
      63.66000747680664,
      63.850006103515625,
      64.08000946044922,
      64.760009765625,
      64.80001068115234,
      64.81001281738281,
      64.81001281738281
    ],
    "short_factor": [
      1.05,
      1.05,
      1.05,
      1.1,
      1.1,
      1.1,
      1.2500000000000002,
      1.2500000000000002,
      1.4000000000000004,
      1.4500000000000004,
      1.5500000000000005,
      1.8500000000000008,
      1.9000000000000008,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.000000000000001,
      2.1000000000000005,
      2.1000000000000005,
      2.2,
      2.3499999999999996,
      2.3499999999999996,
      2.3499999999999996,
      2.3499999999999996,
      2.3999999999999995,
      2.3999999999999995,
      2.6499999999999986,
      2.6999999999999984,
      2.8999999999999977,
      2.9499999999999975,
      3.049999999999997,
      3.049999999999997,
      3.049999999999997
    ],
    "type": "su"
  },
  "rope_theta": 10000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": False,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.38.1",
  "use_cache": True,
  "vocab_size": 32064,
  "_attn_implementation": "sdpa"
}

# Initialize the pre-trained model
cfg = Phi3Config(**phi3_vision_cfg)

# cfg = AutoConfig.from_pretrained(model_id, trust_remote_code=False)
# model = AutoModelForCausalLM.from_pretrained(model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto")
cfg.num_hidden_layers = 2

from thunder.dynamo import thunderfx
from thunder.dynamo.report import get_thunder_fxgraph_reports, fx_report, ThunderCompileSpecification

with torch.device("cuda"):
    model = AutoModelForCausalLM.from_config(cfg, trust_remote_code=False, torch_dtype=torch.bfloat16)
    print(model)

… phi3-vision-test

for more information, see https://pre-commit.ci

kshitij12345 · 2025-06-02T10:06:27Z

thunder/tests/test_networks.py

+# eager - 3596534784
+@requiresCUDA
+@requiresDeviceMemory(required_memory_bytes=int(3.6 * 1024 * 1024 * 1024))
+@pytest.mark.parametrize("attn_implementation", [None, "eager"])


We don't add sdpa here as

ValueError: Phi3VForCausalLM does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet. Please request the support for this architecture: https://github.com/huggingface/transformers/issues/28005. If you believe this error is a bug, please open an issue in Transformers GitHub repository and load your model with the argument `attn_implementation="eager"` meanwhile. Example: `model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="eager")`

t-vi · 2025-06-02T10:24:54Z

The test takes around 4GB of device memory while running.

would there be a chance to have a cut-down model even more similar to what we do with the other models?

I'm quite weary of this and we have been seeing OOM lately.
We used to have all tests take well below 1GB before.

into phi3-vision-test

kshitij12345 · 2025-06-02T13:15:42Z

would there be a chance to have a cut-down model even more similar to what we do with the other models?

I'm quite weary of this and we have been seeing OOM lately.
We used to have all tests take well below 1GB before.

Updated.

kshitij12345 added 2 commits March 7, 2025 13:31

add test for Phi-3-vision-128k-instruct

da1a1be

add reference to finetune script

6b29f94

kshitij12345 requested review from riccardofelluga and IvanYashchuk March 7, 2025 13:07

kshitij12345 marked this pull request as ready for review March 7, 2025 13:07

kshitij12345 requested review from mruberry, lantiga and t-vi as code owners March 7, 2025 13:07

IvanYashchuk reviewed Mar 7, 2025

View reviewed changes

riccardofelluga reviewed Mar 7, 2025

View reviewed changes

This was referenced Mar 7, 2025

fusion_type="dataflow" can lead to invalid trace #1858

Open

Expected NumberType mismatch in pad meta #1857

Closed

kshitij12345 marked this pull request as draft March 7, 2025 23:25

kshitij12345 added 8 commits April 3, 2025 12:03

Merge branch 'main' of github.com:Lightning-AI/lightning-thunder into…

62b0aa6

… phi3-vision-test

try to use assert_closer

2648785

Merge branch 'main' of github.com:Lightning-AI/lightning-thunder into…

6c26468

… phi3-vision-test

update

cf1c894

add comment

aeea557

update

8d4e3b9

update

afb40b0

update requirements to install flash-attn for default path

45c9722

github-actions bot added the dependencies label Apr 25, 2025

update test

3b3aa76

github-actions bot removed the dependencies label Apr 25, 2025

kshitij12345 and others added 4 commits May 28, 2025 16:07

Merge branch 'Lightning-AI:main' into phi3-vision-test

620f963

update

8e76a93

[pre-commit.ci] auto fixes from pre-commit.com hooks

ee4e888

for more information, see https://pre-commit.ci

Merge branch 'Lightning-AI:main' into phi3-vision-test

188d7ab

kshitij12345 commented Jun 2, 2025

View reviewed changes

kshitij12345 marked this pull request as ready for review June 2, 2025 10:06

kshitij12345 added 2 commits June 2, 2025 05:05

scale down the model

1904d4b

Merge branch 'main' of https://github.com/Lightning-AI/lightning-thunder

ea4fe99

into phi3-vision-test

		@@ -558,6 +558,189 @@ def test_hf_llama():
		assert len(get_fusion_symbols(thunder.last_traces(jm)[-1])) == 6


		# We need to copy config here as the AutoModel doesn't work with `trust_remote_code=True`

Add test for Phi-3-vision-128k-instruct #1850

Are you sure you want to change the base?

Add test for Phi-3-vision-128k-instruct #1850

Uh oh!

Conversation

kshitij12345 commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IvanYashchuk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riccardofelluga Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented Mar 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

t-vi commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshitij12345 commented Jun 2, 2025

Uh oh!

Uh oh!

kshitij12345 commented Mar 7, 2025 •

edited

Loading

riccardofelluga Mar 7, 2025 •

edited

Loading

t-vi commented Jun 2, 2025 •

edited

Loading