Skip to content

torch.compile error when running the HuggingFace torchao example #1705

Closed
@vkuzo

Description

@vkuzo

When I run the code snippet from https://huggingface.co/docs/transformers/main/en/quantization/torchao, I see a torch.compile error.

torch version: 2.6.0
torchao version: 0.8.0
transformers version: 4.48.3

Note that I see the same error when running this example even if I remove the quantiazation_config argument to AutoModelForCausalLM.from_pretrained.

Repro script: https://gist.github.com/vkuzo/4dc367370246dc8e47b77f8f1bd92901
Error: https://gist.github.com/vkuzo/814a452a024c27a2f5b2e3fb7276dec3

  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/_dynamo/variables/builder.py", line 2408, in handle_traced_output
    wrap_fx_proxy_cls(                                                                                          
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/_dynamo/variables/builder.py", line 2230, in wrap_fx_proxy_cls
    return handle_traced_output(                        
           ^^^^^^^^^^^^^^^^^^^^^                                                                                
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/_dynamo/variables/builder.py", line 2517, in handle_traced_output
    unimplemented(                                                                                              
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/_dynamo/exc.py", line 317, in unimplemented
    raise Unsupported(msg, case_name=case_name)                                                                                                                                                                                  
torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor device call_function <built-in function getitem>
                                                                                                                
from user code:                                                                                                 
   File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/hooks.py", line 170, in new_forward
    output = module._old_forward(*args, **kwargs)                                                               
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 834, in forward            
    outputs = self.model(                                                                                                                                                                                                        
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 592, in forward                                                                                    
    layer_outputs = decoder_layer(                                                                              
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)                
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)                                         
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/hooks.py", line 364, in pre_forward                                                                                                        
    return send_to_device(args, self.execution_device), send_to_device(                                                                                                                                                          
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/utils/operations.py", line 183, in send_to_device                                                                                          
    {                                                                                                                                                                                                                            
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/utils/operations.py", line 184, in <dictcomp>
    k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)       
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/accelerate/utils/operations.py", line 155, in send_to_device
    return tensor.to(device, non_blocking=non_blocking)                                                         
  File "/home/vasiliy/.conda/envs/pytorch_2_6/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1302, in to
    device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(                                                                                                                                                     
                                                                                                                
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information      
                                                                                                                
                                                                                                                
You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions