Invalid output and errors using model = ipex.optimize(model): split master weight unsupported, Conv BatchNorm folding failed, Linear BatchNorm folding failed

Hi, trying to run inference with a pretrained OFA (OFA-huge) model according to these instructions:

https://github.com/OFA-Sys/OFA/blob/feature/add_transformers/transformers.md

This runs fine on both CPU and CUDA but using XPU results in gibberish.  I also get several warnings which go away when `model = ipex.optimize(model)` is commented out.  With essentially the only change between CPU/CUDA and XPU being the `.to('xpu')` part, the model still outputs gibberish.

Warnings from model = ipex.optimize(model):
```/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/torchvision/transforms/transforms.py:329: UserWarning: Argument 'interpolation' of type int is deprecated since 0.13 and will be removed in 0.15. Please use InterpolationMode enum.
  warnings.warn(
./OFA-huge
<super: <class 'OFATokenizer'>, <OFATokenizer object>>
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/workspace/pytorch/aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447: UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
  warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457: UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so                     make inplace to be true
  warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464: UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout                     is automatically chosen to use
  warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:486: UserWarning: Conv BatchNorm folding failed during the optimize process.
  warnings.warn("Conv BatchNorm folding failed during the optimize process.")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:491: UserWarning: Linear BatchNorm folding failed during the optimize process.
  warnings.warn("Linear BatchNorm folding failed during the optimize process.")
```

`[' this is the ch ch chaval all the is is the word for the band that is']`
   ^ gibberish output

With CPU/CUDA:
`[' a black and white photo of a wolf walking through the woods at night.']`
    ^ correct output

I'm running Ubuntu 22.04 with 1.13.10+xpu, code is below:
```
import warnings
from PIL import Image
from torchvision import transforms
from transformers import OFATokenizer, OFAModel
import intel_extension_for_pytorch as ipex

chkpt_dir = "./OFA-huge"
path_to_image = "image.jpg"
mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
resolution = 256
patch_resize_transform = transforms.Compose([
        lambda image: image.convert("RGB"),
        transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean, std=std)
    ])


tokenizer = OFATokenizer.from_pretrained(chkpt_dir)

txt = " what does the image describe?"
inputs = tokenizer([txt], return_tensors="pt").input_ids
img = Image.open(path_to_image)
patch_img = patch_resize_transform(img).unsqueeze(0)

model = OFAModel.from_pretrained(chkpt_dir, use_cache=False)
model = model.to("xpu")
patch_img = patch_img.to("xpu")
inputs = inputs.to("xpu")
model = ipex.optimize(model)

gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)

print(tokenizer.batch_decode(gen, skip_special_tokens=True))
```

Image:
![image](https://user-images.githubusercontent.com/59679879/219906312-a2a1be8a-a478-4e6e-9632-af05242b3cfa.jpg)

Thanks!





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Invalid output and errors using model = ipex.optimize(model): split master weight unsupported, Conv BatchNorm folding failed, Linear BatchNorm folding failed #302

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Invalid output and errors using model = ipex.optimize(model): split master weight unsupported, Conv BatchNorm folding failed, Linear BatchNorm folding failed #302

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions