Description
Hi, trying to run inference with a pretrained OFA (OFA-huge) model according to these instructions:
https://github.com/OFA-Sys/OFA/blob/feature/add_transformers/transformers.md
This runs fine on both CPU and CUDA but using XPU results in gibberish. I also get several warnings which go away when model = ipex.optimize(model)
is commented out. With essentially the only change between CPU/CUDA and XPU being the .to('xpu')
part, the model still outputs gibberish.
Warnings from model = ipex.optimize(model):
warnings.warn(
./OFA-huge
<super: <class 'OFATokenizer'>, <OFATokenizer object>>
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/workspace/pytorch/aten/src/ATen/native/TensorShape.cpp:3190.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447: UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457: UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so make inplace to be true
warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464: UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout is automatically chosen to use
warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:486: UserWarning: Conv BatchNorm folding failed during the optimize process.
warnings.warn("Conv BatchNorm folding failed during the optimize process.")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:491: UserWarning: Linear BatchNorm folding failed during the optimize process.
warnings.warn("Linear BatchNorm folding failed during the optimize process.")
[' this is the ch ch chaval all the is is the word for the band that is']
^ gibberish output
With CPU/CUDA:
[' a black and white photo of a wolf walking through the woods at night.']
^ correct output
I'm running Ubuntu 22.04 with 1.13.10+xpu, code is below:
import warnings
from PIL import Image
from torchvision import transforms
from transformers import OFATokenizer, OFAModel
import intel_extension_for_pytorch as ipex
chkpt_dir = "./OFA-huge"
path_to_image = "image.jpg"
mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
resolution = 256
patch_resize_transform = transforms.Compose([
lambda image: image.convert("RGB"),
transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC),
transforms.ToTensor(),
transforms.Normalize(mean=mean, std=std)
])
tokenizer = OFATokenizer.from_pretrained(chkpt_dir)
txt = " what does the image describe?"
inputs = tokenizer([txt], return_tensors="pt").input_ids
img = Image.open(path_to_image)
patch_img = patch_resize_transform(img).unsqueeze(0)
model = OFAModel.from_pretrained(chkpt_dir, use_cache=False)
model = model.to("xpu")
patch_img = patch_img.to("xpu")
inputs = inputs.to("xpu")
model = ipex.optimize(model)
gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)
print(tokenizer.batch_decode(gen, skip_special_tokens=True))
Thanks!