Description
🐛 Describe the bug
Need help to understand this failure. When I'm loading a .pte to llama_runner via pybind, I got the following error:
[program.cpp:136] InternalConsistency verification requested but not available
forward: MethodMeta(name='forward', num_inputs=2, input_tensor_meta=['TensorInfo(sizes=[1, 1], dtype=Long, is_memory_planned=True, nbytes=8)', 'TensorInfo(sizes=[1], dtype=Long, is_memory_planned=True, nbytes=8)'], num_outputs=1, output_tensor_meta=['TensorInfo(sizes=[1, 1, 200064], dtype=Float, is_memory_planned=True, nbytes=800256)'])
input_ids size: torch.Size([1, 1])
input_ids: tensor([[32]])
cache_position size: torch.Size([1])
cache_position: tensor([0])
[tensor_util.h:735] Tensors do not match: numel=(2048, 1), dim=(1, 2)
[tensor_util.h:744] size(0): (2048, 1)
[op_le.cpp:29] Check failed (tensors_have_same_shape(a, b)):
[method.cpp:1311] KernelCall failed at instruction 0:16 in operator aten::le.Tensor_out: 0x12
[method.cpp:1317] arg 0 with type id 1
[method.cpp:1317] arg 1 with type id 1
[method.cpp:1317] arg 2 with type id 1
[method.cpp:1317] arg 3 with type id 1
...
RuntimeError: method->execute() failed with error 0x12
The stack trace contains the debug log, upon my check, the inputs shapes match with what the forward()
expects. Not sure why I run into this error.
I was hitting this same error with executorch==0.5.0
in optimum-executorch
. See huggingface/optimum-executorch#14 for Qwen2.5, but verified the fix on 0.6.0a
dev branch several weeks ago. Is it a new regression?
cc: @larryliu0820 @manuelcandales. Is it a bug in the kernel? Do you have any pointer to debug this issue?
Versions
Package Version Editable project location
accelerate 1.4.0
accelerator 2024.9.13
aiohappyeyeballs 2.4.6
aiohttp 3.11.13
aiosignal 1.3.2
attrs 25.1.0
audioread 3.0.1
black 24.4.2
bottle 0.12.25
certifi 2025.1.31
cffi 1.17.1
charset-normalizer 3.4.1
clang-format 18.1.3
click 8.1.8
cmake 3.31.4
cmakelint 1.4.1
datasets 3.3.2
decorator 5.2.1
dill 0.3.8
execnet 2.1.1
executorch 0.6.0a0+01a22b6
expecttest 0.3.0
filelock 3.17.0
flake8 6.1.0
flake8-breakpoint 1.1.0
flake8-bugbear 24.4.26
flake8-comprehensions 3.14.0
flake8-plugin-utils 1.3.3
flake8-pyi 23.5.0
flatbuffers 25.2.10
frozenlist 1.5.0
fsspec 2024.12.0
huggingface-hub 0.29.1
hypothesis 6.126.0
idna 3.10
iniconfig 2.0.0
Jinja2 3.1.5
joblib 1.4.2
lazy_loader 0.4
libcst 1.1.0
librosa 0.11.0
lintrunner 0.12.7
lintrunner-adapters 0.12.4
llvmlite 0.44.0
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mccabe 0.7.0
mdurl 0.1.2
moreorless 0.4.0
mpmath 1.3.0
msgpack 1.1.0
multidict 6.1.0
multiprocess 0.70.16
mypy 1.14.1
mypy-extensions 1.0.0
networkx 3.4.2
numba 0.61.0
numpy 2.2.4
optimum 1.24.0
optimum-executorch 0.0.0.dev0
packaging 24.2
pandas 2.2.3
parameterized 0.9.0
pathspec 0.12.1
pillow 11.1.0
pip 25.0
platformdirs 4.3.6
pluggy 1.5.0
pooch 1.8.2
propcache 0.3.0
psutil 7.0.0
pyarrow 19.0.1
pycodestyle 2.11.1
pycparser 2.22
pyflakes 3.1.0
Pygments 2.19.1
pytest 8.3.4
pytest-rerunfailures 15.0
pytest-xdist 3.6.1
python-dateutil 2.9.0.post0
pytz 2025.1
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.10
ruamel.yaml.clib 0.2.12
ruff 0.9.9
safetensors 0.5.2
scikit-learn 1.6.1
scipy 1.15.2
sentencepiece 0.2.0
setproctitle 1.3.5
setuptools 75.8.0
six 1.17.0
sortedcontainers 2.4.0
soundfile 0.13.1
soxr 0.5.0.post1
stdlibs 2024.12.3
sympy 1.13.3
tabulate 0.9.0
threadpoolctl 3.6.0
timeout-decorator 0.5.0
timm 1.0.7
tokenizers 0.21.0
toml 0.10.2
tomli 2.2.1
tomlkit 0.13.2
torch 2.7.0.dev20250311
torchao 0.10.0+git7d879462
torchaudio 2.6.0.dev20250311
TorchFix 0.6.0
torchsr 1.0.4
torchvision 0.22.0.dev20250311
tqdm 4.67.1
trailrunner 1.4.0
transformers 4.50.0.dev0 /Users/guangyang/transformers
typing_extensions 4.12.2
typing-inspect 0.9.0
tzdata 2025.1
ufmt 2.8.0
urllib3 2.3.0
usort 1.0.8.post1
waitress 3.0.2
wheel 0.45.1
xxhash 3.5.0
yarl 1.18.3
zstd 1.5.6.1
Metadata
Metadata
Assignees
Type
Projects
Status