Skip to content

KernelCall failed at instruction 0:16 in operator aten::le.Tensor_out: 0x12 #9433

@guangy10

Description

@guangy10

🐛 Describe the bug

Need help to understand this failure. When I'm loading a .pte to llama_runner via pybind, I got the following error:

[program.cpp:136] InternalConsistency verification requested but not available

forward: MethodMeta(name='forward', num_inputs=2, input_tensor_meta=['TensorInfo(sizes=[1, 1], dtype=Long, is_memory_planned=True, nbytes=8)', 'TensorInfo(sizes=[1], dtype=Long, is_memory_planned=True, nbytes=8)'], num_outputs=1, output_tensor_meta=['TensorInfo(sizes=[1, 1, 200064], dtype=Float, is_memory_planned=True, nbytes=800256)'])

input_ids size: torch.Size([1, 1])
input_ids: tensor([[32]])

cache_position size: torch.Size([1])
cache_position: tensor([0])

[tensor_util.h:735] Tensors do not match: numel=(2048,  1), dim=(1, 2)
[tensor_util.h:744]     size(0): (2048, 1)
[op_le.cpp:29] Check failed (tensors_have_same_shape(a, b)):
[method.cpp:1311] KernelCall failed at instruction 0:16 in operator aten::le.Tensor_out: 0x12
[method.cpp:1317] arg 0 with type id 1
[method.cpp:1317] arg 1 with type id 1
[method.cpp:1317] arg 2 with type id 1
[method.cpp:1317] arg 3 with type id 1
...
RuntimeError: method->execute() failed with error 0x12

The stack trace contains the debug log, upon my check, the inputs shapes match with what the forward() expects. Not sure why I run into this error.

I was hitting this same error with executorch==0.5.0 in optimum-executorch. See huggingface/optimum-executorch#14 for Qwen2.5, but verified the fix on 0.6.0a dev branch several weeks ago. Is it a new regression?

cc: @larryliu0820 @manuelcandales. Is it a bug in the kernel? Do you have any pointer to debug this issue?

Versions

Package Version Editable project location


accelerate 1.4.0
accelerator 2024.9.13
aiohappyeyeballs 2.4.6
aiohttp 3.11.13
aiosignal 1.3.2
attrs 25.1.0
audioread 3.0.1
black 24.4.2
bottle 0.12.25
certifi 2025.1.31
cffi 1.17.1
charset-normalizer 3.4.1
clang-format 18.1.3
click 8.1.8
cmake 3.31.4
cmakelint 1.4.1
datasets 3.3.2
decorator 5.2.1
dill 0.3.8
execnet 2.1.1
executorch 0.6.0a0+01a22b6
expecttest 0.3.0
filelock 3.17.0
flake8 6.1.0
flake8-breakpoint 1.1.0
flake8-bugbear 24.4.26
flake8-comprehensions 3.14.0
flake8-plugin-utils 1.3.3
flake8-pyi 23.5.0
flatbuffers 25.2.10
frozenlist 1.5.0
fsspec 2024.12.0
huggingface-hub 0.29.1
hypothesis 6.126.0
idna 3.10
iniconfig 2.0.0
Jinja2 3.1.5
joblib 1.4.2
lazy_loader 0.4
libcst 1.1.0
librosa 0.11.0
lintrunner 0.12.7
lintrunner-adapters 0.12.4
llvmlite 0.44.0
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mccabe 0.7.0
mdurl 0.1.2
moreorless 0.4.0
mpmath 1.3.0
msgpack 1.1.0
multidict 6.1.0
multiprocess 0.70.16
mypy 1.14.1
mypy-extensions 1.0.0
networkx 3.4.2
numba 0.61.0
numpy 2.2.4
optimum 1.24.0
optimum-executorch 0.0.0.dev0
packaging 24.2
pandas 2.2.3
parameterized 0.9.0
pathspec 0.12.1
pillow 11.1.0
pip 25.0
platformdirs 4.3.6
pluggy 1.5.0
pooch 1.8.2
propcache 0.3.0
psutil 7.0.0
pyarrow 19.0.1
pycodestyle 2.11.1
pycparser 2.22
pyflakes 3.1.0
Pygments 2.19.1
pytest 8.3.4
pytest-rerunfailures 15.0
pytest-xdist 3.6.1
python-dateutil 2.9.0.post0
pytz 2025.1
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.10
ruamel.yaml.clib 0.2.12
ruff 0.9.9
safetensors 0.5.2
scikit-learn 1.6.1
scipy 1.15.2
sentencepiece 0.2.0
setproctitle 1.3.5
setuptools 75.8.0
six 1.17.0
sortedcontainers 2.4.0
soundfile 0.13.1
soxr 0.5.0.post1
stdlibs 2024.12.3
sympy 1.13.3
tabulate 0.9.0
threadpoolctl 3.6.0
timeout-decorator 0.5.0
timm 1.0.7
tokenizers 0.21.0
toml 0.10.2
tomli 2.2.1
tomlkit 0.13.2
torch 2.7.0.dev20250311
torchao 0.10.0+git7d879462
torchaudio 2.6.0.dev20250311
TorchFix 0.6.0
torchsr 1.0.4
torchvision 0.22.0.dev20250311
tqdm 4.67.1
trailrunner 1.4.0
transformers 4.50.0.dev0 /Users/guangyang/transformers
typing_extensions 4.12.2
typing-inspect 0.9.0
tzdata 2025.1
ufmt 2.8.0
urllib3 2.3.0
usort 1.0.8.post1
waitress 3.0.2
wheel 0.45.1
xxhash 3.5.0
yarl 1.18.3
zstd 1.5.6.1

cc @larryliu0820 @manuelcandales

Metadata

Metadata

Labels

module: kernelsIssues related to kernel libraries and utilities, and code under kernels/

Type

Projects

Status

Backlog

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions