Closed
Description
Export llama is failing with errors for llama and stories models
Error for llama model: Could not import fairseq2 modules....RuntimeError: Trying to create tensor with negative dimension -1: [-1, 4096]
Error for stories model: Could not import fairseq2 modules....RuntimeError: mmap can only be used with files saved with
torch.save(./stories/stories110M.pt, _use_new_zipfile_serialization=True), please torch.save your checkpoint with this option in order to use mmap.`
Steps to run for Llama model
Follow the steps from LLM manual
Download the meta versions of llama weights
Run export_llama script
python -m examples.models.llama2.export_llama --checkpoint $MODEL_PATH/consolidated.00.pth --params $MODEL_PATH/params.json -kv --use_sdpa_with_kv_cache -X -qmode 8da4w --group_size 128 -d fp32
Error details for llama2 model export
Could not import fairseq2 modules.
INFO:root:Loading model with checkpoint=/Users/gchauhan/dev/llama-fast/checkpoints/meta-llama/Llama-2-7b/consolidated.00.pth, params=/Users/gchauhan/dev/llama-fast/checkpoints/meta-llama/Llama-2-7b/params.json, use_kv_cache=True, weight_type=WeightType.LLAMA
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
main() # pragma: no cover
^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama.py", line 26, in main
export_llama(modelname, args)
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 408, in export_llama
return _export_llama(modelname, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 529, in _export_llama
builder_exported_to_edge = _prepare_for_llama_export(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 486, in _prepare_for_llama_export
load_llama_model(
File "/Users/gchauhan/dev/executorch/examples/models/llama2/builder.py", line 83, in load_llama_model
model, example_inputs, _ = EagerModelFactory.create_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/model_factory.py", line 44, in create_model
model = model_class(**kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/model.py", line 139, in __init__
self.model_ = Transformer(model_args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/miniconda3/envs/et/lib/python3.11/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 418, in __init__
self.tok_embeddings = nn.Embedding(params.vocab_size, params.dim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/miniconda3/envs/et/lib/python3.11/site-packages/torch/nn/modules/sparse.py", line 143, in __init__
self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/miniconda3/envs/et/lib/python3.11/site-packages/torch/utils/_device.py", line 78, in __torch_function__
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Trying to create tensor with negative dimension -1: [-1, 4096]
Steps for Stories model
Download the model from the links specified
Run
python -m examples.models.llama2.export_llama -c ./stories/stories110M.pt -p ./stories/params.json
Error details for Stories model export
Could not import fairseq2 modules.
INFO:root:Loading model with checkpoint=./stories/stories110M.pt, params=./stories/params.json, use_kv_cache=False, weight_type=WeightType.LLAMA
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
main() # pragma: no cover
^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama.py", line 26, in main
export_llama(modelname, args)
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 408, in export_llama
return _export_llama(modelname, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 529, in _export_llama
builder_exported_to_edge = _prepare_for_llama_export(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/export_llama_lib.py", line 486, in _prepare_for_llama_export
load_llama_model(
File "/Users/gchauhan/dev/executorch/examples/models/llama2/builder.py", line 83, in load_llama_model
model, example_inputs, _ = EagerModelFactory.create_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/model_factory.py", line 44, in create_model
model = model_class(**kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/gchauhan/dev/executorch/examples/models/llama2/model.py", line 75, in __init__
checkpoint = torch.load(checkpoint_path, map_location=device, mmap=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/miniconda3/envs/et/lib/python3.11/site-packages/torch/serialization.py", line 1032, in load
raise RuntimeError("mmap can only be used with files saved with "
RuntimeError: mmap can only be used with files saved with `torch.save(./stories/stories110M.pt, _use_new_zipfile_serialization=True), please torch.save your checkpoint with this option in order to use mmap.
Environment
python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 2.4.0.dev20240324
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 14.4.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.29.0
Libc version: N/A
Python version: 3.11.8 (main, Feb 26 2024, 15:36:12) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-14.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M1 Pro
Versions of relevant libraries:
[pip3] executorch==0.1.0
[pip3] numpy==1.26.4
[pip3] torch==2.4.0.dev20240324
[pip3] torchao==0.1
[pip3] torchaudio==2.2.0.dev20240324
[pip3] torchsr==1.0.4
[pip3] torchvision==0.19.0.dev20240324
[conda] executorch 0.1.0 pypi_0 pypi
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.4.0.dev20240324 pypi_0 pypi
[conda] torchao 0.1 pypi_0 pypi
[conda] torchaudio 2.2.0.dev20240324 pypi_0 pypi
[conda] torchsr 1.0.4 pypi_0 pypi
[conda] torchvision 0.19.0.dev20240324 pypi_0 pypi