You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.) Can be reproduced in sagemaker
Steps to reproduce
(Paste the commands you ran that produced the error.)
Output logs of model deployment process
What have you tried to solve it?
Tried changing instance size/type
Validated .bin files are in place and correct path in s3
The text was updated successfully, but these errors were encountered:
Description
(A clear and concise description of what the bug is.)
Seeing the following error during conversion when attempting to deploy a v1.4_llama3 fine tuned LLM with tensorrtllm.
LLM Inference Container:
763104351884.dkr.ecr.region.amazonaws.com/djl-inference:0.29.0-tensorrtllm0.11.0-cu124
Bin files exist in s3 path, but cannot be found by the conversion process
Please note this works fine for vllm, but not tensorrt:
VLLM properties:
engine=Python
option.model_id=s3_path
option.tensor_parallel_degree=1
option.trust_remote_code=true
option.rolling_batch=vllm
option.entryPoint=djl_python.huggingface
option.max_model_len=16384
option.max_rolling_batch_size=16
option.enable_streaming=false
TRTLLM properties:
engine=Python
option.model_id=s3_path
option.tensor_parallel_degree=1
option.trust_remote_code=true
option.rolling_batch=trtllm
option.entryPoint=djl_python.huggingface
option.max_model_len=16384
option.max_rolling_batch_size=128
option.enable_streaming=false
Expected Behavior
(what's the expected behavior?)
Expect for the model conversion process to succeed just as it does for vllm config.
Error Message
(Paste the complete error message, including stack trace.)
[INFO ] LmiUtils - convert_py: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/.djl.ai/download/cffe5246b14faa11e217a6f21535dff1719c39ba/pytorch_model-00001-of-00004.bin'
How to Reproduce?
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.) Can be reproduced in sagemaker
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
The text was updated successfully, but these errors were encountered: