-
Notifications
You must be signed in to change notification settings - Fork 581
Description
When I run the following command
$ lmdeploy serve api_server /data/models/deepseek-vl2 --tp 4 --server-port 8001 --dtype float16 --log-level INFO --api-keys sk-r1OH8cTiMe --session-len 12288 --backend pytorch
after I Followed the installation documentation to install lmdeploy.
Here is the lmdeploy check_env
sys.platform: linux
Python: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3: Tesla V100S-PCIE-32GB
CUDA_HOME: /usr/local/cuda-12.2
NVCC: Cuda compilation tools, release 12.2, V12.2.91
GCC: gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
PyTorch: 2.0.1
PyTorch compiling details: PyTorch built with:
- GCC 9.3
- C++ Version: 201703
- Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.8
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.7
- Magma 2.6.1
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.15.2
LMDeploy: 0.7.1+
transformers: 4.47.1
gradio: Not Found
fastapi: 0.115.11
pydantic: 2.10.6
triton: 3.1.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X PHB PHB PHB 0-31 0-1 N/A
GPU1 PHB X PHB PHB 0-31 0-1 N/A
GPU2 PHB PHB X PHB 0-31 0-1 N/A
GPU3 PHB PHB PHB X 0-31 0-1 N/A
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
The full log info
$ lmdeploy serve api_server /data/models/deepseek-vl2 --tp 4 --server-port 8001 --dtype float16 - -log-level INFO --api-keys sk-r1OH8cTiMe --session-len 12288 --chat-template /data/models/chat_template.json --backend pytorch
The argument trust_remote_code
is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type deepseek_vl_v2 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/torchvision/datapoints/init.py:12: UserWarning: The torchvision.datapoints and torchvision.tran sforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback yo u may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about th e APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/torchvision/transforms/v2/init.py:54: UserWarning: The torchvision.datapoints and torchvision.t ransforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
The argument trust_remote_code
is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type deepseek_vl_v2 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
2025-03-20 14:58:23,085 - lmdeploy - INFO - builder.py:64 - matching vision model: DeepSeek2VisionModel
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means tha t the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you und erstand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Python version is above 3.10, patching the collections module.
Some kwargs in processor config are unused and will not have any effect: sft_format, ignore_id, candidate_resolutions, image_std, mask_prompt, patch_size, downsample_ra tio, image_token, pad_token, image_mean, normalize, add_special_token.
2025-03-20 14:58:24,589 - lmdeploy - INFO - async_engine.py:259 - input backend=pytorch, backend_config=PytorchEngineConfig(dtype='float16', tp=4, session_len=12288, ma x_batch_size=128, cache_max_entry_count=0.8, prefill_interval=16, block_size=64, num_cpu_blocks=0, num_gpu_blocks=0, adapters=None, max_prefill_token_num=8192, thread_s afe=False, enable_prefix_caching=False, device_type='cuda', eager_mode=False, custom_module_map=None, download_dir=None, revision=None, quant_policy=0, distributed_exec utor_backend=None)
2025-03-20 14:58:24,590 - lmdeploy - INFO - async_engine.py:260 - input chat_template_config=ChatTemplateConfig(model_name='deepseek-vl2', system='<|im_start|>system\n' , meta_instruction='You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to com puter science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.', eosys=None, user =None, eoh=None, assistant=None, eoa=None, tool=None, eotool=None, separator=None, capability=None, stop_words=None)
2025-03-20 14:58:24,594 - lmdeploy - INFO - async_engine.py:269 - updated chat_template_onfig=ChatTemplateConfig(model_name='deepseek-vl2', system='<|im_start|>system\n ', meta_instruction='You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to co mputer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.', eosys=None, use r=None, eoh=None, assistant=None, eoa=None, tool=None, eotool=None, separator=None, capability=None, stop_words=None)
2025-03-20 14:58:26,161 - lmdeploy - WARNING - transformers.py:22 - LMDeploy requires transformers version: [4.33.0 ~ 4.46.1], but found version: 4.47.1
2025-03-20 14:58:26,427 - lmdeploy - INFO - init.py:82 - Build executor.
2025-03-20 14:58:26,433 - lmdeploy - INFO - ray_executor.py:199 - Init ray cluster.
2025-03-20 14:58:28,513 INFO worker.py:1841 -- Started a local Ray instance.
2025-03-20 14:58:29,385 - lmdeploy - INFO - dist_utils.py:28 - MASTER_ADDR=10.129.11.2, MASTER_PORT=29501
2025-03-20 14:58:29,386 - lmdeploy - INFO - ray_executor.py:219 - Init ray workers.
(pid=22622) /data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/torchvision/datapoints/init.py:12: UserWarning: The torchvision.datapoints and torc hvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn m ore about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
(pid=22622) warnings.warn(_BETA_TRANSFORMS_WARNING)
(pid=22622) /data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/torchvision/transforms/v2/init.py:54: UserWarning: The torchvision.datapoints and t orchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to lear n more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
(pid=22622) warnings.warn(_BETA_TRANSFORMS_WARNING)
(RayWorkerWrapper pid=22622) The argument trust_remote_code
is to be used with Auto classes. It has no effect here and is ignored.
(RayWorkerWrapper pid=22622) You are using a model of type deepseek_vl_v2 to instantiate a model of type . This is not supported for all configurations of models and ca n yield errors.
(RayWorkerWrapper pid=22622) You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is e xpected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#2 4565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Traceback (most recent call last):
File "/data/software/miniconda3/envs/lmdeploy/bin/lmdeploy", line 33, in
sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')())
File "/data/installation/lmdeploy/lmdeploy/cli/entrypoint.py", line 39, in run
args.run(args)
File "/data/installation/lmdeploy/lmdeploy/cli/serve.py", line 322, in api_server
run_api_server(args.model_path,
File "/data/installation/lmdeploy/lmdeploy/serve/openai/api_server.py", line 1106, in serve
VariableInterface.async_engine = pipeline_class(model_path=model_path,
File "/data/installation/lmdeploy/lmdeploy/serve/vl_async_engine.py", line 32, in init
super().init(model_path, backend=backend, backend_config=backend_config, **kwargs)
File "/data/installation/lmdeploy/lmdeploy/serve/async_engine.py", line 279, in init
self._build_pytorch(model_path=model_path, backend_config=backend_config, **kwargs)
File "/data/installation/lmdeploy/lmdeploy/serve/async_engine.py", line 341, in _build_pytorch
self.engine = Engine(model_path=model_path, tokenizer=self.tokenizer, engine_config=backend_config)
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 138, in init
self.executor = build_executor(model_path,
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/init.py", line 110, in build_executor
return RayExecutor(
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 220, in init
self.workers = self._init_workers_ray(placement_group, worker_kwargs)
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 382, in _init_workers_ray
workers = self._sort_workers(driver_ip, workers)
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 329, in _sort_workers
worker_ips = ray.get([worker.get_node_ip.remote() for worker in workers])
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/ray/_private/worker.py", line 2771, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/ray/_private/worker.py", line 921, in get_objects
raise value
ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::RayWorkerWrapper.init() (pid=22622, ip=10.129.11.2, actor_id=9bc a9cf45acb976e93e201f801000000, repr=<lmdeploy.pytorch.engine.executor.ray_executor.RayWorkerWrapper object at 0x7fdf079c67a0>)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 740, in getitem
raise KeyError(key)
KeyError: 'deepseek_vl_v2'
During handling of the above exception, another exception occurred:
ray::RayWorkerWrapper.init() (pid=22622, ip=10.129.11.2, actor_id=9bca9cf45acb976e93e201f801000000, repr=<lmdeploy.pytorch.engine.executor.ray_executor.RayWorkerWra pper object at 0x7fdf079c67a0>)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 132, in init
model_config = ModelConfig.from_pretrained(model_path, trust_remote_code=True,dtype=dtype, tp=tp)
File "/data/installation/lmdeploy/lmdeploy/pytorch/config.py", line 134, in from_pretrained
hf_config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=trust_remote_code)
File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1040, in from_pretrained
raise ValueError(
ValueError: The checkpoint you are trying to load has model type deepseek_vl_v2
but Transformers does not recognize this architecture. This could be because of an iss ue with the checkpoint, or because your version of Transformers is out of date.
(RayWorkerWrapper pid=22622) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RayWorkerWrapper.init() (pid=22 622, ip=10.129.11.2, actor_id=9bca9cf45acb976e93e201f801000000, repr=<lmdeploy.pytorch.engine.executor.ray_executor.RayWorkerWrapper object at 0x7fdf079c67a0>)
(RayWorkerWrapper pid=22622) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 740, in getitem
(RayWorkerWrapper pid=22622) raise KeyError(key)
(RayWorkerWrapper pid=22622) KeyError: 'deepseek_vl_v2'
(RayWorkerWrapper pid=22622)
(RayWorkerWrapper pid=22622) During handling of the above exception, another exception occurred:
(RayWorkerWrapper pid=22622)
(RayWorkerWrapper pid=22622) ray::RayWorkerWrapper.init() (pid=22622, ip=10.129.11.2, actor_id=9bca9cf45acb976e93e201f801000000, repr=<lmdeploy.pytorch.engine.execu tor.ray_executor.RayWorkerWrapper object at 0x7fdf079c67a0>)
(RayWorkerWrapper pid=22622) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 451, in result
(RayWorkerWrapper pid=22622) return self.__get_result()
(RayWorkerWrapper pid=22622) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
(RayWorkerWrapper pid=22622) raise self._exception
(RayWorkerWrapper pid=22622) File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 132, in init
(RayWorkerWrapper pid=22622) model_config = ModelConfig.from_pretrained(model_path, trust_remote_code=True,dtype=dtype, tp=tp)
(RayWorkerWrapper pid=22622) File "/data/installation/lmdeploy/lmdeploy/pytorch/config.py", line 134, in from_pretrained
(RayWorkerWrapper pid=22622) hf_config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=trust_remote_code)
(RayWorkerWrapper pid=22622) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1040, in from_pretrained
(RayWorkerWrapper pid=22622) raise ValueError(
(RayWorkerWrapper pid=22622) ValueError: The checkpoint you are trying to load has model type deepseek_vl_v2
but Transformers does not recognize this architecture. Th is could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
(RayWorkerWrapper pid=22632)
(RayWorkerWrapper pid=22632)
(RayWorkerWrapper pid=22627)
(RayWorkerWrapper pid=22627)
(RayWorkerWrapper pid=22624)
(RayWorkerWrapper pid=22624)
(pid=22627) /data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/torchvision/transforms/v2/init.py:54: UserWarning: The torchvision.datapoints and t orchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to lear n more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning(). [repeated 6x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides /configure-logging.html#log-deduplication for more options.)
(pid=22627) warnings.warn(_BETA_TRANSFORMS_WARNING) [repeated 6x across cluster]
(RayWorkerWrapper pid=22627) The argument trust_remote_code
is to be used with Auto classes. It has no effect here and is ignored. [repeated 3x across cluster]
(RayWorkerWrapper pid=22627) You are using a model of type deepseek_vl_v2 to instantiate a model of type . This is not supported for all configurations of models and ca n yield errors. [repeated 3x across cluster]
(RayWorkerWrapper pid=22627) You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is e xpected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#2 4565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RayWorkerWrapper.init() (pid=22 624, ip=10.129.11.2, actor_id=adab495e3f7329f36cd8329b01000000, repr=<lmdeploy.pytorch.engine.executor.ray_executor.RayWorkerWrapper object at 0x7f5f16e16890>) [repeate d 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 740, in getitem [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) raise KeyError(key) [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) KeyError: 'deepseek_vl_v2' [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) During handling of the above exception, another exception occurred: [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) ray::RayWorkerWrapper.init() (pid=22624, ip=10.129.11.2, actor_id=adab495e3f7329f36cd8329b01000000, repr=<lmdeploy.pytorch.engine.execu tor.ray_executor.RayWorkerWrapper object at 0x7f5f16e16890>) [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 451, in result [repeated 3x across cluste r]
(RayWorkerWrapper pid=22624) return self.__get_result() [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) raise self._exception [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/installation/lmdeploy/lmdeploy/pytorch/engine/executor/ray_executor.py", line 132, in init [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) model_config = ModelConfig.from_pretrained(model_path, trust_remote_code=True,dtype=dtype, tp=tp) [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/installation/lmdeploy/lmdeploy/pytorch/config.py", line 134, in from_pretrained [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) hf_config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=trust_remote_code) [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) File "/data/software/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1040, in from_pretrained [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) raise ValueError( [repeated 3x across cluster]
(RayWorkerWrapper pid=22624) ValueError: The checkpoint you are trying to load has model type deepseek_vl_v2
but Transformers does not recognize this architecture. Th is could be because of an issue with the checkpoint, or because your version of Transformers is out of date. [repeated 3x across cluster]
I also tried tranformers==4.38.2 and tranformers==4.45 ,but no help .