Description
i strictly follow the installation docs (https://lightllm-cn.readthedocs.io/en/latest/getting_started/installation.html#installation).
and my gpu is a800.
error:
python -m lightllm.server.api_server --model_dir ~/autodl-pub/models/llama-7b/
INFO 12-24 20:14:05 [cache_tensor_manager.py:17] USE_GPU_TENSOR_CACHE is On
ERROR 12-24 20:14:05 [_custom_ops.py:51] vllm or lightllm_kernel is not installed, you can't use custom ops
INFO 12-24 20:14:05 [communication_op.py:41] vllm or lightllm_kernel is not installed, you can't use custom allreduce
/root/autodl-tmp/lightllm/lightllm/server/api_server.py:356: DeprecationWarning:
on_event is deprecated, use lifespan event handlers instead.
Read more about it in the
[FastAPI docs for Lifespan Events](https://fastapi.tiangolo.com/advanced/events/).
@app.on_event("shutdown")
/root/autodl-tmp/lightllm/lightllm/server/api_server.py:375: DeprecationWarning:
on_event is deprecated, use lifespan event handlers instead.
Read more about it in the
[FastAPI docs for Lifespan Events](https://fastapi.tiangolo.com/advanced/events/).
@app.on_event("startup")
WARNING 12-24 20:14:06 [tokenizer.py:66] load fast tokenizer fail: Descriptors cannot not be created directly.
WARNING 12-24 20:14:06 [tokenizer.py:66] If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
WARNING 12-24 20:14:06 [tokenizer.py:66] If you cannot immediately regenerate your protos, some other possible workarounds are:
WARNING 12-24 20:14:06 [tokenizer.py:66] 1. Downgrade the protobuf package to 3.20.x or lower.
WARNING 12-24 20:14:06 [tokenizer.py:66] 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
WARNING 12-24 20:14:06 [tokenizer.py:66]
WARNING 12-24 20:14:06 [tokenizer.py:66] More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
Traceback (most recent call last):
File "/root/autodl-tmp/lightllm/lightllm/server/tokenizer.py", line 62, in get_tokenizer
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, trust_remote_code=trust_remote_code, *args, **kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 907, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2208, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2442, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 171, in init
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 203, in get_spm_processor
model_pb2 = import_protobuf(f"The new behaviour of {self.class.name} (with self.legacy = False
)")
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/convert_slow_tokenizer.py", line 38, in import_protobuf
from sentencepiece import sentencepiece_model_pb2
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/sentencepiece/sentencepiece_model_pb2.py", line 34, in
_descriptor.EnumValueDescriptor(
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 796, in new
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
- Downgrade the protobuf package to 3.20.x or lower.
- Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/envs/lightllm/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/lightllm/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/autodl-tmp/lightllm/lightllm/server/api_server.py", line 394, in
init_tokenizer(args) # for openai api
File "/root/autodl-tmp/lightllm/lightllm/server/build_prompt.py", line 8, in init_tokenizer
tokenizer = get_tokenizer(args.model_dir, args.tokenizer_mode, trust_remote_code=args.trust_remote_code)
File "/root/autodl-tmp/lightllm/lightllm/server/tokenizer.py", line 68, in get_tokenizer
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, trust_remote_code=trust_remote_code, *args, **kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 907, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2208, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2442, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 171, in init
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 203, in get_spm_processor
model_pb2 = import_protobuf(f"The new behaviour of {self.class.name} (with self.legacy = False
)")
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/transformers/convert_slow_tokenizer.py", line 38, in import_protobuf
from sentencepiece import sentencepiece_model_pb2
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/sentencepiece/sentencepiece_model_pb2.py", line 16, in
DESCRIPTOR = _descriptor.FileDescriptor(
File "/root/miniconda3/envs/lightllm/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 1066, in new
return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool: duplicate file name sentencepiece_model.proto