-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assertion error #563
Comments
Hi, @TomiLikesToCode , thanks for your issue, llama.cpp is indeed unavailable at the moment, I guess it may have something to do with this update, we are working on it. |
so i cant use it at this time?
|
Yes, we will solve this problem in the future. You can try other install methods. If you have any questions, please let me know. |
What do you mean other install method? Can I just download from releases and run that? I hate docker I don't wanna use that. |
The problem now is that using llama.cpp to deploy models is not supported, You can try other models, such as configuration:
More details can be found at Installation From Source. |
I'm way too late to reply, but separately from the GGML/GGUF incompatibility issue...
...the magic number implies that your model file is actually a webpage. Re-downloading the model file might help. |
Search before asking
Operating system information
Windows
Python version information
3.10
DB-GPT version
main
Related scenes
Installation Information
Installation From Source
Docker Installation
Docker Compose Installation
Cluster Installation
AutoDL Image
Other
Device information
gpu count 1
cpu count 1
Models information
orca_mini_v3_7b.ggmlv3.q8_0
What happened
D:\AI\DB-GPT>python pilot/server/dbgpt_server.py 2023-09-07 20:10:00 | INFO | numexpr.utils | NumExpr defaulting to 8 threads. 2023-09-07 20:10:06 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en 2023-09-07 20:10:09 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda Add file db, db_name: sqlite_default_sqlite, db_type: sqlite, db_path: data/default_sqlite.db add db connect info error2!Constraint Error: Duplicate key "db_name: sqlite_default_sqlite" violates unique constraint. If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (docs - sql - indexes). d:\ai\db-gpt\pilot Model Unified Deployment Mode! 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Worker params: =========================== ModelWorkerParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin worker_type: None worker_class: None host: 0.0.0.0 port: 8000 limit_model_concurrency: 5 standalone: False register: True worker_register_host: None controller_addr: None send_heartbeat: True heartbeat_interval: 20 ====================================================================== 2023-09-07 20:10:10 | INFO | model_worker | Not register current to controller, register: False, controller_addr: None Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> 2023-09-07 20:10:10 | INFO | LOGGER | Found llm model adapter with model path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.model.adapter.LlamaCppAdapater object at 0x0000029C8D61B400> Get model chat adapter with model path d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin, <pilot.server.chat_adapter.LlamaCppChatAdapter object at 0x0000029C8D6689A0> 2023-09-07 20:10:10 | INFO | model_worker | Init empty instances list for orca_mini_v3_7b@llm 2023-09-07 20:10:10 | INFO | model_worker | [DefaultModelWorker] Parameters of device is None, use cuda 2023-09-07 20:10:10 | INFO | model_worker | Begin start all worker, apply_req: None 2023-09-07 20:10:10 | INFO | model_worker | Apply to all workers: [WorkerRunData(worker_key='orca_mini_v3_7b@llm', worker=<pilot.model.worker.default_worker.DefaultModelWorker object at 0x0000029C8D5C7CA0>, worker_params=ModelWorkerParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', worker_type='llm', worker_class=None, host='0.0.0.0', port=8000, limit_model_concurrency=5, standalone=False, register=False, worker_register_host=None, controller_addr=None, send_heartbeat=True, heartbeat_interval=20), model_params=LlamaCppModelParameters(model_name='orca_mini_v3_7b', model_path='d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', device='cuda', model_type='llama.cpp', prompt_template=None, max_context_size=4096, num_gpus=None, max_gpu_memory=None, cpu_offloading=False, load_8bit=True, load_4bit=False, quant_type='nf4', use_double_quant=True, compute_dtype=None, trust_remote_code=True, verbose=False, seed=-1, n_threads=None, n_batch=512, n_gpu_layers=1000000000, n_gqa=None, rms_norm_eps=5e-06, cache_capacity=None, prefer_cpu=False), stop_event=<asyncio.locks.Event object at 0x0000029C8D6690F0 [unset]>, semaphore=<asyncio.locks.Semaphore object at 0x0000029C8D669FF0 [unlocked, value:5]>, command_args=[], _heartbeat_future=None, _last_heartbeat=None)] 2023-09-07 20:10:10 | INFO | model_worker | Begin load model, model params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== 2023-09-07 20:10:10 | INFO | LOGGER | model_params: =========================== LlamaCppModelParameters =========================== model_name: orca_mini_v3_7b model_path: d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin device: cuda model_type: llama.cpp prompt_template: None max_context_size: 4096 num_gpus: None max_gpu_memory: None cpu_offloading: False load_8bit: True load_4bit: False quant_type: nf4 use_double_quant: True compute_dtype: None trust_remote_code: True verbose: False seed: -1 n_threads: None n_batch: 512 n_gpu_layers: 1000000000 n_gqa: None rms_norm_eps: 5e-06 cache_capacity: None prefer_cpu: False ====================================================================== [(0, 'name', '', 0, None, 0), (1, 'seq', '', 0, None, 0)] [(0, 'order_id', 'INTEGER', 0, None, 1), (1, 'user_id', 'INTEGER', 0, None, 0), (2, 'product_id', 'INTEGER', 0, None, 0), (3, 'quantity', 'INTEGER', 0, None, 0), (4, 'order_date', 'DATE', 0, None, 0)] [(0, 'product_id', 'INTEGER', 0, None, 1), (1, 'product_name', 'VARCHAR(100)', 0, None, 0), (2, 'product_price', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'student_name', 'VARCHAR(100)', 0, None, 0), (2, 'major', 'VARCHAR(100)', 0, None, 0), (3, 'year_of_enrollment', 'INTEGER', 0, None, 0), (4, 'student_age', 'INTEGER', 0, None, 0)] [(0, 'case_id', 'INTEGER', 0, None, 1), (1, 'scenario_name', 'VARCHAR(100)', 0, None, 0), (2, 'scenario_description', 'TEXT', 0, None, 0), (3, 'test_question', 'VARCHAR(500)', 0, None, 0), (4, 'expected_sql', 'TEXT', 0, None, 0), (5, 'correct_output', 'TEXT', 0, None, 0)] [(0, 'user_id', 'INTEGER', 0, None, 1), (1, 'user_name', 'VARCHAR(100)', 0, None, 0), (2, 'user_email', 'VARCHAR(100)', 0, None, 0), (3, 'registration_date', 'DATE', 0, None, 0), (4, 'user_country', 'VARCHAR(100)', 0, None, 0)] [(0, 'course_id', 'INTEGER', 0, None, 1), (1, 'course_name', 'VARCHAR(100)', 0, None, 0), (2, 'credit', 'REAL', 0, None, 0)] [(0, 'student_id', 'INTEGER', 0, None, 1), (1, 'course_id', 'INTEGER', 0, None, 2), (2, 'score', 'INTEGER', 0, None, 0), (3, 'semester', 'VARCHAR(50)', 0, None, 0)] 2023-09-07 20:10:10 | INFO | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: d:\ai\db-gpt\models\bge-large-en Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Llama.cpp use cpu 2023-09-07 20:10:10 | INFO | LOGGER | Llama.cpp use cpu Cache capacity is 0 bytes 2023-09-07 20:10:10 | INFO | LOGGER | Cache capacity is 0 bytes Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} 2023-09-07 20:10:10 | INFO | LOGGER | Load LLama model with params: {'model_path': 'd:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin', 'n_ctx': 4096, 'seed': -1, 'n_threads': None, 'n_batch': 512, 'use_mmap': True, 'use_mlock': False, 'low_vram': False, 'n_gpu_layers': 1000000000, 'n_gqa': None, 'logits_all': True, 'rms_norm_eps': 5e-06} gguf_init_from_file: invalid magic number 4f44213c error loading model: llama_model_loader: failed to load model from d:\ai\db-gpt\models\orca_mini_v3_7b.ggmlv3.q8_0.bin llama_load_model_from_file: failed to load model Traceback (most recent call last): File "D:\AI\DB-GPT\pilot\server\dbgpt_server.py", line 115, in initialize_worker_manager_in_client( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 657, in initialize_worker_manager_in_client loop.run_until_complete( File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete return future.result() File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 341, in _start_all_worker await self._apply_worker(apply_req, _start_worker) File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 313, in _apply_worker return await asyncio.gather( File "d:\ai\db-gpt\pilot\model\worker\manager.py", line 324, in _start_worker worker_run_data.worker.start( File "d:\ai\db-gpt\pilot\model\worker\default_worker.py", line 78, in start self.model, self.tokenizer = self.ml.loader_with_params(model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 121, in loader_with_params return llamacpp_loader(llm_adapter, model_params) File "d:\ai\db-gpt\pilot\model\loader.py", line 347, in llamacpp_loader model, tokenizer = LlamaCppModel.from_pretrained(model_path, model_params) File "d:\ai\db-gpt\pilot\model\llm\llama_cpp\llama_cpp.py", line 85, in from_pretrained result.model = Llama(**params) File "C:\Users\Tomi\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_cpp\llama.py", line 323, in init assert self.model is not None AssertionError 2023-09-07 20:10:14 | INFO | sentence_transformers.SentenceTransformer | Use pytorch device: cuda 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb 2023-09-07 20:10:14 | INFO | clickhouse_connect.driver.ctypes | Successfully imported ClickHouse Connect C data optimizations 2023-09-07 20:10:14 | INFO | clickhouse_connect.json_impl | Using orjson library for writing JSON byte strings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | init db profile success... 2023-09-07 20:10:14 | INFO | chromadb | Running Chroma using direct local API. 2023-09-07 20:10:14 | WARNING | chromadb | Using embedded DuckDB with persistence: data will be stored in: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 0 embeddings 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | loaded in 1 collections 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | collection with name langchain already exists, returning existing collection 2023-09-07 20:10:14 | INFO | db_summary | db summary embedding success 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_summary.vectordb 2023-09-07 20:10:14 | INFO | chromadb.db.duckdb | Persisting DB to disk, putting it in the save folder: d:\ai\db-gpt\pilot\data\sqlite_default_sqlite_profile.vectordb
What you expected to happen
To run the llm model as normal
How to reproduce
follow the "Installation From Source" guide and download my model
Additional context
myenv.txt
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: