Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I'm trying to create a completion using a GGUF model
Current Behavior
from llama_cpp import Llama
model = Llama(model_path="./tests/models/openchat-3.5-1210.Q3_K_S.gguf", n_ctx=128, n_batch=128)
questions_and_answers = [
("What's the capital of France?", "Paris"),
("What is the capital of Canada?", "Ottawa"),
("What is the capital of Ghana?", "Accra"),
]
for i, (question, answer) in enumerate(questions_and_answers):
prompt = f"GPT4 Correct User: Answer in a single word. {question} <|end_of_turn|>\n GPT4 Correct Assistant:"
result = model.create_completion(prompt=prompt)
print(i)
print(result)
Segmentation fault (core dumped) on 0.2.58
(works well on 0.2.57)
Environment and Context
Failure Information (for bugs)
Segmentation fault (core dumped)
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I'm trying to create a completion using a GGUF model
Current Behavior
Segmentation fault (core dumped) on 0.2.58
(works well on 0.2.57)
Environment and Context
Failure Information (for bugs)
Segmentation fault (core dumped)