You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using outlines.models.llama_cpp and making repeated calls to an instances of outlines.generate.choice only the first call returns a results. This can be resolved by re-instantiating the generate for every call, but this is not an ideal solution.
The model I use in the example code is directly from the Cookbook CoT example, but this issue arose with multiple different models I had attempted earlier.
The example code will produce the following output when I run it:
result: clothing
result:
result:
I am running this on a Mac M2 and an M3 Macbook
Steps/code to reproduce the bug:
importllama_cppfromoutlinesimportgenerate, modelsfromtextwrapimportdedentllama_tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
)
tokenizer=llama_tokenizer.hf_tokenizermodel=models.llamacpp("NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
"Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_tokenizer,
n_gpu_layers=-1,
flash_attn=True,
n_ctx=8192,
verbose=False)
complaint_data= [{'message': 'Hi, my name is Olivia Brown.I recently ordered a knife set from your wellness range, and it arrived earlier this week. Unfortunately, my satisfaction with the product has been less than ideal.My order was A123456',
'order_number': 'A12-3456',
'department': 'kitchen'},
{'message': 'Hi, my name is John Smith.I recently ordered a dress for an upcoming event, which was alleged to meet my expectations both in fit and style. However, upon arrival, it became apparent that the fabric was of subpar quality, leading to a less than satisfactory appearance.The order number is A12-3456',
'order_number': 'A12-3456',
'department': 'clothing'},
{'message': 'Hi, my name is Sarah Johnson.I recently ordered the ultimate ChefMaster 8 Drawer Cooktop. However, upon delivery, I discovered that one of the burners is malfunctioning.My order was A458739',
'order_number': 'A45-8739',
'department': 'kitchen'}]
departments= ["clothing","electronics","kitchen","automotive"]
defcreate_prompt(complaint):
prompt_messages= [
{
"role": "system",
"content": "You are as agent designed to help label complaints."
},
{
"role": "user",
"content": dedent(""" I'm going to provide you with a consumer complaint to analyze. The complaint is going to be regarding a product from one of our departments. Here is the list of departments: - "clothing" - "electronics" - "kitchen" - "automotive" Please reply with *only* the name of the department. """)
},{
"role": "assistant",
"content": "I understand and will only answer with the department name"
},{
"role": "user",
"content": f"Great! Here is the complaint: {complaint['message']}"
}
]
prompt=tokenizer.apply_chat_template(prompt_messages, tokenize=False)
returnpromptif__name__=="__main__":
generator_struct=generate.choice(model,departments)
forcomplaintincomplaint_data:
prompt=create_prompt(complaint)
result=generator_struct(prompt)
print(f"result: {result}")
This issue arose while putting together an Outlines workshop for ODSC. I had originally hoped to use llama_cpp for the workshop but this (and another soon to be posted bug) were blockers (I ended up using transformers instead).
The text was updated successfully, but these errors were encountered:
I had the same issue on a different application, but I figured it was mostly inexperience. I believe I ended up recreating the generator each time, which is a temporary workaround for people who stumble on the issue.
Note that this will be slow and (I think) requires rebuilding the FSM each time.
Describe the issue as clearly as possible:
When using
outlines.models.llama_cpp
and making repeated calls to an instances ofoutlines.generate.choice
only the first call returns a results. This can be resolved by re-instantiating the generate for every call, but this is not an ideal solution.The model I use in the example code is directly from the Cookbook CoT example, but this issue arose with multiple different models I had attempted earlier.
The example code will produce the following output when I run it:
I am running this on a Mac M2 and an M3 Macbook
Steps/code to reproduce the bug:
Expected result:
Error message:
No response
Outlines/Python version information:
Version information
Context for the issue:
This issue arose while putting together an Outlines workshop for ODSC. I had originally hoped to use llama_cpp for the workshop but this (and another soon to be posted bug) were blockers (I ended up using transformers instead).
The text was updated successfully, but these errors were encountered: