Meta-Llama-3-8B-Instruct-GGUF - Streaming response by sentences #12633
Unanswered
eric-patton-bam
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm incredibly new to running local models, and this may be a basic question, but I can't find a solution for it. I tried creating a shell script using llama-cli that will listen to a response coming back from a prompt and then do something (generate some audio) each time it finds and end of a sentence. I can't seem to get it to work this way though. It goes into an interactive mode and won't do anything until I press Ctrl+C.
Is there any documentation for how to accomplish something like this? I'm very used to doing this kind of thing in C#, but I'm wanting to run this on a Linux OS using an Orange Pi 5, and I want to squeeze out as much performance as I can with it, so I'm going with llama.cpp and trying to learn how to use it.
Here's what my last attempt looked like (I removed some of it that isn't relevant, as it is getting a bit long):
Beta Was this translation helpful? Give feedback.
All reactions