Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use conversation template for api proxy, fix eventsource format #2383

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

zeyugao
Copy link

@zeyugao zeyugao commented Jul 25, 2023

In this pr, it adds a --chat-prompt-model parameter enables the use of a model registered in fastchat/conversation.py. As model prompt templates, like llama 2, become more intricate, handling them exclusively with tools such as --chat-prompt and --user-name becomes less manageable. Thus, a community-maintained conversation template has been developed as a more user-friendly solution.

Currently, the customized system message is pending the merge of lm-sys/FastChat#2069. Yet, the current fschat version should operate without exceptions.

Furthermore, there exists an issue when presenting data in an event-source format. The data must conclude with two \n characters, rather than just one \n, implying the necessity for a blank line that contains only a single \n character, which is what OpenAI did.

Make fschat and flask-cors optional
@Azeirah
Copy link
Contributor

Azeirah commented Jul 26, 2023

Thank you! The PHPStorm plugin I was using, codeGPT didn't work with the main branch api_like_OAI.py.

With yours it works smoothly! With the new llama-2 based wizard-13b I finally have a usable local-only assistant that integrates seamlessly in my existing workflows.

:D

@zeyugao
Copy link
Author

zeyugao commented Aug 2, 2023

The pr has been merged in upstream, and due to the limitation from GitHub (https://github.com/orgs/community/discussions/5634), it seems that I cannot allow editing by maintainer.

@thomasbergersen
Copy link

thomasbergersen commented Aug 5, 2023

Thank you! The llama-cpp-python generated result is missing some key words

@vmajor
Copy link

vmajor commented Aug 7, 2023

I am observing this error with a 70B Llama 2 model when attempting to run the guidance tutorial notebook and dropping in openai.api_base = "http://127.0.0.1:8081/": https://github.com/microsoft/guidance/blob/main/notebooks/tutorial.ipynb

[2023-08-07 09:31:43,210] ERROR in app: Exception on /completions [POST]
Traceback (most recent call last):
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 225, in completion
    postData = make_postData(body, chat=False, stream=stream)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 106, in make_postData
    if(is_present(body, "stop")): postData["stop"] += body["stop"]
TypeError: 'NoneType' object is not iterable
127.0.0.1 - - [07/Aug/2023 09:31:43] "POST //completions HTTP/1.1" 500 -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants