-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPENAI_API_BASE Support #243
Comments
Here's a workaround that seems to work for now. Totally inappropriate for future growth so I'm not creating a PR for this. But as they say, a stupid idea that works isn't stupid. in /llms/openai.py Then in the web UI make the OpenAI API Key http://YOUR.HOST.IP:PORT/v1 |
I tried this and it doesn't work, it just keeps "thinking" for extremely long with no response. Msg me on discord Kita#7214 |
I'm still having trouble running the project, but I thought a simple solution would be:
Then one could simply export an environmental variable to set the URL base path, or if there isn't a variable set then it uses the default URL. |
I added a static method to the AgentExecutor class: superagi/jobs/agent_executor.py: line 91
Then updated the OpenAI class' initialization function to include the base_url parameter class: superagi/llms/openai.py: line 11
Then when calling the executor agent, just add in the parameter:
Finally in the config.yaml file line 5:
I'm still trying to run the project, but that's an improvement because now one can set a setting in the config.yaml file called OPENAI_API_BASE_URL to change the target url of the openai api. |
Okay, I got to work with text-generation-web ui. The solution is a bit hacky atm, but the agent is using a local ggml model that is being executed off of multiple GPUs. The above solution definitely works. In order to get it to work with TGWUI though I had to make the openai api run on my computer's LAN interface as getting the docker image to access port 5001 on host machines loop back interface was hard. To make text generation web ui run on the LAN interface I edited the extensions/openai/script.py file and added the following: Below the import statements, line 17 I added:
This creates a variable containing the host machine's primary network interfaces' ip address. On line or around line 762 you will find the following line:
Please note that this is a quick and dirty solution to use local LLMs. A much better solution would be to add llama-cpp-python functionality to the app, and to create a settings interface for use with llama-cpp-python. I guess I can work on that next. For now though, that is a quick way to get SuperAGI to use local LLMs. |
Okay, using the text generation webui seems to be running into errors parsing JSON from SuperAGI. Going to need to do some more investigating here. This error though is off topic of this issue. As it stands, using local LLMs can be done by editing the aforementioned files in the SuperAGI project.
|
Okay, I created a PR with a merging of Text Generation Web UI to manage local hosted language models. This PR creates a docker image for TGWUI, and adds settings to use it in the configuration file. Local LLMs are a go! |
Here's another option: The Fastchat folks published this today https://lmsys.org/blog/2023-06-09-api-server/ |
I'd recommend we stick with the name |
Absolutely, I'll post that change on my next commit. |
Maybe it would be worth opening a separate smaller PR than #289 so people can use this Base URL change sooner? I'm happy to do that. I just applied @sirajperson 's patches from #243 (comment) locally and they work great! |
@alexkreidler On my fork I have began to implement locally run LLMs. The fork is currently under development, and is not ready to be merged yet. It would be great if you could create a separate PR. Thanks for the help! |
Please consider the following:
Add an import statement to line 6 of agent_executor.py:
Allow the get_agent_api() method to validate the supplied URL or return the default OpenAI API base.
and finally modify the function name when called in the execute_next_action function on or around line 160 to:
The following lines in the OpenAI class in openai.py can remain the same:
This will make the use of a custom base URL more robust. |
@sirajperson Hello Jonathan! I hope you're doing well. However, no matter what I do -> If I build the image cloning their repo or if I copy the "text-generation-webui" an try building the image locally, I always get the same error on Super AGI powershell: "(host='super__tgwui', port=5001): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7f4de27160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))" If I use the command "Test-NetConnection 127.0.0.1 -p 5001" on powershell, it returns true, so it means the port is open. I have even uninstalled docker and installed again, but I'm still facing this... Do you have any Idea of what I might be doing wrong? Thanks a lot! |
@eriksonssilva SuperAGI/tgui/config/models/ They will be copied to the container to be used in tgui. The line "(host='super__tgwui', port=5001): tells SuperAGI to use the docker bridge and host name resolution. In order to get GPU support you will need to follow the docker instructions for setting up the target machine to use the docker image. Those instructions can be found here |
@sirajperson Thanks for the quick answer! "Warning: Loaded default instruction-following template for model. And on Super AGI the answers are always "vague". Also, each command takes A LOT of time... As a means of test I've set the goal: This might not be 100% related to Super AGI, could you (or perhaps anyone?) give me any hint? |
@eriksonssilva Bravo Erik. That's definitely progress. What model are you using? Also, are you using llama-cpp to off load layers to your GPU? I can tell you that the some of the llama model's are just not that great yet. I have been working on trying to get MPT-30B as the brain behind the dropin api end point because it's instruct capabilities are starting to really deliver the quality responses that would make it usable. |
@sirajperson If I try using MTP30b I think my computer will standup and walk away from the room! |
@alexkreidler Yeah, those models don't have a very high perplexity score. If you are able to use the gptq models you should try mpt 30b gptq. What I've been doing, because my two old M40s can't run GPTQ models, is renting cheaper gpu instances at runpod.io and running them there. But please try out the MPT30b and share your results. Also be aware that MPT30b has special message termination characters. That will have to be configured in the constraints section of the agent. |
@alexkreidler This also happened on the 20th. So it may be possible to use this to inference MPT ggml based models from the GPU: https://postgresml.org/blog/announcing-gptq-and-ggml-quantized-llm-support-for-huggingface-transformers |
The discussion on this issue is lingering from being able to use a different end point for the the API to how to get a local LLM working for task agents. Please refer to #542 for discussions on task agent functionality. As it stands, the use of a different API endpoint is working correctly. Because one can point the task agent to any API endpoint weather or not the endpoint selected will work with the task agent is beyond the scope of this issue. I'm hoping this issue will be closed soon, since the alternative endpoint improvement is working great. |
@TransformerOptimus I was wondering if you could close this problem since the OPENAI_API_BASE implementation is working without fault. |
Awesome @sirajperson . Closing this. |
Can we further support configure this in the web GUI? restarting the app whenever you want to change the OPENAI_API_BASE is very time consuming. |
please add better documentation for this in the help |
Add support for the OPENAI_API_BASE endpoint environment variable.
Ideally add input for "OpenAI API Endpoint" in GUI TopBar / Settings under "Open-AI API Key"
This is important for NOW because it will allow us to point to any OpenAI API Compatible drop-in.
Use-case examples:
ChatGPT-to-API for those of us that don't have GPT4 API access or want to use the Plus membership instead of per-token costs for 3.5-Turbo.
llama-cpp-python provides a drop-in OpenAPI compatible API endpoint.
Oobabooga provides an OpenAPI compatible API endpoint plugin.
Realizing that in the future this project may likely have direct support for all sorts of local models and various APIs, this will enable a lot of flexible testing until then.
Related Feature request: Each agent should have its own OPENAI_API_KEY and OPENAI_API_BASE. This may already be baked into the plans for enabling various LLMs, since each will have various settings. But here's a currently useful use case: Agent 1 points at localhost:PORT1 for Gorilla, key="model-name", Agent 2 points at localhost:PORT2 for StarCoder, key="other-model", Agent 3 points at api.openai.com for paid Inference.
The text was updated successfully, but these errors were encountered: