Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENAI_API_BASE Support #243

Closed
kenfink opened this issue Jun 7, 2023 · 25 comments
Closed

OPENAI_API_BASE Support #243

kenfink opened this issue Jun 7, 2023 · 25 comments

Comments

@kenfink
Copy link

kenfink commented Jun 7, 2023

Add support for the OPENAI_API_BASE endpoint environment variable.
Ideally add input for "OpenAI API Endpoint" in GUI TopBar / Settings under "Open-AI API Key"

This is important for NOW because it will allow us to point to any OpenAI API Compatible drop-in.
Use-case examples:
ChatGPT-to-API for those of us that don't have GPT4 API access or want to use the Plus membership instead of per-token costs for 3.5-Turbo.
llama-cpp-python provides a drop-in OpenAPI compatible API endpoint.
Oobabooga provides an OpenAPI compatible API endpoint plugin.

Realizing that in the future this project may likely have direct support for all sorts of local models and various APIs, this will enable a lot of flexible testing until then.

Related Feature request: Each agent should have its own OPENAI_API_KEY and OPENAI_API_BASE. This may already be baked into the plans for enabling various LLMs, since each will have various settings. But here's a currently useful use case: Agent 1 points at localhost:PORT1 for Gorilla, key="model-name", Agent 2 points at localhost:PORT2 for StarCoder, key="other-model", Agent 3 points at api.openai.com for paid Inference.

@kenfink
Copy link
Author

kenfink commented Jun 7, 2023

Here's a workaround that seems to work for now. Totally inappropriate for future growth so I'm not creating a PR for this. But as they say, a stupid idea that works isn't stupid.

in /llms/openai.py
Line 23 is:
openai.api_key = api_key
add line 24 beneath it:
openai.api_base = api_key

Then in the web UI make the OpenAI API Key http://YOUR.HOST.IP:PORT/v1
Be sure to leave off the slash at the end of the endpoint - the backend server adds it back in.

@arkkita
Copy link

arkkita commented Jun 8, 2023

I tried this and it doesn't work, it just keeps "thinking" for extremely long with no response. Msg me on discord Kita#7214

@sirajperson
Copy link
Contributor

sirajperson commented Jun 8, 2023

I'm still having trouble running the project, but I thought a simple solution would be:

 def __init__(self, api_key, image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
             frequency_penalty=0,
             presence_penalty=0, number_of_results=1):
    
    openai.api_base = os.getenv("OPENAI_API_BASE", default="https://api.openai.com/v1")

Then one could simply export an environmental variable to set the URL base path, or if there isn't a variable set then it uses the default URL.

@sirajperson
Copy link
Contributor

sirajperson commented Jun 8, 2023

I added a static method to the AgentExecutor class: superagi/jobs/agent_executor.py: line 91

   @staticmethod
   def get_model_api_base_url():
        base_url = get_config("OPENAI_API_BASE_URL")
        # shell_url = os.getenv("OPENAI_API_BASE_URL")
        return base_url

Then updated the OpenAI class' initialization function to include the base_url parameter class: superagi/llms/openai.py: line 11

  def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
              frequency_penalty=0,
              presence_penalty=0, number_of_results=1):
     openai.api_base = api_base

Then when calling the executor agent, just add in the parameter:
agent_executor.py: about line 151

 spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"],
                               llm=OpenAi(api_base=AgentExecutor.get_model_api_base_url(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory,
                               agent_config=parsed_config)

Finally in the config.yaml file line 5:

OPENAI_API_BASE_URL: https://api.openai.com/v1 

I'm still trying to run the project, but that's an improvement because now one can set a setting in the config.yaml file called OPENAI_API_BASE_URL to change the target url of the openai api.

@sirajperson
Copy link
Contributor

sirajperson commented Jun 9, 2023

Okay, I got to work with text-generation-web ui. The solution is a bit hacky atm, but the agent is using a local ggml model that is being executed off of multiple GPUs. The above solution definitely works. In order to get it to work with TGWUI though I had to make the openai api run on my computer's LAN interface as getting the docker image to access port 5001 on host machines loop back interface was hard. To make text generation web ui run on the LAN interface I edited the extensions/openai/script.py file and added the following:

Below the import statements, line 17 I added:

ipsocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
ipsocket.connect(("8.8.8.8", 80))
localip = ipsocket.getsockname()[0]

This creates a variable containing the host machine's primary network interfaces' ip address. On line or around line 762 you will find the following line:

server_addr = ('0.0.0.0' if shared.args.listen else '127.0.0.1', params['port'])
Change it to use the ip variable instead of the static string 127.0.0.1:

server_addr = ('0.0.0.0' if shared.args.listen else localip, params['port'])

Please note that this is a quick and dirty solution to use local LLMs. A much better solution would be to add llama-cpp-python functionality to the app, and to create a settings interface for use with llama-cpp-python. I guess I can work on that next. For now though, that is a quick way to get SuperAGI to use local LLMs.

@sirajperson
Copy link
Contributor

Okay, using the text generation webui seems to be running into errors parsing JSON from SuperAGI. Going to need to do some more investigating here. This error though is off topic of this issue. As it stands, using local LLMs can be done by editing the aforementioned files in the SuperAGI project.

I added a static method to the AgentExecutor class: superagi/jobs/agent_executor.py: line 91

   @staticmethod
   def get_model_api_base_url():
        base_url = get_config("OPENAI_API_BASE_URL")
        # shell_url = os.getenv("OPENAI_API_BASE_URL")
        return base_url

Then updated the OpenAI class' initialization function to include the base_url parameter class: superagi/llms/openai.py: line 11

  def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
              frequency_penalty=0,
              presence_penalty=0, number_of_results=1):
     openai.api_base = api_base

Then when calling the executor agent, just add in the parameter: agent_executor.py: about line 151

 spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"],
                               llm=OpenAi(api_base=AgentExecutor.get_model_api_base_url(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory,
                               agent_config=parsed_config)

Finally in the config.yaml file line 5:

OPENAI_API_BASE_URL: https://api.openai.com/v1 

I'm still trying to run the project, but that's an improvement because now one can set a setting in the config.yaml file called OPENAI_API_BASE_URL to change the target url of the openai api.

@sirajperson
Copy link
Contributor

Okay, I created a PR with a merging of Text Generation Web UI to manage local hosted language models. This PR creates a docker image for TGWUI, and adds settings to use it in the configuration file. Local LLMs are a go!

@kenfink
Copy link
Author

kenfink commented Jun 9, 2023

Here's another option: The Fastchat folks published this today https://lmsys.org/blog/2023-06-09-api-server/

@alexkreidler
Copy link
Contributor

I'd recommend we stick with the name OPENAI_API_BASE rather than OPENAI_API_BASE_URL because the former is the standard for Langchain.

@sirajperson
Copy link
Contributor

Absolutely, I'll post that change on my next commit.

@alexkreidler
Copy link
Contributor

Maybe it would be worth opening a separate smaller PR than #289 so people can use this Base URL change sooner? I'm happy to do that. I just applied @sirajperson 's patches from #243 (comment) locally and they work great!

@sirajperson
Copy link
Contributor

sirajperson commented Jun 10, 2023

@alexkreidler On my fork I have began to implement locally run LLMs. The fork is currently under development, and is not ready to be merged yet. It would be great if you could create a separate PR. Thanks for the help!

@sirajperson
Copy link
Contributor

sirajperson commented Jun 10, 2023

Please consider the following:
add Django to the end of the requirements.txt file:

Django==4.2.2

Add an import statement to line 6 of agent_executor.py:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

Allow the get_agent_api() method to validate the supplied URL or return the default OpenAI API base.

   @staticmethod
   def get_agent_api_base():
        base_url = get_config("OPENAI_API_BASE")
        # shell_url = os.getenv("OPENAI_API_BASE")

        url_validator = URLValidator(verify_exists=False)
        try:
            url_validator(base_url)
        except ValidationError as e:
            return "https://api.openai.com/v1"

        return base_url

and finally modify the function name when called in the execute_next_action function on or around line 160 to:

        spawned_agent = SuperAgi(ai_name=parsed_config["name"], ai_role=parsed_config["description"],
                                 llm=OpenAi(api_base=AgentExecutor.get_agent_api_base(), model=parsed_config["model"], api_key=model_api_key), tools=tools, memory=memory,
                                 agent_config=parsed_config)

The following lines in the OpenAI class in openai.py can remain the same:

  def __init__(self, api_key, api_base="https://api.openai.com/v1", image_model=None, model="gpt-4", temperature=0.6, max_tokens=4032, top_p=1,
              frequency_penalty=0,
              presence_penalty=0, number_of_results=1):
     openai.api_base = api_base

This will make the use of a custom base URL more robust.

@eriksonssilva
Copy link

eriksonssilva commented Jun 27, 2023

@sirajperson Hello Jonathan! I hope you're doing well.
I'm sorry for using this one instead of creating a new one, but I have searched high and low and can't find a solution...
I have tried using Super AGI+ OogaBooga on the backend with the dockerized version by Atinoda, as you've pointed out here

However, no matter what I do -> If I build the image cloning their repo or if I copy the "text-generation-webui" an try building the image locally, I always get the same error on Super AGI powershell:

"(host='super__tgwui', port=5001): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7f4de27160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))"

If I use the command "Test-NetConnection 127.0.0.1 -p 5001" on powershell, it returns true, so it means the port is open.

I have even uninstalled docker and installed again, but I'm still facing this... Do you have any Idea of what I might be doing wrong?

Thanks a lot!

@sirajperson
Copy link
Contributor

@eriksonssilva
If you place the models that you would like to use under:

SuperAGI/tgui/config/models/

They will be copied to the container to be used in tgui.
Presently only GPTQ models with a context length greater than 4096 tokens are working.

The line "(host='super__tgwui', port=5001): tells SuperAGI to use the docker bridge and host name resolution. In order to get GPU support you will need to follow the docker instructions for setting up the target machine to use the docker image. Those instructions can be found here

@eriksonssilva
Copy link

@sirajperson Thanks for the quick answer!
So I've been messing around here and I KINDA made it "work"...
I am using Oogabooga but without docker...
Basically I've used the openai extension and after a lot of trial and error, using my IPV4 instead of localhost or 127.0.0.1, it stopped giving this error.
However now, something weirder happens... When I start the Super AGI agent, it will only repeat the same thing over and over...
I have used the models that are available on the link you mentioned, and I can chat with the model through the webui without issues.
On the other hand, the Super AGI does not go anywhere:
The output shows things like:
"Exception: When loading characters/instruction-following/None.yaml: FileNotFoundError(2, 'No such file or directory')"

"Warning: Loaded default instruction-following template for model.
Warning: Ignoring max_new_tokens (3250), too large for the remaining context. Remaining tokens: 1168
Warning: Set max_new_tokens = 1168"

And on Super AGI the answers are always "vague".
image
image

Also, each command takes A LOT of time...

As a means of test I've set the goal:
"List 10 mind-boggling movies" and the instructions "Use google to find the movies.".

This might not be 100% related to Super AGI, could you (or perhaps anyone?) give me any hint?

@sirajperson
Copy link
Contributor

sirajperson commented Jun 28, 2023

@eriksonssilva Bravo Erik. That's definitely progress. What model are you using? Also, are you using llama-cpp to off load layers to your GPU? I can tell you that the some of the llama model's are just not that great yet. I have been working on trying to get MPT-30B as the brain behind the dropin api end point because it's instruct capabilities are starting to really deliver the quality responses that would make it usable.

@eriksonssilva
Copy link

@sirajperson If I try using MTP30b I think my computer will standup and walk away from the room!
"Nah, dude. You're expecting too much from me" lol
I have tried with llama-7b-4bit and TheBloke_open-llama-7b-open-instruct-GPTQ, but both produce similar results.
I can only use llamma.cpp on the "llama-7b-4bit". For some reason the other one does not allow me to use it...
It's funny that not even GPT3.5 turbo is giving me satisfying results (for more complex tasks), but I must admit that each time I refresh the "usage page" and I see that the cost is increasing, I start sweating! haha

@sirajperson
Copy link
Contributor

@alexkreidler Yeah, those models don't have a very high perplexity score. If you are able to use the gptq models you should try mpt 30b gptq. What I've been doing, because my two old M40s can't run GPTQ models, is renting cheaper gpu instances at runpod.io and running them there. But please try out the MPT30b and share your results. Also be aware that MPT30b has special message termination characters. That will have to be configured in the constraints section of the agent.

@sirajperson
Copy link
Contributor

@alexkreidler This also happened on the 20th. So it may be possible to use this to inference MPT ggml based models from the GPU: https://postgresml.org/blog/announcing-gptq-and-ggml-quantized-llm-support-for-huggingface-transformers

@sirajperson
Copy link
Contributor

sirajperson commented Jun 29, 2023

The discussion on this issue is lingering from being able to use a different end point for the the API to how to get a local LLM working for task agents. Please refer to #542 for discussions on task agent functionality.

As it stands, the use of a different API endpoint is working correctly. Because one can point the task agent to any API endpoint weather or not the endpoint selected will work with the task agent is beyond the scope of this issue. I'm hoping this issue will be closed soon, since the alternative endpoint improvement is working great.

@sirajperson
Copy link
Contributor

@TransformerOptimus I was wondering if you could close this problem since the OPENAI_API_BASE implementation is working without fault.

@neelayan7
Copy link
Collaborator

Awesome @sirajperson . Closing this.

@5c4lar
Copy link

5c4lar commented Sep 4, 2023

Can we further support configure this in the web GUI? restarting the app whenever you want to change the OPENAI_API_BASE is very time consuming.

@jmikedupont2
Copy link

please add better documentation for this in the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants