Added local llm functionality by incorporating text-generation-webui #289

sirajperson · 2023-06-09T15:03:49Z

In this PM I have integrated text-generation-webui as a means of managing locally hosted LLMs.
In this PM the changes are as follows:
Created a setting for OPENAI_API_BASE_URL: This allows one to set the URL that the openai library is pointed to.
Created a docker image of Text Generation Web UI that includes multi-GPU off loading of GGMLs
Configured SuperAGI to use use the TGWUI docker image by default.

With this PM one can run the docker-compose up --build command, and then navigate to localhost:7860 to download models for use with SuperAGI from huggingface.co.

Renegadesoffun · 2023-06-09T15:35:58Z

Im all about this!! I think we all wanted autogpt to run locally to push the limits without the bank! Stoked to see what we can run locally. And yeah then optimization of gpu preset, cblast, llama etc. will be next to auto optimize locall llms. Great job tho! 👏

TransformerOptimus · 2023-06-09T18:36:32Z

@sirajperson. This is awesome.

Tried running docker-compose on new changes(on macbook 16gb RAM). Getting the following error.

0 32.90 Ignoring bitsandbytes: markers 'platform_system == "Windows"' don't match your environment
0 32.90 Ignoring llama-cpp-python: markers 'platform_system == "Windows"' don't match your environment
0 32.90 Ignoring auto-gptq: markers 'platform_system == "Windows"' don't match your environment
0 32.90 ERROR: auto_gptq-0.2.0+cu117-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform.

failed to solve: executor failed running [/bin/sh -c pip3 install -r /app/requirements.txt]: exit code: 1

sirajperson · 2023-06-09T20:01:05Z

@sirajperson. This is awesome.

Tried running docker-compose on new changes(on macbook 16gb RAM). Getting the following error.

0 32.90 Ignoring bitsandbytes: markers 'platform_system == "Windows"' don't match your environment

0 32.90 Ignoring llama-cpp-python: markers 'platform_system == "Windows"' don't match your environment
0 32.90 Ignoring auto-gptq: markers 'platform_system == "Windows"' don't match your environment
0 32.90 ERROR: auto_gptq-0.2.0+cu117-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform.
failed to solve: executor failed running [/bin/sh -c pip3 install -r /app/requirements.txt]: exit code: 1

Thanks for the reply. As for the build, are you able to build the docker image on the main branch?

Removed device specific launch arguments. Settings must be done after installation.

Removed additional default settings for GPU usage. These settings need to be configured via the configuration.yaml file perhaps.

Additional generalized default device settings.

sirajperson · 2023-06-09T20:39:35Z

Okay, it looks like there error that you are getting is while trying to execute the install requirements.txt file from text generation web ui. On line 25 of tgwui_requirements.txt try commenting out the last line. In order to make this work on your local machine there are a couple of installation steps that you may have to take using mac. I'm not sure what kind of video card you have, or if you are using a laptop, but you should be able to remove that last line from the requirements.txt file to get it installed.

Also, I have removed the following items from the launch arguments so that TGWUI doesn't automatically target devices with nvidia GPUs.

For now configuration of the docker-compose.yaml needs to be done manually. I will create an build.sh script tonight that will create the docker-compose.yaml file to load build options based on the target installation environment. Until then I have commented out GPU offloading and GPU configuration. This will make the model's API responses much slower, but will greatly increase the number of devices that the containers can run on without having to modify the docker-compose.yaml file.

Also, llama.cpp GPU offloading doesn't presently support removing offloaded layers from system RAM. Instead, it is presently making a copy of it in vRAM and then executing the layers there. This, from what I understand, is currently being addressed.

Please reclone the repository and try running from scratch. You may need to remove the containers that you already tried to build. Please refer to docker container management to get information on how to remove containers. If this is the only items that you are using containers for on your system then you can call 'docker system prune' to remove containers that aren't running and to wipe the previous build cache. Don't run prune if you have other docker images that you would like to save installed, or it will delete them.

sirajperson · 2023-06-09T21:41:31Z

@TransformerOptimus
Another hiccup I've run into in working with local LLMs is the difference in token length between llama's, and it's derivatives', token limit. The token limit of lama is 2048 while the token limit for gpt-3.5 and gpt-4 are 4096 and 8192 (for the api version) respectively. I was thinking that it might be a good idea to consider token limits in the session formation. I'll will work on a something like that later tonight, but any ideas would be greatly appreciated.

Searx is a free internet metasearch engine which aggregates results from more than 70 search services. I added a simple beautifulsoup scraper that allows many of the instances on https://searx.space/ to be used without an API key. Uses the bs4, httpx, and pydantic packages which are already in the `requirements.txt`.

TransformerOptimus · 2023-06-10T01:57:23Z

@TransformerOptimus
Another hiccup I've run into in working with local LLMs is the difference in token length between llama's, and it's derivatives', token limit. The token limit of lama is 2048 while the token limit for gpt-3.5 and gpt-4 are 4096 and 8192 (for the api version) respectively. I was thinking that it might be a good idea to consider token limits in the session formation. I'll will work on a something like that later tonight, but any ideas would be greatly appreciated.

There are multiple components in a prompt.

Base prompt - includes goals, constraints, tools etc.
Short term memory
Long term memory(WIP)
Knowledge base - preseeded knowledge for agents (it is WIP).

We can give certain percentage of weight to each of components.
Base prompt weight can’t be changed but we can come up with variations. STM, LTM, Knowledge can have a weight of 1/3 of remaining tokens available or can be kept configurable.

Adds support for searx search

TransformerOptimus · 2023-06-09T15:37:17Z

superagi/helper/json_cleaner.py

@@ -51,7 +51,7 @@ def extract_json_section(cls, input_str: str = ""):

    @classmethod
    def remove_escape_sequences(cls, string):
-        return string.encode('utf-8').decode('unicode_escape').encode('raw_unicode_escape').decode('utf-8')
+        return string.encode('utf-8').decode('unicode_escape')


Please revert this to old change. This is required for non-English encoding.

Updated reversion on line 54

fix misspelled word SERACH to SEARCH

Update config_template.yaml

luciferlinx101 · 2023-06-10T15:29:47Z

@sirajperson Hey can you let me know what to do after localhost:7860 I was able to set up locally but how to choose and test with different model?
can you explain in detail what are the next steps after docker compose up --build

TransformerOptimus · 2023-06-10T16:59:28Z

Can we keep the new docker-compose with local llm different from the current docker-compose file (something like - docker-compose.local_llm.yaml). We don't know how many devs want to run the local model directly by default. We can add a section in the readme for the local model.

If a local LLM url is set in the config.yaml file, it uses tgwui container, which can be managed from 127.0.0.1:7860 The tgui is configured to run models on CPU only mode by default, but can be configured to run in other modes via the build option settings in the docker-compose.yaml

sirajperson · 2023-06-10T17:21:34Z

@TransformerOptimus Sure, It would be nice to be able to specify the use of local LLMs as a build arg. If we hand docker-compose something like --build-args use_local_llm=true than the compose executes the tgwui build.

sirajperson · 2023-06-10T17:24:44Z

In my last commit everything seems to be basically working. I've had 4 successful runs in the past 24 hours. I'll go ahead and separate the docker-compose files. Let me know if the last commit is working well on Mac. I'm on Linux.

sirajperson · 2023-06-10T17:38:49Z

@luciferlinx101 As of the last commit, Jun 10, to use Local LLMs follow these steps:
Clone my development branch
Copy the config_template.yaml to config.yaml

Edit the config.yaml file and make the following changes:
Comment out line 7: OPENAI_API_BASE: https://api.openai.com/v1
Uncomment line 8: #OPENAI_API_BASE: "http://super__tgwui:5001/v1"

Modify the following lines to match the model you plan on using:
MAX_TOOL_TOKEN_LIMIT: 800
MAX_MODEL_TOKEN_LIMIT: 2048 # set to 2048 for llama or 4032 for GPT-3.5

For llama based models I have successfully been using 500, and 2048 respectively

Run docker compose:
docker-compose up --build

Note, that if you want to use more advanced features like loading models into GPUs then you will need to do additional configuration in the docker-compose.yaml file. I have tried to leave comments in the current file as basic instructions. For more information on specific text-generation-webui builds I would recommend that you review the instructions on the text-generation-webui-docker github repo. https://github.com/Atinoda/text-generation-webui-docker

After you have successfully built the containers, point your browser to 127.0.0.1:7860 and click on the models tab. In the feild "Download custom model or LoRA" enter the huggingface model identifier you would like to use, such as:

TheBloke/Vicuna-13B-CoT-GGML

Then click download. Then in the selection drop down menu select the model that you just downloaded and wait for it to load.

Finally, point your browser to 127.0.0.1:3000 to begin using the agent.

Cheers!

Please be aware that my fork is development branch and is undergoing a PR, and will be changing more soon. In other words, it isn't stable.

TransformerOptimus · 2023-06-10T17:45:46Z

In my last commit everything seems to be basically working. I've had 4 successful runs in the past 24 hours. I'll go ahead and separate the docker-compose files. Let me know if the last commit is working well on Mac. I'm on Linux.

Mac it is still failing. Getting this error. Seems running fine on ubuntu.
#0 17.80
#0 17.80 note: This error originates from a subprocess, and is likely not a problem with pip.
#0 17.80 ERROR: Failed building wheel for llama-cpp-python
#0 17.80 Failed to build llama-cpp-python
#0 17.80 ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

failed to solve: executor failed running [/bin/sh -c pip uninstall -y llama-cpp-python && CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python]: exit code: 1

TransformerOptimus · 2023-06-10T17:55:43Z

In my last commit everything seems to be basically working. I've had 4 successful runs in the past 24 hours. I'll go ahead and separate the docker-compose files. Let me know if the last commit is working well on Mac. I'm on Linux.

Mac it is still failing. Getting this error. Seems running fine on ubuntu.

#0 17.80
#0 17.80 note: This error originates from a subprocess, and is likely not a problem with pip.
#0 17.80 ERROR: Failed building wheel for llama-cpp-python
#0 17.80 Failed to build llama-cpp-python
#0 17.80 ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
failed to solve: executor failed running [/bin/sh -c pip uninstall -y llama-cpp-python && CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python]: exit code: 1

Ran the code change on https://lmsys.org/about/ and was able to get

@TransformerOptimus Sure, It would be nice to be able to specify the use of local LLMs as a build arg. If we hand docker-compose something like --build-args use_local_llm=true than the compose executes the tgwui build.

Sounds good

sirajperson · 2023-06-10T18:00:06Z

@TransformerOptimus I'm modifying the default build to remove llama-cuda.

Changed build to default instead of cublas

sirajperson · 2023-06-10T18:03:17Z

Okay I'll do a fresh clone and see if it works. I might just pull the macbook off the shelf and spin it up to debug. Although my mac is an x68 from '18. I'm not sure if I can reproduce. Let me know if commenting out the build line resolved the build issue.

Fix typo in agent_executor.py

Removed the installation video (temporarily)

…dme_removed_vid Update README.MD

renamed: docker-compose.yaml.bak -> local-llm new file: local-llm-gpu modified: superagi/helper/json_cleaner.py renamed: DockerfileTGWUI -> tgwui/DockerfileTGWUI deleted: tgwui/config/place-your-models-here.txt deleted: tgwui/tgwui_requirements.txt

sirajperson · 2023-06-11T20:56:34Z

@TransformerOptimus
In the last commit I'v separated the docker compose files such with the following build schema:

Default build, no local LLMs or OPENAI_API_BASE redirect.
docker-compose up --build

Build with local GPU LLM support. This mode uses memory caching by default. It isn't very fast, but it works on a much greater number of host machines than using GPU mode.
docker-compose -f local-llm up --build

And finally, the more advanced GPU build. This build may require that additional packages be installed on the host machine. I would suggest that anyone trying to build a GPU install of the container read the docs on the in the Text Generation Web UI doc files. They are very informative and well written. TGWUI Docs
docker-compose -f local-llm-gpu up --build

Please note, that Text Generation Web UI is a large project with lots of functionality. It's worth taking time to get to know it to use local LLMs efficiently

sirajperson · 2023-06-11T21:02:01Z

@luciferlinx101 I quite a bit of time looking for the that was causing a hang with the agent execution. Please look over the README.md file in the OpenAI plugin root folder of TGWUI. The openai plugin is currently under development and is not yet fully implemented. This PR is for integration of TGWUI as a model management tool. The issue might be more easily resolved by creating an issue on the projects issue tracker as not being able to sequence API calls to the OpenAI API plugin. The readme can found here.

luciferlinx101 · 2023-06-12T08:17:47Z

@sirajperson Sure I will test and let you know if it works properly.

TransformerOptimus · 2023-06-12T18:54:54Z

Merging it to dev. We will merge the dev -> main along with other changes tomorrow. I could able to get it running in Windows and Ubuntu VM.
Able to get it running on my windows.
Mac still throwing "ERROR: auto_gptq-0.2.2+cu117-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform" error again.

luciferlinx101 · 2023-06-12T20:59:01Z

@sirajperson I was also able to get the tool http://localhost:7860/ and download the model but we wanted proper readme after this step which could be used to use this local model as part of super agi such that in frontend people seeing the agent actually runs on this local model not open ai.

This will help all users once we merge dev to main.

The minimum requirement for the README is to showcase set and use case with atleast one particular model end to end with superagi.

sirajperson · 2023-06-13T03:42:28Z

@luciferlinx101
Sure, I'll be happy to do that.

sirajperson · 2023-06-13T03:47:59Z

@TransformerOptimus

I'll go ahead and get to the bottom of that bug. Commenting out the line changes the Dockerfile that is present on the TGWUI docker github project. I would like to avoid changing files, and to try to keep it the same. That way instead of having additional folders, files, and configurations down the line we can just do a git pull for the repo, so long as it's maintained. For now, I will go ahead and comment it out until the issue is investigated further.

atxcowboy · 2023-06-14T06:49:56Z

@TransformerOptimus In the last commit I'v separated the docker compose files such with the following build schema:

Default build, no local LLMs or OPENAI_API_BASE redirect. docker-compose up --build

Build with local GPU LLM support. This mode uses memory caching by default. It isn't very fast, but it works on a much greater number of host machines than using GPU mode. docker-compose -f local-llm up --build

And finally, the more advanced GPU build. This build may require that additional packages be installed on the host machine. I would suggest that anyone trying to build a GPU install of the container read the docs on the in the Text Generation Web UI doc files. They are very informative and well written. TGWUI Docs docker-compose -f local-llm-gpu up --build

Please note, that Text Generation Web UI is a large project with lots of functionality. It's worth taking time to get to know it to use local LLMs efficiently

This is great! I am familiar with Oobabooga and developed a little extension for it.

A few observations:

I got it to spin up with GPU support and the model loads about 10 times faster with GPU support rather than CPU on my system. I would clearly prefer to be able to just point to my own customized defined Ooobabooga installation location at a certain URL as I'd want to make use of gptq and other extended capabilities. I guess ideally a user gets the choice between using the dockerized version or a custom endpoint.
On the GPU I also get the error you mentioned:
redis.exceptions.DataError: Invalid input of type: 'list'. Convert to a bytes, string, int or float first.
On the Create Agent Screen SuperAGI would need to be aware of the available LLMs, as it currently only shows the predefined ChatGPT models.

luciferlinx101 · 2023-06-14T23:43:24Z

@luciferlinx101 Sure, I'll be happy to do that.

Hey any updates of the stepwise readme showing end to end use case with superagi?

sirajperson · 2023-06-15T01:46:07Z

@luciferlinx101
Sorry, about the delay. Been a bit busy this week. LoL

GuruVirus · 2023-06-17T07:59:44Z

Adding encouragement for @sirajperson : more of us are watching for the steps to successfully set this up.

luciferlinx101 · 2023-06-17T08:14:23Z

@luciferlinx101 Sorry, about the delay. Been a bit busy this week. LoL

No problem! Let me know if it is done.

malicorX · 2023-06-17T08:39:10Z

i'm running this for ooba:

malicor@DESKTOP-I087DO5:/mnt/d/ai/oobabooga_WSL/text-generation-webui$ python3 server.py --wbits 4 --groupsize 128 --model_type llama --model WizardLM-7B-uncensored-GPTQ --api --extensions long_term_memory, EdgeGPT --no-stream

i can use ooba via http://localhost:7860/ and if i ask it "how many legs does a spider have", it gives me correct answer

i started superAGI with:

D:\AI\SuperAGI>docker-compose -f local-llm-gpu up --build

it did a lot of installing and ended up with:

superagi-gui-1 | - ready started server on 0.0.0.0:3000, url: http://localhost:3000
superagi-gui-1 | - event compiled client and server successfully in 957 ms (1314 modules)
superagi-gui-1 | - wait compiling...
superagi-gui-1 | - event compiled client and server successfully in 218 ms (1314 modules)
superagi-super__postgres-1 | 2023-06-17 08:26:33.196 UTC [26] LOG: checkpoint starting: time
superagi-super__postgres-1 | 2023-06-17 08:26:33.230 UTC [26] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.008 s, sync=0.004 s, total=0.035 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB
superagi-gui-1 | - wait compiling /_error (client and server)...
superagi-gui-1 | - event compiled client and server successfully in 135 ms (1315 modules)
superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Source+Code+Pro&display=swap"). Use Document instead.
superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component
superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap"). Use Document instead.
superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component
superagi-gui-1 | - wait compiling /favicon.ico/route (client and server)...
superagi-gui-1 | - event compiled successfully in 43 ms (146 modules)

in the config.yaml file i set this:

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

#OPENAI_API_BASE: https://api.openai.com/v1
OPENAI_API_BASE: "http://super__tgwui:5001/v1"

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

MODEL_NAME: "gpt-3.5-turbo-0301"
MAX_TOOL_TOKEN_LIMIT: 500 # or 800 for gpt3.5
MAX_MODEL_TOKEN_LIMIT: 2048 # set to 2048 for llama or 4032 for gpt3.5

when i now start http://localhost:3000/ i get the purple superAGI screen and it saying "Initializing superAGI", but it's stuck there.

any idea what i m doing wrong?

DaKingof · 2023-06-18T23:21:00Z

It would be fantastic to have a tut that goes over how to set up open source LLM's with SuperAGI. I would love to do test runs on my own hardware to see if my configuration looks pretty good, then take it out to the more powerful LLM's after I have confirmed it looked like I am close to where I need to be with my instructions.

Also, we could save some work by pointing to the oobabooga installation instructions for their part...but hooking into SuperAGI seems to be the part I am amiss at here.

sirajperson · 2023-06-26T21:57:02Z

i'm running this for ooba:

malicor@DESKTOP-I087DO5:/mnt/d/ai/oobabooga_WSL/text-generation-webui$ python3 server.py --wbits 4 --groupsize 128 --model_type llama --model WizardLM-7B-uncensored-GPTQ --api --extensions long_term_memory, EdgeGPT --no-stream

i can use ooba via http://localhost:7860/ and if i ask it "how many legs does a spider have", it gives me correct answer

i started superAGI with:

D:\AI\SuperAGI>docker-compose -f local-llm-gpu up --build

it did a lot of installing and ended up with:

superagi-gui-1 | - ready started server on 0.0.0.0:3000, url: http://localhost:3000 superagi-gui-1 | - event compiled client and server successfully in 957 ms (1314 modules) superagi-gui-1 | - wait compiling... superagi-gui-1 | - event compiled client and server successfully in 218 ms (1314 modules) superagi-super__postgres-1 | 2023-06-17 08:26:33.196 UTC [26] LOG: checkpoint starting: time superagi-super__postgres-1 | 2023-06-17 08:26:33.230 UTC [26] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.008 s, sync=0.004 s, total=0.035 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB superagi-gui-1 | - wait compiling /_error (client and server)... superagi-gui-1 | - event compiled client and server successfully in 135 ms (1315 modules) superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Source+Code+Pro&display=swap"). Use Document instead. superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap"). Use Document instead. superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component superagi-gui-1 | - wait compiling /favicon.ico/route (client and server)... superagi-gui-1 | - event compiled successfully in 43 ms (146 modules)

in the config.yaml file i set this:

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

#OPENAI_API_BASE: https://api.openai.com/v1 OPENAI_API_BASE: "http://super__tgwui:5001/v1"

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

MODEL_NAME: "gpt-3.5-turbo-0301" MAX_TOOL_TOKEN_LIMIT: 500 # or 800 for gpt3.5 MAX_MODEL_TOKEN_LIMIT: 2048 # set to 2048 for llama or 4032 for gpt3.5

when i now start http://localhost:3000/ i get the purple superAGI screen and it saying "Initializing superAGI", but it's stuck there.

any idea what i m doing wrong?

Yes. The IP that you're using is for the docker image. Since you aren't running docker to use TGUI you are going to need to change the line

OPENAI_API_BASE: "http://super__tgwui:5001/v1"

to

OPENAI_API_BASE URL to http://[your host machines LAN address]:5001/v1

Then you are going to want to either create a port forward from your lan interface controller to your loop back interface, or just run TGWU on your lan interface too. You can do that by setting the run address to 0.0.0.0.

DiamondGlassDrill · 2023-07-04T15:16:18Z

i'm running this for ooba:
malicor@DESKTOP-I087DO5:/mnt/d/ai/oobabooga_WSL/text-generation-webui$ python3 server.py --wbits 4 --groupsize 128 --model_type llama --model WizardLM-7B-uncensored-GPTQ --api --extensions long_term_memory, EdgeGPT --no-stream
i can use ooba via http://localhost:7860/ and if i ask it "how many legs does a spider have", it gives me correct answer
i started superAGI with:
D:\AI\SuperAGI>docker-compose -f local-llm-gpu up --build
it did a lot of installing and ended up with:
superagi-gui-1 | - ready started server on 0.0.0.0:3000, url: http://localhost:3000 superagi-gui-1 | - event compiled client and server successfully in 957 ms (1314 modules) superagi-gui-1 | - wait compiling... superagi-gui-1 | - event compiled client and server successfully in 218 ms (1314 modules) superagi-super__postgres-1 | 2023-06-17 08:26:33.196 UTC [26] LOG: checkpoint starting: time superagi-super__postgres-1 | 2023-06-17 08:26:33.230 UTC [26] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.008 s, sync=0.004 s, total=0.035 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB superagi-gui-1 | - wait compiling /_error (client and server)... superagi-gui-1 | - event compiled client and server successfully in 135 ms (1315 modules) superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Source+Code+Pro&display=swap"). Use Document instead. superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component superagi-gui-1 | Do not add stylesheets using next/head (see tag with href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap"). Use Document instead. superagi-gui-1 | See more info here: https://nextjs.org/docs/messages/no-stylesheets-in-head-component superagi-gui-1 | - wait compiling /favicon.ico/route (client and server)... superagi-gui-1 | - event compiled successfully in 43 ms (146 modules)
in the config.yaml file i set this:

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

#OPENAI_API_BASE: https://api.openai.com/v1 OPENAI_API_BASE: "http://super__tgwui:5001/v1"

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

MODEL_NAME: "gpt-3.5-turbo-0301" MAX_TOOL_TOKEN_LIMIT: 500 # or 800 for gpt3.5 MAX_MODEL_TOKEN_LIMIT: 2048 # set to 2048 for llama or 4032 for gpt3.5
when i now start http://localhost:3000/ i get the purple superAGI screen and it saying "Initializing superAGI", but it's stuck there.
any idea what i m doing wrong?

Yes. The IP that you're using is for the docker image. Since you aren't running docker to use TGUI you are going to need to change the line

OPENAI_API_BASE: "http://super__tgwui:5001/v1"

to

OPENAI_API_BASE URL to http://[your host machines LAN address]:5001/v1

Then you are going to want to either create a port forward from your lan interface controller to your loop back interface, or just run TGWU on your lan interface too. You can do that by setting the run address to 0.0.0.0.

Tried everything it says cannot connect to openAI: OPENAI_API_BASE: "http://localhost:5000/api" connecting with SillyTavern to the API of oobabooga works, but cannot make it work on SuperAGI, any thoughts? Thanks in advance.

Error I receive:

superagi-celery-1 | 2023-07-04 15:15:09 UTC - Super AGI - INFO - [/app/superagi/llms/openai.py:79] - Exception:
superagi-celery-1 | [2023-07-04 15:15:09,629: INFO/ForkPoolWorker-8] Exception:
superagi-celery-1 | 2023-07-04 15:15:09 UTC - Super AGI - INFO - [/app/superagi/llms/openai.py:79] - Error communicating with OpenAI: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f97baaac880>: Failed to establish a new connection: [Errno 111] Connection refused'))
superagi-celery-1 | [2023-07-04 15:15:09,629: INFO/ForkPoolWorker-8] Error communicating with OpenAI: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f97baaac880>: Failed to establish a new connection: [Errno 111] Connection refused'))

sirajperson · 2023-07-05T03:08:44Z

@DiamondGlassDrill Port 5000 is the Text Generation Web UI api, which differs from the open AI api endpoints. You will need to enable the openapi extension to work with local LLM models. It can get tricky because not every model is compatible with instruct. You would also need to set up the correct template for prompting.

python server.py --listen --listen-host 0.0.0.0 --verbose --extensions openai

Be sure to check the LAN ip address of your computer, so if it is 192.168.1.100, then in the config.yaml file you would set the openai base setting as such:

OPENAI_API_BASE: "http://192.168.1.100:5001/v1"

There's a wide array of configurable settings for local LLMs. To manage the model you can navigate to localhost:7860 and click on the models tab.

neelayan7 · 2023-07-07T11:40:21Z

@sirajperson A lot of people have been asking for a guide on how to use SuperAGI with local llms. Can you please help us out with a Readme.md? I think @luciferlinx101 also mentioned this a few weeks back.

juangea · 2023-07-07T18:05:15Z

I tried to make this work, at least to be able to access to the minimum UI, but it seems to not work, I used the main SuperAGI branch and I run it with
docker-compose -f local-llm-gpu up --build
But the docker related to Oobabooga never runs, it complains about it being unable to create a volume or something similar.

I want to try with nous-heroes 13B GPTQ superhot with exclama_hf, I had good experience trying it directly in Oobabooga's UI

IsleOf · 2023-09-11T11:38:57Z

How about adding petals support to take load off from local resources? https://github.com/petals-infra/chat.petals.dev

NikolaT0mic · 2024-01-23T22:59:45Z

Is there any progress on adding support for locally running instances of tgwui? Would love to use the version i already have installed instead of using the dockerized version.

Added local llm functionality by incorporating text-generation-webui

cf7a802

Renegadesoffun approved these changes Jun 9, 2023

View reviewed changes

sirajperson added 3 commits June 9, 2023 16:20

Update docker-compose.yaml

74b0696

Removed device specific launch arguments. Settings must be done after installation.

Update docker-compose.yaml

275f653

Removed additional default settings for GPU usage. These settings need to be configured via the configuration.yaml file perhaps.

Update docker-compose.yaml

ec94859

Additional generalized default device settings.

alexkreidler mentioned this pull request Jun 9, 2023

OPENAI_API_BASE Support #243

Closed

Merge pull request TransformerOptimus#293 from alexkreidler/searx

488758f

Adds support for searx search

TransformerOptimus reviewed Jun 10, 2023

View reviewed changes

sirajperson and others added 3 commits June 10, 2023 01:03

Update json_cleaner.py

669b037

Updated reversion on line 54

Update config_template.yaml

35bb8ae

fix misspelled word SERACH to SEARCH

Merge pull request TransformerOptimus#297 from xutpuu/patch-1

b90b31b

Update config_template.yaml

sirajperson added 2 commits June 10, 2023 13:15

Merge remote-tracking branch 'origin/main'

e406d07

Update docker-compose.yaml

ed07ac3

Changed build to default instead of cublas

I’m and others added 5 commits June 11, 2023 16:14

Merge pull request TransformerOptimus#315 from iskandarreza/typo-fix

ebec063

Fix typo in agent_executor.py

Update README.MD

567bba8

Removed the installation video (temporarily)

Merge pull request TransformerOptimus#318 from TransformerOptimus/rea…

e6cf271

…dme_removed_vid Update README.MD

modified: docker-compose.yaml

f133780

renamed: docker-compose.yaml.bak -> local-llm new file: local-llm-gpu modified: superagi/helper/json_cleaner.py renamed: DockerfileTGWUI -> tgwui/DockerfileTGWUI deleted: tgwui/config/place-your-models-here.txt deleted: tgwui/tgwui_requirements.txt

Merge branch 'TransformerOptimus:main' into main

2c7ac53

TransformerOptimus changed the base branch from main to dev June 12, 2023 18:55

TransformerOptimus merged commit 6789ab6 into TransformerOptimus:dev Jun 12, 2023

atxcowboy mentioned this pull request Jun 18, 2023

Local LLM install and use #412

Open

neelayan7 mentioned this pull request Jun 27, 2023

LocalAI integration: Run with local models #247

Closed

Added local llm functionality by incorporating text-generation-webui #289

Added local llm functionality by incorporating text-generation-webui #289

Conversation

sirajperson commented Jun 9, 2023 • edited Loading

Renegadesoffun commented Jun 9, 2023

TransformerOptimus commented Jun 9, 2023 • edited Loading

sirajperson commented Jun 9, 2023

0 32.90 Ignoring bitsandbytes: markers 'platform_system == "Windows"' don't match your environment

sirajperson commented Jun 9, 2023 • edited Loading

sirajperson commented Jun 9, 2023 • edited Loading

TransformerOptimus commented Jun 10, 2023

TransformerOptimus Jun 9, 2023

Choose a reason for hiding this comment

luciferlinx101 commented Jun 10, 2023

TransformerOptimus commented Jun 10, 2023

sirajperson commented Jun 10, 2023

sirajperson commented Jun 10, 2023

sirajperson commented Jun 10, 2023 • edited Loading

TransformerOptimus commented Jun 10, 2023

TransformerOptimus commented Jun 10, 2023

Mac it is still failing. Getting this error. Seems running fine on ubuntu.

sirajperson commented Jun 10, 2023

sirajperson commented Jun 10, 2023

sirajperson commented Jun 11, 2023 • edited Loading

sirajperson commented Jun 11, 2023 • edited Loading

luciferlinx101 commented Jun 12, 2023

TransformerOptimus commented Jun 12, 2023

luciferlinx101 commented Jun 12, 2023 • edited Loading

sirajperson commented Jun 13, 2023 • edited Loading

sirajperson commented Jun 13, 2023

atxcowboy commented Jun 14, 2023

luciferlinx101 commented Jun 14, 2023

sirajperson commented Jun 15, 2023

GuruVirus commented Jun 17, 2023

luciferlinx101 commented Jun 17, 2023 • edited Loading

malicorX commented Jun 17, 2023

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

DaKingof commented Jun 18, 2023

sirajperson commented Jun 26, 2023 • edited Loading

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

DiamondGlassDrill commented Jul 4, 2023

For locally hosted LLMs comment out the next line and uncomment the one after

to configure a local llm point your browser to 127.0.0.1:7860 and click on the model tab in text generation web ui.

"gpt-3.5-turbo-0301": 4032, "gpt-4-0314": 8092, "gpt-3.5-turbo": 4032, "gpt-4": 8092, "llama":2048, "mpt-7b-storywriter":45000

sirajperson commented Jul 5, 2023 • edited Loading

neelayan7 commented Jul 7, 2023

juangea commented Jul 7, 2023

IsleOf commented Sep 11, 2023 • edited Loading

NikolaT0mic commented Jan 23, 2024

sirajperson commented Jun 9, 2023 •

edited

Loading

TransformerOptimus commented Jun 9, 2023 •

edited

Loading

sirajperson commented Jun 9, 2023 •

edited

Loading

sirajperson commented Jun 9, 2023 •

edited

Loading

sirajperson commented Jun 10, 2023 •

edited

Loading

sirajperson commented Jun 11, 2023 •

edited

Loading

sirajperson commented Jun 11, 2023 •

edited

Loading

luciferlinx101 commented Jun 12, 2023 •

edited

Loading

sirajperson commented Jun 13, 2023 •

edited

Loading

luciferlinx101 commented Jun 17, 2023 •

edited

Loading

sirajperson commented Jun 26, 2023 •

edited

Loading

sirajperson commented Jul 5, 2023 •

edited

Loading

IsleOf commented Sep 11, 2023 •

edited

Loading