AdaptiveMotorControlLab · akshaypardhanani · Aug 4, 2025 · Aug 7, 2025 · Aug 7, 2025 · Aug 9, 2025
diff --git a/.gitignore b/.gitignore
@@ -46,3 +46,11 @@ MANIFEST
 *.pyc
 __pycache__/
 *~
+
+digest.txt
+uv.lock
+.vscode/launch.json
+amadeusgpt/modules_embedding.pickle
+temp_answer.json
+
+logs/
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -0,0 +1,7 @@
+{
+    "python.testing.pytestArgs": [
+        "tests"
+    ],
+    "python.testing.unittestEnabled": false,
+    "python.testing.pytestEnabled": true
+}
diff --git a/Makefile b/Makefile
@@ -1,4 +1,4 @@
 export streamlit_app=True
 app:
 
-	streamlit run amadeusgpt/app.py --server.fileWatcherType none --server.maxUploadSize 1000
+	uv run --with streamlit streamlit run amadeusgpt/app.py --server.fileWatcherType none --server.maxUploadSize 1000
diff --git a/README.md b/README.md
@@ -32,32 +32,83 @@ In our original work (NeurIPS 2023) we used GPT3.5 and GPT4 as part of our agent
 
 ## Get started: install AmadeusGPT🎻
 
-### [1] You will need an openAI key:
+### [1] You will need an API key (OpenAI or OpenRouter):
 
-**Why OpenAI API Key is needed** AmadeusGPT relies on API calls of OpenAI (we will add more LLM options in the future) for language understanding and code writing. Sign up for a [openAI API key](https://platform.openai.com/account/api-keys) [here](https://platform.openai.com/account/api-keys).
+**Why an API Key is needed** AmadeusGPT relies on API calls to language models for understanding natural language and generating code. You can use either OpenAI's models directly or access a wider variety of models through OpenRouter.
 
-Then, you can add this into your environment by passing the following in the terminal after you launched your conda env:
+#### Option A: OpenAI API Key
+Sign up for an [OpenAI API key](https://platform.openai.com/account/api-keys) to use GPT-4, GPT-4o, and other OpenAI models.
 
-```bash
-export OPENAI_API_KEY='your API key' 
+#### Option B: OpenRouter API Key  
+Sign up for an [OpenRouter API key](https://openrouter.ai/keys) to access a wide variety of models from different providers. OpenRouter offers:
+- **Pricing flexibility**: Choose from free models or pay-per-use options. See [OpenRouter pricing](https://openrouter.ai/pricing) for model costs.
+- **Rate limits**: Check [OpenRouter rate limits](https://openrouter.ai/docs/limits) for usage restrictions.
+- **Model variety**: Access models from OpenAI, Anthropic, Google, Meta, and more.
+
+#### Setting up your API key:
+
+**Option 1: .env file (recommended)**
+Create a `.env` file in the repository root and add:
+```
+OPENAI_API_KEY=your_openai_api_key
+# OR
+OPENROUTER_API_KEY=your_openrouter_api_key
 ```
 
-Or inside a python script or Jupyter Notebook, add this if you did not pass at the terminal stage:
+For Jupyter Notebooks, use the .env file approach and load it:
+```python
+from dotenv import load_dotenv
+load_dotenv()  # This loads the .env file automatically
+```
+
+**Option 2: Environment variables**
+```bash
+# For OpenAI
+export OPENAI_API_KEY='your_openai_api_key'
 
+# For OpenRouter  
+export OPENROUTER_API_KEY='your_openrouter_api_key'
+```
 
+**Option 3: Python script inline**
 ```python
 import os
-os.environ["OPENAI_API_KEY"] = 'your api key' 
+# For OpenAI
+os.environ["OPENAI_API_KEY"] = 'your_openai_api_key'
+
+# For OpenRouter
+os.environ["OPENROUTER_API_KEY"] = 'your_openrouter_api_key'
+```
+
+#### Configuring models:
+OpenRouter models can be specified in the configuration files located at `amadeusgpt/configs/<example_type>.yaml` under the `llm_info` section:
+```yaml
+llm_info:
+  gpt_model: "qwen/qwen3-coder:free"  # Example OpenRouter model
+  max_tokens: 20000
 ```
 
-### [2] Set up a conda environment:
+Supported models and their pricing are defined in the [`LLM` class](amadeusgpt/analysis_objects/llm.py#L23-L27). To add a new model, update the `prices` dictionary with the model name and its input/output costs per token.
+
+### [2] Set up your Python environment:
 
-Conda is an easy-to-use Python interface that supports launching [Jupyter Notebooks](https://jupyter.org/). If you are completely new to this, we recommend checking out the [docs here for getting conda installed](https://deeplabcut.github.io/DeepLabCut/docs/beginner-guides/beginners-guide.html#beginner-user-guide). Otherwise, proceed to use one of [our supplied conda files](https://github.com/AdaptiveMotorControlLab/AmadeusGPT/tree/main/conda). As you will see we have minimal dependencies to get started, and [here is a simple step-by-step guide](https://deeplabcut.github.io/DeepLabCut/docs/installation.html#step-2-build-an-env-using-our-conda-file) you can reference for setting it up (or see [BONUS](README.md#bonus---customized-your-conda-env) below). Here is the quick start command:
+[uv](https://github.com/astral-sh/uv) is a fast Python package manager by Astral. Install uv first:
 
 ```bash
-conda env create -f amadeusGPT.yml
+# Install uv
+curl -LsSf https://astral.sh/uv/install.sh | sh
+# Or on Windows: powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
 ```
-To note, some modules AmadeusGPT can use benefit from GPU support, therefore we recommend also having an NVIDIA GPU available and installing CUDA. 
+
+Then install AmadeusGPT:
+```bash
+# Clone the repository and install
+git clone https://github.com/AdaptiveMotorControlLab/AmadeusGPT.git
+cd AmadeusGPT
+uv sync
+```
+
+**Note:** Some modules AmadeusGPT can use benefit from GPU support, therefore we recommend also having an NVIDIA GPU available and installing CUDA. 
 
 
 ### [3] 🪄 That's it! Now you have AmadeusGPT installed! 

diff --git a/amadeusgpt/__init__.py b/amadeusgpt/__init__.py
@@ -4,6 +4,7 @@
 SOURCE CODE: https://github.com/AdaptiveMotorControlLab/AmadeusGPT
 Apache-2.0 license
 """
+import streamlit as st
 
 from matplotlib import pyplot as plt
 

diff --git a/amadeusgpt/analysis_objects/llm.py b/amadeusgpt/analysis_objects/llm.py
@@ -10,9 +10,12 @@
 import numpy as np
 import openai
 from openai import OpenAI
+from pydantic import ValidationError
 
 from amadeusgpt.programs.sandbox import Sandbox
+from amadeusgpt.system_prompts.visual_llm import VlmInferenceOutput
 from amadeusgpt.utils import AmadeusLogger, QA_Message, create_qa_message
+from amadeusgpt.utils.openai_adapter import OpenAIAdapter
 
 from .base import AnalysisObject
 
@@ -22,6 +25,7 @@ class LLM(AnalysisObject):
     prices = {
         "gpt-4o": {"input": 5 / 10**6, "output": 15 / 10**6},
         "gpt-4o-mini": {"input": 0.15 / 10**6, "output": 0.6 / 10**6},
+        "qwen/qwen3-coder:free": {"input": 0, "output": 0},
     }
     total_cost = 0
 
@@ -65,17 +69,17 @@ def connect_gpt_oai_1(self, messages, **kwargs):
         This is routed to openai > 1.0 interfaces
         """
 
-        if self.config.get("use_streamlit", False):
-            if "OPENAI_API_KEY" in os.environ:
-                openai.api_key = os.environ["OPENAI_API_KEY"]
-        else:
-            openai.api_key = os.environ["OPENAI_API_KEY"]
+        # if self.config.get("use_streamlit", False):
+        #     if "OPENAI_API_KEY" in os.environ:
+        #         openai.api_key = os.environ["OPENAI_API_KEY"]
+        # else:
+        #     openai.api_key = os.environ["OPENAI_API_KEY"]
         response = None
         # gpt_model is default to be the cls.gpt_model, which can be easily set
-        gpt_model = self.gpt_model
+        # gpt_model = self.gpt_model
         # in streamlit app, "gpt_model" is set by the text box
 
-        client = OpenAI()
+        client = OpenAIAdapter().get_client()
 
         if self.config.get("use_streamlit", False):
             if "gpt_model" in st.session_state:
@@ -89,7 +93,7 @@ def connect_gpt_oai_1(self, messages, **kwargs):
 
         # the usage was recorded from the last run. However, since we have many LLMs that
         # share the call of this function, we will need to store usage and retrieve them from the database class
-        num_retries = 3
+        num_retries = 1
         for _ in range(num_retries):
             try:
                 json_data = {
@@ -259,19 +263,37 @@ def speak(self, sandbox: Sandbox, image: np.ndarray):
             multi_image_content=multi_image_content,
             in_place=True,
         )
-        response = self.connect_gpt(self.context_window, max_tokens=2000)
+        response = self.connect_gpt(self.context_window, max_tokens=20000)
         text = response.choices[0].message.content.strip()
 
         print("description of the image frame provided")
         print(text)
 
+        thinking_pattern = r'<think>.*?</think>'
+        output_text = re.sub(thinking_pattern, '', text, flags=re.DOTALL)
+
+        print(f"output text after removing thinking: {output_text}")
+
         pattern = r"```json(.*?)```"
-        if len(re.findall(pattern, text, re.DOTALL)) == 0:
-            raise ValueError("can't parse the json string correctly", text)
+        if len(re.findall(pattern, output_text, re.DOTALL)) == 0:
+            raise ValueError("can't parse the json string correctly", output_text)
         else:
-            json_string = re.findall(pattern, text, re.DOTALL)[0]
-            json_obj = json.loads(json_string)
-            return json_obj
+            results = []
+            for response_json in re.findall(pattern, output_text, re.DOTALL):
+                try:
+                    json_obj = json.loads(response_json)
+                    VlmInferenceOutput.model_validate(json_obj)
+                    results.append(json_obj)
+                except ValidationError as val_err:
+                    print(f"Couldn't validate the json string correctly for {response_json}", val_err)
+                except Exception as e:
+                    print(f"Couldn't parse the json string correctly for {response_json}", e)
+                    raise e
+            if len(results) == 0:
+                raise ValueError("can't parse the json string correctly", output_text)
+            elif len(results) > 1:
+                print("WARNING!! Found multiple json strings. Returning only the first", results)
+            return results[0]
 
 
 class CodeGenerationLLM(LLM):
@@ -319,7 +341,7 @@ def speak(
 
         self.update_history("user", query)
 
-        response = self.connect_gpt(self.context_window, max_tokens=2000)
+        response = self.connect_gpt(self.context_window, max_tokens=20000)
         text = response.choices[0].message.content.strip()
         # need to keep the memory of the answers from LLM
         self.update_history("assistant", text)
@@ -374,7 +396,7 @@ def speak(self, qa_message):
 Can you correct the code? Make sure you only write one function which is the updated function.
 """
         self.update_history("user", query)
-        response = self.connect_gpt(self.context_window, max_tokens=4096)
+        response = self.connect_gpt(self.context_window, max_tokens=20000)
         text = response.choices[0].message.content.strip()
         print(text)
         pattern = r"```python(.*?)```"

diff --git a/amadeusgpt/app.py b/amadeusgpt/app.py
@@ -1,9 +1,7 @@
 import os
 import traceback
 
-import streamlit as st
-
-from amadeusgpt import app_utils
+from amadeusgpt import app_utils, st
 from amadeusgpt.utils import validate_openai_api_key
 
 # Set page configuration
@@ -21,6 +19,10 @@ def main():
     if "exist_valid_openai_api_key" not in st.session_state:
         if "OPENAI_API_KEY" in os.environ:
             st.session_state["exist_valid_openai_api_key"] = True
+            st.session_state["OPENAI_API_KEY"] = os.environ["OPENAI_API_KEY"]
+        elif "OPENROUTER_API_KEY" in os.environ:
+            st.session_state["exist_valid_openai_api_key"] = True
+            st.session_state["OPENROUTER_API_KEY"] = os.environ["OPENROUTER_API_KEY"]
         else:
             st.session_state["exist_valid_openai_api_key"] = False
 
@@ -30,6 +32,8 @@ def valid_api_key():
         print("inside valid api key function")
         if "OPENAI_API_KEY" in os.environ:
             api_token = os.environ["OPENAI_API_KEY"]
+        elif "OPENROUTER_API_KEY" in os.environ:
+            api_token = os.environ["OPENROUTER_API_KEY"]
         else:
             api_token = st.session_state["openAI_token"]
         check_valid = validate_openai_api_key(api_token)

diff --git a/amadeusgpt/configs/EPM_template.yaml b/amadeusgpt/configs/EPM_template.yaml
@@ -9,6 +9,8 @@ keypoint_info:
 object_info:
   load_objects_from_disk: false
 llm_info:
+  gpt_model: "qwen/qwen3-coder:free"
+  max_tokens: 20000
   keep_last_n_messages: 2
 video_info:
   scene_frame_number: 100

diff --git a/amadeusgpt/configs/Horse_template.yaml b/amadeusgpt/configs/Horse_template.yaml
@@ -7,6 +7,8 @@ keypoint_info:
     nose: "nose"
     neck: "neck" 
 llm_info:
+  gpt_model: "qwen/qwen3-coder:free"
+  max_tokens: 20000
   keep_last_n_messages: 2 
 object_info:
   load_objects_from_disk: false

diff --git a/amadeusgpt/configs/MABe_template.yaml b/amadeusgpt/configs/MABe_template.yaml
@@ -7,6 +7,8 @@ keypoint_info:
     nose: "nose"
     neck: "neck"  
 llm_info:
+  gpt_model: "qwen/qwen3-coder:free"
+  max_tokens: 20000
   keep_last_n_messages: 2
 object_info:
   load_objects_from_disk: false

diff --git a/amadeusgpt/configs/MausHaus_template.yaml b/amadeusgpt/configs/MausHaus_template.yaml
@@ -7,6 +7,8 @@ keypoint_info:
     nose: "nose"
     neck: "neck"  
 llm_info:
+  gpt_model: "qwen/qwen3-coder:free"
+  max_tokens: 20000
   keep_last_n_messages: 2
 object_info:
   load_objects_from_disk: false

diff --git a/amadeusgpt/integration_module_hub.py b/amadeusgpt/integration_module_hub.py
@@ -1,17 +1,19 @@
 import os
 import pickle
 
-from openai import OpenAI
+from amadeusgpt.utils.openai_adapter import OpenAIAdapter
 from sklearn.metrics.pairwise import cosine_similarity
 
+from amadeusgpt import st
+from amadeusgpt.utils.api_key_util import get_api_key
 from amadeusgpt.programs.api_registry import INTEGRATION_API_REGISTRY
 
-client = OpenAI()
-
 
 class IntegrationModuleHub:
     def __init__(self):
         self.amadeus_root = os.path.dirname(os.path.realpath(__file__))
+        self.client = OpenAIAdapter(
+            api_key=get_api_key(st.session_state)).get_client()
 
     def save_embeddings(self):
         result = {}
@@ -20,7 +22,7 @@ def save_embeddings(self):
             docstring = module_info["description"]
             text = docstring.replace("\n", " ")
             embedding = (
-                client.embeddings.create(input=[text], model=model).data[0].embedding
+                self.client.embeddings.create(input=[text], model=model).data[0].embedding
             )
             result[module_name] = embedding
         if len(result) > 0:
@@ -35,7 +37,7 @@ def match_module(self, query):
         model = "text-embedding-3-small"
 
         query_embedding = (
-            client.embeddings.create(input=[query], model=model).data[0].embedding
+            self.client.embeddings.create(input=[query], model=model).data[0].embedding
         )
 
         if not os.path.exists(

diff --git a/amadeusgpt/system_prompts/visual_llm.py b/amadeusgpt/system_prompts/visual_llm.py
@@ -1,3 +1,12 @@
+from pydantic import BaseModel
+from typing import List, Literal
+
+class VlmInferenceOutput(BaseModel):
+    description: str
+    individuals: int
+    species: Literal["topview_mouse", "sideview_quadruped", "others"]
+    background_objects: List[str]
+
 def _get_system_prompt():
     system_prompt = """
     Describe what you see in the image and fill in the following json string:

diff --git a/amadeusgpt/utils/__init__.py b/amadeusgpt/utils/__init__.py
@@ -9,6 +9,7 @@
 from amadeusgpt.analysis_objects.event import Event
 from amadeusgpt.logger import AmadeusLogger
 from IPython.display import Markdown, Video, display, HTML
+from amadeusgpt.utils.openai_adapter import OpenAIAdapter
 
 def filter_kwargs_for_function(func, kwargs):
     sig = inspect.signature(func)
@@ -36,13 +37,8 @@ def parse_error_message_from_python():
     return traceback_str
 
 def validate_openai_api_key(key):
-    import openai
-    openai.api_key = key
-    try:
-        openai.models.list()
-        return True
-    except openai.AuthenticationError:
-        return False
+    client = OpenAIAdapter(key)
+    return client.validate()
 
 def flatten_tuple(t):
     """