Skip to content

Commit d1f2343

Browse files
author
GAIA Framework Bot
committed
Initial commit of repo
0 parents  commit d1f2343

File tree

18 files changed

+1598
-0
lines changed

18 files changed

+1598
-0
lines changed

.gitattributes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Auto detect text files and perform LF normalization
2+
* text=auto

.github/workflows/build.yml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [ main, develop, feature/* ]
6+
7+
jobs:
8+
build-ts-examples:
9+
runs-on: ubuntu-latest
10+
11+
steps:
12+
- uses: actions/checkout@v3
13+
14+
- name: Use Node.js 18.x
15+
uses: actions/setup-node@v3
16+
with:
17+
node-version: 18.x
18+
19+
- name: Install dependencies
20+
run: |
21+
cd examples/typescript
22+
yarn install --frozen-lockfile
23+
24+
- name: Build
25+
run: |
26+
cd examples/typescript
27+
yarn build
28+
29+
docker-build-environments:
30+
runs-on: ubuntu-latest
31+
needs: build-ts-examples
32+
strategy:
33+
matrix:
34+
environment: [jupyter]
35+
steps:
36+
- name: Checkout code
37+
uses: actions/checkout@v2
38+
with:
39+
fetch-depth: 2
40+
41+
- id: filter
42+
uses: dorny/paths-filter@v2
43+
with:
44+
base: ${{ github.ref }}
45+
list-files: shell
46+
filters: |
47+
jupyter:
48+
- 'environments/jupyter/**'
49+
50+
- name: Set up Docker Buildx
51+
uses: docker/setup-buildx-action@v2
52+
53+
# Only build image if there's a change in the folder
54+
- name: Build environment Docker image
55+
if: steps.filter.outputs[matrix.environment] == 'true'
56+
uses: docker/build-push-action@v2
57+
with:
58+
context: ./environments/${{ matrix.environment }}
59+
push: false

.gitignore

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# Logs
2+
logs
3+
*.log
4+
npm-debug.log*
5+
yarn-debug.log*
6+
yarn-error.log*
7+
lerna-debug.log*
8+
.pnpm-debug.log*
9+
10+
# Diagnostic reports (https://nodejs.org/api/report.html)
11+
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
12+
13+
# Runtime data
14+
pids
15+
*.pid
16+
*.seed
17+
*.pid.lock
18+
19+
# Directory for instrumented libs generated by jscoverage/JSCover
20+
lib-cov
21+
22+
# Coverage directory used by tools like istanbul
23+
coverage
24+
*.lcov
25+
26+
# nyc test coverage
27+
.nyc_output
28+
29+
# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
30+
.grunt
31+
32+
# Bower dependency directory (https://bower.io/)
33+
bower_components
34+
35+
# node-waf configuration
36+
.lock-wscript
37+
38+
# Compiled binary addons (https://nodejs.org/api/addons.html)
39+
build/Release
40+
41+
# Dependency directories
42+
node_modules/
43+
jspm_packages/
44+
45+
# Snowpack dependency directory (https://snowpack.dev/)
46+
web_modules/
47+
48+
# TypeScript cache
49+
*.tsbuildinfo
50+
51+
# Optional npm cache directory
52+
.npm
53+
54+
# Optional eslint cache
55+
.eslintcache
56+
57+
# Optional stylelint cache
58+
.stylelintcache
59+
60+
# Microbundle cache
61+
.rpt2_cache/
62+
.rts2_cache_cjs/
63+
.rts2_cache_es/
64+
.rts2_cache_umd/
65+
66+
# Optional REPL history
67+
.node_repl_history
68+
69+
# Output of 'npm pack'
70+
*.tgz
71+
72+
# Yarn Integrity file
73+
.yarn-integrity
74+
75+
# dotenv environment variable files
76+
.env
77+
.env.development.local
78+
.env.test.local
79+
.env.production.local
80+
.env.local
81+
82+
# parcel-bundler cache (https://parceljs.org/)
83+
.cache
84+
.parcel-cache
85+
86+
# Next.js build output
87+
.next
88+
out
89+
90+
# Nuxt.js build / generate output
91+
.nuxt
92+
dist
93+
94+
# Gatsby files
95+
.cache/
96+
# Comment in the public line in if your project uses Gatsby and not Next.js
97+
# https://nextjs.org/blog/next-9-1#public-directory-support
98+
# public
99+
100+
# vuepress build output
101+
.vuepress/dist
102+
103+
# vuepress v2.x temp and cache directory
104+
.temp
105+
.cache
106+
107+
# Serverless directories
108+
.serverless/
109+
110+
# FuseBox cache
111+
.fusebox/
112+
113+
# DynamoDB Local files
114+
.dynamodb/
115+
116+
# TernJS port file
117+
.tern-port
118+
119+
# Stores VSCode versions used for testing VSCode extensions
120+
.vscode-test
121+
122+
# yarn v2
123+
.yarn/cache
124+
.yarn/unplugged
125+
.yarn/build-state.yml
126+
.yarn/install-state.gz
127+
.pnp.*

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 GAIA Framework
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# GAIA Framework - Code Interpreter Tool
2+
3+
An open source version of the flow used to create a basic python code interpreter tool, used similarly within the currently unreleased `gaia-framework` and able to run and test with locally using Docker
4+
5+
This tool aims to act similarly with basic functionality to the `code interpreter` tool that ChatGPT uses to execute python code. In this example, it's ran as a non-stateful jupyter notebook environment that allows execution of python code including internet access and allows for persisting of files to local, blob, or other storage locations depending on the setup. Though, it can be easily extended to support stateful execution as needed and left as a non-stateful tool here for simplicity.
6+
7+
**Models tested**:
8+
9+
- `GPT` models that support tool execution but best with `gpt-4` models
10+
11+
- `claude-3` models that support tool execution using the new Tools Beta, though requires some additional prompting in the system instructions to get claude to perform similary to GPT
12+
13+
**Examples using the tool**:
14+
15+
- Currently only an example with `GPT` using the [OpenAI Node API](https://www.npmjs.com/package/openai)
16+
17+
- See the [Running Examples](#running-examples) section
18+
19+
**How the tool works**:
20+
21+
An LLM such as GPT or Claude decides to call the code-interpreter tool and passes either generated or user-provided code as an argument to the tool. The tool then executes it as a script in a docker container with a short-lived lifetime, only spun up to execute the script and is removed afterwards.
22+
23+
This script executed is written in python and saved to a temp directory, mounted to the conatainer on startup and to be executed in a jupyter notebook environment like below:
24+
25+
```typescript
26+
const imageName = 'jupyter-runtime';
27+
const executionPath = `/app/${notebookName}.ipynb`;
28+
const outputPath = `/app/${notebookName}_output.ipynb`;
29+
const dockerCommand = [
30+
"docker run --rm",
31+
`-v "${tmpDir}:/app"`,
32+
`-v "${outputDir}:/mnt/data"`, // Used to save and persist user files & output
33+
imageName,
34+
`/bin/bash -c "xvfb-run -a jupyter nbconvert --to notebook --execute ${executionPath} --output ${outputPath} && cat ${outputPath}"`
35+
].join(" ");
36+
```
37+
38+
- A Jupyter Notebook environment is used for executing the script because it provides a convenient and automated way to run Python code, retrieve outputs, and persist files using an LLM
39+
40+
- Due to the nature of .ipynb files being in JSON and the way outputs are structured after execution, we can take advantage of that and use it as an execution environment, parsing its output to understand the results and post-execution processing
41+
42+
- This allows use cases such as accessing persisted user files, image data, or other output data similar to ChatGPT's code interpreter and returning the results back to the LLM
43+
44+
- Depending on the `output_type` of a notebook cell, we can access paths to the persisted files such as `/mnt/data/test.txt` or raw base64 image data directly, which can then be retrieved and uploaded to blob storage or other options
45+
46+
- Docker must already be running on the machine the code is executing on
47+
48+
- Depending on the system/architecture where the tool is used and ran, options like Azure Container Instances (ACI) or others can be used in place of Docker
49+
50+
- One approach would be to have some sort of orchestration service/tool to determine whether to use Docker, ACI, or some other provider depending on a parameter or local vs production environment
51+
52+
- Any files persisted to `/mnt/data/` are shared with the output directory and since this example is a non-stateful jupyter environment, that output directory is used to retrieve and upload persisted items to external storage (or access from the mounted output directory itself if accessible)
53+
54+
- If its changed to run in a stateful environment, determining when and how files persisted to /mnt/data/ are uploaded to external storage can be updated depending on the preference
55+
56+
- **Security Considerations**:
57+
58+
When implementing this tool in your own projects, consider the following security measures:
59+
60+
- **Input Sanitization:** Ensure all user inputs are sanitized to prevent injection attacks
61+
62+
- **Execution Environment:** Execute code within a secure, isolated sandbox environment
63+
64+
- **Resource Limits:** Set strict limits on CPU, memory, and execution time to avoid system strain
65+
66+
- **Feature Restrictions:** Disable unnecessary features to minimize potential attack surfaces
67+
68+
- **Error Handling:** Configure error handling to avoid revealing sensitive information
69+
70+
## Example flow
71+
72+
1. User prompts LLM to create a script and execute it
73+
74+
2. LLM decides to call tool
75+
76+
3. LLM executes tool with code as the input and result is created based on the notebook response
77+
78+
4. (Optional) Before sending results to LLM, parse the notebook output and upload any persisted files to local or external storage
79+
80+
- Alternatively, an LLM can be prompted in its system instructions or user message to call another tool that uploads files persisted in the evironment to local/external storage
81+
82+
- This could be done using the mounted output folder or if stateful, accessing whats in /mnt/data/ directly to upload
83+
84+
- In this example, we parse the notebook output after execution and return the local output filePath
85+
86+
- If we were to upload to blob storage instead of currently returning the output filePath, we'd return the blob url generated from the upload instead
87+
88+
5. LLM receives and processes results then returns response to user, or tries to fix errors with the tool call if any up to 3 retries
89+
90+
- The *runTools(...)* method from the [OpenAI Node API](https://www.npmjs.com/package/openai) simplifies tool calling and feeding in errors to fix for us in the `example_openai.ts` code along with system instructions for GPT
91+
92+
- At this time a custom feedback loop is needed when attempting this flow with Claude
93+
94+
6. User receives response, including the link to the path where files were persisted (if any) or the results of the execution in general
95+
96+
- If files were generated during the execution but weren't uploaded to local/external storage in an intermediary step, the response will include the inaccessible file path within the environment, such as `/mnt/data/test.txt`, instead of a publicly accessible URL like `someblobstorageurl.com/path/to/file/test.txt`.
97+
98+
See [Example Run Outputs](#example-run-outputs)
99+
100+
## Running Examples
101+
102+
### Build the Image
103+
104+
For the tool to execute properly the docker image must be built first:
105+
106+
1. Start docker
107+
108+
2. Navigate to the `environments/jupyter` folder in terminal
109+
110+
3. Run the following comand:
111+
112+
```bash
113+
docker build -t jupyter-runtime .
114+
```
115+
116+
### Run Typescript Example
117+
118+
1. Navigate to `examples/typescript` in terminal
119+
120+
2. Create a `.env` file based on the `.env.example` and add your value for the `OPENAI_API_KEY`
121+
122+
3. Run the following commands sequentially:
123+
124+
```bash
125+
yarn install
126+
yarn build
127+
yarn start
128+
```
129+
130+
4. Enter a prompt that would make GPT choose the tool
131+
- e.g., `execute a python script to add two numbers together and show the result`
132+
133+
## Example Run Outputs
134+
135+
View [output examples](docs/output_examples.md) to see example run outputs using the tool with `.runTools(...)` from the [OpenAI Node API](https://www.npmjs.com/package/openai) for easy usage and handling tool errors
136+
137+
- If there's an error in the tool call, returning a string of the error back as the tool response can enable GPT to try and fix errors on its own
138+
139+
## Ethical Use Guidelines
140+
141+
This open-source tool is provided with the intent to foster innovation and aid in development, particularly in educational, research, and development contexts. Users are urged to utilize the tool responsibly and ethically. Here are some guidelines to consider:
142+
143+
- **Responsible Usage**: Ensure that the use of this tool does not harm individuals or groups. This includes avoiding the processing or analysis of data in ways that infringe on privacy or propagate bias.
144+
145+
- **Prohibited Uses**: Do not use this tool for:
146+
- Illegal activities
147+
- Creating or spreading malware
148+
- Conducting surveillance or gathering sensitive data without consent
149+
- Activities that could cause harm, such as cyberbullying or online harassment
150+
151+
- **Transparency**: Users should be transparent about how scripts are generated and used, particularly when the outputs are shared publicly or used in decision-making processes.
152+
153+
- **Data Privacy**: Be mindful of data privacy laws and regulations. Ensure that any data used with this tool complies with relevant legal standards, such as GDPR in Europe, CCPA in California, etc.
154+
155+
- **Intellectual Property**: Respect the intellectual property rights of others. Ensure that all content processed by or generated with this tool does not violate copyrights or other intellectual property laws.
156+
157+
- **Quality Control**: Regularly review and test the code executed by this tool to ensure its accuracy and reliability, especially when used in critical or production environments.
158+
159+
## Reporting Issues
160+
161+
If you encounter any issues or bugs while using this tool, please report them via [GitHub Issues](https://github.com/gaia-framework-ai/code-interpreter-tool/issues).
162+
163+
## License
164+
165+
This project is licensed under the MIT license, see the [LICENSE](LICENSE) file included with the project.
166+
167+
## Contributions
168+
169+
Coming soon

0 commit comments

Comments
 (0)