Local Language Models with GPU Support

Open Interpreter can be used with local language models, however these can be rather taxing on your computer's resources. If you have an NVIDIA GPU, you may benefit from offloading some of the work to your GPU.

Windows

Install the latest NVIDIA CUDA Toolkit for your version of Windows. The newest version that is known to work is CUDA Toolkit 12.2.2 while the oldest version that is known to work is 11.7.1. Other versions may work, but not all have been tested.

For Installer Type, choose exe (network).

During install, choose Custom (Advanced).

The only required components are:
- CUDA
  - Runtime
  - Development
- Driver components
  - Display Driver
You may choose to install additional components if you like.
Once the CUDA Toolkit has finished installing, open a Command Prompt or PowerShell window, and run the corresponding command. This ensures that the CUDA_PATH environment varilable is set.
```
# Command Prompt
echo %CUDA_PATH%

# PowerShell
$env:CUDA_PATH
```
If you don't get back something like this:
```
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2
```
Restart your computer, then repeat this step.

Once you have verified that the CUDA_PATH environment variable is set, run the corresponding commands for your shell. This will reinstall the llama-cpp-python package with NVIDIA GPU support.

# Command Prompt
set FORCE_CMAKE=1 && set CMAKE_ARGS=-DLLAMA_CUBLAS=on
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir -vv

# PowerShell
$env:FORCE_CMAKE=1; $env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir -vv

The command should complete with no errors. If you receive an error, ask for help on the Discord server.

Once llama-cpp-python has been reinstalled, you can quickly check whether GPU support has been installed and set up correctly by running the following command.
```
python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"
```
If you see something similar to this, then you are ready to use your GPU with Open Interpreter.
```
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3080, compute capability 8.6
True
```
If you instead see this, then ask for help on the Discord server.
```
False
```
Finally, run the following command to use Open Interpreter with a local language model with GPU support.
```
interpreter --local
```

Windows Subsystem for Linux 2 (WSL2)

Ensure that you have the latest NVIDIA Display Driver installed on your host Windows OS.
Get the latest NVIDIA CUDA Toolkit for WSL2 and run the provided steps in a WSL terminal.

To get the correct steps, choose the following options.
- Operating System: Linux
- Architecture: x86_64
- Distribution: WSL-Ubuntu
- Version: 2.0
- Installer Type: deb (network)
If installed correctly, the following command will display information about your NVIDIA GPU, including the driver version and CUDA version.
```
nvidia-smi
```
Next, verify the path where the CUDA Toolkit was installed by running the following command.
```
ls /usr/local/cuda/bin/nvcc
```
If it returns the following error, ask for help on the Discord server.
```
ls: cannot access '/usr/local/cuda/bin/nvcc': No such file or directory
```

Ensure that you have the required build dependencies by running the following commands.

sudo apt update
sudo apt install build-essential cmake python3 python3-pip python-is-python3

Next, reinstall the llama-cpp-python package with NVIDIA GPU support by running the following command.
```
CUDA_PATH=/usr/local/cuda FORCE_CMAKE=1 CMAKE_ARGS='-DLLAMA_CUBLAS=on' \
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir -vv
```
The command should complete with no errors. If you receive an error, ask for help on the Discord server.
Once llama-cpp-python has been reinstalled, you can quickly check whether GPU support has been installed and set up correctly by running the following command.
```
python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"
```
If you see something similar to this, then you are ready to use your GPU with Open Interpreter.
```
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3080, compute capability 8.6
True
```
If you instead see this, then ask for help on the Discord server.
```
False
```
Finally, run the following command to use Open Interpreter with a local language model with GPU support.
```
interpreter --local
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU.md

GPU.md

Local Language Models with GPU Support

Windows

Windows Subsystem for Linux 2 (WSL2)

Files

GPU.md

Latest commit

History

GPU.md

File metadata and controls

Local Language Models with GPU Support

Windows

Windows Subsystem for Linux 2 (WSL2)