title | description |
---|---|
Running locally |
Use local tools so that your code never leaves your machine |
import InstallSage from '/snippets/install-sage.mdx';
To run any open-source LLM directly on your machine, use Ollama:
- Go to ollama.com and download the appropriate binary for your machine.
- Open a new terminal window.
- Pull the desired model, for instance:
ollama pull llama3.1
.
If your codebase is small (less than 100 files), you can skip this section.
For larger codebases, we cannot afford to use the default LLM-based retrieval. Instead, we need to use vector-based retrieval, which requires chunking, embedding, and storing the codebase in a vector store. To achieve this locally, we use Marqo.
First, make sure you have Docker installed. Then run:
docker rm -f marqo
docker pull marqoai/marqo:latest
docker run --name marqo -it -p 8882:8882 marqoai/marqo:latest
This will open a persistent Marqo console window. It should take 2-3 minutes on a fresh install.
To chunk, embed and store your codebase in the Marqo vector store, run the following command (replacing huggingface/transformers
with your desired GitHub repository):
sage-index huggingface/transformers --mode=local
You are now ready to chat with your codebase. Run the following command, making sure to replace huggingface/transformers
with your desired GitHub repository:
sage-chat huggingface/transformers --mode=local
You should now have a Gradio app running on localhost. Happy (local) chatting!
You can customize your chat experience by passing any of the following flags to the sage-chat
command: