This repository contains a FastAPI backend and Streamlit app for PrivateGPT, an application built by imartinez. The PrivateGPT App provides an interface to privateGPT, with options to embed and retrieve documents using a language model and an embeddings-based retrieval system. All data remains local.
Easiest way to deploy:
Deploy Full App on Railway
Deploy Backend on Railway
Developer plan will be needed to make sure there is enough memory for the app to run.
- Python 3.11 or later
- Minimum 16GB of memory
-
Create a Python virtual environment using your preferred method.
-
Copy the environment variables from
example.env
to a new file named.env
. Modify the values in the.env
file to match your desired configuration. The variables to set are:PERSIST_DIRECTORY
: The directory where the app will persist data.MODEL_TYPE
: The type of the language model to use (e.g., "GPT4All", "LlamaCpp").MODEL_PATH
: The path to the language model file.EMBEDDINGS_MODEL_NAME
: The name of the embeddings model to use.MODEL_N_CTX
: The number of contexts to consider during model generation.API_BASE_URL
: The base API url for the FastAPI app, usually it's deployed to port:8000.
-
Install the required dependencies by running the following command:
pip install -r requirements.txt
To run the FastAPI backend, execute the following command:
gunicorn app:app -k uvicorn.workers.UvicornWorker --timeout 1500
This command starts the backend server and automatically handles the necessary downloads for the language model and the embedding models. The --timeout 500
option ensures that sufficient time is allowed for proper model downloading.
Please update the API_BASE_URL
to appropriate FastAPI url
To run the Streamlit app, use the following command:
streamlit run streamlit_app.py --server.address localhost
This command launches the Streamlit app and connects it to the backend server running at localhost
.
-
Embedding documents is a quick process, but retrieval may take a long time due to the language model generation step. Optimization efforts are required to improve retrieval performance.
-
The FastAPI backend can be used with any front-end framework of your choice. Feel free to integrate it with your preferred user interface.
-
Community contributions are welcome! We encourage you to contribute to make this app more robust and enhance its capabilities.
The supported extensions for documents are:
.csv
: CSV,.docx
: Word Document,.enex
: EverNote,.eml
: Email,.epub
: EPub,.html
: HTML File,.md
: Markdown,.msg
: Outlook Message,.odt
: Open Document Text,.pdf
: Portable Document Format (PDF),.pptx
: PowerPoint Document,.txt
: Text file (UTF-8),
Certainly! Here are examples of how to call the API routes mentioned in the README:
- Endpoint:
GET /
- Description: Get a simple greeting message to verify that the APIs are ready.
- Example Usage:
curl -X GET http://localhost:8000/
import requests response = requests.get("http://localhost:8000/") print(response.json())
- Endpoint:
POST /embed
- Description: Embed files by uploading them to the server.
- Example Usage:
curl -X POST -F "files=@file1.txt" -F "files=@file2.txt" -F "collection_name=my_collection" http://localhost:8000/embed
import requests files = [("files", open("file1.txt", "rb")), ("files", open("file2.txt", "rb"))] data = {"collection_name": "my_collection"} response = requests.post("http://localhost:8000/embed", files=files, data=data) print(response.json())
- Endpoint:
POST /retrieve
- Description: Retrieve documents based on a query.
- Example Usage:
curl -X POST -H "Content-Type: application/json" -d '{"query": "sample query", "collection_name": "my_collection"}' http://localhost:8000/retrieve
import requests data = {"query": "sample query", "collection_name": "my_collection"} response = requests.post("http://localhost:8000/retrieve", json=data) print(response.json())
Please note that the actual URL (http://localhost:8000/
) and the request payloads should be adjusted based on your specific setup and requirements.