Basic RAG chat sample app with ChatGPT style.
- All works locally
- Ollama llama3.1 for local LLM
- Azure AI Search Emulator for local search
- LiteLLM
- as a adaptor for various LLM
- minimal sample code
- backend is Python FastAPI
- frontend is plain html (instead of React stuff)
- two types: simple response and stream response like ChatGPT
- Chainlit python low-code UI
- start AzureSearchEmulator (see setup below)
cd AzureSearchEmulator
docker compose up -d
docker compose logs -f- setup ollama local LLM (see setup below)
- start dev server
poetry install
./start_devserver.sh- Simple Chat: http://127.0.0.1:8000/static/index.html
- Stream Chat: http://127.0.0.1:8000/static/chat-stream.html
- Chainlit UI: http://127.0.0.1:8000/chainlit/
- FastAPI Doc: http://127.0.0.1:8000/docs
# open with VSCode
poetry shell
export REQUESTS_CA_BUNDLE=~/.aspnet/https/certificate.pem
code .- local docker server
docker build -t rag-chat-app .
docker run --rm --env-file=.env.docker -p 8000:8000 rag-chat-appcurl -v -c cookies.txt -X POST "http://127.0.0.1:8000/chat-stream" \
-H "Content-Type: application/json" -d '{"input":"hello"}'
curl -v -b cookies.txt -X GET \
"http://127.0.0.1:8000/chat-history?session_id=521b158d-9daa-4a70-b419-1074cef0c768"- system role message: setting context and guiding the model
- adding a placeholder for the assistant can be a good practice
- it’s common practice to include a placeholder message with an empty content string.
- especially if you want to clearly indicate that the assistant's response is expected next.
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the weather today?"},
{"role": "assistant", "content": "It's sunny and warm."},
{"role": "user", "content": "What about tomorrow?"}, # User's last input
{"role": "assistant", "content": ""} # Placeholder for the next assistant response
]- FastAPI: https://github.com/fastapi/fastapi
pyenv install 3.12.5
pyenv local 3.12.5
poetry init -n
poetry config virtualenvs.in-project true --local
# notebook cells for dev convenience
poetry add -G dev ipykernel
# Notes:
# fastapi includes httpx (cf. poetry show fastapi)
# pydantic-settings includes python-dotenv (cf. poetry show pydantic-settings)
poetry add fastapi[standard] pydantic-settings
poetry add litellm azure-search-documents-
linter, formatter, type checker (using VSCode extension)
- linter: flake8
- formatter: black, isort
- type checker: Pyright (included in Pylance) instead of mypy
"python.analysis.typeCheckingMode": "basic"in.vscode/settings.json- https://blog.yhiraki.com/nodes/type-checking-with-pyright/
-
pytest
poetry add pytest pytest-cov pytest-asyncio pytest-mock --group dev
# to debug in case ModuleNotFoundError (cf. pyproject.toml ini_options)
poetry run pytest ./tests --collect-only
# run tests
poetry run pytest -vv ./tests
# coverage report
poetry run pytest --cov=src --cov-report=html ./tests
open htmlcov/index.html- Ollama: https://github.com/ollama/ollama
- used with LiteLLM-Ollama: https://docs.litellm.ai/docs/providers/ollama
brew install ollama
ollama run llama3.1 # or phi3 or something you prefer
# >>> Hello!
# Hello there! How can I help you today?
# >>> /bye- drawdown.js: markdown to html converter
curl -OL https://raw.githubusercontent.com/adamvleggett/drawdown/refs/heads/master/drawdown.js- Azure Search Emulator: https://github.com/feature23/AzureSearchEmulator
- dotnet-sdk: https://formulae.brew.sh/cask/dotnet-sdk
- setup local https: https://qiita.com/j_kitayama_hoge000/items/26cd7a5ef0b2fac53fce
dotnet dev-certs https --check
# maybe you will need this
dotnet dev-certs https --trust
# create new certs (path can be different)
dotnet dev-certs https -ep ~/.aspnet/https/aspnetapp.pfx -p password- clone AzureSearchEmulator and edit
docker-compose.yml- update
ASPNETCORE_Kestrel__Certificates__Default__Pathandvolumes
- update
services:
web:
build: .
ports:
- 5080:80
- 5081:443
environment:
- ASPNETCORE_URLS=https://+;http://+
- ASPNETCORE_HTTPS_PORT=5081
- ASPNETCORE_Kestrel__Certificates__Default__Password=password
- ASPNETCORE_Kestrel__Certificates__Default__Path=/https/aspnetapp.pfx
volumes:
- indexes:/app/indexes
- ~/.aspnet/https:/https:ro
volumes:
indexes:- start server and try from curl
docker compose up -d
docker compose logs -f
# you should see some json from curl output
curl https://localhost:5081/- convert pfx to pem for Python (MacOS user)
cd ~/.aspnet/https
openssl pkcs12 -in aspnetapp.pfx -out certificate.pem -nodes
# Python requires this environment variable
export REQUESTS_CA_BUNDLE=~/.aspnet/https/certificate.pem- you may want to try from Postman or Insomnia for debug in case of trouble
- (another option?): https://github.com/tomasloksa/azure-search-emulator
- Note: collection/ComplexField are not implemented in AzureSearchEmulator
- make sure to install Jupyter extension in your VSCode
- open and run with VSCode src/notes/azure-ai-search-notes.py
popular Python UI libraries
- install chainlit will be error
version solving failed.- FastAPI needs
starlette >=0.37.2,<0.39.0(poetry show fastapi) - Chainlit needs
starlette >=0.37.2,<0.38.0 - Chainlit needs
fastapi >=0.110.1,<0.113(poetry show chainlit)
- FastAPI needs
# need to remove the current fastapi
# because chainlit (1.2.0) depends on fastapi (>=0.110.1,<0.113)
poetry remove fastapi
poetry show starlette
# Package starlette not found
# this will be version solving failed.
# poetry add "fastapi[standard]" chainlit
# install with required version
poetry add "fastapi[standard]"@^0.112.0 chainlit
# this ensures all dependencies are resolved properly
poetry update
# verify the versions of fastapi and its dependencies
poetry show fastapi
# verify the versions of installed packages
poetry showpoetry show --outdated
poetry update