StatGPT Backend

This repository contains code for StatGPT backend, which implements APIs and main logic of the StatGPT application.

More information about StatGPT and its architecture can be found in the documentation repository.

Technological stack

Application is written in Python 3.11 and uses the following main technologies:

Technology	Purpose
AI DIAL SDK	SDK for building applications on top of AI DIAL platform
FastAPI	Web framework for API development
SQLAlchemy	ORM for database operations
LangChain	LLM application framework
Pydantic	Data validation and settings
sdmx1	SDMX data handling and provider connections

Project structure

src/admin_portal — backend of the administrator portal which allows the user to add and update data.
src/common — common code used in the admin_portal and statgpt applications.
src/statgpt — main application that generates response using LLMs and based on data prepared by admin_portal.
tests - unit and integration tests.
docker - Dockerfiles for building docker images.

Environment variables

The applications are configured using environment variables. The environment variables are described in the following files:

Common environment variables - used in both applications
Admin Backend environment variables
Chat Backend environment variables

Local Setup

Pre-requisites

1. Install Make

MacOS - should be already installed
Windows
Windows, using Chocolatey
Make sure that make is in the PATH (run which make).

2. Install Python 3.11

Direct installation:

MacOS, using Homebrew - brew install python@3.11
Windows or MacOS, using official repository
Windows, using Chocolatey
Make sure that python3 or python3.11 is in the PATH and works properly (run python3.11 --version).

Alternative: use pyenv:

pyenv allows to manage different python versions on the same machine

execute following from the repository root folder:

pyenv install 3.11
pyenv local 3.11  # use Python 3.11 for the current project

3. Install Poetry

Recommended way - system-wide, independent of any particular python venv:

MacOS - recommended way to install poetry is to use pipx
Windows - recommended way to install poetry is to use official installer
Make sure that poetry is in the PATH and works properly (run poetry --version).

Alternative - venv-specific (using pip):

make sure the correct python venv is activated
make install_poetry

Install Docker Engine and Docker Compose suitable for your OS

Since Docker Desktop requires a paid license for commercial use, you can use one of the following alternatives:

Docker Engine and Docker Compose on Linux
Rancher Desktop on Windows or MacOS

Setup

1. Clone the repository

2. Create venv (python virtual environment)

Create python virtual environment, using poetry:

make init_venv

If you see the following error: Skipping virtualenv creation, as specified in config file., it means venv was not created because poetry is configured not to create a new virtual environment. You can fix this:

Either by updating poetry config:
- poetry config --local virtualenvs.create true (local config)
- or poetry config virtualenvs.create true (global config)
or by creating venv manually: python -m venv .venv

3. Activate venv

For Mac / Linux:

source .venv/bin/activate

For Windows:

.venv/Scripts/Activate

4. Install required python packages

The following will install basic and dev dependencies:

make install_dev

5. Create `.env` file in the root of the project

You can copy the template file and fill values for secrets manually:

cp .env.template .env

The Environment variables section provides links to pages with detailed information about environment variables.

6. Create `dial_conf/core/config.json` file by running python script

make generate_dial_config

Run StatGPT locally

Run the DIAL using docker compose:
```
docker compose up -d
```
Apply alembic migrations:
- locally:
```
make db_migrate
```
- or using Docker:
  1. Set ADMIN_MODE=ALEMBIC_UPGRADE in the .env file
  2. Run admin_portal from docker-compose.yml
Run Admin backend (if you want to initialize or update data):
```
make run_admin
```

Utils for Development

1. Format the code

make format

2. Run linters

make lint

3. Pre-Commit Hooks

To automatically apply black and isort on each commit, enable PreCommit Hooks:

make install_pre_commit_hooks

This command will set up the git hook scripts.

4. Create a new `alembic` migration:

(!) It is critical to note that autogenerate is not intended to be perfect. It is always necessary to manually review and correct the candidate migrations that autogenerate produces.

(!) After creating a new migration, it is necessary to update the ALEMBIC_TARGET_VERSION in the src/common/config/version.py file to the new version.

make db_autogenerate MESSAGE="Your message"

or:

alembic -c src/alembic.ini revision --autogenerate -m "Your message"

5. Undo last `alembic` migration

make db_downgrade

6 Localizations

To update localization files, run:

make extract_messages  # Extract messages to be translated
make update_messages   # Update existing translation files with new messages

Check the *.po files for new messages and provide translations.

Then compile translations:

make compile_messages

Run Tests

Integration tests require running a test database and elasticsearch. They are part of the docker-compose.yml file. The Docker containers with this database/elasticsearch don't have volumes to store data, so they are always fresh after docker compose down.
To run integration tests, uncomment the vectordb-test and elasticsearch-test containers in the docker-compose.yml file. You might also need to comment out the elasticsearch container if your machine doesn't have enough resources.
To run end-to-end tests, first run StatGPT locally. This step is not required for other tests.
Run tests:
- all tests except for end-to-end (unit and integration):
```
make test
```
- only unit tests:
```
make test_unit
```
- only integration tests:
```
make test_integration
```
- just end-to-end tests:
```
make test_e2e
```

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
docker		docker
scripts		scripts
src		src
tests		tests
.env.template		.env.template
.flake8		.flake8
.gitignore		.gitignore
.ort.yml		.ort.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_STYLE.md		CODE_STYLE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
trivy.yaml		trivy.yaml

Uh oh!

License

Uh oh!

epam/statgpt-backend

Folders and files

Latest commit

History

Repository files navigation

StatGPT Backend

Technological stack

Project structure

Environment variables

Local Setup

Pre-requisites

1. Install Make

2. Install Python 3.11

3. Install Poetry

Install Docker Engine and Docker Compose suitable for your OS

Setup

1. Clone the repository

2. Create venv (python virtual environment)

3. Activate venv

4. Install required python packages

5. Create .env file in the root of the project

6. Create dial_conf/core/config.json file by running python script

Run StatGPT locally

Utils for Development

1. Format the code

2. Run linters

3. Pre-Commit Hooks

4. Create a new alembic migration:

5. Undo last alembic migration

6 Localizations

Run Tests

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

5. Create `.env` file in the root of the project

6. Create `dial_conf/core/config.json` file by running python script

4. Create a new `alembic` migration:

5. Undo last `alembic` migration

Packages