Upcycling RAG Prototype

A FastAPI backend prototype built for a bachelor's thesis, comparing RAG-augmented generation against a baseline (no retrieval) for generating DIY upcycling solutions.

The system retrieves domain-specific upcycling factsheets via vector similarity search and injects them as context into the Claude prompt. The research question: does factsheet-based RAG improve solution quality, and how does input specificity affect this?

Architecture

User Input
    │
    ├─── no-rag ──► System Prompt + User Input ──► Claude ──► Response
    │
    └─── rag ─────► Embed Input ──► pgvector Search ──► Top-5 Factsheets
                          └──► System Prompt + Factsheets + User Input ──► Claude ──► Response

Stack:

FastAPI — REST API with OpenAI-compatible /v1/chat/completions endpoint
PostgreSQL + pgvector — vector database for factsheet storage and cosine similarity search
Ollama (nomic-embed-text-v2-moe) — local embedding model (768 dimensions)
Claude API (Anthropic) — LLM for response generation
LibreChat — optional chat UI (connects to the backend as a custom API)

Prerequisites

Docker
An Anthropic API key

Setup

1. Clone the repository

git clone https://github.com/robin-mommsen/thesis-prototype
cd thesis-prototype

2. Configure environment variables

cp .env.example .env

Edit .env and fill in your values:

POSTGRES_USER=raguser
POSTGRES_PASSWORD=your_secure_password
POSTGRES_DB=ragdb

CLAUDE_API_KEY=your_anthropic_api_key_here
CLAUDE_MODEL=claude-sonnet-4-6
RAG_FACTSHEET_LIMIT=5

LIBRECHAT_JWT_SECRET=your_random_secret
LIBRECHAT_JWT_REFRESH_SECRET=your_random_refresh_secret
LIBRECHAT_CREDS_KEY=your_64_char_hex_string
LIBRECHAT_CREDS_IV=your_32_char_hex_string

The LibreChat secrets are required — the container will not start without them. Generate them with Python:

# JWT_SECRET and JWT_REFRESH_SECRET (any random string, min. 32 chars)
python -c "import secrets; print(secrets.token_hex(32))"

# CREDS_KEY (exactly 64 hex chars)
python -c "import secrets; print(secrets.token_hex(32))"

# CREDS_IV (exactly 32 hex chars)
python -c "import secrets; print(secrets.token_hex(16))"

Run each command once and paste the output into the corresponding variable in .env.

3. Start the system

docker compose up --build

On first start, Docker will:

Start PostgreSQL with the pgvector extension
Run the DB initialization SQL scripts (db/)
Start Ollama and pull the embedding model (nomic-embed-text-v2-moe:latest) — this may take a few minutes
Start the FastAPI backend, which seeds the factsheet database on startup
Start MongoDB and LibreChat (optional UI)

Once all services are healthy, the API is available at http://localhost:8080.

API Usage

Generate a upcycling idea

RAG mode (retrieves relevant factsheets):

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rag",
    "messages": [{"role": "user", "content": "Ich habe zwei alte Europaletten. Was kann ich daraus bauen?"}]
  }'

Baseline mode (no retrieval):

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "no-rag",
    "messages": [{"role": "user", "content": "Ich habe zwei alte Europaletten. Was kann ich daraus bauen?"}]
  }'

The model field selects the mode: "rag" or "no-rag".

Streaming is also supported:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "rag", "stream": true, "messages": [{"role": "user", "content": "alte Holzbretter"}]}'

Factsheet endpoints

# List all factsheets
GET http://localhost:8080/factsheets

# Get a single factsheet
GET http://localhost:8080/factsheet/{id}

# Add a new factsheet
POST http://localhost:8080/factsheet

# Update a factsheet
PUT http://localhost:8080/factsheet/{id}

# Delete a factsheet
DELETE http://localhost:8080/factsheet/{id}

Available models

GET http://localhost:8080/v1/models

Returns rag and no-rag as available model IDs.

LibreChat UI (optional)

LibreChat is included as a browser-based chat frontend. After docker compose up, it is available at http://localhost:3080.

To connect it to the RAG backend:

Open LibreChat in your browser and create a local account
The backend is pre-configured via librechat.yaml as a custom API endpoint
Select rag or no-rag as the model in the UI

Running the Research Experiment

The experiment runs all 24 test prompts (experiments/test_inputs.json) against both modes and saves results to experiments/results/.

Requirements

Install Python dependencies (outside Docker):

pip install -r requirements.txt

1. Run the experiment

Make sure the Docker stack is running, then:

python experiments/run_experiment.py

This generates a timestamped JSON file in experiments/results/, e.g. experiment_20240115_120000.json.

Failed generations are automatically retried (up to 10 attempts per prompt/mode).

2. Generate rater scoring sheets

python experiments/evaluate.py experiments/results/experiment_<timestamp>.json

This produces:

experiment_<timestamp>.xlsx — master file with the anonymization mapping (stays with the researcher)
experiment_<timestamp>_rater_1.xlsx to _rater_4.xlsx — anonymized scoring sheets for human raters, each in a different randomized order

Raters score each response across 4 criteria:

Task fit
Material integration
Feasibility
Creativity

3. Aggregate rater scores

After all raters have filled in their sheets:

python experiments/aggregate_rater_scores.py \
  --master experiment_<timestamp>.xlsx \
  experiment_<timestamp>_rater_1.xlsx \
  experiment_<timestamp>_rater_2.xlsx \
  experiment_<timestamp>_rater_3.xlsx \
  experiment_<timestamp>_rater_4.xlsx

This produces aggregated_rater_scores.xlsx with per-response means across raters, condition means (RAG vs. baseline), Krippendorff's alpha per criterion, and Wilcoxon signed-rank test results comparing RAG against baseline per criterion.

Environment Variables Reference

Variable	Description	Default
`POSTGRES_USER`	PostgreSQL username	—
`POSTGRES_PASSWORD`	PostgreSQL password	—
`POSTGRES_DB`	PostgreSQL database name	—
`CLAUDE_API_KEY`	Anthropic API key	—
`CLAUDE_MODEL`	Claude model ID	`claude-sonnet-4-6`
`RAG_FACTSHEET_LIMIT`	Number of factsheets retrieved per query	`5`
`LIBRECHAT_JWT_SECRET`	LibreChat JWT signing secret	—
`LIBRECHAT_JWT_REFRESH_SECRET`	LibreChat JWT refresh secret	—
`LIBRECHAT_CREDS_KEY`	LibreChat credentials encryption key (64 hex chars)	—
`LIBRECHAT_CREDS_IV`	LibreChat credentials encryption IV (32 hex chars)	—

Reproducibility Note

The embedding model (nomic-embed-text-v2-moe:latest) has no versioned tags on the Ollama registry. To verify you are using the same model version as the original experiment, check the model hash after first startup:

docker exec upcycling-rag-ollama ollama list

The model hash used for the thesis experiment: ff9c2f10ef5e

Project Structure

.
├── app/
│   ├── main.py                  # FastAPI app, startup
│   ├── routers/
│   │   ├── chat.py              # /v1/chat/completions endpoint
│   │   ├── factsheet.py         # Factsheet CRUD
│   │   └── models.py            # /v1/models
│   ├── services/
│   │   ├── rag_service.py       # Core RAG and baseline logic, system prompt
│   │   ├── claude_service.py    # Anthropic API wrapper
│   │   ├── embedding_service.py # Ollama embeddings
│   │   └── data_initializer.py  # Seeds factsheets on startup
│   ├── models/
│   │   ├── upcycling_factsheet.py
│   │   └── query_log.py
│   └── config/
│       ├── config.py
│       └── database.py
├── db/                          # PostgreSQL init scripts
├── experiments/                 # Experiment runner and evaluation scripts
│   ├── run_experiment.py
│   ├── evaluate.py
│   ├── aggregate_rater_scores.py
│   └── test_inputs.json         # 24 prompts (8 vague, 8 medium, 8 concrete)
├── docker-compose.yml
├── Dockerfile
└── .env.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upcycling RAG Prototype

Architecture

Prerequisites

Setup

1. Clone the repository

2. Configure environment variables

3. Start the system

API Usage

Generate a upcycling idea

Factsheet endpoints

Available models

LibreChat UI (optional)

Running the Research Experiment

Requirements

1. Run the experiment

2. Generate rater scoring sheets

3. Aggregate rater scores

Environment Variables Reference

Reproducibility Note

Project Structure

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
db		db
experiments		experiments
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
librechat.yaml		librechat.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Upcycling RAG Prototype

Architecture

Prerequisites

Setup

1. Clone the repository

2. Configure environment variables

3. Start the system

API Usage

Generate a upcycling idea

Factsheet endpoints

Available models

LibreChat UI (optional)

Running the Research Experiment

Requirements

1. Run the experiment

2. Generate rater scoring sheets

3. Aggregate rater scores

Environment Variables Reference

Reproducibility Note

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages