Skip to content

Security: helenkwok/offlineaid-pack-builder

Security

SECURITY.md

Security Policy

Supported Versions

offlineaid-pack-builder is an early-stage project (0.x). Only the latest commit on main receives security fixes.

Reporting a Vulnerability

If you discover a security vulnerability, please report it privately via one of:

  • GitHub Security Advisory (preferred — gives you a private channel with the maintainer)
  • Email the maintainer (contact via GitHub profile)

Please do not open a public issue or pull request for security reports. The maintainer aims to acknowledge reports within 7 days and disclose responsibly after a fix is available.

Scope

In scope:

  • packager.py — pack-builder CLI + archive contract
  • agent.py — Gemma 4 compiler agent (PydanticAI + Deep Agents)
  • extract.py — OCR extraction pipeline
  • translate.py — translation pipeline
  • scrape.py — Scrapy-based public-data extraction
  • Makefile targets
  • .oapack.zip archive format and SHA-256 sidecar contract

Out of scope:

  • Third-party dependencies (report upstream — see pyproject.toml for the dependency tree)
  • Hugging Face models pulled at pipeline time (report to model maintainers)
  • On-device behavior in consumer apps that import .oapack.zip packs

Security Posture

  • Zero runtime dependencies in the core packager.py — only the Python standard library (sqlite3, json, csv, struct, zipfile, hashlib, argparse). Optional extras (agent, ocr, scrape, upload) are isolated and only required for the pipeline stages that use them.
  • No network at query time. Built .oapack.zip packs are self-contained SQLite databases consumed offline by the OfflineAid app.
  • Archive contract is cryptographically validated. Every .oapack.zip carries a SHA-256 sidecar and is structurally + cryptographically verified at build time (_validate_archive_contract in packager.py) and at consume time (packager.py verify).
  • Scrapy spiders are polite by default. ROBOTSTXT_OBEY=True, DOWNLOAD_DELAY=1.5s, CONCURRENT_REQUESTS_PER_DOMAIN=2. Only documented per-spider overrides exist, with rationale in code comments.
  • Path traversal is validated. scrape.py and translate.py reject paths outside the working directory or system temp via _validate_path().

What we will not consider a vulnerability

  • Source-code provenance of the bundled examples/accc-scams/*.pdf — these are public documents from ACCC Scamwatch.
  • The USER_AGENT string in scrape.py identifying this tool.
  • Build-time logging to stdout (no PII expected; build logs are local).

There aren't any published security advisories