Compendium Scribe

Compendium Scribe is a Click-driven command line tool and library that uses OpenAI's deep research models to assemble a highly structured XML compendium for any topic. The workflow combines optional prompt refinement (powered by gpt-4.1), an o3-deep-research call with web search tooling, and deterministic post-processing to turn the model output into a dependable knowledge asset.

Features

🔍 Deep research pipeline — orchestrates prompt planning, background execution, and tool-call capture with o3-deep-research.
🧱 Rich data model — includes sections, insights, and citations for cross-format rendering.
🧾 Structured XML output — produces a schema-friendly document ready for downstream conversion (HTML, Markdown, PDF pipelines, etc.).
⚙️ Configurable CLI — control background execution, tool call limits, and output paths.
🧪 Testable architecture — research orchestration is decoupled from the OpenAI client, making it simple to stub in tests.

Quick Start

1. Install

pdm install --dev

Ensure PDM_HOME points to a writable location when developing within a sandboxed environment.

2. Configure credentials

Create a .env file (untracked) with your OpenAI credentials:

OPENAI_API_KEY=sk-...

Deep research requires an OpenAI account with the browsing tooling enabled. Document any environment keys for additional tooling in the repo as you add them.

3. Generate a compendium

pdm run create-compendium "Lithium-ion battery recycling"

Options:

--output PATH — where to write the XML file (defaults to <slug>_<timestamp>.xml).
--no-background — force synchronous execution (useful for short or restricted queries).
--max-tool-calls N — cap the total number of tool calls for cost control.
--export-format FORMAT — repeat to emit Markdown (md), HTML (html), or PDF (pdf) alongside the base XML output.

Example output file name: lithium-ion-battery-recycling_20250107_143233.xml.

Library Usage

from compendiumscribe import build_compendium, ResearchConfig, DeepResearchError

try:
    compendium = build_compendium(
        "Emerging pathogen surveillance",
        config=ResearchConfig(background=False, max_tool_calls=30),
    )
except DeepResearchError as exc:
    # Handle or log deep research failures
    raise

xml_payload = compendium.to_xml_string()

# Alternate exports
markdown_doc = compendium.to_markdown()
html_doc = compendium.to_html()
pdf_bytes = compendium.to_pdf_bytes()

The returned Compendium object contains structured sections, insights, citations, and open questions.

Data Model Overview

Compendium Scribe produces XML shaped like:

<compendium topic="Lithium-ion Battery Recycling" generated_at="2025-01-07T14:32:33+00:00">
  <overview><![CDATA[Comprehensive synthesis of the state of lithium-ion recycling...]]></overview>
  <methodology>
    <step><![CDATA[Surveyed peer-reviewed literature from 2022–2025]]></step>
    <step><![CDATA[Corroborated industrial capacity data with regulatory filings]]></step>
  </methodology>
  <sections>
    <section id="S01">
      <title><![CDATA[Technology Landscape]]></title>
      <summary><![CDATA[Dominant recycling modalities and throughput metrics...]]></summary>
      <key_terms>
        <term><![CDATA[hydrometallurgy]]></term>
        <term><![CDATA[direct recycling]]></term>
      </key_terms>
      <guiding_questions>
        <question><![CDATA[Which processes yield the highest cobalt recovery rates?]]></question>
      </guiding_questions>
      <insights>
        <insight>
          <title><![CDATA[Hydrometallurgy remains the throughput leader]]></title>
          <evidence><![CDATA[EPRI 2024 data shows >95% cobalt recovery in commercial plants.]]></evidence>
          <implications><![CDATA[Capital efficiency favors hydrometallurgy for near-term scaling.]]></implications>
          <citations>
            <ref>C1</ref>
          </citations>
        </insight>
      </insights>
    </section>
  </sections>
  <citations>
    <citation id="C1">
      <title><![CDATA[EPRI Lithium-ion Recycling Benchmarking 2024]]></title>
      <url><![CDATA[https://example.com/epri-li-benchmark]]></url>
      <publisher><![CDATA[EPRI]]></publisher>
      <published_at><![CDATA[2024-09-01]]></published_at>
      <summary><![CDATA[Performance metrics for recycling modalities across 12 facilities.]]></summary>
    </citation>
  </citations>
  <open_questions>
    <question><![CDATA[How will policy incentives shape regional plant siting post-2025?]]></question>
  </open_questions>
</compendium>

This format is intentionally verbose to support downstream transformation; tool traces from the deep research run are not retained in the compendium output.

Testing & Quality

pdm run pytest — executes the unit suite. Tests stub the OpenAI client, so they run offline.
pdm run flake8 src tests — linting.
pdm build — produce distributable artifacts.

If pdm fails to write log files in restricted environments, set PDM_HOME to a writable directory (for example, export PDM_HOME=.pdm_home).

Contributing

Fork and clone the repository.
Run pdm install --dev.
Make changes following the style guide and update/add tests.
Run pdm run pytest and pdm run flake8 src tests.
Raise a pull request with:
- A concise description of the change.
- Verification commands executed locally.
- Representative XML samples if the user-facing structure changes.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github		.github
src/compendiumscribe		src/compendiumscribe
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Compendium Scribe

Features

Quick Start

1. Install

2. Configure credentials

3. Generate a compendium

Library Usage

Data Model Overview

Testing & Quality

Contributing

License

About

Uh oh!

Releases 1

Uh oh!

Languages

License

btfranklin/compendiumscribe

Folders and files

Latest commit

History

Repository files navigation

Compendium Scribe

Features

Quick Start

1. Install

2. Configure credentials

3. Generate a compendium

Library Usage

Data Model Overview

Testing & Quality

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Languages