StructaMed CLI converts unstructured clinical notes into deterministic, standardized formats (SOAP, H&P, Discharge Summary) for documentation quality, interoperability, and data consistency. It uses rules and configuration only (no ML summarization) and ships with synthetic demo data for exams and coursework.
Clinical notes arrive in wildly inconsistent shapes. As a student working with synthetic notes, I kept reformatting the same data to evaluate documentation quality and export to JSON/CSV. The manual cleanup was slow and error-prone, so I built a deterministic CLI that standardizes text with transparent rules and configurable mappings.
- Deterministic parsing into SOAP, H&P, and Discharge Summary structures
- Markdown, JSON, and CSV exports (wide or long)
- Bundle-aware parsing for multi-note files with warnings
- Interactive review mode to confirm sections, rename headings, and control heuristics
- Batch processing with per-file failure tracking and summary report
- Configurable heading aliases and section ordering via TOML
cargo build --release
./target/release/clinote --help# Convert a note
clinote parse --input notes/sample.txt --format soap --out output.json --out-format json
# Validate a note (strict)
clinote validate notes/sample.txt --template soap --strict
# Preview sections and line counts
clinote preview notes/sample.txt --template soapclinote parse --input notes/sample.txt --format soap \
--out output.json --out-format json --bundle autoclinote batch --input-dir notes --glob "*.txt" \
--format hp --out-dir outputs --out-format csvclinote sample --out-dir samples --n 6 --bundles 2clinote validate --config clinote.toml- Strict mode (
--strict) treats missing required sections as errors. - Non-strict mode treats missing required sections as warnings.
- Exit codes:
0when no errors,2when errors exist.
Example:
clinote validate notes/sample.txt --template hp --strict --json
clinote preview notes/sample.txt --template hpRun a sweep over many notes to validate quality at scale.
clinote selftest --fixtures tests/fixtures --template soap --strict
clinote selftest --fixtures "tests/fixtures/*.txt" --template hp --json
clinote selftest --fixtures tests/fixtures --out selftest_outputsBefore (input)
CC - chest pain
HPI: started after exercise
PMH: HTN, asthma
Assessment: likely MSK strain
Plan: NSAIDs, follow-up
After (JSON)
{
"format": "hp",
"sections": [
{"name": "Chief Complaint", "content": "chest pain"},
{"name": "HPI", "content": "started after exercise"},
{"name": "PMH", "content": "HTN, asthma"},
{"name": "Assessment", "content": "likely MSK strain"},
{"name": "Plan", "content": "NSAIDs, follow-up"}
]
}Bundled files are tricky because delimiters can be ambiguous and formats can be mixed. Clinote mitigates this by:
- Splitting only on explicit delimiters or repeated timestamps in auto mode.
- Warning when bundle mode is forced but no clear split is found.
- Allowing interactive review to remove or rename sections.
- Capturing warnings in JSON output and batch reports.
Use clinote sample to generate synthetic notes plus gold JSON outputs in a folder. Bundle files are also generated if --bundles is provided.
- Build the static website in
docs/(already provided). - In GitHub repo settings, enable Pages from branch
mainand folder/docs. - Confirm your Pages URL works:
https://<your-username>.github.io/<repo-name>/.
Trigger a GitHub Release by tagging and pushing:
git tag v0.1.0
git push --tagsArtifacts will appear under GitHub Releases. The GitHub Pages binary remains separate for the exam requirement.
Place compiled binaries in docs/downloads/ so the download page can serve them. Required file:
docs/downloads/clinote-aarch64-unknown-linux-musl.tar.gz
Optional:
docs/downloads/clinote-x86_64-unknown-linux-musl.tar.gz
- Set the GitHub repo About website to:
https://<your-username>.github.io/<repo-name>/ - Description suggestion: "Deterministic CLI to structure synthetic clinical notes into SOAP/H&P/Discharge formats (Rust)."
- Topics to add:
rust,cli,health-informatics,clinical-notes,data-quality,serialization,deterministic,education
Target users: informatics students, clinical documentation QA teams, research labs, health data engineers.
Pricing tiers:
- Free/Student: full CLI with deterministic parsing.
- Pro Lab: batch templates + config review + priority support.
- Enterprise: on-prem pilots, compliance guidance, and training.
Acquisition channels: GitHub, university courses, informatics conferences, lab partnerships.
StructaMed CLI is an educational tool using synthetic data only. It is not a medical device and does not guarantee legal or regulatory compliance.
MIT
Reproduce the working CLI in any environment:
Preview the website locally:
python3 -m http.server --directory docs 8000
# open http://localhost:8000Note: demo_outputs/ is generated locally by demo and is intentionally ignored by git.
cargo test --all
cargo run -- --help
cargo run -- demo
# strict fixtures must be clean (exit 0)
cargo run -- selftest --fixtures tests/fixtures/clean/soap --template soap --strict
cargo run -- selftest --fixtures tests/fixtures/clean/hp --template hp --strict
cargo run -- selftest --fixtures tests/fixtures/clean/discharge --template discharge --strictIf you see bash: EOF: command not found, your heredoc was not closed.
Correct pattern:
cat > file.txt <<'EOF'
hello
EOF