feat: complete v0.1 — MCP server, HTTP API, audio/video, graph, ingest by ojspace · Pull Request #2 · ojspace/md-anything

ojspace · 2026-03-13T16:16:32Z

Summary

MCP server (src/mcp.ts): Claude/Cursor integration via bunx md-anything-mcp
HTTP REST API (src/server.ts): Hono server with /doctor, /convert, /ingest endpoints
Audio/video providers: OpenAI Whisper-based transcription with graceful fallbacks
Knowledge graph extraction (src/core/graph.ts): local heuristic entity/relation extraction — no cloud APIs
PHO-139 runtime capabilities refactor: binary detection pre-computed once, injected into providers
Ingest improvements: --graph and --index flags fully wired; generates _index.md table
npm publish readiness: files, exports, build, prepublishOnly in package.json
README: motivation story, personal usage guide, source manifests docs
Live regression tests: YouTube + URL (skipped unless LIVE_TESTS=1)

Test plan

bun run test:required — 36 pass, 0 fail
bun run lint — type check clean
Smoke test: bun run src/cli.ts doctor
Smoke test: bun run src/cli.ts convert <local-pdf>
Smoke test: bun run src/cli.ts ingest <folder> --index --graph -o /tmp/out
Live tests: bun run test:live (requires network)
MCP test: add .mcp.json to a project and verify tools appear in Claude

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- HTTP API server for convert/ingest and a stdio MCP server exposing convert, ingest, and doctor tools
- Audio and video transcription support; improved PDF, HTML, EPUB handling and graph extraction from documents
- CLI: dedicated convert subcommand, ingest output directory and index generation
Documentation
- Comprehensive README and local runbook with examples and troubleshooting
Tests / CI
- New live integration tests, fixtures, and GitHub Actions workflow

…t improvements Core additions: - src/mcp.ts: MCP stdio server exposing convert/ingest/doctor tools - src/server.ts: Hono HTTP REST API (GET /doctor, POST /convert, POST /ingest) - src/providers/audio.ts: Whisper-based audio transcription with graceful fallback - src/providers/video.ts: ffmpeg + Whisper video transcription with graceful fallback - src/core/graph.ts: local heuristic entity/relation extraction (no cloud APIs) Refactors: - PHO-139: runtime capabilities injected into providers (image/pdf/mobi) instead of re-detected per call - src/core/route-input.ts: routes audio/video kinds; threads runtime.capabilities to all providers - src/core/ingest.ts: graph extraction, _index.md generation, audio/video support - src/cli.ts: --graph and --index flags wired up; real subcommand routing Tests & config: - Live regression tests for YouTube and URL (skipped unless LIVE_TESTS=1) - .mcp.json for MCP host integration - package.json: files/exports/build/prepublishOnly for npm publish readiness - README: motivation story, personal usage guide, source manifests docs, --graph/--index docs - CLAUDE.md: local runbook for future sessions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-13T16:17:09Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8fac319a-ae18-4373-934d-6b0ce1476dac

📥 Commits

Reviewing files that changed from the base of the PR and between d78ec37 and 68cd340.

⛔ Files ignored due to path filters (1)

bun.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

.github/workflows/ci.yml
README.md
package.json
src/core/route-input.ts
src/core/support-levels.ts
src/mcp.ts
src/providers/epub.ts
src/providers/html.ts
src/providers/pdf.ts
src/server.ts
src/utils/html-to-markdown.ts
src/utils/split-sections.ts
tests/integration/convert-pdf.test.ts
tests/unit/support-levels.test.ts

📝 Walkthrough

Walkthrough

Adds MCP stdio server and an HTTP API, new providers (audio, video, enhanced pdf/image/mobi/epub/html), heuristic graph extraction, CLI/ingest changes (convert command, disk output, index), HTML→Markdown utilities, docs, CI, and several integration tests and fixtures.

Changes

Cohort / File(s)	Summary
MCP & HTTP Server `.mcp.json`, `src/mcp.ts`, `src/server.ts`	Adds an MCP stdio server exposing convert/ingest/doctor tools and a Hono-based HTTP API with endpoints for convert, ingest, doctor, and health.
CLI & Ingest `src/cli.ts`, `src/core/ingest.ts`	Adds `convert` subcommand, writes ingest outputs to disk when requested, integrates graph extraction, and generates `_index.md` when indexing is enabled.
Providers — new & extended `src/providers/audio.ts`, `src/providers/video.ts`, `src/providers/pdf.ts`, `src/providers/image.ts`, `src/providers/mobi.ts`, `src/providers/epub.ts`, `src/providers/html.ts`	Adds audio/video transcription modules, upgrades PDF extraction (unpdf + pdftotext fallback), moves OCR/ebook checks to caller flags, converts HTML/EPUB via HTML→Markdown pipeline, and expands metadata returned by providers.
Graph & Content utilities `src/core/graph.ts`, `src/utils/html-to-markdown.ts`, `src/utils/split-sections.ts`	Introduces heuristic graph extraction (entities/relations) and utilities to convert HTML→Markdown and split Markdown into sections.
Routing & Capability Passing `src/core/route-input.ts`	Adds audio/video input kinds and passes runtime capability flags (tesseract, pdftotext, ebook-convert, whisper) into provider converters.
Server/Dev tooling & packaging `package.json`, `.github/workflows/ci.yml`	Expands package metadata, adds `md-anything-mcp` and `md-anything-server` bins, adds scripts (server, mcp, build, test:live), new deps (`@modelcontextprotocol/sdk`, `hono`, `unpdf`), and CI workflow including optional tools setup.
Docs & Runbook `README.md`, `CLAUDE.md`	Adds comprehensive README and local runbook with quick-start, CLI examples, MCP usage, troubleshooting, and smoke checks.
Tests & Fixtures `tests/fixtures/sample.html`, `tests/fixtures/sample.json`, `tests/integration/...`, `tests/unit/support-levels.test.ts`	Adds HTML/JSON fixtures, live URL and YouTube integration tests (conditional on LIVE_TESTS), and updates support-level tests for PDF support change.
Support-level tweak `src/core/support-levels.ts`	Changes PDF support level from "best-effort" to "strong", updating related tests.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client/API Caller
  participant HTTP as HTTP Server (Hono)
  participant MCP as MCP Server (stdio)
  participant Runtime as Runtime Providers
  participant FS as Filesystem
  participant Ext as External Tools (unpdf, ffmpeg, whisper, tesseract, pdftotext)

  rect rgba(200,230,255,0.5)
    Client->>HTTP: POST /convert or POST /ingest
    HTTP->>Runtime: validate & call convert/ingest
    Runtime->>FS: read input (file or download URL)
    Runtime->>Ext: call tool(s) as needed (unpdf/ffmpeg/whisper/ocr)
    Ext-->>Runtime: return transcripts/text/metadata
    Runtime-->>HTTP: return Markdown/summary/json
    HTTP-->>Client: 200 + payload
  end

  rect rgba(230,255,200,0.5)
    Client->>MCP: MCP tool request (convert/ingest/doctor) via stdio
    MCP->>Runtime: dispatch to same convert/ingest/doctor handlers
    Runtime->>Ext: external tool calls (as above)
    Runtime-->>MCP: tool result
    MCP-->>Client: tool response
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat: complete v0.1 — MCP server, HTTP API, audio/video, graph, ingest #2 — Appears to contain the same feature set (MCP server, HTTP server, providers, CLI/ingest changes) and is directly related at the code level.

Poem

🐰 Hops in the meadow with a tiny tap-tap,
I stitch transcripts, graphs, and a server-map,
From HTML to Markdown the carrots align,
MCP hums softly — everything's fine,
A rabbit's patch built one small hop at a time. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main feature additions of the v0.1 release, including MCP server, HTTP API, audio/video support, graph extraction, and ingest improvements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/v0.1-complete

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

coderabbitai

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/providers/mobi.ts (1)

50-53: ⚠️ Potential issue | 🟠 Major

Potential command injection via filePath.

The filePath is interpolated directly into a shell command string. If the path contains shell metacharacters (e.g., backticks, $(), semicolons), arbitrary commands could be executed.

Consider using execFileSync with an arguments array instead of execSync with string interpolation to avoid shell interpretation:

🔒 Proposed fix using execFileSync

-import { execSync } from "node:child_process";
+import { execFileSync } from "node:child_process";

...

-      execSync(`ebook-convert "${filePath}" "${outPath}" 2>/dev/null`, {
-        timeout: 60000,
-      });
+      execFileSync("ebook-convert", [filePath, outPath], {
+        timeout: 60000,
+        stdio: ["ignore", "ignore", "ignore"],
+      });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/providers/mobi.ts` around lines 50 - 53, The call to execSync
interpolates filePath into a shell string which allows command injection;
replace this with execFileSync (or spawnSync) and pass "ebook-convert" as the
executable and [filePath, outPath] as the args array, preserve the timeout
option and ensure shell is false, and remove any shell redirection (2>/dev/null)
so stderr can be captured via options.stdio or try/catch; update the code
locations referencing execSync, filePath and outPath in src/providers/mobi.ts to
use execFileSync/ spawnSync and argument arrays instead of string interpolation.

🧹 Nitpick comments (8)

tests/integration/url-live.test.ts (1)
40-48: Minor: Redundant kind assertion duplicated across tests.

The assertion expect(result.kind).toBe("url") appears in both this test (line 46) and the first test (line 25). Consider consolidating this into a single test or removing the duplicate to avoid redundant network calls during the live test run.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/url-live.test.ts` around lines 40 - 48, The test duplicates
the `expect(result.kind).toBe("url")` assertion; update tests to avoid redundant
network calls by removing the duplicate `expect(result.kind).toBe("url")` from
this "should report source_type as url" test (or consolidate it into the first
test that also calls convertToMarkdown), keeping the check once where
convertToMarkdown is already invoked; locate the assertions around the
convertToMarkdown call that uses TARGET_URL, DEFAULT_CONFIG.options and runtime,
and either delete the redundant `result.kind` assertion here or move any
additional expectations (e.g., usefulness_score) into the single test that
retains the kind assertion (ensuring convertToMarkdown is still awaited and
other expectations remain).
src/server.ts (1)
128-135: Side effect: console.log executes on module import.

The console.log statement at module level will execute whenever this module is imported, even if the server isn't being started (e.g., during testing). Consider moving startup logging inside a conditional or using Bun's serve callback:
♻️ Proposed fix to avoid import side effects
 const port = parseInt(process.env.PORT ?? "3000", 10);

+// Only log when running as main module
+if (import.meta.main) {
+  console.log(`md-anything API running on http://localhost:${port}`);
+}
+
 export default {
   port,
   fetch: app.fetch,
 };
-
-console.log(`md-anything API running on http://localhost:${port}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server.ts` around lines 128 - 135, The console.log at module scope causes
a startup side-effect on import; move that logging into the actual server-start
path so it only runs when the server is started. Specifically, remove the
module-level console.log and add the log inside the code that invokes the server
(e.g., where you call Bun.serve or the function that starts the server),
referencing the same port value (port) and app.fetch context; alternatively wrap
the log in a conditional that checks a runtime flag (e.g., if (require.main ===
module) or an environment variable) so console.log is not executed during
imports/tests.
src/core/route-input.ts (1)
43-46: Audio and video routing looks correct, but note the ffmpeg detection inconsistency.

The capability injection for hasWhisper is consistent with the pattern used for other providers. However, per src/providers/video.ts (context snippet 2), the video provider also performs its own hasFfmpeg() check internally using which ffmpeg. This creates an inconsistency: whisper availability is pre-detected and injected, while ffmpeg availability is checked at conversion time.

Consider adding hasFfmpeg to RuntimeCapabilities in src/core/runtime.ts and passing it to convertVideo() for consistency, especially if you need to mock or test capability combinations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/route-input.ts` around lines 43 - 46, The route handling injects
caps.hasWhisper into convertAudio/convertVideo but ffmpeg availability is still
checked inside src/providers/video.ts; add a hasFfmpeg boolean to the
RuntimeCapabilities type in src/core/runtime.ts, populate it where runtime
capabilities are built (using the same detection used now), and pass
caps.hasFfmpeg into convertVideo from src/core/route-input.ts (alongside
caps.hasWhisper) so convertVideo(received) uses the injected flag (or remove the
internal which ffmpeg check in convertVideo and rely on the injected hasFfmpeg)
to keep capability detection consistent and testable.
README.md (2)
332-346: Add language specifier to fenced code block.

Per markdownlint, fenced code blocks should have a language specified. For the directory structure, use txt or text:
📝 Proposed fix
-```
+```txt
 src/
   cli.ts          # CLI entry point
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 332 - 346, The fenced code block that lists the
project directory (the block starting with "src/" and the directory tree lines)
is missing a language specifier; update that block header from ``` to ```txt (or
```text) so markdownlint passes and the directory listing is treated as plain
text.
98-103: Add language specifier to fenced code block.

Per markdownlint, fenced code blocks should have a language specified. Since this is a plain text file format, use txt or text:
📝 Proposed fix
-```
+```txt
 # sources.txt — one source per line, # for comments
 https://www.youtube.com/watch?v=EqhKw0Oro_k
 https://hono.dev/docs/getting-started/basic
 https://example.com/some-article
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 98 - 103, The fenced code block showing the example
"sources.txt — one source per line, # for comments" in README.md lacks a
language specifier; update that opening fence to include a plain-text language
(e.g., change "" to "txt" or "```text") so the block is marked as txt,
ensuring markdownlint compliance for the block containing the three example
URLs.
</details>

</blockquote></details>
<details>
<summary>src/mcp.ts (1)</summary><blockquote>

`154-158`: **Consider a more specific error type for unknown tools.**

The generic `Error` thrown for unknown tools is functional but could be improved with a more specific error that MCP clients can handle programmatically. This is a minor enhancement.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@src/mcp.ts` around lines 154 - 158, Replace the generic Error thrown for
unknown tools with a dedicated error class so callers can handle it
programmatically: add and export a new UnknownToolError (or ToolNotFoundError)
class in the module and throw new UnknownToolError(name) instead of throw new
Error(`Unknown tool: ${name}`) in the code that currently throws the generic
error; ensure the error includes the tool name/message and preserves the
original stack by extending Error so MCP clients can use instanceof
UnknownToolError to detect this condition.
```

</details>

</blockquote></details>
<details>
<summary>src/core/ingest.ts (2)</summary><blockquote>

`51-64`: **Escape markdown table cells in generated index.**

At Line 60, raw `doc.title` and `doc.source` are inserted into table cells. Values containing `|` or newlines will corrupt table rendering.



<details>
<summary>💡 Suggested fix (escape table cell content)</summary>

```diff
 function buildIndexMarkdown(docs: IngestDoc[], folderPath: string): string {
+  const esc = (v: string) => v.replace(/\|/g, "\\|").replace(/\r?\n/g, " ");
   const lines = [
@@
   for (const doc of docs) {
-    lines.push(`| [${doc.title}](${doc.fileName}) | ${doc.sourceType} | ${doc.source} |`);
+    lines.push(`| [${esc(doc.title)}](${doc.fileName}) | ${esc(doc.sourceType)} | ${esc(doc.source)} |`);
   }
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@src/core/ingest.ts` around lines 51 - 64, buildIndexMarkdown inserts raw
doc.title and doc.source into markdown table cells which breaks rendering if
values contain '|' or newlines; create and use a small helper (e.g.
escapeMarkdownCell) inside buildIndexMarkdown to replace pipe characters with
'\|' and convert newlines to a safe inline representation (e.g. space or '<br>')
and pass escaped values (escapeMarkdownCell(doc.title),
escapeMarkdownCell(doc.source)) when building the table rows so table cells
remain valid.
```

</details>

---

`110-124`: **Extract duplicated graph-mapping logic into a helper.**

The same graph extraction/mapping block is repeated for manifest sources and regular files. Consolidating this reduces drift risk.




Also applies to: 145-159

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @CLAUDE.md:

Around line 58-61: The troubleshooting note describing the CLI's convert
failure is out of date: update the documentation so it reflects the current
behavior where running the CLI with no input returns "Missing input for convert
command." and exit code 1 (as produced by the convert command handler), and
remove or replace the old reference to the parser fallback "Input: convert"
which no longer applies; ensure the note mentions the exact error string
("Missing input for convert command.") and the correct remediation (provide
input or use the appropriate flags) and delete the obsolete "Input: convert"
troubleshooting hint.

In @src/cli.ts:

Around line 70-79: The current write loop uses doc.fileName directly and can
silently overwrite files; update the loop that writes files (the block using
outputDir, result.docs, formatMarkdown, and writeFile) to detect filename
collisions and generate unique names before writing (e.g., maintain a Set of
used names and, when a collision on doc.fileName occurs, append a deterministic
suffix like "-1", "-2", etc., to the basename until unique), then compute
outPath from that unique name and call writeFile; ensure you preserve file
extensions and keep the uniqueness logic scoped to this write routine so
recursive ingests or same basenames do not clobber earlier outputs.

In @src/core/graph.ts:

Around line 77-83: The current code in src/core/graph.ts builds relations by
linking every person to the first three places and concepts (the nested loops
over persons, places.slice(0,3) and concepts.slice(0,3)), which creates spurious
assertions; remove or disable those loops so the function only emits entities
and leaves the relations array empty (or only adds relations when explicit local
co‑occurrence/provenance data is available), and add a clear TODO comment in the
same function (where relations is built) pointing to future logic that will
infer relations from true local context.

In @src/core/ingest.ts:

Around line 167-177: The current block that always pushes an index document
using fileName "_index.md" (inside the options.index branch using
buildIndexMarkdown, docs, basename(folderPath), folderPath) can collide with
existing documents; change the logic to reserve a unique index filename by
computing a candidate (e.g. "_index.md") and, before pushing, loop/check
docs.some(d => d.fileName === candidate) and if it exists append a numeric
suffix (e.g. "_index-1.md", "_index-2.md", ...) until you find a name that is
not present, then use that unique candidate as the fileName when pushing the
index document.

In @src/mcp.ts:

Around line 85-104: In the convert branch, calling resolve(input) before
passing to convertToMarkdown mangles URL strings; change the code to only
resolve filesystem paths and leave URL inputs untouched by detecting URLs (e.g.,
via new URL(...) or checking http(s) prefix) and passing the original input
string to convertToMarkdown when it's a URL; update the convert branch around
the input handling (the input variable used in the convertToMarkdown call) so
URL inputs bypass resolve while non-URL paths continue to be resolved using
resolve(input).

In @src/providers/audio.ts:

Around line 41-64: Update the audio provider to check for ffmpeg availability
like the video provider: call a hasFfmpeg() (or ffmpegAvailable) check before
invoking runWhisper and combine it with hasWhisper (i.e., if (!hasWhisper ||
!ffmpegAvailable)). If either is missing, return the same fallback object but
adjust the user-facing content and metadata to indicate which tool(s) are
missing (mention whisper and/or ffmpeg), set metadata flags like
whisper_available and ffmpeg_available accordingly, and avoid calling runWhisper
when ffmpeg is not available; keep using runWhisper(transcript) only when both
checks pass.

Around line 12-14: The current execSync call that builds a shell string with
the user-controlled filePath should be replaced with child_process.execFile
(async) to avoid shell injection and event-loop blocking: import execFile from
'child_process' (or require it), call execFile('whisper', ['--output_dir',
tmpDir, '--output_format', 'txt', filePath], { timeout: 300000 }, (err, stdout,
stderr) => { handle errors, log stderr, and proceed when complete }), and remove
shell interpolation; ensure you pass filePath and tmpDir as separate arguments
in the array (no shell), handle the callback errors, and keep the same timeout
behavior so the Whisper process is spawned without a shell and does not block
the Node event loop.

In @src/providers/image.ts:

Line 8: The current use of execSync with a shell-interpolated filePath
(variable filePath) in providers/image.ts creates a command-injection and blocks
the event loop; replace this with a non-blocking child_process.execFile (or
spawn) call that passes tesseract and arguments as an array (e.g.,
execFile('tesseract', [filePath, 'stdout'])) so no shell interpolation occurs,
remove the shell redirection 2>/dev/null and instead handle/ignore stderr in
the callback or promise, make the call asynchronous (promisify execFile or use
spawn with stdout collection) and add proper error handling around the call
where result is currently used.

In @src/providers/video.ts:

Around line 22-24: The ffmpeg invocation using execSync currently uses an
invalid stream selector "-map a"; update the command built in the execSync call
(the call where you construct the ffmpeg string for extracting audio) to use an
explicit input and stream specifier such as "-map 0:a:0" and add "-vn" to
disable video processing (e.g., replace "-map a" with "-map 0:a:0 -vn") so
ffmpeg extracts the first audio stream reliably.

Around line 22-28: The synchronous, shell-interpolated calls to execSync using
backticks with filePath/audioPath/tmpDir must be replaced by async execFile
calls that pass the binary and its arguments as an array to avoid shell parsing
and command injection; change the surrounding function to async,
import/promisify execFile (or use child_process/promises.execFile), call
execFile("ffmpeg", ["-i", filePath, "-q:a", "0", "-map", "a", audioPath, "-y"])
and execFile("whisper", [audioPath, "--output_dir", tmpDir, "--output_format",
"txt"]) instead of the backticked commands, handle stdout/stderr via the
returned promise (or options.stdio) instead of using shell redirections like
2>/dev/null, and enforce timeouts via execFile's timeout option or an
AbortController so the event loop is not blocked and inputs from
ConvertRequest.input are never shell-interpolated.

In @src/server.ts:

Around line 60-91: The code is incorrectly calling resolve(input) before
passing to convertToMarkdown which mangles URLs; change the call in the POST
/convert handler to pass the raw input string (the variable input from the
request body) directly to convertToMarkdown (i.e., remove resolve(input) usage)
so convertToMarkdown/detectInputKind can correctly detect URLs vs file paths;
keep the options object ({ ...DEFAULT_CONFIG.options, frontmatter }) and runtime
parameter unchanged.

Outside diff comments:
In @src/providers/mobi.ts:

Around line 50-53: The call to execSync interpolates filePath into a shell
string which allows command injection; replace this with execFileSync (or
spawnSync) and pass "ebook-convert" as the executable and [filePath, outPath] as
the args array, preserve the timeout option and ensure shell is false, and
remove any shell redirection (2>/dev/null) so stderr can be captured via
options.stdio or try/catch; update the code locations referencing execSync,
filePath and outPath in src/providers/mobi.ts to use execFileSync/ spawnSync and
argument arrays instead of string interpolation.

Nitpick comments:
In @README.md:

Around line 332-346: The fenced code block that lists the project directory
(the block starting with "src/" and the directory tree lines) is missing a
language specifier; update that block header from totxt (or ```text) so
markdownlint passes and the directory listing is treated as plain text.

Around line 98-103: The fenced code block showing the example "sources.txt —
one source per line, # for comments" in README.md lacks a language specifier;
update that opening fence to include a plain-text language (e.g., change "" to "txt" or "```text") so the block is marked as txt, ensuring markdownlint
compliance for the block containing the three example URLs.

In @src/core/ingest.ts:

Around line 51-64: buildIndexMarkdown inserts raw doc.title and doc.source
into markdown table cells which breaks rendering if values contain '|' or
newlines; create and use a small helper (e.g. escapeMarkdownCell) inside
buildIndexMarkdown to replace pipe characters with '|' and convert newlines to
a safe inline representation (e.g. space or '
') and pass escaped values
(escapeMarkdownCell(doc.title), escapeMarkdownCell(doc.source)) when building
the table rows so table cells remain valid.

In @src/core/route-input.ts:

Around line 43-46: The route handling injects caps.hasWhisper into
convertAudio/convertVideo but ffmpeg availability is still checked inside
src/providers/video.ts; add a hasFfmpeg boolean to the RuntimeCapabilities type
in src/core/runtime.ts, populate it where runtime capabilities are built (using
the same detection used now), and pass caps.hasFfmpeg into convertVideo from
src/core/route-input.ts (alongside caps.hasWhisper) so convertVideo(received)
uses the injected flag (or remove the internal which ffmpeg check in
convertVideo and rely on the injected hasFfmpeg) to keep capability detection
consistent and testable.

In @src/mcp.ts:

Around line 154-158: Replace the generic Error thrown for unknown tools with a
dedicated error class so callers can handle it programmatically: add and export
a new UnknownToolError (or ToolNotFoundError) class in the module and throw new
UnknownToolError(name) instead of throw new Error(Unknown tool: ${name}) in
the code that currently throws the generic error; ensure the error includes the
tool name/message and preserves the original stack by extending Error so MCP
clients can use instanceof UnknownToolError to detect this condition.

In @src/server.ts:

Around line 128-135: The console.log at module scope causes a startup
side-effect on import; move that logging into the actual server-start path so it
only runs when the server is started. Specifically, remove the module-level
console.log and add the log inside the code that invokes the server (e.g., where
you call Bun.serve or the function that starts the server), referencing the same
port value (port) and app.fetch context; alternatively wrap the log in a
conditional that checks a runtime flag (e.g., if (require.main === module) or an
environment variable) so console.log is not executed during imports/tests.

In @tests/integration/url-live.test.ts:

Around line 40-48: The test duplicates the expect(result.kind).toBe("url")
assertion; update tests to avoid redundant network calls by removing the
duplicate expect(result.kind).toBe("url") from this "should report source_type
as url" test (or consolidate it into the first test that also calls
convertToMarkdown), keeping the check once where convertToMarkdown is already
invoked; locate the assertions around the convertToMarkdown call that uses
TARGET_URL, DEFAULT_CONFIG.options and runtime, and either delete the redundant
result.kind assertion here or move any additional expectations (e.g.,
usefulness_score) into the single test that retains the kind assertion (ensuring
convertToMarkdown is still awaited and other expectations remain).
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `2b83351e-45c2-487a-a9d9-4034010b8ac0`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 5fae2670a1413b874461b87e397a4ef5b821887f and d78ec37ee8aab1f7610f60032f5cf1131789fef5.

</details>

<details>
<summary>⛔ Files ignored due to path filters (1)</summary>

* `bun.lock` is excluded by `!**/*.lock`

</details>

<details>
<summary>📒 Files selected for processing (19)</summary>

* `.mcp.json`
* `CLAUDE.md`
* `README.md`
* `package.json`
* `src/cli.ts`
* `src/core/graph.ts`
* `src/core/ingest.ts`
* `src/core/route-input.ts`
* `src/mcp.ts`
* `src/providers/audio.ts`
* `src/providers/image.ts`
* `src/providers/mobi.ts`
* `src/providers/pdf.ts`
* `src/providers/video.ts`
* `src/server.ts`
* `tests/fixtures/sample.html`
* `tests/fixtures/sample.json`
* `tests/integration/url-live.test.ts`
* `tests/integration/youtube-live.test.ts`

</details>

</details>

coderabbitai · 2026-03-13T16:31:57Z

CLAUDE.md

+If you run `bun run src/cli.ts convert` with no input, CLI returns:
+
+- `Missing input for convert command.`
+- Exit code `1`


⚠️ Potential issue | 🟡 Minor

The troubleshooting note contradicts the documented convert behavior.

Lines 58-61 already say bun run src/cli.ts convert with no input returns Missing input for convert command., but Line 100 still documents the older Input: convert failure mode. If that parser fallback is gone now, this note will send users toward the wrong fix.

Also applies to: 100-100

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@CLAUDE.md` around lines 58 - 61, The troubleshooting note describing the CLI's convert failure is out of date: update the documentation so it reflects the current behavior where running the CLI with no input returns "Missing input for convert command." and exit code 1 (as produced by the `convert` command handler), and remove or replace the old reference to the parser fallback "Input: convert" which no longer applies; ensure the note mentions the exact error string ("Missing input for convert command.") and the correct remediation (provide input or use the appropriate flags) and delete the obsolete "Input: convert" troubleshooting hint.

coderabbitai · 2026-03-13T16:31:57Z

src/cli.ts

+  if (outputDir) {
+    await mkdir(outputDir, { recursive: true });
+    for (const doc of result.docs) {
+      const md = formatMarkdown(
+        { ...doc, metadata: doc.metadata },
+        { frontmatter: values.frontmatter },
+      );
+      const outPath = join(outputDir, doc.fileName);
+      await writeFile(outPath, md, "utf-8");
+    }


⚠️ Potential issue | 🟠 Major

Prevent silent overwrite when writing ingest outputs.

At Line 77-78, files are written using doc.fileName without collision handling. Duplicate names (common with recursive ingest or same basename across folders) will overwrite earlier outputs and lose data.

💡 Suggested fix (deduplicate output filenames)

-import { resolve } from "node:path"; +import { resolve, parse } from "node:path"; @@ if (outputDir) { await mkdir(outputDir, { recursive: true }); + const nameCounts = new Map<string, number>(); for (const doc of result.docs) { const md = formatMarkdown( { ...doc, metadata: doc.metadata }, { frontmatter: values.frontmatter }, ); - const outPath = join(outputDir, doc.fileName); + const seen = nameCounts.get(doc.fileName) ?? 0; + nameCounts.set(doc.fileName, seen + 1); + const parsed = parse(doc.fileName); + const finalName = + seen === 0 + ? doc.fileName + : `${parsed.name}-${seen + 1}${parsed.ext || ".md"}`; + const outPath = join(outputDir, finalName); await writeFile(outPath, md, "utf-8"); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (outputDir) {

await mkdir(outputDir, { recursive: true });

for (const doc of result.docs) {

const md = formatMarkdown(

{ ...doc, metadata: doc.metadata },

{ frontmatter: values.frontmatter },

);

const outPath = join(outputDir, doc.fileName);

await writeFile(outPath, md, "utf-8");

}

if (outputDir) {

await mkdir(outputDir, { recursive: true });

const nameCounts = new Map<string, number>();

for (const doc of result.docs) {

const md = formatMarkdown(

{ ...doc, metadata: doc.metadata },

{ frontmatter: values.frontmatter },

);

const seen = nameCounts.get(doc.fileName) ?? 0;

nameCounts.set(doc.fileName, seen + 1);

const parsed = parse(doc.fileName);

const finalName =

seen === 0

? doc.fileName

: `${parsed.name}-${seen + 1}${parsed.ext || ".md"}`;

const outPath = join(outputDir, finalName);

await writeFile(outPath, md, "utf-8");

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli.ts` around lines 70 - 79, The current write loop uses doc.fileName directly and can silently overwrite files; update the loop that writes files (the block using outputDir, result.docs, formatMarkdown, and writeFile) to detect filename collisions and generate unique names before writing (e.g., maintain a Set of used names and, when a collision on doc.fileName occurs, append a deterministic suffix like "-1", "-2", etc., to the basename until unique), then compute outPath from that unique name and call writeFile; ensure you preserve file extensions and keep the uniqueness logic scoped to this write routine so recursive ingests or same basenames do not clobber earlier outputs.

coderabbitai · 2026-03-13T16:31:58Z

src/core/graph.ts

+  for (const person of persons) {
+    for (const place of places.slice(0, 3)) {
+      relations.push({ from: person.id, to: place.id, type: "located-in" });
+    }
+    for (const concept of concepts.slice(0, 3)) {
+      relations.push({ from: person.id, to: concept.id, type: "related-to" });
+    }


⚠️ Potential issue | 🟠 Major

Don’t infer relations from document-wide cartesian products.

Lines 77-83 link every detected person to the first three places and concepts anywhere in the document. That turns unrelated mentions into asserted facts, so the graph becomes misleading as soon as a note contains multiple topics or people. If you don’t have local co-occurrence/provenance yet, it’s safer to emit entities only and leave relations empty.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/graph.ts` around lines 77 - 83, The current code in src/core/graph.ts builds relations by linking every person to the first three places and concepts (the nested loops over persons, places.slice(0,3) and concepts.slice(0,3)), which creates spurious assertions; remove or disable those loops so the function only emits entities and leaves the relations array empty (or only adds relations when explicit local co‑occurrence/provenance data is available), and add a clear TODO comment in the same function (where relations is built) pointing to future logic that will infer relations from true local context.

coderabbitai · 2026-03-13T16:31:58Z

src/core/ingest.ts

+  if (options.index && docs.length > 0) {
+    const indexMarkdown = buildIndexMarkdown(docs, folderPath);
+    docs.push({
+      fileName: "_index.md",
+      title: `Index: ${basename(folderPath)}`,
+      source: folderPath,
+      sourceType: "text",
+      sections: [{ content: indexMarkdown }],
+      metadata: { extraction: "index", extraction_status: "ok" },
+      graph: { entities: [], relations: [], relatedNotes: [] },
+    });


⚠️ Potential issue | 🟠 Major

Avoid _index.md filename collision with ingested documents.

Line 170 always emits _index.md. If the source set already includes a document that resolves to _index.md, this causes ambiguous outputs and potential overwrite downstream.

💡 Suggested fix (reserve a unique index filename)

if (options.index && docs.length > 0) { const indexMarkdown = buildIndexMarkdown(docs, folderPath); + const taken = new Set(docs.map((d) => d.fileName)); + const indexFileName = taken.has("_index.md") ? "_ingest_index.md" : "_index.md"; docs.push({ - fileName: "_index.md", + fileName: indexFileName, title: `Index: ${basename(folderPath)}`, source: folderPath, sourceType: "text", sections: [{ content: indexMarkdown }], metadata: { extraction: "index", extraction_status: "ok" }, graph: { entities: [], relations: [], relatedNotes: [] }, }); - outputFiles.push("_index.md"); + outputFiles.push(indexFileName); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/ingest.ts` around lines 167 - 177, The current block that always pushes an index document using fileName "_index.md" (inside the options.index branch using buildIndexMarkdown, docs, basename(folderPath), folderPath) can collide with existing documents; change the logic to reserve a unique index filename by computing a candidate (e.g. "_index.md") and, before pushing, loop/check docs.some(d => d.fileName === candidate) and if it exists append a numeric suffix (e.g. "_index-1.md", "_index-2.md", ...) until you find a name that is not present, then use that unique candidate as the fileName when pushing the index document.

src/mcp.ts

coderabbitai · 2026-03-13T16:31:58Z

src/providers/audio.ts

+  if (!hasWhisper) {
+    return {
+      title: name,
+      source: filePath,
+      sourceType: "audio",
+      sections: [
+        {
+          heading: "Audio File",
+          content: `*File: ${name}*\n\nAudio transcription requires OpenAI Whisper.\n\nInstall with: \`pip install openai-whisper\``,
+        },
+      ],
+      metadata: {
+        extraction: "unavailable",
+        extraction_status: "weak",
+        file_name: name,
+        file_size_bytes: fileSize,
+        format: ext,
+        whisper_available: false,
+        low_confidence_output: true,
+      },
+    };
+  }
+
+  const transcript = await runWhisper(filePath);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n src/providers/audio.ts | head -100

Repository: ojspace/md-anything

Length of output: 3590

🏁 Script executed:

fd -t f "video" src/providers/ | head -10

Repository: ojspace/md-anything

Length of output: 85

🏁 Script executed:

rg -A 5 -B 5 "ffmpeg" src/providers/

Repository: ojspace/md-anything

Length of output: 3849

Add ffmpeg availability check alongside hasWhisper in audio provider.

The audio provider only gates on hasWhisper, but OpenAI Whisper requires ffmpeg as an external runtime dependency. When ffmpeg is missing but Whisper is installed, runWhisper silently fails (caught at line 23) and returns null, resulting in a misleading "whisper-empty" message that suggests silent audio rather than a missing dependency. Implement a hasFfmpeg() check mirroring the video provider's approach, then combine the checks: if (!hasWhisper || !ffmpegAvailable) with messaging that specifies which tools are missing.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/providers/audio.ts` around lines 41 - 64, Update the audio provider to check for ffmpeg availability like the video provider: call a hasFfmpeg() (or ffmpegAvailable) check before invoking runWhisper and combine it with hasWhisper (i.e., if (!hasWhisper || !ffmpegAvailable)). If either is missing, return the same fallback object but adjust the user-facing content and metadata to indicate which tool(s) are missing (mention whisper and/or ffmpeg), set metadata flags like whisper_available and ffmpeg_available accordingly, and avoid calling runWhisper when ffmpeg is not available; keep using runWhisper(transcript) only when both checks pass.

coderabbitai · 2026-03-13T16:31:58Z

src/providers/image.ts

  try {
-    const { execSync: exec } = await import("node:child_process");
-    const result = exec(`tesseract "${filePath}" stdout 2>/dev/null`, {
+    const result = execSync(`tesseract "${filePath}" stdout 2>/dev/null`, {


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and surrounding context cat -n src/providers/image.ts | head -30

Repository: ojspace/md-anything

Length of output: 1106

🏁 Script executed:

# Also check if the file exists and what the actual content is around line 8 wc -l src/providers/image.ts

Repository: ojspace/md-anything

Length of output: 88

🏁 Script executed:

# Look for all uses of execSync and execFile in the repository to understand the pattern rg -n "execSync|execFile" --type ts --type js

Repository: ojspace/md-anything

Length of output: 1432

🏁 Script executed:

# Check if there's any input validation or sanitization for filePath before this call rg -n "filePath" src/providers/image.ts -A 3 -B 3

Repository: ojspace/md-anything

Length of output: 1294

Use execFile() or properly escape the filePath argument instead of shell-interpolating.

Line 8 invokes tesseract via execSync() with a shell string. This creates a command injection vulnerability: a filePath containing shell metacharacters (quotes, backticks, semicolons) can escape and execute arbitrary commands. Additionally, execSync() blocks the entire Node.js event loop; with the new HTTP/MCP server entrypoints, a slow OCR execution will freeze the process and deny service to other requests. Use execFile() with argument arrays instead, or properly escape the path using a library like shell-escape.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/providers/image.ts` at line 8, The current use of execSync with a shell-interpolated filePath (variable filePath) in providers/image.ts creates a command-injection and blocks the event loop; replace this with a non-blocking child_process.execFile (or spawn) call that passes tesseract and arguments as an array (e.g., execFile('tesseract', [filePath, 'stdout'])) so no shell interpolation occurs, remove the shell redirection `2>/dev/null` and instead handle/ignore stderr in the callback or promise, make the call asynchronous (promisify execFile or use spawn with stdout collection) and add proper error handling around the call where result is currently used.

coderabbitai · 2026-03-13T16:31:58Z

src/providers/video.ts

+    execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, {
+      timeout: 120000,
+    });


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

ffmpeg -map syntax valid input selector

💡 Result:

-map takes either an input-stream selector or a filtergraph output label:

1) Input selector form

-map [-]input_file_id[:stream_specifier][?][,sync_file_id[:stream_specifier]]

input_file_id = 0-based input number (-i order).

[:stream_specifier] is optional; if omitted, it means “all streams from that input” (e.g. -map 0).

Prefix - makes it a negative map (removes matches from previously-added maps).

Trailing ? makes it optional (ignored if it matches nothing; still fails if the input file index is invalid).

Optional ,sync_file_id:stream_specifier sets the PTS sync reference.
[1]

2) Filtergraph label form

-map [linklabel]

Maps a labeled output from -filter_complex (e.g. -map "[outv]").
[1]

Valid stream_specifier forms (the part after :)

Common ones (can be combined like a:2, v, etc.):

By stream index: 0:1 (stream #1 in input #0)

By type: 0:v, 0:a, 0:s, 0:d, 0:t (video/audio/subtitle/data/attachment)

By type + Nth of that type: 0:a:2 (3rd audio stream)

By metadata tag: 0:m:language:eng

By disposition: 0:disp:forced (and you can combine dispositions with +)

Usable streams only: 0:u
[1][2]

Examples

Map everything from first input: -map 0

Map all video + 3rd audio: -map 0:v -map 0:a:2

Map all except 2nd audio: -map 0 -map -0:a:1

Map audio if present (don’t fail if none): -map 0:a?
[1]

Sources: FFmpeg manual (-map syntax + examples) and FFmpeg “Stream specifiers” documentation. [1][2]

Use an explicit ffmpeg stream selector here.

The -map a syntax is invalid. FFmpeg's -map requires an input file ID (0-based index) followed by an optional stream specifier, e.g., -map 0:a or -map 0:a:0. Using just -map a lacks the required input file ID and will fail. Use -map 0:a:0 to extract the first audio stream, optionally with -vn to disable video processing.

Suggested fix

- execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, { + execSync(`ffmpeg -i "${filePath}" -vn -q:a 0 -map 0:a:0 "${audioPath}" -y 2>/dev/null`, {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, {

timeout: 120000,

});

execSync(`ffmpeg -i "${filePath}" -vn -q:a 0 -map 0:a:0 "${audioPath}" -y 2>/dev/null`, {

timeout: 120000,

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/providers/video.ts` around lines 22 - 24, The ffmpeg invocation using execSync currently uses an invalid stream selector "-map a"; update the command built in the execSync call (the call where you construct the ffmpeg string for extracting audio) to use an explicit input and stream specifier such as "-map 0:a:0" and add "-vn" to disable video processing (e.g., replace "-map a" with "-map 0:a:0 -vn") so ffmpeg extracts the first audio stream reliably.

coderabbitai · 2026-03-13T16:31:58Z

src/providers/video.ts

+    execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, {
+      timeout: 120000,
+    });
+
+    execSync(`whisper "${audioPath}" --output_dir "${tmpDir}" --output_format txt 2>/dev/null`, {
+      timeout: 300000,
+    });


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n src/providers/video.ts | head -50

Repository: ojspace/md-anything

Length of output: 2010

🏁 Script executed:

rg -n "convertVideo" --type ts --type js

Repository: ojspace/md-anything

Length of output: 348

🏁 Script executed:

cat -n src/core/route-input.ts | head -60

Repository: ojspace/md-anything

Length of output: 2405

🏁 Script executed:

rg -n "routeInput" --type ts --type js -B 2

Repository: ojspace/md-anything

Length of output: 1233

🏁 Script executed:

cat -n src/core/convert.ts

Repository: ojspace/md-anything

Length of output: 1394

🏁 Script executed:

rg -n "normalize|sanitize|validate" src/providers/video.ts

Repository: ojspace/md-anything

Length of output: 45

🌐 Web query:

Node.js execFile execSync event loop blocking behavior

💡 Result:

child_process.execFile() is asynchronous: it starts the child process and returns immediately, so it does not block the Node.js event loop while the child runs. The completion callback is queued when the process exits. [1]

child_process.execSync() (and execFileSync(), spawnSync()) are synchronous: they block the Node.js event loop until the child process exits—no other JS, timers, or I/O callbacks can run in that Node process during the wait. [1][2]

Even though a child runs in a separate OS process, using the *Sync APIs still blocks your Node process because you explicitly wait synchronously for the result. [3]

Rule of thumb: use execFile in servers/throughput-sensitive code; reserve execSync/execFileSync for short CLI scripts or startup-time one-offs. [1]

Sources:
[1] Node.js docs: child_process (sync methods block; spawn() async) (nodejs.org)
[2] Node.js docs (older but explicit): sync methods “WILL block the Node.js event loop” (nodejs.org)
[3] Explanation of *Sync waiting behavior (stackoverflow.com)

Citations:

1: https://nodejs.org/api/child_process.html?utm_source=openai

2: https://nodejs.org/download/release/v4.4.2/docs/api/child_process.html?utm_source=openai

3: https://stackoverflow.com/questions/60498538/how-does-nodejss-child-process-interact-with-the-event-loop?utm_source=openai

🌐 Web query:

Node.js execFile shell injection arguments array protection

💡 Result:

execFile(file, args) is mostly “shell-injection safe” specifically because it does not spawn a shell by default (shell: false) and runs the executable directly. With no shell involved, metacharacters like ;, &&, |, > are not interpreted as operators—they’re just bytes inside a single argument. [1]

It does not “sanitize” your inputs; it just avoids shell parsing. Untrusted args can still be dangerous as argument/option injection (e.g., turning on dangerous flags in the target program), so you still need allowlists / strict validation for what arguments are permitted. [2]

If you enable shell: true, you lose that protection. Node explicitly warns not to pass unsanitized input when the shell option is enabled because shell metacharacters can trigger arbitrary command execution. [3]

Windows gotcha: you generally can’t run .bat / .cmd with execFile; Windows batch files require cmd.exe (i.e., effectively a shell path), which reintroduces shell-injection risk if any part is attacker-controlled. [4]

Sources
[1] Node.js child_process.execFile() docs (no shell by default; spawns file directly) (nodejs.org)
[2] Stack Overflow discussion showing metacharacters in the args array are not executed as shell operators (illustrating “no shell parsing,” not “sanitization”) (stackoverflow.com)
[3] Node.js docs warning about shell option + unsanitized input (nodejs.org)
[4] Node.js docs on Windows .bat/.cmd requiring cmd.exe/shell-style invocation (nodejs.org)

Citations:

1: https://nodejs.org/api/child_process.html

2: https://stackoverflow.com/questions/31833800/does-the-nodejss-child-process-args-array-sanitize-arguments

3: https://nodejs.org/api/child_process.html

4: https://nodejs.org/api/child_process.html

Replace these shell-string execSync() calls with execFile() and async/await to prevent command injection.

Lines 22-28 interpolate filePath directly into backtick-quoted shell strings passed to execSync(). Since filePath originates from user input (ConvertRequest.input), a user can inject arbitrary shell commands—for example, passing "video.mp4; rm -rf /; echo" would execute the malicious portion. Additionally, execSync() blocks the Node.js event loop, stalling all other requests in your Hono/MCP server while transcription runs. Use execFile() with argument arrays and switch to async/await to avoid shell parsing entirely and restore event-loop responsiveness.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/providers/video.ts` around lines 22 - 28, The synchronous, shell-interpolated calls to execSync using backticks with filePath/audioPath/tmpDir must be replaced by async execFile calls that pass the binary and its arguments as an array to avoid shell parsing and command injection; change the surrounding function to async, import/promisify execFile (or use child_process/promises.execFile), call execFile("ffmpeg", ["-i", filePath, "-q:a", "0", "-map", "a", audioPath, "-y"]) and execFile("whisper", [audioPath, "--output_dir", tmpDir, "--output_format", "txt"]) instead of the backticked commands, handle stdout/stderr via the returned promise (or options.stdio) instead of using shell redirections like 2>/dev/null, and enforce timeouts via execFile's timeout option or an AbortController so the event loop is not blocked and inputs from ConvertRequest.input are never shell-interpolated.

coderabbitai · 2026-03-13T16:31:58Z

src/server.ts

+app.post("/convert", async (c) => {
+  let body: { input?: string; frontmatter?: boolean };
+  try {
+    body = await c.req.json();
+  } catch {
+    return c.json({ error: "Invalid JSON body" }, 400);
+  }
+
+  const { input, frontmatter = true } = body;
+
+  if (!input) {
+    return c.json({ error: "Missing required field: input" }, 400);
+  }
+
+  try {
+    const result = await convertToMarkdown(
+      {
+        input: resolve(input),
+        options: { ...DEFAULT_CONFIG.options, frontmatter },
+      },
+      runtime,
+    );
+
+    return c.json({
+      input: result.input,
+      kind: result.kind,
+      markdown: result.markdown,
+      metadata: result.metadata,
+    });
+  } catch (err) {
+    return c.json({ error: String(err) }, 500);
+  }


⚠️ Potential issue | 🔴 Critical

Bug: resolve(input) mangles URL inputs.

The resolve(input) call on line 77 will convert URL strings like https://example.com into filesystem paths (e.g., /current/dir/https:/example.com), breaking URL-based conversions.

The core convertToMarkdown function expects the raw input string, as it internally detects whether the input is a URL or file path via detectInputKind.

🐛 Proposed fix

try { + // Only resolve file paths, not URLs + const resolvedInput = input.startsWith("http://") || input.startsWith("https://") + ? input + : resolve(input); const result = await convertToMarkdown( { - input: resolve(input), + input: resolvedInput, options: { ...DEFAULT_CONFIG.options, frontmatter }, }, runtime, );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/server.ts` around lines 60 - 91, The code is incorrectly calling resolve(input) before passing to convertToMarkdown which mangles URLs; change the call in the POST /convert handler to pass the raw input string (the variable input from the request body) directly to convertToMarkdown (i.e., remove resolve(input) usage) so convertToMarkdown/detectInputKind can correctly detect URLs vs file paths; keep the options object ({ ...DEFAULT_CONFIG.options, frontmatter }) and runtime parameter unchanged.

- Add htmlToMarkdown() utility: tables, headings, lists, code, links - Add splitIntoSections() utility: heading-boundary section splitting - Refactor HTML provider to use shared conversion utilities - Refactor EPUB provider to use htmlToMarkdown per chapter - Rewrite PDF provider: unpdf zero-dep primary, pdftotext fallback, detectStructure() heuristic for raw text (ALL-CAPS, numbered sections) - Upgrade PDF support level: best-effort → strong - Fix MCP URL handling: resolve() now skips http/https inputs - Fix stale doctor output in mcp.ts, server.ts, and README - Add CI workflow (ubuntu + macos, required tests + full suite) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ojspace and others added 2 commits March 13, 2026 20:15

chore: remove CI workflow from staging (needs workflow scope token)

d78ec37

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

ojspace merged commit 5c2d824 into main Mar 18, 2026
3 of 4 checks passed

Conversation

ojspace commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

1) Input selector form

2) Filtergraph label form

Valid stream_specifier forms (the part after :)

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ojspace commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Valid `stream_specifier` forms (the part after `:`)