feat: complete v0.1 — MCP server, HTTP API, audio/video, graph, ingest#2
feat: complete v0.1 — MCP server, HTTP API, audio/video, graph, ingest#2
Conversation
…t improvements Core additions: - src/mcp.ts: MCP stdio server exposing convert/ingest/doctor tools - src/server.ts: Hono HTTP REST API (GET /doctor, POST /convert, POST /ingest) - src/providers/audio.ts: Whisper-based audio transcription with graceful fallback - src/providers/video.ts: ffmpeg + Whisper video transcription with graceful fallback - src/core/graph.ts: local heuristic entity/relation extraction (no cloud APIs) Refactors: - PHO-139: runtime capabilities injected into providers (image/pdf/mobi) instead of re-detected per call - src/core/route-input.ts: routes audio/video kinds; threads runtime.capabilities to all providers - src/core/ingest.ts: graph extraction, _index.md generation, audio/video support - src/cli.ts: --graph and --index flags wired up; real subcommand routing Tests & config: - Live regression tests for YouTube and URL (skipped unless LIVE_TESTS=1) - .mcp.json for MCP host integration - package.json: files/exports/build/prepublishOnly for npm publish readiness - README: motivation story, personal usage guide, source manifests docs, --graph/--index docs - CLAUDE.md: local runbook for future sessions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (14)
📝 WalkthroughWalkthroughAdds MCP stdio server and an HTTP API, new providers (audio, video, enhanced pdf/image/mobi/epub/html), heuristic graph extraction, CLI/ingest changes (convert command, disk output, index), HTML→Markdown utilities, docs, CI, and several integration tests and fixtures. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client/API Caller
participant HTTP as HTTP Server (Hono)
participant MCP as MCP Server (stdio)
participant Runtime as Runtime Providers
participant FS as Filesystem
participant Ext as External Tools (unpdf, ffmpeg, whisper, tesseract, pdftotext)
rect rgba(200,230,255,0.5)
Client->>HTTP: POST /convert or POST /ingest
HTTP->>Runtime: validate & call convert/ingest
Runtime->>FS: read input (file or download URL)
Runtime->>Ext: call tool(s) as needed (unpdf/ffmpeg/whisper/ocr)
Ext-->>Runtime: return transcripts/text/metadata
Runtime-->>HTTP: return Markdown/summary/json
HTTP-->>Client: 200 + payload
end
rect rgba(230,255,200,0.5)
Client->>MCP: MCP tool request (convert/ingest/doctor) via stdio
MCP->>Runtime: dispatch to same convert/ingest/doctor handlers
Runtime->>Ext: external tool calls (as above)
Runtime-->>MCP: tool result
MCP-->>Client: tool response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.OpenGrep is compatible with Semgrep configurations. Add an |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/providers/mobi.ts (1)
50-53:⚠️ Potential issue | 🟠 MajorPotential command injection via
filePath.The
filePathis interpolated directly into a shell command string. If the path contains shell metacharacters (e.g., backticks,$(), semicolons), arbitrary commands could be executed.Consider using
execFileSyncwith an arguments array instead ofexecSyncwith string interpolation to avoid shell interpretation:🔒 Proposed fix using execFileSync
-import { execSync } from "node:child_process"; +import { execFileSync } from "node:child_process"; ... - execSync(`ebook-convert "${filePath}" "${outPath}" 2>/dev/null`, { - timeout: 60000, - }); + execFileSync("ebook-convert", [filePath, outPath], { + timeout: 60000, + stdio: ["ignore", "ignore", "ignore"], + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/mobi.ts` around lines 50 - 53, The call to execSync interpolates filePath into a shell string which allows command injection; replace this with execFileSync (or spawnSync) and pass "ebook-convert" as the executable and [filePath, outPath] as the args array, preserve the timeout option and ensure shell is false, and remove any shell redirection (2>/dev/null) so stderr can be captured via options.stdio or try/catch; update the code locations referencing execSync, filePath and outPath in src/providers/mobi.ts to use execFileSync/ spawnSync and argument arrays instead of string interpolation.
🧹 Nitpick comments (8)
tests/integration/url-live.test.ts (1)
40-48: Minor: Redundantkindassertion duplicated across tests.The assertion
expect(result.kind).toBe("url")appears in both this test (line 46) and the first test (line 25). Consider consolidating this into a single test or removing the duplicate to avoid redundant network calls during the live test run.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/url-live.test.ts` around lines 40 - 48, The test duplicates the `expect(result.kind).toBe("url")` assertion; update tests to avoid redundant network calls by removing the duplicate `expect(result.kind).toBe("url")` from this "should report source_type as url" test (or consolidate it into the first test that also calls convertToMarkdown), keeping the check once where convertToMarkdown is already invoked; locate the assertions around the convertToMarkdown call that uses TARGET_URL, DEFAULT_CONFIG.options and runtime, and either delete the redundant `result.kind` assertion here or move any additional expectations (e.g., usefulness_score) into the single test that retains the kind assertion (ensuring convertToMarkdown is still awaited and other expectations remain).src/server.ts (1)
128-135: Side effect:console.logexecutes on module import.The
console.logstatement at module level will execute whenever this module is imported, even if the server isn't being started (e.g., during testing). Consider moving startup logging inside a conditional or using Bun's serve callback:♻️ Proposed fix to avoid import side effects
const port = parseInt(process.env.PORT ?? "3000", 10); +// Only log when running as main module +if (import.meta.main) { + console.log(`md-anything API running on http://localhost:${port}`); +} + export default { port, fetch: app.fetch, }; - -console.log(`md-anything API running on http://localhost:${port}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/server.ts` around lines 128 - 135, The console.log at module scope causes a startup side-effect on import; move that logging into the actual server-start path so it only runs when the server is started. Specifically, remove the module-level console.log and add the log inside the code that invokes the server (e.g., where you call Bun.serve or the function that starts the server), referencing the same port value (port) and app.fetch context; alternatively wrap the log in a conditional that checks a runtime flag (e.g., if (require.main === module) or an environment variable) so console.log is not executed during imports/tests.src/core/route-input.ts (1)
43-46: Audio and video routing looks correct, but note the ffmpeg detection inconsistency.The capability injection for
hasWhisperis consistent with the pattern used for other providers. However, persrc/providers/video.ts(context snippet 2), the video provider also performs its ownhasFfmpeg()check internally usingwhich ffmpeg. This creates an inconsistency: whisper availability is pre-detected and injected, while ffmpeg availability is checked at conversion time.Consider adding
hasFfmpegtoRuntimeCapabilitiesinsrc/core/runtime.tsand passing it toconvertVideo()for consistency, especially if you need to mock or test capability combinations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/route-input.ts` around lines 43 - 46, The route handling injects caps.hasWhisper into convertAudio/convertVideo but ffmpeg availability is still checked inside src/providers/video.ts; add a hasFfmpeg boolean to the RuntimeCapabilities type in src/core/runtime.ts, populate it where runtime capabilities are built (using the same detection used now), and pass caps.hasFfmpeg into convertVideo from src/core/route-input.ts (alongside caps.hasWhisper) so convertVideo(received) uses the injected flag (or remove the internal which ffmpeg check in convertVideo and rely on the injected hasFfmpeg) to keep capability detection consistent and testable.README.md (2)
332-346: Add language specifier to fenced code block.Per markdownlint, fenced code blocks should have a language specified. For the directory structure, use
txtortext:📝 Proposed fix
-``` +```txt src/ cli.ts # CLI entry point🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 332 - 346, The fenced code block that lists the project directory (the block starting with "src/" and the directory tree lines) is missing a language specifier; update that block header from ``` to ```txt (or ```text) so markdownlint passes and the directory listing is treated as plain text.
98-103: Add language specifier to fenced code block.Per markdownlint, fenced code blocks should have a language specified. Since this is a plain text file format, use
txtortext:📝 Proposed fix
-``` +```txt # sources.txt — one source per line, # for comments https://www.youtube.com/watch?v=EqhKw0Oro_k https://hono.dev/docs/getting-started/basic https://example.com/some-article</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 98 - 103, The fenced code block showing the example
"sources.txt — one source per line, # for comments" in README.md lacks a
language specifier; update that opening fence to include a plain-text language
(e.g., change "" to "txt" or "```text") so the block is marked as txt,
ensuring markdownlint compliance for the block containing the three example
URLs.</details> </blockquote></details> <details> <summary>src/mcp.ts (1)</summary><blockquote> `154-158`: **Consider a more specific error type for unknown tools.** The generic `Error` thrown for unknown tools is functional but could be improved with a more specific error that MCP clients can handle programmatically. This is a minor enhancement. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@src/mcp.ts` around lines 154 - 158, Replace the generic Error thrown for unknown tools with a dedicated error class so callers can handle it programmatically: add and export a new UnknownToolError (or ToolNotFoundError) class in the module and throw new UnknownToolError(name) instead of throw new Error(`Unknown tool: ${name}`) in the code that currently throws the generic error; ensure the error includes the tool name/message and preserves the original stack by extending Error so MCP clients can use instanceof UnknownToolError to detect this condition. ``` </details> </blockquote></details> <details> <summary>src/core/ingest.ts (2)</summary><blockquote> `51-64`: **Escape markdown table cells in generated index.** At Line 60, raw `doc.title` and `doc.source` are inserted into table cells. Values containing `|` or newlines will corrupt table rendering. <details> <summary>💡 Suggested fix (escape table cell content)</summary> ```diff function buildIndexMarkdown(docs: IngestDoc[], folderPath: string): string { + const esc = (v: string) => v.replace(/\|/g, "\\|").replace(/\r?\n/g, " "); const lines = [ @@ for (const doc of docs) { - lines.push(`| [${doc.title}](${doc.fileName}) | ${doc.sourceType} | ${doc.source} |`); + lines.push(`| [${esc(doc.title)}](${doc.fileName}) | ${esc(doc.sourceType)} | ${esc(doc.source)} |`); } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@src/core/ingest.ts` around lines 51 - 64, buildIndexMarkdown inserts raw doc.title and doc.source into markdown table cells which breaks rendering if values contain '|' or newlines; create and use a small helper (e.g. escapeMarkdownCell) inside buildIndexMarkdown to replace pipe characters with '\|' and convert newlines to a safe inline representation (e.g. space or '<br>') and pass escaped values (escapeMarkdownCell(doc.title), escapeMarkdownCell(doc.source)) when building the table rows so table cells remain valid. ``` </details> --- `110-124`: **Extract duplicated graph-mapping logic into a helper.** The same graph extraction/mapping block is repeated for manifest sources and regular files. Consolidating this reduces drift risk. Also applies to: 145-159 </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@CLAUDE.md:
- Around line 58-61: The troubleshooting note describing the CLI's convert
failure is out of date: update the documentation so it reflects the current
behavior where running the CLI with no input returns "Missing input for convert
command." and exit code 1 (as produced by theconvertcommand handler), and
remove or replace the old reference to the parser fallback "Input: convert"
which no longer applies; ensure the note mentions the exact error string
("Missing input for convert command.") and the correct remediation (provide
input or use the appropriate flags) and delete the obsolete "Input: convert"
troubleshooting hint.In
@src/cli.ts:
- Around line 70-79: The current write loop uses doc.fileName directly and can
silently overwrite files; update the loop that writes files (the block using
outputDir, result.docs, formatMarkdown, and writeFile) to detect filename
collisions and generate unique names before writing (e.g., maintain a Set of
used names and, when a collision on doc.fileName occurs, append a deterministic
suffix like "-1", "-2", etc., to the basename until unique), then compute
outPath from that unique name and call writeFile; ensure you preserve file
extensions and keep the uniqueness logic scoped to this write routine so
recursive ingests or same basenames do not clobber earlier outputs.In
@src/core/graph.ts:
- Around line 77-83: The current code in src/core/graph.ts builds relations by
linking every person to the first three places and concepts (the nested loops
over persons, places.slice(0,3) and concepts.slice(0,3)), which creates spurious
assertions; remove or disable those loops so the function only emits entities
and leaves the relations array empty (or only adds relations when explicit local
co‑occurrence/provenance data is available), and add a clear TODO comment in the
same function (where relations is built) pointing to future logic that will
infer relations from true local context.In
@src/core/ingest.ts:
- Around line 167-177: The current block that always pushes an index document
using fileName "_index.md" (inside the options.index branch using
buildIndexMarkdown, docs, basename(folderPath), folderPath) can collide with
existing documents; change the logic to reserve a unique index filename by
computing a candidate (e.g. "_index.md") and, before pushing, loop/check
docs.some(d => d.fileName === candidate) and if it exists append a numeric
suffix (e.g. "_index-1.md", "_index-2.md", ...) until you find a name that is
not present, then use that unique candidate as the fileName when pushing the
index document.In
@src/mcp.ts:
- Around line 85-104: In the convert branch, calling resolve(input) before
passing to convertToMarkdown mangles URL strings; change the code to only
resolve filesystem paths and leave URL inputs untouched by detecting URLs (e.g.,
via new URL(...) or checking http(s) prefix) and passing the original input
string to convertToMarkdown when it's a URL; update the convert branch around
the input handling (the input variable used in the convertToMarkdown call) so
URL inputs bypass resolve while non-URL paths continue to be resolved using
resolve(input).In
@src/providers/audio.ts:
- Around line 41-64: Update the audio provider to check for ffmpeg availability
like the video provider: call a hasFfmpeg() (or ffmpegAvailable) check before
invoking runWhisper and combine it with hasWhisper (i.e., if (!hasWhisper ||
!ffmpegAvailable)). If either is missing, return the same fallback object but
adjust the user-facing content and metadata to indicate which tool(s) are
missing (mention whisper and/or ffmpeg), set metadata flags like
whisper_available and ffmpeg_available accordingly, and avoid calling runWhisper
when ffmpeg is not available; keep using runWhisper(transcript) only when both
checks pass.- Around line 12-14: The current execSync call that builds a shell string with
the user-controlled filePath should be replaced with child_process.execFile
(async) to avoid shell injection and event-loop blocking: import execFile from
'child_process' (or require it), call execFile('whisper', ['--output_dir',
tmpDir, '--output_format', 'txt', filePath], { timeout: 300000 }, (err, stdout,
stderr) => { handle errors, log stderr, and proceed when complete }), and remove
shell interpolation; ensure you pass filePath and tmpDir as separate arguments
in the array (no shell), handle the callback errors, and keep the same timeout
behavior so the Whisper process is spawned without a shell and does not block
the Node event loop.In
@src/providers/image.ts:
- Line 8: The current use of execSync with a shell-interpolated filePath
(variable filePath) in providers/image.ts creates a command-injection and blocks
the event loop; replace this with a non-blocking child_process.execFile (or
spawn) call that passes tesseract and arguments as an array (e.g.,
execFile('tesseract', [filePath, 'stdout'])) so no shell interpolation occurs,
remove the shell redirection2>/dev/nulland instead handle/ignore stderr in
the callback or promise, make the call asynchronous (promisify execFile or use
spawn with stdout collection) and add proper error handling around the call
where result is currently used.In
@src/providers/video.ts:
- Around line 22-24: The ffmpeg invocation using execSync currently uses an
invalid stream selector "-map a"; update the command built in the execSync call
(the call where you construct the ffmpeg string for extracting audio) to use an
explicit input and stream specifier such as "-map 0:a:0" and add "-vn" to
disable video processing (e.g., replace "-map a" with "-map 0:a:0 -vn") so
ffmpeg extracts the first audio stream reliably.- Around line 22-28: The synchronous, shell-interpolated calls to execSync using
backticks with filePath/audioPath/tmpDir must be replaced by async execFile
calls that pass the binary and its arguments as an array to avoid shell parsing
and command injection; change the surrounding function to async,
import/promisify execFile (or use child_process/promises.execFile), call
execFile("ffmpeg", ["-i", filePath, "-q:a", "0", "-map", "a", audioPath, "-y"])
and execFile("whisper", [audioPath, "--output_dir", tmpDir, "--output_format",
"txt"]) instead of the backticked commands, handle stdout/stderr via the
returned promise (or options.stdio) instead of using shell redirections like
2>/dev/null, and enforce timeouts via execFile's timeout option or an
AbortController so the event loop is not blocked and inputs from
ConvertRequest.input are never shell-interpolated.In
@src/server.ts:
- Around line 60-91: The code is incorrectly calling resolve(input) before
passing to convertToMarkdown which mangles URLs; change the call in the POST
/convert handler to pass the raw input string (the variable input from the
request body) directly to convertToMarkdown (i.e., remove resolve(input) usage)
so convertToMarkdown/detectInputKind can correctly detect URLs vs file paths;
keep the options object ({ ...DEFAULT_CONFIG.options, frontmatter }) and runtime
parameter unchanged.
Outside diff comments:
In@src/providers/mobi.ts:
- Around line 50-53: The call to execSync interpolates filePath into a shell
string which allows command injection; replace this with execFileSync (or
spawnSync) and pass "ebook-convert" as the executable and [filePath, outPath] as
the args array, preserve the timeout option and ensure shell is false, and
remove any shell redirection (2>/dev/null) so stderr can be captured via
options.stdio or try/catch; update the code locations referencing execSync,
filePath and outPath in src/providers/mobi.ts to use execFileSync/ spawnSync and
argument arrays instead of string interpolation.
Nitpick comments:
In@README.md:
- Around line 332-346: The fenced code block that lists the project directory
(the block starting with "src/" and the directory tree lines) is missing a
language specifier; update that block header fromtotxt (or ```text) so
markdownlint passes and the directory listing is treated as plain text.- Around line 98-103: The fenced code block showing the example "sources.txt —
one source per line, # for comments" in README.md lacks a language specifier;
update that opening fence to include a plain-text language (e.g., change "" to "txt" or "```text") so the block is marked as txt, ensuring markdownlint
compliance for the block containing the three example URLs.In
@src/core/ingest.ts:
- Around line 51-64: buildIndexMarkdown inserts raw doc.title and doc.source
into markdown table cells which breaks rendering if values contain '|' or
newlines; create and use a small helper (e.g. escapeMarkdownCell) inside
buildIndexMarkdown to replace pipe characters with '|' and convert newlines to
a safe inline representation (e.g. space or '
') and pass escaped values
(escapeMarkdownCell(doc.title), escapeMarkdownCell(doc.source)) when building
the table rows so table cells remain valid.In
@src/core/route-input.ts:
- Around line 43-46: The route handling injects caps.hasWhisper into
convertAudio/convertVideo but ffmpeg availability is still checked inside
src/providers/video.ts; add a hasFfmpeg boolean to the RuntimeCapabilities type
in src/core/runtime.ts, populate it where runtime capabilities are built (using
the same detection used now), and pass caps.hasFfmpeg into convertVideo from
src/core/route-input.ts (alongside caps.hasWhisper) so convertVideo(received)
uses the injected flag (or remove the internal which ffmpeg check in
convertVideo and rely on the injected hasFfmpeg) to keep capability detection
consistent and testable.In
@src/mcp.ts:
- Around line 154-158: Replace the generic Error thrown for unknown tools with a
dedicated error class so callers can handle it programmatically: add and export
a new UnknownToolError (or ToolNotFoundError) class in the module and throw new
UnknownToolError(name) instead of throw new Error(Unknown tool: ${name}) in
the code that currently throws the generic error; ensure the error includes the
tool name/message and preserves the original stack by extending Error so MCP
clients can use instanceof UnknownToolError to detect this condition.In
@src/server.ts:
- Around line 128-135: The console.log at module scope causes a startup
side-effect on import; move that logging into the actual server-start path so it
only runs when the server is started. Specifically, remove the module-level
console.log and add the log inside the code that invokes the server (e.g., where
you call Bun.serve or the function that starts the server), referencing the same
port value (port) and app.fetch context; alternatively wrap the log in a
conditional that checks a runtime flag (e.g., if (require.main === module) or an
environment variable) so console.log is not executed during imports/tests.In
@tests/integration/url-live.test.ts:
- Around line 40-48: The test duplicates the
expect(result.kind).toBe("url")
assertion; update tests to avoid redundant network calls by removing the
duplicateexpect(result.kind).toBe("url")from this "should report source_type
as url" test (or consolidate it into the first test that also calls
convertToMarkdown), keeping the check once where convertToMarkdown is already
invoked; locate the assertions around the convertToMarkdown call that uses
TARGET_URL, DEFAULT_CONFIG.options and runtime, and either delete the redundant
result.kindassertion here or move any additional expectations (e.g.,
usefulness_score) into the single test that retains the kind assertion (ensuring
convertToMarkdown is still awaited and other expectations remain).</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro **Run ID**: `2b83351e-45c2-487a-a9d9-4034010b8ac0` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 5fae2670a1413b874461b87e397a4ef5b821887f and d78ec37ee8aab1f7610f60032f5cf1131789fef5. </details> <details> <summary>⛔ Files ignored due to path filters (1)</summary> * `bun.lock` is excluded by `!**/*.lock` </details> <details> <summary>📒 Files selected for processing (19)</summary> * `.mcp.json` * `CLAUDE.md` * `README.md` * `package.json` * `src/cli.ts` * `src/core/graph.ts` * `src/core/ingest.ts` * `src/core/route-input.ts` * `src/mcp.ts` * `src/providers/audio.ts` * `src/providers/image.ts` * `src/providers/mobi.ts` * `src/providers/pdf.ts` * `src/providers/video.ts` * `src/server.ts` * `tests/fixtures/sample.html` * `tests/fixtures/sample.json` * `tests/integration/url-live.test.ts` * `tests/integration/youtube-live.test.ts` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| If you run `bun run src/cli.ts convert` with no input, CLI returns: | ||
|
|
||
| - `Missing input for convert command.` | ||
| - Exit code `1` |
There was a problem hiding this comment.
The troubleshooting note contradicts the documented convert behavior.
Lines 58-61 already say bun run src/cli.ts convert with no input returns Missing input for convert command., but Line 100 still documents the older Input: convert failure mode. If that parser fallback is gone now, this note will send users toward the wrong fix.
Also applies to: 100-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CLAUDE.md` around lines 58 - 61, The troubleshooting note describing the
CLI's convert failure is out of date: update the documentation so it reflects
the current behavior where running the CLI with no input returns "Missing input
for convert command." and exit code 1 (as produced by the `convert` command
handler), and remove or replace the old reference to the parser fallback "Input:
convert" which no longer applies; ensure the note mentions the exact error
string ("Missing input for convert command.") and the correct remediation
(provide input or use the appropriate flags) and delete the obsolete "Input:
convert" troubleshooting hint.
| if (outputDir) { | ||
| await mkdir(outputDir, { recursive: true }); | ||
| for (const doc of result.docs) { | ||
| const md = formatMarkdown( | ||
| { ...doc, metadata: doc.metadata }, | ||
| { frontmatter: values.frontmatter }, | ||
| ); | ||
| const outPath = join(outputDir, doc.fileName); | ||
| await writeFile(outPath, md, "utf-8"); | ||
| } |
There was a problem hiding this comment.
Prevent silent overwrite when writing ingest outputs.
At Line 77-78, files are written using doc.fileName without collision handling. Duplicate names (common with recursive ingest or same basename across folders) will overwrite earlier outputs and lose data.
💡 Suggested fix (deduplicate output filenames)
-import { resolve } from "node:path";
+import { resolve, parse } from "node:path";
@@
if (outputDir) {
await mkdir(outputDir, { recursive: true });
+ const nameCounts = new Map<string, number>();
for (const doc of result.docs) {
const md = formatMarkdown(
{ ...doc, metadata: doc.metadata },
{ frontmatter: values.frontmatter },
);
- const outPath = join(outputDir, doc.fileName);
+ const seen = nameCounts.get(doc.fileName) ?? 0;
+ nameCounts.set(doc.fileName, seen + 1);
+ const parsed = parse(doc.fileName);
+ const finalName =
+ seen === 0
+ ? doc.fileName
+ : `${parsed.name}-${seen + 1}${parsed.ext || ".md"}`;
+ const outPath = join(outputDir, finalName);
await writeFile(outPath, md, "utf-8");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (outputDir) { | |
| await mkdir(outputDir, { recursive: true }); | |
| for (const doc of result.docs) { | |
| const md = formatMarkdown( | |
| { ...doc, metadata: doc.metadata }, | |
| { frontmatter: values.frontmatter }, | |
| ); | |
| const outPath = join(outputDir, doc.fileName); | |
| await writeFile(outPath, md, "utf-8"); | |
| } | |
| if (outputDir) { | |
| await mkdir(outputDir, { recursive: true }); | |
| const nameCounts = new Map<string, number>(); | |
| for (const doc of result.docs) { | |
| const md = formatMarkdown( | |
| { ...doc, metadata: doc.metadata }, | |
| { frontmatter: values.frontmatter }, | |
| ); | |
| const seen = nameCounts.get(doc.fileName) ?? 0; | |
| nameCounts.set(doc.fileName, seen + 1); | |
| const parsed = parse(doc.fileName); | |
| const finalName = | |
| seen === 0 | |
| ? doc.fileName | |
| : `${parsed.name}-${seen + 1}${parsed.ext || ".md"}`; | |
| const outPath = join(outputDir, finalName); | |
| await writeFile(outPath, md, "utf-8"); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/cli.ts` around lines 70 - 79, The current write loop uses doc.fileName
directly and can silently overwrite files; update the loop that writes files
(the block using outputDir, result.docs, formatMarkdown, and writeFile) to
detect filename collisions and generate unique names before writing (e.g.,
maintain a Set of used names and, when a collision on doc.fileName occurs,
append a deterministic suffix like "-1", "-2", etc., to the basename until
unique), then compute outPath from that unique name and call writeFile; ensure
you preserve file extensions and keep the uniqueness logic scoped to this write
routine so recursive ingests or same basenames do not clobber earlier outputs.
| for (const person of persons) { | ||
| for (const place of places.slice(0, 3)) { | ||
| relations.push({ from: person.id, to: place.id, type: "located-in" }); | ||
| } | ||
| for (const concept of concepts.slice(0, 3)) { | ||
| relations.push({ from: person.id, to: concept.id, type: "related-to" }); | ||
| } |
There was a problem hiding this comment.
Don’t infer relations from document-wide cartesian products.
Lines 77-83 link every detected person to the first three places and concepts anywhere in the document. That turns unrelated mentions into asserted facts, so the graph becomes misleading as soon as a note contains multiple topics or people. If you don’t have local co-occurrence/provenance yet, it’s safer to emit entities only and leave relations empty.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/graph.ts` around lines 77 - 83, The current code in
src/core/graph.ts builds relations by linking every person to the first three
places and concepts (the nested loops over persons, places.slice(0,3) and
concepts.slice(0,3)), which creates spurious assertions; remove or disable those
loops so the function only emits entities and leaves the relations array empty
(or only adds relations when explicit local co‑occurrence/provenance data is
available), and add a clear TODO comment in the same function (where relations
is built) pointing to future logic that will infer relations from true local
context.
| if (options.index && docs.length > 0) { | ||
| const indexMarkdown = buildIndexMarkdown(docs, folderPath); | ||
| docs.push({ | ||
| fileName: "_index.md", | ||
| title: `Index: ${basename(folderPath)}`, | ||
| source: folderPath, | ||
| sourceType: "text", | ||
| sections: [{ content: indexMarkdown }], | ||
| metadata: { extraction: "index", extraction_status: "ok" }, | ||
| graph: { entities: [], relations: [], relatedNotes: [] }, | ||
| }); |
There was a problem hiding this comment.
Avoid _index.md filename collision with ingested documents.
Line 170 always emits _index.md. If the source set already includes a document that resolves to _index.md, this causes ambiguous outputs and potential overwrite downstream.
💡 Suggested fix (reserve a unique index filename)
if (options.index && docs.length > 0) {
const indexMarkdown = buildIndexMarkdown(docs, folderPath);
+ const taken = new Set(docs.map((d) => d.fileName));
+ const indexFileName = taken.has("_index.md") ? "_ingest_index.md" : "_index.md";
docs.push({
- fileName: "_index.md",
+ fileName: indexFileName,
title: `Index: ${basename(folderPath)}`,
source: folderPath,
sourceType: "text",
sections: [{ content: indexMarkdown }],
metadata: { extraction: "index", extraction_status: "ok" },
graph: { entities: [], relations: [], relatedNotes: [] },
});
- outputFiles.push("_index.md");
+ outputFiles.push(indexFileName);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/ingest.ts` around lines 167 - 177, The current block that always
pushes an index document using fileName "_index.md" (inside the options.index
branch using buildIndexMarkdown, docs, basename(folderPath), folderPath) can
collide with existing documents; change the logic to reserve a unique index
filename by computing a candidate (e.g. "_index.md") and, before pushing,
loop/check docs.some(d => d.fileName === candidate) and if it exists append a
numeric suffix (e.g. "_index-1.md", "_index-2.md", ...) until you find a name
that is not present, then use that unique candidate as the fileName when pushing
the index document.
| if (!hasWhisper) { | ||
| return { | ||
| title: name, | ||
| source: filePath, | ||
| sourceType: "audio", | ||
| sections: [ | ||
| { | ||
| heading: "Audio File", | ||
| content: `*File: ${name}*\n\nAudio transcription requires OpenAI Whisper.\n\nInstall with: \`pip install openai-whisper\``, | ||
| }, | ||
| ], | ||
| metadata: { | ||
| extraction: "unavailable", | ||
| extraction_status: "weak", | ||
| file_name: name, | ||
| file_size_bytes: fileSize, | ||
| format: ext, | ||
| whisper_available: false, | ||
| low_confidence_output: true, | ||
| }, | ||
| }; | ||
| } | ||
|
|
||
| const transcript = await runWhisper(filePath); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/providers/audio.ts | head -100Repository: ojspace/md-anything
Length of output: 3590
🏁 Script executed:
fd -t f "video" src/providers/ | head -10Repository: ojspace/md-anything
Length of output: 85
🏁 Script executed:
rg -A 5 -B 5 "ffmpeg" src/providers/Repository: ojspace/md-anything
Length of output: 3849
Add ffmpeg availability check alongside hasWhisper in audio provider.
The audio provider only gates on hasWhisper, but OpenAI Whisper requires ffmpeg as an external runtime dependency. When ffmpeg is missing but Whisper is installed, runWhisper silently fails (caught at line 23) and returns null, resulting in a misleading "whisper-empty" message that suggests silent audio rather than a missing dependency. Implement a hasFfmpeg() check mirroring the video provider's approach, then combine the checks: if (!hasWhisper || !ffmpegAvailable) with messaging that specifies which tools are missing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/providers/audio.ts` around lines 41 - 64, Update the audio provider to
check for ffmpeg availability like the video provider: call a hasFfmpeg() (or
ffmpegAvailable) check before invoking runWhisper and combine it with hasWhisper
(i.e., if (!hasWhisper || !ffmpegAvailable)). If either is missing, return the
same fallback object but adjust the user-facing content and metadata to indicate
which tool(s) are missing (mention whisper and/or ffmpeg), set metadata flags
like whisper_available and ffmpeg_available accordingly, and avoid calling
runWhisper when ffmpeg is not available; keep using runWhisper(transcript) only
when both checks pass.
| try { | ||
| const { execSync: exec } = await import("node:child_process"); | ||
| const result = exec(`tesseract "${filePath}" stdout 2>/dev/null`, { | ||
| const result = execSync(`tesseract "${filePath}" stdout 2>/dev/null`, { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file and surrounding context
cat -n src/providers/image.ts | head -30Repository: ojspace/md-anything
Length of output: 1106
🏁 Script executed:
# Also check if the file exists and what the actual content is around line 8
wc -l src/providers/image.tsRepository: ojspace/md-anything
Length of output: 88
🏁 Script executed:
# Look for all uses of execSync and execFile in the repository to understand the pattern
rg -n "execSync|execFile" --type ts --type jsRepository: ojspace/md-anything
Length of output: 1432
🏁 Script executed:
# Check if there's any input validation or sanitization for filePath before this call
rg -n "filePath" src/providers/image.ts -A 3 -B 3Repository: ojspace/md-anything
Length of output: 1294
Use execFile() or properly escape the filePath argument instead of shell-interpolating.
Line 8 invokes tesseract via execSync() with a shell string. This creates a command injection vulnerability: a filePath containing shell metacharacters (quotes, backticks, semicolons) can escape and execute arbitrary commands. Additionally, execSync() blocks the entire Node.js event loop; with the new HTTP/MCP server entrypoints, a slow OCR execution will freeze the process and deny service to other requests. Use execFile() with argument arrays instead, or properly escape the path using a library like shell-escape.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/providers/image.ts` at line 8, The current use of execSync with a
shell-interpolated filePath (variable filePath) in providers/image.ts creates a
command-injection and blocks the event loop; replace this with a non-blocking
child_process.execFile (or spawn) call that passes tesseract and arguments as an
array (e.g., execFile('tesseract', [filePath, 'stdout'])) so no shell
interpolation occurs, remove the shell redirection `2>/dev/null` and instead
handle/ignore stderr in the callback or promise, make the call asynchronous
(promisify execFile or use spawn with stdout collection) and add proper error
handling around the call where result is currently used.
| execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, { | ||
| timeout: 120000, | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
ffmpeg -map syntax valid input selector
💡 Result:
-map takes either an input-stream selector or a filtergraph output label:
1) Input selector form
-map [-]input_file_id[:stream_specifier][?][,sync_file_id[:stream_specifier]]
input_file_id= 0-based input number (-iorder).[:stream_specifier]is optional; if omitted, it means “all streams from that input” (e.g.-map 0).- Prefix
-makes it a negative map (removes matches from previously-added maps). - Trailing
?makes it optional (ignored if it matches nothing; still fails if the input file index is invalid). - Optional
,sync_file_id:stream_specifiersets the PTS sync reference.
[1]
2) Filtergraph label form
-map [linklabel]
Maps a labeled output from -filter_complex (e.g. -map "[outv]").
[1]
Valid stream_specifier forms (the part after :)
Common ones (can be combined like a:2, v, etc.):
- By stream index:
0:1(stream#1in input#0) - By type:
0:v,0:a,0:s,0:d,0:t(video/audio/subtitle/data/attachment) - By type + Nth of that type:
0:a:2(3rd audio stream) - By metadata tag:
0:m:language:eng - By disposition:
0:disp:forced(and you can combine dispositions with+) - Usable streams only:
0:u
[1][2]
Examples
- Map everything from first input:
-map 0 - Map all video + 3rd audio:
-map 0:v -map 0:a:2 - Map all except 2nd audio:
-map 0 -map -0:a:1 - Map audio if present (don’t fail if none):
-map 0:a?
[1]
Sources: FFmpeg manual (-map syntax + examples) and FFmpeg “Stream specifiers” documentation. [1][2]
Use an explicit ffmpeg stream selector here.
The -map a syntax is invalid. FFmpeg's -map requires an input file ID (0-based index) followed by an optional stream specifier, e.g., -map 0:a or -map 0:a:0. Using just -map a lacks the required input file ID and will fail. Use -map 0:a:0 to extract the first audio stream, optionally with -vn to disable video processing.
Suggested fix
- execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, {
+ execSync(`ffmpeg -i "${filePath}" -vn -q:a 0 -map 0:a:0 "${audioPath}" -y 2>/dev/null`, {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, { | |
| timeout: 120000, | |
| }); | |
| execSync(`ffmpeg -i "${filePath}" -vn -q:a 0 -map 0:a:0 "${audioPath}" -y 2>/dev/null`, { | |
| timeout: 120000, | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/providers/video.ts` around lines 22 - 24, The ffmpeg invocation using
execSync currently uses an invalid stream selector "-map a"; update the command
built in the execSync call (the call where you construct the ffmpeg string for
extracting audio) to use an explicit input and stream specifier such as "-map
0:a:0" and add "-vn" to disable video processing (e.g., replace "-map a" with
"-map 0:a:0 -vn") so ffmpeg extracts the first audio stream reliably.
| execSync(`ffmpeg -i "${filePath}" -q:a 0 -map a "${audioPath}" -y 2>/dev/null`, { | ||
| timeout: 120000, | ||
| }); | ||
|
|
||
| execSync(`whisper "${audioPath}" --output_dir "${tmpDir}" --output_format txt 2>/dev/null`, { | ||
| timeout: 300000, | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/providers/video.ts | head -50Repository: ojspace/md-anything
Length of output: 2010
🏁 Script executed:
rg -n "convertVideo" --type ts --type jsRepository: ojspace/md-anything
Length of output: 348
🏁 Script executed:
cat -n src/core/route-input.ts | head -60Repository: ojspace/md-anything
Length of output: 2405
🏁 Script executed:
rg -n "routeInput" --type ts --type js -B 2Repository: ojspace/md-anything
Length of output: 1233
🏁 Script executed:
cat -n src/core/convert.tsRepository: ojspace/md-anything
Length of output: 1394
🏁 Script executed:
rg -n "normalize|sanitize|validate" src/providers/video.tsRepository: ojspace/md-anything
Length of output: 45
🌐 Web query:
Node.js execFile execSync event loop blocking behavior
💡 Result:
child_process.execFile()is asynchronous: it starts the child process and returns immediately, so it does not block the Node.js event loop while the child runs. The completion callback is queued when the process exits. [1]child_process.execSync()(andexecFileSync(),spawnSync()) are synchronous: they block the Node.js event loop until the child process exits—no other JS, timers, or I/O callbacks can run in that Node process during the wait. [1][2]- Even though a child runs in a separate OS process, using the
*SyncAPIs still blocks your Node process because you explicitly wait synchronously for the result. [3]
Rule of thumb: use execFile in servers/throughput-sensitive code; reserve execSync/execFileSync for short CLI scripts or startup-time one-offs. [1]
Sources:
[1] Node.js docs: child_process (sync methods block; spawn() async) (nodejs.org)
[2] Node.js docs (older but explicit): sync methods “WILL block the Node.js event loop” (nodejs.org)
[3] Explanation of *Sync waiting behavior (stackoverflow.com)
Citations:
- 1: https://nodejs.org/api/child_process.html?utm_source=openai
- 2: https://nodejs.org/download/release/v4.4.2/docs/api/child_process.html?utm_source=openai
- 3: https://stackoverflow.com/questions/60498538/how-does-nodejss-child-process-interact-with-the-event-loop?utm_source=openai
🌐 Web query:
Node.js execFile shell injection arguments array protection
💡 Result:
execFile(file, args)is mostly “shell-injection safe” specifically because it does not spawn a shell by default (shell: false) and runs the executable directly. With no shell involved, metacharacters like;,&&,|,>are not interpreted as operators—they’re just bytes inside a single argument. [1]- It does not “sanitize” your inputs; it just avoids shell parsing. Untrusted args can still be dangerous as argument/option injection (e.g., turning on dangerous flags in the target program), so you still need allowlists / strict validation for what arguments are permitted. [2]
- If you enable
shell: true, you lose that protection. Node explicitly warns not to pass unsanitized input when theshelloption is enabled because shell metacharacters can trigger arbitrary command execution. [3] - Windows gotcha: you generally can’t run
.bat/.cmdwithexecFile; Windows batch files requirecmd.exe(i.e., effectively a shell path), which reintroduces shell-injection risk if any part is attacker-controlled. [4]
Sources
[1] Node.js child_process.execFile() docs (no shell by default; spawns file directly) (nodejs.org)
[2] Stack Overflow discussion showing metacharacters in the args array are not executed as shell operators (illustrating “no shell parsing,” not “sanitization”) (stackoverflow.com)
[3] Node.js docs warning about shell option + unsanitized input (nodejs.org)
[4] Node.js docs on Windows .bat/.cmd requiring cmd.exe/shell-style invocation (nodejs.org)
Citations:
- 1: https://nodejs.org/api/child_process.html
- 2: https://stackoverflow.com/questions/31833800/does-the-nodejss-child-process-args-array-sanitize-arguments
- 3: https://nodejs.org/api/child_process.html
- 4: https://nodejs.org/api/child_process.html
Replace these shell-string execSync() calls with execFile() and async/await to prevent command injection.
Lines 22-28 interpolate filePath directly into backtick-quoted shell strings passed to execSync(). Since filePath originates from user input (ConvertRequest.input), a user can inject arbitrary shell commands—for example, passing "video.mp4; rm -rf /; echo" would execute the malicious portion. Additionally, execSync() blocks the Node.js event loop, stalling all other requests in your Hono/MCP server while transcription runs. Use execFile() with argument arrays and switch to async/await to avoid shell parsing entirely and restore event-loop responsiveness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/providers/video.ts` around lines 22 - 28, The synchronous,
shell-interpolated calls to execSync using backticks with
filePath/audioPath/tmpDir must be replaced by async execFile calls that pass the
binary and its arguments as an array to avoid shell parsing and command
injection; change the surrounding function to async, import/promisify execFile
(or use child_process/promises.execFile), call execFile("ffmpeg", ["-i",
filePath, "-q:a", "0", "-map", "a", audioPath, "-y"]) and execFile("whisper",
[audioPath, "--output_dir", tmpDir, "--output_format", "txt"]) instead of the
backticked commands, handle stdout/stderr via the returned promise (or
options.stdio) instead of using shell redirections like 2>/dev/null, and enforce
timeouts via execFile's timeout option or an AbortController so the event loop
is not blocked and inputs from ConvertRequest.input are never
shell-interpolated.
| app.post("/convert", async (c) => { | ||
| let body: { input?: string; frontmatter?: boolean }; | ||
| try { | ||
| body = await c.req.json(); | ||
| } catch { | ||
| return c.json({ error: "Invalid JSON body" }, 400); | ||
| } | ||
|
|
||
| const { input, frontmatter = true } = body; | ||
|
|
||
| if (!input) { | ||
| return c.json({ error: "Missing required field: input" }, 400); | ||
| } | ||
|
|
||
| try { | ||
| const result = await convertToMarkdown( | ||
| { | ||
| input: resolve(input), | ||
| options: { ...DEFAULT_CONFIG.options, frontmatter }, | ||
| }, | ||
| runtime, | ||
| ); | ||
|
|
||
| return c.json({ | ||
| input: result.input, | ||
| kind: result.kind, | ||
| markdown: result.markdown, | ||
| metadata: result.metadata, | ||
| }); | ||
| } catch (err) { | ||
| return c.json({ error: String(err) }, 500); | ||
| } |
There was a problem hiding this comment.
Bug: resolve(input) mangles URL inputs.
The resolve(input) call on line 77 will convert URL strings like https://example.com into filesystem paths (e.g., /current/dir/https:/example.com), breaking URL-based conversions.
The core convertToMarkdown function expects the raw input string, as it internally detects whether the input is a URL or file path via detectInputKind.
🐛 Proposed fix
try {
+ // Only resolve file paths, not URLs
+ const resolvedInput = input.startsWith("http://") || input.startsWith("https://")
+ ? input
+ : resolve(input);
const result = await convertToMarkdown(
{
- input: resolve(input),
+ input: resolvedInput,
options: { ...DEFAULT_CONFIG.options, frontmatter },
},
runtime,
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/server.ts` around lines 60 - 91, The code is incorrectly calling
resolve(input) before passing to convertToMarkdown which mangles URLs; change
the call in the POST /convert handler to pass the raw input string (the variable
input from the request body) directly to convertToMarkdown (i.e., remove
resolve(input) usage) so convertToMarkdown/detectInputKind can correctly detect
URLs vs file paths; keep the options object ({ ...DEFAULT_CONFIG.options,
frontmatter }) and runtime parameter unchanged.
- Add htmlToMarkdown() utility: tables, headings, lists, code, links - Add splitIntoSections() utility: heading-boundary section splitting - Refactor HTML provider to use shared conversion utilities - Refactor EPUB provider to use htmlToMarkdown per chapter - Rewrite PDF provider: unpdf zero-dep primary, pdftotext fallback, detectStructure() heuristic for raw text (ALL-CAPS, numbered sections) - Upgrade PDF support level: best-effort → strong - Fix MCP URL handling: resolve() now skips http/https inputs - Fix stale doctor output in mcp.ts, server.ts, and README - Add CI workflow (ubuntu + macos, required tests + full suite) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
src/mcp.ts): Claude/Cursor integration viabunx md-anything-mcpsrc/server.ts): Hono server with/doctor,/convert,/ingestendpointssrc/core/graph.ts): local heuristic entity/relation extraction — no cloud APIs--graphand--indexflags fully wired; generates_index.mdtablefiles,exports,build,prepublishOnlyinpackage.jsonLIVE_TESTS=1)Test plan
bun run test:required— 36 pass, 0 failbun run lint— type check cleanbun run src/cli.ts doctorbun run src/cli.ts convert <local-pdf>bun run src/cli.ts ingest <folder> --index --graph -o /tmp/outbun run test:live(requires network).mcp.jsonto a project and verify tools appear in Claude🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests / CI