A Python script inspired by repo2file to generate an LLM prompt text file from your code. Here's what it does:
- Recursively dump a directory tree.
- Collect and embed file contents into a single text file.
- Automatically skip
.git/,.vscode/, and common media/document formats. - Honor your
.gitignorerules. - Offer a single
--excludeflag for any extra patterns. - Optionally customize the output directory.
python codepromptor.py <root> [options]- (required): Path to the directory you want to scan.
| Flag | Alias | Description |
|---|---|---|
-e, --exclude |
— | Comma‑separated glob patterns to exclude (e.g. readme.md,.gitignore,.css). |
-o, --output-dir |
— | Directory where the resulting text file will be saved. Defaults to a prompts/ folder next to script. |
-s, --search |
— | String to search for inside file contents. If provided, ONLY files containing at least one match are included in the File Contents section. |
--ignore-case |
— | Make the search case‑insensitive (default is case‑sensitive). |
--whole-word |
— | Match only whole words using Python's word boundary (\b) logic instead of plain substrings. |
--include |
— | Comma‑separated glob patterns to include (e.g. *.ts,*.tsx). Only files matching at least one pattern are processed. |
--exclude-files |
— | Comma‑separated relative file paths to exclude (e.g. frontend/app/src/main.tsx). |
-
Always skips the following:
.git/directory.vscode/directory- File types:
png,jpg,jpeg,gif,bmp,svg,mp4,mp3,wav,avi,mov,mkv,webp,pdf,ppt,pptx,doc,docx,xls,xlsx,csv
-
Honors all patterns in
.gitignore(if present). -
Prints each file as it’s added:
-
Basic scan (uses defaults, output to
prompts/<folder>.txt):python codepromptor.py /path/to/my-project
-
Exclude specific files (
README.md,.gitignore,package-lock.json):python codepromptor.py ./my-app \ -e readme.md,.gitignore,package-lock.json
-
Custom output directory:
python codepromptor.py ./my-app \ -o /tmp/dumps
-
Combine exclude + custom output:
-
Search for a case‑sensitive substring (
UserService):python codepromptor.py ./my-app -s "UserService" -
Case‑insensitive substring search:
python codepromptor.py ./my-app -s "userservice" --ignore-case -
Case‑insensitive whole‑word search (word boundary match):
python codepromptor.py ./my-app -s "UserService" --ignore-case --whole-word
Example summary output for a search run:
Done! Output written to .../prompts/my-app.txt
Search summary: 'UserService' (whole word, ignore case) -> 37 matches in 12 files.
If you omit -s/--search, the behavior is identical to the original version (all text files included that aren't excluded by patterns).
-
Include only TypeScript sources:
python codepromptor.py ./my-app --include "*.ts,*.tsx" -
Exclude a specific file by relative path:
python codepromptor.py ./my-app --exclude-files "frontend/app/src/main.tsx" -
Combine include + search (search only within TypeScript files):
python codepromptor.py ./my-app --include "*.ts,*.tsx" -s "UserService"- Combine exclude-files + search (ignore one noisy file):
python codepromptor.py ./my-app -s "UserService" --exclude-files "frontend/app/src/main.tsx"When you provide -s/--search:
- The directory tree still lists all entries (except those excluded by patterns) so you retain full structural context.
- The File Contents section ONLY includes files where the search string matched at least once.
- Matching is performed after exclusion filtering (
--exclude,.gitignore, built‑in skips). - If
--includeis supplied, only files matching at least one include pattern are considered for searching or dumping. - If
--exclude-filesis supplied, those specific relative paths are removed after pattern filtering but before searching. --ignore-casetoggles a case‑insensitive search; default is case‑sensitive.--whole-wordwraps the term with\bword boundaries. In Python,\btreats letters, digits, and underscore as word characters, soUserServicewon't matchUserServiceImplin whole‑word mode, but will matchUserServicefollowed by punctuation or whitespace.- The script reports two numbers at the end:
- Total number of matches across all included files.
- Number of distinct files containing at least one match.
These help you compare against IDE/editor search results.
Edge notes:
-
Binary / media / large non‑text formats are already excluded by default patterns.
-
If a file can't be read (e.g. encoding error), an error line is written for that file only when it would otherwise be included (for non‑search runs) or if it passes the search test (which effectively it can't, so unreadable files under search mode are skipped after logging the read error line).
-
Whole‑word mode may treat underscores as part of words (
my_varcountsmy_varas one word); plan searches accordingly.python codepromptor.py ../repo \ -e '*.test.js',node_modules/ \ -o /var/log/dir-dumps
The script generates a single .txt file named after the scanned folder (e.g. my-app.txt), containing:
-
Directory Structure
/ ├── src/ │ ├── index.js │ └── lib/ └── README.md -
File Contents
File: src/index.js -------------------------------------------------- // (file contents...) File: docs/overview.md -------------------------------------------------- # Overview ...