Skip to content

arzisxam/xFolderCompare

Repository files navigation

xFolderCompare

A fast, two-folder diff tool that generates a self-contained HTML comparison report. Run it through the GUI or directly from the command line.


Install

pip install customtkinter        # required for the GUI
pip install xxhash tkinterdnd2   # optional — faster hashing + drag-and-drop

On macOS with Homebrew Python you may also need brew install python-tk.

Then either:

  • macOS users: double-click xFolderCompare.app (right-click → Open the first time so Gatekeeper lets it through, since the app is unsigned) or xFolderCompare.command.
  • Windows users: double-click xFolderCompare.bat.
  • CLI / cross-platform: python3 xFolderCompare.py for the GUI, or python3 compare_folders.py <left> <right> for the CLI.

Tested on macOS. The GUI and CLI also run on Windows and Linux; the "open in file manager" buttons use Explorer / xdg-open respectively. On Windows the report's ⌘ buttons reveal files in Explorer. On Linux there's no portable "reveal in file manager" API, so they fall back to opening the parent folder.

Prerequisites

Requirement Notes
Python 3.10+ Earlier versions may work but are untested
tkinter Usually bundled with Python. On macOS (Homebrew): brew install python-tk
customtkinter GUI only — pip install customtkinter
tkinterdnd2 (optional) Drag-and-drop folder support — pip install tkinterdnd2
xxhash (optional) Faster hashing — pip install xxhash. Falls back to hashlib (blake2b) if not installed

Files

xFolderCompare/
├── xFolderCompare.app       # macOS app bundle (Apple Silicon; unsigned)
├── xFolderCompare.command   # macOS launcher — double-click to open the GUI
├── xFolderCompare.py        # GUI frontend
├── compare_folders.py       # comparison engine (also usable standalone via CLI)
└── state.json               # auto-created on first run — stores your last settings

GUI

Double-click xFolderCompare.command or run:

python3 xFolderCompare.py

Fields

Field Description
Left / Right The two folders to compare. Use Browse, paste a path, or drag a folder from Finder onto the field
Output Where to save the HTML report
⌘ button Reveals that folder in Finder
↺ Recent Dropdown of the last 5 folder pairs — click any entry to reload both paths instantly
Skip hash Compare by size and modification time only, skipping content hashing. Faster but less accurate — the report shows a warning banner when enabled
Summary view Show folders only (no per-file rows). Useful for large trees
Depth How many folder levels to render in summary view (default: 3)

Advanced settings

Click ⚙ Advanced › to expand. Three settings are shown spread across the panel width. Hover over any label for a tooltip explaining the setting.

Setting Default Description
Large File 0 (disabled) Skip hashing files larger than this size in GB (steps of 0.1 GB). Skipped files are shown as Skipped / Failed in the report. Set to 0 to hash all files
Diff Threshold 50,000 Auto-switch to summary view when total file count exceeds this number
Workers Auto-detected Parallel hashing threads. Auto-set to cpu_count for SSDs, 1 for HDDs/network on first run

Themes

Four colour themes are available in the header — Slate (default), Blue, Purple, Red. Each is shown as a coloured dot; the active theme has a bright accent ring. Selection is remembered between sessions.

Settings persistence

All settings (folder paths, output path, theme, advanced values, panel open/closed state) are saved automatically in state.json alongside the scripts. On first run, Workers is auto-detected based on whether your storage is SSD/NVMe or HDD/network.


CLI

compare_folders.py can be run directly without the GUI:

python3 compare_folders.py <left> <right> [options]

Options

Flag Default Description
-o, --output comparison_report.html Output HTML file path
--summary off Force summary (folder-only) view
--depth N 3 Folder depth for summary view
--threshold N 50000 Auto-switch to summary above this total file count
--skip-hash off Compare by size + mtime only; skip content hashing
--max-file-size N 0 Skip hashing files larger than N GB (0 = no limit)
--workers N auto Number of parallel hashing threads

Examples

python3 compare_folders.py ~/Projects/v1 ~/Projects/v2 -o diff.html
python3 compare_folders.py /Volumes/Backup /Volumes/Archive --summary --depth 4
python3 compare_folders.py /data/left /data/right --skip-hash
python3 compare_folders.py /media/left /media/right --max-file-size 2.5

The Report

The HTML report is fully self-contained (no external dependencies) and opens in any browser.

Modes

Full detail — renders every file and folder. Used when total file count is below the threshold. Rows are expandable/collapsible.

Summary — renders folders only, up to the configured depth. Each folder has a ⬡ Drill down button that re-runs the comparison on that sub-folder pair and opens a new report.

Status colours

Colour Meaning
White/dim Identical on both sides
Red Content differs
Amber Same content, different modification time
Purple Skipped / Failed (file too large, or hash read error)
Blue Present on one side only (orphan)

Toolbar

Filter buttons — each has its own accent colour; the active button is filled.

Button Shows
All Every row
Differences Rows where content differs or a file is missing on one side
Identical Rows matching on both sides
Diff mtime Rows with matching content but different modification timestamps

Search bar — type to filter rows by filename in real time. Matching rows are shown; ancestor folders are automatically expanded to preserve tree context. A live match count appears beside the box.

Utility buttons

Button Action
Expand All / Collapse All Toggle all folder rows open or closed
Hide / Show Orphans Toggle visibility of files present on only one side
Hide / Show Dotfiles Toggle visibility of files and folders starting with .

Sortable columns — click any column header to sort siblings within each folder by that field (name, size, or modified date). Sorting cycles through ascending → descending → natural order. Folders always stay above files within the same parent.

Warning banners

The report shows a banner at the top in the following situations:

Banner Condition
✔ Folders are completely identical Every file matches on both sides
⚠ Hash comparison skipped --skip-hash was used
⚠ Some files skipped — exceeded N GB limit At least one file was skipped due to --max-file-size

Inline file diff

Click any red (content differs) file row to open a side-by-side text diff in a panel at the bottom of the report. Additions are highlighted green, deletions red, and modified lines amber. Drag the top edge of the panel to resize; left and right sides scroll together.

Embedding is automatic but bounded so the report stays a single self-contained file:

Limit Default Why
Differing text files embedded first 50 Keeps report size manageable
Per-file size cap 100 KB / side Larger files aren't embedded
Rendered lines per side 500 Diff algorithm is O(n·m); larger inputs are truncated with a notice
Binary files skipped Detected by NUL-byte sniff in first 8 KB

If at least one diff is embedded, an info banner at the top of the report says how many files are embedded vs. skipped.

Finder integration

The ⌘ button on each row opens that file or folder in Finder. This requires the GUI to be running — it hosts a local server on port 7731 that handles open requests. The button is silently inactive if the GUI is not running.

For security, the server only opens paths inside the folders of the most recently launched comparison (resolved by realpath); requests for anything outside are rejected.


Hashing

Files are compared by content hash. xxhash (xxh3_128) is used when installed; otherwise 16-byte blake2b truncated digests are used. Both are fast and suitable for change detection — they are not intended for cryptographic verification.

Files with different sizes are classified as mismatches immediately without hashing. Files with matching sizes are hashed; if hashes match, modification times are then compared to detect mtime-only differences.

On SSD/NVMe storage, multiple files are hashed in parallel. On HDDs or network volumes, hashing runs sequentially to avoid seek thrashing. The Workers setting controls the thread count and can be overridden in the Advanced panel.

Files exceeding the Large File threshold are not read at all and are shown as Skipped / Failed (purple) in the report. A warning banner is shown only if at least one file was actually skipped.


Sample output

samples/sample_report.html is a pre-generated report covering every status type (matching, mismatch, diff-mtime, left-only, right-only, binary mismatch, UTF-16 text). Open it directly in a browser — no Python or server needed for viewing.

samples/App1.pngApp4.png are GUI screenshots.

License

MIT — see LICENSE.

About

A fast, two-folder Python diff tool with a GUI

Topics

Resources

License

Stars

Watchers

Forks

Contributors