Skip to content

`foldergrunt` is a powerful Python-based folder sorting utility that prepares directories for archiving or long-term storage.

License

Notifications You must be signed in to change notification settings

GiantRavens/foldergrunt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

foldergrunt

foldergrunt

foldergrunt tames folder and file chaos typical in large file collections and archives.

foldergrunt can save you hours of tedious archiving preparation. Given a target folder, it can dedupe all files, append meaningful media info to images and movies, surface hidden files, wipe away OS cruft, and even sort everything into a neatly ordered category folders.

foldergrunt is a powerful Python-based folder sorting utility that prepares directories for archiving or long-term storage. It removes operating-system lint, validates MIME types, normalizes permissions, and can optionally scan for malware and duplicates, surface hidden directories, and flatten large disorganized folder chaos and media dumps into organized category folders.

It is strongly encouraged that you run foldergrunt with the --dry-run flag to see what actions it will take before committing changes.

Features

  • Lint cleanup – removes common system lint such as .DS_Store, Thumbs.db, and __MACOSX folders.
  • MIME validation – checks that file extensions match detected MIME types and reports mismatches.
  • Permission normalization – reapplies sane default modes (755 for directories, 644 for files).
  • Extended attribute handling – strips problematic macOS extended attributes when requested tools are present.
  • Malware scanning – integrates with ClamAV (clamscan) when available.
  • Duplicate detection – supports czkawka_cli, fdupes, and jdupes backends; can remove duplicates with explicit confirmation.
  • Media tagging – appends semantic descriptors (frame size, resolution, bit depth, etc.) based on MediaInfo/ExifTool output.
  • Copy suffix stripping – removes trailing "copy" qualifiers from duplicate filenames that might be left over from previous sorting and merging.
  • Flatten sorting – categorizes files into top-level folders by type (video, audio, images, documents, etc.) while respecting bundles and Git repositories.
  • Empty directory pruning – optionally removes empty folders after processing.

Quick start

python3 foldergrunt.py /path/to/target

By default, duplicate detection runs in audit mode, normalization steps apply, and a summary is printed. Use --dry-run to see actions without touching the filesystem.

Typical archival run

python3 foldergrunt.py ~/Desktop/sort_me \
  --flatten-sort \
  --summary \
  --verbose

Duplicate removal (destructive)

python3 foldergrunt.py ~/media_dump \
  --remove-dupes --force \
  --dedupe-backend czkawka \
  --dedupe-delete-method AEO

Flatten test run

python3 foldergrunt.py ~/incoming --flatten-sort --dry-run

Installing as an executable

If you prefer to invoke foldergrunt like any other command-line utility:

  1. Ensure the script is executable:

    chmod +x foldergrunt.py
  2. Optionally rename it for convenience:

    mv foldergrunt.py foldergrunt
  3. Place it somewhere on your PATH. Common choices:

    • /usr/local/bin (requires sudo):

      sudo mv foldergrunt /usr/local/bin/
    • A personal bin directory (e.g., ~/bin). Create it if needed and add it to your shell profile (~/.zshrc, ~/.bashrc):

      mkdir -p ~/bin
      mv foldergrunt ~/bin/
      echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc
  4. Verify it runs:

    foldergrunt --help

The script’s shebang (#!/usr/bin/env python3) automatically selects the first python3 on your PATH. If you rely on a virtual environment, consider using an absolute interpreter path (e.g., #!/Users/you/.venv/bin/python) before distributing the executable.

CLI reference

Flag Purpose / Notes
target (positional) Directory to process (defaults to CWD).
--no-recurse Limit processing to the top-level directory.
--skip-dedupe Skip duplicate detection (dedupe runs by default).
--remove-dupes Request duplicate deletion (operates only with --force).
--dedupe (hidden) Internal override to force dedupe on; rarely needed manually.
--dedupe-backend Choose duplicate finder (czkawka, fdupes, jdupes; default czkawka).
--dedupe-min-size Minimum size (bytes) for duplicate detection (default 1).
--dedupe-delete-method Deletion policy for czkawka_cli when removals are permitted.
--skip-scan Skip malware scanning.
--skip-extension-check Skip MIME/extension validation.
--dry-run Report actions without changing the filesystem.
--summary Emit a summary report at the end of the run.
--force Enable destructive actions (e.g., duplicate deletion).
--keep-empty Preserve empty directories instead of pruning.
--keep-names Skip filename normalization.
--names-report-only Report problematic names without renaming.
--skip-media-tagging Skip appending media descriptors.
--media-report-only Report media descriptors without renaming.
--flatten-sort Organize files into category folders (respects dry-run).
--exclude PATTERN Glob pattern to exclude (repeatable).
--quiet Reduce output to warnings/errors only.
--verbose Increase logging verbosity (use twice for debug).

Processing pipeline

foldergrunt performs the following steps in order by calling different functions:

  1. Inventory – collect files/directories, identify lint, and gather stats.
  2. Require tools – verify binaries for requested steps (ClamAV, MediaInfo, etc.).
  3. Lint removal – delete unwanted files/folders unless --dry-run.
  4. Extended attributes – clear problematic metadata on macOS.
  5. Security scan – run ClamAV if not skipped.
  6. MIME validation – compare extensions against detected MIME types.
  7. Permission normalization – apply default modes.
  8. Duplicate handling – run chosen backend, optionally delete with --remove-dupes --force.
  9. Media tagging – append descriptors or report them.
  10. Copy suffix stripping – rename files ending with "copy" suffixes.
  11. Flatten sorting – lift files to root-level category directories.
  12. Prune empty directories – remove empty folders unless --keep-empty.
  13. Summary – print metrics if --summary.

Testing

The project uses pytest. From the repository root:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt  # see requirements document
pytest

Tests rely on monkeypatched subprocess calls, so no external tools need to be installed to execute the suite.

Troubleshooting

  • Nothing happens? Add --verbose (once or twice) to inspect skipped operations.
  • Duplicate deletion skipped? Ensure both --remove-dupes and --force are supplied, and that you are not in --dry-run mode.

License

Released under the MIT License.

Third-party tools

foldergrunt orchestrates several optional command-line tools. Install and use them according to their respective licenses:

Tool Purpose Homepage / Source
clamscan Malware scanning https://www.clamav.net/
czkawka_cli Duplicate finder (default backend) https://github.com/qarmin/czkawka
fdupes Alternative duplicate finder https://github.com/adrianlopezroche/fdupes
jdupes Alternative duplicate finder https://github.com/jbruchon/jdupes
mediainfo Media metadata inspection https://mediaarea.net/en/MediaInfo
exiftool Extended metadata extraction https://exiftool.org/
xattr macOS extended attribute manipulation https://ss64.com/osx/xattr.html

About

`foldergrunt` is a powerful Python-based folder sorting utility that prepares directories for archiving or long-term storage.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages