Skip to content

Lightweight, Dockerized EXIF cleaner for fast publishing of JPEG photos without leaking sensitive metadata

License

Notifications You must be signed in to change notification settings

per2jensen/scrubexif

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

scrubexif

Tag CI License

Docker Pulls Base OS # clones Milestone

🎯 Stats powered by ClonePulse

🧼 scrubexif is a lightweight, Dockerized EXIF cleaner designed for fast publishing of JPEG photos without leaking sensitive metadata.

It removes most embedded EXIF, IPTC, and XMP data while preserving useful tags like exposure settings β€” ideal for privacy-conscious photographers who still want to share some technical info.

πŸ‘‰ GitHub: per2jensen/scrubexif

πŸ“¦ Docker Hub: per2jensen/scrubexif


πŸ“š Table of Contents


πŸš€ Quick Start

There are two modes:

βœ… Manual mode (default)

Manually scrub one or more .jpg / .jpeg files from the current directory.

Scrub specific files

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION "file1.jpg" "file2.jpeg"

Scrub all JPEGs in current directory

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION

Recursively scrub nested folders

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION --recursive

πŸ€– Auto mode (--from-input)

Scrubs everything in a predefined input directory and saves output to another β€” useful for batch processing.

You must mount three volumes:

  • /photos/input β€” input directory (e.g. $PWD/input)
  • /photos/output β€” scrubbed files saved here
  • /photos/processed β€” originals are moved here (or deleted if --delete-original is used)

Example

VERSION=0.5.11; docker run -it --rm \
  -v "$PWD/input:/photos/input" \
  -v "$PWD/output:/photos/output" \
  -v "$PWD/processed:/photos/processed" \
  per2jensen/scrubexif:$VERSION --from-input

Optional flags:

  • --delete-original β€” Delete originals instead of moving them
  • --on-duplicate {delete|move} - delete or move a duplicate
  • --dry-run β€” Show what would be scrubbed, but don’t write files

Duplicate Handling (auto mode)

By default, if a file with the same name already exists in the output folder, it is treated as a duplicate:

  • --on-duplicate delete (default): Skips scrubbing and deletes the original from input.
  • --on-duplicate move: Moves the duplicate file to /photos/errors for inspection.

This ensures output is not overwritten and prevents silently skipping files.

The reason to delete a duplicate by default is that the files are probably not that important, mostly used to give viewers a quick glance. It also conserves disk space.

# Move duplicates to /photos/errors instead of deleting
docker run -v "$PWD/input:/photos/input" \
           -v "$PWD/output:/photos/output" \
           -v "$PWD/processed:/photos/processed" \
           -v "$PWD/errors:/photos/errors" \
           scrubexif:dev --from-input --on-duplicate move

πŸ“Œ Observe the -v "$PWD/errors:/photos/errors" volume specification needed for the --on-duplicate move option.


πŸ”§ Options (Manual mode)

The container accepts:

  • Filenames: one or more .jpg or .jpeg file names
  • -r, --recursive: Recursively scrub /photos and all subfolders
  • --dry-run: Show what would be scrubbed, without modifying files

Examples:

Scrub all .jpg files in subdirectories:

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION --recursive

Dry-run (preview only):

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION --dry-run

Mix recursion and dry-run:

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION --recursive --dry-run

πŸ“Œ Observe In manual mode, files are scrubbed in-place and will overwrite the originals. Duplicate handling (e.g. move/delete) is not applicable here.


βœ… Features

  • Case insensitive, works on .jpg, .JPG, .jpeg & .JPEG
  • Removes most EXIF, IPTC, and XMP metadata
  • Preserves useful photography tags:
    • Title
    • ExposureTime, FNumber, ISO
    • ImageSize, Orientation
    • FocalLength
  • Show tags before & after (see below)
  • Preserves Color profile, with a compromise in scrubbing (see below)
  • A --paranoia option to scrub color profile tags (see below)
  • A --preview option to check tag before/after scrub (see below)
  • An --on-duplicate option controlling what to do if a file in /output is already there
  • Based on the most excellent ExifTool inside a minimal Ubuntu base image
  • Docker-friendly for pipelines and automation

🎯 Metadata Preservation Strategy

By default, scrubexif preserves important non-private metadata such as exposure, lens, ISO, and color profile information. This ensures that images look correct in color-managed environments (e.g. Apple Photos, Lightroom, web browsers with ICC support).

For users who require maximum privacy, an optional --paranoia mode is available.

πŸ›‘οΈ --paranoia Mode

When enabled, --paranoia disables color profile preservation and removes fingerprintable metadata like ICC profile hashes (ProfileID). This may degrade color rendering on some devices, but ensures all embedded fingerprint vectors are scrubbed.

Mode ICC Profile Color Fidelity Privacy Level
(default) βœ… Preserved βœ… High ⚠️ Moderate
--paranoia ❌ Removed ❌ May degrade βœ… Maximum

πŸ“Έ Example

# Safe color-preserving scrub (default)
docker run -v "$PWD:/photos" scrubexif:dev image.jpg

# Maximum scrub, removes the ICC profile
docker run -v "$PWD:/photos" scrubexif:dev image.jpg --paranoia

Note: The ICC profile includes values like ProfileDescription, ColorSpace, and ProfileID. The latter is a hash that may vary by device or editing software.

πŸ” Inspecting Metadata with --show-tags

The --show-tags option lets you inspect metadata before, after, or both before and after scrubbing. This is useful for:

  • Auditing what data is present in your photos
  • Verifying that scrubbed output removes private metadata
  • Confirming what remains (e.g. lens info, exposure, etc.)

⚠️ Note on --dry-run

If you want to inspect metadata only without modifying any files, you must pass --dry-run.

Without --dry-run, scrubbing is performed as usual.


πŸ“Œ Usage Examples

# πŸ”Ž See tags BEFORE scrub (scrub still happens)
docker run -v "$PWD:/photos" scrubexif:dev image.jpg --show-tags before

# πŸ”Ž See both BEFORE and AFTER (scrub still happens)
docker run -v "$PWD:/photos" scrubexif:dev image.jpg --show-tags both

# βœ… Just show metadata, DO NOT scrub
docker run -v "$PWD:/photos" scrubexif:dev image.jpg --show-tags before --dry-run

Works in both modes

Manual mode: for individual files or folders

Auto mode (--from-input): applies to all JPEGs in inputdirectory.

πŸ›‘ Tip: Combine --dry-run --paranoia --show-tags before to verify level of metadata removal before commiting.

πŸ” Preview Mode (--preview)

The --preview option lets you safely simulate the scrubbing process on a single JPEG without modifying the original file.

This mode:

  • Copies the original image to a temporary file
  • Scrubs the copy in memory
  • Shows metadata before and/or after scrubbing
  • Deletes the temp files automatically
  • Never alters the original image

βœ… Typical Use

docker run -v "$PWD:/photos" scrubexif:dev test.jpg --preview

πŸ›‘ Tip: Combine --preview --paranoia to verify the color profile tags including the ProfileId tag has been scrubbed.


🧼 What It Cleans

The tool removes:

  • GPS location data
  • Camera serial numbers
  • Software version strings
  • Embedded thumbnails
  • XMP/IPTC descriptive metadata
  • MakerNotes (where safely possible)

It preserves key tags important for photographers and viewers.


Known limitations

🚧 Symlinked input paths are not detected inside the container

If you bind-mount a symbolic link (e.g. -v $(pwd)/symlink:/photos/input), Docker resolves the symlink before passing it to the container. This means:

  • The container sees /photos/input as a normal directory.
  • scrubexif cannot detect it was originally a symlink.
  • For safety, avoid mounting symbolic links to any of the required directories.

🐳 Docker Images

For now I am not using latest, as the images are only development quality.

I am currently going with:

Tag Description Docker Hub Example Usage
:0.x.y Versioned releases following semantic versioning βœ… Yes docker pull per2jensen/scrubexif:0.5.11
:stable Latest "good" and trusted version; perhaps :rc βœ… Yes docker pull per2jensen/scrubexif:stable
:dev Development version; may be broken or incomplete ❌ No docker run scrubexif:dev

πŸ”„ The release pipeline automatically updates build-history.json, which contains metadata for each uploaded image.

πŸ“₯ Pull Images

Versioned image:

VERSION=0.5.11; docker pull per2jensen/scrubexif:$VERSION

Pull the latest stable release (when available)

docker pull per2jensen/scrubexif:stable

βœ”οΈ All :0.5.x and :stable images run the test suite successfully as part of the release pipeline.

:dev β†’ Bleeding edge development, only built >locally, not pushed to Docker Hub

🧼 Run to scrub all .jpg and .jpeg files in the current directory

VERSION=0.5.11; docker run -it --rm -v "$PWD:/photos" per2jensen/scrubexif:$VERSION

πŸ› οΈ Show version and help

VERSION=0.5.11; docker run --rm per2jensen/scrubexif:$VERSION --version
VERSION=0.5.11; docker run --rm per2jensen/scrubexif:$VERSION --help

πŸ” User Privileges and Running as Root

By default, the scrubexif container runs as user ID 1000, not root. This is a best-practice security measure to avoid unintended file permission changes or elevated access.

πŸ§‘ Default Behavior

docker run --rm scrubexif:dev

Runs the container as UID 1000 by default

Ensures safer file operations on mounted volumes

Compatible with most host setups

πŸ‘€ Running as a Custom User

You can specify a different UID (e.g., match your local user) using the --user flag:

docker run --rm --user $(id -u) scrubexif:dev

This ensures created or modified files match your current user permissions.

🚫 Root is Blocked by Default

Running the container as root (UID 0) is explicitly disallowed to prevent unsafe behavior:

docker run --rm --user 0 scrubexif:dev
# ❌ Running as root is not allowed unless ALLOW_ROOT=1 is set.

To override this safeguard, set the following environment variable:

docker run --rm --user 0 -e ALLOW_ROOT=1 scrubexif:dev

⚠️ Use this option only if you know what you're doing. Writing files as root can cause permission issues on the host system.


πŸ“Œ Recommendations

To ensure smooth and safe operation when using scrubexif, follow these guidelines:

πŸ›‘οΈ Hardening

Use these options when starting a container:

docker run  --read-only --security-opt no-new-privileges \
          -v "$PWD/input:/photos/input" \
          -v "$PWD/output:/photos/output" \
          -v "$PWD/processed:/photos/processed" \
          scrubexif:dev --from-input

βœ… Use Real Directories for Mounts

Avoid using symbolic links for input, output, or processed paths. Due to Docker's volume resolution behavior, symlinks are flattened and no longer detectable inside the container.

Instead:

docker run -v "$PWD/input:/photos/input" \
           -v "$PWD/output:/photos/output" \
           -v "$PWD/processed:/photos/processed" \
           scrubexif:dev --from-input

βœ… Run as a Non-Root User

scrubexif checks directory writability. If you mount a directory as root-only, and the container runs as a non-root user (recommended), it will detect and exit cleanly.

Tip: Use --user 1000 or ensure mounted dirs are writable by UID 1000.

βœ… Always Pre-Check Mount Paths

Ensure the input, output, and processed directories:

Exist on the host

Are not files or symlinks

Are writable by the container’s user

Otherwise, scrubexif will fail fast with a clear error message.

βœ… Keep Metadata You Intend to Preserve Explicit

Configure your scrub.py to define which EXIF tags to preserve, rather than relying on defaults if privacy is critical.


πŸ” Viewing Metadata

To inspect the metadata of an image before/after scrubbing:

exiftool "image.jpg"

Inside the container (optional):

Observe the "/photos" in the filename, that is because the container has your $PWD mounted on /photos.

VERSION=0.5.11; docker run --rm -v "$PWD:/photos" --entrypoint exiftool  per2jensen/scrubexif:$VERSION  "/photos/image.jpg"

πŸ“¦ Inspecting the Image Itself

To view embedded labels and metadata:

VERSION=0.5.11; docker inspect per2jensen/scrubexif:$VERSION | jq '.[0].Config.Labels'

You can also check the digest and ID:

VERSION=0.5.11; docker image inspect per2jensen/scrubexif:$VERSION --format '{{.RepoDigests}}'

πŸ“ Example Integration

This image is ideal for:

  • Web galleries
  • Dog show photo sharing
  • Social media publishing
  • Backup pipelines before upload
  • Static site generators like Hugo/Jekyll

πŸ”§ Build Locally (Optional)

docker build -t scrubexif .

πŸ§ͺ Test Image

To verify that a specific scrubexif Docker image functions correctly, the test suite supports containerized testing using any image tag. By default, it uses the local tag scrubexif:dev for testing. You can override this with the SCRUBEXIF_IMAGE environment variable.

πŸ”§ Default behavior

When running pytest, the following fallback is used if no override is set:

IMAGE_TAG = os.getenv("SCRUBEXIF_IMAGE", "scrubexif:dev")

This means that the tests will attempt to run:

docker run ... scrubexif:dev ...

If no such local image exists, the test will fail.

✍️ License

Licensed under the GNU General Public License v3.0 or later
See the LICENSE file in this repository.


πŸ™Œ Related Tools

πŸ“Έ file-manager-scripts β€” Nautilus context menu integrations
πŸ“Έ image-scrubber β€” Browser-based interactive metadata removal
πŸ“Έ jpg-exif-scrubber β€” Python tool that strips all metadata (no preservation)

scrubexif focuses on automated, container-friendly workflows with safe defaults for photographers.


πŸ’¬ Feedback

Suggestions, issues, or pull requests are always welcome.
Maintained by Per Jensen


πŸ”— Project Homepage

Source code, issues, and Dockerfile available on GitHub:

πŸ‘‰ https://github.com/per2jensen/scrubexif

πŸ“¦ Docker Hub: per2jensen/scrubexif

About

Lightweight, Dockerized EXIF cleaner for fast publishing of JPEG photos without leaking sensitive metadata

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •