A simple, standalone Python application designed to clean and enhance scanned PDF documents by effectively removing background noise and standardizing text to high-contrast black-on-white.
- Adjustable Parameters: Fine-tune Denoising Strength (H) and Contrast Threshold (C) using intuitive, integer-based sliders.
- Standalone Executable: Download and run directly on Windows without the need to install Python or external dependencies.
- Clean Output: Ensures high-contrast, professional, and print-friendly PDF files.
- Download the latest release (e.g.,
PDFCleanScan.exeor the zip folder) from the Releases page. - Run the executable file (
PDFCleanScan.exe). - Click "Browse..." to select your noisy scanned PDF document.
- Adjust the H and C parameters using the sliders based on the quality of your scan.
- Click "Convert" and choose the save location for the new, cleaned PDF.
The processing uses the OpenCV library's Non-Local Means Denoising and Adaptive Thresholding methods. Adjust these two core parameters for optimal results:
| Parameter | Range | Default | Description & Effect |
|---|---|---|---|
| Denoising Strength (H) | 5 - 35 | 10 | Controls the intensity of noise removal. Lower H retains more fine text detail but leaves more background noise. Higher H aggressively removes noise but risks thinning or blurring the original text. |
| Contrast Threshold (C) | 1 - 15 | 5 | A constant subtracted from the mean to fine-tune the threshold. Lower C makes text lines thicker/bolder, restoring subtle details. Higher C thins the text and ensures a whiter background, useful for high-quality scans. |
This tool is created using the Python bundling utility PyInstaller to create a convenient, standalone executable. It is a common industry challenge that antivirus software (such as Windows Defender, Avast, etc.) may mistakenly flag files created by these bundling tools as potentially malicious. This is known as a False Positive.
This program does not contain any malicious code and is safe to use.
If your antivirus software blocks the executable:
- Add an Exclusion: The quickest solution is to temporarily add the executable or its containing folder to your antivirus program's exclusion list.
- Download the Directory Version: If available, download the directory version (usually a ZIP file containing the executable and libraries) instead of the single
.exefile.
The source code is provided for transparency and review. This tool is built using Python 3 and the following core libraries:
- PyMuPDF (fitz): For PDF rendering and image extraction.
- OpenCV (cv2): For image processing (denoising and thresholding).
- Pillow (PIL): For final PDF assembly.
- Tkinter: For the graphical user interface.
This project is released under the MIT License. For full details, see the LICENSE.txt file.