CropVision is a user-friendly desktop application designed to streamline the process of object detection and cropping using state-of-the-art YOLO models. Built with Python, PyQt6, and the powerful ultralytics library, it provides an interactive interface for selecting images, running detections, visualizing results, and saving precise crops for your datasets or projects.
- Intuitive GUI: A clean, modern interface built with PyQt6, featuring a resizable layout.
- Flexible Model Loading: Load any YOLO model compatible with the
ultralyticslibrary (e.g.,yolo11x.pt, custom-trained models). - Source/Output Flexibility: Easily select source directories (including subfolders) and define where to save your crops.
- Image Browse: View images as thumbnails with an easy-to-use pagination system.
- Interactive Preview:
- View large previews of selected images.
- Visualize detection results with bounding boxes, class labels, and confidence scores drawn directly on the image.
- Detection Control:
- Adjust the confidence threshold with a slider to filter out weak detections.
- Filter detections by a specific class name (with autocompletion based on the loaded model).
- Efficient Cropping:
- Save crops for a single image.
- Batch-process and save crops for the current page or all images in the source folder.
- Image Management: Delete unwanted source images directly from the interface (with confirmation).
- Responsive & Non-Blocking: Long operations (model loading, batch processing) run in background threads, keeping the GUI responsive and providing progress updates.
- Smart Sizing: Attempts to set a reasonable initial window size based on your screen.
Follow these steps to set up and run CropVision on your local machine.
-
Prerequisites:
- Python 3.8+: Ensure you have a compatible Python version installed.
-
Clone the Repository:
git clone https://github.com/JindalChaitanya/CropVision.git # Or download and extract the ZIP cd crop-vision
-
Create and Activate a Virtual Environment (Highly Recommended):
# On macOS/Linux python3 -m venv venv source venv/bin/activate # On Windows python -m venv venv .\venv\Scripts\activate
-
Install Dependencies:
pip install -r requirements.txt
⚠️ Important Note on PyTorch (torch): Therequirements.txtfile liststorch. For GPU support, you might need to install a specific version of PyTorch that matches your CUDA toolkit. Visit the Official PyTorch Website to get the correct installation command for your system and then run it after activating your virtual environment.
- Launch the Application:
python crop_vision/main.py
- Load a Model:
- Enter the name (e.g.,
yolo11x.pt, it will download if not present) or the full path to your.ptmodel file in the "Model" input field. - Click "Load Model". Wait for the confirmation message and status update.
- Enter the name (e.g.,
- Select Folders:
- Click "Source Folder" and choose the directory containing your images. Thumbnails will appear.
- Click "Output Folder" to choose where your cropped images will be saved (defaults to
yolo_crops).
- Browse Images:
- Use the "Previous" and "Next" buttons to navigate through pages of thumbnails.
- Click a thumbnail to display it in the preview pane.
- Run Detection:
- Select an image.
- Adjust the "Confidence Threshold" slider.
- (Optional) Enter a "Class Filter" name.
- Click "Detect Objects". Bounding boxes will appear on the preview.
- Save Crops:
- Current: Click "Save Current Crop(s)" for the image in the preview.
- Page: Click "Save Page Crops" to process all images on the current thumbnail page.
- All: Click "Save All Pages Crops" to process every image found in the source folder. A progress bar will appear for batch operations.
- Delete (Use with Caution!):
- Select an image.
- Click "Delete Selected Image". You will be asked for confirmation before the file is permanently removed.
- Add support for video processing.
- Allow drawing/adjusting bounding boxes manually.
- Implement model training or fine-tuning integration.
- Add more image preview tools (zoom, pan).
- Save detection data (e.g., in JSON or XML format).
- Customizable output filename formats.
Contributions are welcome! If you have suggestions or find bugs, please open an issue or submit a pull request.
This project is currently unlicensed. Consider adding an open-source license like MIT if you plan to share it widely.
This application was created by Chaitanya Jindal. Github | LinkedIn
