Skip to content

Advanced vision transformers and segmentation models (DINO, ViT, SAM, and YOLO+) to build a robust pipeline for identifying and isolating trees from images . With refined post-processing steps and enhanced mask filtering, high precision segmentation , improving overall performance through metrics such as IoU, precision, recall, and AP.

License

Notifications You must be signed in to change notification settings

Zack4DEV/Sam_Filtered_ViT_Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌳 Tree Detection and Segmentation Pipeline (DINO + ViT + SAM + YOLO+)

A modern deep learning project combining the power of vision transformers and segment-anything models to accurately detect and isolate trees in complex scenes using advanced filtering and evaluation techniques.


πŸ“Œ Key Components

Model Purpose
DINO Vision Transformer-based object detection
ViT Backbone feature extraction
SAM Smart segmentation with mask refinement
YOLO+ Real-time object detection

  • βœ… Refined Segmentation Masks: Removed irrelevant masks (e.g., humans, background) generated by SAM.
  • βœ… Post-processing Isolation: Only tree-like objects are segmented with minimal overhead.
  • βœ… Mask Filtering: Shape and size-based filtering improves mask quality.
  • βœ… Evaluation Support: Built-in performance evaluation using:
    • IoU (Intersection over Union)
    • Precision & Recall
    • AP (Average Precision)

πŸš€ Quick Start

Requirements

bash pip install jupyter

Inference

bash jupyter nbconvert --to script Sam_Filtered_ViT_Segmentation.ipynb python sam_filtered_vit_segmentation.py --image Images/Trees/Tree.jpg

Options

Argument Description
--model Choose model: dino, yolo+, sam
--evaluate Run evaluation metrics
--refine Apply shape/size mask filtering

πŸ“ŠΒ Evaluation

We evaluate performance using:

  • IoU - Measures overlap of predicted vs ground truth
  • Precision / Recall - Accuracy of segmentation results
  • AP - Average precision across confidence thresholds

Results are printed and logged automatically.


πŸ“ Project Structure

β”œβ”€β”€ Images/ β”‚ β”œβ”€β”€ Trees/ β”‚ β”œβ”€β”€ NotTrees/ β”œβ”€β”€ Labels/ β”‚ β”œβ”€β”€ Trees/ β”‚ β”œβ”€β”€ NotTrees/ β”œβ”€β”€ dataset.yaml β”œβ”€β”€ Sam_Filtered_ViT_Segmentation β”œβ”€β”€ README.md └── LICENSE


πŸ€– Authors & Contributions

  • πŸ”¬ DINO & ViT Integration - @Zack4DEV
  • 🧠 SAM Post-Processing - @Zack4DEV
  • βš™ Evaluation Engine - @Zack4DEV

πŸ“œ License

MIT License - see LICENSE for details.


🌍 Future Work

  • βœ… Integrate image captioning for detected tree regions
  • ⏳ Multi-class support (e.g., tree species)
  • ⏳ Web UI with streamlit or Gradio

About

Advanced vision transformers and segmentation models (DINO, ViT, SAM, and YOLO+) to build a robust pipeline for identifying and isolating trees from images . With refined post-processing steps and enhanced mask filtering, high precision segmentation , improving overall performance through metrics such as IoU, precision, recall, and AP.

Topics

Resources

License

Stars

Watchers

Forks