PyTorch PlantCLEF: Multi-label Plant Species Classification with DINOv2

(Added Wednesday Apr 9th, 2025)

Jacob A Rose forked this repo from the starter repo originally created by @murilogustineli, for the purpose of improving on their work to compete in the 2025 PlantCLEF dataset challenge.

PyTorch webinar on using the DINOv2 model and the Faiss library for multi-label plant species classification in the PlantCLEF @ LifeCLEF & CVPR-FGVC competition on Kaggle. This session will demonstrate how self-supervised Vision Transformers (ViTs) and similarity search techniques can classify plant species efficiently at scale.

This webinar is made possible through the support of the PyTorch Foundation and Intel AI.

Watch the Webinar on YouTube

▶️ Click here to watch on YouTube

Discover how we used DINOv2 + Faiss leveraging PyTorch and PyTorch Lightning for large-scale multi-label plant species classification.

Table of Contents

What You'll Learn
Event Details
Quickstart Guide
Intel Tiber AI Cloud Setup

What You’ll Learn

How to leverage DINOv2 embeddings for multi-label classification using transfer learning.
Efficient feature extraction from a subset of 1.4M+ images using PyTorch Lightning.
Using Faiss for fast nearest neighbor search on high-dimensional embeddings.
Image processing techniques: grid-based tiling and prediction aggregation to handle large datasets.

Event Details

📅 Date: March 27th, 12 PM PST

🎤 Speaker: Murilo Gustineli

📍 Where: Online Webinar

👋 Register today: Registration Page

Quickstart Guide

1. Clone the repository

First, clone the pytorch-plantclef repo:

⚠️ Using HTTPS (Recommended for Intel Tiber AI Cloud):

git clone https://github.com/murilogustineli/pytorch-plantclef.git

Using SSH:

git clone git@github.com:murilogustineli/pytorch-plantclef.git

Navigate to the project directory:

cd pytorch-plantclef

2. Install `uv` (Fast Package Manager)

Install uv as the package manager for the project. Follow the uv installation instructions for macOS, Linux, and Windows.

If running on Intel Tiber AI Cloud, install uv as the following (also works for macOS and Linux):

curl -LsSf https://astral.sh/uv/install.sh | sh

Add it to PATH:

source $HOME/.local/bin/env

Check uv installation:

uv --version

3. Create a Virtual Environment

Create the virtual environment:

uv venv venv

Activate the virtual environment:

source venv/bin/activate

4. Install Dependencies and Set Up the Project

Install the plantclef package in editable mode, which means changes to the Python files will be immediately available without needing to reinstall the package.

Install all dependencies from requirements.txt to the venv virtual environment:

uv pip install -e .

This command does two things:

Installs all dependencies listed in requirements.txt.
Sets up plantclef as an editable package inside the virtual environment.

[OPTIONAL] Set Up Pre-Commit Hooks for Code Formatting:

To ensure code follows best practices, install pre-commit:

pre-commit install

This automatically formats and checks your code before every commit.

5. Download Dataset & Fine-Tuned ViT Model

Run the following script to download:

Dataset (data/parquet/dataset_name)
Fine-Tuned DINOv2 Model (model/pretrained_models/model_name)

bash scripts/download_data_model.sh

This script will:

Download the dataset & model from Google Drive.
Extract the .zip files into their respective directories.
Remove the original .zip files to save space.

6. Run tests to verify setup

After downloading the data and fine-tuned model, we can ensure everything is working correctly by running the following pystest:

pytest -vv -s tests/test_model.py

This test ensures that:

The virtual environment is correctly set up.
The DINOv2 model is correctly loaded.
Image embeddings are generated without errors.

If you're running locally, you should be good to go! If you're running on Intel Tiber AI Cloud, follow the setup below.

Intel Tiber AI Cloud Setup

⚠️ The Jupyter and terminal environments on ITAC are NOT synced. This means that installing packages or setting environment variables in one will not automatically apply to the other.

To ensure proper Intel GPU (xpu) access, follow these steps:

Open the notebook: Open the jupyter notebook notebooks/setup_itac.ipynb.
Run cells sequentially: Go through the notebook step by step.
Restart the Kernel when required: Running the cell exit() will restart the jupyter kernel to apply the installations.
Verify that the Intel GPU (xpu) is being used: At the end of the notebook execution, check the PyTorch version and device are correct. The expect output if Intel GPU is enabled is:
```
PyTorch Version: 2.5.1+cxx11.abi
Using device: xpu
```
If you see Using device: cpu, the setup did not correctly enable the Intel GPU—retry running the setup notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
images		images
notebooks		notebooks
plantclef		plantclef
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
packages.txt		packages.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyTorch PlantCLEF: Multi-label Plant Species Classification with DINOv2

Watch the Webinar on YouTube

What You’ll Learn

Event Details

Quickstart Guide

1. Clone the repository

2. Install `uv` (Fast Package Manager)

3. Create a Virtual Environment

4. Install Dependencies and Set Up the Project

[OPTIONAL] Set Up Pre-Commit Hooks for Code Formatting:

5. Download Dataset & Fine-Tuned ViT Model

6. Run tests to verify setup

Intel Tiber AI Cloud Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

JacobARose/plantclef-vision

Folders and files

Latest commit

History

Repository files navigation

PyTorch PlantCLEF: Multi-label Plant Species Classification with DINOv2

Watch the Webinar on YouTube

What You’ll Learn

Event Details

Quickstart Guide

1. Clone the repository

2. Install uv (Fast Package Manager)

3. Create a Virtual Environment

4. Install Dependencies and Set Up the Project

[OPTIONAL] Set Up Pre-Commit Hooks for Code Formatting:

5. Download Dataset & Fine-Tuned ViT Model

6. Run tests to verify setup

Intel Tiber AI Cloud Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

2. Install `uv` (Fast Package Manager)

Packages