Pollen_viability

🌸 Pollen Viability Detector (YOLOv8)

Python YOLOv8 Status

image

An automated computer vision pipeline for assessing pollen viability using the Alexander Stain technique. This project leverages YOLOv8 to detect and classify pollen grains as either Viable (Magenta/Purple) or Non-Viable (Green/Shriveled) from high-resolution microscope images.

πŸ“Œ Table of Contents


πŸ”­ Project Overview

This tool automates the tedious process of counting pollen grains on microscope slides. It uses deep learning to replicate the judgment of a biologist following the Alexander Stain protocol.


πŸ”¬ Methodology

The project runs on the CESNET MetaCentrum high-performance computing cluster, utilizing NVIDIA A40/A100 GPUs for rapid training and inference.

Key Strategies used in this model:

  1. Mosaic Disabling: We disable mosaic augmentation for the final 10 epochs of training. This stabilizes the training loss and prevents the model from hallucinating pollen in empty space (ghost detections).
  2. High-Res Tiling: For inference, large microscope scans (>2000px) are sliced into overlapping tiles (1600x1600px). This prevents the β€œSquish Effect,” ensuring small grains retain their texture and are not lost during resizing.
  3. Border Patrol: To prevent double-counting grains on tile edges, we implement a custom logic that ignores edge detections, relying on the neighboring tile to capture the grain centrally.

βš™οΈ Workflow Setup

This project has migrated from Google Colab to the CESNET Cluster for better performance and data privacy.

1. Connecting to CESNET

Detailed instructions on how to set up your environment, connect via SSH, and manage files can be found in our connection guide:

πŸ‘‰ CESNET Connection Guide

2. Kubernetes Workflow

For instructions on running detection and training jobs: πŸ‘‰ Kubernetes Workflow Guide

2. Environment Installation

The environment uses a custom Jupyter kernel. Dependencies include ultralytics, opencv-python-headless, and pandas. (See the setup_environment.py script in the repo for automated dependency handling).


🏷️ Labeling Guide (Roboflow)

We use Roboflow for annotating datasets. If you are contributing to the dataset, please adhere to strict Alexander Stain criteria.

Source: Pollen Viability Staining Guide (Alexander Stain)

1. Class: viable (Target)

Definition: Pollen grains that contain healthy, full cytoplasm and are capable of fertilization.

Example: Viable Example (Add your example image here)

2. Class: non_viable (Target)

Definition: Pollen grains that are aborted, empty, or dead. They lack the cytoplasm required for fertilization.

Example: Non-Viable Example (Add your example image here)

3. Edge Cases & Rules


πŸ‹οΈ Training (CESNET Cluster)

Training is now fully automated using Kubernetes jobs managed by deploy_training.sh.

Workflow:

  1. Script: src/train_model.py handles the entire pipeline.
  2. Data: Automatically syncs datasets and staging areas from S3.
  3. Configuration:
    • Epochs: 300
    • Image Size: 640px
    • Batch Size: 16
    • Augmentations: Full rotation (180Β°), flips, and Mosaic (1.0).
  4. Process: Merges new staged data, generates synthetic negatives, and trains YOLOv8x.
  5. Backup: Results (weights, logs, visualizations) are automatically uploaded back to S3.
  6. Dynamic Save Directory Handling: The script automatically captures the exact location where YOLOv8 saves training outputs (e.g., runs/detect vs runs/segment), ensuring that S3 backups always find the correct files regardless of YOLO version or task defaults.

To Run:

sudo ./deploy_training.sh

πŸ” Routine Detection

Detection is handled by src/run_detection.py and deployed via deploy_pollen.sh.

πŸ‘‰ Routine Detection Manual

Quickstart Pipeline:

  1. Upload images via Cyberduck to the S3 Bucket detect_images/ folder.
  2. Deploy the job:
    sudo ./deploy_pollen.sh
    
  3. Sync results locally (automatic or manual):
    python3 src/sync_results.py
    

License: MIT