Pipeline

This section provides detailed documentation for all tools in the package. Each tool is designed to handle a specific part of the digital pathology analysis pipeline.

Overview

The tools follow a modular design that enables flexible workflow composition. Each tool can be used independently or as part of a complete analysis pipeline.

Pipeline Flow

The typical workflow follows this sequence:

  1. Patch Extraction → Extract patches from whole slide images

  2. Cell Segmentation → Identify and segment individual cells

  3. Graph Creation → Create spatial graphs from segmented cells

  4. Feature Extraction → Compute quantitative features from cells

  5. Dataset Creation → Process multiple slides into training datasets

  6. Feature Visualization → Visualize and analyze extracted features

  7. MIL Training → Train Multiple Instance Learning models

Quick Reference

Basic CLI Commands

# Extract patches from a slide
patch_extraction --output_path ./results --wsi_path ./data/slide.svs --patch_size 1024 --patch_overlap 6.25 --target_mag 20.0

# Segment cells in patches
cell_segmentation --model cellvit --gpu 0 --wsi_path ./data/slide.svs --patched_slide_path ./results/slide

# Create spatial graphs from segmented cells
graph_creation --method delaunay_radius --patched_slide_path ./results/slide --segmentation_model cellvit

# Extract features from segmented cells
feature_extraction --extractor pyradiomics_hed --wsi_path ./data/slide.svs --patched_slide_path ./results/slide --segmentation_model cellvit

# Create a dataset from multiple slides
create_dataset --excel_path ./data/metadata.xlsx --output_path ./results --gpu 0 --segmentation_models cellvit hovernet --extractors morphometrics pyradiomics

# Visualize extracted features
vis_features --dataset ./results

Common Parameters

File Paths
  • --wsi_path: Path to whole slide image file

  • --patched_slide_path: Path to directory with processed slide data

  • --output_path: Directory for saving results

  • --excel_path: Excel file with slide metadata and labels

Model Selection
  • --model: Choice of segmentation or MIL model

  • --extractor: Feature extraction method

  • --graph_method: Graph creation methods

  • --segmentation_model: Segmentation model used

Computing Resources
  • --gpu: GPU device ID (or -1 for CPU)

Installation and Setup

Ensure you have the package installed with all dependencies. For detailed installation instructions, see Installation.

Tool Categories

Data Processing Tools

These tools handle the initial processing of whole slide images:

Analysis and Visualization Tools

These tools help understand and analyze your data:

Machine Learning Tools

These tools handle model training and inference:

Performance Considerations

GPU Usage

Most computationally intensive tools support GPU acceleration:

  • Specify GPU device with --gpu N (where N is device ID)

  • CPU is set automatically if it is not available

Memory Management

For large datasets:

  • Tools automatically manage memory usage

  • Process slides individually to prevent memory overflow

  • Temporary files are cleaned up automatically

  • Progress is saved incrementally

Parallel Processing

Where applicable, tools use parallel processing:

  • Multi-core CPU utilization

  • Batch processing for GPU operations

  • Concurrent file I/O operations

Getting Help

Command-Line Help

Each tool provides built-in help:

# Get help for any tool
patch_extraction --help
cell_segmentation --help
graph_creation --help
feature_extraction --help

Documentation Resources

  • This documentation: Comprehensive guides for each tool

  • API documentation: Detailed technical reference

  • Quick start guide: Step-by-step tutorial

  • GitHub repository: Source code and issue tracking

See Also