glasscut.dataset package¶
Submodules¶
glasscut.dataset.generator module¶
Dataset generation orchestration for multi-slide tiling workflows.
- class glasscut.dataset.generator.DatasetGenerator(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]¶
Bases:
objectGenerate a tile dataset from one or more slide files.
- Parameters:
- __init__(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]¶
Initialize generator from direct parameters.
- Parameters:
dataset_id (str) – Dataset identifier.
output_dir (str | Path) – Output root directory.
tiler (Tiler) – Preconfigured tiler instance used for extraction.
n_workers (int, optional) – Number of workers for batched tile extraction. Default is
4.batch_size (int, optional) – Number of tiles per extraction batch. Default is
128.save_thumbnails (bool, optional) – Whether to save slide thumbnail artifacts.
save_masks (bool, optional) – Whether to save tissue mask artifacts.
save_processed_json (bool, optional) – Whether to save
processed.jsonat dataset root.show_progress (bool, optional) – Whether to display progress bars for slides and tiles.
verbose (bool, optional) – Whether to enable info-level logs.
- Return type:
None
glasscut.dataset.live module¶
Live in-memory slide-level dataset utilities.
This module provides a dataset-like interface where each item corresponds to one slide and contains all extracted tiles for that slide, without writing artifacts to disk.
- class glasscut.dataset.live.LiveSlideSample(slide_id, slide_name, slide_path, dimensions, mpp, magnifications, tiles)[source]¶
Bases:
objectContainer for one in-memory slide sample.
- Parameters:
- class glasscut.dataset.live.LiveSlideDataset(slide_paths, *, tiler, n_workers=4, batch_size=128, use_cucim=True)[source]¶
Bases:
objectSlide-level in-memory dataset.
Each
__getitem__call opens one slide, extracts all tiles in memory using the configured tiler, and returns aLiveSlideSample.- Parameters:
Module contents¶
Dataset generation module.
- class glasscut.dataset.DatasetGenerator(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]¶
Bases:
objectGenerate a tile dataset from one or more slide files.
- Parameters:
- __init__(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]¶
Initialize generator from direct parameters.
- Parameters:
dataset_id (str) – Dataset identifier.
output_dir (str | Path) – Output root directory.
tiler (Tiler) – Preconfigured tiler instance used for extraction.
n_workers (int, optional) – Number of workers for batched tile extraction. Default is
4.batch_size (int, optional) – Number of tiles per extraction batch. Default is
128.save_thumbnails (bool, optional) – Whether to save slide thumbnail artifacts.
save_masks (bool, optional) – Whether to save tissue mask artifacts.
save_processed_json (bool, optional) – Whether to save
processed.jsonat dataset root.show_progress (bool, optional) – Whether to display progress bars for slides and tiles.
verbose (bool, optional) – Whether to enable info-level logs.
- Return type:
None
- class glasscut.dataset.LiveSlideDataset(slide_paths, *, tiler, n_workers=4, batch_size=128, use_cucim=True)[source]¶
Bases:
objectSlide-level in-memory dataset.
Each
__getitem__call opens one slide, extracts all tiles in memory using the configured tiler, and returns aLiveSlideSample.- Parameters: