glasscut package

Subpackages

Submodules

glasscut.exceptions module

exception glasscut.exceptions.GlassCutException[source]

Bases: Exception

GlassCut custom exception main class

exception glasscut.exceptions.SlidePropertyError[source]

Bases: GlassCutException

Raised when a required slide property is not available.

exception glasscut.exceptions.BackendError[source]

Bases: GlassCutException

Raised when there’s an issue with the slide backend.

exception glasscut.exceptions.MagnificationError[source]

Bases: GlassCutException

Raised when requested magnification is not available.

exception glasscut.exceptions.TileSizeOrCoordinatesError[source]

Bases: GlassCutException

Raised when tile size or coordinates are invalid.

exception glasscut.exceptions.FilterCompositionError[source]

Bases: GlassCutException

Raised when a filter composition for the class is not available

glasscut.tile module

class glasscut.tile.Tile(image, coords, magnification, tissue_detector=None, **kwargs)[source]

Bases: object

Provide Tile object representing a tile generated from a Slide object.

Parameters:
  • image (Image.Image) – Image describing the tile

  • coords (tuple[int, int]) – Coordinates (x, y) of the tile at level 0 (upper-left corner)

  • magnification (int | float) – Magnification at which the tile was extracted

  • tissue_detector (TissueDetector | None, optional) – Strategy for tissue detection. Defaults to OtsuTissueDetector.

  • kwargs (Any)

__init__(image, coords, magnification, tissue_detector=None, **kwargs)[source]

Initialize a Tile.

Parameters:
  • image (Image.Image) – The tile image in RGB format

  • coords (tuple[int, int] | None) – Coordinates (x, y) of the tile at level 0. Can be None for utility tiles.

  • magnification (int | float | None) – Magnification at which the tile was extracted. Can be None for utility tiles.

  • tissue_detector (TissueDetector | None, optional) – Strategy for detecting tissue. If None, uses OtsuTissueDetector.

  • **kwargs (Any) – Optional keyword-only extensions. Supported key: precomputed_tissue_ratio (float), used to skip recomputing tissue ratio.

Return type:

None

set_precomputed_tissue_ratio(tissue_ratio)[source]

Set a precomputed tissue ratio to avoid re-running detection.

Parameters:

tissue_ratio (float)

Return type:

None

has_enough_tissue(tissue_threshold=0.2)[source]

Check if the tile has enough tissue.

This method checks if the proportion of the detected tissue over the total area of the tile is above a specified threshold (by default 20%).

Parameters:

tissue_threshold (float, optional) – Number between 0.0 and 1.0 representing the minimum required percentage of tissue over the total area of the image, default is 0.2

Returns:

enough_tissue – Whether the image has enough tissue, i.e. if the proportion of tissue over the total area of the image is more than tissue_threshold.

Return type:

bool

save(path)[source]

Save tile at given path.

The format to use is determined from the filename extension (to be compatible to PIL.Image formats). If no extension is provided, the image will be saved in png format.

Parameters:

path (str or pathlib.Path) – Path to which the tile is saved.

Return type:

None

property tissue_mask: ndarray

Binary mask representing the tissue in the tile.

The mask is computed using the configured tissue detector strategy.

Returns:

Binary mask representing the tissue in the tile (dtype: uint8, values 0 or 1)

Return type:

np.ndarray

property tissue_ratio: float

Ratio of the tissue area over the total area of the tile.

Returns:

Ratio of the tissue area over the total area of the tile

Return type:

float

glasscut.utils module

glasscut.utils.lazyproperty(f)[source]

Decorator like @property, but evaluated only on first access.

Like @property, this can only be used to decorate methods having only a self parameter, and is accessed like an attribute on an instance, i.e. trailing parentheses are not used. Unlike @property, the decorated method is only evaluated on first access; the resulting value is cached and that same value returned on second and later access without re-evaluation of the method.

Like @property, this class produces a data descriptor object, which is stored in the __dict__ of the class under the name of the decorated method (‘fget’ nominally). The cached value is stored in the __dict__ of the instance under that same name.

Because it is a data descriptor (as opposed to a non-data descriptor), its __get__() method is executed on each access of the decorated attribute; the __dict__ item of the same name is “shadowed” by the descriptor.

While this may represent a performance improvement over a property, its greater benefit may be its other characteristics. One common use is to construct collaborator objects, removing that “real work” from the constructor, while still only executing once. It also de-couples client code from any sequencing considerations; if it’s accessed from more than one location, it’s assured it will be ready whenever needed.

A lazyproperty is read-only. There is no counterpart to the optional “setter” (or deleter) behavior of an @property. This is critically important to maintaining its immutability and idempotence guarantees. Attempting to assign to a lazyproperty raises AttributeError unconditionally. The parameter names in the methods below correspond to this usage example:

class Obj(object):

    @lazyproperty
    def fget(self):
        return 'some result'

obj = Obj()

Not suitable for wrapping a function (as opposed to a method) because it is not callable.

Parameters:

f (Callable[[...], T])

Return type:

Any

glasscut.utils.np_to_pil(np_img)[source]

Convert a NumPy array to a PIL Image.

Handles the conversion of different numpy array types (bool, float64, uint8, etc.) to a properly formatted PIL Image.

Parameters:

np_img (np.ndarray) – The image represented as a NumPy array.

Returns:

The image represented as PIL Image

Return type:

PIL.Image.Image

Examples

>>> import numpy as np
>>> from glasscut.utils import np_to_pil
>>> float_array = np.random.rand(100, 100, 3)
>>> pil_image = np_to_pil(float_array)
class glasscut.utils.Profiler(enabled=True)[source]

Bases: object

Lightweight phase-based profiler with thread-safe accumulation.

Zero overhead when enabled is False — all methods return immediately.

Example

>>> profiler = Profiler(enabled=True)
>>> for i in range(100):
...     with profiler.phase("compute"):
...         _ = i ** 2
>>> profiler.print_summary()
Parameters:

enabled (bool)

__init__(enabled=True)[source]
Parameters:

enabled (bool)

Return type:

None

enabled
phase(name)[source]
Parameters:

name (str)

Return type:

_PhaseContext

property phases: dict[str, float]
summary(sort=True)[source]
Parameters:

sort (bool)

Return type:

str

print_summary(sort=True)[source]
Parameters:

sort (bool)

Return type:

None

record(name, elapsed)[source]
Parameters:
Return type:

None

reset()[source]
Return type:

None

Module contents

GlassCut: Fast and flexible histopathology image tiling library.

A lightweight, extensible library for extracting tiles from whole slide images (WSI) with support for multiple tiling strategies, and parallel processing.

Core Components:
Slide I/O:
  • Slide: WSI reader with backend abstraction

  • Tile: Individual tile image with metadata

Tiling Strategies:
  • Tiler: Abstract base class for tiling strategies

  • GridTiler: Regular grid tiling with optional overlap

Tissue Detection:
  • OtsuTissueDetector: Otsu method tissue detection

Stain Normalization:
  • MacenkoNormalizer: Stain color normalization

  • ReinhardtNormalizer: Fast color transfer-based normalization

See documentation at: https://github.com/CamiloSinningUN/glasscut

class glasscut.Slide(path, use_cucim=True)[source]

Bases: object

Represents a whole slide image with magnification-based access.

This class provides an interface to access whole slide images. It abstracts away the backend (OpenSlide or cuCim).

Parameters:
  • path (Union[str, pathlib.Path]) – Path to the slide file

  • use_cucim (bool, optional) – Whether to try using cuCim GPU backend. If False or cuCim is not available, falls back to OpenSlide. Default is True.

__init__(path, use_cucim=True)[source]
Parameters:
Return type:

None

__enter__()[source]

Context manager entry.

__exit__(exc_type, exc_val, exc_tb)[source]

Context manager exit.

Parameters:
Return type:

None

close()[source]

Close the slide and free resources.

Return type:

None

property name: str

Slide name without extension.

Returns:

Slide filename without extension

Return type:

str

property dimensions: tuple[int, int]

Slide dimensions (width, height) at base magnification.

Returns:

(width, height) in pixels at highest magnification (typically 40x)

Return type:

tuple[int, int]

property magnifications: list[float]

Available magnifications for this slide.

These are calculated from the actual slide’s base magnification (objective power) and the number of pyramid levels.

Returns:

List of magnifications in descending order (e.g., [40.0, 20.0, 10.0, 5.0])

Return type:

list[float]

property mpp: float

Microns per pixel at base magnification.

Returns:

Microns per pixel

Return type:

float

Raises:

SlidelazypropertyError – If MPP cannot be determined from slide metadata

property properties: dict[str, str]

Slide metadata properties.

Returns:

Dictionary of all slide properties

Return type:

dict

property thumbnail: Image

Get thumbnail of the slide.

The thumbnail size is automatically calculated based on slide dimensions.

Returns:

Thumbnail image in RGB format

Return type:

PIL.Image.Image

extract_tile(coords, tile_size, magnification)[source]

Extract a single tile from the slide at specified magnification.

The requested magnification must be available on this slide. If the exact magnification is not available, a MagnificationError is raised.

Parameters:
  • coords (tuple[int, int]) – Coordinates (x, y) at level 0 (upper-left corner of the tile)

  • tile_size (tuple[int, int]) – Desired tile size (width, height) in pixels

  • magnification (int | float) – Target magnification (e.g., 40, 20, 10, 5)

Returns:

Extracted tile object

Return type:

Tile

Raises:
class glasscut.Tile(image, coords, magnification, tissue_detector=None, **kwargs)[source]

Bases: object

Provide Tile object representing a tile generated from a Slide object.

Parameters:
  • image (Image.Image) – Image describing the tile

  • coords (tuple[int, int]) – Coordinates (x, y) of the tile at level 0 (upper-left corner)

  • magnification (int | float) – Magnification at which the tile was extracted

  • tissue_detector (TissueDetector | None, optional) – Strategy for tissue detection. Defaults to OtsuTissueDetector.

  • kwargs (Any)

__init__(image, coords, magnification, tissue_detector=None, **kwargs)[source]

Initialize a Tile.

Parameters:
  • image (Image.Image) – The tile image in RGB format

  • coords (tuple[int, int] | None) – Coordinates (x, y) of the tile at level 0. Can be None for utility tiles.

  • magnification (int | float | None) – Magnification at which the tile was extracted. Can be None for utility tiles.

  • tissue_detector (TissueDetector | None, optional) – Strategy for detecting tissue. If None, uses OtsuTissueDetector.

  • **kwargs (Any) – Optional keyword-only extensions. Supported key: precomputed_tissue_ratio (float), used to skip recomputing tissue ratio.

Return type:

None

set_precomputed_tissue_ratio(tissue_ratio)[source]

Set a precomputed tissue ratio to avoid re-running detection.

Parameters:

tissue_ratio (float)

Return type:

None

has_enough_tissue(tissue_threshold=0.2)[source]

Check if the tile has enough tissue.

This method checks if the proportion of the detected tissue over the total area of the tile is above a specified threshold (by default 20%).

Parameters:

tissue_threshold (float, optional) – Number between 0.0 and 1.0 representing the minimum required percentage of tissue over the total area of the image, default is 0.2

Returns:

enough_tissue – Whether the image has enough tissue, i.e. if the proportion of tissue over the total area of the image is more than tissue_threshold.

Return type:

bool

save(path)[source]

Save tile at given path.

The format to use is determined from the filename extension (to be compatible to PIL.Image formats). If no extension is provided, the image will be saved in png format.

Parameters:

path (str or pathlib.Path) – Path to which the tile is saved.

Return type:

None

property tissue_mask: ndarray

Binary mask representing the tissue in the tile.

The mask is computed using the configured tissue detector strategy.

Returns:

Binary mask representing the tissue in the tile (dtype: uint8, values 0 or 1)

Return type:

np.ndarray

property tissue_ratio: float

Ratio of the tissue area over the total area of the tile.

Returns:

Ratio of the tissue area over the total area of the tile

Return type:

float

class glasscut.Tiler[source]

Bases: ABC

Abstract base class for tile extraction strategies.

A Tiler is responsible for determining which tiles to extract from a slide and providing them to the user. Different tiling strategies can be implemented by subclassing this class.

abstractmethod extract(slide, *, n_workers=4, batch_size=128)[source]

Extract tiles from slide.

This is the primary extraction API for all tilers. Implementations can use batching and parallelism internally.

Parameters:
  • slide (Slide) – The slide object to extract tiles from

  • n_workers (int, optional) – Worker hint for internal parallel extraction. Default is 4.

  • batch_size (int, optional) – Internal extraction batch size. Default is 128.

Yields:

Tile – Individual tile objects with image, coordinates, and metadata

Raises:
  • MagnificationError – If the requested magnification is not available on this slide

  • TileSizeOrCoordinatesError – If generated coordinates are invalid for the slide

  • Example:

    >>> slide = Slide("slide.svs")
        >>> tiler = GridTiler(tile_size=(512, 512), overlap=50)
        >>> for tile in tiler.extract(slide):
        ...     tile.save(f"tile_{tile.coords}.png")
    

Return type:

Generator[Tile, None, None]

abstractmethod get_tile_boxes(slide)[source]

Get all tile boxes without extracting images.

This method computes which tile regions would be extracted without actually reading images from the slide. Useful for planning, filtering, or batch processing.

Parameters:

slide (Slide) – The slide object

Returns:

  • list[tuple[int, int, int, int]] – List of tile boxes as (x, y, width, height) in level-0 space.

  • Example – >>> tiler = GridTiler(tile_size=(512, 512)) >>> boxes = tiler.get_tile_boxes(slide) >>> print(f”Will extract {len(boxes)} tiles”)

Return type:

list[tuple[int, int, int, int]]

visualize(slide, scale_factor=32, colors=None, linewidth=1)[source]

Visualize tile grid on a slide thumbnail.

This method creates a thumbnail of the slide and draws the tile grid on top of it. Useful for verifying tiling strategy before processing.

Parameters:
  • slide (Slide) – The slide object to visualize

  • scale_factor (int, optional) – Scale factor for thumbnail downsampling. Default is 32.

  • colors (list[tuple[int, int, int]] | None, optional) – RGB colors for tile rectangles. If None, uses a cycle of colors. Default is None.

  • alpha (int, optional) – Transparency alpha value for rectangles (0-255). Default is 200.

  • linewidth (int, optional) – Width of rectangle lines in pixels. Default is 1.

Returns:

  • PIL.Image.Image – Thumbnail image with tile grid drawn on it

  • Example – >>> tiler = GridTiler(tile_size=(512, 512)) >>> viz_image = tiler.visualize(slide) >>> viz_image.show()

Return type:

Image

class glasscut.GridTiler(tile_size=(512, 512), magnification=20, overlap=0, min_tissue_ratio=0.2, tissue_detector=None, transforms=None, show_progress=True, debug=False)[source]

Bases: Tiler

Extract tiles using a regular grid.

Parameters:
  • tile_size (tuple[int, int], optional) – Default tile size as (width, height) in pixels at requested magnification. Default is (512, 512).

  • magnification (int | float, optional) – Magnification used for extraction and coordinate generation. Default is 20.

  • overlap (int, optional) – Overlap between neighboring tiles in pixels at requested magnification. Default is 0.

  • min_tissue_ratio (float, optional) – Minimum tissue ratio in [0.0, 1.0] required for preselection. Default is 0.2.

  • tissue_detector (TissueDetector | None, optional) – Tissue detector used for preselection mask. Defaults to OtsuTissueDetector.

  • show_progress (bool, optional) – Whether to display a loading bar while extracting tiles. Default is True.

  • debug (bool, optional) – When True, record and print per-phase timing breakdown (tissue mask, candidate grid, tile extraction, transforms). Default is False.

  • transforms (list[Callable[[Image], Image]] | None)

__init__(tile_size=(512, 512), magnification=20, overlap=0, min_tissue_ratio=0.2, tissue_detector=None, transforms=None, show_progress=True, debug=False)[source]
Parameters:
Return type:

None

get_tile_boxes(slide)[source]

Return preselected grid boxes as (x, y, width, height).

Parameters:

slide (Slide)

Return type:

list[tuple[int, int, int, int]]

get_tile_candidates(slide)[source]

Return preselected boxes with tissue ratio as (x, y, w, h, ratio).

Parameters:

slide (Slide)

Return type:

list[tuple[int, int, int, int, float]]

extract(slide, *, n_workers=4, batch_size=128)[source]

Yield tiles using batched parallel extraction.

Each worker thread extracts a single tile from the slide and immediately applies the full transform pipeline.

Parameters:
Return type:

Generator[Tile, None, None]

print_profile()[source]
Return type:

None

class glasscut.OtsuTissueDetector[source]

Bases: TissueDetector

Otsu-based tissue detection

This detector applies Otsu thresholding with optional morphological operations. It’s fast, robust, and works well for standard histopathology images.

detect(image)[source]

Detect tissue using Otsu thresholding.

Parameters:

image (Image.Image) – Input RGB image

Returns:

Binary mask (dtype: uint8 with values 0 or 1)

Return type:

np.ndarray

class glasscut.DatasetGenerator(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]

Bases: object

Generate a tile dataset from one or more slide files.

Parameters:
__init__(dataset_id, output_dir, *, tiler, n_workers=4, batch_size=128, save_thumbnails=True, save_masks=True, save_processed_json=True, show_progress=True, verbose=True)[source]

Initialize generator from direct parameters.

Parameters:
  • dataset_id (str) – Dataset identifier.

  • output_dir (str | Path) – Output root directory.

  • tiler (Tiler) – Preconfigured tiler instance used for extraction.

  • n_workers (int, optional) – Number of workers for batched tile extraction. Default is 4.

  • batch_size (int, optional) – Number of tiles per extraction batch. Default is 128.

  • save_thumbnails (bool, optional) – Whether to save slide thumbnail artifacts.

  • save_masks (bool, optional) – Whether to save tissue mask artifacts.

  • save_processed_json (bool, optional) – Whether to save processed.json at dataset root.

  • show_progress (bool, optional) – Whether to display progress bars for slides and tiles.

  • verbose (bool, optional) – Whether to enable info-level logs.

Return type:

None

process_dataset(slide_paths)[source]

Process all provided slides and persist tiles, artifacts, and metadata.

Parameters:

slide_paths (Sequence[str | Path])

Return type:

DatasetMetadata

class glasscut.LiveSlideDataset(slide_paths, *, tiler, n_workers=4, batch_size=128, use_cucim=True)[source]

Bases: object

Slide-level in-memory dataset.

Each __getitem__ call opens one slide, extracts all tiles in memory using the configured tiler, and returns a LiveSlideSample.

Parameters:
__init__(slide_paths, *, tiler, n_workers=4, batch_size=128, use_cucim=True)[source]
Parameters:
Return type:

None

__len__()[source]

Return number of slides in the live dataset.

Return type:

int

__getitem__(index)[source]

Return all extracted tiles for one slide.

Parameters:

index (int) – Slide index in the dataset.

Return type:

LiveSlideSample

class glasscut.LiveSlideSample(slide_id, slide_name, slide_path, dimensions, mpp, magnifications, tiles)[source]

Bases: object

Container for one in-memory slide sample.

Parameters:
slide_id

Slide identifier in slide_XXX format.

Type:

str

slide_name

Slide basename without extension.

Type:

str

slide_path

Absolute slide path.

Type:

str

dimensions

Level-0 dimensions as (width, height).

Type:

tuple[int, int]

mpp

Microns-per-pixel at level 0.

Type:

float

magnifications

Available magnification values.

Type:

list[float]

tiles

All extracted tiles for this slide, in extraction order.

Type:

list[Tile]

slide_id: str
slide_name: str
slide_path: str
dimensions: tuple[int, int]
mpp: float
magnifications: list[float]
tiles: list[Tile]
__init__(slide_id, slide_name, slide_path, dimensions, mpp, magnifications, tiles)
Parameters:
Return type:

None

class glasscut.MacenkoStainNormalizer[source]

Bases: TransformerStainMatrixMixin, StainNormalizer

Stain normalizer using M. Macenko et al.’s method.

This method performs unsupervised color deconvolution to identify stain vectors, then normalizes stain concentrations to a reference image. It works well for standard H&E stained histopathology images.

The algorithm: 1. Converts image to optical density (OD) space 2. Performs principal component analysis on OD values 3. Identifies stain vectors using angular decomposition 4. Normalizes stain concentrations to match reference

stain_color_map

Mapping of stain names to their normalized OD vectors.

Type:

dict

Examples

>>> from PIL import Image
>>> from glasscut.stain_normalizers import MacenkoStainNormalizer
>>> normalizer = MacenkoStainNormalizer()
>>> ref_image = Image.open("reference.png")
>>> normalizer.fit(ref_image)
>>> test_image = Image.open("test.png")
>>> normalized_image = normalizer.transform(test_image)
stain_color_map = {'dab': array([0.27, 0.57, 0.78]), 'eosin': array([0.07, 0.99, 0.11]), 'hematoxylin': array([0.65, 0.7 , 0.29]), 'null': array([0., 0., 0.])}
__init__()[source]

Initialize MacenkoStainNormalizer.

stain_matrix(img_rgb, background_intensity=240, **kwargs)[source]

Estimate stain matrix using Macenko’s method.

Parameters:
  • img_rgb (PIL.Image.Image) – Input image in RGB or RGBA format.

  • background_intensity (int, optional) – Background transmitted light intensity. Default is 240.

  • **kwargs – Additional keyword arguments.

  • alpha (int, optional) – Minimum angular percentile. Default is 1.

  • beta (float, optional) – Threshold for OD magnitude filtering. Default is 0.15.

  • stains (list of str, optional) – List of stain names in order. Default is ["hematoxylin", "eosin"].

Returns:

Calculated 3×3 stain matrix with stain vectors as columns.

Return type:

np.ndarray

Raises:
class glasscut.ReinhardtStainNormalizer[source]

Bases: StainNormalizer

Stain normalizer using E. Reinhardt et al.’s color transfer method.

This method normalizes stain appearance by matching the mean and standard deviation of each channel in LAB color space between source and target images. The normalization is performed only on tissue regions.

The algorithm is: 1. Identify tissue using tissue masking 2. Convert to LAB color space 3. Compute per-channel mean and std on tissue 4. Normalize source statistics to match target statistics 5. Convert back to RGB

target_means

Target mean values per LAB channel.

Type:

np.ndarray or None

target_stds

Target standard deviation values per LAB channel.

Type:

np.ndarray or None

Notes

This method is computationally fast and suitable for real-time preview during stain normalization parameter tuning. However, it may not preserve color relationships as well as matrix-based methods for complex stains.

Examples

>>> from PIL import Image
>>> from glasscut.stain_normalizers import ReinhardtStainNormalizer
>>> normalizer = ReinhardtStainNormalizer()
>>> ref_image = Image.open("reference.png")
>>> normalizer.fit(ref_image)
>>> test_image = Image.open("test.png")
>>> normalized_image = normalizer.transform(test_image)
__init__()[source]

Initialize ReinhardtStainNormalizer.

fit(target_image, **kwargs)[source]

Fit stain normalizer using target image.

Parameters:
  • target_image (Image.Image) – Target image for stain normalization. Can be RGB or RGBA.

  • **kwargs – Additional arguments (unused for Reinhardt method).

Return type:

None

transform(image, **kwargs)[source]

Normalize staining of image.

Parameters:
  • image (Image.Image) – Image to normalize. Can be RGB or RGBA.

  • **kwargs – Additional arguments (unused for Reinhardt method).

Returns:

Image with normalized stain.

Return type:

Image.Image

static rgb_to_lab(img_rgb)[source]

Convert RGB image to LAB color space.

Parameters:

img_rgb (Image.Image) – Input image in RGB or RGBA format.

Returns:

Array representation of the image in LAB space.

Return type:

np.ndarray

Raises:

ValueError – If the input image is grayscale.

static lab_to_rgb(img_lab)[source]

Convert LAB image to RGB color space.

Parameters:

img_lab (np.ndarray) – Input image in LAB color space.

Returns:

Image in RGB color space.

Return type:

Image.Image