cellmil.graph.creator

Classes

Creator(method, device)

Base class for graph creation.

DelaunayEdgeCreator(device[, k, radius, ...])

Delaunay + Radius-based edge creator for cell graphs.

DilateEdgeCreator(device[, k, radius, ...])

EdgeCreator(device[, k, radius, ...])

Abstract base class for edge creation between cells.

KNNEdgeCreator(device[, k, radius, ...])

KNN-based edge creator for cell graphs.

RadiusEdgeCreator(device[, k, radius, ...])

Radius-based edge creator for cell graphs.

SimilarityEdgeCreator(device[, ...])

Similarity-based edge creator for cell graphs.

class cellmil.graph.creator.Creator(method: GraphCreatorType, device: str)[source]

Bases: object

Base class for graph creation.

__init__(method: GraphCreatorType, device: str)[source]
create(cells: List[dict[str, Any]]) Tuple[Tensor, Tensor, Tensor][source]

Create graph from cells data.”

Parameters:

cells – List of cell dictionaries

Returns:

Tensor of shape [N, F] where N is number of cells and F is feature dimension edge_indices: Tensor of shape [2, E] where E is number of edges edge_features: Tensor of shape [E, F] where F is edge feature dimension

Return type:

node_features

_extract_node_features(cells: List[dict[str, Any]]) Tuple[Tensor, Tensor][source]

Extract node features from cells.

_create_empty_graph() Tuple[Tensor, Tensor, Tensor][source]

Create an empty graph when no cells are found.

class cellmil.graph.creator.EdgeCreator(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]

Bases: ABC

Abstract base class for edge creation between cells.

This class provides a framework for creating different types of graphs from cell position data. It implements common functionality like batched processing and edge feature calculation, while allowing subclasses to define specific edge creation strategies.

The class supports various graph creation methods: - KNN: Connect each cell to its k nearest neighbors - Radius: Connect cells within a specified radius - Delaunay + Radius: Use Delaunay triangulation with distance filtering - Dilate: Dilate the nuclei to approximate cell boundaries.

All edge creators produce graphs with: - Node features: Cell IDs - Edge features: [distance, direction_x, direction_y] where direction is a unit vector

device

Computing device (‘cpu’ or ‘cuda:X’)

Type:

str

batch_size

Number of cells to process per batch for memory efficiency

Type:

int

k

Number of nearest neighbors for KNN method

Type:

int | None

radius

Maximum distance for radius-based connections

Type:

float | None

limit_radius

Maximum distance filter for Delaunay triangulation

Type:

float | None

dilation

Dilation factor of nuclei for edge creation

Type:

int | None

__init__(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]
abstract create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges between cells.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • cells – Optional list of cell dictionaries for contour-based methods

Returns:

  • edge_indices: Tensor of shape (2, num_edges) containing source and target node indices

  • edge_features: Tensor of shape (num_edges, feature_dim) containing edge features

Return type:

Tuple of (edge_indices, edge_features) where

_process_batched(positions: Tensor, edge_computation_fn: Callable[[Tensor], Tuple[Tensor, Tensor]]) Tuple[Tensor, Tensor][source]

Process positions in batches and compute edges.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • edge_computation_fn – Function that takes batch_positions and returns (edge_indices, edge_features)

Returns:

Tuple of (edge_indices, edge_features)

_calculate_edge_features(edge_indices: Tensor, positions: Tensor) Tensor[source]

Calculate edge features based on positions.

Parameters:
  • edge_indices – Tensor of shape (2, num_edges) containing source and target node indices

  • positions – Tensor of shape (N, 2) containing the (x, y) positions of the cells

Returns:

  • distance: Euclidean distance between cells

  • direction_x: x-component of unit direction vector (from source to target)

  • direction_y: y-component of unit direction vector (from source to target)

Return type:

Tensor of shape (num_edges, 3) containing edge features

class cellmil.graph.creator.KNNEdgeCreator(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]

Bases: EdgeCreator

KNN-based edge creator for cell graphs.

create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges between cells.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • cells – Optional list of cell dictionaries for contour-based methods

Returns:

  • edge_indices: Tensor of shape (2, num_edges) containing source and target node indices

  • edge_features: Tensor of shape (num_edges, feature_dim) containing edge features

Return type:

Tuple of (edge_indices, edge_features) where

class cellmil.graph.creator.RadiusEdgeCreator(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]

Bases: EdgeCreator

Radius-based edge creator for cell graphs.

create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges between cells.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • cells – Optional list of cell dictionaries for contour-based methods

Returns:

  • edge_indices: Tensor of shape (2, num_edges) containing source and target node indices

  • edge_features: Tensor of shape (num_edges, feature_dim) containing edge features

Return type:

Tuple of (edge_indices, edge_features) where

class cellmil.graph.creator.DelaunayEdgeCreator(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]

Bases: EdgeCreator

Delaunay + Radius-based edge creator for cell graphs.

create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges between cells.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • cells – Optional list of cell dictionaries for contour-based methods

Returns:

  • edge_indices: Tensor of shape (2, num_edges) containing source and target node indices

  • edge_features: Tensor of shape (num_edges, feature_dim) containing edge features

Return type:

Tuple of (edge_indices, edge_features) where

class cellmil.graph.creator.DilateEdgeCreator(device: str, k: Optional[int] = None, radius: Optional[float] = None, limit_radius: Optional[float] = None, dilation: Optional[int] = None, batch_size: int = 2000000)[source]

Bases: EdgeCreator

create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges between cells whose dilated contours intersect.

Parameters:
  • positions (torch.Tensor) – Cell centroid positions [N, 2]

  • cells (List[dict] | None) – List of cell dictionaries containing contour information

Returns:

Edge indices [2, E] edge_features (torch.Tensor): Edge features [E, 3] (distance, dx, dy)

Return type:

edge_indices (torch.Tensor)

class cellmil.graph.creator.SimilarityEdgeCreator(device: str, similarity_threshold: float = 0.5, distance_sigma: float = 200.0, alpha: float = 0.5, combination_method: Literal['additive', 'multiplicative'] = 'additive', distance_metric: Literal['gaussian', 'laplacian', 'inverse', 'inverse_square'] = 'gaussian', feature_metric: Literal['cosine', 'correlation', 'euclidean', 'gaussian'] = 'cosine', feature_sigma: float = 1.0, batch_size: int = 1024, max_gpu_memory_fraction: float = 0.8)[source]

Bases: EdgeCreator

Similarity-based edge creator for cell graphs.

Creates edges based on both spatial distance and morphological feature similarity. Uses correlation filtering to reduce feature redundancy before computing similarity.

__init__(device: str, similarity_threshold: float = 0.5, distance_sigma: float = 200.0, alpha: float = 0.5, combination_method: Literal['additive', 'multiplicative'] = 'additive', distance_metric: Literal['gaussian', 'laplacian', 'inverse', 'inverse_square'] = 'gaussian', feature_metric: Literal['cosine', 'correlation', 'euclidean', 'gaussian'] = 'cosine', feature_sigma: float = 1.0, batch_size: int = 1024, max_gpu_memory_fraction: float = 0.8)[source]

Initialize similarity-based edge creator.

Parameters:
  • device – Computing device (‘cpu’ or ‘cuda:X’)

  • similarity_threshold – If < 1, used as threshold filter. If >= 1 (integer), used as KNN parameter

  • distance_sigma – Gaussian kernel width for distance-based similarity

  • alpha – Weight for similarity vs distance (0=distance only, 1=similarity only)

  • combination_method – Method to combine similarity and distance (‘additive’ or ‘multiplicative’)

  • distance_metric – Metric for distance-based similarity (‘gaussian’, ‘laplacian’, ‘inverse’, ‘inverse_square’)

  • feature_metric – Metric for feature-based similarity (‘cosine’, ‘correlation’, ‘euclidean’, ‘gaussian’)

  • feature_sigma – Gaussian kernel width for feature-based similarity (only used when feature_metric=’gaussian’)

  • batch_size – Number of cells to process per batch (will be adjusted dynamically based on GPU memory)

  • max_gpu_memory_fraction – Maximum fraction of available GPU memory to use (default 0.8)

_get_available_gpu_memory() float[source]

Get available GPU memory in bytes.

Returns:

Available memory in bytes, or 0 if not on GPU

_estimate_memory_usage(batch_size: int, n_cells: int, feature_dim: int) float[source]

Estimate memory usage for processing a batch.

Parameters:
  • batch_size – Number of cells in batch

  • n_cells – Total number of cells

  • feature_dim – Feature dimension

Returns:

Estimated memory usage in bytes

_calculate_safe_batch_size(n_cells: int, feature_dim: int, initial_batch_size: int) tuple[int, str][source]

Calculate a safe batch size that won’t exceed GPU memory.

Parameters:
  • n_cells – Total number of cells

  • feature_dim – Feature dimension

  • initial_batch_size – Requested batch size

Returns:

Tuple of (safe_batch_size, device_to_use)

_compute_distance_similarity(distances: Tensor) Tensor[source]

Compute distance-based similarity using the specified metric.

Parameters:

distances – Tensor of pairwise distances

Returns:

Similarity values in range [0, 1]

_compute_feature_similarity(batch_features: Tensor, all_features: Tensor) Tensor[source]

Compute feature-based similarity using the specified metric.

Parameters:
  • batch_features – Tensor of shape [batch_size, feature_dim]

  • all_features – Tensor of shape [n_cells, feature_dim]

Returns:

Similarity matrix of shape [batch_size, n_cells]

create_edges(positions: Tensor, cells: Optional[List[dict[str, Any]]] = None) Tuple[Tensor, Tensor][source]

Create edges based on distance and feature similarity.

Parameters:
  • positions – Tensor of shape [N, 2] containing the (x, y) positions of the cells

  • cells – List of cell dictionaries with features for similarity computation

Returns:

Tensor of shape [2, E] containing source and target node indices edge_features: Tensor of shape [E, 3] containing [distance, direction_x, direction_y]

If similarity_threshold < 1: edges are filtered by weight threshold If similarity_threshold >= 1: top-k edges per node are kept (KNN mode) Weight = alpha * max(0, cosine_similarity) + (1 - alpha) * exp(-distance^2 / (2*sigma^2))

Return type:

edge_indices