cellmil.features.extractor.topological¶

Classes

`ConnectivityExtractor`()	Extract connectivity features from the graph.
`GeometricExtractor`()	Extract geometric features from the graph.
`StructureExtractor`()	Extract structural features from the graph.
`TopologicalExtractor`(extractor_name)

class cellmil.features.extractor.topological.TopologicalExtractor(extractor_name: ExtractorType)[source]¶

Bases: object

__init__(extractor_name: ExtractorType)[source]¶

extract_features(cell_id: Tensor, graph: dict[str, torch.Tensor], cells: list[dict[str, Any]]) → dict[str, Any][source]¶

class cellmil.features.extractor.topological.ConnectivityExtractor[source]¶

Bases: object

Extract connectivity features from the graph. * Degree * Weighted degree (by distance) * K-core number * PageRank * Eigenvector centrality (Approx.)

__init__()[source]¶: Initialize the connectivity extractor.

_ensure_global_metrics_computed(edge_indices: Tensor, edge_features: Tensor, num_nodes: int) → tuple[dict[int, float], dict[int, float], dict[int, float], dict[int, float], dict[int, float]][source]¶: Compute global metrics (k-core, pagerank, eigenvector, degree, weighted_degree) if not cached.

clear_cache() → None[source]¶: Clear the cached global metrics. Useful for testing or memory management.

extract_features(cell_id: Tensor, graph: dict[str, torch.Tensor], cells: list[dict[str, Any]]) → dict[str, Any][source]¶

Extract connectivity features for a specific cell from the graph.

Parameters:

cell_id – Tensor containing the target cell ID
graph – Dictionary containing ‘edge_indices’, ‘edge_features’, ‘node_features’
cells – List of cell dictionaries (not used in connectivity extraction)

Returns:

Dictionary containing connectivity features

_find_index(cell_id: int, node_features: Tensor) → int | None[source]¶

Find the node index that corresponds to the given cell_id.

Parameters:

cell_id – The cell ID to find
node_features – Tensor with shape [num_nodes, 1] containing cell_ids in first column

Returns:

Node index if found, None otherwise

class cellmil.features.extractor.topological.StructureExtractor[source]¶

Bases: object

Extract structural features from the graph. * Weighted clustering coefficient (by distance) * Local efficiency * Ego-network density

__init__()[source]¶: Initialize the structure extractor.

_ensure_global_metrics_computed(edge_indices: Tensor, edge_features: Tensor, num_nodes: int) → dict[int, float][source]¶: Compute global structural metrics if not cached.

extract_features(cell_id: Tensor, graph: dict[str, torch.Tensor], cells: list[dict[str, Any]]) → dict[str, Any][source]¶: Extract structural features for a specific cell from the graph.

_find_index(cell_id: int, node_features: Tensor) → int | None[source]¶: Find the node index that corresponds to the given cell_id.

_get_neighbors_vectorized(node_idx: int, edge_indices: Tensor) → Tensor[source]¶: Vectorized neighbor finding using torch operations.

_calculate_local_efficiency(node_idx: int, edge_indices: Tensor, edge_features: Tensor, neighbors: Tensor) → float[source]¶: Calculate local efficiency using pre-computed neighbors.

_calculate_ego_network_density(node_idx: int, edge_indices: Tensor, neighbors: Tensor) → float[source]¶: Calculate ego-network density using pre-computed neighbors.

clear_cache() → None[source]¶: Clear the cached global metrics.

class cellmil.features.extractor.topological.GeometricExtractor[source]¶

Bases: object

Extract geometric features from the graph. * Distance to nearest neighbor * Distance to nearest neighbor of each type * Mean distance to neighbors * Edge length variance * Anisotropy → Dominant direction of nearest neighbors * Local density (number of nodes in a radius) * Spatial entropy of neighbors * Shape of local convex hull * Area/perimeter ratio of local neighborhood * Nucleus size relative to local density * Anisotropy of neighborhood * Relative orientation of neighbors

__init__()[source]¶: Initialize the geometric extractor.

_get_cell_mapping(cells: list[dict[str, Any]]) → dict[int, dict[str, Any]][source]¶: Return cached mapping cell_id -> cell; rebuild only when cells list identity changes.

_ensure_spatial_index(cells: list[dict[str, Any]]) → None[source]¶: Build and cache KDTree and arrays from cells when the list identity changes.

_graph_key(edge_indices: Tensor, edge_features: Tensor, node_features: Tensor) → tuple[int, int, int][source]¶: Create a lightweight identity key for current graph tensors (no hashing).

extract_features(cell_id: Tensor, graph: dict[str, torch.Tensor], cells: list[dict[str, Any]]) → dict[str, Any][source]¶: Extract geometric features for a specific cell from the graph.

_find_index(cell_id: int, node_features: Tensor) → int | None[source]¶: Find the node index that corresponds to the given cell_id.

_get_neighbour_data(node_idx: int, edge_indices: Tensor, edge_features: Tensor, cells: list[dict[str, Any]], node_features: Tensor) → dict[str, Any][source]¶: Get comprehensive neighbor data using vectorized ops (undirected graphs) with simple caching.

clear_cache() → None[source]¶: Clear all caches maintained by GeometricExtractor.

_calculate_anisotropy(neighbor_data: dict[str, Any]) → dict[str, float][source]¶: Calculate anisotropy and dominant direction.

_calculate_local_density(target_cell: dict[str, Any], cells: list[dict[str, Any]], radius: float) → float[source]¶: Calculate number of cells within radius using cached KDTree (no Python loops).

_calculate_spatial_entropy(neighbor_data: dict[str, Any]) → float[source]¶: Calculate spatial entropy of neighbor distribution.

_calculate_convex_hull_features(neighbor_data: dict[str, Any], target_cell: dict[str, Any]) → dict[str, float][source]¶: Calculate convex hull area and perimeter of neighborhood.

_calculate_relative_nucleus_size(target_cell: dict[str, Any], local_density: float) → float[source]¶: Calculate nucleus size relative to local density.

_calculate_mean_orientation(neighbor_data: dict[str, Any]) → float[source]¶: Calculate mean orientation of neighbors.