cellmil.datamodels.transforms.normalization

Robust scaler transform for feature normalization.

Classes

RobustScalerTransform([apply_log_transform, ...])

Transform that applies robust scaling to features using median and IQR.

class cellmil.datamodels.transforms.normalization.RobustScalerTransform(apply_log_transform: bool = True, quantile_range: Tuple[float, float] = (0.25, 0.75), clip_quantiles: Tuple[float, float] = (0.005, 0.995), constant_threshold: float = 1e-08)[source]

Bases: FittableTransform

Transform that applies robust scaling to features using median and IQR.

Robust scaling is less sensitive to outliers than standard scaling. Formula: (x - median) / IQR, where IQR = Q3 - Q1

Features are first log-transformed to handle skewed distributions and outliers, then robust scaling is applied.

__init__(apply_log_transform: bool = True, quantile_range: Tuple[float, float] = (0.25, 0.75), clip_quantiles: Tuple[float, float] = (0.005, 0.995), constant_threshold: float = 1e-08)[source]

Initialize the robust scaler transform.

Parameters:
  • apply_log_transform – Whether to apply log transformation before scaling

  • quantile_range – Tuple of (lower_quantile, upper_quantile) for IQR computation

  • clip_quantiles – Tuple of (lower_clip, upper_clip) for outlier clipping

  • constant_threshold – Threshold below which IQR is considered too small

fit(features: Tensor, feature_names: Optional[List[str]] = None) RobustScalerTransform[source]

Fit the robust scaler on training data.

Parameters:
  • features – Training features tensor of shape (n_instances, n_features)

  • feature_names – Optional list of feature names for logging

Returns:

Self for method chaining

_transform_impl(features: Tensor) Tensor[source]

Apply the robust scaling to features.

Parameters:

features – Input features tensor

Returns:

Normalized features tensor

get_config() Dict[str, Any][source]

Get the configuration dictionary for this transform.

classmethod from_config(config: Dict[str, Any]) RobustScalerTransform[source]

Create transform instance from configuration dictionary.

get_scaling_parameters() Optional[Tuple[Tensor, Tensor]][source]

Get the scaling parameters (median, IQR) computed from training data.

Returns:

Tuple of (median_values, iqr_values) if fitted, None otherwise

get_constant_features_mask() Optional[Tensor][source]

Get a boolean mask indicating which features were considered constant.

Returns:

Boolean tensor of shape (n_features,) where True indicates the feature was considered constant during fitting.