cellmil.datamodels.model¶
Functions
|
Recursively convert numpy types to Python native types for JSON serialization. |
Classes
|
Metadata for the entire k-fold experiment. |
|
Metadata for a single fold. |
|
Manages storage and retrieval of k-fold cross-validation results. |
- cellmil.datamodels.model.convert_numpy_types(obj: Any) Any[source]¶
Recursively convert numpy types to Python native types for JSON serialization.
- Parameters:
obj – Object to convert
- Returns:
Object with numpy types converted to Python native types
- class cellmil.datamodels.model.FoldMetadata(fold_idx: int, train_size: int, val_size: int, best_epoch: int, best_metric_value: float, metric_name: str, is_survival: bool, metrics: dict[str, Any])[source]¶
Bases:
objectMetadata for a single fold.
- class cellmil.datamodels.model.ExperimentMetadata(name: str, k_folds: int, random_state: int, balance_cell_counts: bool, cell_balance_bins: int, is_survival: bool, aggregated_metrics: dict[str, Any], best_fold_idx: int, avg_best_epoch: float, dataset_config: dict[str, Any], model_config: dict[str, Any])[source]¶
Bases:
objectMetadata for the entire k-fold experiment.
- class cellmil.datamodels.model.ModelStorage(output_dir: Union[str, Path], experiment_name: str, load_existing: bool = False)[source]¶
Bases:
objectManages storage and retrieval of k-fold cross-validation results.
Directory structure: {output_dir}/
├── experiment_metadata.json ├── fold_0/ │ ├── best_model.ckpt │ ├── train_indices.json │ ├── val_indices.json │ ├── predictions.csv │ ├── transforms/ │ │ ├── pipeline_config.json │ │ ├── transform_0_*.json │ │ └── … │ ├── label_transforms/ │ │ ├── pipeline.json │ │ ├── transform_0.json │ │ └── … │ └── metadata.json ├── fold_1/ │ └── … ├── … └── final_model/
├── final_model.ckpt └── metadata.json
- __init__(output_dir: Union[str, Path], experiment_name: str, load_existing: bool = False)[source]¶
Initialize ModelStorage.
- Parameters:
output_dir – Base directory for storing results
experiment_name – Name of the experiment
load_existing – If True, load from existing directory without versioning
- save_fold_results(fold_idx: int, checkpoint_path: Union[str, Path], train_indices: list[int], val_indices: list[int], predictions: dict[str, Any], metadata: FoldMetadata, transforms: Optional[Any] = None, label_transforms: Optional[Any] = None) None[source]¶
Save all results for a single fold.
- Parameters:
fold_idx – Fold index
checkpoint_path – Path to the best checkpoint for this fold
train_indices – Training indices
val_indices – Validation indices
predictions – Dictionary with ‘y_true’ and ‘y_pred’ arrays
metadata – Fold metadata
transforms – Optional feature transforms
label_transforms – Optional label transforms
- save_experiment_metadata(metadata: ExperimentMetadata) None[source]¶
Save overall experiment metadata.
- save_final_model(checkpoint_path: Union[str, Path], avg_epochs: float, final_metrics: dict[str, Any], transforms: Optional[Any] = None, label_transforms: Optional[Any] = None) None[source]¶
Save the final model trained on average epochs.
- Parameters:
checkpoint_path – Path to final model checkpoint
avg_epochs – Average number of epochs used
final_metrics – Metrics from final model
transforms – Optional feature transforms
label_transforms – Optional label transforms
- get_fold_indices(fold_idx: int) tuple[list[int], list[int]][source]¶
Get train and validation indices for a specific fold.
- classmethod from_directory(experiment_dir: Union[str, Path]) ModelStorage[source]¶
Load an existing experiment from a directory.
- Parameters:
experiment_dir – Path to the experiment directory
- Returns:
ModelStorage instance with loaded metadata
Example
>>> storage = ModelStorage.from_directory("/path/to/experiments/my_experiment") >>> print(storage.experiment_metadata) >>> predictions = storage.load_all_predictions()