cellmil.utils.dataset_from_dataset¶
Functions
|
Create processed dataset files directly compatible with GNNMILDataset. |
- cellmil.utils.dataset_from_dataset.create_processed_dataset_files(root: Union[str, Path], label: str, pyg_datasets: List[CellGNNMILDataset], data: DataFrame, split: Literal['train', 'val', 'test'] = 'train', force_reload: bool = False) str[source]¶
Create processed dataset files directly compatible with GNNMILDataset.
This function reuses existing processed datasets but with different labels, and creates the processed files exactly as GNNMILDataset would create them. After running this function, you can use GNNMILDataset normally with the new label.
- Parameters:
root – Root directory where the processed dataset files will be saved
label – New label column name for classification
pyg_datasets – List of existing GNNMILDatasets [train, val, test]
data – DataFrame containing metadata with the new labels
split – Dataset split to create (train/val/test)
force_reload – Whether to force reprocessing even if processed files exist
- Returns:
Path to the created processed file