plismbench.engine.extract.extract_from_png module#

Stream PLISM tiles dataset and extract features on-the-fly for a given model.

plismbench.engine.extract.extract_from_png.collate(batch: list[dict[str, str | Image.Image]], transform: Callable[[np.ndarray], torch.Tensor]) → tuple[list[str], list[str], torch.Tensor][source]#

Return slide ids, tile ids and transformed images.

Parameters:

batch (list[dict[str, str | Image.Image]],) – List of length batch_size made of dictionnaries. Each dictionnary is a single input with keys: ‘slide_id’, ‘tile_id’ and ‘png’. The image is a PIL.Image.Image with type unit8 (0-255)
transform (collections.abc.Callable[[numpy.ndarray], torch.Tensor]) – Transform function taking numpy.ndarray image as inputs. Prior to calling this transform function, conversion from a PIL.Image.Image to an array is performed.

Returns:

output – A tuple made of slides ids, tiles ids and transformed input images.

Return type:

tuple[list[str], list[str], torch.Tensor]

plismbench.engine.extract.extract_from_png.resume_streaming(export_dir: Path, slide_features: list[numpy.ndarray], current_num_tiles: int, slide_features_export_path: Path, feature_extractor: Extractor, slide_ids: list[str], tile_ids: list[str], imgs: Tensor, reference_slide_id: str) → tuple[list[numpy.ndarray], bool, int][source]#: Resume streaming without re-extracting slides.

plismbench.engine.extract.extract_from_png.run_extract_streaming(feature_extractor_name: str, batch_size: int, device: int, export_dir: Path, overwrite: bool) → None[source]#: Run features extraction with streaming.

plismbench.engine.extract.extract_from_png module#

This Page