plismbench.utils.metrics module#

Aggregation of robustness metrics across different extractors.

plismbench.utils.metrics.get_extractor_results(results_path: Path) DataFrame[source]#

Get robustness results for a given extractor.

plismbench.utils.metrics.get_results(metrics_root_dir: Path, n_tiles: int = 8139) DataFrame[source]#

Get robustness results for all extractors and a given number of tiles.

plismbench.utils.metrics.format_results(metrics_root_dir: Path, agg_type: str = 'median', n_tiles: int = 8139, top_k: list[int] | None = None) DataFrame[source]#

Add float columns with parsed metrics wrt an aggregation type (“mean” or “median”).

plismbench.utils.metrics.rank_results(results: DataFrame, robustness_type: str = 'all', metric_name: str = 'top_1_accuracy_median') DataFrame[source]#

Rank results according to a robustness type and metric name.

plismbench.utils.metrics.get_aggregated_results(results: DataFrame, metric_name: str = 'top_1_accuracy', robustness_type: str = 'all', agg_type: str = 'median', top_k: list[int] | None = None) DataFrame[source]#

Retrieve results from .csv and rank by a given metric.

plismbench.utils.metrics.get_leaderboard_results(metrics_root_dir: Path) DataFrame[source]#

Generate leaderboard results.