uqlm.calibration.evaluate.evaluate_calibration#

uqlm.calibration.evaluate.evaluate_calibration(uq_result, correct_indicators, plot=True, axes=None)#

Evaluate the calibration quality of the scores.

Return type:

dict

Parameters:
  • uq_result (UQResult) – The UQResult object to evaluate.

  • correct_indicators (array-like of shape (n_samples,)) – Binary labels indicating correctness (True/False or 1/0).

  • plot (bool, default=True) – Whether to plot the reliability diagram.

  • axes (tuple of matplotlib.axes.Axes, optional) – Tuple of (reliability_ax, distribution_ax) for plotting. If None and plot=True, creates new figure.

Returns:

metrics – Dictionary containing calibration metrics for each scorer.

Return type:

dict