aif360.sklearn.postprocessing.CalibratedEqualizedOdds

class aif360.sklearn.postprocessing.CalibratedEqualizedOdds(prot_attr=None, cost_constraint='weighted', random_state=None)[source]

Calibrated equalized odds post-processor.

Calibrated equalized odds is a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective [1].

Note

A Pipeline expects a single estimation step but this class requires an estimator’s predictions as input. See PostProcessingMeta for a workaround.

References

Adapted from: https://github.com/gpleiss/equalized_odds_and_calibration/blob/master/calib_eq_odds.py

Variables:
  • prot_attr_ (str or list(str)) – Protected attribute(s) used for post- processing.

  • groups_ (array, shape (2,)) – A list of group labels known to the classifier. Note: this algorithm require a binary division of the data.

  • classes_ (array, shape (num_classes,)) – A list of class labels known to the classifier. Note: this algorithm treats all non-positive outcomes as negative (binary classification only).

  • pos_label_ (scalar) – The label of the positive class.

  • mix_rates_ (array, shape (2,)) – The interpolation parameters – the probability of randomly returning the group’s base rate. The group for which the cost function is higher is set to 0.

Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).

  • cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

Methods

fit

Compute the mixing rates required to satisfy the cost constraint.

get_metadata_routing

Get metadata routing of this object.

get_params

Get parameters for this estimator.

predict

Predict class labels for the given scores.

predict_proba

The returned estimates for all classes are ordered by the label of classes.

score

Score the predictions according to the cost constraint specified.

set_fit_request

Request metadata passed to the fit method.

set_params

Set the parameters of this estimator.

set_score_request

Request metadata passed to the score method.

__init__(prot_attr=None, cost_constraint='weighted', random_state=None)[source]
Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).

  • cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

fit(X, y, labels=None, pos_label=1, sample_weight=None)[source]

Compute the mixing rates required to satisfy the cost constraint.

Parameters:
  • X (array-like) – Probability estimates of the targets as returned by a predict_proba() call or equivalent.

  • y (pandas.Series) – Ground-truth (correct) target values.

  • labels (list, optional) – The ordered set of labels values. Must match the order of columns in X if provided. By default, all labels in y are used in sorted order.

  • pos_label (scalar, optional) – The label of the positive class.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

self

predict(X)[source]

Predict class labels for the given scores.

Parameters:

X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.

Returns:

numpy.ndarray – Predicted class label per sample.

predict_proba(X)[source]

The returned estimates for all classes are ordered by the label of classes.

Parameters:

X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.

Returns:

numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.

score(X, y, sample_weight=None)[source]

Score the predictions according to the cost constraint specified.

Parameters:
  • X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.

  • y (array-like) – Ground-truth (correct) target values.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

float – Absolute value of the difference in cost function for the two groups (e.g. generalized_fpr() if self.cost_constraint is ‘fpr’)

set_fit_request(*, labels: bool | None | str = '$UNCHANGED$', pos_label: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') CalibratedEqualizedOdds[source]

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • labels (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for labels parameter in fit.

  • pos_label (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for pos_label parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self (object) – The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') CalibratedEqualizedOdds[source]

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self (object) – The updated object.