aif360.sklearn.postprocessing.CalibratedEqualizedOdds

class aif360.sklearn.postprocessing.CalibratedEqualizedOdds(prot_attr=None, cost_constraint='weighted', random_state=None)[source]

Calibrated equalized odds post-processor.

Calibrated equalized odds is a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective [1].

Note

A Pipeline expects a single estimation step but this class requires an estimator’s predictions as input. See PostProcessingMeta for a workaround.

References

[1]G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger, “On Fairness and Calibration,” Conference on Neural Information Processing Systems, 2017.

Adapted from: https://github.com/gpleiss/equalized_odds_and_calibration/blob/master/calib_eq_odds.py

Variables:
  • prot_attr (str or list(str)) – Protected attribute(s) used for post- processing.
  • groups (array, shape (2,)) – A list of group labels known to the classifier. Note: this algorithm require a binary division of the data.
  • classes (array, shape (num_classes,)) – A list of class labels known to the classifier. Note: this algorithm treats all non-positive outcomes as negative (binary classification only).
  • pos_label (scalar) – The label of the positive class.
  • mix_rates (array, shape (2,)) – The interpolation parameters – the probability of randomly returning the group’s base rate. The group for which the cost function is higher is set to 0.
Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).
  • cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

Methods

fit Compute the mixing rates required to satisfy the cost constraint.
get_params Get parameters for this estimator.
predict Predict class labels for the given scores.
predict_proba The returned estimates for all classes are ordered by the label of classes.
score Score the predictions according to the cost constraint specified.
set_params Set the parameters of this estimator.
__init__(prot_attr=None, cost_constraint='weighted', random_state=None)[source]
Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).
  • cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.
fit(X, y, labels=None, pos_label=1, sample_weight=None)[source]

Compute the mixing rates required to satisfy the cost constraint.

Parameters:
  • X (array-like) – Probability estimates of the targets as returned by a predict_proba() call or equivalent.
  • y (pandas.Series) – Ground-truth (correct) target values.
  • labels (list, optional) – The ordered set of labels values. Must match the order of columns in X if provided. By default, all labels in y are used in sorted order.
  • pos_label (scalar, optional) – The label of the positive class.
  • sample_weight (array-like, optional) – Sample weights.
Returns:

self

predict(X)[source]

Predict class labels for the given scores.

Parameters:X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.
Returns:numpy.ndarray – Predicted class label per sample.
predict_proba(X)[source]

The returned estimates for all classes are ordered by the label of classes.

Parameters:X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.
Returns:numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.
score(X, y, sample_weight=None)[source]

Score the predictions according to the cost constraint specified.

Parameters:
  • X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index.
  • y (array-like) – Ground-truth (correct) target values.
  • sample_weight (array-like, optional) – Sample weights.
Returns:

float – Absolute value of the difference in cost function for the two groups (e.g. generalized_fpr() if self.cost_constraint is ‘fpr’)