`aif360.sklearn.postprocessing`.CalibratedEqualizedOdds¶

class aif360.sklearn.postprocessing.CalibratedEqualizedOdds(prot_attr=None, cost_constraint='weighted', random_state=None)[source]¶

Calibrated equalized odds post-processor.

Calibrated equalized odds is a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective [1].

Note

A Pipeline expects a single estimation step but this class requires an estimator’s predictions as input. See PostProcessingMeta for a workaround.

See also

PostProcessingMeta

References

[1]	G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger, “On Fairness and Calibration,” Conference on Neural Information Processing Systems, 2017.

Adapted from: https://github.com/gpleiss/equalized_odds_and_calibration/blob/master/calib_eq_odds.py

Variables:

prot_attr (str or list(str)) – Protected attribute(s) used for post- processing.
groups (array, shape (2,)) – A list of group labels known to the classifier. Note: this algorithm require a binary division of the data.
classes (array, shape (num_classes,)) – A list of class labels known to the classifier. Note: this algorithm treats all non-positive outcomes as negative (binary classification only).
pos_label (scalar) – The label of the positive class.
mix_rates (array, shape (2,)) – The interpolation parameters – the probability of randomly returning the group’s base rate. The group for which the cost function is higher is set to 0.

Parameters:

prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).
cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

Methods

`fit`	Compute the mixing rates required to satisfy the cost constraint.
`get_params`	Get parameters for this estimator.
`predict`	Predict class labels for the given scores.
`predict_proba`	The returned estimates for all classes are ordered by the label of classes.
`score`	Score the predictions according to the cost constraint specified.
`set_params`	Set the parameters of this estimator.

__init__(prot_attr=None, cost_constraint='weighted', random_state=None)[source]¶

Parameters:

prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).
cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

fit(X, y, labels=None, pos_label=1, sample_weight=None)[source]¶

Compute the mixing rates required to satisfy the cost constraint.

Parameters:

X (array-like) – Probability estimates of the targets as returned by a predict_proba() call or equivalent.
y (pandas.Series) – Ground-truth (correct) target values.
labels (list, optional) – The ordered set of labels values. Must match the order of columns in X if provided. By default, all labels in y are used in sorted order.
pos_label (scalar, optional) – The label of the positive class.
sample_weight (array-like, optional) – Sample weights.

Returns:

self

predict(X)[source]¶

Predict class labels for the given scores.

Parameters:	X (pandas.DataFrame) – Probability estimates of the targets as returned by a `predict_proba()` call or equivalent. Note: must include protected attributes in the index.
Returns:	numpy.ndarray – Predicted class label per sample.

predict_proba(X)[source]¶

The returned estimates for all classes are ordered by the label of classes.

Parameters:	X (pandas.DataFrame) – Probability estimates of the targets as returned by a `predict_proba()` call or equivalent. Note: must include protected attributes in the index.
Returns:	numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in `self.classes_`.

score(X, y, sample_weight=None)[source]¶

Score the predictions according to the cost constraint specified.

Parameters:	X (pandas.DataFrame) – Probability estimates of the targets as returned by a `predict_proba()` call or equivalent. Note: must include protected attributes in the index. y (array-like) – Ground-truth (correct) target values. sample_weight (array-like, optional) – Sample weights.
Returns:	float – Absolute value of the difference in cost function for the two groups (e.g. `generalized_fpr()` if `self.cost_constraint` is ‘fpr’)

aif360.sklearn.postprocessing.CalibratedEqualizedOdds¶

`aif360.sklearn.postprocessing`.CalibratedEqualizedOdds¶