aif360.sklearn.postprocessing
.CalibratedEqualizedOdds¶

class
aif360.sklearn.postprocessing.
CalibratedEqualizedOdds
(prot_attr=None, cost_constraint='weighted', random_state=None)[source]¶ Calibrated equalized odds postprocessor.
Calibrated equalized odds is a postprocessing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective [1].
Note
A
Pipeline
expects a single estimation step but this class requires an estimator’s predictions as input. SeePostProcessingMeta
for a workaround.See also
References
[1] G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger, “On Fairness and Calibration,” Conference on Neural Information Processing Systems, 2017. Adapted from: https://github.com/gpleiss/equalized_odds_and_calibration/blob/master/calib_eq_odds.py
Variables:  prot_attr (str or list(str)) – Protected attribute(s) used for post processing.
 groups (array, shape (2,)) – A list of group labels known to the classifier. Note: this algorithm require a binary division of the data.
 classes (array, shape (num_classes,)) – A list of class labels known to the classifier. Note: this algorithm treats all nonpositive outcomes as negative (binary classification only).
 pos_label (scalar) – The label of the positive class.
 mix_rates (array, shape (2,)) – The interpolation parameters – the probability of randomly returning the group’s base rate. The group for which the cost function is higher is set to 0.
Parameters:  prot_attr (single label or listlike, optional) – Protected
attribute(s) to use in the postprocessing. If more than one
attribute, all combinations of values (intersections) are
considered. Default is
None
meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).  cost_constraint ('fpr', 'fnr', or 'weighted') – Which equalcost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
 random_state (int or numpy.RandomState, optional) – Seed of pseudo random number generator for sampling from the mix rates.
Methods
fit
Compute the mixing rates required to satisfy the cost constraint. get_params
Get parameters for this estimator. predict
Predict class labels for the given scores. predict_proba
The returned estimates for all classes are ordered by the label of classes. score
Score the predictions according to the cost constraint specified. set_params
Set the parameters of this estimator. 
__init__
(prot_attr=None, cost_constraint='weighted', random_state=None)[source]¶ Parameters:  prot_attr (single label or listlike, optional) – Protected
attribute(s) to use in the postprocessing. If more than one
attribute, all combinations of values (intersections) are
considered. Default is
None
meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged).  cost_constraint ('fpr', 'fnr', or 'weighted') – Which equalcost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’).
 random_state (int or numpy.RandomState, optional) – Seed of pseudo random number generator for sampling from the mix rates.
 prot_attr (single label or listlike, optional) – Protected
attribute(s) to use in the postprocessing. If more than one
attribute, all combinations of values (intersections) are
considered. Default is

fit
(X, y, labels=None, pos_label=1, sample_weight=None)[source]¶ Compute the mixing rates required to satisfy the cost constraint.
Parameters:  X (arraylike) – Probability estimates of the targets as returned by
a
predict_proba()
call or equivalent.  y (pandas.Series) – Groundtruth (correct) target values.
 labels (list, optional) – The ordered set of labels values. Must match the order of columns in X if provided. By default, all labels in y are used in sorted order.
 pos_label (scalar, optional) – The label of the positive class.
 sample_weight (arraylike, optional) – Sample weights.
Returns: self
 X (arraylike) – Probability estimates of the targets as returned by
a

predict
(X)[source]¶ Predict class labels for the given scores.
Parameters: X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba()
call or equivalent. Note: must include protected attributes in the index.Returns: numpy.ndarray – Predicted class label per sample.

predict_proba
(X)[source]¶ The returned estimates for all classes are ordered by the label of classes.
Parameters: X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba()
call or equivalent. Note: must include protected attributes in the index.Returns: numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_
.

score
(X, y, sample_weight=None)[source]¶ Score the predictions according to the cost constraint specified.
Parameters:  X (pandas.DataFrame) – Probability estimates of the targets as
returned by a
predict_proba()
call or equivalent. Note: must include protected attributes in the index.  y (arraylike) – Groundtruth (correct) target values.
 sample_weight (arraylike, optional) – Sample weights.
Returns: float – Absolute value of the difference in cost function for the two groups (e.g.
generalized_fpr()
ifself.cost_constraint
is ‘fpr’) X (pandas.DataFrame) – Probability estimates of the targets as
returned by a