aif360.sklearn.postprocessing.CalibratedEqualizedOdds¶

class aif360.sklearn.postprocessing.CalibratedEqualizedOdds(prot_attr=None, cost_constraint='weighted', random_state=None)[source]

Calibrated equalized odds post-processor.

Calibrated equalized odds is a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective [1].

Note

A Pipeline expects a single estimation step but this class requires an estimator’s predictions as input. See PostProcessingMeta for a workaround.

References

Variables: prot_attr (str or list(str)) – Protected attribute(s) used for post- processing. groups (array, shape (2,)) – A list of group labels known to the classifier. Note: this algorithm require a binary division of the data. classes (array, shape (num_classes,)) – A list of class labels known to the classifier. Note: this algorithm treats all non-positive outcomes as negative (binary classification only). pos_label (scalar) – The label of the positive class. mix_rates (array, shape (2,)) – The interpolation parameters – the probability of randomly returning the group’s base rate. The group for which the cost function is higher is set to 0. prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged). cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’). random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.

Methods

 fit Compute the mixing rates required to satisfy the cost constraint. get_params Get parameters for this estimator. predict Predict class labels for the given scores. predict_proba The returned estimates for all classes are ordered by the label of classes. score Score the predictions according to the cost constraint specified. set_params Set the parameters of this estimator.
__init__(prot_attr=None, cost_constraint='weighted', random_state=None)[source]
Parameters: prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the post-processing. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used. Note: This algorithm requires there be exactly 2 groups (privileged and unprivileged). cost_constraint ('fpr', 'fnr', or 'weighted') – Which equal-cost constraint to satisfy: generalized false positive rate (‘fpr’), generalized false negative rate (‘fnr’), or a weighted combination of both (‘weighted’). random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for sampling from the mix rates.
fit(X, y, labels=None, pos_label=1, sample_weight=None)[source]

Compute the mixing rates required to satisfy the cost constraint.

Parameters: X (array-like) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. y (pandas.Series) – Ground-truth (correct) target values. labels (list, optional) – The ordered set of labels values. Must match the order of columns in X if provided. By default, all labels in y are used in sorted order. pos_label (scalar, optional) – The label of the positive class. sample_weight (array-like, optional) – Sample weights. self
predict(X)[source]

Predict class labels for the given scores.

Parameters: X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index. numpy.ndarray – Predicted class label per sample.
predict_proba(X)[source]

The returned estimates for all classes are ordered by the label of classes.

Parameters: X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index. numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.
score(X, y, sample_weight=None)[source]

Score the predictions according to the cost constraint specified.

Parameters: X (pandas.DataFrame) – Probability estimates of the targets as returned by a predict_proba() call or equivalent. Note: must include protected attributes in the index. y (array-like) – Ground-truth (correct) target values. sample_weight (array-like, optional) – Sample weights. float – Absolute value of the difference in cost function for the two groups (e.g. generalized_fpr() if self.cost_constraint is ‘fpr’)