aif360.sklearn.inprocessing.AdversarialDebiasing

class aif360.sklearn.inprocessing.AdversarialDebiasing(prot_attr=None, scope_name='classifier', adversary_loss_weight=0.1, num_epochs=50, batch_size=128, classifier_num_hidden_units=200, debias=True, verbose=False, random_state=None)[source]

Debiasing with adversarial learning.

Adversarial debiasing is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary’s ability to determine the protected attribute from the predictions [1]. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit.

References

Variables:
  • prot_attr_ (str or list(str)) – Protected attribute(s) used for debiasing.

  • groups_ (array, shape (n_groups,)) – A list of group labels known to the classifier.

  • classes_ (array, shape (n_classes,)) – A list of class labels known to the classifier.

  • sess_ (tensorflow.Session) – The TensorFlow Session used for the computations. Note: this can be manually closed to free up resources with self.sess_.close().

  • classifier_logits_ (tensorflow.Tensor) – Tensor containing output logits from the classifier.

  • adversary_logits_ (tensorflow.Tensor) – Tensor containing output logits from the adversary.

Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the debiasing process. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used.

  • scope_name (str, optional) – TensorFlow “variable_scope” name for the entire model (classifier and adversary).

  • adversary_loss_weight (float or None, optional) – If None, this will use the suggestion from the paper: \(\alpha = \sqrt{global\_step}\) with inverse time decay on the learning rate. Otherwise, it uses the provided coefficient with exponential learning rate decay.

  • num_epochs (int, optional) – Number of epochs for which to train.

  • batch_size (int, optional) – Size of mini-batch for training.

  • classifier_num_hidden_units (int, optional) – Number of hidden units in the classifier.

  • debias (bool, optional) – If False, learn a classifier without an adversary.

  • verbose (bool, optional) – If True, print losses every 200 steps.

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.

Methods

decision_function

Soft prediction scores.

fit

Train the classifier and adversary (if debias == True) with the given training data.

get_metadata_routing

Get metadata routing of this object.

get_params

Get parameters for this estimator.

predict

Predict class labels for the given samples.

predict_proba

Probability estimates.

score

Return the mean accuracy on the given test data and labels.

set_params

Set the parameters of this estimator.

set_score_request

Request metadata passed to the score method.

__init__(prot_attr=None, scope_name='classifier', adversary_loss_weight=0.1, num_epochs=50, batch_size=128, classifier_num_hidden_units=200, debias=True, verbose=False, random_state=None)[source]
Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the debiasing process. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used.

  • scope_name (str, optional) – TensorFlow “variable_scope” name for the entire model (classifier and adversary).

  • adversary_loss_weight (float or None, optional) – If None, this will use the suggestion from the paper: \(\alpha = \sqrt{global\_step}\) with inverse time decay on the learning rate. Otherwise, it uses the provided coefficient with exponential learning rate decay.

  • num_epochs (int, optional) – Number of epochs for which to train.

  • batch_size (int, optional) – Size of mini-batch for training.

  • classifier_num_hidden_units (int, optional) – Number of hidden units in the classifier.

  • debias (bool, optional) – If False, learn a classifier without an adversary.

  • verbose (bool, optional) – If True, print losses every 200 steps.

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.

decision_function(X)[source]

Soft prediction scores.

Parameters:

X (pandas.DataFrame) – Test samples.

Returns:

numpy.ndarray – Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.

fit(X, y)[source]

Train the classifier and adversary (if debias == True) with the given training data.

Parameters:
  • X (pandas.DataFrame) – Training samples.

  • y (array-like) – Training labels.

Returns:

self

predict(X)[source]

Predict class labels for the given samples.

Parameters:

X (pandas.DataFrame) – Test samples.

Returns:

numpy.ndarray – Predicted class label per sample.

predict_proba(X)[source]

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

Parameters:

X (pandas.DataFrame) – Test samples.

Returns:

numpy.ndarray – Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') AdversarialDebiasing[source]

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self (object) – The updated object.