aif360.algorithms.inprocessing
.AdversarialDebiasing¶
-
class
aif360.algorithms.inprocessing.
AdversarialDebiasing
(unprivileged_groups, privileged_groups, scope_name, sess, seed=None, adversary_loss_weight=0.1, num_epochs=50, batch_size=128, classifier_num_hidden_units=200, debias=True)[source]¶ Adversarial debiasing is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary’s ability to determine the protected attribute from the predictions [5]. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit.
References
[5] B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating Unwanted Biases with Adversarial Learning,” AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2018. Parameters: - unprivileged_groups (tuple) – Representation for unprivileged groups
- privileged_groups (tuple) – Representation for privileged groups
- scope_name (str) – scope name for the tenforflow variables
- sess (tf.Session) – tensorflow session
- seed (int, optional) – Seed to make
predict
repeatable. - adversary_loss_weight (float, optional) – Hyperparameter that chooses the strength of the adversarial loss.
- num_epochs (int, optional) – Number of training epochs.
- batch_size (int, optional) – Batch size.
- classifier_num_hidden_units (int, optional) – Number of hidden units in the classifier model.
- debias (bool, optional) – Learn a classifier with or without debiasing.
Methods
fit
Compute the model parameters of the fair classifier using gradient descent. fit_predict
Train a model on the input and predict the labels. fit_transform
Train a model on the input and transform the dataset accordingly. predict
Obtain the predictions for the provided dataset using the fair classifier learned. transform
Return a new dataset generated by running this Transformer on the input. -
__init__
(unprivileged_groups, privileged_groups, scope_name, sess, seed=None, adversary_loss_weight=0.1, num_epochs=50, batch_size=128, classifier_num_hidden_units=200, debias=True)[source]¶ Parameters: - unprivileged_groups (tuple) – Representation for unprivileged groups
- privileged_groups (tuple) – Representation for privileged groups
- scope_name (str) – scope name for the tenforflow variables
- sess (tf.Session) – tensorflow session
- seed (int, optional) – Seed to make
predict
repeatable. - adversary_loss_weight (float, optional) – Hyperparameter that chooses the strength of the adversarial loss.
- num_epochs (int, optional) – Number of training epochs.
- batch_size (int, optional) – Batch size.
- classifier_num_hidden_units (int, optional) – Number of hidden units in the classifier model.
- debias (bool, optional) – Learn a classifier with or without debiasing.
-
fit
(dataset)[source]¶ Compute the model parameters of the fair classifier using gradient descent.
Parameters: dataset (BinaryLabelDataset) – Dataset containing true labels. Returns: AdversarialDebiasing – Returns self.
-
predict
(dataset)[source]¶ Obtain the predictions for the provided dataset using the fair classifier learned.
Parameters: dataset (BinaryLabelDataset) – Dataset containing labels that needs to be transformed. Returns: dataset (BinaryLabelDataset) – Transformed dataset.