aif360.algorithms.preprocessing

Disparate Impact Remover

class aif360.algorithms.preprocessing.DisparateImpactRemover(repair_level=1.0, sensitive_attribute='')[source]

Disparate impact remover is a preprocessing technique that edits feature values increase group fairness while preserving rank-ordering within groups [1].

References

[1]M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian, “Certifying and removing disparate impact.” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015.
Parameters:
  • repair_level (float) – Repair amount. 0.0 is no repair while 1.0 is full repair.
  • sensitive_attribute (str) – Single protected attribute with which to do repair.
fit_transform(dataset)[source]

Run a repairer on the non-protected features and return the transformed dataset.

Parameters:dataset (BinaryLabelDataset) – Dataset that needs repair.
Returns:Transformed Dataset.
Return type:dataset (BinaryLabelDataset)

Note

In order to transform test data in the same manner as training data, the distributions of attributes conditioned on the protected attribute must be the same.

Learning Fair Representations

class aif360.algorithms.preprocessing.LFR(unprivileged_groups, privileged_groups, k=5, Ax=0.01, Ay=1.0, Az=50.0, print_interval=250, verbose=1, seed=None)[source]

Learning fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes [2].

References

[2]R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning Fair Representations.” International Conference on Machine Learning, 2013.

Based on code from https://github.com/zjelveh/learning-fair-representations

Parameters:
  • unprivileged_groups (tuple) – Representation for unprivileged group.
  • privileged_groups (tuple) – Representation for privileged group.
  • k (int, optional) – Number of prototypes.
  • Ax (float, optional) – Input recontruction quality term weight.
  • Az (float, optional) – Fairness constraint term weight.
  • Ay (float, optional) – Output prediction error.
  • print_interval (int, optional) – Print optimization objective value every print_interval iterations.
  • verbose (int, optional) – If zero, then no output.
  • seed (int, optional) – Seed to make predict repeatable.
fit(dataset, **kwargs)[source]

Compute the transformation parameters that leads to fair representations.

Parameters:dataset (BinaryLabelDataset) – Dataset containing true labels.
Returns:Returns self.
Return type:LFR
fit_transform(dataset, seed=None)[source]

fit and transform methods sequentially

transform(dataset, threshold=0.5, **kwargs)[source]

Transform the dataset using learned model parameters.

Parameters:
  • dataset (BinaryLabelDataset) – Dataset containing labels that needs to be transformed.
  • threshold (float, optional) – threshold parameter used for binary label prediction.
Returns:

Transformed Dataset.

Return type:

dataset (BinaryLabelDataset)

Optimized Preprocessing

class aif360.algorithms.preprocessing.OptimPreproc(optimizer, optim_options, unprivileged_groups=None, privileged_groups=None, verbose=False, seed=None)[source]

Optimized preprocessing is a preprocessing technique that learns a probabilistic transformation that edits the features and labels in the data with group fairness, individual distortion, and data fidelity constraints and objectives [3].

References

[3]F. P. Calmon, D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K. R. Varshney. “Optimized Pre-Processing for Discrimination Prevention.” Conference on Neural Information Processing Systems, 2017.

Based on code available at: https://github.com/fair-preprocessing/nips2017

Parameters:
  • optimizer (class) – Optimizer class.
  • optim_options (dict) – Options for optimization to estimate the transformation.
  • unprivileged_groups (dict) – Representation for unprivileged group.
  • privileged_groups (dict) – Representation for privileged group.
  • verbose (bool, optional) – Verbosity flag for optimization.
  • seed (int, optional) – Seed to make fit and predict repeatable.

Note

This algorithm does not use the privileged and unprivileged groups that are specified during initialization yet. Instead, it automatically attempts to reduce statistical parity difference between all possible combinations of groups in the dataset.

fit(dataset, sep='=')[source]

Compute optimal pre-processing transformation based on distortion constraint.

Parameters:
  • dataset (BinaryLabelDataset) – Dataset containing true labels.
  • sep (str, optional) – Separator for converting one-hot labels to categorical.
Returns:

Returns self.

Return type:

OptimPreproc

fit_transform(dataset, sep='=', transform_Y=True)[source]

Perfom fit() and transform() sequentially.

transform(dataset, sep='=', transform_Y=True)[source]

Transform the dataset to a new dataset based on the estimated transformation.

Parameters:
  • dataset (BinaryLabelDataset) – Dataset containing labels that needs to be transformed.
  • transform_Y (bool) – Flag that mandates transformation of Y (labels).

Reweighing

class aif360.algorithms.preprocessing.Reweighing(unprivileged_groups, privileged_groups)[source]

Reweighing is a preprocessing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification [4].

References

[4]F. Kamiran and T. Calders, “Data Preprocessing Techniques for Classification without Discrimination,” Knowledge and Information Systems, 2012.
Parameters:
  • unprivileged_groups (list(dict)) – Representation for unprivileged group.
  • privileged_groups (list(dict)) – Representation for privileged group.
fit(dataset)[source]

Compute the weights for reweighing the dataset.

Parameters:dataset (BinaryLabelDataset) – Dataset containing true labels.
Returns:Returns self.
Return type:Reweighing
fit_transform(dataset)

Train a model on the input and transform the dataset accordingly.

Equivalent to calling fit(dataset) followed by transform(dataset).

Parameters:dataset (Dataset) – Input dataset.
Returns:Output dataset. metadata should reflect the details of this transformation.
Return type:Dataset
transform(dataset)[source]

Transform the dataset to a new dataset based on the estimated transformation.

Parameters:dataset (BinaryLabelDataset) – Dataset that needs to be transformed.
Returns:Transformed dataset.
Return type:dataset (BinaryLabelDataset)