Disparate Impact Remover¶
Disparate impact remover is a preprocessing technique that edits feature values increase group fairness while preserving rank-ordering within groups .
 M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian, “Certifying and removing disparate impact.” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015. Parameters:
Run a repairer on the non-protected features and return the transformed dataset.
Parameters: dataset (BinaryLabelDataset) – Dataset that needs repair. Returns: Transformed Dataset. Return type: dataset (BinaryLabelDataset)
In order to transform test data in the same manner as training data, the distributions of attributes conditioned on the protected attribute must be the same.
Learning Fair Representations¶
LFR(unprivileged_groups, privileged_groups, k=5, Ax=0.01, Ay=1.0, Az=50.0, print_interval=250, verbose=1, seed=None)¶
Learning fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes .
 R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning Fair Representations.” International Conference on Machine Learning, 2013.
Based on code from https://github.com/zjelveh/learning-fair-representations
- unprivileged_groups (tuple) – Representation for unprivileged group.
- privileged_groups (tuple) – Representation for privileged group.
- k (int, optional) – Number of prototypes.
- Ax (float, optional) – Input recontruction quality term weight.
- Az (float, optional) – Fairness constraint term weight.
- Ay (float, optional) – Output prediction error.
- print_interval (int, optional) – Print optimization objective value every print_interval iterations.
- verbose (int, optional) – If zero, then no output.
- seed (int, optional) – Seed to make predict repeatable.
Compute the transformation parameters that leads to fair representations.
Parameters: dataset (BinaryLabelDataset) – Dataset containing true labels. Returns: Returns self. Return type: LFR
fit and transform methods sequentially
transform(dataset, threshold=0.5, **kwargs)¶
Transform the dataset using learned model parameters.
OptimPreproc(optimizer, optim_options, unprivileged_groups=None, privileged_groups=None, verbose=False, seed=None)¶
Optimized preprocessing is a preprocessing technique that learns a probabilistic transformation that edits the features and labels in the data with group fairness, individual distortion, and data fidelity constraints and objectives .
 F. P. Calmon, D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K. R. Varshney. “Optimized Pre-Processing for Discrimination Prevention.” Conference on Neural Information Processing Systems, 2017.
Based on code available at: https://github.com/fair-preprocessing/nips2017
- optimizer (class) – Optimizer class.
- optim_options (dict) – Options for optimization to estimate the transformation.
- unprivileged_groups (dict) – Representation for unprivileged group.
- privileged_groups (dict) – Representation for privileged group.
- verbose (bool, optional) – Verbosity flag for optimization.
- seed (int, optional) – Seed to make fit and predict repeatable.
This algorithm does not use the privileged and unprivileged groups that are specified during initialization yet. Instead, it automatically attempts to reduce statistical parity difference between all possible combinations of groups in the dataset.
Compute optimal pre-processing transformation based on distortion constraint.
Reweighing is a preprocessing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification .
 F. Kamiran and T. Calders, “Data Preprocessing Techniques for Classification without Discrimination,” Knowledge and Information Systems, 2012. Parameters:
Compute the weights for reweighing the dataset.
Parameters: dataset (BinaryLabelDataset) – Dataset containing true labels. Returns: Returns self. Return type: Reweighing
Train a model on the input and transform the dataset accordingly.
Equivalent to calling fit(dataset) followed by transform(dataset).
Parameters: dataset (Dataset) – Input dataset. Returns: Output dataset. metadata should reflect the details of this transformation. Return type: Dataset