aif360.algorithms.preprocessing
¶
Disparate Impact Remover¶
-
class
aif360.algorithms.preprocessing.
DisparateImpactRemover
(repair_level=1.0, sensitive_attribute='')[source]¶ Disparate impact remover is a preprocessing technique that edits feature values increase group fairness while preserving rank-ordering within groups [1].
References
[1] M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian, “Certifying and removing disparate impact.” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015. Parameters: -
fit_transform
(dataset)[source]¶ Run a repairer on the non-protected features and return the transformed dataset.
Parameters: dataset (BinaryLabelDataset) – Dataset that needs repair. Returns: Transformed Dataset. Return type: dataset (BinaryLabelDataset) Note
In order to transform test data in the same manner as training data, the distributions of attributes conditioned on the protected attribute must be the same.
-
Learning Fair Representations¶
-
class
aif360.algorithms.preprocessing.
LFR
(unprivileged_groups, privileged_groups, k=5, Ax=0.01, Ay=1.0, Az=50.0, print_interval=250, verbose=1, seed=None)[source]¶ Learning fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes [2].
References
[2] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning Fair Representations.” International Conference on Machine Learning, 2013. Based on code from https://github.com/zjelveh/learning-fair-representations
Parameters: - unprivileged_groups (tuple) – Representation for unprivileged group.
- privileged_groups (tuple) – Representation for privileged group.
- k (int, optional) – Number of prototypes.
- Ax (float, optional) – Input recontruction quality term weight.
- Az (float, optional) – Fairness constraint term weight.
- Ay (float, optional) – Output prediction error.
- print_interval (int, optional) – Print optimization objective value every print_interval iterations.
- verbose (int, optional) – If zero, then no output.
- seed (int, optional) – Seed to make predict repeatable.
-
fit
(dataset, **kwargs)[source]¶ Compute the transformation parameters that leads to fair representations.
Parameters: dataset (BinaryLabelDataset) – Dataset containing true labels. Returns: Returns self. Return type: LFR
-
transform
(dataset, threshold=0.5, **kwargs)[source]¶ Transform the dataset using learned model parameters.
Parameters: - dataset (BinaryLabelDataset) – Dataset containing labels that needs to be transformed.
- threshold (float, optional) – threshold parameter used for binary label prediction.
Returns: Transformed Dataset.
Return type: dataset (BinaryLabelDataset)
Optimized Preprocessing¶
-
class
aif360.algorithms.preprocessing.
OptimPreproc
(optimizer, optim_options, unprivileged_groups=None, privileged_groups=None, verbose=False, seed=None)[source]¶ Optimized preprocessing is a preprocessing technique that learns a probabilistic transformation that edits the features and labels in the data with group fairness, individual distortion, and data fidelity constraints and objectives [3].
References
[3] F. P. Calmon, D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K. R. Varshney. “Optimized Pre-Processing for Discrimination Prevention.” Conference on Neural Information Processing Systems, 2017. Based on code available at: https://github.com/fair-preprocessing/nips2017
Parameters: - optimizer (class) – Optimizer class.
- optim_options (dict) – Options for optimization to estimate the transformation.
- unprivileged_groups (dict) – Representation for unprivileged group.
- privileged_groups (dict) – Representation for privileged group.
- verbose (bool, optional) – Verbosity flag for optimization.
- seed (int, optional) – Seed to make fit and predict repeatable.
Note
This algorithm does not use the privileged and unprivileged groups that are specified during initialization yet. Instead, it automatically attempts to reduce statistical parity difference between all possible combinations of groups in the dataset.
-
fit
(dataset, sep='=')[source]¶ Compute optimal pre-processing transformation based on distortion constraint.
Parameters: - dataset (BinaryLabelDataset) – Dataset containing true labels.
- sep (str, optional) – Separator for converting one-hot labels to categorical.
Returns: Returns self.
Return type:
-
fit_transform
(dataset, sep='=', transform_Y=True)[source]¶ Perfom
fit()
andtransform()
sequentially.
-
transform
(dataset, sep='=', transform_Y=True)[source]¶ Transform the dataset to a new dataset based on the estimated transformation.
Parameters: - dataset (BinaryLabelDataset) – Dataset containing labels that needs to be transformed.
- transform_Y (bool) – Flag that mandates transformation of Y (labels).
Reweighing¶
-
class
aif360.algorithms.preprocessing.
Reweighing
(unprivileged_groups, privileged_groups)[source]¶ Reweighing is a preprocessing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification [4].
References
[4] F. Kamiran and T. Calders, “Data Preprocessing Techniques for Classification without Discrimination,” Knowledge and Information Systems, 2012. Parameters: -
fit
(dataset)[source]¶ Compute the weights for reweighing the dataset.
Parameters: dataset (BinaryLabelDataset) – Dataset containing true labels. Returns: Returns self. Return type: Reweighing
-
fit_transform
(dataset)¶ Train a model on the input and transform the dataset accordingly.
Equivalent to calling fit(dataset) followed by transform(dataset).
Parameters: dataset (Dataset) – Input dataset. Returns: Output dataset. metadata should reflect the details of this transformation. Return type: Dataset
-
transform
(dataset)[source]¶ Transform the dataset to a new dataset based on the estimated transformation.
Parameters: dataset (BinaryLabelDataset) – Dataset that needs to be transformed. Returns: Transformed dataset. Return type: dataset (BinaryLabelDataset)
-