aif360.sklearn.preprocessing
.LearnedFairRepresentations
- class aif360.sklearn.preprocessing.LearnedFairRepresentations(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]
Learned Fair Representations.
Learned fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes [1]. It can also be used as an in- processing method by utilizing the learned target coefficients.
References
# Based on code from https://github.com/zjelveh/learning-fair-representations
- Variables:
prot_attr_ (str or list(str)) – Protected attribute(s) used for reweighing.
groups_ (array, shape (n_groups,)) – A list of group labels known to the transformer.
classes_ (array, shape (n_classes,)) – A list of class labels known to the transformer.
priv_group_ (scalar) – The label of the privileged group.
coef_ (array, shape (n_prototypes, 1) or (n_prototypes, n_classes)) – Coefficient of the intermediate representation for classification.
prototypes_ (array, shape (n_prototypes, n_features)) – The prototype set used to form a probabilistic mapping to the intermediate representation. These act as clusters and are in the same space as the samples.
n_iter_ (int) – Actual number of iterations.
- Parameters:
prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the reweighing process. If more than one attribute, all combinations of values (intersections) are considered. Default is
None
meaning all protected attributes from the dataset are used.n_prototypes (int, optional) – Size of the set of “prototypes,” Z.
reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.
target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.
fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.
tol (float, optional) – Tolerance for stopping criteria.
max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.
verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.
random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.
Methods
Compute the transformation parameters that lead to fair representations.
fit_transform
Fit to data, then transform it.
get_metadata_routing
Get metadata routing of this object.
get_params
Get parameters for this estimator.
Transform the targets using the learned model parameters.
Transform the targets using the learned model parameters.
score
Return the mean accuracy on the given test data and labels.
Request metadata passed to the
fit
method.set_output
Set output container.
set_params
Set the parameters of this estimator.
Request metadata passed to the
score
method.Transform the dataset using the learned model parameters.
- __init__(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]
- Parameters:
prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the reweighing process. If more than one attribute, all combinations of values (intersections) are considered. Default is
None
meaning all protected attributes from the dataset are used.n_prototypes (int, optional) – Size of the set of “prototypes,” Z.
reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.
target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.
fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.
tol (float, optional) – Tolerance for stopping criteria.
max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.
verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.
random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.
- fit(X, y, priv_group=1, sample_weight=None)[source]
Compute the transformation parameters that lead to fair representations.
- Parameters:
X (pandas.DataFrame) – Training samples.
y (array-like) – Training labels.
priv_group (scalar, optional) – The label of the privileged group.
sample_weight (array-like, optional) – Sample weights.
- Returns:
self
- predict(X)[source]
Transform the targets using the learned model parameters.
- Parameters:
X (pandas.DataFrame) – Training samples.
- Returns:
numpy.ndarray – Transformed targets.
- predict_proba(X)[source]
Transform the targets using the learned model parameters.
- Parameters:
X (pandas.DataFrame) – Training samples.
- Returns:
numpy.ndarray – Transformed targets. Returns the probability of the sample for each class in the model, where classes are ordered as they are in
self.classes_
.
- set_fit_request(*, priv_group: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') LearnedFairRepresentations [source]
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Returns:
self (object) – The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') LearnedFairRepresentations [source]
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns:
self (object) – The updated object.
- transform(X)[source]
Transform the dataset using the learned model parameters.
- Parameters:
X (pandas.DataFrame) – Training samples.
- Returns:
pandas.DataFrame – Transformed samples.