aif360.sklearn.preprocessing.LearnedFairRepresentations

class aif360.sklearn.preprocessing.LearnedFairRepresentations(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]

Learned Fair Representations.

Learned fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes [1]. It can also be used as an in- processing method by utilizing the learned target coefficients.

References

# Based on code from https://github.com/zjelveh/learning-fair-representations

Variables:
  • prot_attr_ (str or list(str)) – Protected attribute(s) used for reweighing.

  • groups_ (array, shape (n_groups,)) – A list of group labels known to the transformer.

  • classes_ (array, shape (n_classes,)) – A list of class labels known to the transformer.

  • priv_group_ (scalar) – The label of the privileged group.

  • coef_ (array, shape (n_prototypes, 1) or (n_prototypes, n_classes)) – Coefficient of the intermediate representation for classification.

  • prototypes_ (array, shape (n_prototypes, n_features)) – The prototype set used to form a probabilistic mapping to the intermediate representation. These act as clusters and are in the same space as the samples.

  • n_iter_ (int) – Actual number of iterations.

Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the reweighing process. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used.

  • n_prototypes (int, optional) – Size of the set of “prototypes,” Z.

  • reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.

  • target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.

  • fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.

  • tol (float, optional) – Tolerance for stopping criteria.

  • max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.

  • verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.

Methods

fit

Compute the transformation parameters that lead to fair representations.

fit_transform

Fit to data, then transform it.

get_metadata_routing

Get metadata routing of this object.

get_params

Get parameters for this estimator.

predict

Transform the targets using the learned model parameters.

predict_proba

Transform the targets using the learned model parameters.

score

Return the mean accuracy on the given test data and labels.

set_fit_request

Request metadata passed to the fit method.

set_output

Set output container.

set_params

Set the parameters of this estimator.

set_score_request

Request metadata passed to the score method.

transform

Transform the dataset using the learned model parameters.

__init__(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]
Parameters:
  • prot_attr (single label or list-like, optional) – Protected attribute(s) to use in the reweighing process. If more than one attribute, all combinations of values (intersections) are considered. Default is None meaning all protected attributes from the dataset are used.

  • n_prototypes (int, optional) – Size of the set of “prototypes,” Z.

  • reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.

  • target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.

  • fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.

  • tol (float, optional) – Tolerance for stopping criteria.

  • max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.

  • verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.

  • random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.

fit(X, y, priv_group=1, sample_weight=None)[source]

Compute the transformation parameters that lead to fair representations.

Parameters:
  • X (pandas.DataFrame) – Training samples.

  • y (array-like) – Training labels.

  • priv_group (scalar, optional) – The label of the privileged group.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

self

predict(X)[source]

Transform the targets using the learned model parameters.

Parameters:

X (pandas.DataFrame) – Training samples.

Returns:

numpy.ndarray – Transformed targets.

predict_proba(X)[source]

Transform the targets using the learned model parameters.

Parameters:

X (pandas.DataFrame) – Training samples.

Returns:

numpy.ndarray – Transformed targets. Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.

set_fit_request(*, priv_group: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') LearnedFairRepresentations[source]

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • priv_group (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for priv_group parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self (object) – The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') LearnedFairRepresentations[source]

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self (object) – The updated object.

transform(X)[source]

Transform the dataset using the learned model parameters.

Parameters:

X (pandas.DataFrame) – Training samples.

Returns:

pandas.DataFrame – Transformed samples.