aif360.sklearn.inprocessing.GridSearchReduction

class aif360.sklearn.inprocessing.GridSearchReduction(prot_attr, estimator, constraints, constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid=None, drop_prot_attr=True, loss='ZeroOne', min_val=None, max_val=None)[source]

Grid search reduction for fair classification or regression.

Grid search is an in-processing technique that can be used for fair classification or fair regression. For classification it reduces fair classification to a sequence of cost-sensitive classification problems, returning the deterministic classifier with the lowest empirical error subject to fair classification constraints [1] among the candidates searched. For regression it uses the same priniciple to return a deterministic regressor with the lowest empirical error subject to the constraint of bounded group loss [2].

References

Parameters:
  • prot_attr – String or array-like column indices or column names of protected attributes.

  • estimator – An estimator implementing methods fit(X, y, sample_weight) and predict(X), where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned by predict(X) are either 0 or 1 – e.g. scikit-learn classifiers/regressors.

  • constraints (str or fairlearn.reductions.Moment) – If string, keyword denoting the fairlearn.reductions.Moment object defining the disparity constraints – e.g., “DemographicParity” or “EqualizedOdds”. For a full list of possible options see self.model.moments. Otherwise, provide the desired Moment object defining the disparity constraints.

  • constraint_weight – When the selection_rule is “tradeoff_optimization” (default, no other option currently) this float specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be 1-constraint_weight.

  • grid_size (int) – The number of Lagrange multipliers to generate in the grid.

  • grid_limit (float) – The largest Lagrange multiplier to generate. The grid will contain values distributed between -grid_limit and grid_limit by default.

  • grid (pandas.DataFrame) – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument in a DataFrame.

  • drop_prot_attr (bool) – Flag indicating whether to drop protected attributes from training data.

  • loss (str) – String identifying loss function for constraints. Options include “ZeroOne”, “Square”, and “Absolute.”

  • min_val – Loss function parameter for “Square” and “Absolute,” typically the minimum of the range of y values.

  • max_val – Loss function parameter for “Square” and “Absolute,” typically the maximum of the range of y values.

Methods

fit

Train a less biased classifier or regressor with the given training data.

get_metadata_routing

Get metadata routing of this object.

get_params

Get parameters for this estimator.

predict

Predict output for the given samples.

predict_proba

Probability estimates.

score

Return the mean accuracy on the given test data and labels.

set_params

Set the parameters of this estimator.

set_score_request

Request metadata passed to the score method.

__init__(prot_attr, estimator, constraints, constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid=None, drop_prot_attr=True, loss='ZeroOne', min_val=None, max_val=None)[source]
Parameters:
  • prot_attr – String or array-like column indices or column names of protected attributes.

  • estimator – An estimator implementing methods fit(X, y, sample_weight) and predict(X), where X is the matrix of features, y is the vector of labels, and sample_weight is a vector of weights; labels y and predictions returned by predict(X) are either 0 or 1 – e.g. scikit-learn classifiers/regressors.

  • constraints (str or fairlearn.reductions.Moment) – If string, keyword denoting the fairlearn.reductions.Moment object defining the disparity constraints – e.g., “DemographicParity” or “EqualizedOdds”. For a full list of possible options see self.model.moments. Otherwise, provide the desired Moment object defining the disparity constraints.

  • constraint_weight – When the selection_rule is “tradeoff_optimization” (default, no other option currently) this float specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be 1-constraint_weight.

  • grid_size (int) – The number of Lagrange multipliers to generate in the grid.

  • grid_limit (float) – The largest Lagrange multiplier to generate. The grid will contain values distributed between -grid_limit and grid_limit by default.

  • grid (pandas.DataFrame) – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument in a DataFrame.

  • drop_prot_attr (bool) – Flag indicating whether to drop protected attributes from training data.

  • loss (str) – String identifying loss function for constraints. Options include “ZeroOne”, “Square”, and “Absolute.”

  • min_val – Loss function parameter for “Square” and “Absolute,” typically the minimum of the range of y values.

  • max_val – Loss function parameter for “Square” and “Absolute,” typically the maximum of the range of y values.

fit(X, y)[source]

Train a less biased classifier or regressor with the given training data.

Parameters:
  • X (pandas.DataFrame) – Training samples.

  • y (array-like) – Training output.

Returns:

self

predict(X)[source]

Predict output for the given samples.

Parameters:

X (pandas.DataFrame) – Test samples.

Returns:

numpy.ndarray – Predicted output per sample.

predict_proba(X)[source]

Probability estimates.

The returned estimates for all classes are ordered by the label of classes for classification.

Parameters:

X (pandas.DataFrame) – Test samples.

Returns:

numpy.ndarray – returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') GridSearchReduction[source]

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self (object) – The updated object.