aif360.algorithms.inprocessing
.GridSearchReduction¶
-
class
aif360.algorithms.inprocessing.
GridSearchReduction
(estimator, constraints, prot_attr=None, constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid=None, drop_prot_attr=True, loss='ZeroOne', min_val=None, max_val=None)[source]¶ Grid search reduction for fair classification or regression.
Grid search is an in-processing technique that can be used for fair classification or fair regression. For classification it reduces fair classification to a sequence of cost-sensitive classification problems, returning the deterministic classifier with the lowest empirical error subject to fair classification constraints [1] among the candidates searched. For regression it uses the same priniciple to return a deterministic regressor with the lowest empirical error subject to the constraint of bounded group loss [2].
References
[1] A. Agarwal, A. Beygelzimer, M. Dudik, J. Langford, and H. Wallach, “A Reductions Approach to Fair Classification,” International Conference on Machine Learning, 2018. [2] A. Agarwal, M. Dudik, and Z. Wu, “Fair Regression: Quantitative Definitions and Reduction-based Algorithms,” International Conference on Machine Learning, 2019. Parameters: - estimator – An estimator implementing methods
fit(X, y, sample_weight)
andpredict(X)
, whereX
is the matrix of features,y
is the vector of labels, andsample_weight
is a vector of weights; labelsy
and predictions returned bypredict(X)
are either 0 or 1 – e.g. scikit-learn classifiers/regressors. - constraints (str or fairlearn.reductions.Moment) – If string, keyword
denoting the
fairlearn.reductions.Moment
object defining the disparity constraints – e.g., “DemographicParity” or “EqualizedOdds”. For a full list of possible options seeself.model.moments
. Otherwise, provide the desiredMoment
object defining the disparity constraints. - prot_attr – String or array-like column indices or column names of protected attributes.
- constraint_weight – When the
selection_rule
is “tradeoff_optimization” (default, no other option currently) this float specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be1-constraint_weight
. - grid_size (int) – The number of Lagrange multipliers to generate in the grid.
- grid_limit (float) – The largest Lagrange multiplier to generate. The
grid will contain values distributed between
-grid_limit
andgrid_limit
by default. - grid (pandas.DataFrame) – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument in a DataFrame.
- drop_prot_attr (bool) – Flag indicating whether to drop protected attributes from training data.
- loss (str) – String identifying loss function for constraints. Options include “ZeroOne”, “Square”, and “Absolute.”
- min_val – Loss function parameter for “Square” and “Absolute,” typically the minimum of the range of y values.
- max_val – Loss function parameter for “Square” and “Absolute,” typically the maximum of the range of y values.
Methods
fit
Learns model with less bias fit_predict
Train a model on the input and predict the labels. fit_transform
Train a model on the input and transform the dataset accordingly. predict
Obtain the predictions for the provided dataset using the model learned. transform
Return a new dataset generated by running this Transformer on the input. -
__init__
(estimator, constraints, prot_attr=None, constraint_weight=0.5, grid_size=10, grid_limit=2.0, grid=None, drop_prot_attr=True, loss='ZeroOne', min_val=None, max_val=None)[source]¶ Parameters: - estimator – An estimator implementing methods
fit(X, y, sample_weight)
andpredict(X)
, whereX
is the matrix of features,y
is the vector of labels, andsample_weight
is a vector of weights; labelsy
and predictions returned bypredict(X)
are either 0 or 1 – e.g. scikit-learn classifiers/regressors. - constraints (str or fairlearn.reductions.Moment) – If string, keyword
denoting the
fairlearn.reductions.Moment
object defining the disparity constraints – e.g., “DemographicParity” or “EqualizedOdds”. For a full list of possible options seeself.model.moments
. Otherwise, provide the desiredMoment
object defining the disparity constraints. - prot_attr – String or array-like column indices or column names of protected attributes.
- constraint_weight – When the
selection_rule
is “tradeoff_optimization” (default, no other option currently) this float specifies the relative weight put on the constraint violation when selecting the best model. The weight placed on the error rate will be1-constraint_weight
. - grid_size (int) – The number of Lagrange multipliers to generate in the grid.
- grid_limit (float) – The largest Lagrange multiplier to generate. The
grid will contain values distributed between
-grid_limit
andgrid_limit
by default. - grid (pandas.DataFrame) – Instead of supplying a size and limit for the grid, users may specify the exact set of Lagrange multipliers they desire using this argument in a DataFrame.
- drop_prot_attr (bool) – Flag indicating whether to drop protected attributes from training data.
- loss (str) – String identifying loss function for constraints. Options include “ZeroOne”, “Square”, and “Absolute.”
- min_val – Loss function parameter for “Square” and “Absolute,” typically the minimum of the range of y values.
- max_val – Loss function parameter for “Square” and “Absolute,” typically the maximum of the range of y values.
- estimator – An estimator implementing methods
- estimator – An estimator implementing methods