aif360.sklearn.datasets.fetch_lawschool_gpa

aif360.sklearn.datasets.fetch_lawschool_gpa(subset='all', *, data_home=None, cache=True, binary_race=True, fillna_gender='female', usecols=['race', 'gender', 'lsat', 'ugpa'], dropcols=None, numeric_only=False, dropna=True)[source]

Load the Law School GPA dataset.

Optionally binarizes ‘race’ to ‘white’ (privileged) or ‘black’ (unprivileged). The other protected attribute is gender (‘male’ is privileged and ‘female’ is unprivileged). The outcome variable is standardized first year GPA (‘zfygpa’). Note: this is a continuous variable, i.e., a regression task.

Parameters:
  • subset ({'train', 'test', or 'all'}, optional) – Select the dataset to load: ‘train’ for the training set, ‘test’ for the test set, ‘all’ for both.

  • data_home (string, optional) – Specify another download and cache folder for the datasets. By default all AIF360 datasets are stored in ‘aif360/sklearn/data/raw’ subfolders.

  • cache (bool) – Whether to cache downloaded datasets.

  • binary_race (bool, optional) – Filter only white and black students.

  • fillna_gender (str or None, optional) – Fill NA values for gender with this value. If None, leave as NA. Note: this is used for backward- compatibility with tempeh and may be dropped in later versions.

  • usecols (single label or list-like, optional) – Feature column(s) to keep. All others are dropped.

  • dropcols (single label or list-like, optional) – Feature column(s) to drop.

  • numeric_only (bool) – Drop all non-numeric feature columns.

  • dropna (bool) – Drop rows with NAs.

Returns:

namedtuple – Tuple containing X, y, and sample_weights for the Law School GPA dataset accessible by index or name.