aif360.sklearn.datasets.fetch_adult

aif360.sklearn.datasets.fetch_adult(subset='all', data_home=None, binary_race=True, usecols=[], dropcols=[], numeric_only=False, dropna=True)[source]

Load the Adult Census Income Dataset.

Binarizes ‘race’ to ‘White’ (privileged) or ‘Non-white’ (unprivileged). The other protected attribute is ‘sex’ (‘Male’ is privileged and ‘Female’ is unprivileged). The outcome variable is ‘annual-income’: ‘>50K’ (favorable) or ‘<=50K’ (unfavorable).

Note

By default, the data is downloaded from OpenML. See the adult page for details.

Parameters:
  • subset ({'train', 'test', or 'all'}, optional) – Select the dataset to load: ‘train’ for the training set, ‘test’ for the test set, ‘all’ for both.
  • data_home (string, optional) – Specify another download and cache folder for the datasets. By default all AIF360 datasets are stored in ‘aif360/sklearn/data/raw’ subfolders.
  • binary_race (bool, optional) – Group all non-white races together.
  • usecols (single label or list-like, optional) – Feature column(s) to keep. All others are dropped.
  • dropcols (single label or list-like, optional) – Feature column(s) to drop.
  • numeric_only (bool) – Drop all non-numeric feature columns.
  • dropna (bool) – Drop rows with NAs.
Returns:

namedtuple – Tuple containing X, y, and sample_weights for the Adult dataset accessible by index or name.

Examples

>>> adult = fetch_adult()
>>> adult.X.shape
(45222, 13)
>>> adult_num = fetch_adult(numeric_only=True)
>>> adult_num.X.shape
(48842, 5)