Hello,
I am trying to experiment with fastai on Tabular datas, and i want to normalize/scale some columns but i can’t get the transforms specified in procs to work on my dataset.
I tried reproducing the tutorial from the docs on Kaggle, but the transforms aren’t applied to the data.
See here if you want to reproduce at home:
from fastai.tabular.all import *
path = untar_data(URLs.ADULT_SAMPLE)
path.ls()
df = pd.read_csv(path/'adult.csv')
splits = RandomSplitter(valid_pct=0.2)(range_of(df))
to = TabularPandas(df, procs=[Categorify, FillMissing,Normalize],
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race'],
cont_names = ['age', 'fnlwgt', 'education-num'],
y_names='salary',
splits=splits)
dls = to.dataloaders(bs=64)
dls.show_batch()
dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names="salary",
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race'],
cont_names = ['age', 'fnlwgt', 'education-num'],
procs = [Categorify, FillMissing, Normalize])
dls.show_batch()
workclass | education | marital-status | occupation | relationship | race | education-num_na | age | fnlwgt | education-num | salary | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | Private | HS-grad | Divorced | Priv-house-serv | Unmarried | Black | False | 62.000000 | 272209.000628 | 9.0 | <50k |
1 | Local-gov | HS-grad | Married-civ-spouse | Transport-moving | Husband | White | False | 49.000000 | 67001.004943 | 9.0 | <50k |
2 | Private | Bachelors | Divorced | Prof-specialty | Unmarried | White | False | 35.000000 | 225749.998765 | 13.0 | <50k |
3 | Local-gov | HS-grad | Divorced | Adm-clerical | Unmarried | White | False | 37.000000 | 272471.002093 | 9.0 | <50k |
4 | ? | HS-grad | Divorced | ? | Not-in-family | White | False | 46.000000 | 37671.997994 | 9.0 | <50k |
5 | State-gov | HS-grad | Never-married | Sales | Unmarried | Black | False | 30.000000 | 312767.002367 | 9.0 | <50k |
6 | Private | HS-grad | Divorced | Other-service | Not-in-family | White | False | 63.000001 | 174826.000719 | 9.0 | <50k |
7 | Federal-gov | Some-college | Married-civ-spouse | Adm-clerical | Husband | White | False | 61.000000 | 229062.000296 | 10.0 | >=50k |
8 | State-gov | Assoc-voc | Divorced | Tech-support | Unmarried | White | False | 45.000000 | 37671.997994 | 11.0 | <50k |
9 | Private | HS-grad | Married-civ-spouse | Adm-clerical | Wife | White | False | 20.000000 | 221661.000241 | 9.0 | <50k |
workclass | education | marital-status | occupation | relationship | race | education-num_na | age | fnlwgt | education-num | salary | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | Private | Some-college | Married-civ-spouse | Adm-clerical | Wife | White | False | 36.000000 | 461336.999879 | 10.0 | <50k |
1 | Federal-gov | Some-college | Married-civ-spouse | Tech-support | Husband | White | False | 29.000000 | 309777.997350 | 10.0 | >=50k |
2 | Self-emp-not-inc | Some-college | Divorced | Craft-repair | Not-in-family | White | False | 58.000001 | 195835.000087 | 10.0 | <50k |
3 | Private | HS-grad | Never-married | Craft-repair | Not-in-family | White | False | 39.000000 | 248693.996852 | 9.0 | <50k |
4 | Private | Doctorate | Married-civ-spouse | #na# | Husband | White | False | 64.000000 | 47298.001898 | 16.0 | >=50k |
5 | Local-gov | 11th | Married-civ-spouse | Craft-repair | Husband | Black | False | 64.000000 | 244903.001857 | 7.0 | >=50k |
6 | Private | Some-college | Divorced | Exec-managerial | Not-in-family | White | False | 35.000000 | 187119.000058 | 10.0 | <50k |
7 | Private | Assoc-acdm | Never-married | Other-service | Own-child | White | False | 25.000000 | 82559.995684 | 12.0 | <50k |
8 | Private | Assoc-acdm | Divorced | Prof-specialty | Unmarried | White | False | 42.000000 | 341204.005738 | 12.0 | >=50k |
9 | ? | HS-grad | Never-married | ? | Not-in-family | White | False | 40.000000 | 84232.001317 | 9.0 | <50k |