Get_tabular_learner doesn't do regression in v 1.0.24

aleksod · November 14, 2018, 11:14pm

I decided to branch out a conversation from another thread.

Basically, I have a regression problem to solve. The way I go about setting up my data and my model is the following:

procs = [Categorify]
data = TabularDataBunch.from_df(path=path, 
                                df=df, 
                                dep_var=dep_var, 
                                valid_idx=valid_idx, 
                                procs=procs, 
                                cat_names=cat_col, 
                                cont_names=con_col, c=1)

range_scale = 1.2
y_range = (
    float(df.iloc[train_idx].Vfinal_min_Vf50.min() * range_scale),
    float(df.iloc[train_idx].Vfinal_min_Vf50.max() * range_scale)
)

def rmse(pred, targ):
    "RMSE between `pred` and `targ`."
    return torch.sqrt(((targ - pred)**2).mean())

emb_szs = {'weekday': 3}
learn = get_tabular_learner(data, 
                            layers=[200,100], 
                            emb_szs=emb_szs, 
                            y_range=y_range) #, metrics=rmse)

I cannot use my own metric (RMSE), as it gives me the error indicative of the fact that it is doing a classification problem (i.e. it thinks my validation set had 96 levels with predicted 89 levels):

RuntimeError: The size of tensor a (96) must match the size of tensor b (89) at non-singleton dimension 1

Moreover, when I run learn.layer_groups, I get the following output:

[Sequential(
   (0): Embedding(6, 3)
   (1): Dropout(p=0.0)
   (2): BatchNorm1d(7, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (3): Linear(in_features=10, out_features=200, bias=True)
   (4): ReLU(inplace)
   (5): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (6): Linear(in_features=200, out_features=100, bias=True)
   (7): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (8): Linear(in_features=100, out_features=89, bias=True)
 )]

Notice that the last layer has 89 features to output, while I only need 1, i.e. a real number.

FastAI version 1.0.22 had a “hidden” parameter c one could pass to TabularDataBunch as discussed in the parent thread, but in the current version 1.0.24 that parameter does not have any effect on the kind of network being constructed.

Update

I noticed that kwargs do not get passed to data created from TabularDataBunch.from_df, see the code. Instead, you need to manually pass down c after data creation, e.g.

data = TabularDataBunch.from_df(path=path, 
                                df=df, 
                                dep_var=dep_var, 
                                valid_idx=valid_idx, 
                                procs=procs, 
                                cat_names=cat_names, 
                                cont_names=cont_names)
data.c=1
.
.
.
learn = get_tabular_learner(data, 
                            layers=[200,100], 
                            emb_szs=emb_szs, 
                            y_range=y_range)

The resultant network looks like it may be capable of doing regression, having only one output in the last layer:

[Sequential(
   (0): Embedding(6, 3)
   (1): Dropout(p=0.0)
   (2): BatchNorm1d(7, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (3): Linear(in_features=10, out_features=200, bias=True)
   (4): ReLU(inplace)
   (5): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (6): Linear(in_features=200, out_features=100, bias=True)
   (7): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (8): Linear(in_features=100, out_features=1, bias=True)
 )]

Unfortunately, that breaks something down the line when I try to teach the model with learn.fit_one_cycle(1, 1e-2). I believe it still thinks I am trying to teach it a classification task with only one output this time (i.e. 1 or 0) and it trips on the fact that my outputs lie outside of [0,1] range:

RuntimeError                              Traceback (most recent call last)
<ipython-input-20-3ea49add0339> in <module>
----> 1 learn.fit_one_cycle(1, 1e-2)

/opt/conda/lib/python3.6/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     18     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     19                                         pct_start=pct_start, **kwargs))
---> 20     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     21 
     22 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    160         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    161         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 162             callbacks=self.callbacks+callbacks)
    163 
    164     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)
     96 

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     82             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     83                 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 84                 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     85                 if cb_handler.on_batch_end(loss): break
     86 

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     23 
     24     if opt is not None:
---> 25         loss = cb_handler.on_backward_begin(loss)
     26         loss.backward()
     27         cb_handler.on_backward_end()

/opt/conda/lib/python3.6/site-packages/fastai/callback.py in on_backward_begin(self, loss)
    219     def on_backward_begin(self, loss:Tensor)->None:
    220         "Handle gradient calculation on `loss`."
--> 221         self.smoothener.add_value(loss.detach().cpu())
    222         self.state_dict['last_loss'], self.state_dict['smooth_loss'] = loss, self.smoothener.smooth
    223         for cb in self.callbacks:

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/generic/THCTensorCopy.cpp:70

Any suggestions?

Update 2

I believe one of the culprits is the directive of using cross-entropy as loss measure. The switch occurs within data_block.py inside label_cls function. My FastAI code base did not have if isinstance(it, (float, np.float32)): return FloatList line, but when I include it another error pops up so the solution is not there yet.

sgugger · November 16, 2018, 12:01am

You should check the code in the new notebook lesson5-rossmann. It’s not ready for class yet, but you have a few elements, mainly:

data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                   .split_by_idx(valid_idx)
                   .label_from_df(cols=dep_var, label_cls=FloatList, log=True)
                   .databunch())

for doing your data. The label_cls=FloatList is the thing that will force regression, you can remove log=True in your case since you don’t want the log of the targets I believe.

aleksod · November 16, 2018, 2:08pm

Your code appears to cause an error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-a7e6e3a95905> in <module>
      2 data = (TabularList.from_df(df, path=path, cat_names=cat_col, cont_names=con_col, procs=procs)
      3                    .split_by_idx(valid_idx)
----> 4                    .label_from_df(cols=dep_var, label_cls=FloatList, log=True)
      5                    .databunch())
      6 # data = TabularDataBunch.from_df(path=path,

NameError: name 'FloatList' is not defined

Update:

It is weird. When I install FastAI v 1.0.24 using conda install -c fastai fastai, I do not see FloatList as a defined class in data_block.py. However, I see it on GitHub. Do you suggest using what’s on GitHub, instead?

sgugger · November 16, 2018, 2:25pm

You have to either use a developer install to have the same version as one GitHub or wait for the 1.0.25 release later today.

aleksod · November 16, 2018, 2:45pm

Looks like it is working in version 1.0.25dev. I will update you on its status when I train my model for longer with better parameters (dropout?). Right now it gives me a negative R-squared score…

amirmim · November 22, 2018, 4:11am

I tried .label_from_df(cols=0,label_cls=FloatList) with text data in lesson3-imdb notebook to make aregression model.

data_reg = (TextList.from_csv(path, 'texts.csv', cols='text', vocab=data_lm.vocab)
            .split_from_df(col=2)
            .label_from_df(cols=0,label_cls=FloatList)
            .databunch(bs = 50))

My deppendat var has a float type. But I get this AttributeError:

AttributeError: 'FloatList' object has no attribute 'classes'

Does this only work for tabular data?

sgugger · November 22, 2018, 2:54pm

Can you share more of your code and the version of fastai you’re using? I just tried and got a data object then a learner I could train without any issue.

amirmim · November 22, 2018, 11:29pm

Sure, and thanks a lot.
the version is 1.0.28

data looks like this:

	label	   text	                                                is_valid
0	1.0	       benign neoplasm of rectum. benign neoplasm of ...	False
1	2.0	       benign neoplasm of rectum.	                        False
2	1.0	       senile cataract - unspecified.	                    True

And the code is:

data_lm = (TextList.from_csv(path, 'texts.csv', cols='text')
           #Where are the inputs? Column 'text' of this csv
                   .random_split_by_pct()
           #How to split it? Randomly with the default 20%
                   .label_for_lm()
           #Label it for a language model
                   .databunch(bs = 10816))

learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.3)
learn.fit_one_cycle(1, 1e-2, moms=(0.8,0.7))

The language model works fine. Then I try to make the data object:

data_reg = (TextList.from_csv(path, 'texts.csv', cols='text', vocab=data_lm.vocab)
            .split_from_df(col=2)
            .label_from_df(cols=0,label_cls=FloatList)
            .databunch(bs = 50))

I get:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-32-bfd01f8dd827> in <module>
      5 
      6 
----> 7 data_reg.save('tmp_clas')

/opt/anaconda3/lib/python3.6/site-packages/fastai/text/data.py in save(self, cache_name)
    111         np.save(cache_path/f'valid_lbl.npy', self.valid_ds.y.items)
    112         if self.test_dl is not None: np.save(cache_path/f'test_ids.npy', self.test_ds.x.items)
--> 113         save_texts(cache_path/'classes.txt', self.train_ds.classes)
    114 
    115     @classmethod

/opt/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __getattr__(self, k)
    409     def __getattr__(self,k:str)->Any:
    410         res = getattr(self.x, k, None)
--> 411         return res if res is not None else getattr(self.y, k)
    412 
    413     def __getitem__(self,idxs:Union[int,np.ndarray])->'LabelList':

AttributeError: 'FloatList' object has no attribute 'classes'

sgugger · November 23, 2018, 2:08am

Ah, I see. This is a bug indeed, I’ll fix this tomorrow.
In the meantime, just do data.classes = [] before trying to save the data object.

aleksod · November 23, 2018, 7:56pm

So it is working for me in 1.0.25dev and in 1.0.28 (provided the function is switched from get_tabular_learner to tabular_learner). Thank you!

amirmim · November 24, 2018, 9:51am

Thanks a lot.

jroberayalas · December 13, 2018, 4:31pm

I’d suggest to put this in the documentation, as it’s quite important and difficult to find.

sgugger · December 13, 2018, 4:34pm

It’s already in the docs of the data block API
What I think you meant is that there needs to be more tutorials.

jroberayalas · December 13, 2018, 4:49pm

Yeah, you’re right!