Lr_find for tabular data is blank

FCFC · March 23, 2019, 9:52pm

Hey all,

I’m having trouble with finding a LR and running a fit on a tabular dataset only filled with continuous variables.

Here is what the data looks like:

I’m trying to predict the “peak” value with the following continuous variables: “ml”, “depth”, “dist”, and “azmith”. However, when I try to plot the learning rate, I get an empty plot:

Sometimes, I run the exactly the same code and I get an actual plot though! In any case, no matter what I do, if I try to fit using any kind of LR, for example:

learn.fit_one_cycle(1, 1e-6)

I get the following error:

Any help would be appreciated, thanks!

gevezex · March 23, 2019, 10:16pm

At a first glance it looks like your are missing the y_range option in your tabular_learner.
That’s why it will see it as a logistic regression instead of a linear regression problem and hence your metric is confused.

FCFC · March 23, 2019, 10:57pm

Thanks very much for your quick reply.

I was working off the Lesson 4 example which did not use y_range. I can now use it to get a consistent output at the moment (albeit not very good, but still).

It also seems the latter float error has to do with non-log metrics.

As of right now there are way too many parameters that I don’t understand. I’ll have to put this problem down at the moment and go through Lesson 6 and the Rossman example to understand a bit better what’s going on and come back to this.

Thanks again!

FCFC · March 26, 2019, 8:29pm

I had a follow up question actually. Even with doing a proper y_range parameter, I keep getting the same runtime error as above if I use the metrics=accuracy. The only way I can avoid this is to use metrics=exp_rmspe. Do you have any suggestions regarding why this may be?

Thanks!

gevezex · March 26, 2019, 9:09pm

You are using linear regression instead of logistic regression.

Check the documentation what accuracy means:
As you can see it expects

bs x n_classes

as inputs. You should have a float as output and hence

bs x 1

as input for accuracy.
Accuracy is a classification metric.

FCFC · March 26, 2019, 9:21pm

Thanks for your response, that is very helpful!

pravinvignesh · June 15, 2019, 2:54am

U can write a custom accuracy metric by yourself where u typecast the targ value as long.it will work fine but it is not a correct metric to use for these kind of problems , since accuracy is a classification metric.

Antoine.C · November 15, 2019, 5:39pm

In addition to that, I am not quite sure how the learner knows that @FCFC is trying to do regression and not classification.

Here is what I mean by that.

I haven’t worked much with the convenience function TabularDataBunch() and the likes, but more on the data block API. In order to build a (tabular) data bunch for regression, I specify that the output is FloatList. Namely, adapting from this fastai tutorial, I use something like:

data = (TabularList.from_df(df, path=adult, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(valid_idx=range(800,1000))
                           .label_from_df(cols=dep_var, label_cls=FloatList)
                           .databunch())

Note the extra argument label_cls=FloatList.

I don’t see anything like that in FCFC’s code, and I don’t why playing around with y_range in the learner would help with this, as this comes after creating the data bunch.

Here is a (timestamped) link to a fastai video where Jeremy explains this.