Fastai v1 questions


(Fred Guth) #1

I am working with v1 and I have a few questions (not issues) and couldn’t find another thread well suited for them. Here they go:

  1. Can’t find *_save_name in fit. What is the suggestion to save models now? Create a callback? Is there a callback like old SaveBestModel?

  2. I noticed that there still is a TrainingPhase scheduler. How can one use it? I haven’t find it in docs.

  3. recorder plot is the new sched.plot, right? Is there a way for it to plot the axes names? and legend for plot_losses?


#2

The SaveBestModel callback doesn’t exist yet. If you implement one, don’t hesitate to suggest it in a PR (in a notebook).
The TrainingPhase object goes with GeneralScheduler, both of them are documented here with an example.
recoderd.plot replaces sched.plot indeed. If you want to customize your plot, you should just copy-paste its code and add whatever you want (the losses and lrs will be in learn.recorder.losses and learn.recorder.lrs respectively).

Hope that helps!


(Fred Guth) #3

Thanks


(Jeremy Howard) #4

Maybe not a PR with a notebook, but a forum post linking to a notebook would be better? Since we’re not actually wanting to merge notebooks in to the repo, right?


#5

You’re right, a forum post linking to a notebook is best.


(Pranjal Yadav) #6

I’m not sure where to put this question so putting it here.

I’m trying to replicate the https://github.com/fastai/fastai_old/blob/master/dev_nb/006_carvana.ipynb notebook tor TSG challenge and the only problem I’m facing is with the learner

When I do

learn = ConvLearner(data, tvm.resnet34, 2, custom_head=head,
metrics=metrics, loss_fn=CrossEntropyFlat())

I get

/usr/local/lib/python3.6/dist-packages/fastai/vision/learner.py in _resnet_split(m)
33 def _default_split(m:Model): return (m[1],)
34 # Split a resnet style model
—> 35 def _resnet_split(m:Model): return (m[0][6],m[1])
36
37 _default_meta = {‘cut’:-1, ‘split’:_default_split}

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in getitem(self, idx)
66 return self.class(OrderedDict(list(self._modules.items())[idx]))
67 else:
—> 68 return self._get_item_by_idx(self._modules.values(), idx)
69
70 def setitem(self, idx, module):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in _get_item_by_idx(self, iterator, idx)
58 idx = operator.index(idx)
59 if not -size <= idx < size:
—> 60 raise IndexError(‘index {} is out of range’.format(idx))
61 idx %= size
62 return next(islice(iterator, idx, None))

IndexError: index 6 is out of range

It seems funny because the implementation in learner is straightforward, my data source and head is perfectly fine. Can anyone suggest me the right direction to think in?


(Ramon) #7

A question about tfm_y and class DataSetTfm

Am I correct that apply_ftms uses random values to generate a new augmentation?
Then, the random values applied to x are different than the random values to y. In case y are bounding boxes, different transformations are applied to y.

    def __getitem__(self,idx:int)->Tuple[ItemBase,Any]:
    "Return tfms(x),y."
    x,y = self.ds[idx]
    x = apply_tfms(self.tfms, x, **self.kwargs)
    if self.tfm_y: y = apply_tfms(self.tfms, y, **self.y_kwargs)
    return x, y

#8

Nope, in the y_kwargs we added the magic line do_resolve=False which means the resolved arguments (everything random) are kept as is.


#9

What do you want to do with the argument 2? It’s where the pretrained model is cut, so currently you’re cutting at the second layer of the model (instead of -2) which gives you this error. It’s best to leave it blank and let the library figure out for you where to cut in general.


(Ramon) #10

Cool, thanks


(Fred Guth) #11

I have created 3 callbacks TerminateOnNaN, EarlyStopping and SaveModel, they are in this notebook:

They are very similar to Keras callbacks of same name.


(Pranjal Yadav) #12

I interpreted the argument incorrectly, I thought its the 2nd layer from end which is equivalent of -2, leaving it blank as you suggested. Thanks!


(Fred Guth) #13

I posted the notebook and I could send a PR with the py code, but I didn’t understand if I should send a PR or not.


#14

The notebook looks great, I’ll get working on it to merge the content with the library.
The idea is that we (Jeremy or I) would like to do the incorporation of big contributions in the library to make sure it takes advantage of everything there is there and has the same coding style since we often find we have to rewrite a lot of things in PRs.
The downside is that you won’t get your name on a commit so we’re also creating a file (probably called CHANGES.md) where we will cite all those contributions (yours would be the first there) and link to the original notebooks where those were introduced (like the one you did).

This is an experimental process, so please don’t hesitate to give us any feedback.


(Jeremy Howard) #15

Also, it’s extremely helpful if notebooks can include a few cells containing assert statements that check that things are working correctly. that way we have tests we can immediately add to the test suite. i updated CONTRIBUTING.md yesterday to mention the need for tests.


(Zubair Shah) #16

Hi,
Is there a way to set batch size in the new Library. I am training a language model and it keep throwing memory error after it run till few percents. I am thinking reducing bs might solve this problem.


#17

Whenever you call a method to get a DataBunch, you can pass bs = ... to set the batch size (default is 64).


(Fred Guth) #18

Would it be possible to use the information on the hardware to set the best batch size instead of using a 64 default? Similar to what is done with n_workers. If we can get the size of GPU RAM and know how much data the model uses, we could find a more suitable default, no?


#19

The data object doesn’t know the model though.


(Fred Guth) #20

:-/ Ok.