Fastai v1 questions

fredguth · October 11, 2018, 2:31pm

I am working with v1 and I have a few questions (not issues) and couldn’t find another thread well suited for them. Here they go:

Can’t find *_save_name in fit. What is the suggestion to save models now? Create a callback? Is there a callback like old SaveBestModel?
I noticed that there still is a TrainingPhase scheduler. How can one use it? I haven’t find it in docs.
recorder plot is the new sched.plot, right? Is there a way for it to plot the axes names? and legend for plot_losses?

sgugger · October 11, 2018, 2:47pm

The SaveBestModel callback doesn’t exist yet. If you implement one, don’t hesitate to suggest it in a PR (in a notebook).
The TrainingPhase object goes with GeneralScheduler, both of them are documented here with an example.
recoderd.plot replaces sched.plot indeed. If you want to customize your plot, you should just copy-paste its code and add whatever you want (the losses and lrs will be in learn.recorder.losses and learn.recorder.lrs respectively).

Hope that helps!

fredguth · October 11, 2018, 2:56pm

Thanks

jeremy · October 11, 2018, 4:31pm

Maybe not a PR with a notebook, but a forum post linking to a notebook would be better? Since we’re not actually wanting to merge notebooks in to the repo, right?

sgugger · October 11, 2018, 4:56pm

You’re right, a forum post linking to a notebook is best.

PranY · October 12, 2018, 4:35pm

I’m not sure where to put this question so putting it here.

I’m trying to replicate the https://github.com/fastai/fastai_old/blob/master/dev_nb/006_carvana.ipynb notebook tor TSG challenge and the only problem I’m facing is with the learner

When I do

learn = ConvLearner(data, tvm.resnet34, 2, custom_head=head,
metrics=metrics, loss_fn=CrossEntropyFlat())

I get

/usr/local/lib/python3.6/dist-packages/fastai/vision/learner.py in _resnet_split(m)
33 def _default_split(m:Model): return (m[1],)
34 # Split a resnet style model
—> 35 def _resnet_split(m:Model): return (m[0][6],m[1])
36
37 _default_meta = {‘cut’:-1, ‘split’:_default_split}

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in getitem(self, idx)
66 return self.class(OrderedDict(list(self._modules.items())[idx]))
67 else:
—> 68 return self._get_item_by_idx(self._modules.values(), idx)
69
70 def setitem(self, idx, module):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in _get_item_by_idx(self, iterator, idx)
58 idx = operator.index(idx)
59 if not -size <= idx < size:
—> 60 raise IndexError(‘index {} is out of range’.format(idx))
61 idx %= size
62 return next(islice(iterator, idx, None))

IndexError: index 6 is out of range

It seems funny because the implementation in learner is straightforward, my data source and head is perfectly fine. Can anyone suggest me the right direction to think in?

ramon · October 12, 2018, 6:56pm

A question about tfm_y and class DataSetTfm

Am I correct that apply_ftms uses random values to generate a new augmentation?
Then, the random values applied to x are different than the random values to y. In case y are bounding boxes, different transformations are applied to y.

    def __getitem__(self,idx:int)->Tuple[ItemBase,Any]:
    "Return tfms(x),y."
    x,y = self.ds[idx]
    x = apply_tfms(self.tfms, x, **self.kwargs)
    if self.tfm_y: y = apply_tfms(self.tfms, y, **self.y_kwargs)
    return x, y

sgugger · October 12, 2018, 7:03pm

Nope, in the y_kwargs we added the magic line do_resolve=False which means the resolved arguments (everything random) are kept as is.

sgugger · October 12, 2018, 7:06pm

What do you want to do with the argument 2? It’s where the pretrained model is cut, so currently you’re cutting at the second layer of the model (instead of -2) which gives you this error. It’s best to leave it blank and let the library figure out for you where to cut in general.

ramon · October 12, 2018, 7:22pm

Cool, thanks

fredguth · October 13, 2018, 3:33am

I have created 3 callbacks TerminateOnNaN, EarlyStopping and SaveModel, they are in this notebook:

github.com

fredguth/fastai_playground/blob/master/playground.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai.torch_core import *\n",
    "from fastai.data import DataBunch\n",
    "from fastai.callback import *\n",
    "from fastai.basic_train import Learner, LearnerCallback"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [

This file has been truncated. show original

They are very similar to Keras callbacks of same name.

PranY · October 13, 2018, 8:10am

I interpreted the argument incorrectly, I thought its the 2nd layer from end which is equivalent of -2, leaving it blank as you suggested. Thanks!

fredguth · October 13, 2018, 5:33pm

I posted the notebook and I could send a PR with the py code, but I didn’t understand if I should send a PR or not.

sgugger · October 13, 2018, 5:45pm

The notebook looks great, I’ll get working on it to merge the content with the library.
The idea is that we (Jeremy or I) would like to do the incorporation of big contributions in the library to make sure it takes advantage of everything there is there and has the same coding style since we often find we have to rewrite a lot of things in PRs.
The downside is that you won’t get your name on a commit so we’re also creating a file (probably called CHANGES.md) where we will cite all those contributions (yours would be the first there) and link to the original notebooks where those were introduced (like the one you did).

This is an experimental process, so please don’t hesitate to give us any feedback.

jeremy · October 13, 2018, 6:15pm

Also, it’s extremely helpful if notebooks can include a few cells containing assert statements that check that things are working correctly. that way we have tests we can immediately add to the test suite. i updated CONTRIBUTING.md yesterday to mention the need for tests.

zubair1.shah · October 16, 2018, 11:56pm

Hi,
Is there a way to set batch size in the new Library. I am training a language model and it keep throwing memory error after it run till few percents. I am thinking reducing bs might solve this problem.

sgugger · October 17, 2018, 2:35pm

Whenever you call a method to get a DataBunch, you can pass bs = ... to set the batch size (default is 64).

fredguth · October 17, 2018, 2:40pm

Would it be possible to use the information on the hardware to set the best batch size instead of using a 64 default? Similar to what is done with n_workers. If we can get the size of GPU RAM and know how much data the model uses, we could find a more suitable default, no?

sgugger · October 17, 2018, 2:49pm

The data object doesn’t know the model though.

fredguth · October 17, 2018, 5:47pm

:-/ Ok.