Fastai v2 chat

muellerzr · February 1, 2020, 9:54pm

For the major changes I keep an eye on notebook 50 with the DataBlock examples, else yeah I keep track of the commits or look for specific things I want to keep an eye out for.

morgan · February 2, 2020, 6:20pm

OK so I kind of have a feeling of what is going on with the AWD_QRNN classifier:

In SentenceEncoder there is a line that I believe removes the padding from the input before passing if to your model:

for i in range(0, sl, self.bptt):
    #Note: this expects that sequence really begins on a round multiple of bptt
    real_bs = (input[:,i] != self.pad_idx).long().sum()             # Find the count of non-padded items
    r,o = self.module(input[:real_bs,i: min(i+self.bptt, sl)])     # Pass to model

For example from the logs below you can see that the entire batch size is (8, 850), but this then gets chopped up into various lengths (like (3, 72), (4, 72) or (5, 72)) according to real_bs. It is this chopped up input that gets passed to your model.

real_bs : 3, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([3, 72])

real_bs : 4, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([4, 72])

real_bs : 5, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([5, 72])

AWD_QRNN Classifier
The above works fine for the awd_qrnn language model, however introducing the SortedDL dataloader and before_batch=pad_input seems to cause issues, specifically with the _get_source function in the QRNNLayer class.

The function concats the previous input (prevX) with the current input. However because real_bs can change, it can cause a misalignment in tensor shapes between prevX and the current input.

def forward(self, inp, hid=None):
        y = self.linear(self._get_source(inp))     # <-- _get_source() function called 
        ...
    def _get_source(self, inp):
        if self.window == 1: return inp
        dim = (1 if self.batch_first else 0)
        inp_shift = [torch.zeros_like(inp[:,:1] if self.batch_first else inp[:1]) if self.prevX is None else self.prevX]
        if self.backward: inp_shift.insert(0,inp[:,1:] if self.batch_first else inp[1:])
        else:             inp_shift.append(inp[:,:-1] if self.batch_first else inp[:-1])
        inp_shift = torch.cat(inp_shift, dim)       # <- This is where the QRNN classifier fails
        return torch.cat([inp, inp_shift], 2)

An example of the size descrepancy can be seen logged below:

real_bs : 1, input size : torch.Size([64, 2055]) , pad_idx : 1      <-- real_bs = 1
AWD_QRNN inp size: torch.Size([1, 72])
QRNN Module input size : torch.Size([1, 72, 50])

input size: torch.Size([1, 72, 50]), window: 2, batch_first: True, backward : False  # _get_source()
prevX size: torch.Size([1, 1, 50])
inp_shift size : torch.Size([1, 1, 50])
output size : torch.Size([1, 72, 100])       <--- Successful concat outputted from _get_source()

real_bs : 3, input size : torch.Size([64, 2055]) , pad_idx : 1       <-- real_bs changes
AWD_QRNN inp size: torch.Size([3, 72])
QRNN Module input size : torch.Size([3, 72, 50])

input size: torch.Size([3, 72, 50]), window: 2, batch_first: True, backward : False  # _get_source() 
prevX size: torch.Size([1, 1, 50])         <-- Will cause the error when concatted

I think thats whats happending, but I’m not 100% sure what the best way to resolve it is, whether some padding needs to be re-added in _get_source, or something else.

@sgugger , @jeremy would you have any suggestions how to resolve? cc: @fmobrj75

jeremy · February 2, 2020, 6:45pm

Thanks for the great analysis. @sgugger has made a bunch of changes to padding recently, but hasn’t been testing the QRNN AFAIK, so I wouldn’t be surprised if things have cropped up.

We’re on a book deadline until Feb 10, so please ping us after that if we don’t follow up…

sgugger · February 2, 2020, 7:27pm

Probably yes

muellerzr · February 3, 2020, 9:26pm

Quick question:

is there a fast way to get access to my all my transformed y’s post an item transform? (such as PointScaler) in an entire dataloader?

I’m currently trying this:
tfmd = [dls.after_item.point_scaler(x[1]) for x in dls.dataset]

boris · February 4, 2020, 1:56am

If anybody is interested in dataloaders that can process partial epochs, here is my implementation which picks randomly subset of data at each iteration.

The intention was to handle datasets such as ImageNet and get validation metrics more frequently.

#export
@delegates()
class PartialDL(TfmdDL):
    '''Select randomly partial quantity of data at each epoch'''
    def __init__(self, dataset=None, bs=None, partial_n=None, **kwargs):        
        super().__init__(dataset=dataset, bs=bs, **kwargs)
        self.partial_n = min(partial_n, self.n) if partial_n else None

    def get_idxs(self):
        if self.partial_n is None: return super().get_idxs()
        return list(np.random.choice(self.n, self.partial_n, replace=False))
    
    def __len__(self):
        if self.partial_n is None: return super().__len__()
        return self.partial_n//self.bs + (0 if self.drop_last or self.partial_n%self.bs==0 else 1)


#export
@patch
@delegates(Datasets.dataloaders)
def partial_dataloaders(self:FilteredBase, partial_n, bs=64, **kwargs):
    xtra_kwargs = [{}] * (self.n_subsets-1)
    return self.dataloaders(bs=bs, dl_type=PartialDL, dl_kwargs=({'partial_n':partial_n}, *xtra_kwargs), **kwargs)

I could add an option to prevent replacing items until they have all be drawn or allow to have a “non-shuffled” version but I wanted to keep it simple for now. It works both with TfmdLists and Datasets.

Let me know if you are interested in a PR for it.

boris · February 4, 2020, 7:26pm

I noticed 21_vision.learner.ipynb does not execute fully anymore.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-bd945bd2ef1c> in <module>
----> 1 learn = unet_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(axis=1))
      2 learn = unet_learner(dls, models.resnet34, pretrained=True, n_in=4)

<ipython-input-28-429ea9b9ca5e> in unet_learner(dls, arch, loss_func, pretrained, cut, splitter, config, n_in, n_out, **kwargs)
     11     model = models.unet.DynamicUnet(body, n_out, size, **config)
     12     learn = Learner(dls, model, loss_func=loss_func, splitter=ifnone(splitter, meta['split']), **kwargs)
---> 13     if pretrained: learn.freeze()
     14     return learn

~/Projects/fastai2/fastai2/learner.py in freeze(self)
    564 
    565 @patch
--> 566 def freeze(self:Learner): self.freeze_to(-1)
    567 
    568 @patch

~/Projects/fastai2/fastai2/learner.py in freeze_to(self, n)
    559 @patch
    560 def freeze_to(self:Learner, n):
--> 561     if self.opt is None: self.create_opt()
    562     self.opt.freeze_to(n)
    563     self.opt.clear_state()

~/Projects/fastai2/fastai2/learner.py in create_opt(self)
    233     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
    234     def create_opt(self):
--> 235         self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
    236         if not self.wd_bn_bias:
    237             for p in self._bn_bias_state(True ): p['do_wd'] = False

<ipython-input-23-75cf9ffd62a9> in _resnet_split(m)
      1 #export
      2 def _xresnet_split(m): return L(m[0][:3], m[0][3:], m[1:]).map(params)
----> 3 def  _resnet_split(m): return L(m[0][:6], m[0][6:], m[1:]).map(params)
      4 def _squeezenet_split(m:nn.Module): return L(m[0][0][:5], m[0][0][5:], m[1:]).map(params)
      5 def _densenet_split(m:nn.Module): return L(m[0][0][:7],m[0][0][7:], m[1:]).map(params)

~/Projects/fastcore/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

~/Projects/fastcore/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

~/Projects/fastcore/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

~/Projects/fastcore/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

~/Projects/fastcore/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

~/Projects/fastcore/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

~/Projects/fastai2/fastai2/torch_core.py in params(m)
    496 def params(m):
    497     "Return all parameters of `m`"
--> 498     return [p for p in m.parameters()]
    499 
    500 # Cell

AttributeError: 'tuple' object has no attribute 'parameters'

It runs if we do pretrained=False but will fail at the same step when we do learn.fit(1).

fmobrj75 · February 4, 2020, 9:22pm

Hi @sgugger. Tried to update to the last version of fastai2 (v 0.0.8) and now receiving an error when using text_classifier_learner:

learn = text_classifier_learner(dbunch_fwd, 
                                AWD_LSTM, 
                                seq_len=72,
                                pretrained=False, 
                                config=config, 
                                metrics=[accuracy], 
                                path=path, 
                                drop_mult=0.7,
                                loss_func=CrossEntropyLossFlat()
                               ).to_fp16()

error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-30-a900cdbe143c> in <module>
      7                                 path=path,
      8                                 drop_mult=0.7,
----> 9                                 loss_func=CrossEntropyLossFlat()
     10                                ).to_fp16()
     11 

/media/hdd3tb/data/fastai2/fastai2/text/learner.py in text_classifier_learner(dls, arch, seq_len, config, pretrained, drop_mult, n_out, lin_ftrs, ps, max_len, **kwargs)
    173                                 drop_mult=drop_mult, lin_ftrs=lin_ftrs, ps=ps, max_len=max_len)
    174     meta = _model_meta[arch]
--> 175     learn = TextLearner(dls, model, splitter=meta['split_clas'], **kwargs)
    176     if pretrained:
    177         if 'url' not in meta:

/media/hdd3tb/data/fastai2/fastai2/text/learner.py in __init__(self, model, dls, loss_func, alpha, beta, moms, **kwargs)
     51     def __init__(self, model, dls, loss_func, alpha=2., beta=1., moms=(0.8,0.7,0.8), **kwargs):
     52         super().__init__(model, dls, loss_func, moms=moms, **kwargs)
---> 53         self.add_cbs([ModelReseter(), RNNTrainer(alpha=alpha, beta=beta)])
     54 
     55     def save_encoder(self, file):

NameError: name 'RNNTrainer' is not defined

sgugger · February 4, 2020, 10:21pm

Change of name wasn’t properly saved in the notebook, should be fixed now.

fmobrj75 · February 4, 2020, 10:29pm

Thanks!

lgvaz · February 4, 2020, 10:50pm

How can I use nbdev to jump through the library? I tried doing:

from nbdev.showdoc import *
nb_source_link(Pipeline)

But I get the error:

AssertionError: Use `Config.create` to create a `Config` object the first time

I tried calling nbdev_build_lib inside my fastai clone but that did not helped

muellerzr · February 4, 2020, 10:55pm

@lgvaz see here: (and the following discussion mentioned)

muellerzr · February 4, 2020, 11:07pm

By the way, for those navigating the documentation and may have issues with the search being confusing, you can jump to the right page by first seeing what notebook it’s in:

DataBlock??

Which will give us fastai2/data/block.py
We can take this and put it directly into the URL like so:

dev.fast.ai/data.block.html

boris · February 5, 2020, 12:31am

From _resnet_split, I think the issue is with m[1:] as I have the same issue if I do:

learn = unet_learner(dls, models.resnet34, pretrained=False)
learn.model[1:].parameters

The other sections of the model (m[0][:6] and m[0][6:]) can return parameters.

Let me know if anybody can confirm there is a bug and I can file an issue in the repo if necessary.

muellerzr · February 5, 2020, 12:33am

If I had to guess, does it work if you pass pretrained=True?

boris · February 5, 2020, 12:34am

With pretrained=True the model does not load at all.
With pretrained=False it works but the same error happens later when we do learn.fit(1)

muellerzr · February 5, 2020, 12:36am

With something like this, I’d put in an issue (since they are working on the book) so they know to get to it (as i can’t see where that would go wrong). Are you using the most recent git version?

boris · February 5, 2020, 12:42am

Yes, I’m using the most recent git version and just tried to run the entire notebook 21.
I’ll just file an issue. I was just scared that I did something wrong as I thought all notebooks were run through CI before being merged.

3DSF_LTD · February 7, 2020, 8:52am

Dear Jeremy,
I hope you are ok.
I would like to attend the new course on March 2020.
How can I access it?

jeremy · February 7, 2020, 1:12pm

You can apply to join the course in SF here: https://www.usfca.edu/data-institute/certificates/deep-learning-part-one

(For online live streaming, you need an invite–they’ve already gone out, based on forum participation.)