Fastai v2 chat

sgugger · January 30, 2020, 3:19pm

I don’t think it would harm to add it if we want it in this specific case. It wouldn’t be called when you show since a TensorImage knows how to show itself, but would be called when you call decode.

boris · January 30, 2020, 6:21pm

Just in case it helps at all, this is what I ended up doing:

@IntToFloatTensor
def decodes(self, TensorImage):
    return o.clamp(0., 1.).mul_(self.div).byte()

The case of self.div=None is broken at the encoding part so I wonder if it’s actually needed at all as it default to 255.

drscotthawley · January 31, 2020, 12:21am

Installation docs: ‘nbdev_’?
I just followed the installation instructions at https://github.com/fastai/fastai2
and it says to run nbdev_install_git_hooks … but I don’t seem to have any executables related to ‘nbdev’ installed along with fastai.
Is there something left out of the docs that we still need to do? How could these instructions be clarified?

edit: I installed via

git clone https://github.com/fastai/fastai2
cd fastai2
conda env create -f environment.yml
source activate fastai2
pip install fastai2

drscotthawley · January 31, 2020, 12:32am

Seems an additional “pip install nbdev” is needed, but then “make test” produces a list of other failed dependencies:

tensorboard
wandb
Pillow
…perhaps these could be added to the environment.yml file, or the install instructions edited to include them?

sgugger · January 31, 2020, 12:46am

Wandb and tensorboard are ultimately going to be separate extensions to avoid haivng too many dependencies, that’s why they are not in deps. Pillow is in the dependencies of the settings.ini so the pip install should have grabbed it.

nbdev is in the devs requirements, so it only gets installed if you did

pip isntall -e .[dev]

inside the fastai2 repo.

drscotthawley · January 31, 2020, 12:52am

Ok, thanks for clarifying.
I’ll submit a PR of the README.md that explains this in the “## Tests” section.

drscotthawley · January 31, 2020, 12:58am

It looks like nbdev is more than just a ‘dev’ requirement: The second cell of fastai2/nbs/10_tutorial.pets.ipynb reads

from nbdev.showdoc import *

Is it ok when this fails with a ModuleNotFoundError: No module named 'nbdev' when non-dev users try to run the tutorial?

sgugger · January 31, 2020, 4:32am

That’s just for the show_doc cells. If you are running the notebooks, you need the dev requirements yes. But if you are using fastai2 on its own, you don’t need nbdev, hence it not being an official requirement.

boris · February 1, 2020, 5:54pm

@sgugger I’m now tackling the problem of “partial” DataLoader. The interest is to train large dataset such as ImageNet and get validation metrics more frequently while still using the entire dataset.

It seems like I should follow the approach from weighted_dataloaders. Would the right place be in that same notebook 14a_callback.data (from the name seems maybe misplaced)?

morgan · February 1, 2020, 6:14pm

Hey @fmobrj75 , did you manage to resolve this bug? I’m getting the same error in the same situation (lm encoder loaded, trying to run classification)

fmobrj75 · February 1, 2020, 7:08pm

Hi, Morgan. Not yet. It seems to be some bug in the library code for the QRNN in the current version of fastai2. For a while I dropped the QRNN model. I am getting much better results with AWD-LSTM than with my fastai v1 AWD-QRNN for the same tasks.

morgan · February 1, 2020, 7:11pm

Ah ok thank for letting me know! I’ll give it a bit more investigation and post here if I can get it working. If not I guess I’ll switch to AWD-LSTM

boris · February 1, 2020, 9:50pm

I can’t believe I now have a new shortcut on my phone screen to https://github.com/fastai/fastai2/commits/master

What better way to keep up to date with such a fast evolving library?

muellerzr · February 1, 2020, 9:54pm

For the major changes I keep an eye on notebook 50 with the DataBlock examples, else yeah I keep track of the commits or look for specific things I want to keep an eye out for.

morgan · February 2, 2020, 6:20pm

OK so I kind of have a feeling of what is going on with the AWD_QRNN classifier:

In SentenceEncoder there is a line that I believe removes the padding from the input before passing if to your model:

for i in range(0, sl, self.bptt):
    #Note: this expects that sequence really begins on a round multiple of bptt
    real_bs = (input[:,i] != self.pad_idx).long().sum()             # Find the count of non-padded items
    r,o = self.module(input[:real_bs,i: min(i+self.bptt, sl)])     # Pass to model

For example from the logs below you can see that the entire batch size is (8, 850), but this then gets chopped up into various lengths (like (3, 72), (4, 72) or (5, 72)) according to real_bs. It is this chopped up input that gets passed to your model.

real_bs : 3, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([3, 72])

real_bs : 4, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([4, 72])

real_bs : 5, input size : torch.Size([8, 850]) , pad_idx : 1
AWD_LSTM inp size: torch.Size([5, 72])

AWD_QRNN Classifier
The above works fine for the awd_qrnn language model, however introducing the SortedDL dataloader and before_batch=pad_input seems to cause issues, specifically with the _get_source function in the QRNNLayer class.

The function concats the previous input (prevX) with the current input. However because real_bs can change, it can cause a misalignment in tensor shapes between prevX and the current input.

def forward(self, inp, hid=None):
        y = self.linear(self._get_source(inp))     # <-- _get_source() function called 
        ...
    def _get_source(self, inp):
        if self.window == 1: return inp
        dim = (1 if self.batch_first else 0)
        inp_shift = [torch.zeros_like(inp[:,:1] if self.batch_first else inp[:1]) if self.prevX is None else self.prevX]
        if self.backward: inp_shift.insert(0,inp[:,1:] if self.batch_first else inp[1:])
        else:             inp_shift.append(inp[:,:-1] if self.batch_first else inp[:-1])
        inp_shift = torch.cat(inp_shift, dim)       # <- This is where the QRNN classifier fails
        return torch.cat([inp, inp_shift], 2)

An example of the size descrepancy can be seen logged below:

real_bs : 1, input size : torch.Size([64, 2055]) , pad_idx : 1      <-- real_bs = 1
AWD_QRNN inp size: torch.Size([1, 72])
QRNN Module input size : torch.Size([1, 72, 50])

input size: torch.Size([1, 72, 50]), window: 2, batch_first: True, backward : False  # _get_source()
prevX size: torch.Size([1, 1, 50])
inp_shift size : torch.Size([1, 1, 50])
output size : torch.Size([1, 72, 100])       <--- Successful concat outputted from _get_source()

real_bs : 3, input size : torch.Size([64, 2055]) , pad_idx : 1       <-- real_bs changes
AWD_QRNN inp size: torch.Size([3, 72])
QRNN Module input size : torch.Size([3, 72, 50])

input size: torch.Size([3, 72, 50]), window: 2, batch_first: True, backward : False  # _get_source() 
prevX size: torch.Size([1, 1, 50])         <-- Will cause the error when concatted

I think thats whats happending, but I’m not 100% sure what the best way to resolve it is, whether some padding needs to be re-added in _get_source, or something else.

@sgugger , @jeremy would you have any suggestions how to resolve? cc: @fmobrj75

jeremy · February 2, 2020, 6:45pm

Thanks for the great analysis. @sgugger has made a bunch of changes to padding recently, but hasn’t been testing the QRNN AFAIK, so I wouldn’t be surprised if things have cropped up.

We’re on a book deadline until Feb 10, so please ping us after that if we don’t follow up…

sgugger · February 2, 2020, 7:27pm

Probably yes

muellerzr · February 3, 2020, 9:26pm

Quick question:

is there a fast way to get access to my all my transformed y’s post an item transform? (such as PointScaler) in an entire dataloader?

I’m currently trying this:
tfmd = [dls.after_item.point_scaler(x[1]) for x in dls.dataset]

boris · February 4, 2020, 1:56am

If anybody is interested in dataloaders that can process partial epochs, here is my implementation which picks randomly subset of data at each iteration.

The intention was to handle datasets such as ImageNet and get validation metrics more frequently.

#export
@delegates()
class PartialDL(TfmdDL):
    '''Select randomly partial quantity of data at each epoch'''
    def __init__(self, dataset=None, bs=None, partial_n=None, **kwargs):        
        super().__init__(dataset=dataset, bs=bs, **kwargs)
        self.partial_n = min(partial_n, self.n) if partial_n else None

    def get_idxs(self):
        if self.partial_n is None: return super().get_idxs()
        return list(np.random.choice(self.n, self.partial_n, replace=False))
    
    def __len__(self):
        if self.partial_n is None: return super().__len__()
        return self.partial_n//self.bs + (0 if self.drop_last or self.partial_n%self.bs==0 else 1)


#export
@patch
@delegates(Datasets.dataloaders)
def partial_dataloaders(self:FilteredBase, partial_n, bs=64, **kwargs):
    xtra_kwargs = [{}] * (self.n_subsets-1)
    return self.dataloaders(bs=bs, dl_type=PartialDL, dl_kwargs=({'partial_n':partial_n}, *xtra_kwargs), **kwargs)

I could add an option to prevent replacing items until they have all be drawn or allow to have a “non-shuffled” version but I wanted to keep it simple for now. It works both with TfmdLists and Datasets.

Let me know if you are interested in a PR for it.

boris · February 4, 2020, 7:26pm

I noticed 21_vision.learner.ipynb does not execute fully anymore.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-bd945bd2ef1c> in <module>
----> 1 learn = unet_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(axis=1))
      2 learn = unet_learner(dls, models.resnet34, pretrained=True, n_in=4)

<ipython-input-28-429ea9b9ca5e> in unet_learner(dls, arch, loss_func, pretrained, cut, splitter, config, n_in, n_out, **kwargs)
     11     model = models.unet.DynamicUnet(body, n_out, size, **config)
     12     learn = Learner(dls, model, loss_func=loss_func, splitter=ifnone(splitter, meta['split']), **kwargs)
---> 13     if pretrained: learn.freeze()
     14     return learn

~/Projects/fastai2/fastai2/learner.py in freeze(self)
    564 
    565 @patch
--> 566 def freeze(self:Learner): self.freeze_to(-1)
    567 
    568 @patch

~/Projects/fastai2/fastai2/learner.py in freeze_to(self, n)
    559 @patch
    560 def freeze_to(self:Learner, n):
--> 561     if self.opt is None: self.create_opt()
    562     self.opt.freeze_to(n)
    563     self.opt.clear_state()

~/Projects/fastai2/fastai2/learner.py in create_opt(self)
    233     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
    234     def create_opt(self):
--> 235         self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
    236         if not self.wd_bn_bias:
    237             for p in self._bn_bias_state(True ): p['do_wd'] = False

<ipython-input-23-75cf9ffd62a9> in _resnet_split(m)
      1 #export
      2 def _xresnet_split(m): return L(m[0][:3], m[0][3:], m[1:]).map(params)
----> 3 def  _resnet_split(m): return L(m[0][:6], m[0][6:], m[1:]).map(params)
      4 def _squeezenet_split(m:nn.Module): return L(m[0][0][:5], m[0][0][5:], m[1:]).map(params)
      5 def _densenet_split(m:nn.Module): return L(m[0][0][:7],m[0][0][7:], m[1:]).map(params)

~/Projects/fastcore/fastcore/foundation.py in map(self, f, *args, **kwargs)
    360              else f.format if isinstance(f,str)
    361              else f.__getitem__)
--> 362         return self._new(map(g, self))
    363 
    364     def filter(self, f, negate=False, **kwargs):

~/Projects/fastcore/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    313     @property
    314     def _xtra(self): return None
--> 315     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    316     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    317     def copy(self): return self._new(self.items.copy())

~/Projects/fastcore/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     39             return x
     40 
---> 41         res = super().__call__(*((x,) + args), **kwargs)
     42         res._newchk = 0
     43         return res

~/Projects/fastcore/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    304         if items is None: items = []
    305         if (use_list is not None) or not _is_array(items):
--> 306             items = list(items) if use_list else _listify(items)
    307         if match is not None:
    308             if is_coll(match): match = len(match)

~/Projects/fastcore/fastcore/foundation.py in _listify(o)
    240     if isinstance(o, list): return o
    241     if isinstance(o, str) or _is_array(o): return [o]
--> 242     if is_iter(o): return list(o)
    243     return [o]
    244 

~/Projects/fastcore/fastcore/foundation.py in __call__(self, *args, **kwargs)
    206             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    207         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 208         return self.fn(*fargs, **kwargs)
    209 
    210 # Cell

~/Projects/fastai2/fastai2/torch_core.py in params(m)
    496 def params(m):
    497     "Return all parameters of `m`"
--> 498     return [p for p in m.parameters()]
    499 
    500 # Cell

AttributeError: 'tuple' object has no attribute 'parameters'

It runs if we do pretrained=False but will fail at the same step when we do learn.fit(1).