Fastai v2 chat

sgugger · June 11, 2020, 7:09pm

I think this should work, and it looks like the right approach to me. Make sure to test the two dls are indeed shuffled the same way.

muellerzr · June 11, 2020, 7:19pm

Awesome! Thanks Sylvain! And re: are they the same:

dl2.dls[0].get_idxs() == dl2.dls[1].get_idxs() is True

For more, see my notebook here:

https://www.kaggle.com/muellerzr/fastai2-tabular-vision-starter-kernel

s.s.o · June 11, 2020, 10:18pm

How to combine models? Any tiny example would be nice .

muellerzr · June 11, 2020, 10:24pm

Working on an example, but depending on how it does I’ll either release it now or later

s.s.o · June 11, 2020, 10:39pm

It will be great Last week I was searching for such example for v2… (not for the completion tho)

muellerzr · June 11, 2020, 11:03pm

Best example from a model persepctive is the Kaggle comp where it mixed Tab + Vision + Text (it was a pet adoption one). The DL’s should be somewhat the same from what I had above so that should help with merging models

AntonBlomstrom · June 11, 2020, 11:54pm

Is the DataLoaders class supposed to load in all the data from a subclass of IterableDataset into memory? Or am I missing some setting to disable the _listify call? Tried it with:
DataLoaders.from_dsets(train_ds=train_dl, valid_ds=valid_dl)

Just checking if there are any thoughts on this since it’d really help me at the moment due to my data depending on earlier timesteps. Precomputing the whole dataset into independent datapoints is infeasible.

marii · June 12, 2020, 1:09am

I might be missing something, but to me train_dl and valid_dl suggest to me that they are Dataloaders. IF so you should be able to create the Dataloaders object using the regular __init__ function.

Actually which version of fastai re you using?

AntonBlomstrom · June 12, 2020, 1:53am

Tried v1 first, but read that it didn’t support IterableDataset. So I’m on v2 now.

And yeah, sorry, those names are from when I messed around trying to get it to work. They are torch IterableDataset subclasses, not dataloaders. I tried wrapping them in DataLoader classes like you said, but it seems to try and use the __getitem__ method of Dataset instead of __iter__. Anyway, I didn’t mean to hijack this thread. I’ll look around some more, maybe post it in the help channel.

marii · June 12, 2020, 2:07am

Ah okay, yeah I wasn’t able to find where from_dsets, takes keyword arguments for the datasets. So I was wondering if you were on a different version: https://github.com/fastai/fastai2/blob/master/fastai2/data/core.py#L124

MicPie · June 12, 2020, 6:12pm

I’m currently trying to use a torchvision.transforms that works in my plain pytorch dataloader but when I use it as item_tfms in a fastai ImageDataLoaders I get the error:
“Could not do one pass in your dataloader, there is something wrong in it”

I guess, there is a straightforward way to do it, but so far I didn’t found a hint in the docs or the forum threads. Maybe somebody knows how to get that working?

muellerzr · June 12, 2020, 6:18pm

Can you grab a batch and show what the error says @MicPie? Or if you’re using the DataBlock can you show the result of dblock.summary()?

MicPie · June 14, 2020, 1:43pm

Thank you for your help.

I could track down the error:
The image transform was also applied to y-data which was of type TensorCategory and this broke it.

I could fix it with the following setup:

@patch
def augfunc(x:Image.Image):
<aug func code here>

class AugFunc(Transform):
    def encodes(self, x:(Image.Image,*TensorTypes)): return x.augfunc()

But I am not sure if this is the most elegant way?

muellerzr · June 14, 2020, 1:48pm

Looks to be the right way if they’re of type PILImage and not quite TensorImages when they get there

boris · June 15, 2020, 4:56pm

I wanted to implement support for bounding boxes in WandbCallback but there does not seem to be a “normalized way” to handle them:

I could not find an example where a model is trained on them
there is a DataBlock example in 50_tutorial.datablock.ipynb (not for training though) where the output is split in 2 blocks: one is a list of bounding boxes and the other one is the corresponding list of classes

I think ideally TensorBBox would also contain the class so we can easily identify we are working on a bounding box detection problem through typedispatch.

It could be anything else but I can’t find a “normalized way” of treating these problems like we get with semantic segmentation (where we can reliably access optional labels with get_meta('codes')).

muellerzr · June 15, 2020, 5:05pm

here is one example.

I do like the idea of this, as I can’t think of a reason a bounding box doesn’t have a label to go along with it, at least for general use, but I also don’t know their insight into splitting it up

UKamath7 · June 17, 2020, 8:02am

In kaggle how to do prediction on test set using fastai2.
If someone could help, i would be great. Im not understanding test_dl and getpreds in fastai2

muellerzr · June 17, 2020, 11:21am

So let’s say I have a list of file names for my test set. Here would be my steps

get my file names
Run dl=learn.dls.test_dl(fnames)
preds=learn.get_preds(dl=dl)

UKamath7 · June 17, 2020, 2:18pm

thanks, this worked,
I have one more question
i am working on multi label classification
got preds by the method you told but have problem here
code:

thresh=0.2
labelled_preds = [' '.join([learn.dls.vocab[i] for i,p in enumerate(pred) if p > thresh]) for pred in preds]

I am getting
RuntimeError: bool value of Tensor with more than one value is ambiguous

please HELP

much_learner · June 17, 2020, 10:36pm

How to check what transforms are applied?
I want to keep only vertical and horizontal flips, so I set these values:
batch_tfms=aug_transforms(flip_vert=True, max_rotate=0, min_zoom=1, max_zoom=1, max_lighting=0, max_warp=0),
The datablock.summary gives something but I’m still not sure if there’s something else happening
Pipeline: IntToFloatTensor -> AffineCoordTfm