A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

muellerzr · March 2, 2020, 9:32pm

I’m confused by the wording here, Can you show an example of this issue?

Vishucyrus · March 2, 2020, 9:33pm

dls.show_batch() gives following trace —

“```python”

ValueError Traceback (most recent call last)
in ()
----> 1 dls.show_batch()

~/anaconda3/lib/python3.6/site-packages/fastai2/data/core.py in show_batch(self, b, max_n, ctxs, show, **kwargs)
88
89 def show_batch(self, b=None, max_n=9, ctxs=None, show=True, **kwargs):
—> 90 if b is None: b = self.one_batch()
91 if not show: return self._pre_show_batch(b, max_n=max_n)
92 show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)

~/anaconda3/lib/python3.6/site-packages/fastai2/data/load.py in one_batch(self)
127 def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
128 def one_batch(self):
–> 129 if self.n is not None and len(self)==0: raise ValueError(f’This DataLoader does not contain any batches’)
130 with self.fake_l.no_multiproc(): res = first(self)
131 if hasattr(self, ‘it’): delattr(self, ‘it’)

ValueError: This DataLoader does not contain any batches

“```”

muellerzr · March 2, 2020, 9:34pm

Interesting. Are you splitting your data? (Also remove the " ) when doing the wrapping and it will work, when you see the preview it should look something like so:

Hello World

What does the DataBlock code look like you’re building with?

s.s.o · March 2, 2020, 9:34pm

Lets, say I have a 224x 224 image and one big segment (mask of class 2)50x50 pxl and other one is (mask of class 5) 3x3 pxl for segmentation…

muellerzr · March 2, 2020, 9:35pm

If it’s not being very sucessful (IE This is something like class imbalancing) you could probably adjust the loss function to make it weighted?

s.s.o · March 2, 2020, 9:38pm

Well, accuracy is about 50% and I can weight it true… I read some where that big difference in the mask sizes can be problematic… so I’m looking at some solutions… My masks are relatively balanced…

muellerzr · March 2, 2020, 9:39pm

My extent of segmentation is what I’ve shown in the course, so I’m not much help there, apologies!

s.s.o · March 2, 2020, 9:40pm

Thank you very much…

Vishucyrus · March 2, 2020, 9:40pm

I am not splitting the data. The datablock code is exactly what u used in ur object detection notebook. Her’s it –

db = DataBlock(blocks=(ImageBlock, BBoxBlock, BBoxLblBlock),
                 splitter=RandomSplitter(),
                 get_items=get_train_imgs, 
                 getters=getters,
                 item_tfms=item_tfms,
                 batch_tfms=batch_tfms,
                 n_inp=1)

Also …

img2bbox = dict(zip(imgs, lbl_bbox))


first = {k: img2bbox[k] for k in list(img2bbox)[:1]}; first


getters = [lambda o: path/o, lambda o: img2bbox[o][0], lambda o: img2bbox[o][1]]


item_tfms = [Resize(224)]
batch_tfms = [Rotate(), Flip(), Dihedral(), Normalize.from_stats(*imagenet_stats)]

def get_train_imgs(noop):  return imgs

As u can see I have just made minimal changes. (Only replaced imgs and lbl_bbox with direct values). Nothing extra…

muellerzr · March 2, 2020, 9:42pm

What does len(dls.train) and len(dls.valid) provide?

Vishucyrus · March 2, 2020, 9:44pm

Forgot to include the output of

first = {k: img2bbox[k] for k in list(img2bbox)[:1]}; first

{'0_Cancel 3.pdf.jpg': ([[1010, 319, 661, 89],
   [178, 438, 887, 89],
   [427, 2955, 464, 52],
   [1882, 2847, 542, 63],
   [416, 3018, 679, 119],
   [398, 100, 590, 72]],
  ['ik', 'pn', 'phn', 'don', 'pha', 'in'])}

The outputs of len(dls.train) , len(dls.valid) are as follows –
(0, 1)

muellerzr · March 2, 2020, 9:44pm

There’s our issue. How big is your dataset? This len tells us how many batches are contained in each one. An example of what to expect is on my segmentation I get (75, 13) (75 batches in the train, 13 in the valid)

Vishucyrus · March 2, 2020, 9:52pm

Actually the data size is small. I have 30 images only.
Yeah I see that len(dls.train) is 0. I don’t know why is that.

Also, I was randomly checking this…
dls.dataset.split_idx
And it gave 0.

Could that be the problem in splitting? I am just guessing…
Let me know what else can I provide as output.

muellerzr · March 2, 2020, 9:56pm

You should reduce your batch size to 2-3 to try. This is part of the issue, you have a very small dataset. IE bs=2 but yes it could be. The next step would be to split via an IndexSplitter so we know what’s going on what

Vishucyrus · March 2, 2020, 9:56pm

Hi Zarchary. Thank you so much for ur help and time. That perfectly worked… !!!

Thanks once again

muellerzr · March 2, 2020, 9:58pm

So to explain what was going on, fastai’s default batch size is 64. You don’t have 64 items and so it’s building and splitting and doesn’t know what to do. So if we reduce it down it works better

Vishucyrus · March 2, 2020, 10:00pm

Hmm I got now ! Thanks !!

There is another problem now. The bounding boxes are not aligning well. It seems that they haven’t been re-scaled or something. Could you point where i might have missed.

shimsan · March 2, 2020, 10:07pm

Hi Zach, been learning a lot with your notebooks, thanks a lot!

I was going through the MultiModel notebook (and all those heads are blowing my mind ), and been comparing that with the MultiLabel notebook. I had some questions.

Why don’t we use MultiCategoryBlock here like how we did with Planet dataset?
Is there a different way to plot top losses for this kind of model?

I tried:


interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()

len(dls.valid_ds)==len(losses)==len(idxs)
interp.plot_top_losses(9, figsize=(15,10))

I need a little help in understanding the 3rd line, what is that for? It returns True for me.

The error I get:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-48-be969192d4f7> in <module>()
----> 1 interp.plot_top_losses(9, figsize=(15,10))

/usr/local/lib/python3.6/dist-packages/fastai2/interpret.py in plot_top_losses(self, k, largest, **kwargs)
     39         x1,y1,outs = self.dl._pre_show_batch(b_out, max_n=k)
     40         if its is not None:
---> 41             plot_top_losses(x, y, its, outs.itemgot(slice(len(inps), None)), self.preds[idx], losses,  **kwargs)
     42         #TODO: figure out if this is needed
     43         #its None means that a batch knos how to show itself as a whole, so we pass x, x1

TypeError: only integer tensors of a single element can be converted to an index

muellerzr · March 2, 2020, 10:10pm

Because each head is doing something different. If we wanted a single headed model, for all three, then yes we could.

Plot_top_losses should work, let me know if it isn’t. Ideally you’d want to mabye do 3 or label by the head, not sure, haven’t looked into that.

(which is exactly why you get that error, it expects a single output, not 3. We’d have to adjust for our multimodel somehow)

muellerzr · March 3, 2020, 12:56am

BTW, for those of us having issues with the Bytes, apparently it’s already in the type annotation. What I didn’t realize is it wants bytes, not BytesIO! (because it will do that!) So during deployment we can just do:

(assuming we’re in analyze):

    data = await request.form()
    bytes = await (data["file"].read())
    learn.predict(bytes)

Ping @mrfabulous1 because I know you were wanting this info