Fastai v2 vision

It strongly depends on how your model returns its output too. I don;t have any examples of show_results with multi-target, so there might be something broken in fastai2.

Got it. I’ll take a look at that and see. It’s the RetinaNet architecture used in previous lectures. IIRC I had a seperate function that’s as used to show the results. I’ll get that working then Maybe we can get some inspiration on how to fit it in

I implemented the “non-pretrained” version and am now working on the pretrained version.
I just wanted to check if for n_in>3, additional weights should be 0 as I understand they won’t ever learn anything.

They will still have gradients, so they will learn :slight_smile:

Oops… make sense!
I just sent a PR.

Maybe this helps:

1 Like

Quick question, should our points that come out of PointScaler be on a scale of -1,1? Or their own scale? Because the second is what is currently happening. On my particular dataset, when I manually calculated what the y range was:

tfmd_pnts = [dls.after_item.point_scaler(x[1]) for x in dls.dataset]
min_pnt = 0
max_pnt = 0
for t in tfmd_pnts:
  if t.min() < min_pnt:
    min_pnt = t.min()
  if t.max() > max_pnt:
    max_pnt = t.max()
max_pnt = float(max_pnt)
min_pnt = float(min_pnt)

I do not get -1,1 I get -2.0491, 3.5536 for my dataset. (And I verified no point went off the image’s range by accident). This is important as sometimes when running a multi-point regression model, (without explicitly declaring a y_range), the points will all stack to the middle. Could this be due to the face that Resize occurs after PointScaler? And it should in fact be the other way around so everything scales to the new image size that it should be?

1 Like

I do think this is the problem, Currently PointScaler is being applied before Resize, this means that _scale_pnts will be calculated with the wrong (pre-resized) sz, generating this problem

def _scale_pnts(y, sz, do_scale=True, y_first=False):
    if y_first: y = y.flip(1)
    res = y * 2/tensor(sz).float() - 1 if do_scale else y
    return TensorPoint(res, img_size=sz)

I did tried to fix the issue, by putting PointScaler after Resize, but that just generates another problem, Resize starts to fail with TensorPoint, the root of this issue is this line inside Resize.encodes: orig_sz = _get_sz(x)
_get_sz returns an empty tuple because no img_size _meta was attributed to TensorPoint yet.

I’m trying for some hours now to attribute img_size to TensorPoint before Resize.encodes gets called, but I failed every time :sleepy:

Yes PointScaler is called before the resizing, but that shouldn’t impact anything: points will be resized with respect to the actual size of the original image. Or are you passing coordinates assuming the image has already been resized?

No, they’re just the original coordinates for the original sizes. The documentation says everything is on a scale of -1,1 but this isn’t what I’m seeing. So is that not true? Or is it variable depending on the starting image size and ending. Thanks @sgugger :slight_smile:

1 Like

Double check what are the points sent, but if they are indeed inside the images, they should have coordinates between -1 and 1. If anything, having Resize happen before PointScaler would be the reason you see wrong coordinates.

A quick example is the following problem:

Original image size: (1025, 721)
New image size: (224,224)
Point: (1024,720)

If we follow how it is being done, IE
newX = 1024 * 2/224 - 1

We get: 8.14 as our point. I think it should actually be taking the original size not the transformed for that to work properly and then apply it. (This leads to a 0.99, which is what is expected) which I don’t think is what’s being done, else we wouldn’t see -2 ish as happening on my DataLoader above

1 Like

However also PointScaler is done before resize, so how do we explain the > 1 seen on the DataLoader?

That is what I keep saying. It will not take 224 as a size since Resize happens after PointScaler, so the size used at this stage will be (1025, 721).

1 Like

Got it, that just clicked. My apologies @sgugger :slight_smile: So how would this then explain the y’s being above or below 1, -1 when outputted from the DataLoader? As I can verify all input coordinates were in fact on the image when it came in.

I can’t say, you need to debug your data pipeline step by step. Just explaining why the order of the transforms need to be this way :wink:

1 Like

I’ll look into it and report back :slight_smile:

Thanks for your guidance, it’s greatly appreciated :slight_smile:

Would the new dblock.summary be able to provide this? (just curious)

Try it and see what the output is :wink:

1 Like

Will do!