0.0.29 regression bug

i was going through this sweet notebook by @muellerzr and after updating fastai2 the results are off.

the first time i ran it i was using:
fastai2==0.0.17
fastcore==0.1.17
torchvision==0.5.0
torch==1.4.0

got these results:

then i updated to:
fastai2==0.0.29
fastcore==0.1.38
torch==1.6.0
torchvision==0.7.0

and got these results:

1 Like

That is indeed an annoying bug… Augmentations stayed the same?

Edit: can recreate this on my end, filed an issue

Okay we’ve isolated the issue (just to keep you updated), and I’m working on a PR now. What’s happening is when points go off-screen, we don’t actually properly clamp it (and my own clamp function actually didn’t solve this issue, as you can get a result such as: [ 1.0661, -0.0437] when our points need to be between -1 and 1).

I was wondering about that. I remember seeing some predictions off the picture but wasn’t sure if it was just a bad prediction.

Even if the clamping function wasn’t working, shouldn’t the results still be the same because they were both using it?

I don’t deny something else may be afoot here, but first let’s fix the bug we know is a bug :slight_smile: The other option would be to try training without warp, etc and see if it’s still present. We may have simply gotten lucky. There were some hps not being passed down and some defaults weren’t quite the same in the image augmentations we fixed too in relation to the probabilities, so this could be another factor.

1 Like

ill try without the augmentations right now

1 Like

ok so it looks like theres definitely something with Flip() in batch transforms. im going to show them all because 0.0.17 with all batch tfms looks like its the best compared to 0.0.29 using one of the batch tfms. it seems like itd be easier for 0.0.29 to be better since theres only 3 epochs and one batch tfm.

0.0.17 with all batch tfms:
image image

0.0.29:
all batch tfms:
image image

just flip batch tfm:
image image

no batch tfms:
image image

just warp batch tfm:
image image

just rotate batch tfm:
image image

just zoom batch tfm:
image image

just clampbatch tfm:
image image

1 Like

I’d assume everything below here is from 0.0.29?

oh yea.

how do you reply like that? with part of another post in your reply

Select the text you want to specifically quote, and a “Quote” button will pop up and add it in :slight_smile:

1 Like

sweeeet :sunglasses: thanks!

@pattyhendrix the next step is to look at the outputs from just using flip, are any of them going outside of -1, 1.? I’ll look and see if flip’s behavior changed lately, but I have not noticed a major change to that specific augmentation. Only big difference is flip’s p was taken from 0.5 to 1.0, so to properly recreate it we should make flip’s p 0.5 (It’s default p was adjusted, so do Flip(p=0.5) (and verify that it is 0.5 when running it by checking the dls.after_batch. This may not actually need to be a thing, but this would be what I first check

Flips() default p for me was .5 so i switched it to 1 and it looks like it worked:

so i put all the batch tfms back on with Flip(p=1) and got:

its like 3 seconds slower than 0.0.17 and the valid loss is way worse but the predictions look ok. not as good as 0.0.17, but better than 0.0.17 if it had the valid loss of 0.0.29

1 Like