Questions about the 1st notebook

Hello everyone,

so I just watched first 3 lessons and I wanted to make sure that I understand everything before I move on. (I will make sure to do the assignment and apply my new skills on some kaggle datasets for sure).

So there are some things that I still don’t understand like “precompute=True” but I am pretty that this is explained somewhere on the forums.

So I would like to ask more about the things that just aren’t clear to me.

  1. when we use the Data augmentation does it increase the size of our dataset by a factor of 6 or does it just use a different version of the same image in every epoch?

  2. another thing that isn’t clear to me is the image sizes. I understand that if you have 500x400 image you can just crop 224x224 chunk from the middle, but in the cats vs dogs dataset there are some images that have a dimension that is smalled than 224. So does the Convnet fill the remaining area with white pixels or how is this problem handled?

  3. in function ‘most_by_correct’ there is this line
    mult = -1 if (y==1)==is_correct else 1
    I am actually not sure what the condition does there and if it affects the mult variable or not.

Also if there is someone who is also relatively new or someone who is feeling like coaching a newbie you can add me on discord snek#0551 :sunglasses:

I’ll try my best to answer your two first questions. I am a beginner too so please cross validate my answers with the others that will hopefully come.

  1. As far as I understand, the Data augmentation does not add more images per se in your dataset, it justes gives a different version (Augmented version) of your input at each iteration. This makes your network more robust since it learns invariate features.

  2. the 224*224 actually resizes your image.I belive there is also a padding parameter there that can “fill” the blanks

I will wait for someone else to answer your third question.

@Ashka Thanks for replying.

I might have not specified the first question well enough. What I meant is that usually when you use Data augmentation it increases your dataset size. So when you augment every image 5 different ways you will have 6 times bigger dataset. The question was if you use the 6 versions of the image you have in every epoch or if you for example use the non augmented version in the first and then maybe a different version in the second epoch and so on :slight_smile: . Now when I think about it. It’s probably a pretty stupid question and you just use all images in every epoch.

About the second question I don’t know how I did not realise that you can just upscale the images :slight_smile:

For the augmentation, on thing that i find useful is to look at it as now part of your data distribution ( you augment everything so now you have the same distribution as initially, just with more examples). So basically every iteration you take minibatch of your sample, and ultimately you’ll be going through it all. Unless your batch size is the size of your dataset, you won’t be using all the images at each epoch.

I am pretty sure that 1 epoch = 1 pass through all your data regardless of batch size.
So if you have small batch size there will be more steps in one epoch than with bigger batch size, but it is still only 1 epoch. :sunglasses:

Oh…well I am now even more interested in the answer. Sorry for the wrong info…

Edit I found this thread that might answer your question

What i thought is, after augmentation of the pictures it creates a copy of same image at different angles… So to make the GPU work better it takes a square of size 224, So it is possible that any large image doesn’t fit into this.

So, if consider an example of dog image, if it lengthy than it is possible that its head not comes completely.
after performing the augmentation, they ill try to search in all its copy that whether it gets any useful information. like may be in some photo head comes clear to separate dog and cat.

And for you 3rd question
mult = -1 if (y==1)==is_correct else 1

it returns numerical value instead of boolean. Also, we calculate np.exp(…) it values comes around -1 to 1. So by setting its value to numerical -1 or 1 it is easy to compare…