Wiki: Lesson 2

That helps a lot. Can you explain how precompute works? Why train the model with precompute=True when you’re just going to set precompute=False before training with transformations? I know Jeremy talks about this in the video, but I don’t understand why precompute=True does not take into account the transformations.

EDIT : In this post I’ve described my intuition on what precompute=true does. Sorry, for this erroneous explanation, as it is quite misleading.Thank you @jeremy , for pointing it out. Please, check out this post Wiki: Lesson 2 , for more clarity. Also, precompute in fastai .

Pre-compute is a kind of a hack, Jeremy engineered to incorporate augmentation.
Image augmentation artificially creates training images through different ways of processing or combination of multiple processing, such as random rotation, flips, shifts e.t.c. Suppose we create five different augmentations for each image and if we have 10 GB of training images. We will end up with 50GB+ data. So, precompute=false means that we don’t generate these augmented examples. Computation-wise, generating these (in ram) is fast but storing it in hardisk is a slower process. Now, storing time (ram to hardisk) for these 50gb+ data is very high comparitively. When precompute=true, these augmented images are generated during training and sent to model(in ram/vram), then it’s deleted. This is an optimized way of computation.
This is my understanding.
Understanding concepts behind iterable, iterator and generator will help in implementing optimizations, like these.

I’m having the same issue. How do I set up this data?

Yes you need to download, explained in lesson 3.

Hello ,
I am trying to code the Dog Breed notebook. I am getting an error when I try to execute
learn = ConvLearner.pretrained(arch,data,precompute=True).
The last line of error trace shows [Errno 2] No such file or directory : ‘/home/paperspace/fastai/courses/dl1/fastai/weights/resnext101_64x4d.pth’

Please help.

Hi Bikash

this post should help:

@mhmoodlan
Cyclical Learning Rates for Training Neural Networks
Link to to paper used in learning rate finder: https://arxiv.org/abs/1506.01186

1 Like

Hi,

I’m a little confused on the learning rate finder. Should it be used on an untrained model? I’m trying to implement the algorithm myself in Keras (on an untrained example CNN model – see gist at bottom)

I thought I would write my own learning rate scheduler to feed into a keras model, but when plotting my loss vs learning rate, the loss just hovers around 2 then explodes when the Learning rate gets close to 1.

I start with a learning rate of 1e-5. At the end of each batch, I update my learning rate:

newLR = initialLR* (1 + 0.04)^{batchNumber}

which gives me a similar learning rate scheduling as the lecture notes:

IITERATIONS_VS_LR

When I compare this learning rate to the loss values at the end of each batch, I get a relatively flat loss value then it explodes:

LR_VS_LOSS

Not sure what I’m missing here… I created a Gist of the code I used to code this up. Most of the action happens in the def on_batch_end function in my callback.

I expected to see the loss decrease in a nice downhill kind of way to a valley, the way the learning rate vs loss is shown in the lecture, but yeah mine isn’t even close to that.

Any advice would be much appreciated!

when run lesson1-rxt50.ipynb I get ‘invalid index to scalar variable.’ in function save_metrics after learn.fit()
can somebody help me,thanks

I have the same issue. There is an double “[0]” in sgdr.py:79
If you change it to single, it will work correctly.

But anyway I think there should be a conditional loop through vals, to support multiple or single val losses,
.fit(…, all_val=True) will pass here multi val losses.

1 Like

This isn’t what happens. No data augmentation occurs when precompute=True. Search the forums for “precompute” to see the correct answer to this question.

1 Like

I try to use .fit(…, all_val=True),another issue came up.
I change the source code a lit by add some if/else to avoid the scalar vals[0] to be indexed.
thank you very much for you answer.

Sorry, I mean that IF you pass it will be broken again.

  • without all_val, only one [0] is needed
  • with all_val, double [0] is needed

Im not sure what is for all_val parameter, so I dont know right solution yet, but
removing [0] is a quick fix for now.

Unless the training loss is above or equal to validation loss there is no overfitting, however if your validation loss starts increasing above your training loss, it is cause of overfitting.

Thanks Bellomatic, the solution worked for me.

Could you please explain why this process of precomputing the activations is not used with the transformed versions of the image?

Why would you not want to precompute the activations of the transformed images? Is that because they are random and the precomputed activations won’t be useful when a new random transformation is used in a next training?

1 Like

That’s correct. Because an image is randomly generated every time, there is no point in caching activations.

For anyone that has trouble loading the planet file data, I made a ipynb that guides you through it : https://github.com/daphn3cor/Fastai-Planet-Files

2 Likes

This is an awesome tip, but I’m not understanding the novelty aspect. How does it differ in any way from augmenting with zoomed images, what we learned to do previously? Conceptually it isn’t any different. Is there something different under the hood going on?

01:32:45 Undocumented Pro-Tip from Jeremy: train on a small size, then use ‘learn.set_data()’ with a larger data set (like 299 over 224 pixels)

I looked some more and the zoom actually is cropping. So that’s the only difference, right? If it were a non-cropping zoom, there would be no difference in these approaches?

Hello,

I’m trying to perform dog breed image classification on my laptop and I’m getting the below issue. Looks like its because the site has bad SSL implementation. This issue occurs while trying to download resnet from pytorch website.

Any ideas on how to resolve this issue would be appreciated.

Thanks,
Sravan