# Wiki: Lesson 1

(Sairam) #282

Thanks for doing this for all of us Jeremy & Rachel!

I have two questions re Lesson 1:

Regarding the Cyclical Learning Rate paper: Is there a way we can use this method to determine the optimum learning rate even when our loss function isn’t SGD? For example, what if our loss function is a combination of two different losses or what if it’s something like Entropy loss?

Where to put the sample images for the homework assignment: In the video, Jeremy asked us to put a few images from 2 classes of our choice and train the network on those classes. In the data subfolder, we already have the dog and cat subfolders. Do we remove that and put in our new image class folders? If we don’t, then the network is trying to classify 4 different categories right?

(Marc) #283

Sorry, can only help with question 2:

1. create a separate folder in the “data/” directory
2. point the PATH variable (it is being set in the 4th code-cell of the notebook) to your new folder. so instead of PATH = “data/dogscats/” it should then say e.g. PATH = “data/myimages/”. Don’t forget to run the cell.
3. make sure that the folder contains the structure as mentioned in the lecture and as is in the dogs cats example (train, valid folders with substructure).
4. The notebook will basically run completely with your dataset now. (there are a few hardcoded things like looking at cat pics first, there you will also have to adjust the path manually…)

#284

Couple dumb questions about stuff mentioned in first video:

1. Universal Approximation Theorem is mentioned that requires exponential size network, but then backpropagation helps with that. Is there a version of the theorem that says how much backprop helps? E.g. does it become polynomial?

2. Learning rate finder is reminiscent of old fashioned numerical root finders and the like, used in calculators and desktop programs. There’s a famous article by W. Kahan about the HP-34C solver from 1979: http://www.hpl.hp.com/hpjournal/pdfs/IssuePDFs/1979-12.pdf (starts at page 20 of the pdf). Is this similar? Is traditional numerics much help in machine learning?

3. Similarly is it reasonable to find the minimum by numerical differentiation and then looking for derivative = 0 with a traditional root finder?

4. The demo showing the different layers of a DNN recognizing features showed a layer recognizing circles. But since the input is a 3x3 grid would that actually recognize circles of only a specific size? Do actual deep learning algorithms manage to to recognize shapes like circles regardless of their size? Does anyone train on Fourier transforms of the input images, or anything like that?

Sorry to be so low level early in the course, the opposite of the advice about going top-down. Those issues just jumped out at me.

The course looks great, thanks a million for doing it.

#285

anyone tried to run this on Kaggle? I tried to run this on Kaggle but it fail on ConvLearner.pretrained(). It complains about failing to download the model.

Then I change to use ConvLearner.lsuv_learner(). Is that the correct thing to do?

(Rahim Shamsy) #286

I recently came across the ‘sched’ method run on the object returned by the ConvLearner.pretrained method (the ConvLearner.pretrained method suggests that the returned object comes from a cls method:

I want to understand the method ‘sched()’ that is applied to the returned object using the cls method above. So if I were to use the ‘??’ jupyter notebook shortcut to accessing code docs, how would I reveal the ‘sched()’ method? I tried the following:

??ConvLearner.pretrained.sched
to which I get output: Object `ConvLearner.pretrained.sched` not found.

Thanks,
Rahim

(Sairam) #288

Thanks Marc!

(Murali Mohana Krishna) #289

Hi, I am trying to run the code in lesson 1 and am getting a cuda error. It seems after it runs fit it is not releasing the cuda resource. So when I run the cell:

``````arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 2)
``````

I get the error:

`RuntimeError: cuda runtime error (46) : all CUDA-capable devices are busy or unavailable at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58`

How can I solve this?

(Marc) #291

Never seen this myself, but you don’t seem to be the only one with this type of error.

Maybe that helps, see last comment about switching modes of your GPU:

Other than that are you aware you are running multiple processes that use the GPU?
I am not sure what to make of this SO-Answer but maybe otherwise keep looking in that direction:

In general, just googling for your error message will often help you find the solution. It’s what I did with your message above.

(Gang Cheng) #292

This may be related to some GPU memory is consumed by supporting display, especially 4K+ monitor. You may want to separate display and computer into different cards

(Gang Cheng) #293

The image size is tied to your particular problem and the computing power you have. Large image size needs more GPUs (memory) and will be slower in training. However, if you downsize the images too much, the important features may get lost. Medical images typically need higher resolution than other samples.

(Gang Cheng) #294

This V2 is very different from V1 since it enforces more on learning the high-level picture (top-down approach) by abstracting more implementation into fast.ai library. It is not a course to learn how to use TF, Keras, etc.

(Kevin Chow) #295

(Amal) #296

How to change the batch size?
I can’t find this variable in the code!

#297

Batch size is identified in the def get_augs() function as bs.

(Carlos Vouking) #298

You can change your batch size like so:

data = ImageClassifierData.from_paths (path, tfms=tfms, bs=30, …)

Hope this helps.

(Amal) #299

I have a question…
How could the classifier asses itself on the test set while it doesn’t know the correct answers of the test set?
I mean how could the classifier be sure that it will achieve certain accuracy 98 or 99 when it only knows the correct answers of the validation set but not the test set?

How could it be sure that it will achieve exactly the same accuracy as the validation set on the test set?

(Kofi Asiedu Brempong) #300

Hey guys, check out my first medium post, its based on an image classifier i wrote.

#301

Hi everyone!

I’m having a hard time with using the fast.ai library on a Linux machine (ScientificLinux7 ) I SSH to. In short: When building resnet50, the machine I SSH to is unable to locate the pre-trained model.

I set up the fast.ai on the machine by following the instructions on the wiki. When I try to build the fast.ai model:

`PATH = 'my_data/hep_images/' sz = 300 arch = resnet50 data = ImageClassifierData.from_paths(PATH,tfms=tfms_from_model(arch, sz),bs=32 ) learn = ConvLearner.pretrained(arch, data, precompute=True)`

This results in the following error:

`
FileNotFoundError Traceback (most recent call last)
in ()
2 arch = resnet50
3 data = ImageClassifierData.from_paths(PATH,tfms=tfms_from_model(arch, sz),bs=32 )
----> 4 learn = ConvLearner.pretrained(arch, data, precompute=True)

/mnt/scratch/eab326/fastai/courses/dl1/fastai/conv_learner.py in pretrained(cls, f, data, ps, xtra_fc, xtra_cut, custom_head, precompute, pretrained, **kwargs)
111 pretrained=True, **kwargs):
112 models = ConvnetBuilder(f, data.c, data.is_multi, data.is_reg,
114 return cls(data, models, precompute, **kwargs)
115

/mnt/scratch/eab326/fastai/courses/dl1/fastai/conv_learner.py in init(self, f, c, is_multi, is_reg, ps, xtra_fc, xtra_cut, custom_head, pretrained)
38 else: cut,self.lr_cut = 0,0
39 cut-=xtra_cut
—> 40 layers = cut_model(f(pretrained), cut)
41 self.nf = model_features[f] if f in model_features else (num_features(layers)*2)

/mnt/scratch/eab326/anaconda3/envs/fastai/lib/python3.6/site-packages/torchvision/models/resnet.py in resnet50(pretrained, **kwargs)
186 model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
187 if pretrained:
189 return model
190

55 model_dir = os.getenv(‘TORCH_MODEL_ZOO’, os.path.join(torch_home, ‘models’))
56 if not os.path.exists(model_dir):
—> 57 os.makedirs(model_dir)
58 parts = urlparse(url)
59 filename = os.path.basename(parts.path)

/mnt/scratch/eab326/anaconda3/envs/fastai/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
218 return
219 try:
–> 220 mkdir(name, mode)
221 except OSError:
222 # Cannot rely on checking for EEXIST, since the operating system

FileNotFoundError: [Errno 2] No such file or directory: ‘/home/eab326/.torch/models’
`

I’d appreciate any help resolving this issue. Thank you!

(Fred Guth) #302

I guess that your question got lost among others and as I didn’t see anyone answer I will try to explain (I guess you know by now, but at least it can be registered for posterity).

So, an epoch is one pass in all the training set. The batch size is how many images you analyse at once. See, images are represented as tensors (n dimensional arrays) and you can make calculations with big tensors or small tensors.

If you use big tensors it is faster because you need less computations to pass through all your data. In other words, you need less iterations to finish an epoch. But you will also need more memory to keep that big tensor (the batch) in RAM. In general, you want to keep batches as big as your GPU memory allow.

To set the batch size you can try different sizes and keep an eye in your GPU usage (`nvidia-smi` command). If you set it too high, your code will have a runtime error. If that happens, try to reduce the batch size.