Lesson 2: further discussion ✅

Can Someone tell me what does this system error mean? How to correct this?error

hey thanks @kat6123 the problem has been solved… the problem is the path of the folder…

Question: I have training data in an NPY file whose shape is (814, 440, 440, 1), which consist of 814 images of size 440 x 440 x 1, and the label is an (814, 1) NPY file containing ones and zeroes. Is there a way to create a data bunch from these data? I used to train this with Keras, but I don’t quite know how to do this with the fastai library.

image

image

Thank you!

1 Like

@jeremy In the lesson 2 video lecture jeremy talks about unbalanced data (1.10.10), you say do nothing about unbalanced data classes and says to try it training the network with unbalanced data.But when I read the paper found in the resources section (https://arxiv.org/pdf/1710.05381.pdf) you say “The effect of class imbalance on classification performance is detrimental” in the concusion section of the paper.Can you clarify on this issue.

So I was going through lesson 2 sgd and on running the code for the same i get this error

Code:
def update():
y_hat = x@a
loss = mse(y, y_hat)
#if t % 10 == 0: print(loss)
loss.backward()
with torch.no_grad():
a.sub_(lr * a.grad)
a.grad.zero_()
for i in range(0,100):
update()
lr = 1e-1

And the following error:

RuntimeError Traceback (most recent call last)

in ()
1 for i in range(0,100):
----> 2 update()
3 lr = 1e-1

2 frames

in update()
3 loss = mse(y, y_hat)
4 #if t % 10 == 0: print(loss)
----> 5 loss.backward()
6 with torch.no_grad():
7 a.sub_(lr * a.grad)

/usr/local/lib/python3.6/dist-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
105 products. Defaults to False.
106 “”"
–> 107 torch.autograd.backward(self, gradient, retain_graph, create_graph)
108
109 def register_hook(self, hook):

/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
91 Variable._execution_engine.run_backward(
92 tensors, grad_tensors, retain_graph, create_graph,
—> 93 allow_unreachable=True) # allow_unreachable flag
94
95

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Can anyone help me out?

This a question regarding, train/test/validation split. I notice the accuracy is being measured in the validation and was wondering if the validation set is used to fine tune the model? If so, wouldn’t that lead to overfitting? Isn’t it better to save a piece of the data and never use it while training and only after the model has been fine tuned? I might be missing something. Thank you!

Yes you can save a subset of your data and treat it as your test set. And what you would do is evaluate your model on the test set at the very end, as a result of which you get a result that can be considered a pretty fair judgement of your model. But then again, the same care that needs to be taken when creating a validation set has to be considered when creating a test set as well.

1 Like

@dreambeats Thank you for the response! Now, is it fair then to compare the results to the state of the art even if they are not measured on the test set?

Can you expect them to be the exact same? Probably not. As long as your test set is a decent representation of the general population of observations, you can probably take the results on the test set quite seriously. One caveat however, is that even if your model does generalise pretty well, its hard to say how well it does on the subset of data that is in the official test set, because its entirely possible that your model happens to do slightly worse on the data that is in there. This is pretty common in Kaggle competitions, the placings on the public leaderboard and private leaderboard tend to vary (sometimes by quite a bit) due to the exact reason that I mentioned above. Just because you did well on one test set doesn’t mean that you’ll do equally well on the other test set, but you usually won’t do too much worse.

1 Like

@dreambeats makes sense! Thank you again for the response.

1 Like

I’m worried about how much of the math I should know. If I complete the SGD notebook and make sure I understand everything in it will I be OK to continue?

Welcome to the forum @nole

You do not need to worry too much about maths behind the seen . In fact that is the beasuti of FastAI top-down approach. First you develop general intuition of the concept & keep moving with with lesson & practice. In the end every thing will gel together & you will be more curious about maths behind. I would say keep moving ahead without worry too much about unknown.

Thank you for the welcome @dinesh.chauhan ! And thank you for the reply. This definitely eases my worries. I was afraid if I didn’t fully get something I’d loose the ability to continue. I’ll make my web app and try not to worry too much!

Hi guys,

It’s been pretty rough setting up colab and also determining the paths to work.

This is the part whereby the video de-syncs from the notebook.

I have not run this step yet as it is not explained. Can anyone pls advise or point to the right direction.
I think the step is trying to filter the bad pics from the 3 bears but it’s video is not the same as screenshot below. Thanks

1st post today! Revisiting here Lesson 2 by trying to load the images as .npy files through torchvision.datasets.DatasetFolder() -> torch.utils.data.DataLoader -> ImageDataBunch

When building: learn = cnn_learner(data, models.resnet34, metrics=error_rate), the following error pops up:


AttributeError Traceback (most recent call last)
in
----> 1 learn = cnn_learner(data, models.resnet34, metrics=error_rate)

/opt/conda/lib/python3.6/site-packages/fastai/vision/learner.py in cnn_learner(data, base_arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, init, concat_pool, **kwargs)
94 “Build convnet style learner.”
95 meta = cnn_config(base_arch)
—> 96 model = create_cnn_model(base_arch, data.c, cut, pretrained, lin_ftrs, ps=ps, custom_head=custom_head,
97 split_on=split_on, bn_final=bn_final, concat_pool=concat_pool)
98 learn = Learner(data, model, **kwargs)

/opt/conda/lib/python3.6/site-packages/fastai/basic_data.py in getattr(self, k)
120 return cls(*dls, path=path, device=device, dl_tfms=dl_tfms, collate_fn=collate_fn, no_check=no_check)
121
–> 122 def getattr(self,k:int)->Any: return getattr(self.train_dl, k)
123 def setstate(self,data:Any): self.dict.update(data)
124

/opt/conda/lib/python3.6/site-packages/fastai/basic_data.py in getattr(self, k)
36
37 def len(self)->int: return len(self.dl)
—> 38 def getattr(self,k:str)->Any: return getattr(self.dl, k)
39 def setstate(self,data:Any): self.dict.update(data)
40

/opt/conda/lib/python3.6/site-packages/fastai/basic_data.py in DataLoader___getattr__(dl, k)
18 torch.utils.data.DataLoader.init = intercept_args
19
—> 20 def DataLoader___getattr__(dl, k:str)->Any: return getattr(dl.dataset, k)
21 DataLoader.getattr = DataLoader___getattr__
22

AttributeError: ‘DatasetFolder’ object has no attribute ‘c’

Thanks for any advice.

Hi, I am trying to run the lesson2-download colab notebook. I have downloaded csv files to my local machine containing urls for images of teddy, black and grizzly bears. When I run all the appropriate cells up to the download images section, it does not return any errors. However, when I run the download_images cell, an error is returned saying that the path I passed does not exist, even though I can clearly see in my google drive that the path does exist. Here is an image of the error. Any help would be appreciated, thanks!

use ‘My Drive’ in your path instead of My Drive.

Did you manage to fix it?
I got the same error and don’t know how to solve it.

[QUESTION] HOW TO DOWNLOAD IMAGES WITH FOR-LOOP?

In class, Jeremy download images in this way:

image

That’s OK if we have only several classes, but when the number of categories is much larger, for example 50, we have to use a for-loop instead.

Just like this:

I run the code above, it works well in the first loop and second loop, but fails at the third loop, which means, I get two folders named ‘teddy’ and ‘black’, both of them have 120 pictures in it, but there is no folder named ‘grizzly’.

Outcome like this:

HERE IS MY QUESTION:
Acturally, when I use Jeremy’s method(downloading each category one by one), the outcome above also appears but it doesn’t matter. However, when I use a for-loop it will break my loop. How to fix it?

problem solved:
open this file ‘fastai/core.py’,
delete line 227: ‘import sys;sys.exit(1)’,
save and quit,
done.