Lesson 1 In-Class Discussion ✅

digital.entomologist · July 12, 2019, 11:16am

the rates aren’t changing drastically. After the last epoch I get results between 6.1% and 7.1%.
And I observed once overfitting in the 4th epoch, as in the 3rd epoch had an error rate of 6.8% while the 4th epoch had an error rate of just under 7%.

Here are 2 example results:

Am I misunderstanding the use of setting the seed here?

Are you observing the same or is this some issue with how I ran it?

disposableraft · July 12, 2019, 4:41pm

Hi all!

I’m new to deep learning and I’m wondering about how the number of classes affects the model’s output. Specifically, what’s happening when you train for one class versus three classes?

For lesson one’s homework, I thought I would build a Husky identifier (huskies are hot dogs, too). I used images from the Stanford Dogs dataset. The dataset has 192 Siberian huskies, which I divided into three directories: train (60%), valid (20%) and test (20%). So the training set is 116 images, and the other two both have 38 images.

I guessed that the model could learn with only one class, and that class scores were independent of each other. But, running learn.fit_one_cycle returned zeros for all loss and error columns.

So I added classes and observed that the error and loss rates changed, in some cases increasing slightly in other cases decreasing.

Can someone please shed light on how the number of classes affect output?

Thanks!

digital.entomologist · July 12, 2019, 5:07pm

Hi @disposableraft,

If you are thinking about the influence of the number of different output categories and what you might expect, it is good to consider how chance is influencing the expected result.

If you have 2 different classes, any naive guess is able to achieve 50% accuracy and therefore error rate by simply flipping a coin on what the output may be.

If you have 3 classes the expected accuracy decreases to 33% with the error rate rising to 66%.

For any larger set of N classes the expected result is accordingly:

Accuracy: 1/N
Error rate: 1-1/N

A Neural Network that you train to reduce the error rate should give you a lower error rate than chance.

Having a single class classification problem, is no classification problem as the output is always the same. Any approach that has a single option to pick, must pick it, resulting in the expected results and the predicted results to be the same. Therefore, the accuracy is 100% and the error rate 0%.

This means that the more classes you have, the harder time you have to get a high score for accuracy.

disposableraft · July 12, 2019, 5:28pm

Thank you @digital.entomologist! (And great name, btw.) This is super helpful.

I partly grasped that the zero values with single class classification was related to the fact that all images belonged to that class (ie, not a problem) — but, I wasn’t so sure that was the case either. This clarifies.

Thank you again!

ryba183 · July 12, 2019, 10:17pm

HI,

I do have a problem while

learn.fit_one_cycle(4)

A 1st cycle is ok but then after the 2nd start I have such information

[Errno 22] Invalid argument

and it is interrupted

digital.entomologist · July 13, 2019, 6:53am

Hi @ryba183,

Are you running this on Windows?
I am unable to check the actual source code right now but that error code seems to be connected to I/O operations.
I suspect it is trying to persist the epoch results after running it and is unable to open the file.

It might be that the path is wrong.

Could you maybe post the whole stack trace?

ryba183 · July 13, 2019, 9:04pm

Hi ,

Thank you for your help,
This is my problem::
OSError Traceback (most recent call last)
in
----> 1 learn.fit_one_cycle(4)

~/anaconda3/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
20 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,
21 final_div=final_div, tot_epochs=tot_epochs, start_epoch=start_epoch))
—> 22 learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
23
24 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, wd:float=None):

~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
198 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
199 if defaults.extra_callbacks is not None: callbacks += defaults.extra_callbacks
–> 200 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
201
202 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
104 if not cb_handler.skip_validate and not learn.data.empty_val:
105 val_loss = validate(learn.model, learn.data.valid_dl, loss_func=learn.loss_func,
–> 106 cb_handler=cb_handler, pbar=pbar)
107 else: val_loss=None
108 if cb_handler.on_epoch_end(val_loss): break

~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
55 val_losses,nums = [],[]
56 if cb_handler: cb_handler.set_dl(dl)
—> 57 for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
58 if cb_handler: xb, yb = cb_handler.on_batch_begin(xb, yb, train=False)
59 val_loss = loss_batch(model, xb, yb, loss_func, cb_handler=cb_handler)

~/anaconda3/lib/python3.7/site-packages/fastprogress/fastprogress.py in iter(self)
70 self.update(0)
71 try:
—> 72 for i,o in enumerate(self._gen):
73 if i >= self.total: break
74 yield o

~/anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in iter(self)
73 def iter(self):
74 “Process and returns items from DataLoader.”
—> 75 for b in self.dl: yield self.proc_batch(b)
76
77 @classmethod

~/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in iter(self)
191
192 def iter(self):
–> 193 return _DataLoaderIter(self)
194
195 def len(self):

~/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in init(self, loader)
467 # before it starts, and del tries to join but will get:
468 # AssertionError: can only join a started process.
–> 469 w.start()
470 self.index_queues.append(index_queue)
471 self.workers.append(w)

~/anaconda3/lib/python3.7/multiprocessing/process.py in start(self)
110 ‘daemonic processes are not allowed to have children’
111 _cleanup()
–> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect

~/anaconda3/lib/python3.7/multiprocessing/context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
–> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

~/anaconda3/lib/python3.7/multiprocessing/context.py in _Popen(process_obj)
275 def _Popen(process_obj):
276 from .popen_fork import Popen
–> 277 return Popen(process_obj)
278
279 class SpawnProcess(process.BaseProcess):

~/anaconda3/lib/python3.7/multiprocessing/popen_fork.py in init(self, process_obj)
18 self.returncode = None
19 self.finalizer = None
—> 20 self._launch(process_obj)
21
22 def duplicate_for_child(self, fd):

~/anaconda3/lib/python3.7/multiprocessing/popen_fork.py in _launch(self, process_obj)
68 code = 1
69 parent_r, child_w = os.pipe()
—> 70 self.pid = os.fork()
71 if self.pid == 0:
72 try:

OSError: [Errno 22] Invalid argument

digital.entomologist · July 13, 2019, 10:02pm

Some additional questions:

Does this happen every time you try?
Are you sure you have GPU support enabled?

Could you please try running this with a smaller batch size and share whether the same occurs? Thanks!

knakamura13 · July 14, 2019, 11:25pm

@ryba183 I haven’t found a solution per-say, but this error might be related to memory. The line that is actually crashing is self.pid = os.fork at the end of your stack trace, so your server is having trouble creating a new process due to running out of memory.

You should try reducing the batch size and then re-train your model.
Look for the following line of code in your notebook

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs).normalize(imagenet_stats)

and replace bs=bs with bs=16 to set the batch size to 16.
If 16 fails, try 8.

The bs parameter sets the number of images that will be loaded into memory at the same time.

Another option in the future is to try and use smaller images (fewer pixels = less memory required) instead of reducing the batch size; either way, the memory load would be reduced to (hopefully) prevent any errors from popping up.

good74152 · July 15, 2019, 1:59pm

I am training the Dogs vs Cats Datasets using kaggle kernel

And I have a big problem on showing the result pred = learn.predict(data)
and it will occured AttributeError: apply_tfms

I’m sure the databunch has the ds_tfms parameter
i use this data = ImageDataBunch.from_lists(trainpath, fnames, ds_tfms=get_transforms(), size=224, bs=64, labels = labels)
but why the error occured?

another question is : why the ds_tfms is the necessary parameter in create databunch?
The tfms is something about doing data augmentation, but if it have a lot training images , and i don’t want to do data augmentation, how should i do when i am creating the databunch?

thank you for your help

digital.entomologist · July 15, 2019, 4:49pm

Hi @good74152,

The error code you are discribing is not so much about whether or not the ds_tfms parameter is passed. Instead it tries to access or define the attribute apply_tfms which does not exist on the object. This is a strong sign that whatever you are passing to the function is not what it is expecting.

This is what the docs say:
" predict can be used to get a single prediction from the trained learner on one specific piece of data you are interested in."

So you cannot use it in the way you are trying (by passing everything).
If you want to get a prediction for a single piece of data, try:

data.test_ds[0]

You might want to check out get_preds() if you want to get predictions for more data points:

https://docs.fast.ai/basic_train.html#Learner.get_preds

ryba183 · July 15, 2019, 9:23pm

Thank you,

It is working, I was using WS and so why there was a problem.
Now I do have anaconda on windows and I am using CUDA, it is ok.
I have changed bs to 8 and is good.

Thank you,
Piotr

ryba183 · July 15, 2019, 9:25pm

Hi,

Thank you, I have chaned bs into smaller. Now I do have CUDA.

Thanks,
Piotr

Demistocle · July 17, 2019, 8:21pm

Just wanting to make sure I got things right, probability of actual class is the probability that the model gave for the actual class? So if it is 0.02, it gave that particular class a 0.02% probability?

dmx · July 18, 2019, 3:45am

just finished lesson 1 with Google Cloud Platform (Compute Engine). Plan to use the leaf image set from leafsnap.com to train the AI to recognize leaves. wondering if anyone has done it?

erikishiru · July 18, 2019, 8:22am

I’m trying to classify ice hockey and field hockey images using the resnet34 model.
I followed the link provided in the lecture on to create my own dataset.
But when I try to look at my data it throws an OSError saying

OSError: cannot identify image file PosixPath('data/hockey/field/118.UNC-field-hockey-championship.png')

digital.entomologist · July 18, 2019, 2:30pm

I am correcting myself here:

the expected accuracy for each INDIVIDUAL classification (e.g. 1 image) is 1/N.
The overall expected accuracy is more complicated to calculate.

digital.entomologist · July 18, 2019, 2:34pm

Hi @erikishiru,

could you please post the full stack trace?
I suspect though that it is trying to open the Image.

Are you able to manually open the image?
Could you try removing that specific image and running it again to see whether the image is the problem?

ryba183 · July 18, 2019, 9:10pm

Hi,

I have just done my model to recognize types of balls (tennis, baseball, golf for now)
I was doing things like in a tutorial. But how to use this model or werify it?
May I just somehow input some new photo and expect it to tell me what kind of ball it is?

For now I do have parameters liek this:

epoch	train_loss	valid_loss	error_rate	time
0	0.218861	0.108038	0.025157	00:57
1	0.222147	0.113225	0.031447	00:56

mrfabulous1 · July 25, 2019, 1:42pm

Hi ryba183 hope you are well!
Open up the lesson2 notebook.
Scoll down towords the end, to the section labeled Putting your model in production

The first 5 instructions under this heading show you how to test your model.

Cheers mrfabulous1