Wiki: Lesson 3

Hi all.

I am working through the Lesson 3 Rossman notebook

When I get to the command (just below the Sample subsection)

m.fit(lr, 3, metrics=[exp_rmspe])

I get KeyError: <weakref at 0x7fb6eb4e2688; to 'tqdm' at 0x7fb6eed23c88>

The details are below. Has anyone dealt with this error, or know what to do about it?


KeyError Traceback (most recent call last)
in ()
----> 1 m.fit(lr, 3, metrics=[exp_rmspe])

~/fastai/courses/dl1/fastai/learner.py in fit(self, lrs, n_cycle, wds, **kwargs)
285 self.sched = None
286 layer_opt = self.get_layer_opt(lrs, wds)
–> 287 return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
288
289 def warm_up(self, lr, wds=None):

~/fastai/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
232 metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
233 swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
–> 234 swa_eval_freq=swa_eval_freq, **kwargs)
235
236 def get_layer_groups(self): return self.models.get_layer_groups()

~/fastai/courses/dl1/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, **kwargs)
121 if hasattr(cur_data, ‘val_sampler’): cur_data.val_sampler.set_epoch(epoch)
122 num_batch = len(cur_data.trn_dl)
–> 123 t = tqdm(iter(cur_data.trn_dl), leave=False, total=num_batch)
124 if all_val: val_iter = IterBatch(cur_data.val_dl)
125

~/fastai/courses/dl1/fastai/imports.py in tqdm(*args, **kwargs)
45 if in_notebook():
46 def tqdm(*args, **kwargs):
—> 47 clear_tqdm()
48 return tq.tqdm(*args, file=sys.stdout, **kwargs)
49 def trange(*args, **kwargs):

~/fastai/courses/dl1/fastai/imports.py in clear_tqdm()
41 inst = getattr(tq.tqdm, ‘_instances’, None)
42 if not inst: return
—> 43 for i in range(len(inst)): inst.pop().close()
44
45 if in_notebook():

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tqdm/_tqdm.py in close(self)
1096 # decrement instance pos and remove from internal set
1097 pos = abs(self.pos)
-> 1098 self._decr_instances(self)
1099
1100 # GUI mode

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tqdm/_tqdm.py in _decr_instances(cls, instance)
436 with cls._lock:
437 try:
–> 438 cls._instances.remove(instance)
439 except KeyError:
440 if not instance.gui: # pragma: no cover

~/anaconda3/envs/fastai/lib/python3.6/_weakrefset.py in remove(self, item)
107 if self._pending_removals:
108 self._commit_removals()
–> 109 self.data.remove(ref(item))
110
111 def discard(self, item):

KeyError: <weakref at 0x7fb6eb4e2688; to ‘tqdm’ at 0x7fb6eed23c88>

Same thing here, I get the “Watch this video on Youtube” message, too if I try to play the embedded video directly from http://course.fast.ai/lessons/lesson3.html .

I am browsing from Barcelona, Spain. Thought that might be the problem but switched to browsing through a proxy in the US and still got the same message.

Hi guys, here is a multi label image classification challenge for you to implement your skills learnt in this lesson:
https://www.hackerearth.com/challenge/competitive/deep-learning-3/

2 Likes

hey @shubham3121
its nice problem to work with multi label classification problem

1 Like

Hi to everyone, I’ tried to fine-tunning ResNet50 with Keras. It works perfectly fine but the validation accuracy is stuck at 0.50. I really can’t figure out why because the code is very simple and it works without problems with sequential custom models.

I attach a pdf file of the notebook with the incriminated code :slight_smile:

Thank you so much.

resnet50_keras_fine_tune.pdf (70.6 KB)

http://forums.fast.ai/t/subject-lesson-3-video-is-this-the-intended-behaviour/16072

I think it’s fixed now

1 Like

Please note that the kaggle client interface has changed. Please go to

https://github.com/Kaggle/kaggle-api

Not sure you have a problem. Please note some of your code is missing in your pdf at a page break.

Okay I see whats at the page break.

Your problem may be you’ve taken two steps out of the note book if you compare to this one. Each adds to the next. So you are missing the the 3 epoch block which is followed by the split block and then the fine-tuning fit. Your train model has only one epoch.
The other big difference is you only have 8000 images belonging to 2 classes, I think it should be 23000
See:-

https://github.com/fastai/fastai/blob/master/courses/dl1/keras_lesson1.ipynb

Thank you @RogerS49 so much for your help. The dataset has less images because I intentionally reduced the number of images and there is only one epoch because also with more the validation accuracy is stuck at 0.5.

I noticed that also in the original code there is a similar problem and exactly for this reason I’d like to know if there is a possible solution or at least a reasonable explanation :smile: Thanks.

Question on how to save weights on every restart of cosine annealing:

learn.fit(lrs, 3, cycle_len=1, cycle_mult=2)
This runs 2 restarts : (1), (1,2), (1,2,3,4) each digit here represents a cycle length of 1 epoch

Is there any way to save weights for each cosine restart rather than saving the weights in the end?
Sometimes when all is done the model might be quite overfitted.

I am also having that problem. I think the point of watching on course.fast.ai is just so we can find things like the wiki, but obviously you already found all the resources :slight_smile: So I think you’re good-- watch it on YouTube, but post your questions here, not in the YouTube comments. I don’t think there is any difference in the video content, since the one that is normally on course.fast.ai is just linked to YT.

I’m having an issue with running the rossmann example.

Step 8: for t in tables: display(DataFrameSummary(t).summary())

I am running into an issue with pandas.

It’s telling me:

module ‘pandas.core.common’ has no attribute ‘is_numeric_dtype’

Any ideas on how to fix this?

1 Like

Hi, I also encountered the same issue and I googled a bit.

At this time, the easiest solution is to skip the cell because the cell just displays the summaries of dataframes and does not manipulate any element in datarames.

The issue is from the pandas_summary library.
It calls pandas is_numeric_dtype method though, the location of the method is changed now. This is why the error occurs.

1 Like

Heya D,
I ran into the same issue but continuing on with the notebook I couldn’t see that it affected anything. The remainder of the lesson ran fine (although when i tried it a few days ago it was getting hung up on the tables, i do Git Pull and Conda Env Update pretty often so that might have helped).

1 Like

I’m going to try and move to the FastAI AMI and not use sagemaker prebuilt image as I am wondering if that’s my issue.

The issue is a new Pandas library was released in early May, version 23 and it has a bug for the DataFrameSummary.

I was able to resolve the issue by going back to pandas version 22 by using the command

conda install pandas=0.22

1 Like

you can also use for t in tables: print(t.describe())
it will show the same data, just not in a html table

1 Like

exp_rmspe : metrics vs explicit calculation.
I was able to run through the whole rossman notebook. I also created my copy from scratch to make sure I fully understand. Overall, I am pretty familiar with the idea and code base behind the scene now.
But I got a small questions regarding the result. See the following lines from the notebook:

m.fit(lr, 2, metrics=[exp_rmspe], cycle_len=4)
# m.save('awang_val0')
# m.load('awang_val0')
x,y=m.predict_with_targs()
len(x), len(y), exp_rmspe(x,y)

For the first line, I got the following output:
epoch trn_loss val_loss exp_rmspe
0 0.007935 0.012073 0.108804
1 0.006928 0.010902 0.09866
2 0.00581 0.010534 0.09842
3 0.005843 0.010511 0.097418
4 0.007132 0.011242 0.10067
5 0.007127 0.011706 0.100811
6 0.005701 0.010629 0.097536
7 0.005325 0.010472 0.097566
For the last line, I got:
(38399, 38399, 0.10119025162931043)
From last epoc of fit() to m.predict_with_targs(), nothing changed to the learner m. I checked the code, both exp_rmspe were calculated on validation set. Shouldn’t they be the same?

For the validation set, I used:
val_idx = np.flatnonzero((df.index<=datetime.datetime(2014,9,17)) & (df.index>=datetime.datetime(2014,8,1)))
If I use the original one:
val_idx=[0]
Then they matched.
It seems the exp_rmspe will return slightly different result for validation set if validation set is larger than one.
Do I miss anything?

I’m a little bit confused by matrix product on a fully connected layer on conv-example.xslx demonstration.
Jeremy says from approximately 1:09:10 to 1:09:23 that “a fully connected layer is doing classic traditional matrix product, basically just going through each pair in turn, multiplying them together and then adding them up to a matrix product”. But the formula in cell EN4 at 1:09:23 has in it a SUMPRODUCT() Excel function that does element-wise multiplication.
I thought it may have something to do with the fact that we have two matrices 12x13 on Maxpool layer, each multiplied by corresponding 12x13 matrices on Dense weights layer, and this can be seen as a 2x12x13 matrix multiplied by a 2x12x13 matrix, but traditional matrix product is only defined for two-dimensional matrices.
In short: Jeremy says it’s matrix product but I only see element-wise product. What did I miss?

@amir01, did you get an answer to your question? I have a related question.

I am doing the Planets competition, and initially trained my model using size 64 (created an ImageClassifierData object - one of its super classes’ object to be precise - with that size):


Then after training with Data Augmentation, I trained with differential rates and changing the size to 256. I check the value of the size variable of the data object, and get 64:

Isn’t it supposed to give 256 at this point?

Hope you can help. Thanks