Lesson 5 In-Class Discussion

Chris_Palmer · December 4, 2017, 11:45am

Getting an error tonight (12/03/2017 in USA) when trying to run the movelens data. I had performed a git pull and conda env update before running the notebook:

The full trace:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-9-675a60c20fae> in <module>()
----> 1 learn.fit(1e-2, 2, wds=wd, cycle_len=1, cycle_mult=2)

~/fastai/courses/dl1/fastai/learner.py in fit(self, lrs, n_cycle, wds, **kwargs)
    190         self.sched = None
    191         layer_opt = self.get_layer_opt(lrs, wds)
--> 192         self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
    193 
    194     def lr_find(self, start_lr=1e-5, end_lr=10, wds=None):

~/fastai/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, metrics, callbacks, use_wd_sched, **kwargs)
    137         n_epoch = sum_geom(cycle_len if cycle_len else 1, cycle_mult, n_cycle)
    138         fit(model, data, n_epoch, layer_opt.opt, self.crit,
--> 139             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, **kwargs)
    140 
    141     def get_layer_groups(self): return self.models.get_layer_groups()

~/fastai/courses/dl1/fastai/model.py in fit(model, data, epochs, opt, crit, metrics, callbacks, **kwargs)
     80         stepper.reset(True)
     81         t = tqdm(iter(data.trn_dl), leave=False, total=len(data.trn_dl))
---> 82         for (*x,y) in t:
     83             batch_num += 1
     84             for cb in callbacks: cb.on_batch_begin()

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/tqdm/_tqdm.py in __iter__(self)
    951 """, fp_write=getattr(self.fp, 'write', sys.stderr.write))
    952 
--> 953             for obj in iterable:
    954                 yield obj
    955                 # Update and possibly print the progressbar.

~/fastai/courses/dl1/fastai/dataloader.py in __iter__(self)
     64     def __iter__(self):
     65         with ThreadPoolExecutor(max_workers=self.num_workers) as e:
---> 66             for batch in e.map(self.get_batch, iter(self.batch_sampler)):
     67                 yield get_tensor(batch, self.pin_memory)
     68 

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in result_iterator()
    584                     # Careful not to keep a reference to the popped future
    585                     if timeout is None:
--> 586                         yield fs.pop().result()
    587                     else:
    588                         yield fs.pop().result(end_time - time.time())

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    423                 raise CancelledError()
    424             elif self._state == FINISHED:
--> 425                 return self.__get_result()
    426 
    427             self._condition.wait(timeout)

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/thread.py in run(self)
     54 
     55         try:
---> 56             result = self.fn(*self.args, **self.kwargs)
     57         except BaseException as exc:
     58             self.future.set_exception(exc)

~/fastai/courses/dl1/fastai/dataloader.py in get_batch(self, indices)
     60     def __len__(self): return len(self.batch_sampler)
     61 
---> 62     def get_batch(self, indices): return self.collate_fn([self.dataset[i] for i in indices])
     63 
     64     def __iter__(self):

~/fastai/courses/dl1/fastai/dataloader.py in <listcomp>(.0)
     60     def __len__(self): return len(self.batch_sampler)
     61 
---> 62     def get_batch(self, indices): return self.collate_fn([self.dataset[i] for i in indices])
     63 
     64     def __iter__(self):

~/fastai/courses/dl1/fastai/column_data.py in __getitem__(self, idx)
     11 
     12     def __len__(self): return len(self.y)
---> 13     def __getitem__(self, idx): return [o[idx] for o in self.xs] + [self.y[idx]]
     14 
     15     @classmethod

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/pandas/core/series.py in __getitem__(self, key)
    621         key = com._apply_if_callable(key, self)
    622         try:
--> 623             result = self.index.get_value(self, key)
    624 
    625             if not is_scalar(result):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2555         try:
   2556             return self._engine.get_value(s, k,
-> 2557                                           tz=getattr(series.dtype, 'tz', None))
   2558         except KeyError as e1:
   2559             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1230

ecdrid · December 4, 2017, 12:29pm

Tried Restarting the kernel?

ecdrid · December 4, 2017, 1:24pm

Have a look at this notebook contributed by merajat

https://nbviewer.jupyter.org/github/MeRajat/fastai_workspace/blob/master/seed_data/visualize_layers.ipynb

pierreguillou · December 4, 2017, 2:43pm

Hello, I do not have excel from Microsoft and tried to open the excel sheet collab_filter.xlsx of @jeremy with Open Office and Google Office without success. Is there any other possibilities ?

miguel_perez · December 4, 2017, 2:56pm

@pierreguillou you can open this excels with Libreoffice or Openoffice but macros will not work. (And you need Jeremy’s macros to make this notebooks work). Problem is that macro languaje of excel is propietary and different to the one used in open source programs.

So… as far as I know, unless you edit this macros and adapt them to Openoffice or Libreoffice I dont think there is much to do. A pity cause MS makes you buy all the package office360 that is an overkill if you just want to run this notebooks. An option, maybe, get the trial version?

hiromi · December 4, 2017, 3:03pm

I haven’t tried it myself, but this might be an option. Please let us know how it goes if you try

pierreguillou · December 4, 2017, 3:39pm

Thanks @hiromi for the idea but it does not work through my Microsoft Excel online. File can not be opened (I took it from https://github.com/fastai/fastai/tree/master/courses/dl1/excel).

sermakarevich · December 4, 2017, 5:30pm

It turned out to be pretty easy to do this on your own. You can substitute rankings with something similar:

advertised campaigns - sites - were there conversions
stores - products - sales level

Anything related to your business should be just fine and probably more interesting to discover.

jeremy · December 4, 2017, 6:11pm

You don’t need the macros - all they do is copy the 2 output cells back to the two input cells. But you can see a full epoch in the spreadsheet without needing to do this at all.

Chris_Palmer · December 4, 2017, 6:56pm

@jeremy - is it possible a bug has been introduced to the fast.ai data set library for collaborative filters, or is it the data in the movielens, so that whenever I follow the notebook it crashes immediately when running learn.fit, on using a pandas library?

I have restarted everything. I have tried upgrading all packages, the fast.ai library is up-to-date as is conda env. I have deleted the data directory and re-unzipped to re-create it. The steps before this run OK so the data looks OK.

I do not get this error using learn.fit with our Dogs and Cats data set - but its a different learner…

The line it fails on (I have also tried with SGD optimzer and get the same problem):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2555         try:
   2556             return self._engine.get_value(s, k,
-> 2557                                           tz=getattr(series.dtype, 'tz', None))
   2558         except KeyError as e1:
   2559             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 55759

The full trace:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 55759

jeremy · December 4, 2017, 7:04pm

It is indeed possible! I’ve been fixing things up for tonight’s NLP class, and may well have caused a problem in earlier notebooks. Let me check now and get back to you…

Chris_Palmer · December 4, 2017, 7:13pm

Thanks!

I also have a problem further on when creating the examples by scratch:

fit(model, data, 3, opt, F.mse_loss)

Epoch
0% 0/3 [00:00<?, ?it/s]


  0%|          | 0/1251 [00:00<?, ?it/s]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-09a5bc469dee> in <module>()
----> 1 fit(model, data, 3, opt, F.mse_loss)

~/fastai/courses/dl1/fastai/model.py in fit(model, data, epochs, opt, crit, metrics, callbacks, **kwargs)
     83             batch_num += 1
     84             for cb in callbacks: cb.on_batch_begin()
---> 85             loss = stepper.step(V(x),V(y))
     86             avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
     87             debias_loss = avg_loss / (1 - avg_mom**batch_num)

~/fastai/courses/dl1/fastai/model.py in step(self, xs, y)
     41         if isinstance(output,(tuple,list)): output,*xtra = output
     42         self.opt.zero_grad()
---> 43         loss = raw_loss = self.crit(output, y)
     44         if self.reg_fn: loss = self.reg_fn(output, xtra, raw_loss)
     45         loss.backward()

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in mse_loss(input, target, size_average)
    817 
    818 def mse_loss(input, target, size_average=True):
--> 819     return _functions.thnn.MSELoss.apply(input, target, size_average)
    820 
    821 

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py in forward(ctx, input, target, *args)
     45         output = input.new(1)
     46         getattr(ctx._backend, update_output.name)(ctx._backend.library_state, input, target,
---> 47                                                   output, *ctx.additional_args)
     48         return output
     49 

TypeError: CudaMSECriterion_updateOutput received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, torch.cuda.DoubleTensor, torch.cuda.FloatTensor, bool), but expected (int state, torch.cuda.FloatTensor input, torch.cuda.FloatTensor target, torch.cuda.FloatTensor output, bool sizeAverage)

jeremy · December 4, 2017, 7:39pm

OK these should both be fixed now - thanks for reporting the problem!

Chris_Palmer · December 4, 2017, 7:51pm

Thanks @jeremy - all good for the first part, but again with the “from scratch” area of my notebook I got the cuda error - but then saw that you had replaced y = ratings[‘rating’] with y = ratings[‘rating’].astype(np.float32) - implementing this fixed things.

I guess anyone running old code based on the previous version of the noteboook will still get those cuda errors…

thiago · December 4, 2017, 8:48pm

Hi guys!

Is there a way to calculate the accuracy of the CollabFilter? Just by the validation loss is difficult to understand how good is the algorithm, especially when you are using your own dataset and don’t have a benchmark.

I tried to use the accuracy_thresh(0) that uses accuracy_multi behind the scenes, but the result is always the same 0.3414.

sermakarevich · December 4, 2017, 8:58pm

I used:

AUC for 2 class CF
% of correctly classified in each bucket for multiclass

EricPB · December 4, 2017, 9:05pm

Video timelines for Lesson 5

00:00:01 Review of students articles and works
- “Structured Deep Learning” for structured data using Entity Embeddings,
- “Fun with small image data-sets (part 2)” with unfreezing layers and downloading images from Google,
- “How do we train neural networks” technical writing with detailled walk-through,
- “Plant Seedlings Kaggle competition”
00:07:45 Starting the 2nd half of the course: what’s next ?
MovieLens dataset: build an effective collaborative filtering model from scratch
00:12:15 Why a matrix factorization and not a neural net ?
Using Excel solver for Gradient Descent ‘GRG Nonlinear’
00:23:15 What are the negative values for ‘movieid’ & ‘userid’, and more student questions
00:26:00 Collaborative filtering notebook, ‘n_factors=’, ‘CollabFilterDataset.from_csv’
00:34:05 Dot Product example in PyTorch, module ‘DotProduct()’
00:41:45 Class ‘EmbeddingDot()’
00:47:05 Kaiming He Initialization (via DeepGrid),
sticking an underscore ‘_’ in PyTorch, ‘ColumnarModelData.from_data_frame()’, ‘optim.SGD()’
Pause
00:58:30 ‘fit()’ in ‘model.py’ walk-through
01:00:30 Improving the MovieLens model in Excel again,
adding a constant for movies and users called “a bias”
01:02:30 Function ‘get_emb(ni, nf)’ and Class ‘EmbeddingDotBias(nn.Module)’, ‘.squeeze()’ for broadcasting in PyTorch
01:06:45 Squeashing the ratings between 1 and 5, with Sigmoid function
01:12:30 What happened in the Netflix prize, looking at ‘column_data.py’ module and ‘get_learner()’
01:17:15 Creating a Neural Net version “of all this”, using the ‘movielens_emb’ tab in our Excel file, the “Mini net” section in ‘lesson5-movielens.ipynb’
01:33:15 What is happening inside the “Training Loop”, what the optimizer ‘optim.SGD()’ and ‘momentum=’ do, spreadsheet ‘graddesc.xlsm’ basic tab
01:41:15 “You don’t need to learn how to calculate derivates & integrals, but you need to learn how to think about the spatially”, the ‘chain rule’, ‘jacobian’ & ‘hessian’
01:53:45 Spreadsheet ‘Momentum’ tab
01:59:05 Spreasheet ‘Adam’ tab
02:12:01 Beyond Dropout: ‘Weight-decay’ or L2 regularization

thiago · December 5, 2017, 1:00am

Thanks @sermakarevich!

sabzo · December 5, 2017, 1:11am

Embedding Dimensions:
In the Excel spread sheet the Embedding for movies is a 5x15 matrix (before the bias is applied) and the user Embedding matrix is 15x5.

However the Embedding matrices used in the Lesson5 Notebook seem to be off where users is 671x50 and movies is 9066x50. I thought to multiply matrices the number of columns must match the number of rows vice versa…

It seemed like Jeremy first multiplied the movie embedding and user embeddings to (replacing empty ratings with zero) to determine the first initial predictions and to update the factors in the embedding matrices.

Do I have this right?

mcallara · December 5, 2017, 2:16pm

I had the same issue with the Rossman Notebook, I fixed by replacing:
md = ColumnarModelData.from_data_frame(PATH, val_idx, df, yl, cat_flds=cat_vars, bs=128, test_df=df_test)

with

md = ColumnarModelData.from_data_frame(PATH, val_idx, df, yl.astype(np.float32), cat_flds=cat_vars, bs=128, test_df=df_test)

until (I think) it will modified directly inside the method