Single Prediction on exported Collaborative Filtering model

Hey everyone :wave:,

I’m building my first collaborative filter using fastai2 following the fastbook/08_collab.ipynb notebook.

I’ve been able to successfully import my data, find my learning rate, train my model, and export it.

With all the training data loaded it’s easy for me to take the dot product and add in biases to make predictions even for my ultimate use-case of returning a descending rank-ordered list of all items for a given user.

However, I’m having trouble making use of the predict() method, even prior to exporting the model.

At this point, I have a vague understanding that I must somehow pre-process my data in order to get predictions out of the model.

I appreciate the help pointing me in the right direction!

2 Likes

On a newly trained model with training data included, here’s what my model returns from learn.summary():

EmbeddingDotBias (Input shape: ['64 x 2'])
================================================================
Layer (type)         Output Shape         Param #    Trainable 
================================================================
Embedding            64 x 50              1,052,100  True      
________________________________________________________________
Embedding            64 x 50              2,078,300  True      
________________________________________________________________
Embedding            64 x 1               21,042     True      
________________________________________________________________
Embedding            64 x 1               41,566     True      
________________________________________________________________

Total params: 3,193,008
Total trainable params: 3,193,008
Total non-trainable params: 0

Optimizer used: <function Adam at 0x7f0a7f087050>
Loss function: FlattenedLoss of MSELoss()

Model unfrozen

Callbacks:
  - TrainEvalCallback
  - Recorder
  - ProgressCallback

However, when I try to call learn.predict(learn.dls.train.all_cols.iloc[0]) (link) I receive a stacktrace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-37-d2fa5476b9bb> in <module>
----> 1 learn.predict(learn.dls.train.all_cols.iloc[0])

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastai2/learner.py in predict(self, item, rm_type_tfms, with_input)
    227 
    228     def predict(self, item, rm_type_tfms=None, with_input=False):
--> 229         dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms)
    230         inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
    231         dec = self.dls.decode_batch((*tuplify(inp),*tuplify(dec_preds)))[0]

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastai2/tabular/data.py in test_dl(self, test_items, rm_type_tfms, **kwargs)
     30     def test_dl(self, test_items, rm_type_tfms=None, **kwargs):
     31         to = self.train_ds.new(test_items)
---> 32         to.process()
     33         return self.valid.new(to, **kwargs)
     34 

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastai2/tabular/core.py in process(self)
    138     def show(self, max_n=10, **kwargs): display_df(self.new(self.all_cols[:max_n]).decode().items)
    139     def setup(self): self.procs.setup(self)
--> 140     def process(self): self.procs(self)
    141     def loc(self): return self.items.loc
    142     def iloc(self): return _TabIloc(self)

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in __call__(self, o)
    183         self.fs.append(t)
    184 
--> 185     def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
    186     def __repr__(self): return f"Pipeline: {' -> '.join([f.name for f in self.fs if f.name != 'noop'])}"
    187     def __getitem__(self,i): return self.fs[i]

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in compose_tfms(x, tfms, is_enc, reverse, **kwargs)
    136     for f in tfms:
    137         if not is_enc: f = f.decode
--> 138         x = f(x, **kwargs)
    139     return x
    140 

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in __call__(self, x, **kwargs)
     70     @property
     71     def name(self): return getattr(self, '_name', _get_name(self))
---> 72     def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
     73     def decode  (self, x, **kwargs): return self._call('decodes', x, **kwargs)
     74     def __repr__(self): return f'{self.name}: {self.encodes} {self.decodes}'

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
     94     "A `Transform` that modifies in-place and just returns whatever it's passed"
     95     def _call(self, fn, x, split_idx=None, **kwargs):
---> 96         super()._call(fn,x,split_idx,**kwargs)
     97         return x
     98 

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
     80     def _call(self, fn, x, split_idx=None, **kwargs):
     81         if split_idx!=self.split_idx and self.split_idx is not None: return x
---> 82         return self._do_call(getattr(self, fn), x, **kwargs)
     83 
     84     def _do_call(self, f, x, **kwargs):

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     84     def _do_call(self, f, x, **kwargs):
     85         if not _is_tuple(x):
---> 86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
     87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
     96         if not f: return args[0]
     97         if self.inst is not None: f = MethodType(f, self.inst)
---> 98         return f(*args, **kwargs)
     99 
    100     def __get__(self, inst, owner):

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastai2/tabular/core.py in encodes(self, to)
    199         self.classes = {n:CategoryMap(to.iloc[:,n].items, add_na=(n in to.cat_names)) for n in to.cat_names}
    200 
--> 201     def encodes(self, to): to.transform(to.cat_names, partial(_apply_cats, self.classes, 1))
    202     def decodes(self, to): to.transform(to.cat_names, partial(_decode_cats, self.classes))
    203     def __getitem__(self,k): return self.classes[k]

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastai2/tabular/core.py in transform(self, cols, f, all_col)
    160     def transform(self, cols, f, all_col=True):
    161         if not all_col: cols = [c for c in cols if c in self.items.columns]
--> 162         if len(cols) > 0: self[cols] = self[cols].transform(f)
    163 
    164 # Cell

/opt/conda/envs/fastai/lib/python3.7/site-packages/fastcore/foundation.py in __getitem__(self, k)
    268     def __init__(self, items): self.items = items
    269     def __len__(self): return len(self.items)
--> 270     def __getitem__(self, k): return self.items[k]
    271     def __setitem__(self, k, v): self.items[list(k) if isinstance(k,CollBase) else k] = v
    272     def __delitem__(self, i): del(self.items[i])

TypeError: list indices must be integers or slices, not L

Here’s learn.dls.train.all_cols.iloc[0]:

USER_ID    16840.000000
ITEM_ID    40385.000000
SCORE      0.562633
Name: 100105, dtype: float64

And type(learn.dls.train.all_cols.iloc[0]):

pandas.core.series.Series

Here are my learn.dls.classes:

{'USER_ID': (#21042) ['#na#',-9221652811967581228,-9221628377334701256,-9221293271364246775,-9220623858600595297,-9217989984838361784,-9216054495925762037,-9215287826836550985,-9215252638705101368,-9213431121804536050...],
'ITEM_ID': (#41566) ['#na#',9,41,45,100,114,235,256,294,310...]}

Finally, when I call learn.get_preds():

inp, preds, _, dec_preds = learn.get_preds(with_input=True, with_decoded=True)

inp, preds and dec_preds all return tensors. How can I transform these back into classified labels?

Any help or pointing me in the right direction of resources is greatly appreciated!

Was there any update here? I am getting the same issue on this for learn.predict on collab model

This is the error I keep on getting. I might not be aware of the correct item to input here

This is the error I keep on getting:
TypeError: list indices must be integers or slices, not list

Hi, did you get this figured out? I was trying to get a prediction for a single row of the data frame I used to create the dataloaders per https://docs.fast.ai/tabular.learner.html#TabularLearner.predict, but got the following error:

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in __getitem__(self, k)
    110     def __init__(self, items): self.items = items
    111     def __len__(self): return len(self.items)
--> 112     def __getitem__(self, k): return self.items[list(k) if isinstance(k,CollBase) else k]
    113     def __setitem__(self, k, v): self.items[list(k) if isinstance(k,CollBase) else k] = v
    114     def __delitem__(self, i): del(self.items[i])

TypeError: list indices must be integers or slices, not list

Any help on this? I still get the same error.

Struggling with the same thing today. I’ve tried all sorts of permutations on how to call learn.predict, but I invariably get the same TypeError that others here are getting. Even the example that ChatGPT came up with failed to work! Unfortunately, I’m also not finding the answer in the fastai library docs, and I haven’t been able to find working example code anywhere.

The third exercise in the “Further Research” section of fastbook Chapter 8 is to “Complete this notebook using the full MovieLens dataset, and compare your results to online benchmarks.”. Surely there must be someone who has done this and shared their notebook, which ought to contain the solution to this issue. But how to find that notebook…?

I found another thread which links to an example that allegedly works. I haven’t had a chance to work through that example yet, but wanted to provide a link here in case anyone is following along. The example does not call learn.predict(); it uses some other method.