Predicting on a single row with Rossmann Data


(Johnny Lee) #1

Hi, I’m trying to figure how to do a single prediction in the Rossmann notebook. I modify the code from single image prediction but I’m getting the following errors.

df, y, nas, mapper = proc_df(joined_samp, ‘Sales’, do_scale=True)
test = df.iloc[0].values
m.predict_array(test[None])


TypeError Traceback (most recent call last)
in ()
1 test = df.iloc[0].values
----> 2 m.predict_array(test[None])

~/fastai/courses/dl1/fastai/learner.py in predict_array(self, arr)
328 def predict_array(self, arr):
329 self.model.eval()
–> 330 return to_np(self.model(to_gpu(V(T(arr)))))
331
332 def TTA(self, n_aug=4, is_test=False):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
355 result = self._slow_forward(*input, **kwargs)
356 else:
–> 357 result = self.forward(*input, **kwargs)
358 for hook in self._forward_hooks.values():
359 hook_result = hook(self, input, result)

TypeError: forward() missing 1 required positional argument: ‘x_cont’


(Anders) #2

I’m wondering about this as well


#3

I’ve been trying to do this as well. I’ve made little progress. If someone can help us out by pointing us in the right direction, it’d be great.

Based on how to perform a prediction on a single image, I came up with this:

single_row = df_test.head(1).as_matrix()
m.predict_array(single_row[None])

And I get an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-98-0319cad5735c> in <module>()
      1 single_row = df_test.head(1).as_matrix()
----> 2 m.predict_array(single_row[None])

~/fastai-learn/rossmann-store-sales/fastai/learner.py in predict_array(self, arr)
    366     def predict_array(self, arr):
    367         self.model.eval()
--> 368         return to_np(self.model(to_gpu(V(T(arr)))))
    369 
    370     def TTA(self, n_aug=4, is_test=False):

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    355             result = self._slow_forward(*input, **kwargs)
    356         else:
--> 357             result = self.forward(*input, **kwargs)
    358         for hook in self._forward_hooks.values():
    359             hook_result = hook(self, input, result)

TypeError: forward() missing 1 required positional argument: 'x_cont'

(Taylor Harrison) #4

It looks like the predict_array function doesn’t work in this case due to the way the MixedInputModel is defined (the forward function takes 2 arguments, one for the categorical variables and one for the continuous variables). To get a prediction for a single row, use the following code

cat_values = [[3, 2, 2]]
cont_values = [[6.4, 78.4, 29.8]]
learn.model.eval()
pred = to_np(learn.model(to_gpu(V(T(np.array(cat_values)))), to_gpu(V(T(np.array(cont_values))))))

where cat_values is a 2D array with column values that match the categorical variable indices (make sure the columns are in the same order as in your pandas dataframe) and cont_values is the same thing just with the continuous variables. Your single number prediction should be in the pred numpy array. You can also pass in multiple rows to get multiple predictions at once.

Side note, you can get the categorical variable category indices by looking at the df.column_name.cat.categories attribute, where column_name is the name of the column your interested in and df is the dataframe (will only work after the column has been set as categorical with the df[column_name].astype('category').cat.as_ordered()). Just set the correct column in the cat_values array to the index of the category you want.


#5

That does not seem to work for me. Here is the code that I used:

cat_values = [[3, 2, 2]]
cont_values = [[6.4, 78.4, 29.8]]
m.model.eval()
pred = to_np(m.model(to_gpu(V(T(np.array(cat_values)))), to_gpu(V(T(np.array(cont_values))))))

I get the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-16-53cdaaaf6b19> in <module>()
----> 1 pred = to_np(m.model(to_gpu(V(T(np.array(cat_values)))), to_gpu(V(T(np.array(cont_values))))))

/net/vaosl01/opt/NFS/sw/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    355             result = self._slow_forward(*input, **kwargs)
    356         else:
--> 357             result = self.forward(*input, **kwargs)
    358         for hook in self._forward_hooks.values():
    359             hook_result = hook(self, input, result)

/net/vaosl01/opt/NFS/sw/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/column_data.py in forward(self, x_cat, x_cont)
    112     def forward(self, x_cat, x_cont):
    113         if self.n_emb != 0:
--> 114             x = [e(x_cat[:,i]) for i,e in enumerate(self.embs)]
    115             x = torch.cat(x, 1)
    116             x = self.emb_drop(x)

/net/vaosl01/opt/NFS/sw/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/column_data.py in <listcomp>(.0)
    112     def forward(self, x_cat, x_cont):
    113         if self.n_emb != 0:
--> 114             x = [e(x_cat[:,i]) for i,e in enumerate(self.embs)]
    115             x = torch.cat(x, 1)
    116             x = self.emb_drop(x)

/net/vaosl01/opt/NFS/sw/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/variable.py in __getitem__(self, key)
     76                 return IndexSelect.apply(self, 0, key)
     77             # else fall through and raise an error in Index
---> 78         return Index.apply(self, key)
     79 
     80     def __setitem__(self, key, value):

/net/vaosl01/opt/NFS/sw/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py in forward(ctx, i, index)
     87             result = i.index(ctx.index)
     88         else:
---> 89             result = i.index(ctx.index)
     90             ctx.mark_shared_storage((i, result))
     91         return result

IndexError: index 3 is out of range for dimension 1 (of size 3)

Also could you explain your answer a bit more. With reference to lesson 3, at the testing stage, we have already build our model and our ColumnarModelData md from which we have our learner m. The data that was passed to these were already processed using the proc_df function. I’m confused as to what we pass as a “single” row.

My idea is:

  1. Grab a random row(s) from the joined_test dataframe.
  2. Follow the same processing steps that was done to get the df_test dataframe, at which point we only have numbers.
  3. Pass this through the trained model and get a prediction.

Is that right/doable?