Hi everyone, I am in one of the kaggle competition where it requires to load the testing data from a python generator. So I cannot add the testing file when creating the tabularlist and databunch.
data = (TabularList.from_df(market_train_df, cat_names=cat_names, cont_names=cont_names, procs=procs)
.random_split_by_pct()).label_from_df(cols=dep_var).databunch()
I use the above code to create data and learn. Then I use the following code to predict where y1 is the pandas data frame generated from the generator.
days = env.get_prediction_days()
for (market_obs_df, news_obs_df, predictions_template_df) in days:
x1,y1,z1 = predictions_template_df, market_obs_df, news_obs_df
preds = learn.predict(y1)
However, the codes give the following error:
TypeError Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
TypeError: an integer is required
During handling of the above exception, another exception occurred:
KeyError: 'volume'
I also tried preds = learn.predict(y1.iloc[9:10]), the same problem occurs.
Interestingly when I predict it individually, it works.
preds = learn.predict(y1.iloc[9])
I thought it was about the normalize in the preprocessing, but since it can be predicted individually, I have no idea what happens.
I am currently running this, but I know there must be a better way.
first=True
for i in range(4):
pred = learn.predict(y1.iloc[i])[2]
if first:
preds = pred
first = False
else:
preds = torch.cat((preds, pred))
preds = preds.view(-1, 2)
Does anyone have an idea about it? Thanks a lot!