Issue with Custom Preds?

muellerzr · May 18, 2019, 7:42pm

Hi everyone,

I wanted a way to streamline a test dataframe’s validation, so I quickly whipped this function up, and it was working okay until my most recent tabular model started showing ~50% less than what it should be. Could someone with an extra pair of eyes look this over for me? Thanks! (also could it be due to the two different batch-sizes if I made for instance my train batch size 1000?)

def PredictTest(df, learn, dep_var):
  data = learn.data.train_ds.x
  path = learn.path
  cat_names = data.cat_names
  cont_names = data.cont_names
  procs = data.procs
  testData = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
       .split_none()
       .label_from_df(cols=dep_var)
       .databunch())
  results = learn.validate(testData.train_dl)
  acc = float(results[1]) * 100
  print("Test accuracy of: " + str(acc))
  return acc

I appreciate the help!

Zach

muellerzr · June 3, 2019, 11:04pm

This approach is wrong, you need to swap the dataloaders, as shown here: