Hi everyone,
I wanted a way to streamline a test dataframe’s validation, so I quickly whipped this function up, and it was working okay until my most recent tabular model started showing ~50% less than what it should be. Could someone with an extra pair of eyes look this over for me? Thanks! (also could it be due to the two different batch-sizes if I made for instance my train batch size 1000?)
def PredictTest(df, learn, dep_var):
data = learn.data.train_ds.x
path = learn.path
cat_names = data.cat_names
cont_names = data.cont_names
procs = data.procs
testData = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
.split_none()
.label_from_df(cols=dep_var)
.databunch())
results = learn.validate(testData.train_dl)
acc = float(results[1]) * 100
print("Test accuracy of: " + str(acc))
return acc
I appreciate the help!
Zach