Hi all, I’m making my own small example data to follow along with chapter 9, to try to understand the fastai library better (and actually just understand pandas better). I’m making a dataframe like this:
df = pd.DataFrame(index=range(20), columns=range(4))
for dfRowIndex in range(df.shape[0]):
df.loc[dfRowIndex, 1] = random.uniform(0, 100)
df.loc[dfRowIndex, 2] = random.uniform(0, 100)
df.loc[dfRowIndex, 3] = random.uniform(0, 100)
if df.loc[dfRowIndex,1] > df.loc[dfRowIndex,3]:
df.loc[dfRowIndex, 0] = int(1)
else:
df.loc[dfRowIndex, 0] = int(0)
print(df)
So that column0 will be 1 if column1 is greater than column3, and 0 otherwise. column2 has no effect on anything.
Then i try this:
procs = [Categorify, FillMissing, Normalize]
numberOfValidationRows = 5
splits = (list(range(numberOfValidationRows,df.shape[0])),
list(range(0,numberOfValidationRows)))
cat_names = [0]
cont_names = list(range(1, 4))
y_names = [0]
to = TabularPandas(df, procs, cat_names, cont_names,
y_names=y_names, y_block=CategoryBlock, splits=splits)
to.show(1)
The last line causes an error “ValueError: Columns must be same length as key”
I can’t see why this is the case. And then regardless of the attempt to do to.show(), if I do something like this:
dls = to.dataloaders(5)
learn = tabular_learner(dls, y_range=(0,1), layers=[500,250],
n_out=1, loss_func=F.mse_loss)
learn.lr_find()
It causes a different error: “RuntimeError: CUDA driver error: unknown error”
(But the code in the chapter 9 notebook on the same server works fine)
Maybe the 2nd error is for the same reason as the 1st one?
Any pointers to what I’m doing wrong here? Am I misunderstanding something to do with how a dataframe becomes a TabularPandas? Thanks!