Kaggle Humpback Whale Identification

sctenvoorde · February 7, 2019, 3:22pm

I found a solution on another thread (TabularDataBunch Error: "Your validation data contains a label that isn't present in the training set, please fix your data.")

#after load the dataset, grab the targets and make unique list
classes = df['SalePrice'].unique()
classes.sort()

#later passes that list to be treated as categorical values.
.label_from_df(cols=dep_var, classes=classes)

mizzourah2006 · February 7, 2019, 3:43pm

Are you using the Kaggle kernels for this? I tried to use the kernel for the Cancer detection one and I can’t save out my predictions because it always freezes when I try to commit my kernel after I start training.

tomdraug · February 7, 2019, 3:57pm

I never managed to successfully commit the Kaggle kernel. I run them on my own machine and on Collab without any problems.

mizzourah2006 · February 7, 2019, 4:08pm

Ok, cool. That’s what I was wondering. I have my machine, but I use it for work too and I stupidly only committed 80gbs of memory to my Linux partition when I split it a year ago. So downloading 10gbs worth of data pretty much eats up all my free space. I figured Kaggle’'s kernels might be a way for me to test it out, but I guess not. I might look into the google colab setup for trying out Kaggle competitions. I appreciate the feedback as I was getting extremely frustrated wasting hours of my time only to get the commit error

tomdraug · February 7, 2019, 4:49pm

Oh I see! Collab is quite slow, but works. And it’s free.

dipam7 · May 13, 2019, 12:40pm

Hey, Kaggle kernels have worked perfectly fine for me so far. Also if you want to alter your partitions you can use an application such as GParted. Check out its tutorials online, it is pretty easy to use.
Cheers