So here is what happened (still not 100% sure, but we have a fix)
Cause: 1.0.46 version kaggle kernel
File: lr_find.py
Function: on_train_end()
self.learn.load(‘tmp’) will trigger the purge function which somehow will cause the purge open the tmp file on path(’…/input/tmp.pkl’), instead of model_dir path
Resolve:
set self.learn.load(‘tmp’, purge=False) will resolve the issue.
Which fastai 1.0.48 has the fix…
My advise is if you want to use kaggle kernel (P100 GPU)
On the first cell, run
!conda install -c fastai fastai --yes
after done, import fastai fastai.__version__
check it is 1.0.48
Now you can go back to set the learn = cnn_learner(..., model_dir = '/kaggle/model')
@shlyakh: You can find out which version of fastai you are using by putting in this code: import fastai; fastai.__version__
@heye0507: It’s weird that you get an error with my code, because I used (and am still using) version 1.0.46 as well. In your error message it seems like your code still points at the input? It might also have something to do with the learner. I create it like that: learn = create_cnn(data, models.resnet34, model_dir = '/tmp/models', metrics=error_rate)
But anyway: now we have two options and with an update on 1.0.48 this problem seems to be gone, just like you wrote
is there a way to download a model after some epochs in kaggle without commiting, like if I train a gan model, the commit will go on forever as I stop the cell running manually in colab I usually go to the folder structure and download the model file or save in in google drive.
yes I know about the csv method but it only works for files around 2mb and csv files https://www.kaggle.com/rtatman/download-a-csv-file-from-a-kernel
it is a pkl file, being able to download any file would be great though as even after commiting I cancel the commit and I get a error that folder structure is too large, more than 6 and no outputs files. Im looking to a workaround now.
the solution seems to be around zipping the file and then deleted the folder contents to not get the too many outputs error
!zip -r output.zip /kaggle/working/
!rm -rf /kaggle/working/*
So I was able to get the commit to pass but there is not output zip file, is there a specific directory I need to move the output.zip file to get it to show up as output?
You probably haven’t saved the model in the correct directory… Otherwise, the model should be downloadable from the output section of the kaggle kernel
Hey all, I’m getting Bus Errors when trying to look at a batch from my DataBunch. I’ve successfully downloaded the images into the kernel (I don’t know if this is the correct way to say that) and have created the data bunch but the following line is throwing errors:
Hi. When I was doing the second lesson I wanted to try to run the bear recognition model on my machine. As they say in the video it’s better to learn the neural network with GPU but after model is ready I can use a CPU for image recognition. I saved a model to a file in a Kaggle kernel but I can’t find a way of downloading it. Is there a good way to get files from kernel or it’d be easier to do it in different environment?