Platform: Kaggle Kernels

heye0507 · March 12, 2019, 7:16am

So here is what happened (still not 100% sure, but we have a fix)

Cause: 1.0.46 version kaggle kernel

File: lr_find.py
Function: on_train_end()
self.learn.load(‘tmp’) will trigger the purge function which somehow will cause the purge open the tmp file on path(’…/input/tmp.pkl’), instead of model_dir path

Resolve:
set self.learn.load(‘tmp’, purge=False) will resolve the issue.

Which fastai 1.0.48 has the fix…

My advise is if you want to use kaggle kernel (P100 GPU)
On the first cell, run

!conda install -c fastai fastai --yes

after done,
import fastai
fastai.__version__

check it is 1.0.48

Now you can go back to set the learn = cnn_learner(..., model_dir = '/kaggle/model')

piaoya · March 12, 2019, 8:47am

@shlyakh: You can find out which version of fastai you are using by putting in this code:
import fastai; fastai.__version__

@heye0507: It’s weird that you get an error with my code, because I used (and am still using) version 1.0.46 as well. In your error message it seems like your code still points at the input? It might also have something to do with the learner. I create it like that:
learn = create_cnn(data, models.resnet34, model_dir = '/tmp/models', metrics=error_rate)

But anyway: now we have two options and with an update on 1.0.48 this problem seems to be gone, just like you wrote

gan · March 13, 2019, 1:15am

is there a way to download a model after some epochs in kaggle without commiting, like if I train a gan model, the commit will go on forever as I stop the cell running manually in colab I usually go to the folder structure and download the model file or save in in google drive.

heye0507 · March 13, 2019, 5:08am

There is a way to download csv file without commit… but I don’t know anything about download the model… assume is .pth file?

gan · March 13, 2019, 2:45pm

yes I know about the csv method but it only works for files around 2mb and csv files https://www.kaggle.com/rtatman/download-a-csv-file-from-a-kernel
it is a pkl file, being able to download any file would be great though as even after commiting I cancel the commit and I get a error that folder structure is too large, more than 6 and no outputs files. Im looking to a workaround now.

ilovescience · March 14, 2019, 12:28am

I would try this:
https://www.kaggle.com/rtatman/download-a-csv-file-from-a-kernel#467667

from IPython.display import FileLinks
FileLinks('.') # input argument is specified folder

I do not know if there is a download limit though…

gan · March 14, 2019, 4:00pm

the solution seems to be around zipping the file and then deleted the folder contents to not get the too many outputs error
!zip -r output.zip /kaggle/working/
!rm -rf /kaggle/working/*

currently testing this

gan · March 14, 2019, 4:58pm

So I was able to get the commit to pass but there is not output zip file, is there a specific directory I need to move the output.zip file to get it to show up as output?

Daniel · March 17, 2019, 1:48am

Thanks for sharing it!
I tried it just now, a model file with size 241MB can be downloaded successfully!

ilovescience · March 17, 2019, 4:47am

@Daniel glad to hear this worked!

ilovescience · March 17, 2019, 4:49am

You probably haven’t saved the model in the correct directory… Otherwise, the model should be downloadable from the output section of the kaggle kernel

gan · March 17, 2019, 3:15pm

I tried this method i get this

gan · March 17, 2019, 3:17pm

it is saved under /kaggle/working/stylegan/results/, I was able to upload the model to dropbox from kaggle using a script

ilovescience · March 17, 2019, 10:39pm

This doesn’t look like from the kaggle kernel… you should be able to just press the link for the file you want and download it…

gan · March 19, 2019, 3:09pm

when I click on the link that is the page I get

sammbeller · March 22, 2019, 3:48pm

Hey all, I’m getting Bus Errors when trying to look at a batch from my DataBunch. I’ve successfully downloaded the images into the kernel (I don’t know if this is the correct way to say that) and have created the data bunch but the following line is throwing errors:

data.show_batch(rows=3, figsize=(7,8), num_workers=0)

I’m also getting bus errors when trying to commit the kernel. Is this a problem on Kaggle’s end?

A4KA5H · March 22, 2019, 6:41pm

I was facing same problem as yours, this is the solution i came up with

scadaboosh · March 22, 2019, 11:29pm

Hey all!

I am getting errors with the lesson 3 planets kernel, could someone help me with the updated commands? I think ImageItemList got removed.

yappo · April 10, 2019, 8:10pm

Hi. When I was doing the second lesson I wanted to try to run the bear recognition model on my machine. As they say in the video it’s better to learn the neural network with GPU but after model is ready I can use a CPU for image recognition. I saved a model to a file in a Kaggle kernel but I can’t find a way of downloading it. Is there a good way to get files from kernel or it’d be easier to do it in different environment?

ilovescience · April 11, 2019, 7:07am

I am sorry I did not get back to you. I did not see this response until now. Did you get this to work?