Platform: Kaggle Kernels

So here is what happened (still not 100% sure, but we have a fix)

Cause: 1.0.46 version kaggle kernel

File: lr_find.py
Function: on_train_end()
self.learn.load(‘tmp’) will trigger the purge function which somehow will cause the purge open the tmp file on path(’…/input/tmp.pkl’), instead of model_dir path

Resolve:
set self.learn.load(‘tmp’, purge=False) will resolve the issue.

Which fastai 1.0.48 has the fix… :joy:

My advise is if you want to use kaggle kernel (P100 GPU)
On the first cell, run

!conda install -c fastai fastai --yes

after done,
import fastai
fastai.__version__

check it is 1.0.48

Now you can go back to set the learn = cnn_learner(..., model_dir = '/kaggle/model')

@shlyakh: You can find out which version of fastai you are using by putting in this code:
import fastai; fastai.__version__

@heye0507: It’s weird that you get an error with my code, because I used (and am still using) version 1.0.46 as well. In your error message it seems like your code still points at the input? It might also have something to do with the learner. I create it like that:
learn = create_cnn(data, models.resnet34, model_dir = '/tmp/models', metrics=error_rate)

But anyway: now we have two options and with an update on 1.0.48 this problem seems to be gone, just like you wrote :slight_smile:

2 Likes

is there a way to download a model after some epochs in kaggle without commiting, like if I train a gan model, the commit will go on forever as I stop the cell running manually in colab I usually go to the folder structure and download the model file or save in in google drive.

There is a way to download csv file without commit… but I don’t know anything about download the model… assume is .pth file?

yes I know about the csv method but it only works for files around 2mb and csv files https://www.kaggle.com/rtatman/download-a-csv-file-from-a-kernel
it is a pkl file, being able to download any file would be great though as even after commiting I cancel the commit and I get a error that folder structure is too large, more than 6 and no outputs files. Im looking to a workaround now.

I would try this:
https://www.kaggle.com/rtatman/download-a-csv-file-from-a-kernel#467667

from IPython.display import FileLinks
FileLinks('.') # input argument is specified folder

I do not know if there is a download limit though…

1 Like

the solution seems to be around zipping the file and then deleted the folder contents to not get the too many outputs error
!zip -r output.zip /kaggle/working/
!rm -rf /kaggle/working/*

currently testing this

So I was able to get the commit to pass but there is not output zip file, is there a specific directory I need to move the output.zip file to get it to show up as output?

Thanks for sharing it!
I tried it just now, a model file with size 241MB can be downloaded successfully!

@Daniel glad to hear this worked!

You probably haven’t saved the model in the correct directory… Otherwise, the model should be downloadable from the output section of the kaggle kernel

I tried this method i get this


it is saved under /kaggle/working/stylegan/results/, I was able to upload the model to dropbox from kaggle using a script

This doesn’t look like from the kaggle kernel… you should be able to just press the link for the file you want and download it…

when I click on the link that is the page I get

Hey all, I’m getting Bus Errors when trying to look at a batch from my DataBunch. I’ve successfully downloaded the images into the kernel (I don’t know if this is the correct way to say that) and have created the data bunch but the following line is throwing errors:

data.show_batch(rows=3, figsize=(7,8), num_workers=0)

I’m also getting bus errors when trying to commit the kernel. Is this a problem on Kaggle’s end?


I was facing same problem as yours, this is the solution i came up with

Hey all!

I am getting errors with the lesson 3 planets kernel, could someone help me with the updated commands? I think ImageItemList got removed.

Hi. When I was doing the second lesson I wanted to try to run the bear recognition model on my machine. As they say in the video it’s better to learn the neural network with GPU but after model is ready I can use a CPU for image recognition. I saved a model to a file in a Kaggle kernel but I can’t find a way of downloading it. Is there a good way to get files from kernel or it’d be easier to do it in different environment?

I am sorry I did not get back to you. I did not see this response until now. Did you get this to work?