Platform: Kaggle Kernels

Thanks @ini_27 and @wdhorton for this wonderful work!

There seems to be a lot of tricks and efforts poured in to make them work in Kaggle kernel. Even download dataset seems very tricky.

Everytime I ran Lesson 1 pets on Kaggle, downloading works fine; but when I start a new kernel to download Pets dataset, I got the same error message below

ConnectionError: HTTPSConnectionPool(host='s3.amazonaws.com', port=443): Max retries exceeded with url: /fast-ai-imageclas/oxford-iiit-pet.tgz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f23189d9f60>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

Here are my quesitons:

  1. in your lesson 1 pets notebook, it seems every time we run the notebook we have to re-download the pet dataset. does it mean we can’t save the pet dataset somewhere on Kaggle for reuse?
  2. what do I need to do in order to download the pet dataset in my own new kernel?
  3. how can I download my trained model/weights on Kaggle kernel into my local computer?

Thanks!

I have saved the dataset over here

If you do learn.save() and commit your kernel, it should be under output for you to be able to download

Also, that error is probably because you have internet connection turned off

2 Likes

Can someone tell me where we can upload the pre-trained AWD LSTM bits (if that is possible)?

The competition I’m working in doesn’t allow submissions when “Internet Connected” = True.

Hi @ilovescience

Thanks for your very helpful response! The downloading problem and dataset reuse problem have been dealt beautifully.

Also thanks for uploading oxford pets dataset! I found it was very very slow to upload, actually I think it is not uploading at all. do you know why?

As for weights, do you know how to save weights on Kaggle and reuse it again later?

Thanks a lot!

I am not sure why it was not uploading for you…

I have tried using learn.save('stage-1') and sometimes it will output under Output section of the kernel when you commit, and sometimes it does not… It depends on the weird directory structure of the Kaggle Kernels and where you have the model directory… I will try to figure out how to do that…

1 Like

I was able to get the model weights to output. If you save the model in the main directory it will output the weights. Please see this kernel

1 Like

FYI kaggle now seems to be using fastai 1.0.45

yes, you are right.

To be more specific, we can save model with `learn.save(’/kaggle/working/model_1’)

If I’m on a competition where it’s not allowed internet connection on kernels, how can i use a pre-trained fast-ai resnet34 cnn?

yeah, ‘1.0.45’

you can upload your pre-trained AWD LSTM as a dataset in kaggle, then in your kaggle kernel you can “Add Data” and select your dataset.

I tried using this approach. But the learner fails to load the pretrained model, even thought that’s the one it downloaded itself a few minutes before.

The notebook describes all the steps and errors occurred. Do you know what’s wrong there?

I update the kernels every weekend so any updates pushed out mid-week cause some breakage now and then.
Sorry for the issues, I’ll try to update them mid-week :slight_smile:

learn = create_cnn(data, models.resnet34, metrics=error_rate, pretrained=False, model_dir=str(models_path.absolute())) in your notebook.
this was how i did it.
model_path = “/kaggle/input/resnet34/resnet34”
i copied my dowloaded resnet34 into kaggle/working/models/
!cp -r /kaggle/input/resnet34/ /kaggle/working/models/
initialised my model
model = create_cnn(data, models.resnet34, metrics=accuracy, path=".", pretrained=False, model_dir=’/kaggle/working/models/resnet34/resnet34’)
then you can fit with that

@init_27
Kaggle kernel allows us to run !pip install fastai==1.0.46 to install the latest version fine, but when load the library, the latest is not used.

see this proof in Kaggle

Thank you for the suggestion. I don’t quite get how it’s different from my solution, except for the missing step of loading the copied model. As far as my understanding of the process goes, it only creates a new network with resnet34 architecture and randomly initialized weights. It doesn’t use the copied file in any way.

I’ve updated the notebook with your example. It supports my guess that the weights from the resnet34.pth weren’t used. I might have messed up with the directories somehow, though. Could you please tell the output of the following commands?

# What's in your datasource directory
!ls -lah /kaggle/input/resnet34/
# The working directory should have the content of /kaggle/input/resnet34/ I guess
# Why is it /kaggle/working/models/resnet34/resnet34 you're using for your model_dir?
# Is there any special file that create_cnn can read?
!ls -lah /kaggle/working/models/
!ls -lah /kaggle/working/models/resnet34
!ls -lah /kaggle/working/models/resnet34/resnet34

And just to be sure, did you compare the results of using the model from dataset, as you suggest, to the ones from loading it (on another machine maybe)?

I have a few beginner questions, as doing the course using Kernels doesn’t seem to be working for me.

Most importantly, how can you actually save your work? I’m trying to do lesson 1, and have been trying for about 2 days to get through it. It’s taking between up to 40 minutes to train a model, and sometimes this is long enough for the Kernel to stop responding/time out. No error message, just stops responding completely. When I reload the kernel, I would like to be able to load the model so I don’t have to spend 60+ minutes going through the notebook. Unfortunately, I can’t figure out how to load the data, as the saved models don’t seem to persist if I have to refresh the page. Is this even possible with kernels?

I also tried looking into exporting the pth files after saving the model, but in order to do this, it looks like you have to ‘commit’ the kernel - but this process takes over 90 minutes, and requires running the notebook start to finish, with no promise that it won’t time out, so I unfortunately haven’t been able to save my work this way.

I assume I’m just not understanding the workflow I should be using to work using kernels. Can someone provide me some guidance? As it is, I can’t really even get through the first lesson, as it is taking so long, and I don’t have a way of saving progress. Do other platforms have similar limitations?

1 Like

Hi,

You can use Google Colab for fast ai. Please check this link to get started
https://course.fast.ai/start_colab.html

Same question for Lesson 2 – I’m wanting to pull the weights and labels.csv for a dataset I spent time cleaning-up.

The timeout is a known problem of the Kaggle Kernels if you do not commit. But if you commit, it should not timeout and you should be able to export your model as pth files.

1 Like