Non-Beginner Discussion

I’m getting 404s on these, maybe not public yet.

Thanks - try now.

1 Like


Sorry, I forgot that I forced my Notebook to not duplicate the images!

I mean, I don’t know why, when getting the images from the dataset, it gets two folders with the same images: Mushrooms and mushrooms/Mushrooms.

It can be seen in the copy you made:

But in my copy, I only used the folder Mushrooms:

So, I think now the is question would be, is the model that uses each image twice overfitting? or is the duplication producing an augmentation?

Sorry and thanks again.

Oh, sorry, I hadn’t seen your answer.

I would try that.

The difference was that one training was made with all the images duplicated.
I then asked about that in Jeremy’s response.

Thanks you.

Yeah sounds like the original had a bug so it’s not actually working correctly. Sorry I missed that!

Does anyone see any issue with installing packages to Paperspace persistent storage and then adding this location to system path?

To save time I have been doing the following:

  • Create a directory in /notebooks (the home dir) called libs
  • Install packages to this dir using the --target argument: !pip install --target=./libs wandb.
  • Set this location to the sys.path:
import sys
pkg_path = "./libs"

The above cell is then the only cell you need to run each time you restart a Paperspace machine.

Does anyone see any fault in this approach? It saves me precious minutes (package install is a little slow in Paperspace) when I have a quick idea to try out, but unsure if setting sys.path has any other impacts (none that I can see so far).

1 Like

@stantonius The Paperspace images come with a certain amount pre-installed, but that’s just one thing which you have to reinstall / reload each time unfortunately.

Another approach would be to create a custom ‘notebook’ based on a custom docker image. You could potentially use the fastai docker container as a base (more on those here), but you might want to add some custom logic on top of that. I described how I did that with the IceVision library and connected it to a custom notebook on Paperspace in this blog post.

That said, I’m not sure I’d bother with it for just simple fastai experimentation. Even when I needed to do it for my side project, it still felt like a distraction from the main show in town: i.e. training my models. YMMV :slight_smile:

1 Like

@stantonius @strickvl please remember to keep non-beginner topics, like modifying the python sys path and creating docker images, away from any topic labelled “beginner”. I’ve moved your discussion to the non-beginner topic now.


I’m about to use timm, but just making sure - I could use timm models in unet_learner as well as I could in vision_learner? (Or it’s just for vision_learner)

Another question from the tutorial. Say, we have this:

import timm 
import torch

model = timm.create_model('resnet34')
x     = torch.randn(1, 3, 224, 224)

torch.Size([1, 1000])

What does each axis of the model’s shape mean? I figure it out to be number of layers but I’m not sure.

right now unet_learner does not support timm.

The shape is the logits for each of the classes, and ImageNet has 1000 classes…

1 Like

Hey everyone,

I have been iterating through Jeremy’s notebook Iterate like a grandmaster! | Kaggle and trying to take on a the suggestion of using the patent-trained model + incorporating this with fastai. However I am hitting a wall: when I try to cut/slice/manipulate a Pytorch model

# example of what I thought was the simplest model modification,
# which is just deconstructing and reconstructing the model
model = nn.Sequential(*[l for l in model_d.children()])

and pass it a batch, I get this error:

TypeError: forward() got an unexpected keyword argument 'input_ids'

I know the batch is formatted correctly, because when I pass one batch to the HF-delivered model, I get the expected output.

Any ideas what I am doing wrong here? I have searched high and low and cannot find the answer, but that usually means I have overlooked something simple.

BTW this is clearly a basic example, but this is after I had tried experimenting with custom architectures and always eventually hit this same error. I ultimately wanted to write a blog for this group that outlines how to use models from other libs in fastai, but I have fallen at the first hurdle.

For reference, my notebook is here that describes the issues I am facing in more detail (same Kaggle competition data as JH’s notebook).

Extra points: I found when I was creating the Datasets object myself for this custom task, I had to specifically move the input tensors to the GPU via the .cuda() method in my custom Transform. However when I follow the HF or any fastai tutorial, it seems this is done automatically. I tried looking in both repos but can’t seem to see where this happens. Any ideas?

Many thanks for any guidance or feedback

I just found out about this intriguing project and it seems to resonate with what Jeremy was discussing in today’s lesson about Transformers being good for GPUs but ULMFiT/RNNs being better for larger contexts. It appears to be based on something called “Attention-Free Transformers”. Is anybody familiar with this type of work? Is it worth pursuing?


Am I correct to understand that huggingface does not provide a facility to configure ssh-keys? Lots of googling hasn’t turned up anything. Is there some alternative way to configure that git push doesn’t ask my usename & password each time?

[Edit:] Heh! I discovered something new… Git - gitcredentials Documentation
So all I needed to do at my local machine WSL console was…

Read the help files to consider security implications

$ git help credential-cache
$ git help credential-store

Implement the one I chose…

$ git config --global credential.helper store

and then at the next push, entered my huggingface username/password for the last time.


Maybe you can talk to BlinkDL in the EleutherAI discord, he is frequently sharing his progress there.

1 Like

Hey guys, any idea how to deploy my model into an online app? I just need a website where one can upload a photo, and get it predicted and decoded with the model.

I barely caught the thing with Binder, and others

How is what you’re looking to do different from what we covered in lesson 2?


Oops! Might have forgotten that. I’ve probably looked again in an old tutorial. Thanks for addressing me to HuggingFace, Gradio… I will check these up

You don’t say that you’ve considered HuggingFaces and don’t want to use that, so use that!

Otherwise, I see lots of options googling for: gradio web hosting.
Can you report on three options you find there, so we can get a better feel for what sort of service you are looking for?

To be honest, I just didn’t know about HuggingFace. I was exploring different options but found only Binder and in an old version of the fastai tutorials/classes, and got lost, but rewatching again the class #2 of this year 2022, showed me new options such as HuggingFace, which is what I’m going to try first… Will keep you posted :wink:


a tip btw, in helping with the transcriptions I’m picking up details that I missed while just viewing them…