Lesson 2 official topic

balnazzar · May 4, 2022, 6:59am

Just a few suggestions:

You don’t want to parallelize your models. That’s a very advanced topic, as Jeremy himself highlighted.
One way to use multiple GPUs is by running different experiments on each of them, as Jeremy suggested again. It’s quite time-saving.
It you want to use both (or all) your GPUs for the same experiment, you can quite easily do that by data- (not model-!) parallelization.
That comes quite handy when you want to use a batch size that doesn’t fit into the vram of a single gpu.
It amounts to using Pytorch’s DataParallel. Check old threads about this.

n-e-w · May 4, 2022, 7:45am

@orangelmx Can verify this experience – multi GPU training is full of unexpected, non-linear hassles… like race conditions. Very difficult to trace and debug. It’s just too easy to come unstuck in some nonobvious way and run out of talent especially quickly. This is exacerbated if you have two different GPU models; you will experience bottlenecking from the lower-powered GPU and all sorts of other issues

dhoa · May 4, 2022, 8:42am

Sorry for off-topic but this is a super famous dog in Vietnam, I think he has his own emoji collection :)) He looks so funny

suvash · May 4, 2022, 8:58am

Oh, is it ‘Nguyen van dui’ you’re talking about ? I think I’ve seen it on social media every now and then. Now that you mentioned, on some of the vids/pics, the background tiles do match.
Thanks for resolving this mystery. Not including links here so as not to go too off-topic

jeremy · May 4, 2022, 9:03am

Wow that’s really cool! Thanks for telling us about this very special dog

RogerS49 · May 4, 2022, 9:03am

I ran this code and it worked fine

Chapter 1 Kaggle

jpc · May 4, 2022, 10:01am

To expand on this answer a little bit: Segmentation in fastai is quite picky about the mask format. The background pixels need to have the value of 0 and all the other classes should use subsequent integer values 1, 2, and so forth. This makes inspecting the mask images in an image viewer quite difficult since all your pixels are almost black.

The clever trick @VishnuSubramanian used helps a lot for binary masks (where pixels are either background – black (0) or foreground – white (255)) where you can get away with dividing the mask by 255 to get only 0 or 1 as outputs.

Btw. it would be great if the SegmentationDataloader could handle random colors (with a vocab-like mechanism) or maybe sanity-check the inputs a bit to show better error messages.

mike.moloch · May 4, 2022, 11:12am

While using the pet classifier, I noticed that if I used the whole picture for inference, it classified it as a “cat” but if I used the edit functionality (pencil in the top right corner of the picture) and cropped it just to the head, it correctly identified it as a “dog”.

I thought it was interesting that the classifier took into account various body configurations and the relationships that may/may not exist. OTOH, this dogs ears look very cat-like so it’s impressive that the classifier still picks it as a dog when shown just the head (cropped from the same picture)

Mattr · May 4, 2022, 11:26am

I’ve been looking at the docs and experimenting tonight but not coming up with a solution yet @sambit. Using Google Colab I can’t generally see the downloaded file in the file explorer. Only if I open the terminal can I locate it in the hidden folder .fastai.

My interim hack is to move files using the following terminal cell command. eg:
path = untar_data(URLs.IMDB_SAMPLE)
!mv {path}/'texts.csv' /content/drive/MyDrive/Notebooks/

The biggest issue I am finding with using fastai and Google Colab is that artifacts like trained models are so easily lost. The config.ini folder locations for data, models, archive would work great for a static environment but as far as I understand right now this isn’t possible with Colab. I hope to be proven wrong!

My understanding is that a new instance is a clean slate and you have to start over unless you download or move your artifacts elsewhere and this can be triggered as soon as you close your laptop lid. I’ve used PyTorch Lightning to train models that have been interrupted and restarted using the most recent epoch backup log file, saving time and energy. I am wondering if fastai can or could do this?

jpc · May 4, 2022, 12:53pm

I’ve seen people using Google Drive to persist models and data inside Collab. Maybe the library could be extended to do it automatically for you?

mike.moloch · May 4, 2022, 12:54pm

I tend to agree with your observations. I have found colab’s drive situation to be, at the very least, confusing. This is not so much a function of fastai but just the way the colab ecosystem is setup. I have not tried to continue with data saved across session boundaries (mostly because of this).

I think this can be mitigated with extra coding and checkpointing and whatnot. I would think that saving it in my google drive would at least keep the downloaded stuff (config.ini mappings notwithstanding) but I have not made a whole lot of effort in this regard because I find dealing with google drive simply too clunky and unweildy.

bencoman · May 4, 2022, 2:17pm

Perhaps…

def relocateFile(fromFile, toDir):
  maxFile = max(toDir.glob('*'))
  maxFilePlusOne = str(int(maxFile.stem,10)+1).zfill(len(maxFile.stem))
  toFile = toDir/(maxFilePlusOne + fromFile.suffix)
  fromFile.rename(toFile)
  print('relocated', fromFile, '==>', toFile)

for idx,cat in cleaner.change(): relocateFile(cleaner.fns[idx], path/cat)

jona · May 4, 2022, 3:24pm

Hi repeat class takers–I’ve got a brain teaser for you!

If I start by training a model just like the bears model here, except I use a different three classes: [‘dog’, ‘snake’, ‘OTHER’]…

And I only provide images of dogs and snakes in the training and validation datasets (no images ever provided of the third category)…

What would the learner predict if I input an image of a house?
a) [0,0,1]
b) [.3, .3, .3]
c) [.5, .5, 0]
d) something else

(note these numbers have been rounded to demonstrate different regimes)

Reply to this comment with your answer AND why you think that would be the case.
I’ll post the results of my test code after I see 5 guesses!

kurianbenoy · May 4, 2022, 4:44pm

I might be probably wrong. I feel since there is no data of thrid category, that third category won’t be recognized by model at all. So probably option (c) is my best guess

suvash · May 4, 2022, 5:08pm

I mean, you’re doing all those matrix operations, calculating the loss, getting the gradients, adjusting the weights etc. only for the network to learn to predict 0 for the OTHER class, regardless of what the inputs are.

I’d be very surprised if that class activates at all for any image. I haven’t tried this myself yet, but I’d like to know what happens practically.

My intuition says something like
d) [x, y, z] where x + y ~= 1.0 & z ~= 0.0

brismith · May 4, 2022, 5:09pm

I’d guess c too - and with different values than 0.5 if the dog or snake training data had houses in the background (hopefully 0.9, 0.1, 0 - as I like my houses without snakes).

mike.moloch · May 4, 2022, 7:50pm

Let’s get it to 5 answers because I’m interested in the answer too

I’m guessing c with a small caveat.
(but more like ~some probability of it predicting OTHER ie; non zero)

My intuition says that since there is no “understanding”, the network may activate on things it has learned for the known categories and mistake the OTHER to be that.

prairieguy · May 4, 2022, 8:16pm

@jeremy I implemented a python function reindex(dest, start_idx=0, ext = None) that takes a directory or directory of directories and uniquely reindexes all of the files across all directories. Optionally, a starting index and extension can be provided. Importantly, when it reindexes, it will save the non-numeric stem of the file name to the extent that was used for labeling. I plan to wrap it in a script so it can be used from either the shell or within Jupyter and post it on github.

image_dir contains directories of images: dir1/{1.jpg, 2.jpg, 3.jpg}, dir2/{1.jpg, 2.jpg, 3.jpg}, dir3/{1.jpg, 2.jpg, 3.jpg}
usage: reindex(image_dir) -> dir1/{1.jpg, 2.jpg, 3.jpg}, dir2/{4.jpg, 5.jpg, 6.jpg}, dir3/{7.jpg, 8.jpg, 9.jpg}

def reindex_dir(dest, start_idx=0, ext = None):                                                                                                                                                                 
    dest = Path(dest)                                                                                                                                                                                           
    rnd  = "zsk#@m"                                                                                                                                                                                             
    fns,idx  = [], None                                                                                                                                                                                         
    for idx, fn in enumerate(filter(Path.is_file, dest.iterdir()), start=start_idx):                                                                                                                            
        suf = ext if ext else fn.suffix.strip('.')                                                                                                                                                              
        stm = re.compile('[0-9]*$').sub("",fn.stem.strip('.')) # remove numeric idx from end of stem                                                                                                               
        fn_new = dest/(f'{stm}{idx}.{suf}')                                                                                                                                                                     
        fn_tmp = dest/(str(idx) + rnd)                         # Need unique fn to  avoid collisions                                                                                                            
        fns.append([fn_tmp,fn_new])                                                                                                                                                                             
        fn.rename(fn_tmp)                                                                                                                                                                                       
    for fn_tmp, fn_new in fns:                                                                                                                                                                                  
        fn_tmp.rename(fn_new)                                                                                                                                                                                   
    return idx + 1 if idx else 0                                                                                                                                                                                
                                                                                                                                                                                                                
def reindex(dest, start_idx=0, ext = None):                                                                                                                                                                     
    dest = Path(dest)                                                                                                                                                                                           
    if not(dest.is_dir()): return f'{dest} is not a directory'                                                                                                                                                  
    idx = start_idx                                                                                                                                                                                             
    for d in filter(Path.is_dir, dest.iterdir()):                                                                                                                                                               
        idx = reindex_dir(d, idx, ext=ext)

jeremy · May 6, 2022, 5:21am

2 posts were merged into an existing topic: Help: Using Colab or Kaggle

jeremy · May 5, 2022, 5:43am

That should be fixed now - I modified the function to save images with random filenames.