Fastai v2 chat

couple of quick questions:

  1. does the progress bar work in v2?
  2. is there an easy way to set a epoch size smaller than my whole dataset? or a good way to do that? I have 650k images - would love to do something like for each epoch randomly sample 50k etc so i can more granularly early stop?
2 Likes

oh - i think if I import callbacks.progress it will patch in progress to default.callbacks

edit: yep

from fai2.callback.progress import *

thatā€™s sort of interesting that importing a module monkey patches in new superpowers to learner, i think its useful but i wasnā€™t expecting it.

1 Like

Yeah you can always use the callbacks directly - but to use the Learner shortcuts, monkey patching is the only way to make that work really.

yeah i donā€™t see a better way, and kind of like how it works now that i know about itā€¦ just wasnā€™t used to seeing that

one question, with this lineā€¦

defaults.callbacks = [TrainEvalCallback, Recorder, ProgressCallback]

is there any possibility of accidentally slamming over defaults set by some another module? does order of import matter?

Ultimately all callbacks will be imported together, so we will have to be careful to import them in the order that makes default.callbacks with all defaults.

1 Like

does v2 have the equivalent of .to_distributed yet? or is there already a good distributed v2 training script laying around for inspiration?

apologies if itā€™s super obvious in the source, i didnā€™t see it (is see stuff to return the right model if distributed i guess i could wrap up the model myself?)

iā€™m using mixedprecision btw in case that affects anything

No it doesnā€™t exist yet, although in v1 itā€™s fairly simple so feel free to port it over.

thanks - i might - one other thing i tripped on - skm not defined when i did local.metrics import, on my local fork i added

import sklearn.metrics as skm

to the top of metrics.py

wasnā€™t sure if that was correct or iā€™m importing things incorrectly? maybe you donā€™t want to bake in that sklearn.metrics dependency?

The corresponding cell is not exported, thatā€™s a mistake. Will fix now!

1 Like

thanks! - iā€™ll work on getting up to speed on the test/commit process so trivial stuff like this i can just fix without bugging you guys

Honestly itā€™s often faster for us to be told of an issue on the forum and fix it ourselves, vs deal with merging a PR. So feel free to just ping us when you see a problem.

1 Like

ok happy to do it that way too

i am probably more useful for me to work on real world problems and see what i trip on anyway?

Yes thatā€™s very useful, Fred!

One thing that I feel would be very useful would be an ability to run inference both on a batch or a single example using a loaded model. This can be useful in the context of a Kaggle competition but more broadly it would allow for building a lot of interesting things. Might be I am doing something wrong, but currently this does not seem possible. I tried looking into this myself not to bother anyone but didnā€™t get very far.

A bonus here would be an ability to perform TTA. I am not even that interested in a TTA method, but rather something that would allow me to specify augmentations to be applied to data during inference and that would give me the preds from a single inference run. I suspect I could get this as is right now if I can figure out how I can apply augmentations to the validation set (this is not something I have explored).

One other thing I have been interested in doing is custom sampling of examples to go into an epoch, but now looking at code I think this should be fairly straightforward to do somewhere in between DataLoader.sampler and DataLoader.create_batches :slight_smile:

I can tell from the commit history that both you and Sylvain are very busy - I am sure that this functionality is coming at some point, just would be helpful to know if you are planning on working on the ability to predict using a loaded model anytime soon. If not maybe Iā€™ll take another stab at it though I am thinking that probably moving around the order in which I was planning to work on things would probably be my best bet.

1 Like

@radek iā€™m looking at this too because iā€™m now at the point where i want to try submitting a model to the kaggle competition. right now iā€™m doing roughly this based on the #13 learner notebook:

dbunch = dblock.databunch(train_data,  
                          bs=batch_size, 
                          val_bs=batch_size, 
                          ds_tfms=Resize(img_size), 
                          dl_tfms=aug_transforms())

test_ds = dblock.datasource(test_df)
test_dl = TfmdDL(test_ds, bs=batch_size)
dbunch.dls += (test_dl,)

opt_func = partial(Adam, lr=3e-3, wd=0.01)
#model = xresnet.xresnet50(c_in=channels,c_out=num_cat)
model = xresnet.xresnet18(c_in=channels,c_out=num_cat)
cb_funcs = [partial(MixedPrecision, clip=0.1)]
metrics = [accuracy_multi]
learn = Learner(dbunch, model, opt_func=opt_func, loss_func=BCEWithLogitsLossFlat(), 
                cb_funcs=cb_funcs, metrics=metrics)


learn = learn.load('submission_1')

learn.get_preds(ds_idx=2)

its crashing inside the progress callback due to n_epoch not existing.
In validation, and in get_preds, self._do_begin_fit(n_epoch) is not called inside the learner.

iā€™m debugging this now, when i figure it out iā€™ll let you know

i agree with all your suggestions though, especially the sampler - was thinking the same thing and it would be great for this particular competition (brain injury) because 1) there are a zillion images 2) there are some nice ways to naturally want to segment random draws (by patient, by class type, etc)

anyway, somehow iā€™m going to extract a submission out of this model today, will let you know when i figure out how

my approach to this was going to be to override get_items in DataBlock and combine with a callback on the learner

so the databunch i would initialize with a dataframe

the get_items method would draw n samples from the dataframe in whatever complicated sampling method you want

on the end of an epoch callback, i would reset the random draws for the next epoch

hope that makes sense, its a half baked plan at the moment

Iā€™ve created a Tuple class that inherits from tuple, but adds some nice things - particularly element-wise ops. This is particularly useful for dealing with arithmetic with image shapes. For instances hereā€™s a refactoring I just did in crop_pad:

So much nicer! :slight_smile:

4 Likes

not T? i guess thats taken by tensor
next up - dictionaries :smiley:

1 Like

The approach @radek describes sounds closer to what sounds sensible to me. Did you try that and find it didnā€™t work out for you?

Itā€™s not used for that any more. But I donā€™t think Tuple is used enough to be worth a single char symbol - at least not yetā€¦