Fastai v2 chat

313V · September 25, 2019, 10:45pm

i havent’ tried anything yet, i was just sharing how i thought i could do it - because i have this datablock with a bunch of callbacks that almost does it anyway. I didn’t actually know there was a sampler until he mentioned it because i haven’t used it.

i’ll understand his way first and try it when i get there after i get the predictions working, appreciate the redirect

jeremy · September 25, 2019, 10:48pm

The best way to understand DataLoader and TfmdDL is to read their source code and tests. They’re extremely simple, in terms of implementation, but extremely flexible. We’ll need lots of tutorial examples to show how to take full advantage of this. Creating examples for Kaggle competitions will be fantastic role models for others to follow - I’m happy to code review any draft approaches that you all come up with. It’s possible you’ll find things that aren’t easy to do, in which case we can modify the library until it is easy

313V · September 25, 2019, 10:51pm

that would be great, thanks!

radek · September 26, 2019, 4:52am

Really great discussion

That sounds terrific!

I’ll address more things during the day and will share whatever code I end up writing on sampling / inference. Part of the benefit of putting in writing what I struggled with yesterday was that I think I came up with ways to work around them. But the problem is that I would not like to work around the framework. Not in the sense that I don’t want to put in the time or that I don’t know how to - I’ve done that before and would be happy to do it again. It’s just that I feel there is less value for everyone in this approach.

For inference in a Flask app I think I probably would want to call the model on a tensor directly? That’s what I have done in the past doing a project for a fashion startup. If not a permanent solution maybe at least in the interim period that would be ok. I would still use the building blocks that v2 provides, the context managers, etc. Would learn a bit more about the framework so that would be nice as well. We probably want to forgo the whole DataLoader mechanism for inference anyhow.

For inference for a Kaggle competition, I think I could figure out how to get the augmented tensors and manually iterate over the dataloader using context mangers, setting model to eval, stopping to recalculate BN values, etc. I could manually do the TTA if it even will make sense in the context of this competition. Maybe that is not a bad idea to go this route? I am just thinking out loud, but its a bit hard for me to draw the line between what the framework could do out of the box vs piecing functionality from components. I think the answer is fairly simply though - whatever the framework doesn’t do and is not easily achieved using callbacks / subclassing and overriding a simple method, that probably would be best to be created using individual ‘lego’ bricks and shared if possible for other to have a chance of referencing.

Those are just some random thoughts but I am trying to orient myself in how to get things done with v2 while still being a good member of the community In particular, I hope that neither Jeremy nor Sylvain mind what I write - I do respect your time and I neither would want to impede your work in whatever order you deem to be best (seems text models have been getting a lot of love recently! ) nor is my intention to waste your time on some Radek goose chases BTW you seem very good with people on the Internet not impeding your work so maybe there is nothing there to worry about

jeremy · September 26, 2019, 6:31am

FYI, we haven’t started on inference yet, and it’s likely to be the next thing we work on - so expect this to be fixed in the next week or two.

arora_aman · September 26, 2019, 11:07am

I’ve decided to get into RSNA kaggle competition using V2 and will be exploring the vision API for the next few days.

Anyone is more than welcome to join and we can together

radek · September 26, 2019, 12:33pm

Could I please ask if the Adam optimizer in v2 is of the AdamW variety? Also, does it handle setting the eps for us?

Is there some description that you could please point me to of the TTA trick where you train for a little and then search for useful transforms? Probably the biggest question on my mind - by training a little this would mean training for a couple of epochs where the model just starts doing something useful? Roughly how much training one should do here? Or is it training to the point where the model achieves 80% - 90% of its final accuracy, a point that can be reached rather quickly but where the model can still benefit from further training?

sgugger · September 26, 2019, 1:22pm

Adam in v2 is AdamW unless you pass true_wd=False. You can also set the eps which defaults to 1e-5 (more than the default value in PyTorch but we found it led to more stable trainings and allowed for slightly bigger learning rates).

kdorichev · September 26, 2019, 1:32pm

An issue I found in vision core. Already submitted a PR.

jeremy · September 27, 2019, 7:40pm

I love coming back to old code and discovering we can make it much clearer thanks to new features we’ve added! Here’s a couple of recent examples:

The second one above is thanks to the new Tuple class that I recently added, that supports broadcasted arithmetic ops on tuples - really helpful for arithmetic on array shapes, as you see here.

arora_aman · September 30, 2019, 1:58am

In V2, is there a way please to create X_labels using Pipeline from source (items) but y_labels from a csv file? I am trying to convert .dcm images to X dataset, but y_labels come from a .csv file in RSNA Kaggle challenge.

jeremy · September 30, 2019, 3:24am

Easiest would be to combine everything you need (filenames and labels) into a DataFrame, and then follow any of the sample approaches for Planet in nb 50. Let us know how you go!

arora_aman · September 30, 2019, 4:23am

Okay, thanks @jeremy! I will share my notebook with you for review before publishing on Kaggle

jeremy · September 30, 2019, 5:43am

I’ve renamed methods with names like mapped in L to names like map. Other examples include filtered->filter and shuffled->shuffle.

marvin · September 30, 2019, 11:49am

I guess that’s more of a feature request:
And please let me know if that’s the right place to ask or if there is another forum / discussion for that.

Are there any plans to add a SOTA NLP Zoo similar to huggingface/transformers that ships 32+ pre-trained pytorch models including (Ro)BERT, GP2, XLM ?

Maybe as a “contrib” branch?

They already have some SWIFT / CoreML ports so on the surface that looks like a nice complement to the recent Swift.AI work but I guess that might need to be examine a bit more.

I am well aware that fast.ai already comes with ULmFit and similar models, but it would be wonderful to add GPT2 and BERT as an addition to enable new kinds of application development. For example, the guys at TabNine use the GPT-2 model to train a predictive code-autocomplete / code-snippet generator for several programming languages and that is a pretty useful application.

https://huggingface.co/transformers/

Kailegh · October 1, 2019, 5:10pm

Should I stop learning to use the v1 since it will be written from scratch and focus in other things in the meantime?

jeremy · October 1, 2019, 7:19pm

No. The concepts are not changing much at all.

jeremy · October 1, 2019, 7:48pm

You can now use fastai v2 in kaggle kernels. Example here:

https://www.kaggle.com/jhoward/fastai-v2-pipeline-tutorial

jeremy · October 2, 2019, 6:12am

The v2 docs are up to date now. (Although they’re quite a mess still!)

https://dev.fast.ai

immarried · October 4, 2019, 1:16am

Does the pip install line at the beginning need to be run at the start of every session then, in order to use fastai v2?