Fastai v2 chat

vrodriguezf · July 8, 2020, 7:19am

I think that comes from scikit learn, it’s not a V2 thing. Labels are encoded as 0 and 1 in a binary classification task, so you need to say which one is the positive class to compute precision and recall properly. It’s not about the number of classes, it’s about the type of errors.

ai_padawan · July 8, 2020, 7:43am

How do you determine that? My classes are in strings, and I assume fastai does some conversion under the hood. Is there a mapping for which one of my classes is 0 and 1?

stefan-ai · July 8, 2020, 7:48am

You can check your class names with db.vocab (where db is the name of your databunch) and if you index into db.vocab you can retrieve the corresponding class name of class 0 or 1.

Yes, that’s supposed to be the case. If you have two classes, say cat and dog, precision and recall will be different depending on which class is the positive class. Take the example of recall:

if cat is 0 and dog is 1: of all true dog images, how many were correctly predicted as dog?
if cat is 1 and dog is 0: of all true cat images, how many were correctly predicted as cat?

marii · July 8, 2020, 10:10pm

Is there any plans to do anything with https://www.fast.ai/2020/01/13/self_supervised/ in fastai2?(just saw post today) My study group is currently re-implementing SIMCLR in fastai2, and I was wondering if maybe we could possibly contribute something.

muellerzr · July 8, 2020, 10:14pm

IIRC Jeremy said it would possibly be in part 2 (this was said over Twitter)

muellerzr · July 8, 2020, 10:16pm

Also it’s been implemented (in case you wanted to check/compare notes )

github.com

WAMRI-AI/imagewang/blob/d575c7e6d531ec14ec10e545c6e672f87e3d5953/03_ImageWang_ContrastLearning_final_224.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Image网 Submission `224x224`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This contains a submission for the Image网 leaderboard in the `224x224` category.\n",
    "\n",
    "Inspired by the paper: [A Simple Framework for Contrastive Learning of Visual Representations](https://arxiv.org/abs/2002.05709)\n",
    "\n",
    "Code run on Titan RTX\n",
    "\n",
    "In this notebook we:\n",

This file has been truncated. show original

marii · July 8, 2020, 10:31pm

Thank you! Yeah, sadly I can’t figure out how to use Twitter, due to the endless stream of Tweets. I probably skip more than I actually read. (I follow 12 people)

Thank you for the link to the notebook as well. Will be great to compare. How do you keep up with all of these things? Twitter as well?

muellerzr · July 8, 2020, 10:33pm

For the most part I keep an eye out on ImageNette every so often to see what’s up there (that’s where that implementation of SIMCLR came from), Twitter (I follow 88 specific people ) and occasionally Papers with Code. (Sadly mostly Twitter)

marii · July 9, 2020, 11:08pm

One issue I have come across a few times now, is forgetting that when defining your own custom loss function, it tends to need an activation function and a decodes method. Otherwise how do you translate the output of a model into labels.

The decodes method on the loss function doesn’t show up anywhere that I know of… would it be useful to add a warning when adding a loss function without this? Or since this mostly shows up with “learn.show_results()/get_preds”, maybe a warning when that is called?

muellerzr · July 9, 2020, 11:15pm

Possibly. Or in the metric documentation at the very beginning. Because it’s not absolutely mandatory, so I worry about a “check”. See the fact that rmspe, etc don’t have it.

marii · July 9, 2020, 11:54pm

That is true, some losses will not require decodes/activation.

I was mostly talking about the very specific case of custom loss though, IE, one not in the fastai library. I think my main problem with adding it to the documentation is fastai loss function do include information about this case: http://dev.fast.ai/layers

My problem here is that I did not know my loss function had a problem, because I keep forgetting loss functions have decodes methods.

My inital form of debugging was to check the transforms, but it is not present there, and therefore I could not find the code responsible for the decodes.

What about including it as a part of the regular transforms pipeline instead of the loss function?

Maybe loss functions could have a transform, that is then pulled off and added to the transforms pipeline at learner creation?

These are just ideas, I just think having one decodes method that isn’t part of transforms is a bit confusing from a user’s perspective. Not to say I do not sort of see why it is this way… for example you wouldn’t want to apply the loss function’s decodes function to the actual labels, which would generally happen if you simply added it to the transformation pipeline.

Edit: Maybe also a shape check on Categorize decodes? (is label shape same as predicted labels)

kshitijpatil09 · July 13, 2020, 3:10pm

Does anyone have working example of ensemble of models (vision) using fastai2. I can see a Callback as solution for this, but don’t know whether it’ll backpropograte the gradients in both the models.

muellerzr · July 13, 2020, 3:16pm

Do you mean like a mutli-modal model of models? What is the scenario? Usually ensembled-based designs are just inference only so you train each individually.

kshitijpatil09 · July 13, 2020, 3:21pm

Multiple vision models (Not multimodal). It makes sense to do inference only, so should I use the get_preds to do the job? I’m looking for a way to benchmark the results on validation set using Learner’s validate method since I’ll get the values for all the metrics. So can I plug-in this behavior somewhere in training loop but to only work on validation step?

muellerzr · July 13, 2020, 3:40pm

Yup. So just your standard ensembling. IE get_preds, combine and average the predictions, then use one of the labels to check it via accuracy (or whatever metric)

An example is here where I did a forward and backwards LM with sentence piece too: fastai-Experiments-and-tips/IMDB Spacy with SentencePiece/SentencePiece plus Spacy.ipynb at master · muellerzr/fastai-Experiments-and-tips · GitHub (v1 but the averaging towards the end is still the same).

Tendo · July 13, 2020, 6:36pm

On the ensemble subject, Does anyone know how to ensemble multiclass models. I understand @muellerzr’s approach of averaging the predictions in a binary classification problem but in a multiclass problem can this still be applied? From my understanding, if an average is done in a mutliclass problem, it’s possible we will end up with a float values instead of integers. How can this be avoided? Or am i totally incorrect?

vrodriguezf · July 13, 2020, 10:50pm

What about averaging the output probabilities instead of the final predictions?

muellerzr · July 13, 2020, 10:54pm

That’s the method I was trying to describe above as well (and what the code does)

utkb · July 15, 2020, 5:46pm

I think in ensembles you are always ‘mixing’ (I should say… ensembling =P ) the predicted probabilities, optionally with different weightings (e.g. in two-model ensemble if I think model A is more accurate I can give it 60% weighting vs. 40% for model B output). It will make no sense to ‘average’ the class prediction integers – if 1=cat, 2=dog, 3=cow, two models that gave highest probabilities to cat and cow respectively obviously do not mean that the real answer should be dog!

I guess we can caveat this by saying: unless there is some form of ordinal/order in your classes where, say, a class 3 is really in between classes 4 and 5. But in that case, you probably want to do something different anyways, e.g. regression.

Yijin

msivanes · July 16, 2020, 3:59pm

I am looking for ways to handle imbalanced dataset for a text classification problem. So far many have pointed out to me that one should WeightedRandomSampler or Weighted Cross Entropy as the loss function.

Changing the loss function did not yield any benefits. So I am looking for ways to use Weighted Random Sampler as part of training data loader based on the approach described here.

Could you point to any examples in fastaiv2? The closest one I can find is the kaggle kernel by @ilovescience showing OverSamplingCallback. But this is not yet part of v2.