Fastai v2 chat

I think that comes from scikit learn, it’s not a V2 thing. Labels are encoded as 0 and 1 in a binary classification task, so you need to say which one is the positive class to compute precision and recall properly. It’s not about the number of classes, it’s about the type of errors.

How do you determine that? My classes are in strings, and I assume fastai does some conversion under the hood. Is there a mapping for which one of my classes is 0 and 1?

You can check your class names with db.vocab (where db is the name of your databunch) and if you index into db.vocab you can retrieve the corresponding class name of class 0 or 1.

Yes, that’s supposed to be the case. If you have two classes, say cat and dog, precision and recall will be different depending on which class is the positive class. Take the example of recall:

  • if cat is 0 and dog is 1: of all true dog images, how many were correctly predicted as dog?
  • if cat is 1 and dog is 0: of all true cat images, how many were correctly predicted as cat?

Is there any plans to do anything with https://www.fast.ai/2020/01/13/self_supervised/ in fastai2?(just saw post today) My study group is currently re-implementing SIMCLR in fastai2, and I was wondering if maybe we could possibly contribute something.

1 Like

IIRC Jeremy said it would possibly be in part 2 (this was said over Twitter)

Also it’s been implemented (in case you wanted to check/compare notes :slight_smile: )

3 Likes

Thank you! Yeah, sadly I can’t figure out how to use Twitter, due to the endless stream of Tweets. I probably skip more than I actually read. (I follow 12 people)

Thank you for the link to the notebook as well. Will be great to compare. How do you keep up with all of these things? Twitter as well?

1 Like

For the most part I keep an eye out on ImageNette every so often to see what’s up there (that’s where that implementation of SIMCLR came from), Twitter (I follow 88 specific people :wink: ) and occasionally Papers with Code. (Sadly mostly Twitter)

1 Like

One issue I have come across a few times now, is forgetting that when defining your own custom loss function, it tends to need an activation function and a decodes method. Otherwise how do you translate the output of a model into labels.

The decodes method on the loss function doesn’t show up anywhere that I know of… would it be useful to add a warning when adding a loss function without this? Or since this mostly shows up with “learn.show_results()/get_preds”, maybe a warning when that is called?

Possibly. Or in the metric documentation at the very beginning. Because it’s not absolutely mandatory, so I worry about a “check”. See the fact that rmspe, etc don’t have it.

That is true, some losses will not require decodes/activation.

I was mostly talking about the very specific case of custom loss though, IE, one not in the fastai library. I think my main problem with adding it to the documentation is fastai loss function do include information about this case: http://dev.fast.ai/layers

My problem here is that I did not know my loss function had a problem, because I keep forgetting loss functions have decodes methods.

My inital form of debugging was to check the transforms, but it is not present there, and therefore I could not find the code responsible for the decodes.

What about including it as a part of the regular transforms pipeline instead of the loss function?

Maybe loss functions could have a transform, that is then pulled off and added to the transforms pipeline at learner creation?

These are just ideas, I just think having one decodes method that isn’t part of transforms is a bit confusing from a user’s perspective. Not to say I do not sort of see why it is this way… for example you wouldn’t want to apply the loss function’s decodes function to the actual labels, which would generally happen if you simply added it to the transformation pipeline.

Edit: Maybe also a shape check on Categorize decodes? (is label shape same as predicted labels)

Does anyone have working example of ensemble of models (vision) using fastai2. I can see a Callback as solution for this, but don’t know whether it’ll backpropograte the gradients in both the models.

Do you mean like a mutli-modal model of models? What is the scenario? Usually ensembled-based designs are just inference only so you train each individually.

Multiple vision models (Not multimodal). It makes sense to do inference only, so should I use the get_preds to do the job? I’m looking for a way to benchmark the results on validation set using Learner’s validate method since I’ll get the values for all the metrics. So can I plug-in this behavior somewhere in training loop but to only work on validation step?

Yup. So just your standard ensembling. IE get_preds, combine and average the predictions, then use one of the labels to check it via accuracy (or whatever metric)

An example is here where I did a forward and backwards LM with sentence piece too: https://github.com/muellerzr/fastai-Experiments-and-tips/blob/master/IMDB%20Spacy%20with%20SentencePiece/SentencePiece%20plus%20Spacy.ipynb (v1 but the averaging towards the end is still the same).

1 Like

On the ensemble subject, Does anyone know how to ensemble multiclass models. I understand @muellerzr’s approach of averaging the predictions in a binary classification problem but in a multiclass problem can this still be applied? From my understanding, if an average is done in a mutliclass problem, it’s possible we will end up with a float values instead of integers. How can this be avoided? Or am i totally incorrect?

What about averaging the output probabilities instead of the final predictions?

2 Likes

That’s the method I was trying to describe above as well (and what the code does) :slight_smile:

1 Like

I think in ensembles you are always ‘mixing’ (I should say… ensembling =P ) the predicted probabilities, optionally with different weightings (e.g. in two-model ensemble if I think model A is more accurate I can give it 60% weighting vs. 40% for model B output). It will make no sense to ‘average’ the class prediction integers – if 1=cat, 2=dog, 3=cow, two models that gave highest probabilities to cat and cow respectively obviously do not mean that the real answer should be dog!

I guess we can caveat this by saying: unless there is some form of ordinal/order in your classes where, say, a class 3 is really in between classes 4 and 5. But in that case, you probably want to do something different anyways, e.g. regression.

Yijin

I am looking for ways to handle imbalanced dataset for a text classification problem. So far many have pointed out to me that one should WeightedRandomSampler or Weighted Cross Entropy as the loss function.

Changing the loss function did not yield any benefits. So I am looking for ways to use Weighted Random Sampler as part of training data loader based on the approach described here.

Could you point to any examples in fastaiv2? The closest one I can find is the kaggle kernel by @ilovescience showing OverSamplingCallback. But this is not yet part of v2.

2 Likes