FYI you need triple backticks on their own line to do code blocks. I’ve edited your post for you.
Thank you Jeremy.
Note that in v2, metrics are different than in v1, you have to fill the template:
class Metric():
"Blueprint for defining a metric"
def reset(self): pass
def accumulate(self, learn): pass
@property
def value(self): raise NotImplementedError
@property
def name(self): return class2attr(self, 'Metric')
where reset is like what you to do at on_epoch_begin
, accumulate
corresponds to on_batch_end
and value
is like on_epoch_end
.
For your specific problem however, there is already a class called AccumMetric
that gathers targets and preds then applies any function you like on it. And roc_auc_score
from scikit-learn is already wrapped in RocAuc
.
Ah, thank you. I’m looking at the source codes and aware of RocAuc
, but its description shows that it’s restricted to binary classification problems, but in my case I have 3 categories.
According to the sklearn docs it can handle multilabel too: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html . I haven’t tried it myself.
As I know multilabel is when each sample contains multiple labels, like in lesson 3’s amazon satellite images. It’s different than this case where we have multiple classes but 1 label for each sample. Please fix me if I’m wrong.
You’re not wrong - although I’m not sure exactly what behavior you’re expecting for multinomial AUC. It’s not something I’ve really looked into - but I believe generally it’s handled using multiclass binomial…
I’m working on ISIC 2017 Task 3: Lesion classification where there’re 3 classes: Melanoma, Seborrheis Keratosis, and Nevus. Its ranking metric is the first 2 classes’ AUC (for each of them).
I will try to use Sylvain’ template. Thank you.
I’m also looking for a proper way of solving image single label classification with multi categories. But how to implement it by using v2 metrics I have no glue …
I find an example snippet from web (self sufficient example behavior)
from sklearn.metrics import roc_curve, auc from sklearn import datasets from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import LinearSVC from sklearn.preprocessing import label_binarize #from sklearn.cross_validation import train_test_split from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt %matplotlib iris = datasets.load_iris() X, y = iris.data, iris.target y = label_binarize(y, classes=[0,1,2]) n_classes = 3 # shuffle and split training and test sets X_train, X_test, y_train, y_test =\ train_test_split(X, y, test_size=0.33, random_state=0) # classifier clf = OneVsRestClassifier(LinearSVC(random_state=0)) y_score = clf.fit(X_train, y_train).decision_function(X_test) # Compute ROC curve and ROC area for each class fpr = dict() tpr = dict() roc_auc = dict() for i in range(n_classes): fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i]) roc_auc[i] = auc(fpr[i], tpr[i]) # Plot of a ROC curve for a specific class for i in range(n_classes): plt.figure() plt.plot(fpr[i], tpr[i], label='ROC curve (area = %0.2f)' % roc_auc[i]) plt.plot([0, 1], [0, 1], 'k--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic example') plt.legend(loc="lower right") plt.show()
For the version 1 two methods are defined
> def roc_curve(input:Tensor, targ:Tensor):
> "Computes the receiver operator characteristic (ROC) curve by determining the true positive ratio (TPR) and false positive ratio (FPR) for various classification thresholds. Restricted binary classification tasks."
> targ = (targ == 1)
> desc_score_indices = torch.flip(input.argsort(-1), [-1])
> input = input[desc_score_indices]
> targ = targ[desc_score_indices]
> d = input[1:] - input[:-1]
> distinct_value_indices = torch.nonzero(d).transpose(0,1)[0]
> threshold_idxs = torch.cat((distinct_value_indices, LongTensor([len(targ) - 1]).to(targ.device)))
> tps = torch.cumsum(targ * 1, dim=-1)[threshold_idxs]
> fps = (1 + threshold_idxs - tps)
> if tps[0] != 0 or fps[0] != 0:
> fps = torch.cat((LongTensor([0]), fps))
> tps = torch.cat((LongTensor([0]), tps))
> fpr, tpr = fps.float() / fps[-1], tps.float() / tps[-1]
> return fpr, tpr
and
> def auc_roc_score(input:Tensor, targ:Tensor):
> "Computes the area under the receiver operator characteristic (ROC) curve using the trapezoid method. Restricted binary classification tasks."
> fpr, tpr = roc_curve(input, targ)
> d = fpr[1:] - fpr[:-1]
> sl1, sl2 = [slice(None)], [slice(None)]
> sl1[-1], sl2[-1] = slice(1, None), slice(None, -1)
> return (d * (tpr[tuple(sl1)] + tpr[tuple(sl2)]) / 2.).sum(-1)
and calling it like
> fpr = dict()
> tpr = dict()
> roc_auc = dict()
> for i in range(n_classes):
> fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
> roc_auc[i] = auc_roc_score(fpr[i], tpr[i])
What is the proper way doing it for version 2 using metrics?
For anyone having the same interest in per-class AUC, I’m using this in v2, thanks to the template given by Sylvain, with predefined classes = [...]
, since I’m not yet able to extract dbunch’s classes like data.classes
in v1. Any fixes or refactoring to a standard metric are welcome.
class AUC(Metric):
"AUC score for each class in single-label multi-class classifications."
def __init__(self, main_class=0):
super().__init__()
self.main_class = main_class
def reset(self): self.targs, self.preds = [],[]
def accumulate(self, learn):
pred = learn.pred
targ = learn.y
pred, targ = to_detach(pred), to_detach(targ)
self.preds.append(pred)
self.targs.append(targ)
@property
def value(self):
if len(self.preds) == 0: return
preds = torch.cat(self.preds)
targs = torch.cat(self.targs)
idx = (targs==self.main_class)
targs = torch.zeros(targs.size())
targs[idx] = 1
preds = F.softmax(preds, dim=1)[:, self.main_class]
return skm.roc_auc_score(targs.cpu().numpy(), preds.cpu().numpy())
@property
def name(self): return f'{classes[self.main_class]} AUC'
Thank you for sharing and I modified a bit like:
class AUC(Metric): "AUC score for each class in single-label multi-class classifications." def __init__(self, main_class=0, classes = noop): super().__init__() self.main_class = main_class self.classes = classes def reset(self): self.targs, self.preds = [],[] def accumulate(self, learn): pred = learn.pred targ = learn.y pred, targ = to_detach(pred), to_detach(targ) self.preds.append(pred) self.targs.append(targ) @property def value(self): if len(self.preds) == 0: return preds = torch.cat(self.preds) targs = torch.cat(self.targs) idx = (targs==self.main_class) targs = torch.zeros(targs.size()) targs[idx] = 1 preds = F.softmax(preds, dim=1)[:, self.main_class] return skm.roc_auc_score(targs.cpu().numpy(), preds.cpu().numpy()) @property def name(self): return f'{self.classes[self.main_class]} AUC'
and use it
metrics=[accuracy] + [AUC(c, databunch.vocab) for c in range(databunch.c)] def get_learner2(): learn = cnn_learner(databunch, xresnet50, opt_func=opt_func, metrics=metrics) return learn.to_fp16()
That makes sense. Thank you, I appreciate the feedback.
Let me re-ask a different question (still on segmentation). I’m trying to do a custom dataset right now, databunch and Learner generated as so:
ds12 = DataBlock(blocks=(ImageBlock, ImageBlock(cls=PILMask)),
get_items=get_image_files,
splitter=RandomSplitter(),
get_y=lambda o: path/'masks_machine'/f'{o.stem}{o.suffix}')
dbunch = ds12.databunch(path/'img', bs=4, item_tfms=Resize(224),
batch_tfms=[*aug_transforms(size=224), Normalize(*imagenet_stats)])
codes = ['Person', 'Background', 'Other']
dbunch.vocab = codes
learn = unet_learner(dbunch, resnet50, metrics=accuracy)
When I do learn.lr_find()
I’m getting a size mismatch error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-17-d81c6bd29d71> in <module>()
----> 1 learn.lr_find()
8 frames
/usr/local/lib/python3.6/dist-packages/fastai2/callback/schedule.py in lr_find(self, start_lr, end_lr, num_it, stop_div, show_plot)
188 n_epoch = num_it//len(self.dbunch.train_dl) + 1
189 cb=LRFinder(start_lr=start_lr, end_lr=end_lr, num_it=num_it, stop_div=stop_div)
--> 190 with self.no_logging(): self.fit(n_epoch, cbs=cb)
191 if show_plot: self.recorder.plot_lr_find()
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
279 try:
280 self.epoch=epoch; self('begin_epoch')
--> 281 self._do_epoch_train()
282 self._do_epoch_validate()
283 except CancelEpochException: self('after_cancel_epoch')
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _do_epoch_train(self)
254 try:
255 self.dl = self.dbunch.train_dl; self('begin_train')
--> 256 self.all_batches()
257 except CancelTrainException: self('after_cancel_train')
258 finally: self('after_train')
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in all_batches(self)
232 def all_batches(self):
233 self.n_iter = len(self.dl)
--> 234 for o in enumerate(self.dl): self.one_batch(*o)
235
236 def one_batch(self, i, b):
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in one_batch(self, i, b)
238 try:
239 self._split(b); self('begin_batch')
--> 240 self.pred = self.model(*self.xb); self('after_pred')
241 if len(self.yb) == 0: return
242 self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in forward(self, x)
374 for l in self.layers:
375 res.orig = x
--> 376 nres = l(res)
377 # We have to remove res.orig to avoid hanging refs and therefore memory leaks
378 res.orig = None
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in forward(self, x)
389 "Merge a shortcut with the result of the module by adding them or concatenating them if `dense=True`."
390 def __init__(self, dense:bool=False): self.dense=dense
--> 391 def forward(self, x): return torch.cat([x,x.orig], dim=1) if self.dense else (x+x.orig)
392
393 #Cell
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 224 and 1205 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71
In short seems to be a bug in the unet architecture I believe. Actually it seems to be an issue with the call to the sizes, my databunch seems to not be resizing accordingly.
size = dbunch.train_ds[0][0].size
gave me (1205,800)
however the call to size = dbunch.one_batch()[0].shape[-2:]
does work correctly, which would then tell me that my item transform wasn’t applied in time for the unet to grab it?
Note that since you use after_item
in the DataLoader
, it’s logical your items in the dataset are not resized yet. As you said, the sizes are correct in the DataLoader
. Will look at a unet example and see where this could come from. Can you debug the sizes of x
and x.orig
in the meantime? This is probably some wrong padding somewhere and they are very close but not exactly the same.
So, suddenly, without changing the code whatsoever it’s working this morning. The size being passed in is still (1200, 800) (not the 224 I want to train with) but the architecture will train without the mismatch. However that’s still a large memory usage. Running through the size debugger now.
Putting this here in case someone needs to do the same and wants to use the debugger:
Put a Debugger()
layer just before the layer you want the size of, this will open the python interpreter. Step once so it passes through the next layer, and then use the interact
command. Now we can use and analyze any variables that pop up. Now I can do x.size()
and x.orig.size()
(or result.size()
). To escape do CTRL + D
Edit: realized that was just the debugging layer.
@sgugger the size being passed in looks right ([4,99,224,224)]. So it’s all working now, just unsure why we’re using more memory now (a lot more memory than v1).
Am I correct in noticing that the RandTransforms inside 09_data_augment are not cloning the Image data but are mutating it in place and passing it back? Is this just because in the Pipeline stuff is constantly being created and destroyed each batch? Any reason I shouldn’t follow this pattern for audio? Thanks.
Edit: For anyone wondering about timing, removing the tensor clone and return of a new AudioItem shaved my transform from 400microseconds to 100
Edit 2: Should a transform that is applied 100% of the time, but has a random element, extend RandTransform? Or is that overkill and it is better off as a function (I’m leaning towards the latter)
I have moreso a feature request (I’ll try to implement it later after the holidays if someone who knows more in this particular area hasn’t ) One of the biggest debug headaches with Segmentation is when you don’t have the proper number of labels, you receive a very strange bug where you get a CUDA assist error. It would be nice to have a type hint saying what the expected number of labels present to be and what was in the databunch. I’m unsure how to find this out exactly yet but just an issue I know was present in v1 and a feature that would be a small but great improvement for v2
@jeremy I’m just curious, I see that in the latest commit old xresnet
you removed the norm_type
being passed to the stem and to the resnet blocks, why is that?
ResBlock and ConvLayer have the right norm_type by default, so no need to pass anything.
I was working with style transfer, and for that I would like to use InstanceNorm, no way of doing that with the XResNet then?