Pcamellon
(Pedro Pablo Camellón Quintero)
October 23, 2021, 7:04am
21
Hi! How do I define the weights values? I have a dataset of ~112 000 images from chest xrays. There are only ~2000 images with nodules which are the ones I want to identify the most!
Thanks in advance.
Pcamellon
(Pedro Pablo Camellón Quintero)
October 26, 2021, 4:17pm
22
I’ve found this:
self.weights = np.array([0 if i == 0 else 1 / i for i in counts])
self.weights = torch.DoubleTensor((self.weights)[self.labels])
or alternatively
self.weights = np.divide(1, counts, out=np.zeros_like(counts, dtype=np.float64), where=(counts!=0))
self.weights = torch.DoubleTensor((self.weights)[self.labels])
from:
@ilovescience Thanks so much for making this.
One problem I’ve found is that in the most extreme case, when there is a class which appears only once in the entire dataset, the single example may end up in the validation set, meaning the length of counts is less than the number of classes, so it cannot be indexed by self.labels.
I think this could be resolved by changing counts to be to same as self.label_counts (np.bincount instead of np.unique), and then each item in self.weights will be 0 if…