I surprised that I can’t find any people asking about this already.
there is a great series of articles on this topic with some pytorch code available:
To summarize my problem the Loss function looks like this:
class QuantileLoss(nn.Module): def __init__(self, quantiles): super().__init__() self.quantiles = quantiles def forward(self, preds, target): assert not target.requires_grad assert preds.size(0) == target.size(0) losses =  for i, q in enumerate(quantiles): errors = target - preds[:, i] losses.append(torch.max((q-1) * errors, q * errors).unsqueeze(1)) loss = torch.mean(torch.sum(torch.cat(losses, dim=1), dim=1)) return loss
Which you would use like this:
quantiles = [.05, .5, .95] loss_func = QuantileLoss(quantiles)
The problem I am having is that the network only outputs only one output due to the fact that the data source only has one output. I’m not sure how I would add a custom head that adds three linear layers that outputs one value each (assuming my example with three quantiles anyhow)
In the example code I cited above the model is built like this:
final_layers = [ nn.Linear(64, 1) for _ in range(len(self.quantiles)) ] self.final_layers = nn.ModuleList(final_layers) def forward(self, x): tmp_ = self.base_model(x) return torch.cat([layer(tmp_) for layer in self.final_layers], dim=1)
I found a cool example project that does a regression on images of people’s faces to predict their age and I’m trying to do a POC using this project if anyone wants a toy project to try this out themselves:
P.S. The monte carlo dropout technique they demonstrate looks pretty useful too.