I surprised that I can’t find any people asking about this already.
there is a great series of articles on this topic with some pytorch code available:
To summarize my problem the Loss function looks like this:
class QuantileLoss(nn.Module):
def __init__(self, quantiles):
super().__init__()
self.quantiles = quantiles
def forward(self, preds, target):
assert not target.requires_grad
assert preds.size(0) == target.size(0)
losses = []
for i, q in enumerate(quantiles):
errors = target - preds[:, i]
losses.append(torch.max((q-1) * errors, q * errors).unsqueeze(1))
loss = torch.mean(torch.sum(torch.cat(losses, dim=1), dim=1))
return loss
Which you would use like this:
quantiles = [.05, .5, .95]
loss_func = QuantileLoss(quantiles)
The problem I am having is that the network only outputs only one output due to the fact that the data source only has one output. I’m not sure how I would add a custom head that adds three linear layers that outputs one value each (assuming my example with three quantiles anyhow)
In the example code I cited above the model is built like this:
final_layers = [
nn.Linear(64, 1) for _ in range(len(self.quantiles))
]
self.final_layers = nn.ModuleList(final_layers)
def forward(self, x):
tmp_ = self.base_model(x)
return torch.cat([layer(tmp_) for layer in self.final_layers], dim=1)
I found a cool example project that does a regression on images of people’s faces to predict their age and I’m trying to do a POC using this project if anyone wants a toy project to try this out themselves:
Thanks,
Bob
P.S. The monte carlo dropout technique they demonstrate looks pretty useful too.