I’m not quite sure why I’m getting NaN. I know this can occur when 0 or a negative value is input to a logarithm. However, my data has no negative values and I’m accounting for 0 values by using the torch.log1p() function, which adds 1 to whatever value is input.
Is it simply the model itself outputting negative predictions?
Most likely the issue is that preds holds values smaller than or equal to -1. Even with log1p, that would still be a problem because when preds contains, for instance, -3, then -3 + 1 = -2, whose logarithm is undefined on the real numbers. One solution is to either clamp the predictions with preds = preds.clamp(min=0) to ensure a minimum value of 0 or normalize them with fastai’s sigmoid_range, e.g., preds = sigmoid_range(preds, 0, 10).
Alternatively, preds might contain NaNs, which can be checked by preds.isnan().sum().
Before writing this post, I did consider clamping the predictions to be positive only because, as you said too, log1p wouldn’t help for negative values, but wanted to see if there was a better way.
The reason being because if I clamped, the clamped predictions wouldn’t be fed back into the model. But I suppose the model will figure that out itself from its calculated loss?
Can’t use sigmoid_range however because the outputs are continuous.
Thanks to the power of backpropagation and PyTorch’s autograd engine, you need not worry about any problems arising from clamping. However, I’d recommend including the clamp operation in the model’s forward pass because it would, in a sense, become part of the network’s definition and not merely a step in the calculation of the loss function. That is, the model’s outputs would always have to be clamped to get reliable results, be it during inference or training.
Also, you can think of clamping values at 0 as the equivalent of ReLU - in fact, you could insert ReLU as the final layer of your network rather than clamping.
sigmoid_range scales a tensor’s content to a given range and is meant for continuous data.
ReLUs were indeed coming to mind as I was dabbling around this!
I’m using fastai to create my learner and have so far not specified any model to this learner. If I wanted to specify the use of the ReLU as an activation, would I have to create my own PyTorch model which I then input to the learner, or is there a function that would allow me to easily specify it?
I did find this tabular_config function in which I passed torch.nn.ReLU to the act_cls parameter, and then input this function to the learner. Didn’t seem to work though as I still got NaNs.
Oh, I see. Judging by the docs, I still don’t think it would work for my case? Because I would not know what the maximum value would be. Unless I just take the maximum value from my dataset?
To the best of my knowledge, fastai does not natively offer an option for specifying a final activation layer, but assuming the Learner is an instance of TabularLearner and fastai automatically constructed the model, appending ReLU to it is as straightforward as learn.model.layers.append(nn.ReLU()).
Yes, you could determine the maximum value from your dataset or prior domain knowledge. If you do decide to proceed with sigmoid_range, you can pass the desired range to the y_range argument of TabularLearner. Ultimately, you should experiment with both methods to evaluate the efficacy of each.