Changing Criterion During Training Provides Good Results

So pytroch doesn’t look like it has huber loss as a default and I haven’t coded a custom loss function for it yet.

In the mean time I’ve been thinking about ways to address the distributional issues on the pre-processing side of things. I posted another thread on the topic here but I also wanted to ask you guys if you had any thoughts on pre-processing. I would like to be able to take the log of some of my features but can’t with others because they contain many negatives. So I’m wondering if there are any other ideas for how to process these features? Can I take just the log of some but not others? Am I asking for trouble by scaling and normalizing in all different ways?