Hello Everyone! Just Earned Basic
I have a question about how to craft features such that they lead to the best results for my Neural Networks.
Let me start with two examples I face in my current projects.
-
In one of my NNs I take wind direction as a input measured in degrees (0-360). It was brought to my attention that from an input perspective this might be a source of bias. If the wind shifts from NW to NE it will go from High 300âs to Below 50. This numerical represenation can lead to some strange cusps in my data when training the RNN
- Part of my solution to this was to split wind direction in to 8 columns and have them be binary ether the wind is blowing this way or not
-
The other example is with some labels that I have applying to loan default prediction NN. If I transform these text labels such as âCredit Card Refinancingâ or âCar Loanâ in to numerical representations wouldnât this cause bias based on which label gets which number? âCar Loanâ (Mapped to 9) being some how better then âCredit Card Refinancingâ (mapped to 2).
- I had a similar idea of turning all these different purposes in to there own binary column that act an inputs
My main question is what is the name of this data problem and what are some of the more recent thoughts on how to handle converting text labels in to numerical representations that wonât inadvertently add bias to my network.