@MicPie,
thank you for sharing your thoughts and all the insightful links. Meanwhile, I came to realize that all the correlation tests are actually pretty meaningless in practice. In my last experiment, I simply used Bayesian optimization and it just worked out of the box and delivered results far superior to anything I have done myself.
In a nutshell, building any deep learning model consists of three optimization problems:
- Find best features
- Find best model
- Find best model hyperparameter.
For (1), you can use either try & error, leverage domain knowledge or simply use genetic algorithms. The later deliver the same result as brute force search but at about 10X faster speed.
For (2), you can use any existing model, if good enough, build your own, or use deep neuronal evolution to find your best model. The latter is computational expensive.
For (3), your lucky, because that’s the easiest part because there are plenty of tools at your disposal. For starters, use known best practices as fast.ai is doing it all the time. Then, there is bayesian optimization, which is arguably a bit shaky when used standalone. Some folks question whether it is any better than open search, and indeed, the evidence seems inconclusive. However, when combined with evolutionary search strategies, it delivers superb results although in a non-deterministic way. In practice, I got the optimal result within the three runs or so, that’s actually a non-issue.
Also, there are model-agnostic tools to tune hyper parameter, although I have not explored them yet.
I think Jeremy recently made a point about not teaching Reinforcement Learning because it is in fact no better than any good search algorithm. A valid point he made, and I discovered recently that any crappy model optimized with a genetic algorithm easily outperforms the most sophisticated RL system by a wide margin. Conversely, those RL agents that actually do well in practice usually leverage a lot of optimization, either by using plain brute-force search or a genetic algorithm.
More recently, the decades-old genetic algorithm make a strong comeback, but now labeled as “Deep Neuro Evolution” to auto-optimize RL/Agents and Deep Nets with millions of parameters. Uber heavily invests in that field because of the simple reality that any auto-optimizer is doing better than any human whenever your model becomes complex. And there is no shortage of complex models out there, but certainly a shortage of highly optimized ones. It is very telling that Google, Uber, and OpenAi moving all beyond RL and towards Deep Neuro Evolution since all three have the problem of optimizing large and complex models.
https://eng.uber.com/deep-neuroevolution/
Reinforcement Learning may or may not survive the decade, but Deep Neuro Evolution is meant to stay simply because it is going to solve a ton of really hard optimization problems within a reasonable time and with reasonable resources. By reasonable, I mean 48 Cores instead of the hundreds of GPU’s/TPU’s Google loves to use in its experiments. When you can auto-optimize a common architecture and hyperparameter within an hour or less, I take it any day.