my name is Fabio, 26, living in Berlin/Germany, currently Ph.D candidate in theoretical astrophysics.
I implemented DeepMind’s Dueling DDQN in tensorflow from scratch and optimized it over the course of a few weeks to match the score DeepMind reports for the Atari games Pong and Breakout. There are several threads/posts on the web (here, here and here) where people report having difficulties matching the scores especially in Breakout.
Since I really learned a lot while implementing and especially debugging, I decided to explain the individual components of DQN in detail in one jupyter notebook, directly followed by the implementation and giving comprehensive explanations and examples of the relevant equations.
If you are interested in learning about RL and especially DQN, take a look: