Uncertainty in Deep Learning (Bayesian networks, Gaussian processes)


(Jason P Morrison) #1

I am really enjoying reading Yarin Gal’s work drawing connections between deep learning techniques and Bayesian Inference / Gaussian Processes.

Particularly fascinating is the idea of producing useful uncertainty metrics from deep neural networks by (I’m simplifying a little) adding dropout to your network , doing prediction across many of the thinned networks (Monte-Carlo dropout), and measuring the variance across predictions.

I’m looking forward to trying this in some NLP classification and regression projects of mine, to see if the uncertainty metrics align with the samples that I intuit the network should be uncertain about.

“What My Deep Model Doesn’t Know…”, Yarin Gal 2015:

Quite a bit more in this (very readable!.. so far) thesis:
“Uncertainty in Deep Learning” Yarin Gal 2016:

So hopefully it can now be seen as a more complete body of work, accessible to as large an audience as possible, and also acting as an introduction to the field of what people refer to today as Bayesian Deep Learning.

Has anyone tried this? Or other techniques for producing uncertainty measures? I found a StackOverflow thread
How to calculate prediction uncertainty using Keras? although it appears to be missing a term tau = l**2 * (1 - model.p) / (2 * N * model.weight_decay) (code from “What my deep model doesn’t know”).

Somewhat related, I am curious to learn where the intersection of gaussian processes and modern deep learning is contributing fruitful techniques since seeing this tongue-in-cheek cartoon https://twitter.com/Azaliamirh/status/907692703604817920 - anyone got some fun papers in that area?


(Jason P Morrison) #2

Quick followup on why I think this is SUPER INTERESTING for “Practical Deep Learning for Coders” – I think that uncertainty metrics can be a very valuable tool for intuiting where our model is good and where it can be improved (joining my debugging pantheon of “print your results” and “plot your results” and “occasionally use LIME”).

It’s also interesting when we have very small datasets but could hand label more. I find myself in that situation often, and uncertainty metrics can be used in a process called “active learning” where you look for the kinds of examples that the model is least certain about, and provide it more labeled samples in those areas. This should maximize your ROI on time spent hand labeling. This might be intuitive for some instances of classification or regression, but perhaps less so for image recognition or more sophisticated tasks. Some interesting details in the thesis at Gal, Islam on “Active learning with image data”


(Mark Hoffmann) #3

I work on a lot of structured data problems as well and recently came across this same approach for determining uncertainty for these problems using dropout. I would find this incredibly useful as well.

Has anyone thought about adding to the fast.ai module for ColumnarModelData? Any interest in scoping out what it might take to do so?