Uncertainty in Deep Learning (Bayesian networks, Gaussian processes)

I am really enjoying reading Yarin Gal’s work drawing connections between deep learning techniques and Bayesian Inference / Gaussian Processes.

Particularly fascinating is the idea of producing useful uncertainty metrics from deep neural networks by (I’m simplifying a little) adding dropout to your network , doing prediction across many of the thinned networks (Monte-Carlo dropout), and measuring the variance across predictions.

I’m looking forward to trying this in some NLP classification and regression projects of mine, to see if the uncertainty metrics align with the samples that I intuit the network should be uncertain about.

“What My Deep Model Doesn’t Know…”, Yarin Gal 2015:

Quite a bit more in this (very readable!.. so far) thesis:
“Uncertainty in Deep Learning” Yarin Gal 2016:

So hopefully it can now be seen as a more complete body of work, accessible to as large an audience as possible, and also acting as an introduction to the field of what people refer to today as Bayesian Deep Learning.

Has anyone tried this? Or other techniques for producing uncertainty measures? I found a StackOverflow thread
How to calculate prediction uncertainty using Keras? although it appears to be missing a term tau = l**2 * (1 - model.p) / (2 * N * model.weight_decay) (code from “What my deep model doesn’t know”).

Somewhat related, I am curious to learn where the intersection of gaussian processes and modern deep learning is contributing fruitful techniques since seeing this tongue-in-cheek cartoon https://twitter.com/Azaliamirh/status/907692703604817920 - anyone got some fun papers in that area?


Quick followup on why I think this is SUPER INTERESTING for “Practical Deep Learning for Coders” – I think that uncertainty metrics can be a very valuable tool for intuiting where our model is good and where it can be improved (joining my debugging pantheon of “print your results” and “plot your results” and “occasionally use LIME”).

It’s also interesting when we have very small datasets but could hand label more. I find myself in that situation often, and uncertainty metrics can be used in a process called “active learning” where you look for the kinds of examples that the model is least certain about, and provide it more labeled samples in those areas. This should maximize your ROI on time spent hand labeling. This might be intuitive for some instances of classification or regression, but perhaps less so for image recognition or more sophisticated tasks. Some interesting details in the thesis at Gal, Islam on “Active learning with image data”


I work on a lot of structured data problems as well and recently came across this same approach for determining uncertainty for these problems using dropout. I would find this incredibly useful as well.

Has anyone thought about adding to the fast.ai module for ColumnarModelData? Any interest in scoping out what it might take to do so?

1 Like


I recently saw two implementations of Monte Carlo Dropout (MCDO) in pytorch. The links could be a good starting point for adding it to fastai library. RelayNet MCDO and Monte-Carlo Dropout and its variants .

In the first link this is the relevant changes made to the RelayNet to add MCDO. Guessing similar changes would be necessary for fastai model implementations. Might be able to easily achieve using decorator pattern by using the original models outputs as the base class.

    def train(self, mode=True):

        # to do MC dropout we would like to keep dropout also during evaluation
        for module in self.modules():
            if 'dropout' in  module.__class__.__name__.lower():

    def predict(self, input, times=10):
        results = list()
        for _ in range(times):
            out = self.forward(input)

        results = np.asarray(results, dtype=np.float)
        average = results.mean(axis=0).squeeze()
        per_class_entropy = -np.sum(results * np.log(results + 1e-12), axis=0)
        overall_entropy = -np.sum(results * np.log(results + 1e-12), axis=(0, 2))  # 1 is batch size

        return average, per_class_entropy / times, overall_entropy / times, results

I’m looking into building in some uncertainty into a natural language classification problem I’m working on.

Has anyone had experience doing this?

I’m currently in the process of digging into the fastai library to find out where would be best to implement something like Monte Carlo Dropout at test time (similar to the above comments).

1 Like

Quite a late reply, I know, but this Medium post from Daniel Huynh covers how to implement MC Dropout in fastai. The related Twitter discussion might also be interesting.


thanks , this is interesting

1 Like