How to use dropout at prediction time?

I’m looking to try and use dropout at prediction time to get a distribution of predictions.

I was inspired by this thread:

How I understand it is:

Run 100~ predictions with dropout applied to get 100~ different predictions, then by measuring the variance of the predictions you can get a measure of ‘uncertainty’ for each sample.

Some boilerplate code I’ve been experimenting with using MNIST:

class Net(nn.Module):
    def __init__(self):
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5, padding=2)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5, padding=2)
        self.dropout = nn.Dropout2d(p=0.3)
        self.fc1 = nn.Linear(7*7*20, 1000)
        self.fc2 = nn.Linear(1000, 10)
    def forward(self, x):
        # dropout at every activation layer
        x = F.relu(F.max_pool2d(self.dropout(self.conv1(x)), 2))
        x = F.relu(F.max_pool2d(self.dropout(self.conv2(x)), 2))
        x = x.view(-1, 7*7*20)
        x = F.relu(self.fc1(self.dropout(x)))
        x = F.relu(self.fc2(self.dropout(x)))
        return F.log_softmax(x, dim=1)

model = Net().to(device)

...[Train model]...

# Create function to apply to model at eval() time
def apply_dropout(m):
    if type(m) == nn.Dropout2d:

# Predict the class of X using model
def predict_class(model, X):
    model = model.eval()
    model.apply(apply_dropout) # apply dropout at pred time (see func above)
    outputs = model(Variable(X))
    _, pred = torch.max(, 1)
    return pred.numpy()

# Predict T times to get a distribution of predictions
def predict(model, X, T=100):
    list_of_preds = []
    standard_pred=predict_class(model, X)
    y1 = []
    y2 = []
    for _ in range(T):
        _y1 = model(Variable(X))
        _y2 = F.softmax(_y1, dim=1)
        list_of_preds.append(predict_class(model, X)) # predict T times
    return standard_pred, np.array(y1), np.array(y2), np.array(list_of_preds)

By calling .predict() and passing it model, 100 predictions are made with Dropout2d applied.

What my empirical testing has shown is examples of data which are ‘bad’ (don’t look good compared to others) have a higher variance compared to ‘good’ examples.

Does anyone know how I would go about doing this with the fastai library?

Could I do it with a wrapper function on the Learner class?

I’m currently investigating this now but if anyone has a clue, that’d be amazing.

1 Like

Hi, were you able to make any progress with this? I am trying to implement this method now as well, and would love to see if you had any success.