Absolutely, we would typically expect the training loss to be higher than the validation loss. The reason why it is different is because the validation set does not use dropout. Diving into the code in model.py, we can look at *fit* (where all the action happens), *model_stepper.reset(True)* sets the model in training mode (ie dropout is enabled). However when we get round to the validation, if we go into *validate*, the *stepper.reset(False)* sets the model to eval mode, and disables the dropout

def validate(stepper, dl, metrics):

batch_cnts,loss,res = [],[],[]

stepper.reset(False)

with no_grad_context():

for (*x,y) in iter(dl):

preds, l = stepper.evaluate(VV(x), VV(y))

if isinstance(x,list): batch_cnts.append(len(x[0]))

else: batch_cnts.append(len(x))

loss.append(to_np(l))

res.append([f(preds.data, y) for f in metrics])

return [np.average(loss, 0, weights=batch_cnts)] + list(np.average(np.stack(res), 0, weights=batch_cnts))

The training error is a rolling average,

```
loss = model_stepper.step(V(x),V(y), epoch)
avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
debias_loss = avg_loss / (1 - avg_mom**batch_num)
t.set_postfix(loss=debias_loss)
```

avg_mom is a fixed constant of 0.98

Hope this helps! It’s worth looking at the source code. The *fit* method is a little less readable than it was, but it is all still pretty accessible