CH4 - model is improving, but results always look poor

Hello, I’ve been fighting with this for weeks and could use some advice on how best to debug this issue.

When working on the full mnist classifier, i’ve found that when i run, my batch accuracy is either very low or stuck at a value, but when i then manually apply my model to a batch and calculate the batch accuracy on my own, the value and predictions look quite good (after the has updated the weights).

I’ve shared a cached set of results to demonstrate what i’m seeing here: trainging example
or there’s the actual notebook

If you look at ln[19], you can see i’m running and printing every batch result to show pretty bad looking results. However, right after the training has run, in ln[20]&[21], i manually run the same accuracy metric on a batch w/ the same model and then accuracy and predictions look pretty decent. I run again (ln[22]) and the values look bad again.

I’m currently trying various callbacks in the learn loop to get an idea of where things are going wrong, but i’m a bit lost. It seems to me like backprogation isn’t occurring or possibly my batch_accuracy function is just not working properly inside the learner.

Any advice would be greatly appreciated, thanks!