learn.TTA stuck after first epoch ends

While using learn.tta I noticed that it wasn’t proceeding to the subsequent epochs after processing the first batch of test set images. It’s been stuck at this point for 20 minutes, and gets stuck after processing the first batch even if I restart my machine. What could be happening here?

Turns out the problem was WandbCallback being called after_epoch. This is test time, so I don’t think all functionality in that callback works, hence the lag before even printing the progress-bar at the end of the current step.

2 Likes