When does it make sense to use bootstrapping?

Cool, thanks, I’ll check out the links! Another link I stumbled on that seems very interesting: https://stats.stackexchange.com/questions/26088/explaining-to-laypeople-why-bootstrapping-works

Just to clarify one part of my question: I’m curious if it would make sense to bootstrap while training, even for just a single NN (so not as a means towards ensembling NNs that have each been trained on somewhat different versions of the training data). Not sure if this makes sense/would end up working any differently to the normal approach, but to be as bootstrappy as possible, imagine doing away with the concept of epoch entirely: each minibatch is freshly sampled with replacement from the training set, and you just keep sampling and taking steps of gradient descent until you’re satisfied.