TL;DR I had an idea that it might be interesting to use spaced repetition for curriculum learning, similar to Anki. I have not implemented it yet, sharing in case someone else wants to try it too.
I first heard about “curriculum learning” above in this topic.
I was thinking about how to continue training during the inference stage, i.e. after a model is released to production, so it can continue to learn, and I came up with the idea of training using spaced repetition, like Anki does for human learning.
We would schedule items for training based on how well the model is doing with them. If it fails on an item, schedule that item for study again ASAP, e.g. in the next batch of the same epoch. If it succeeds on an item, schedule it out into the future a bit. When it repeatedly succeeds, it would revise that item at exponentially increasing intervals, e.g. 1 day, 2 days, 4 days, 8 days, … (or rather, some number of batches / epochs later)
I haven’t tried this yet for machine learning, but spaced repetition can work very well for human study, so I guess it would work well for machine learning too. It would also work well for adding new items to the training set later on (e.g. in production, or for subsequent stages of learning). I’ve personally used Anki to learn kanji and martial arts techniques, it’s surely much more effective than just reading the whole dataset repeatedly as we do for each normal epochs without curriculum learning.
It would make sense to combine this with other curriculum learning ideas, such as tackling the easier cases first.
edit: I found a paper mentioning this approach and some more advanced approaches, which cites some other papers on the topic too: https://arxiv.org/pdf/2011.00080.pdf