Can a research mathematician not understand their own dissertation and the implications for learning

I would like to share an article I just came across:

This resonates deeply with something I have been wondering about lately. Why it is that if I don’t work with RNNs for a couple of weeks I effectively have to bring myself up to speed with many of the things I knew how to implement in the past? Why it is that when figuring out how to train a convolutional neural net I need to feverishly jump around the fastai codebase even though I have trained convnets using it in the past and some of them were very non-standard / customized, etc.

The answer is that this is okay. This is just how it is. With each notebook I implement I learn something and improve my meta problem solving skills. But it is natural at the same time that I will forget specific things pertinent to a subject area over time.

I have a colleague at work who is very good at mathematics and he would be telling me the same thing - he understands many things on a high level but would need to refresh his memory to speak in detail about nearly anything. And that is okay.

I think this is the difference between someone who grew up in a field and a newcomer. I have another colleague who just started to learn to program. I told him noone knows it all, that I constantly have to look up the API even of things I use on a daily basis, such as that of dicts or lists. For example, just today I wanted a default dict with [] as a default value but I forgot I had to pass something callable at the moment of construction and not an empty [] (which makes sense in a lot of way, doah). And most of the things I look up are even more trivial than this. Completely blew his mind when I told him that.

I am coming to the realization that these unreasonable expectations are what fuels the impostor syndrome and that make it so hard to get into new, complex fields such as math / machine learning. I should probably drop the uneasiness that I have about rereading parts of the fastai codebase or something RNN related in a similar fashion as I couldn’t care less that I need to constantly look up the simplest things about the API of the languages I work with.


“You probably won’t remember this, but the “forgetting curve” theory explains why learning is hard.”

Since I learnt about that a couple of decades ago, I’ve changed all my learning methods around that basis.


Paul Graham (the famous Lisp hacker and Y Combinator founder) in his short essay “How You Know” touches the same subject. Highly recommended 4 minute read. I like the part where he compares acquiring experiences with compiling from source to your brain.


Qualitatively, when should we go over something again? As we start forgetting things, once we have forgotten half of it or nearly all of it? At what point of the forgetting curve is it most valuable to repeat the material?

If I am reading this right, it makes sense to stick for a while longer to reviewing material we read / learn from lectures / understand through writing code than just dropping looking at it after gaining initial understanding. This also means that learning to solve a problem using a specific library doesn’t really count all that much - for retention you have to come back to the library over and over again and keep using the functionality on slightly permuted problems. Or on problems that increase in complexity as you go.

This would then also reward being able to stay focused on a specific area / using specific set of tools over time.

Thank you for bringing this up - I definitely need to think more on how I can take advantage of spaced repetition to a greater extent.

At the same time I think being worried too much about forgetting is also part of the problem, at least for me. I have forgotten how to derive the derivative of a softmax followed by cross entropy loss (I can recall the math has nice simplifying properties though) but I am not sure I should be too worried about this. My ability to do things has certainly not decreased as a result of this and if I really wanted to know how this is derived I could probably look it up very quickly. But worrying about forgetting is something that has occupied my mind for way to long and I think I have read into it too much. That one cannot be ‘good’ (whatever that means) at a field such as ML / math if they don’t remember everything needed how to arrive at a result, even if they can grok the big picture why a result is correct.

Maybe the reality of forgetting is just something that everyone experiences and it doesn’t necessary need to make us less effective :slight_smile: And for the things that are really worthwhile and we want to remember, we should look into spaced repetition.

Sidenote: For anyone interested, there is a great piece of software, Anki, that quite a few people have used successfully for spaced repetition.


Really good read! I like the metaphor of compiling he uses. Even though we might not be able to recall everything word for word, or every step of a derivation, there is still something that remains with us for having done it at some point in our lives :slight_smile: And the model of the world that we build through experiences can be quite useful even though we can’t pinpoint how we got to where we are.

1 Like

When you’re just about to forget, but can remember it with effort. It’s important to not look up the answer at this point, because the effort of remembering is what makes it stick.

Something like the following revision schedule works pretty well:

  • 2 hours
  • 1 day
  • 4 days
  • 2 weeks
  • 2 months
  • a year
  • 3 years

This article motivated my use of Anki for learning (and also this Forum search to find this thread):

Maybe this is of help for others.

Best regards

1 Like

Also, Anki will automatically adjust the forgetting curve depending on the feedback for each card.

Since Jeremy mentioned “spaced repetition” in one of the videos (and his “chinese adventures” online somewhere) I have started using anki kind of as a note-taking app for all kinds of things from numpy/pandas/vim commands to ML concepts. Then repeating things while idling somewhere really helps.

I can also recommend this excellent course which provides a lot of insights into learning and methods to improve (including of course spaced repetition):

I think this course should be mandatory for everyone :wink: I wish I had come across this years ago…
Covers other interesting topics like procrastination, all brain-science based.


I know that this is an older post but I would love to know more about your learning methods (a lot more please).

What really matters in learning when we know that our memory is going to fade away? All of my learning seems meaningless and remains just a mere activity of the “past”. I deeply read and explore a topic only to forget everything about it a few months later. Even the practical stuff. What truly matters then? What should I focus on more that can reward me?

Hi Everybody
There are lots of memory stories. My favourite is from the 1920’s where intellectuals were discussing this subject at a cafe. They called for the bill and the waiter churned out the list. They paid. They called the waiter back but he could not remember what they had. This is called the completed task theory where the mind closes the book. Maybe which is why many languages say goodbye using “until next time”. Hence leaving the connections open.

The next story is from Einstein. When asked why he kept a notebook he replied because his mind had better things to do than store facts.

I agree with the original contributor. I have forgotten most of what I learned and if I look at my University exam papers can not even name the symbols let alone answer the questions.

I spent twenty years writting IBM assembler but nowadays I probabily write in C# and occasionally Python but I have to refer back to code examlpes whenever I want to do something.

I think the problems is depth. Nowadays the modern computer languages can be learnt in a day with a second day for the obscure bits where as before it might of taken much longer.

Originally there were few API because you would either write the code or copy some old code. “Hacking” orginally meant copying and reusing code. Nowadays the skill is knowing a sub-routine / function exists to do what you require rather having the knowledge to write the required code.

Finally there is old age and the cluttered mind. Dementure is interesting in old people. They can remember historic details but not what they had for breakfast. Hence the mind must have very long term memory, middle term memory which I refresh by practice and short term memory which never goes anywhere.

Please note this are my thoughts. I have no technical knowledge in this area.

Regards Conwyn

1 Like