That “downplaying” apology applies to me, you are doing the hard work here as well and I am just a bystander.
The problem is not so simple. There’s a 3 way split as far as I know. Train, validation and test set.
Train - seen by model and used to adjust weight
Validation - not seen by model and used to continuous evaluate progress, but lt is seen by you, the human decision maker.
Test - not seen by model, not used during training in any manner, not even printing out its loss or accuracy. It is used only at the end running only once. Ideally, this data is hidden under a rock until your finish all your work.
This is really the most honest way with probably no chance of accidental overfitting. And of course, this is how Kaggle work as well. You don’t see their test set while u work, and only get a number reported back to you.
This point may be explained further in fastai, I am not sure. But if not, you can easily google it up.
Having said all this, I get JH point. There’s a good chance you achieved equal or higher accuracy than the paper, cos u aren’t tuning the hyperparam like mad, only very mildly by LR or epoch num if so. But I would be careful if the claim is compared to a research paper, and then mentioning it on Twitter, then the result will have to be defended more. I perfectly understand we are all just learning, I just shared this cos I ran into bad things before by leaving out work.