What is the ideal validation split size for sequence to sequence training?

(Sai Prasanna) #1

I thought the standard 10-20% can be applied, but found in a grammar correction paper that they were using just 5k sentences for validation set. And even open nmt recommends about that size.

Is this enough? And any particular reasoning on why a bigger validation set won’t matter in this specific type of training?