ULMFiT - German

Part 2 & Alumni (2018)

jolackner (Johannes Lackner) July 3, 2019, 2:36pm 106

That’s due to all language model shapes being divisible by 8 as of fastai 1.0.53 by default (reason: half precision training is much faster this way). Thankfully, Sylvain Gugger posted a workaround here:

2 Likes