I’m very happy to share the Dutch dataset and the weights of the trained language model!
However, for the former, I’m not sure whether I can publish it, since the contents are scraped. I’ve read the website’s disclaimer and there’s nothing in it that forbids it, but I’m still not quite sure about legal issues
For language model weights, do you have pointers how I can best package and describe it? I’ve never shared network weights before.