ULMFit - Italian - v1

(Davide Boschetto) #1

Hi there!
Let’s open this thread… The original assignee for Italian has not been active since June of last year, so I was thinking about opening a new thread and possibly using fastai_v1 to create something for the italian language!

Dumping itwiki from wikimedia still is the first required step, followed by json conversion using wikiextractor. This is what I remember for now: I’ll see where to go next!

There are some problems (forced to use Windows, most ULMfit code is written for 0.7 that requires torch 0.3.1 that has trouble with windows), but I think they are easily solvable given the right amount of time!

Hopefully someone else will jump in :slight_smile:

(Andrea de Luca) #2

I tried to build a viable ULMFit for Italian, without being able to succeed, though.

Let me try and find the my old code… Meanwhile, let us stay in touch…


(Fabrizio) #3

Ciao Davide, better wait for the part 2 v3 and see what is coming. Training on Wikipedia is just the first step, and quite easy to do. However you still need dataset in Italian to validate your work. This last point is a little bit trickier. If you want, let me know about your plans.