Language Model Zoo 🦍

piotr.czapla · October 8, 2019, 6:33pm

And ULMFiT has superior results on the CLS dataset: https://arxiv.org/pdf/1909.04761.pdf

much_learner · January 25, 2020, 2:06pm

Anyone is willing to share pretrained (english) ULMFIT or MULTIFIT LM weights, with the SentencePiece tokenizer?

Update:
I trained it myself https://www.kaggle.com/manyregression/fastai-en-wiki-500kk-pretrained-sp

There’s also more in the versions of this notebook - 100kk tokens, awd-lstm weights
https://www.kaggle.com/manyregression/sp-wikitext-vocab-lm-ipynb?scriptVersionId=27995530

much_learner · January 31, 2020, 4:06pm

Another question - there’s no point to use pretrained https://s3.amazonaws.com/fast-ai-modelzoo/wt103-fwd if I chose SentencePiece, right?

Moody · February 1, 2020, 12:03pm

Correct. You need consistent indices and tokens for encoding (training) and decoding (inference).

much_learner · February 1, 2020, 4:18pm

Funny, but I got slightly worse results when I fine tuned pretrained Spacy weights with SP and the ntrained a classifier https://www.kaggle.com/manyregression/fastai-ulmfit-google-quest-classifier-spacy?scriptVersionId=27771121

much_learner · February 2, 2020, 9:40pm

Any ideas why ULMFIT english regression model pretrained from 500kk wiki tokens failed while 100kk gave just worse results?

Here’s 500kk version https://www.kaggle.com/manyregression/fastai-ulmfit-google-quest-sp?scriptVersionId=28040078

For 100kk, the spearman metrics was 0.26 at best.

insightfactory · March 30, 2020, 6:57pm

Hi I built a persian language model
here is the topic

pouramini · June 12, 2020, 8:14am

Hi, I’m interested in knowing about your work. I’m a phd student in Tehran University.

pouramini · June 12, 2020, 1:15pm

Could one guide me how to implement MultiFit for a new language (the Persian language).

This is the notebook

github.com

n-waves/multifit/blob/master/notebooks/CLS-JA.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "env: CUDA_VISIBLE_DEVICES=0\n"
     ]
    }
   ],
   "source": [
    "%env CUDA_VISIBLE_DEVICES=0"
   ]
  },
  {

This file has been truncated. show original

It reads a pretrained model for Japanese, but I guess there is not such a model for Persian. Also, I don’t know the format of the models. I found a pretrained model for Persian in the following link,

however I don’t know if the model fits the project above?

Moody · December 8, 2020, 2:38pm

I was so glad to have Ines and Matt presenting in-person about the new features of spaCY v3.0. Highlights include the data pipeline to store all the configures and hyper parameters in one place and APIs with other popular open source tools (such Weighs and Bias and FastAPI). My favorite feature is the ability to build (ie hard code) your own acronyms for specific domains or use cases. Enjoy!