Lesson4-IMDB LanguageModelData.from_text_files

sam2 · March 13, 2018, 4:36pm

After having generated datamodel md as:

md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

I want to give myself and my computer a break.

Is there a way to save the md?
my TEXT object is saved as a pickle.
md happens to be a generator and hence can’t be pickled.
Can I build md from the pickled TEXT?

Kasianenko · August 7, 2018, 1:24pm

In Train part of the notebook

github.com

fastai/fastai/blob/master/courses/dl1/lesson4-imdb.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%reload_ext autoreload\n",
    "%autoreload 2\n",
    "%matplotlib inline\n",
    "\n",
    "from fastai.learner import *\n",
    "\n",
    "import torchtext\n",
    "from torchtext import vocab, data\n",
    "from torchtext.datasets import language_modeling\n",
    "\n",
    "from fastai.rnn_reg import *\n",
    "from fastai.rnn_train import *\n",

This file has been truncated. show original

you can create learner

learner = md.get_model(opt_fn, em_sz, nh, nl,
                       dropouti=0.05, dropout=0.05, wdrop=0.1, dropoute=0.02, dropouth=0.05)
learner.reg_fn = partial(seq2seq_reg, alpha=2, beta=1)
learner.clip=0.3

and save/load it after training the model

learner.save_encoder('adam1_enc')
learner.load_encoder('adam1_enc')

sam2 · August 7, 2018, 2:01pm

@Kasianenko
Igor, I wanted to save the model data object after having spent long time in building the TEXT object.
I wanted to take a break before starting training the model.
After training I would save the model (learner object) and load it as you suggest

Kasianenko · August 7, 2018, 2:37pm

You may save learner before starting train, in will save initial model before training

sam2 · August 7, 2018, 4:50pm

@Kasianenko
Thank you. I also noticed that the execution of line:

md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

does not take as long as it used to. So my question about saving the md is not critical for me any more. I am on pytorch 0.4 (which maybe why the execution time is better maybe??)

Mariam · December 24, 2018, 5:15am

If I am starting a new session, how would I load learner if md is not defined?

immarried · February 3, 2019, 8:29am

Did you find out how to?