Fastai_v1, adding features

Okay thanks I will try this out. I think for backward compatibility, one simple function can be added that does this. It will allow people to use their already build models in fastai V1.

Edit:
What about 1.decoder.weight and 1.decoder.bias ?? I think the old models does not have decoder part. I think they are saved separately?

Here is the function that does this. You might want to add this to the new Library.

def convert(path_to_old_model, path_to_save_converted_model):
    """
    path_to_old_model is the path to old model 
    and 
    path_to_save_converted_model is the path where the converted model is stored
    """
    old_wgts = torch.load(path_to_old_model, map_location=lambda storage, loc: storage)
    new_wgts = OrderedDict()
    new_wgts['encoder.weight']=old_wgts['0.encoder.weight']
    new_wgts['encoder_dp.emb.weight']=old_wgts['0.encoder_with_dropout.embed.weight']
    new_wgts['rnns.0.weight_hh_l0_raw']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.weight_ih_l0']=old_wgts['0.rnns.0.module.weight_ih_l0']
    new_wgts['rnns.0.module.weight_hh_l0']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.bias_ih_l0']=old_wgts['0.rnns.0.module.bias_ih_l0']
    new_wgts['rnns.0.module.bias_hh_l0']=old_wgts['0.rnns.0.module.bias_hh_l0']
    new_wgts['rnns.1.weight_hh_l0_raw']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.weight_ih_l0']=old_wgts['0.rnns.1.module.weight_ih_l0']
    new_wgts['rnns.1.module.weight_hh_l0']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.bias_ih_l0']=old_wgts['0.rnns.1.module.bias_ih_l0']
    new_wgts['rnns.1.module.bias_hh_l0']=old_wgts['0.rnns.1.module.bias_hh_l0']
    new_wgts['rnns.2.weight_hh_l0_raw']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.weight_ih_l0']=old_wgts['0.rnns.2.module.weight_ih_l0']
    new_wgts['rnns.2.module.weight_hh_l0']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.bias_ih_l0']=old_wgts['0.rnns.2.module.bias_ih_l0']
    new_wgts['rnns.2.module.bias_hh_l0']=old_wgts['0.rnns.2.module.bias_hh_l0']

    torch.save(new_wgts, path_to_save_converted_model+'converted_model.pth')
2 Likes

I don’t know whether this was suggested or not. But It will be nice to have options which along finding learning rate will also grid search few weight decays very similar to the figure from Smith paper.

I made something quick for my self. It currently called wd_finder And plotting function plot_wd looks something like this.

If people are interested I can share the notebook and we can add it to the library.

2 Likes

Do you have it in a public notebook somewhere? github repo?

1 Like

I will clean up the code, and share GitHub repo=)

1 Like

Here is link to repo… One could improve progress bar… Hope it helps!

the function at the end creates following graph (using standard wd or provided by users)38%20PM

3 Likes

I was looking into the codebase and noticed that spacy tokenizer is initialized like this. https://github.com/fastai/fastai/blob/68466b5268d5c7a6c42389798f9bb7daf0154139/fastai/text/transform.py#L23

In order to leverage alpha tokenization support for other languages as well (https://spacy.io/usage/models#alpha-support), what do you think about initializing as follows:

try:
  self.tok = spacy.load(lang)
except Exception:
  self.tok = spacy.blank(lang)
1 Like

Good point, there’s actually no point using spacy.load(lang) for the tokenization, so we switched to spacy.blank(lang).

1 Like

this is a really good idea

1 Like

Minor feature request: consistent data load function behavior with regard to file names.

Small thing, but from_csv, from_df, and from_folder expect file paths to images that are relative to path, but from_name_re, from_name_func, and from_lists require absolute file paths, even though path is also one of the arguments to all of the ImageDataBunch data loading functions. As a new user of the library I found it confusing and expect other new users may as well.

I’m using DS, with valid_pct, which is using random_split, but after reopening notebook, random.uniform returns a different set that spoils further training.

Could we add to random_split additional seed argument with default len(arrs[0])?

def random_split(valid_pct:float, *arrs:NPArrayableList, seed:int=len(arrs[0]))->SplitArrayList:
    "Randomly split `arrs` with `valid_pct` ratio. good for creating validation set."
    np.random.seed(seed)
    is_train = np.random.uniform(size=(len(arrs[0]),)) > valid_pct
    return arrays_split(is_train, *arrs)

And enable it with kwargs from constructor

1 Like

No we don’t want to remove randomness by default. Just set the seed in your notebook.

Hi,

Not a huge deal, but I keep having to pull up the docs to remind myself the order of what is being presented when I call plot_top_losses (is it actual/predicted or predicted/actual?). It might be more verbose than you would want, but I would change it so it appears like this:

Code change here.

Notebook where I played with this (under “Results” section) here.

Let me know if you would like me to open a PR. Otherwise, I’m sure I’ll remember the order eventually :slight_smile:

2 Likes

If you added a title containing the list in order, I’d merge that PR :slight_smile:

@jeremy can fastai add adversarial training? now i know this is purely something being focused on research right now. But in production it would be better if the models were robust to noise as well?

I’d be happy to contribute or help out.

Minor feature request: Allow the use of a custom data loader. ie for image classification, it would be adding an optional parameter dataloader in ImageDataBunch.create static method.

Not sure how it is relevant to the library’s evolution roadmap, but probably it would be interesting to have a facade (or maybe build this logic into ImageDataBunch) to convert “plain” pytorch dataset class/instance into dataset supported by the library. Also, would be nice to have a way directly pass nn.Module heirs into the learner. Then you can do something like:

class FancyCustomModel(nn.Module):
   # ... some stuff to create and setup model

  def split(self):
      return (self[1], )

train_ds, valid_ds = MNIST(train=True), MNIST(train=False)
bunch = ImageDataBunch.create(train_ds, valid_ds)
learn = ClassificationLearner(bunch, FancyCustomModel())
learn.fit_one_cycle(1)

The main reason of this proposal is to make pytorch <-> fastai integration even more seamless then it already is.

Agree I often follow this pattern using “ModelModifier”
learn = create_cnn(data, ModelModifier(models.resnet50), metrics=error_rate)

class ModelModifier:
def init( self, arch ):
self.arch = arch
def call(self, pretrained):
module = self.arch(pretrained)
# do something with the model before passing it back to create_cnn
return module

2 Likes

Any pytorch dataset should already work fine with the library. Did you find some problems?

1 Like

I also played with your weight decay finder but with my two datasets the results almost always overlaid each other.
What is you experience with it? Does this depend on the data/network or you encountered the same behavior for different datasets?

Thank you for sharing your notebook!

Kind regards
Michael