Okay thanks I will try this out. I think for backward compatibility, one simple function can be added that does this. It will allow people to use their already build models in fastai V1.
Edit:
What about 1.decoder.weight and 1.decoder.bias ?? I think the old models does not have decoder part. I think they are saved separately?
Here is the function that does this. You might want to add this to the new Library.
def convert(path_to_old_model, path_to_save_converted_model):
"""
path_to_old_model is the path to old model
and
path_to_save_converted_model is the path where the converted model is stored
"""
old_wgts = torch.load(path_to_old_model, map_location=lambda storage, loc: storage)
new_wgts = OrderedDict()
new_wgts['encoder.weight']=old_wgts['0.encoder.weight']
new_wgts['encoder_dp.emb.weight']=old_wgts['0.encoder_with_dropout.embed.weight']
new_wgts['rnns.0.weight_hh_l0_raw']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
new_wgts['rnns.0.module.weight_ih_l0']=old_wgts['0.rnns.0.module.weight_ih_l0']
new_wgts['rnns.0.module.weight_hh_l0']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
new_wgts['rnns.0.module.bias_ih_l0']=old_wgts['0.rnns.0.module.bias_ih_l0']
new_wgts['rnns.0.module.bias_hh_l0']=old_wgts['0.rnns.0.module.bias_hh_l0']
new_wgts['rnns.1.weight_hh_l0_raw']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
new_wgts['rnns.1.module.weight_ih_l0']=old_wgts['0.rnns.1.module.weight_ih_l0']
new_wgts['rnns.1.module.weight_hh_l0']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
new_wgts['rnns.1.module.bias_ih_l0']=old_wgts['0.rnns.1.module.bias_ih_l0']
new_wgts['rnns.1.module.bias_hh_l0']=old_wgts['0.rnns.1.module.bias_hh_l0']
new_wgts['rnns.2.weight_hh_l0_raw']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
new_wgts['rnns.2.module.weight_ih_l0']=old_wgts['0.rnns.2.module.weight_ih_l0']
new_wgts['rnns.2.module.weight_hh_l0']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
new_wgts['rnns.2.module.bias_ih_l0']=old_wgts['0.rnns.2.module.bias_ih_l0']
new_wgts['rnns.2.module.bias_hh_l0']=old_wgts['0.rnns.2.module.bias_hh_l0']
torch.save(new_wgts, path_to_save_converted_model+'converted_model.pth')
I don’t know whether this was suggested or not. But It will be nice to have options which along finding learning rate will also grid search few weight decays very similar to the figure from Smith paper.
Minor feature request: consistent data load function behavior with regard to file names.
Small thing, but from_csv, from_df, and from_folder expect file paths to images that are relative to path, but from_name_re, from_name_func, and from_lists require absolute file paths, even though path is also one of the arguments to all of the ImageDataBunch data loading functions. As a new user of the library I found it confusing and expect other new users may as well.
I’m using DS, with valid_pct, which is using random_split, but after reopening notebook, random.uniform returns a different set that spoils further training.
Could we add to random_split additional seed argument with default len(arrs[0])?
def random_split(valid_pct:float, *arrs:NPArrayableList, seed:int=len(arrs[0]))->SplitArrayList:
"Randomly split `arrs` with `valid_pct` ratio. good for creating validation set."
np.random.seed(seed)
is_train = np.random.uniform(size=(len(arrs[0]),)) > valid_pct
return arrays_split(is_train, *arrs)
Not a huge deal, but I keep having to pull up the docs to remind myself the order of what is being presented when I call plot_top_losses (is it actual/predicted or predicted/actual?). It might be more verbose than you would want, but I would change it so it appears like this:
@jeremy can fastai add adversarial training? now i know this is purely something being focused on research right now. But in production it would be better if the models were robust to noise as well?
Minor feature request: Allow the use of a custom data loader. ie for image classification, it would be adding an optional parameter dataloader in ImageDataBunch.create static method.
Not sure how it is relevant to the library’s evolution roadmap, but probably it would be interesting to have a facade (or maybe build this logic into ImageDataBunch) to convert “plain” pytorch dataset class/instance into dataset supported by the library. Also, would be nice to have a way directly pass nn.Module heirs into the learner. Then you can do something like:
class FancyCustomModel(nn.Module):
# ... some stuff to create and setup model
def split(self):
return (self[1], )
train_ds, valid_ds = MNIST(train=True), MNIST(train=False)
bunch = ImageDataBunch.create(train_ds, valid_ds)
learn = ClassificationLearner(bunch, FancyCustomModel())
learn.fit_one_cycle(1)
The main reason of this proposal is to make pytorch <-> fastai integration even more seamless then it already is.
Agree I often follow this pattern using “ModelModifier”
learn = create_cnn(data, ModelModifier(models.resnet50), metrics=error_rate)
class ModelModifier:
def init( self, arch ):
self.arch = arch
def call(self, pretrained):
module = self.arch(pretrained)
# do something with the model before passing it back to create_cnn
return module
I also played with your weight decay finder but with my two datasets the results almost always overlaid each other.
What is you experience with it? Does this depend on the data/network or you encountered the same behavior for different datasets?