Is drop_mult the same as Dropout

I have used the language model learner and there is a parameter named drop_mult. The default value is 1. I wanted to know whether this is actually dropout or is it different. To my knowledge , dropout is always less than 1, so any clarity would be greatly appreciated. Thanks

As far as I understand your different archs like LSTM have multiple dropout probabilities for different things. Once you set them, this drop_mult property scales all of them. So you can change all dropout probabilities simultaneously using this, keeping their relative size
E.g. the default for LSTM is
hidden_p:float=0.2, input_p:float=0.6, embed_p:float=0.1, weight_p:float=0.5
using drop_mult=1.5 will basically set those to
hidden_p:float=0.3, input_p:float=0.9, embed_p:float=0.15, weight_p:float=0.75

Ok I get it now. Is there any documentation that I could refer to. Fastai docs do not explain this in much great detail.

1 Like

It only says: drop_mult is applied to all the dropouts weights of the config

At this point I can just say, don’t hesitate to also take a look at the code. That’s what I did when I saw your question. I have no idea about NLP :smiley: Never used this api, but took me less than a minute to figure out…

When you click the [source] link next to the documentation you get directly linked to this part of the source code:

Source code (click this)
def get_language_model(arch:Callable, vocab_sz:int, config:dict=None, drop_mult:float=1.):
    "Create a language model from `arch` and its `config`, maybe `pretrained`."
    meta = _model_meta[arch]
    config = ifnone(config, meta['config_lm'].copy())

    # SEE HERE
    for k in config.keys(): 
        if k.endswith('_p'): config[k] *= drop_mult    # HERE

    tie_weights,output_p,out_bias = map(config.pop, ['tie_weights', 'output_p', 'out_bias'])
    init = config.pop('init') if 'init' in config else None
    encoder = arch(vocab_sz, **config)
    enc = encoder.encoder if tie_weights else None
    decoder = LinearDecoder(vocab_sz, config[meta['hid_name']], output_p, tie_encoder=enc, bias=out_bias)
    model = SequentialRNN(encoder, decoder)
    return model if init is None else model.apply(init)

def language_model_learner(data:DataBunch, arch, config:dict=None, drop_mult:float=1., pretrained:bool=True,
                           pretrained_fnames:OptStrTuple=None, **learn_kwargs) -> 'LanguageLearner':
    "Create a `Learner` with a language model from `data` and `arch`."

    # SEE HERE
    model = get_language_model(arch, len(data.vocab.itos), config=config, drop_mult=drop_mult) # HERE

    meta = _model_meta[arch]
    learn = LanguageLearner(data, model, split_func=meta['split_lm'], **learn_kwargs)
    if pretrained:
        if 'url' not in meta: 
            warn("There are no pretrained weights for that architecture yet!")
            return learn
        model_path = untar_data(meta['url'], data=False)
        fnames = [list(model_path.glob(f'*.{ext}'))[0] for ext in ['pth', 'pkl']]
    if pretrained_fnames is not None:
        fnames = [learn.path/learn.model_dir/f'{fn}.{ext}' for fn,ext in zip(pretrained_fnames, ['pth', 'pkl'])]
    return learn

Just by doing Ctrl+F in the browser on the github page looking for drop_mult you can see it’s passed to the above function where it is only used in one line.
Don’t assume the fastai code is too complicated to look inside. Usually it’s pretty simple :wink:


Oh I will make it a point to explore the underlying codes. Thanks a lot.

1 Like

For those with the same doubt, here’s the piece of code that uses drop_multi:

if k.endswith('_p'): config[k] *= drop_mult