How does slice behave for float inputs?

From what I understood through reading the docs, there are two behaviours of slice:

  • When you only pass 1 argument (like the learningrate lr in your example), the last layergroups learningrate is lr and all others have lr/10
  • When you pass 2 arguments (like 1e-5,5e-2), the first group of layers gets 1e-5 as learningrate and the last group 5e-2. All other groups get a learningrate evenly geomatrically spaced between those arguments.

You can look it up in the docs: discriminative layer training and when you scroll down a bit to lr_range

Hope this helps :slight_smile:

1 Like