Hi all,
Here at Esri, we’ve been using fast.ai to develop a library to train geospatial deep learning models.
We performed a few experiments and found that picking a learning rate about midway or 2/3rd of the way in the section of the loss curve where loss is going down worked best for us.
Here’s the code we used, in case others want to try it out:
def find_lr(losses, lrs):
import matplotlib.pyplot as plt
losses_skipped = 5
trailing_losses_skipped = 5
losses = losses[losses_skipped:-trailing_losses_skipped]
lrs = lrs[losses_skipped:-trailing_losses_skipped]
n = len(losses)
max_start = 0
max_end = 0
# finding the longest valley.
lds = [1] * n
for i in range(1, n):
for j in range(0, i):
if losses[i] < losses[j] and lds[i] < lds[j] + 1:
lds[i] = lds[j] + 1
if lds[max_end] < lds[i]:
max_end = i
max_start = max_end - lds[max_end]
sections = (max_end - max_start) / 3
final_index = max_start + int(sections) + int(sections/2) # pick something midway, or 2/3rd of the way to be more aggressive
fig, ax = plt.subplots(1, 1)
ax.plot(
lrs,
losses
)
ax.set_ylabel("Loss")
ax.set_xlabel("Learning Rate")
ax.set_xscale('log')
ax.xaxis.set_major_formatter(plt.FormatStrFormatter('%.0e'))
ax.plot(
lrs[final_index],
losses[final_index],
markersize=10,
marker='o',
color='red'
)
plt.show()
return lrs[final_index]
Let us know if you’d like to see this is in a PR.