wgpubs
(WG)
November 9, 2017, 10:07pm
1
I was looking in the lesson1-vgg notebook and I saw learn.fit(lr, 3, cycle_len=3)
code towards the end, but I don’t know what was driving the decision?
As we train, are there some general guidelines we should follow re: cycle_len and cycle_mult parameters, based on the scores we are saying (e.g., the train/val loss and val accuracy)?
1 Like
jeremy
(Jeremy Howard)
November 10, 2017, 1:09am
2
We’ll try and develop these guidelines as we go thru the lessons, although I think there’s a certain amount of experimentation that’s still required…
1 Like
Havent looked at the lesson1-vgg notebook, but played around with cycle_len and cycle_mult a little bit.
Sometimes when I used the
learn.fit(lr, 3, cycle_len=3
I found that the training loss was still much higher than the validation loss.
Therefore I could use learn.fit(lr, cycle_len=1, cycle_mult=2) and still not over fit.
But sometimes using cycle_mult resulted in overfitting - i.e. training loss becoming much lower than validation loss, and I couldn’t use it.
Always started without cycle_mult - and based on the results decided to see if I needed to use cycle_mult.
Looking forward to the guidelines…
wgpubs
(WG)
November 10, 2017, 1:24am
4
It seems to me that using learn.fit(lr, 3, cycle_len=3)
is useful after you have a pretty good model as the decrease in learning rate(s) are applied much more slowly, over the course of 3 epochs instead of just 1 in this example.
KevinB
(Kevin Bird)
November 24, 2017, 9:43pm
5
I’m working on figuring out exactly what these arguments do. I thought cycle_mult would just change how many epochs are run so my guess was that it would be ((1 epoch * 1 cycle_length) * 2 cycle_mult), but that doesn’t appear to be how it works. I am currently doing a learner.fit(3e-3, 20, wds=1e-6, cycle_len=1, cycle_mult=2)
which I thought would be 40 total epochs so I was letting it run overnight. Currently I’m still training at 441. Anybody have insight on the cycle_mult parameter?
wgpubs
(WG)
November 24, 2017, 9:47pm
6
cycle_mult=2
is setting the next number of cycles = 2x the previous # of cycles.
So … you have 20 cycles:
1
1*2 = 2
2*2 = 4
4*2 = 8
8*2 = 16
…
They cycle_mult
parameter is saying, “Do the SGDR over a much longer set of examples as you got from cycle to cycle”
5 Likes
wgpubs
(WG)
November 24, 2017, 9:50pm
7
FYI: The first parameter is number of cycles … not number of epochs.
1 Like
KevinB
(Kevin Bird)
November 24, 2017, 10:15pm
9
So my current run is going to take 1048576 total cycles. I may need to rethink my strategy… thanks for the explanation.
jeremy
(Jeremy Howard)
November 24, 2017, 10:16pm
10
You may need to buy more GPUs…
2 Likes
wgpubs
(WG)
November 24, 2017, 10:22pm
11
… or maybe all the GPUs!
@KevinB , I’m interested to hear what your loss and accuracy looks like at epoch 441 though. What have you noticed in terms of things getting better or not as this thing has progressed?
KevinB
(Kevin Bird)
November 24, 2017, 10:27pm
12
This is for the predict the happiness competition but here is what it looks like. 3.18 is the best for the parameters I’m using at least:
[ 0. 4.46455 4.40694]
[ 1. 3.78111 3.68151]
[ 2. 3.64669 3.57838]
[ 3. 3.59818 3.50164]
[ 4. 3.47426 3.41417]
[ 5. 3.40503 3.36634]
[ 6. 3.37427 3.35349]
[ 7. 3.46382 3.39218]
[ 8. 3.43258 3.36339]
[ 9. 3.37776 3.33089]
[ 10. 3.33042 3.30413]
[ 11. 3.28036 3.28624]
[ 12. 3.26289 3.26632]
[ 13. 3.23597 3.25855]
[ 14. 3.24087 3.26026]
[ 15. 3.38016 3.32014]
[ 16. 3.35569 3.31741]
[ 17. 3.34304 3.30325]
[ 18. 3.31614 3.29608]
[ 19. 3.30345 3.28457]
[ 20. 3.27406 3.274 ]
[ 21. 3.24832 3.26164]
[ 22. 3.2306 3.24961]
[ 23. 3.2108 3.23925]
[ 24. 3.18269 3.23447]
[ 25. 3.18531 3.22781]
[ 26. 3.16189 3.22197]
[ 27. 3.14961 3.21416]
[ 28. 3.12464 3.21537]
[ 29. 3.13175 3.21721]
[ 30. 3.11997 3.21413]
[ 31. 3.29961 3.27805]
[ 32. 3.30252 3.27648]
[ 33. 3.28079 3.26912]
[ 34. 3.2694 3.2659]
[ 35. 3.25986 3.26362]
[ 36. 3.25798 3.25808]
[ 37. 3.25004 3.25705]
[ 38. 3.24428 3.25316]
[ 39. 3.22762 3.24886]
[ 40. 3.22248 3.24073]
[ 41. 3.20404 3.2386 ]
[ 42. 3.20709 3.23556]
[ 43. 3.1896 3.22998]
[ 44. 3.17944 3.22585]
[ 45. 3.16321 3.22228]
[ 46. 3.16504 3.21484]
[ 47. 3.14242 3.21536]
[ 48. 3.14741 3.2123 ]
[ 49. 3.12625 3.20362]
[ 50. 3.10344 3.20547]
[ 51. 3.09274 3.2037 ]
[ 52. 3.09811 3.20076]
[ 53. 3.07678 3.19759]
[ 54. 3.0647 3.19866]
[ 55. 3.06034 3.19094]
[ 56. 3.04002 3.19406]
[ 57. 3.05049 3.19022]
[ 58. 3.05384 3.18993]
[ 59. 3.0535 3.19103]
[ 60. 3.04397 3.19231]
[ 61. 3.04068 3.19205]
[ 62. 3.03304 3.19169]
[ 63. 3.23848 3.25088]
[ 64. 3.22621 3.24744]
[ 65. 3.22434 3.24304]
[ 66. 3.22906 3.24308]
[ 67. 3.24857 3.24918]
[ 68. 3.21878 3.24352]
[ 69. 3.23039 3.23571]
[ 70. 3.22374 3.23938]
[ 71. 3.20965 3.23735]
[ 72. 3.21444 3.23378]
[ 73. 3.20674 3.23376]
[ 74. 3.2078 3.23165]
[ 75. 3.1987 3.22595]
[ 76. 3.19938 3.23507]
[ 77. 3.18689 3.22284]
[ 78. 3.18084 3.22244]
[ 79. 3.18653 3.22262]
[ 80. 3.17729 3.21731]
[ 81. 3.16459 3.21774]
[ 82. 3.17705 3.21972]
[ 83. 3.16235 3.21643]
[ 84. 3.16654 3.21328]
[ 85. 3.15455 3.20769]
[ 86. 3.15155 3.20684]
[ 87. 3.12451 3.20813]
[ 88. 3.14947 3.20421]
[ 89. 3.12834 3.20194]
[ 90. 3.12413 3.20637]
[ 91. 3.11801 3.19935]
[ 92. 3.12102 3.19921]
[ 93. 3.11484 3.19797]
[ 94. 3.10516 3.19669]
[ 95. 3.09042 3.19513]
[ 96. 3.09301 3.19484]
[ 97. 3.09507 3.19433]
[ 98. 3.05678 3.19744]
[ 99. 3.08352 3.19124]
[ 100. 3.08538 3.18486]
[ 101. 3.06861 3.18697]
[ 102. 3.04848 3.18873]
[ 103. 3.05528 3.18864]
[ 104. 3.03432 3.19123]
[ 105. 3.04564 3.18712]
[ 106. 3.0321 3.18311]
[ 107. 3.04318 3.18195]
[ 108. 3.00695 3.18685]
[ 109. 3.02886 3.18409]
[ 110. 3.0148 3.18659]
[ 111. 3.01486 3.18448]
[ 112. 2.99889 3.18207]
[ 113. 2.9952 3.18396]
[ 114. 2.98288 3.18145]
[ 115. 2.98388 3.18279]
[ 116. 2.99322 3.18393]
[ 117. 2.96559 3.18501]
[ 118. 2.99808 3.18357]
[ 119. 2.98889 3.18103]
[ 120. 2.99486 3.18449]
[ 121. 3.01123 3.18102]
[ 122. 2.97897 3.18479]
[ 123. 2.99516 3.18105]
[ 124. 3.00145 3.18284]
[ 125. 2.9775 3.18499]
[ 126. 2.96427 3.18408]
[ 127. 3.20856 3.22821]
[ 128. 3.19646 3.22887]
[ 129. 3.19737 3.2281 ]
[ 130. 3.19659 3.23014]
[ 131. 3.18768 3.22808]
[ 132. 3.18439 3.22152]
[ 133. 3.19678 3.22389]
[ 134. 3.17703 3.22035]
[ 135. 3.183 3.21862]
[ 136. 3.192 3.22406]
[ 137. 3.17358 3.21746]
[ 138. 3.17765 3.21561]
[ 139. 3.17347 3.21992]
[ 140. 3.16725 3.21501]
[ 141. 3.17361 3.22105]
[ 142. 3.18161 3.21504]
[ 143. 3.17081 3.2157 ]
[ 144. 3.16785 3.21799]
[ 145. 3.18129 3.21546]
[ 146. 3.17449 3.22056]
[ 147. 3.18341 3.22368]
[ 148. 3.1668 3.21352]
[ 149. 3.16325 3.21293]
[ 150. 3.15678 3.21001]
[ 151. 3.15519 3.20601]
[ 152. 3.15513 3.20976]
[ 153. 3.15034 3.20938]
[ 154. 3.16605 3.20754]
[ 155. 3.16452 3.20608]
[ 156. 3.14818 3.20636]
[ 157. 3.14552 3.20417]
[ 158. 3.14942 3.20291]
[ 159. 3.1508 3.2004]
[ 160. 3.14307 3.20419]
[ 161. 3.14126 3.20458]
[ 162. 3.12931 3.19923]
[ 163. 3.13057 3.20211]
[ 164. 3.12807 3.19782]
[ 165. 3.13312 3.19905]
[ 166. 3.122 3.19798]
[ 167. 3.12411 3.20108]
[ 168. 3.12643 3.19731]
[ 169. 3.11415 3.19349]
[ 170. 3.1147 3.20588]
[ 171. 3.11652 3.19945]
[ 172. 3.12725 3.19218]
[ 173. 3.1065 3.19732]
[ 174. 3.10854 3.19247]
[ 175. 3.09941 3.1938 ]
[ 176. 3.10473 3.19225]
[ 177. 3.09112 3.19396]
[ 178. 3.10195 3.19826]
[ 179. 3.08704 3.19312]
[ 180. 3.09166 3.1934 ]
[ 181. 3.10786 3.19031]
[ 182. 3.08331 3.19182]
[ 183. 3.09221 3.18711]
[ 184. 3.08179 3.18716]
[ 185. 3.08024 3.18457]
[ 186. 3.07388 3.18814]
[ 187. 3.06453 3.18827]
[ 188. 3.06165 3.18272]
[ 189. 3.06251 3.18189]
[ 190. 3.04889 3.18953]
[ 191. 3.0525 3.18797]
[ 192. 3.04449 3.1874 ]
[ 193. 3.04376 3.18817]
[ 194. 3.06242 3.18087]
[ 195. 3.0322 3.18314]
[ 196. 3.03113 3.1842 ]
[ 197. 3.03513 3.18485]
[ 198. 3.04283 3.17946]
[ 199. 3.02527 3.18595]
[ 200. 3.02092 3.18397]
[ 201. 3.03514 3.1869 ]
[ 202. 3.02593 3.18493]
[ 203. 3.01437 3.18325]
[ 204. 3.01286 3.18669]
[ 205. 2.99889 3.18552]
[ 206. 3.01792 3.18054]
[ 207. 2.9965 3.18535]
[ 208. 3.01588 3.17894]
[ 209. 3.00635 3.18027]
[ 210. 2.99369 3.18455]
[ 211. 3.00367 3.18162]
[ 212. 2.97965 3.18553]
[ 213. 2.98531 3.18345]
[ 214. 2.98842 3.18356]
[ 215. 2.98086 3.1793 ]
[ 216. 2.98962 3.17905]
[ 217. 2.98082 3.17802]
[ 218. 2.97403 3.18094]
[ 219. 2.9635 3.18309]
[ 220. 2.97349 3.1811 ]
[ 221. 2.95721 3.18013]
[ 222. 2.95493 3.18436]
[ 223. 2.96337 3.18425]
[ 224. 2.97471 3.17979]
[ 225. 2.94692 3.1816 ]
[ 226. 2.94737 3.18807]
[ 227. 2.94457 3.18876]
[ 228. 2.95111 3.18588]
[ 229. 2.95226 3.1825 ]
[ 230. 2.94176 3.18738]
[ 231. 2.94058 3.18557]
[ 232. 2.92664 3.18686]
[ 233. 2.94096 3.18446]
[ 234. 2.92679 3.18767]
[ 235. 2.93604 3.1893 ]
[ 236. 2.92983 3.18651]
[ 237. 2.93712 3.18657]
[ 238. 2.92763 3.18536]
[ 239. 2.91183 3.18708]
[ 240. 2.92262 3.18508]
[ 241. 2.93369 3.18783]
[ 242. 2.91738 3.18658]
[ 243. 2.91192 3.18659]
[ 244. 2.90401 3.18836]
[ 245. 2.90573 3.19187]
[ 246. 2.91176 3.18448]
[ 247. 2.89789 3.18768]
[ 248. 2.92199 3.18865]
[ 249. 2.95667 3.18836]
[ 250. 2.91485 3.18726]
[ 251. 2.91499 3.18881]
[ 252. 2.90999 3.18696]
[ 253. 2.95363 3.18801]
[ 254. 2.91879 3.19104]
[ 255. 3.18585 3.21711]
[ 256. 3.15349 3.21782]
[ 257. 3.17448 3.21382]
[ 258. 3.15787 3.21656]
[ 259. 3.16026 3.21496]
[ 260. 3.15147 3.21426]
[ 261. 3.17213 3.21468]
[ 262. 3.17185 3.21178]
[ 263. 3.15907 3.20836]
[ 264. 3.15182 3.20767]
[ 265. 3.1555 3.20518]
[ 266. 3.15253 3.20411]
[ 267. 3.1485 3.20625]
[ 268. 3.1689 3.20582]
[ 269. 3.14913 3.22122]
[ 270. 3.16579 3.2045 ]
[ 271. 3.14788 3.20755]
[ 272. 3.15643 3.20427]
[ 273. 3.15718 3.20722]
[ 274. 3.14071 3.20992]
[ 275. 3.16481 3.20389]
[ 276. 3.15759 3.20123]
[ 277. 3.15416 3.20439]
[ 278. 3.14959 3.20693]
[ 279. 3.14725 3.20563]
[ 280. 3.15985 3.20083]
[ 281. 3.14692 3.20297]
[ 282. 3.15456 3.20174]
[ 283. 3.15595 3.19926]
[ 284. 3.14787 3.20199]
[ 285. 3.14507 3.20397]
[ 286. 3.14591 3.19952]
[ 287. 3.15124 3.20179]
[ 288. 3.15774 3.19583]
[ 289. 3.1451 3.19506]
[ 290. 3.13534 3.19637]
[ 291. 3.13396 3.20323]
[ 292. 3.15714 3.19812]
[ 293. 3.13982 3.19932]
[ 294. 3.13436 3.19839]
[ 295. 3.13446 3.19633]
[ 296. 3.13403 3.19485]
[ 297. 3.13775 3.20192]
[ 298. 3.13662 3.194 ]
[ 299. 3.13606 3.19968]
[ 300. 3.13387 3.19622]
[ 301. 3.13052 3.19662]
[ 302. 3.13198 3.2016 ]
[ 303. 3.12945 3.19628]
[ 304. 3.13171 3.19744]
[ 305. 3.12496 3.19912]
[ 306. 3.14476 3.19347]
[ 307. 3.13251 3.19482]
[ 308. 3.12623 3.19412]
[ 309. 3.12149 3.19529]
[ 310. 3.1284 3.19336]
[ 311. 3.12367 3.19773]
[ 312. 3.12846 3.19223]
[ 313. 3.11204 3.19342]
[ 314. 3.1422 3.19149]
[ 315. 3.11776 3.19178]
[ 316. 3.12142 3.19607]
[ 317. 3.11681 3.18779]
[ 318. 3.12573 3.19127]
[ 319. 3.10654 3.19032]
[ 320. 3.11076 3.19401]
[ 321. 3.11436 3.19012]
[ 322. 3.11232 3.19633]
[ 323. 3.10879 3.19501]
[ 324. 3.11051 3.18754]
[ 325. 3.10838 3.18807]
[ 326. 3.11247 3.18908]
[ 327. 3.10665 3.18599]
[ 328. 3.10373 3.18751]
[ 329. 3.09219 3.18933]
[ 330. 3.10141 3.1876 ]
[ 331. 3.10448 3.19008]
[ 332. 3.10334 3.18564]
[ 333. 3.0944 3.18539]
[ 334. 3.11026 3.18411]
[ 335. 3.09931 3.18543]
[ 336. 3.0942 3.18723]
[ 337. 3.08586 3.18744]
[ 338. 3.09629 3.18583]
[ 339. 3.09604 3.18758]
[ 340. 3.09592 3.18723]
[ 341. 3.09072 3.18506]
[ 342. 3.09361 3.18747]
[ 343. 3.08194 3.18066]
[ 344. 3.08133 3.18055]
[ 345. 3.07916 3.1856 ]
[ 346. 3.09275 3.18677]
[ 347. 3.08215 3.18291]
[ 348. 3.07476 3.18514]
[ 349. 3.0911 3.18281]
[ 350. 3.07228 3.18434]
[ 351. 3.06323 3.1856 ]
[ 352. 3.07143 3.18012]
[ 353. 3.06727 3.18156]
[ 354. 3.05717 3.18419]
[ 355. 3.07105 3.18394]
[ 356. 3.06096 3.18657]
[ 357. 3.0556 3.18497]
[ 358. 3.07169 3.18537]
[ 359. 3.06159 3.18307]
[ 360. 3.06539 3.18341]
[ 361. 3.05943 3.1866 ]
[ 362. 3.05638 3.18451]
[ 363. 3.04482 3.18558]
[ 364. 3.05643 3.1812 ]
[ 365. 3.04275 3.18405]
[ 366. 3.05358 3.18319]
[ 367. 3.04215 3.17748]
[ 368. 3.06701 3.17835]
[ 369. 3.03953 3.1828 ]
[ 370. 3.03784 3.18116]
[ 371. 3.04168 3.1819 ]
[ 372. 3.04285 3.17969]
[ 373. 3.03461 3.18588]
[ 374. 3.02604 3.18361]
[ 375. 3.03675 3.18021]
[ 376. 3.02754 3.18702]
[ 377. 3.03286 3.1842 ]
[ 378. 3.02803 3.18226]
[ 379. 3.02469 3.17857]
[ 380. 3.02006 3.17989]
[ 381. 3.03287 3.18093]
[ 382. 3.03386 3.17457]
[ 383. 3.01906 3.18206]
[ 384. 3.02523 3.18208]
[ 385. 3.0249 3.17841]
[ 386. 3.01594 3.18479]
[ 387. 3.00656 3.18223]
[ 388. 3.03734 3.17651]
[ 389. 3.01972 3.18381]
[ 390. 2.99733 3.18375]
[ 391. 3.00744 3.17919]
[ 392. 3.01168 3.17915]
[ 393. 3.00454 3.18034]
[ 394. 2.99668 3.18494]
[ 395. 3.01433 3.17575]
[ 396. 3.00007 3.17726]
[ 397. 2.99665 3.17951]
[ 398. 3.00307 3.18112]
[ 399. 2.98843 3.18654]
[ 400. 2.99738 3.18487]
[ 401. 2.99188 3.18256]
[ 402. 2.99771 3.17806]
[ 403. 2.9969 3.1752]
[ 404. 2.9822 3.18257]
[ 405. 2.99659 3.17716]
[ 406. 2.98294 3.18228]
[ 407. 2.99136 3.18174]
[ 408. 3.00483 3.17646]
[ 409. 2.98342 3.18306]
[ 410. 2.96817 3.18673]
[ 411. 2.9779 3.18051]
[ 412. 2.97003 3.18111]
[ 413. 2.97236 3.1865 ]
[ 414. 2.97085 3.17836]
[ 415. 2.99252 3.17634]
[ 416. 2.96375 3.18693]
[ 417. 2.95817 3.18452]
[ 418. 2.96447 3.18923]
[ 419. 2.9524 3.18971]
[ 420. 2.95469 3.18977]
[ 421. 2.95515 3.18554]
[ 422. 2.95378 3.18953]
[ 423. 2.95134 3.18415]
[ 424. 2.96292 3.18495]
[ 425. 2.95274 3.18735]
[ 426. 2.94469 3.18708]
[ 427. 2.95185 3.18246]
[ 428. 2.95539 3.18414]
[ 429. 2.9421 3.18857]
[ 430. 2.94309 3.18717]
[ 431. 2.96466 3.18244]
[ 432. 2.93257 3.18498]
[ 433. 2.92841 3.19228]
[ 434. 2.94315 3.18088]
[ 435. 2.93953 3.18709]
[ 436. 2.92503 3.19111]
[ 437. 2.93082 3.19063]
[ 438. 2.95543 3.17995]
[ 439. 2.91526 3.19005]
[ 440. 2.9235 3.18787]
[ 441. 2.92913 3.1913 ]
[ 442. 2.90686 3.19078]
[ 443. 2.91982 3.19346]
[ 444. 2.91682 3.19154]
[ 445. 2.90894 3.19157]
[ 446. 2.91718 3.18939]
[ 447. 2.92191 3.19028]
[ 448. 2.91324 3.19256]
[ 449. 2.91879 3.18983]
[ 450. 2.91476 3.18906]
[ 451. 2.90406 3.19049]
[ 452. 2.90904 3.18919]
[ 453. 2.92424 3.18695]
[ 454. 2.91702 3.1899 ]
[ 455. 2.90734 3.19451]
[ 456. 2.91905 3.18995]
[ 457. 2.90503 3.1893 ]
KevinB
(Kevin Bird)
November 24, 2017, 10:31pm
13
I stopped it and it actually seems to have kept the most recent one so that actually is pretty decent if you just want it to run for a long time.
wgpubs
(WG)
November 24, 2017, 10:33pm
14
This is actually kind of awesome!
You got some pretty good and stable results by epoch 62. Might have been able to get similar results with just:
learner.fit(3e-3, 3, wds=1e-6, cycle_len=20)
KevinB
(Kevin Bird)
November 24, 2017, 10:34pm
15
Yeah, obviously not what I was going for, but I agree. It ended up being a decent way to figure out what I should be shooting for.