Use cyclical learning rate with triplet loss

I’m training the MGN (ReID) network on public market dataset and it really takes a lot of time to converge. I’m wondering if anyone tried to use cyclical learning rate to speed up network training with triplet loss. In my own experiments, large learning rate often leads to collapse of sbuspaces and the triplet loss would stuck at the selected margin. Even further reducing the learning rate could not help improvement the performance.

I guess cyclical learning rate could not work with metrc learing problem while with a ground truth label, this method actually boosts the training performance.

Hoping for your clarification.