I was looking at the paper from Baidu research about improving distance metric learning with angular loss (http://research.baidu.com/Public/uploads/5acc20706a719.pdf).
I have experiences with simple triplet network but I have big problems to move from triplet to simple triplet+angular loss.
I implemented triplet loss function in Tensorflow which works great:
def triplet_loss(self, anchor, out_a, out_b, squared): """ FaceNet: A Unified Embedding for Face Recognition and Clustering <https://arxiv.org/abs/1503.03832> """ with tf.name_scope("triplet-loss"): distance_pos = self.euclidean_distance(anchor, out_a, squared=squared) distance_neg = self.euclidean_distance(anchor, out_b, squared=squared) triplet_loss = tf.maximum(0.0, self.margin + distance_pos - distance_neg) total_loss = tf.reduce_mean(triplet_loss) return distance_pos, distance_neg, triplet_loss, total_loss def euclidean_distance(self, a, b, squared=False): eps = 1e-12 with tf.name_scope("euclidean-distance"): if squared: return tf.reduce_sum(tf.square(tf.subtract(a,b)), 1) return tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(a, b)), 1)+eps)
However, I tried to change it to angular loss but the optimization is much slower then simple triplet loss. Here is the code for angular loss in Tensorflow:
def angular_loss(self, anchor, out_a, out_b, in_degree, squared, alpha=45): """ Deep Metric Learning with Angular Loss <https://arxiv.org/pdf/1708.01682.pdf> """ with tf.name_scope("angular-loss"): if not in_degree: alpha = self.deg2rad(alpha) out_c = tf.div(tf.add(anchor, out_a), 2.0) distance_pos = self.euclidean_distance(anchor, out_a, squared=squared) distance_bc = self.euclidean_distance(out_b, out_c, squared=squared) tan_dist = 4.0 * (tf.tan(alpha) ** 2) * distance_bc loss = tf.maximum(distance_pos - tan_dist, 0.0) total_loss = tf.reduce_mean(loss) return distance_pos, distance_bc, loss, total_loss
For embeddings I used last fully connected layer(4096 dim) of pretrained VGG Network. I tried to play with different optimizers, learning rates but in the end the model is much worse than simple triplet loss. Do you have any experiences here with angular/triplet networks? Is my implementation right according to the paper?