How do we get f1 scores for our validation set?

The answer is in this thread: F1 Score as metric