Lesson 2, image classification: consistency of results after each run?

I implemented a horse color breed classifier. After fitting an initial model, I unfreezed then trained again and every time, I was getting overfitting with slight or no improvement in accuracy. Then I got improvement with no loss of training set at 0.22 vs loss of validation set at 0.27. Plotting the losses did not show overfitting. So my question is, how is it possible to get such different results after each run. I guess there is a part of luck given the random batch selection, but shouldn’t the results be more or less consistent? Does that mean if you can no improvement or overfitting, don’t give up, run it a few more times until you get something better? Is this a viable approach??
I got 4 classes and around 900 instances in balanced dataset, split 80-20.