Kaggle Leaderboard: Bad Cats/Dogs Competition Results

When I’ve submitted my predictions to Dogs vs. Cats Redux: Kernels Edition, my original score was like 0.9, i.e. very very high. Could somebody tell me which score is expected when you finetune top layer only? And, what about finetuning conv layers? Which score it is expected to have in this case?

Because it seems I’m doing something wrong with my VGG16 training. As I can see, even without doing too much fancy stuff, log loss should be lower, right?

We might more information to help. Like submission.csv and code example of predication