My dataset looks different from imagenet dataset, do I still use transfer learning?

I made my dataset by extracting basketball player’s poses over several frames and stacking them into a single frame. Here is an example from the player shooting from the free-throw line.

I understand that the image does not like anything from imagenet dataset beside human Skelton. Would you still recommend training the CNN model as Jeremy taught in lecture 1 using Transfer learning or shall I train a brand new NN?