Best way to detect bad input before prediction? (medical image segmentation)

I trained a unet learner to segment images from cardiac ultrasound. Now I am trying to analyze frames taken from an ultrasound clip. Some frames, however, are garbage and do not show the patient’s heart. This is to be expected because during a study a patient may move, or even their respirations will momentarily obscure the ultrasound’s view. My question is-- how would you recommend that I identify these garbage frames instead of attempting to analyze them with the learner?

Here’s an example:

Good frame with accurate segmentation–

Sequence of bad frames where lung obscures the heart–