Emotion detection in movies

Hello fast.ai community!

I’m working on horror detection in movies and I tried several models but I cannot find better results.
First, I did visual features extraction for all frames in movies using vgg16, then I used machine learning classifiers to predict induced fear for each frame. I also extracted audio features using opensmile and I did classification but no better results. I tried to apply ResNext using in lesson 1) directly on frames but I can’t find any improvement. A major problem is the unbalanced data. Accuracy is good in general but that’s due to the unbalanced data. Precision or recall are so bad.

I will be so grateful if anyone have some insights to share.