Tabular Data: make predictions with condition

Hello all

I am trying to figure out how can I make a prediction that has a certain condition (Tabular Data).

for instance: Let’s say that I have data of 50 people from each year for the last years (1970,1971,…2018,2019). my goal is to choose only 1 person from each year based on my data.

for an explanation, adding NBA MVP data. I want to predict the MVP for every year based on many parameters.

I think the answer would depend on on how you would choose to formalize your problem. One option would be to treat it as a binary classification with two class labels: an MVP player class and not MVP player class. Creation of labels should be easy in the NBA case, but might not be easy in general case. However this approach might lead to a very skewed train set with significantly more instances of negative class reminiscent of fraud detection problem. You might think of treating it as a multi class classification, or even a regression problem. Domain specific knowledge should help you to make the choice. Another factor to consider is possibility of obtaining labeled data and opportunity for collection of larger volume of samples.

thank you very much!

Try Randoom Forrst, as discussed in Lesson 1 - 3 of the DL course: