Hi everyone! I havent fully completed part1, so I m a newbie yet, but I would appriciate hints of what to do with some real world stuff.
Task itself: ( I modified the topic and simplified a few things)
Imagine an email center, with multiple customers to send emails to. The overall task is to improve the ratio of the successful opens / attempts.
we have: approx 100 000 000 emails with a status (succ/fail), it has time (incl day) which I was going to transform to the same format as Jeremy used as there is huge correlation between success and days of week and day of month etc.
users have several different characteristics but one of them is very descriptive and has the biggest number of discrete options (users are grouped into 5 000 groups where users in one group are very alike in terms of success probability)
For a certain user, we should be able to give days when we should better email him, knowing that the average amount of emails per user shouldnt by higher that N.
some real world constraints:
we stopped sending out messages as soon as the “email open” is successful
the longer we wait - the more likely this user no longer uses this email
I found some positive correlation between the time we wait until the next email and success. (if we send a new email the next day its less likely to be opened than an email sent 3 days after)
so please, if you have ideas of how to solve this, please share. yet I have a very vague understanding on where to start. thanks!