Need someone smart to identify the approach/architecture

Hi everyone! I havent fully completed part1, so I m a newbie yet, but I would appriciate hints of what to do with some real world stuff.

Task itself: ( I modified the topic and simplified a few things)

Imagine an email center, with multiple customers to send emails to. The overall task is to improve the ratio of the successful opens / attempts.

  • we have: approx 100 000 000 emails with a status (succ/fail), it has time (incl day) which I was going to transform to the same format as Jeremy used as there is huge correlation between success and days of week and day of month etc.

  • users have several different characteristics but one of them is very descriptive and has the biggest number of discrete options (users are grouped into 5 000 groups where users in one group are very alike in terms of success probability)

Final result:
For a certain user, we should be able to give days when we should better email him, knowing that the average amount of emails per user shouldnt by higher that N.

some real world constraints:
we stopped sending out messages as soon as the “email open” is successful
the longer we wait - the more likely this user no longer uses this email
I found some positive correlation between the time we wait until the next email and success. (if we send a new email the next day its less likely to be opened than an email sent 3 days after)

so please, if you have ideas of how to solve this, please share. yet I have a very vague understanding on where to start. thanks!

This assertion is incorrect. In the sales world, this means the consumer would probably forget about your product.

could be. I don’t really know and care less, but this should be somehow caught by the neural net and thats why I mention it here. Any ideas on the subject itself?

I think if you follow the structured data lecture pretty closely, then you should have a good guide to do this. Though it sounds like you’re working with a pretty big data set and while it’s great for the final model, it’s only going to complicate things before you get there. I’d suggest taking small sample of the data (maybe 1 million or so), and try building the model with that first. Once you get that figured out then you can try scaling up.

But otherwise it seems pretty straightforward, you have the data, you can do the datetime parsing that Jeremy does, and see how that goes.

Do you have any specific questions?

well, I looked closely into it and the datetime parsing is an obvious thing (I mentioned it in description)

  1. the most confusing part for me is that I have no idea to make it show me the most optimal days, properly spaced and everything (its not just a set of the best performing days, it is a set)
  2. yeah, taking a smaller amount of data for some subgroups is a good idea, I am not facing any troubles yet as nothing is currently working )

I’m not entirely sure if I know what you’re trying to say here, but I’ll give you a few solutions to what I think you are talking about.

If you are referring to feature importance (i.e. knowing which of the features contributed the most to good predictions), then I’d suggest using a Random Forest. If you are set on using a neural net then look into “Feature Permutation”.

If you are referring to use best performing days as a target or the predicted values (i.e. y), then I would just set each of the days (or set of days) as a different class.

It might be helpful to post a sample of your dataset (just the head or something) in order for us to help you more, it’s still not entirely clear what you are trying to do or your issues.