I am currently using traditional machine learning model “Random Forest” with Spark Streaming and Spark MLLib to solve phish detection problem.
Which Deep Learning Model to Use:
I was thinking of solving the same problem using Deep Learning. I was reading on internet it says we can implement the phish detection efficiently using RNN/LSTM and NLP. In our problem space we have two sets of data available
URLs and URL label like phish/non phish.
URL Content/ Web Page Content with label like phish and non phish.
I am just wondering how to map two sets of data into LSTM/NLP. Should we make URLs learn using LSTM… then how can URLs detect the pattern. And should we make URL content learn using NLP.
How to get Real Time Prediction using Tensor Flow: Do we we integrate real time pipe line with deep learning? Do I still need to use Spark Streaming with model generated using deep learning?
Please let me know your thoughts.