I’m a weekend deep learner (this is really a thing)
But few days ago I encounter with the need to classify close to 100,000 documents before the end of the year.
The task is really simple destroy the documents or not destroy the documents based on some criteria already published.
Preparing the neural network with softmax and enmbedings was not really difficult thanks @jeremy and I ended with an accuracy of 93.45% on classification from the team with close to 5,000 of the 100,000 items.
Everyone on the team was really impressed with the confusion matrix but then someone asked.
This method looks really “math” oriented and very complex.
We are not going to be able to proof any particular answer to the model.
With excel I can show a list of keyword and then provide a VLOOKUP and update the rows.
We can provide visual answer to each one of the our questions based on simple observation.
My initial idea was to provide the same excel spreadsheet with an extra column that includes the output of the softmax function (very similar to the cats and dogs).
I need you help
Is there any easy way to display the data in a way that makes sense to the end users?
Cluster the data among the two options or more options if available.
Any help or pointers will be really appreciated.