Performance is not stellar (top 25%) but learning how to do feature engineering on a Pandas dataframe was interesting. The trickiest part was working around a bug in fastai 1.0.24 (which has now been fixed).
Running a little behind on the lectures and implementations but finally created my first classifier to classify John Oliver(the comedian) from Steve Mnuchin(US Treasury Secretary).
But I managed to get only about 72% accuracy though, even after removing useless images and figured out the reason after wasting a lot of time, turns out that a lot of the images in the data bunch are being transformed in such a way that they don’t contain either of them. Turned off max_warp and set max_zoom = 1. Still didn’t get much improvement so I will probably go ahead with the next lecture and come back to this once Jeremy addresses the get_transforms() function.
Nevertheless, a look at the transformed images and the issue:
Definitely found that actively doing rather than passively watching the lecture helps in learning better. Hopefully, I will do a lot more from now.
I might be super late to this party but I FINALLY got my very first web app up and running with my very first trained ML model, so Yay!
I built a FER model, trained it over the KDEF dataset and then experimented with drastically different test images via my web app.
The KDEF dataset classifies emotions into 7 types, I use all. After the very first training cycle, I got an error rate of 6.4% and almost identical TL and VL.
After many many iterations trying different epochs and learning rates, I was able to get the model down to an error rate of 1.9% but my training loss ended up higher than validation loss by a tiny(?) bit:
PS: I have tons of questions on learning rates and epochs (some listed in notebook) and will really appreciate if someone can provide feedback on my approach and results!
At this stage, I put this model to production using Zeit and tested it for other images.
I observed that the KDEF dataset uses males and females in age group 20-30 and the images seemed predominantly caucasian. So I wanted to test my model against images of older women, people of color, kids/babies etc. Sharing some results below (more results and related questions in notebook):
Overall my model has not done well on random test images despite the 1.9% error rate during training - I’m not sure if its because of the way the KDEF dataset was created or if my model is overfitted or something else? My error rate had a downward trend all along, barring few fluctuations, so I don’t think it is overfitted… but, would love clarification on this!
Now I am really interested in understanding what the model “learnt” and why certain images from the test set got mis-classified… I’ve been guessing myself crazy! But I need to better understand what I did so far before I try anything else.
The entire notebook is available here.
Give the app a go here!
Please please provide feedback/answers/questions… all of this is so new that it’ll be good to get validation on the approach and results. Thank you!
After lesson 4, I tried to combine tabular data with NLP, particularly in spanish.
I took a tabular dataset from an e-commerce marketplace with the objective of predicting products’ condition (new or used) based on listings’ features. It includes 100k records and after some data pre-processing (not included in the attached notebooks), I ended up with 30 features, including: 17 categorical, 12 continuous and 1 text field (listing’s title).
The process included:
Creating a tabular model without the text feature (accuracy: 91.5%).
Creating an NLP model to predict from the listing title:
2.2. Training a language model in spanish from scratch: I used a Wiki corpus trimmed to around 130 million tokens (training for 6 epochs tooks 10 hours on a GTX 1080TI, reaching an accuracy of 30.5%).
2.3. Appling ULMFiT: First training a domain language model (accuracy: 34.3%) and then classifier itself (accuracy: 81.5%). Then, the classifier was used to predict on the entire data set (probability of the product being new given the title).
Creating a new tabular model, this time adding as a new feature the prediction coming from the NLP model (final accuracy: 92.4%).
I tried also extracting the last linear layer’s activations (50) from the NLP model and feeding them in the tabular model, but it didn’t improve accuracy. Something that I didn’t reach to try was removing the output layer of both models, concatenating the outputs and feed it in a linear model (unlike my simpler model, this would backprop to both models).
In this case, the effort of training the NLP model (particularly the spanish model from scratch) just improved something below 1%. However, it was nice learning exercise and now I have a spanish pre-trained model, that hopefully will be useful for others projects thanks to ULMFiT.
I finally wrote up my blog post about creating the guitar classification model. In the previous days I decided to redo the exercise and incorporate the new data block API, progressive resizing and other goodies of fastai v1.
Please let me know if my description of the one-cycle-routine, progressive resizing, etc. is off.
I think the results came our real nice and I’m still amazed how good progressive resizing works!
Continuing our series of updates to our aircraft classifier project, I have added the Data Block API and progressively resized the dataset, from 32x32, to 64x64, to finally 128x128. We are now at 99.3% accuracy. Hooray.
Using the new model I have created this web app. Check it out at: deepair-v2.
I’ve written the following short Medium post describing the details of the process.
The accompanying notebook can be found at this gist.
I have been playing around with audio classification, using bachir’s strategy of transforming the audio signal into an image represents its spectrogram, and then performing transfer learning on those images using the guidelines from the first three lessons. I tried this with the dataset from the tensorflow speech recognition challenge from Kaggle last year (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge/) and I got an interesting result. The dataset comprises short utterances containing commands such as up, down, stop, go etc. In my first trial, I excluded the categories unknown and silence to facilitate training.
The best result was superior to the first place in the private leaderboard of Kaggle 10 months ago. However, I’d need to include the unknown and silence categories to perform a fair comparison.
I also applied this same approach to emotion recognition from speech using the IEMOCAP database (https://sail.usc.edu/iemocap/). This database contains speech signals uttered by actors and labeled in categories such as sadness, happiness, anger and so on. I started with two classes with a decent amount of data (one thousand samples each) and the first results are encouraging: I got about 93 % accuracy differentiating between anger and sadness. I’m curious to see the performance for the entire dataset.
I am working on an object verification problem that need to add new categories regularly. (similar to face verification, we first show the identity card and a model will verify if the face is matching)
I have tested the technique with Mnist data set, from number 0 to number 7 only. Number 8 and 9 will be add later. I use only the binarizing code of the last layer in this moment. I tested with number 9 and the results is quite ok. Every other numbers have low similarity (<60%), except number 7, it has 90% similarity with number 9. I will continue to test with Euclidean distance rather than this binary hamming distance.
That is the “data wrangling” part - prepping raw data into a format more useful for analysis or ML models. How you slice and dice depends on what you are trying to do, and the design of your model. It looks like you may want to construct tabular data and perhaps look at relationships of other dimensions based on day of the week and/or time.
If that’s the case, you might be interested in the concept of “embeddings for categorical variables”. Here is a useful blog post and workshop video from Rachel on the subject: