Superconvergence? Explain pls
Some Face Expression Recognition (FER) with fastai v1 :
Been working on a Classifier of 13 different Brands of Swiss Watches… got a 0.280851 error_rate, after 7 epochs and tuning the learning rate…will experiment with more epochs.
I have been working on American sign language dataset. I have used resnet34 model and I have got an accuracy of 99.97%. I have used opencv for making live prediction of signs made by hand via. webcam. Here are some of the top losses
Link to ASL-live-predictor github repository:
Look at references from fast.ai: https://www.fast.ai/2018/07/02/adam-weight-decay/
Here is another example of application of Fast.AI library to the problems in cancer genomic domain. The problem is discriminating between true and false variants detected by automatic workflows in tumor-normal cancer sequencing. Comments, questions are very welcomed, Thanks!
Well, that’s just pretty awesome.
I am impressed by 1 cycle policy and superconvergence methods do automatic so I write a blog on it and it’s is my first blog can fastai community give me feedbacks about blogs or am I getting the concepts correctly to relate superconvergence and regularization.
I just published SuperConvergence with inbuilt regularization -
Thank you for that post, that was very useful - here is what happened:
following Rachel’s post, I split my training and test set more rigorously (the key here was to have all the images of the same city either in the train set or the test set, and not in both). For anyone running into the same issue, split your sets yourself, then use
Unfortunately the result of this was a much lower test accuracy - around 60%; and no fiddling with the training seemed to improve the result. So I went and got more data
I downloaded data from the largest 25,000 cities in the world; around 100,000 labeled images in the dataset. Interestingly, resnet34 (which was better on the small dataset) turned out to be much worse than resnet50 on the large dataset. Resnet34 got to about 75% accuracy while resnet50 easily got to 80% Finetuning also helped a lot with this larger dataset, almost 5 points improvement.
Conclusion: with this new model, accuracy is now back to 85% (using rigorously split train and test sets).
The result is extremely pleasing, and the webapp is now almost creepy to try out; it seems to get almost everything right: yourcityfrom.space
I wanted to check how fast.ai initialized weights in CNNs and was surprised to see that it changed from last year!
So I wrote a blog post on “He Initialization”
Weight initialization for CNNs: A Deep Dive into He Initialization
Nice job! Do you have any interpretation on how it’s classifing?
IMO, for local installations it is always better to use docker wherever possible, to avoid messing up your installed packages. You don’t want to end up with incompatible versions. Cheers.
For a project like this I don´t see the need for a docker, I use multiple conda environments and it works just fine
I think data augmentation will help in this situation.
Yes, we have much trouble and I would say pain, getting the model converted to ONNX binary protobuf file which contains both the network structure and parameters of the model we exported. We are able to go from:
fast.ai + PyTorch ONNX Caffe2 Android devices.
The SqueezeNet model in plain PyTorch from torchvision runs fine on Android devices. Exporting fast.ai ResNet-family of models (we have tested ResNet18, a smaller model with lesser parameters and reasonable GFLOPS on mobile ARM processor/GPU(DSP)) through the ONNX exporter with Caffe2 backend proceed successfully (no error). However, when we put the 2 pb files on the Android app, it runs and crashed on Android devices. Our initial investigation found the crash could be due to discrepancies all over the ONNX specifications and around the tooling that follows ONNX.
The bottleneck is in this step: “Export the model to run on mobile devices”, where ONNX-Caffe2 backend rewrite ONNX graph to Caffe2 NetDef.
Currently, in our project, we are mostly doing detective work on the ONNX and Caffe2 sides. In this project, we learned and discovered a big revelation on how “production-readiness of PyTorch” after the PyTorch 1.0 release announcement, about 2 months ago. We are trying to demystify that announcement.
Please bear with us. Our response might be delay as we are super busy now:
- to get the iOS part done
- make it easier and more pleasant to ship and test your neural network model in PyTorch on mobile devices
- ironing out all the kinks that slow down the shipping process
Supercovergence refers to a consequence of the one cycle policy: as the policy involves partly training with a high learning rate, a regularisation effect (overfitting prevention) and shorter training time with stable prediction performance are achieved.
More in-depth explanation by @sgugger: https://sgugger.github.io/the-1cycle-policy.html#the-1cycle-policy
and the original paper:
Actually Docker on Windows 10 can run Linux container images as well.
Just made a classifier which can recognize the language from the script.
I have chosen 4 Indian languages. ( Bengali, Gujarati, Hindi and Telugu )
Architecture: ResNet 34
Accuracy: 95 %
The model was trained on electronic scripts. The test images were having newspaper, posters, textbooks etc. i.e. the test images distribution was much larger than training.
Even then the classifier worked with a very good accuracy.
I’ve used a model trained on the KDEF dataset to analyze Jeremy’s facial expressions in the lesson videos: https://gist.github.com/jerbly/695df9598eeb64788cb16075cf8a3d4a
Time series classification: General Transfer Learning with Convolutional Neural Networks
I have performed some time series classification (TSC) experiments applying the standard transfer learning approach we have learned in this course with the idea that it wouldn’t be very useful when applied to time series.
However, the results are great: 93.3% accuracy (very close to the state of the art which is 96.7%).
I think this demonstrates that the fastai transfer learning approach is pretty powerful, and may be applied to lots of different domains, beyond the type of images included in Imagenet.
The goal of the experiment is to identify the country of origin of 30 samples of olive oil based on spectrographs (univariate time series).
The experiment combines 2 approaches that have not been widely used in the TSC domain yet:
- Time series encoding as images. I have tested several approaches, and the one that works better is one where time series are mapped to an image by means of a polar coordinate transformation called the Gramian Angular Field (GAF). This creates a rich graphical representation of a univariate time series. If you are interested you can read this paper (Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks). Based on this the univariate time series on the left is transformed into the image on the right.
- Transfer learning with resnet 50. This is a difficult test as the model has not been previously trained on time series images.
I have used a small UCR time series dataset (Olive oil, with 30 train and 30 test samples) to confirm really test transfer learning.
No transformations have been used.
If you are interested, I have created a gist.
I now plan to apply this same approach to other difficult time series classification problems to get more insight into this technique. Also to modify the standard approach using some transformations which might be useful.
Feedback on the experiment is very welcome.
Edit: if you are interested in time series applications, I encourage you to participate in the Time series/ sequential data study group thread where you’ll be able to share your experience, ask questions, etc.