Share your work here ✅

Hello everyone,

I made a simple binary classifier to identify building architecture styles - Gothic or Renaissance.


Images were downloaded from Google. I ran the model on resnet34 and got an accuracy of 90.5%. The source is available here . Feedback and suggestions are welcome.

4 Likes

When loading the image data, try changing Num_workers to zero, it slows it down, bhtnits possible to train something

If you change num_workers=0, still learn.lr_find() is Interrupted.

Hey everyone,

After being inspired by @suvash and @nikhil.ikhar, I decided to look into Arabic handwritten characters. I found a dataset published last year for Arabic handwritten characters where they achieved a SoTA of 94.9%.

After a couple of hours of work with the fastiai library and making some tweaks and fine-tuning, I was able to hit an accuracy score of 96.9% :muscle: :boom:

I’ve collated my work in a notebook and pushed it to my Github if you would like to take a look - https://github.com/oasis789/Arabic-Handwritten-Characters-Dataset

10 Likes

I work on time series data for system and business monitoring. The most common way to work with time series data is directly on the time series data points and use things like moving averages, regression, and neural networks.

I have always wondered how well it would work to simply convert the time series data to images and use convolutional neural networks for image classification. The intuition is that “we know an anomaly when we see one” - so why not just do that?

TL;DR is that using fast.ai on time series images I generated, I have been able to consistently get to around 96-97% accuracy on this task.

This is kind of amazing because the time series data I trained on are from different time series domains (like service API latency versus purchase volumes) and generally the thought has been that we need to fine tune for each domain.

Here are some examples.

Anomaly: this time series has a spike toward the vewry end. I generated the images with some buffer at the right. I may experiment with making this lag window narrower in the future.
api-01-a-003066

Normal Time Series: This example is normal at the right edge, which is where we want to detect anomalies. The spike toward the left might have been an anomaly at that point in time, but we are not interested in that now.
app1-03-n-000505

I have roughly 100 anomaly images and 400 normal images.

Notebook is shared here.

Training results:

21 Likes

Hi everyone!

I managed to make an immune cell classifier and I’m serving it at floydhub thanks to @whatrocks. I got my data from Paul Mooney who made it available in kaggle. The model has accuracy of about .95 but I still think I can make it better. Check it out here (Immune cell classifier)[https://www.floydlabs.com/serve/Pw4dG2dnXCBPXKEY7SBVAK] In case I’ve put it offline here’s the screen shot

Sooo excited that the fast.ai library has enabled me to achieve this so quick. It’s been a dream of mine to make this for months. Here’s the repository with code and the model https://github.com/Shuyib/fast-ai-projects

17 Likes

Awesome @uwaisiqbal - I’ve been looking for data sets related to Arabic over the last few days. Mainly for purposes of Arabic text extraction from images. This data set looks like a good starting point, thanks!

I put together a flower classifier trained on the Oxford-Flower-102 dataset. 102 classes of flowers and I used the same train/valid/test split as in the data split provided. With that split, only 11 images per class were used for training (a total of 1122), and the same number for validation.

The test set contains 6149 images.

After fine-tuning a Resnet50 model, without data augmentation, I got a 92.26% accuracy on the test set. I think it’s a great model considering how little data was used for training and the number of classes.

In case someone else wants to use the data, here is the code to split the data after downloading it into train/valid/test and put it in an ImageNet folder-structure (execute in a Jupyter notebook for the magic commands to work):

import scipy.io as sio
import numpy as np
import pandas as pd


data_split = sio.loadmat('raw_data/setid.mat')
image_labels = sio.loadmat('raw_data/imagelabels.mat')['labels'][0]

trn_id = data_split['trnid'][0]
val_id = data_split['valid'][0]
tst_id = data_split['tstid'][0]

data={'filename': np.arange(1, len(image_labels)+1), 'label': image_labels}
split = []
for file_index in data['filename']:
    if file_index in trn_id:
        split.append('train')
    elif file_index in val_id:
        split.append('valid')
    elif file_index in tst_id:
        split.append('test')
data['filename'] = ['image_'+str(n).zfill(5)+'.jpg' for n in data['filename']]
data['split'] = split

# This dataframe has three columns: filename, label, and split (train/test/valid)
image_labels = pd.DataFrame(data)


# Execute once to create dir structure + copy the data from the jpg folder to the new labeled, split folders

# ! mkdir raw_data/train
# ! mkdir raw_data/test
# ! mkdir raw_data/valid

# for split in ['train', 'valid', 'test']:
#     for label in range(102):
#         label = str(label+1)
#         ! mkdir 'raw_data/'$split'/'$label

# # Copy images to their labeled folders

# for index, row in image_labels.iterrows():
#     #print(row['filename'])
#     fname = row['filename']
#     split = row['split']
#     label = row['label']
#     ! cp 'raw_data/jpg/'$fname 'raw_data/'$split'/'$label'/'

6 Likes

I am working on something similar in my office! Nice to see your results. Would try to apply that in my work. Thanks!

Which datasets have you managed to find?

I have access to a dataset for OCR with Arabic scientific manuscripts from this project - https://www.primaresearch.org/RASM2018/

I’d be interested in working together on Arabic OCR if you’d be up for that

2 Likes

First, thank you for alerting me to Crestle’s policy until year-end for storage. It even has a real terminal, unlike Colab. GCP keeps running out of resources and Paperspace stopped saving my updated notebooks a while ago (which may have been fixed - I haven’t checked). I ended up using Colab, which leads me too…

Second, I ran the core portions of Lesson 2 on Colab w/o incident. By ‘core’, I mean the learners and the models and even the image verifier. However, Colab doesn’t support widgets, so the FIleDeleter/ImageDeleter functions do not work. Otherwise, it seemed fine.

Did you have a different experience w/ Colab?

Thanks for sharing. Someone at my office asked about use cases for NNs, and time series was one of my examples. Now I have a working example I can share with him.

@GiantSquid and I created this Chrisifier:

https://still-refuge-41112.herokuapp.com/

For aviation enthusiasts, I’ve updated the aircraft classifier project that classifies aircrafts into civilian, military (manned) and UAV (unmanned) categories. I have pruned the dataset and updated the model. Using resnet50, it has improved quite a bit.

Using the new model I have created this web app. Check it out at: deepair.

I’ve written the following short Medium post describing some of the details. The accompanying notebook can be found at this gist.

1 Like

Hi everyone,

I have created a binary image classifier to identify abnormal and normal brain images from MRs.

and deployed it

you can test it : https://brain-mr-images-classification.now.sh/

Thank you so much for all your help and support.

9 Likes

I used the material in the first two lectures to train a classifier for detecting plant diseases through the sounds insects make (still a hypothesis). I have been working on this project with researchers in the U.K. for a while now and was having problems with analysing raw audio but was able to solve it by converting the data to images with librosa.

I trained a classifier for the different experiment conditions (e.g. infected plant or non infected plant, males only or males and females together):

classconf

And then compared the usefulness of different clustering algorithms (PCA, TSNE, UMAP). For example, PCA here:

21 Likes

These results look encouraging - is your validation set from plants that are well separated from those in the training set?

Hi Everyone,
I made Birds classifier based on what I learned in lecture 1 and 2 so here it is
Here I am using a pre-trained ResNet34 model.
The accuracy is 91%
The confusion Matrix

Notebook Url

1 Like

I successfully trained a text classifier on legal judgments based on the lesson3-imdb notebook. A multi-label classifier would have been more suitable for the use case I had in mind but I went ahead and trained a 19-way classifier with very strong results out of the box (82.56% accuracy), considering that there was a huge imbalance in the number of documents per topic, the number of classes (19 instead of just positive/negative) and I used most of the fastai default settings. The resulting errors made by the classifier were reasonable misclassifications due to overlapping subject matters.

What’s so amazing is that the fastai library makes it so easy to get quick results. There were frequent changes to the library in the past week as I was working on this but the actual training of the model was pretty straightforward once the library updates settled, with only a small amount of digging into the source code required to understand what was going on.

I will be presenting my results and how I used fastai’s ULMFiT this Wednesday at the National University of Singapore’s School of Computing Project Showcase (I’ve been participating in a deep learning study group there). Here’s the poster I’ve prepared for it. Will also put up a more detailed Medium post after I run a few more experiments.

I look forward to seeing more of everyone’s impressive work!

Confusion matrix

Sample judgment page

28 Likes

Hi All, used another problem from Cancer Genomics domain – Cancer Type Classification using Gene Expression data – this is my subject matter hence almost the same topic for all of my work :wink: This time, peaked a bit at the structured data documentation and did not convert the data into images (although I am sure we can represent this data as such). Overall accuracy is 93.9%, tiny bit better than the recent paper that addressed this problem.


Thanks!

14 Likes