Share your work here ✅

Hi Akshay, sorry for the late reply. I actually did a bunch of things that day, not sure which worked. :smile: But one of them was to downgrade my python to 3.6 instead of 3.7 to get the VSCode python and jupyter extensions working. Can you try just this step and see if it works for you? All the best!

1 Like

Some initial results colorizing black and white images using perceptual loss: https://medium.com/@btahir/colorizing-pakistan-5697f7754b2a

Github: https://github.com/btahir/colorizing-pakistan

2 Likes

Not my work but I thought you might find this interesting nonetheless:

Mannino, Robert G., et al. “Smartphone app for non-invasive detection of anemia using only patient-sourced photos.” Nature communications 9 (2018).
https://www.nature.com/articles/s41467-018-07262-2

An algorithm was then written in MATLAB utilizing robust multi-linear regression with a bisquare weighting algorithm

MATLAB!!

I note with interest from this article:

The app is part of the PhD work of former biomedical engineering graduate student Rob Mannino, PhD, who was motivated to conduct the research by his own experience living with beta-thalassemia, an inherited blood disorder caused by a mutation in the beta-globin gene.

Pairs nicely with one of my favourite PG essays:

The way to get startup ideas is not to try to think of startup ideas. It’s to look for problems, preferably problems you have yourself.

4 Likes

I have been playing around with an MNIST classifier for a few weeks, and noticed in Lecture 7 that Jeremy did a resnet-like MNIST model that is similar to what I have been doing. So I thought I’d share it here.

In my model, I reliably get an accuracy on the test set of 99.82 percent. This is with a committee of 35 classifiers, where I average their softmax outputs to come up with a predicted probability distribution.

The individual nets average 99.73 percent accuracy, and the committee is formed by taking n of them with no selection criteria whatsoever.

So for the 10 thousand element test set, a 99.82 percent accurate classifier has 18 misclassifications. Here is a typical run:

Misclassifications

mnist9982

Interestingly, I think all or nearly all of these are genuine misclassifications. In other words, I agree with ground truth here. Most of the characters are marginal and I can certainly see why my classifier chose what it did, but the ground truth labels are, in my opinion, subtly perceptive and correct. So there is still room for improvement!

I’ve posted the business end of my model below, which you can plug into Jeremy’s notebook. I use one-cycle training with ten epochs total for a model, with a slight variation on the momentum schedule.

My data augmentation is different, and I think that’s what pushes this classifier beyond any published result I’ve seen. (The best I’ve seen published is 99.79 percent–I’m interested if anybody has seen better than that.)

For augmentation, I use a small amount of random elastic distortion, a small amount of random rotation (within plus or minus 10 degrees), and I randomly crop the 28x28 images to a 25x25 size, then resize back up to 28x28 (this has the effect of translating the image a little and also thickening it as a side effect).

At inference time, I do the same augmentation except for the crop part I do a single 25x25 center crop (with resize back to 28x28).

The model itself is the same idea as Jeremy’s: inspired by resnet except not as many layers. Also, the skip connections surround a single conv layer. I err on the side of too many batchnorms. There is a small dense layer at the end (about 5000 parameters). I use no dropout, no weight decay, and the Adam optimizer with a one_cycle schedule.

There are about 100x more parameters in my model than in Jeremy’s, even though the layer structure is similar. My convolutions just output many more layers. Still, each net trains in about 2 minutes on my '1080ti.

def mnist_model():
    return nn.Sequential(
               nn.Conv2d(in_channels=1, out_channels=128, kernel_size=5, padding=2),
               nn.ReLU(),

               Residual(128),
               nn.MaxPool2d(2),
               Residual(128),

               nn.BatchNorm2d(128),
               nn.Conv2d(128, 256, 3, padding=1),
               nn.ReLU(),
               nn.MaxPool2d(2),
               Residual(256),

               nn.BatchNorm2d(256),
               nn.Conv2d(256, 512, 3, padding=1),
               nn.ReLU(),
               nn.MaxPool2d(2, ceil_mode=True),
               Residual(512),

               nn.BatchNorm2d(512),
               nn.AvgPool2d(kernel_size=4),
               Flatten(),
               nn.Linear(512,10),
               # Softmax provided during training.
           )

class Residual(nn.Module):
    def __init__(self, d):
        super().__init__()
        self.bn = nn.BatchNorm2d(d, **bn_params)
        self.conv3x3 = nn.Conv2d(in_channels=d, out_channels=d, kernel_size=3, padding=1)

    def forward(self, x):
        x = self.bn(x)
        return x + F.relu(self.conv3x3(x))

class Flatten(nn.Module):
    def forward(self, x):
        return x.view(x.size()[0], -1)
6 Likes

Great work !!
Waiting for the blog

Hey! I recently competed in Kaggle in the PLAsTiCC Competition for classifying astronomical objects together with @henripal @marcmuc @oguiza @paul and @Takezo. It was my first Kaggle competition and I wanted to keep a record of my learnings so I wrote a blogpost about it here. I summarize the problem statement, the most important techniques used by winners and my own learnings on ‘how to Kaggle’ that I learned from the team.

16 Likes

Hello,

I continued playing with transfer learning, now with some very interesting publicly available datasets. With minor or no tuning, fastai library keeps showing amazing results.

All details and notebooks are in: https://github.com/martin-merener/deep_learning/tree/master/more_transfer_learning

Briefly what I did:

I built 4 classifiers for image recognition problems:

  1. 57,000 images of fruits, for classification into 83 different categories.
  2. 10,000 images of skin lesions, for classification into 7 types of skin-cancer diagnoses.
  3. 5,800 Chest X-Ray images that are classified into Pneumonia or Normal.
  4. 85,000 images of retinas, for classification into 4 categories (normal, and 3 diseases).

In some cases the accuracy is really high on test set. The selection of hyper-parameters was very simple, almost all defaults, and Test performances were computed only once at the end. No cherry picking data partitions, runs, datasets, etc.

Performances on test sets:

  1. Fruits: 99% accuracy. This is much higher than the performance achieved in the paper that introduces the dataset, but their learning is from scratch with a ad-hoc architecture, not with transfer learning.

  1. Skin-lesions: 83% accuracy. This was the most challenging, possibly due to unbalance training set. However, when normalizing the confusion matrix by row and column, the performances seems pretty good.

skincancer_te_CM

  1. Pneumonia: 93% accuracy. Here it is interesting that even though the performance has room to improvement, the rate of false negatives is very low, with P[predict=Normal | actual=Pneumonia]<0.01).

pnemounia_te_CM

  1. Retina: 99% accuracy.

retina_te_CM

9 Likes

Followed Jeremy’s advise, just created my very first blog on Medium.

This is my very first blog in English and it’s also my very first blog ever. So I’m not sure if I violated any rules or didn’t cite some papers properly. And I’m not familiar with how to format it to be prettier. So if anyone has time to take a look at that and kindly offer some feedback and advices that will be highly appreciated.

7 Likes

I’m working on a hand tracking problem using webcam. The result is below with dataset is a 1min30s video coresponding to 350 images.


Source code for the prediction:
import cv2
from fastai import *
from fastai.vision import *

path = Path('data/1')
empty_data = ImageDataBunch.load_empty(path)
learn = create_cnn(empty_data, models.resnet34).load('stage-3')

cap = cv2.VideoCapture('output.avi')
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output_play.avi', fourcc, 20.0, (640,480))

while(cap.isOpened()):
    ret, frame = cap.read()
    if ret:
        cv2.imwrite('test1.png',frame)
        img = open_image(Path('test1.png'))
        y = learn.predict(img)[0]
        loc = (y.data + 1) * torch.Tensor([[240, 320]])
        loc = loc[0].flip(0).numpy()
        cv2.circle(frame, (loc[0],loc[1]), 10,(0, 0, 255),-1)
        cv2.imshow('frame', frame)

        out.write(frame)
        if cv2.waitKey(0) & 0xFF == ord('q'):
            break
    else:
        break
        
cap.release()
out.release()
cv2.destroyAllWindows()

Next step to improve the result:

  1. I use opencv to work with webcam and video and I am not succeed to create Image object from opencv image. At this time I write the image to a .png file and open_image from that file. I found that we can used numpy array to create Image object but I’m not succeed either.
  2. The tracking point is still vibrating too much. I will try to use recurrent network to make the point taking information from last frame, maybe it can help to reduce the vibration.

Hope to receive help :smiley: Thank you in advance

7 Likes

Sounds cool! Any chance you could share a notebook or similar so we can see what you did?

Hi,

I used the approach in the Super Resolution notebook to try to get paintings to look more like photographs. As my “crappify function”, I used ImageMagick to convert the pets photos into images that looked like paintings. I then trained a model and looked at how well the model worked with real paintings.

It’s amazing what it does with another photo (not in training or validation set) where I apply the same ImageMagick function and then try to get it to make it look like a photograph:

Original photo:

After ImageMagick conversion:

After model tries to turn it into a photo:

The results on actual paintings are mixed likely because there are many different ways an actual painting can differ from a photo while ImageMagick is very consistent in its artistry.

Notebook here. Any suggestions welcome!

Thanks

18 Likes

Hi Everyone,
I finally made it to the end of Lecture 7 and decided to summarize of all of the mistakes that I’ve made over the two years of trying to take up fastai and discovering all sorts of new ways to fail at it.

Here is my essay on “How not to do fast.ai”

-Sanyam

7 Likes

Hi everyone,
I have started by doing a Kaggle Competition on Travelling Santa. Even though it’s about optimization and not about deep learning. This is my code. I am also attempting Humpback Whale Identifier challenge now

I added a siamese network to my Kaggle whale identification starter pack.

I feel this is quite an interesting way of using the library and in general the concept of a siamese network is quite neat.

Anyhow - in my tweet on this I mention a project idea. When working through v2 version of this course I picked up many little projects like this and found it very useful for learning. To a large extent, this continues to be my modus operandi.

Not saying that one can get a lot of mileage out of this project but who knows - people often seem to ask about ideas what one could work on so maybe this might be useful to someone.

Here is the associated thread on kaggle where maybe some discussion will take place.

13 Likes

Here’s a function I wrote to convert a cv2 frame into a fastai image. This will solve point (1) for you:

def from_cv2(arr: np.ndarray)->Image:
    "Return `Image` object created from image in array `arr` in cv2 format."
    # cv2 returns image as rows, cols, bgr
    # reverse the order of the last rank bgr -> rgb
    # make a copy to fix the -1 stride which torch can't process
    # move the last rank to the front to make rgb, rows, cols
    rgb_rc = np.moveaxis(arr[..., ::-1].copy(), 2, 0)
    return Image(torch.from_numpy(rgb_rc).float().div_(255))

Remove these lines:

        cv2.imwrite('test1.png',frame)
        img = open_image(Path('test1.png'))

And replace with:

        img = from_cv2(frame)
7 Likes

Thank you a lot !! I will try it

Hi Pankaj
First of all thanks for your amazing work on creating this guide. I am currently using it to deploy my first classification model as a webapp. There’s one thing that has been bothered me though and I hope you can help me with it: I notice that sometime the images are rotated after uploading (especially when the image has longer width than height). Is there a way to fix this? Much thanks!

Thanks Quan, For issues with rotation, It a known issue for browsers on camera upload pictures in landscape mode (longer width than height). So, current starter pack uses Canvas methods from JS and forces Portrait mode images only on upload.
You can try playing these JS hacks in utility.js (comment/uncomment/add new code) under static folder,

var canvas = document.createElement(‘canvas’);
// FORCEFUL to portrait mode only images
if (w > h) {
// canvas.width = h;
// canvas.height = w;
canvas.width = w;
canvas.height = h;
var ctx = canvas.getContext(‘2d’);
// move the rotation point to the center of the rect
// ctx.translate( h / 2, w / 2);
ctx.translate( w / 2, h / 2);
ctx.rotate(90 * Math.PI / 180);
// ctx.drawImage(image, -h / 2, -w / 2, h, w);
ctx.drawImage(image, -w / 2, -h / 2, w, h);
}

1 Like

When I said this, I really meant it :slight_smile: Here is a new addition - fluke detection.

Here are a couple of detections:
Dvvl-ICXcAISMWQ

I was quite surprised that doing something so simple (though with the use of a pretrained model) could give such good results. Also, I only used 300 examples for training.

I annotated the images myself (details in the NB) and I found the process really valuable. I learned a lot about the data and it gave me good food for thought on what I would like my model to do. If I ever have to go through such an ordeal again, possibly with more data, I am getting a gaming mouse!

16 Likes

Thanks Jeremy. I just put the finishing touches on the code and the notebook which gives a walk-through, and put it on github here.

1 Like