Share your work here ✅

Here’s what I wrote after two lessons! (in Spanish)

Thanks to fast.ai & its community!

1 Like

I know this isn’t a great error rate, but for Lesson 1, not knowing what I’m doing and using a fairly small data set (~200) images. I feel pretty good about this model!

I was surprised how well back/front yard was predicted. The one interesting one was the back yard with the couch being categorized as a living room. In retrospect I should’ve added a patio category to account for something like that.

1 Like

Building semantic point clouds for the Kaggle Lyft 3D Object Detection for Autonomous Vehicles Competition

@simonjhb and I just published the first blog post about our experience on Lyft 3D object detection competition

Eventually we ended up with a custom model, starting from the Lyft public kernel and borrowing a lot of ideas from PointPillars paper and Entity Embedding paper.
To find a way to leverage both point clouds and camera data, we trained the network starting from a Semantic Point Cloud where the semantic class of each point is treated as a sort of 2D Categorical Variable.

In the road of bronze medal, we faced a lot of challenging and interesting problems such as:

  • Dealing with pretty big dataset (around 120GB) and the consequent need of focus on performance for almost any task, rebooting our GCP instance with T4 gpu for training and 32 cores and no gpu for post-processing.

  • Lot of pre & post processing: with a lot of linear algebra to manipulate point clouds, old school computer vision tricks (ie: contours, erosion & dilate…) and pretty advanced Pytorch techniques like grid_sample and scatter data operations that made it possible to deal with such amount of data.

  • Leveraging fastai with plain pytorch model focusing on training loop, datablck api and custom transformations to deal with tensors with both both continous and categorical channels.

We’re releasing the code on github soon along with more deep dive articles on the various techniques we’ve used.

17 Likes

@ste this is absolutely awesome! Very well done! I can’t wait to see the source code and read more of the articles. There is so much to learn here :slight_smile:

3 Likes

Hi Dave, fabulous work! It’s overall inspiring! btw, I really interested in the work of preparing the dataset, as you mentioned “convert geojson and crop individual buildings …”. would you like to share some experiences about those? that would be great helpful! thanks in advance!

Will

Hi everyone,

I’ve been working on a fastai version of the segmentation part of https://mateuszbuda.github.io/2017/12/01/brainseg.html - in which Mateusz achieved “83.60% mean DSC and 87.33% median DSC”.
It would be great if someone can take a look at https://www.kaggle.com/peter88b/brain-segmentation-fastai/output?scriptVersionId=24252708 to see if I’ve made any mistakes because … I’m getting a dice score of just over 0.9 - which looks like a significant improvement - but I’m not 100% sure my results are directly comparable.

I’ve made both the training and inference notebooks public:
https://www.kaggle.com/peter88b/brain-segmentation-fastai
https://www.kaggle.com/peter88b/brain-segmentation-inference-fastai

Thanks,

Pete

Hello everyone,
I created my own image segmentation dataset by using the Carla Simulator. Here is an example from the dataset:

I then trained a model on the data, and it worked quite well. If you are interested, have a look at my Github repositories:

My Dataset for image segmentation

Tutorial on how to create your own dataset & Code for training a UNet model on the dataset

There is only one point I´m not happy about. I noticed that ImageSegmentationMask.show() function sometimes switch the color palette. I checked the mask arrays, and they have exactly the same codes, so it seems that this is not an issue with my dataset but with the plot function used. Does anyone know a way to fix that?

7 Likes

My audio project Manatee Chat first demo! Took me about a year to get here…:blush: It is still struggling with partial manatee calls, but it is getting better. It was trained on a small dataset and I did not use any transformations or augmentations yet (it will be the next step).

https://manatee-chat-demo.appspot.com/

4 Likes

I started working through the first Lectures and wanted to share my initial project. As a Subaru fan, I wanted to create a classifier that distinguishes between two of the models, Forester and XV, which happen to look somewhat alike (being that one is a cross-over and the other a compact SUV). This and the fact that there are several generations of the models make the task at least mildly interesting/non-trivial.
Here is a preview of the data:


1. Initial Learning Rate Estimation and Model Training

Note that at this point we are already doing quite well.

*2. Unfreeze and Continuously Check Progress/Re-estimate LR

Things are still going well (no overfitting!) so we will do more

Now we seem to have ended up with a good set of weights
So let’s see the confusion matrix … lol
Selection_7
3.Conclusion We were able to achieve an excellent solution thanks to transfer learning (using a ResNet50 architecture) and the incredible fastai library. Overall we had just 616 data points, which definitely contradicts the common belief that one needs a lot of data to successfully use Deep Learning in practice.
The full notebook and the data are available at: ClassifyingSubarus
I am looking forward to exploring further projects and getting more familiar with the inner workings of the library, thanks Jeremy and team for a great course and library!

5 Likes

thanks, has anybody tried LIME on text data ?. Any other explainability tools / packages that is good.

Nice! I bet @aza, @radek and @britt would be interested - they’re working on this area too (https://stochasticlabs.org/portfolio/ai-animal-intelligence/).

4 Likes

Thank you! How interesting! This is what I am working on too (marine mammals mostly, whales, dolphins and manatees), first step is to identify calls, then to classify them and finally (hopefully) maybe use NLP models to search for underlying structure (like they tried with lost human languages i.e. Linear A or Voynich Manuscript.)

3 Likes

Thank you so much @jeremy for the heads up on this :slight_smile:

Natalija, your work is amazing. I know a little bit about what goes into hosting a deep learning model, to be able to put an end to end example like this is really impressive.

I am however even more excited about your post above :slight_smile: I think there would be a lot of overlap between our work and a lot of grounds to share :slight_smile: I don’t want to hijack this thread even further, let me please send you a PM :slight_smile:

1 Like

I want to share my second mini project from the 4th lesson.

I built a phishing classifier using fast.ai tabular data and the following dataset: https://data.mendeley.com/datasets/h3cgnj8hft/1

The dataset contains 48 features extracted from 5000 phishing webpages and 5000 legitimate webpages.

I obtained 98% of accuracy, outperforming benchmarks obtained with traditional ML algorithms used for phishing detection like Random Forest, SVM. For instance, the related paper to the dataset says: " The overall experimental results suggest that HEFS performs best when it is integrated with Random Forest classifier, where the baseline features correctly distinguish 94.6% of phishing and legitimate websites using only 20.8% of the original features." https://www.sciencedirect.com/science/article/pii/S0020025519300763#ec-research-data

So, like @jeremy said in the 4th lesson: "It’s not true that neural nets are not useful for tabular data ,in fact they are extremely useful. "

I do really appreciate the work that Jeremy, Rachel and folks from the fast.ai team are doing to bring AI for all!

Update: Here is the notebook: https://github.com/johnagr/Phishing-Classifier-

10 Likes

Part 1: Lesson 1 - Adventures
I used 1675 images of men and women and built a model to predict man vs woman. I good pretty good results. Here’s my work:

Screenshot%20from%202019-12-06%2000-14-37

1 Like

I am so positive this was a ton of work, but if you get around to sharing your source code. I would love to learn how you approached this problem outside what Jeremy taught on the Rossman data set. :blush:

If not maybe explaining did you tweak the architecture?

1 Like

Hi Johnpal
Nice work!
mrfabulous1 :smiley::smiley:

1 Like

to identify calls, then to classify them and finally (hopefully) maybe use NLP models to search for underlying structure

You may also be interested in this new 8TB dataset of whale calls I stumbled on the other day http://www.soest.hawaii.edu/ore/dclde/dataset/

DCLDE being the 2020 Workshop on Detection, Classification, Localization and Density Estimation of Marine Mammals using Passive Acoustics

1 Like

Hello Jeremy, in fact it wasn’t. I used the tabular notebook from the class. I added a few lines mostly to pre process the dataset and used the one cycle policy.

I was going to share the notebook in a github repository, but I had to leave home.

I will share it on Monday.

2 Likes

Really?!

Awesome I have seen some amazing things happening with the fastai tabular framework. A winning solution for one of the kaggle competition that used k-folds and haven’t been able to completely go through that frame work, but it’s on my list.

I have worked with the tabular model and used the embedding to get better results with a random Forest, but look forward to being able to see your approach. I still think there is so much value in tabular data. Even thought nlp is what fascinates me the most.

Best Regards I’ll be back Monday night to see if you were able to upload thanks a million

1 Like