Share your work here ✅

jmp · June 3, 2022, 4:39am

Attending the lesson where we went through NLP for absolute-beginners made me decide to take on NLP as homework, and revive some work on lithology analysis (lithology: “the science of describing the physical characteristics of the stuff under your feet”).

I have a first blog post on this, powered by fastpages of course. No machine learning so far by the way, “only” discovering the data…

Jeremy shared his opinion at some point that there is a lot of value that could be created from NLP, and this comment stayed with me over the past weeks. There is still a lot of information out there captured perhaps like this, and a lot of knowledge and insights (as well as ethical questions by the way) to gain from this data.

Soil-Borehole-Log-showing-the-Stratification-Description-of-the-Subsoil-Encountered_W640

edit: external images rendering in preview but not rendering in the final post

Zakia · June 4, 2022, 8:45pm

I need some help - I’m working on a NLP model, and the dataset has 172 distinct values for a column (not target variable). So its a ‘Drug Name’ column - what is the best way or best practice for handling it, please?

Thanks in advance. Much appreciated.

jeremy · June 4, 2022, 10:57pm

You’ll need to use an embedding. You can learn about that in the book, or in the next two lessons we’ll cover it.

Whity · June 7, 2022, 5:13pm

I need some help.

In a first project, I could segment a really small structure of the inner ear, based on one MR sequence and got decent results with a DCS of 0.76 on the validation set.
Now I would like to segment adenomas of the parathyroid glands, where I have to first label the pathology in more than one MR Sequences, one sequence will be a dynamic one(the most important one). The sequences will have different orientations, resolution etc.

Now I feel a little bit lost where to start, how to handle the different sequences etc. A little guidance is more than welcome;-).

Mattr · June 9, 2022, 10:31pm

It’s been a long time coming but with the help of the daily fastai walk thrus I have submitted my first entries into a kaggle competition from my Paperspace notebook.

You can find out how to do this yourself by following Radek’s post Practice walk-thru 6 / chp1 on Kaggle! and the official walk thrus Official course walk-thrus ✅ - #108. Walk thru 8 was Jeremy giving the demo from his own GPU but the same process works for Paperspace notebooks as well. I highly recommend the walk thrus for beginners.

Third Entry after walk-thru 9:

JackV · June 10, 2022, 4:22am

Submitted my first kaggle entry following Jeremy’s code on today’s Official Walkthru (Week 8) from Radek’s recommendation from this thread.

yiyimarz · June 15, 2022, 2:24am

Inspired by @pcuenq 's project that uses CLIP to search photos. I created an app to search for frames from YouTube videos based on the text you type in. To use the app, all you need is the link to a Youtube video. You can use very intuitive queries - for example, you can use “Macaulay Culkin screams with hands on his cheeks” to get the iconic scream face from a Home Alone movie clip.

One super interesting property when we apply CLIP on movie clips is that we can leverage the subtitles. Subtitles are essentially images of text. Because the CLIP model is multimodal, it is able to read subtitles and develop a much more comprehensive understanding of images based on semantic information from both the frame itself and the subtitles. It means we can include the content of the dialog in our search. For example, you can search “Vizzini says inconceivable” in The Princess Bride to get all the frames when Vizzini says inconceivable.

I had a lot of fun building this app, as well as playing with it. Thanks @ilovescience for the awesome tutorial about Gradio and Hugging face. That’s all you need to launch an app on Hugging face space

I wrote a blogpost here: It Happened One Frame: incredibly accurate video content search using the OpenAI CLIP model
The app is hosted here: It Happened One Frame

jeremy · June 15, 2022, 6:26am

Got a tweet I can share?

jmp · June 15, 2022, 12:30pm

Great and intriguing work. Addictive indeed. I was reminded of a scene of The Hurt Locker today. “James’ face standing in front with cereals boxes in the background”. Not quite the frame I pedantically wanted, but certainly picked the “cereal aisle” frames in the overall supermarket scene. Impressive. I’ll read the post.

mike.moloch · June 15, 2022, 1:54pm

Oh dear Lord! this was a little too eery!

pcuenq · June 15, 2022, 4:02pm

That’s amazing! I loved reading the blog post too.

radikubwa · June 16, 2022, 1:16pm

managed to work with a couple of researchers to create a cell explorer. It is able to identify the causative agents of diseases Trypanosomiasis, Leishmaniasis and Malaria with extensions for inference. In addition, help with counting using a couple of widgets. Hopefully this will be start to using Machine learning models to aid us in the diagnostic laboratory and making other related tools. In case you want to check it out and the source code, reach out via email and I’ll send you using the link here.

The fastbook was also immensely helpful in helping me understand how to use some functionality in fastai library. Thanks for checking out and helping to the review the manuscript @poppingtonic.

yiyimarz · June 16, 2022, 4:04pm

yes! https://twitter.com/YiYiMarz/status/1537465564104257536?s=20&t=E9ljAxdgTAM1SxPp7hkIUw

zerotosingularity · June 19, 2022, 10:07am

I’ve been keeping a list of fast.ai docker containers since v1.0.60, and continue to build new versions as they get released.

Today I added support for fast.ai 2.7.2 with PyTorch 1.10.0 or 1.11.0:

https://hub.docker.com/repository/docker/seemeai/fastai

Mattr · June 21, 2022, 9:31pm

I’ve just trained my first decent ensemble image model on kaggle and am moving up the leaderboard thanks to Jeremy’s Walkthrus and kaggle experiments!

devforfu · June 23, 2022, 8:37pm

I’m also using fastai to take part in an ongoing UW-Madison GI Tract Image Segmentation competition. (Shared some examples before.) My results aren’t that impressive, and the competition is still running. But I haven’t been at such rank (even temporarily) for a long time.

No matter how it ends, it feels like now I am iterating pretty quickly and spend much less time writing boilerplate while focusing on modeling instead. Also, competing became less stressful and more joyful because of the shifted focus.

jeremy · June 24, 2022, 1:16am

Top 10% is pretty good!

FraPochetti · June 28, 2022, 8:44am

In this one I wanted to show how you can BOTH train and deploy a license plate recognition system in a fully Dockerized way.

The serving container just adds a couple of libraries on top of the training one, guaranteeing a completely reproducible end2end pipeline.

If you are into Docker, Amazon Textract, IceVision and FastAPI feel free to dive in!

Twitter, LinkedIn posts

Mattr · July 1, 2022, 10:21pm

I participated in the fastai hackathon this week and with the help of my team @AllenK @Rkap and @Sanjib we created an incident priorisation app: Risk Predictor - a Hugging Face Space by mrosinski

msivanes · July 3, 2022, 3:19pm

Based on Jeremy’s notebook on Kaggle - Getting Started with NLP for absolute beginners, I tried replicating the same approach on the dataset Women’s Clothing Reviews to determine if a customer will recommend the product based on clothing reviews.

Kaggle Notebook: getting-started-with-nlp-womens-clothing-reviews | Kaggle

Approach
Using the text fields alone such as title and review, trained a deberta-v3-small transformer model to predict the output.

Next Steps

This is a rich dataset consisting of both text, categorical and numerical fields and will be experimenting with some of the ideas shared in this topic Tabular Model with NLP Text features