Entity/phrase extraction from text

Hi all,

I’m pretty new to AI and machine learning, but I’m having a blast going through the first part of the fast.ai course. I’ve been playing around with the text classification example and have managed to make a few of my own models that work incredibly well so I want to try to push the boundaries of fast.ai (or at least my current understanding of the boundaries with fast.ai) with some more challenging tasks.

I’m at a point where I want to start extracting specific elements out of the text, not just classify it. I think what I want is Named Entity Recognition, but I’m not 100% sure. The goal I’m trying to achieve now is to identify when a medical procedure is being discussed in a report. I have a large quantity of data to train on, and the time/manpower to label the data myself. I’m just wondering if fast.ai supports the ability to do this. If it isn’t NER that I’m looking for, can you suggest what would be the fastest way to extract this sort of information from text?

If fast.ai isn’t able to do this, would the next best step be to break my text into sentences and label each sentence as positive or negative for a procedure, as well as label each sentence (that has a procedure) with the type of procedure it is? Then I can create two models, which combined should give me an idea if a procedure is happening in the report.

Thanks for your help everyone, I know this is probably a really broad question, I’m just hoping to be pointed in the right direction so I can focus my research/studying/poking around the internet :slight_smile:

Hi https://forums.fast.ai/u/shensmobile hope all is well!
I am currently playing with the same type of models as you. Unfortunately I cant solve your problem however if I were in your position, I would create very small datasets and try all the possibilities you mentioned in your post.

Cheers mrfabulous1 :grinning::grinning: