Please give me advice! What's the best way to go about doing Named-Entity Recognition (NER)?


(Myles) #1

Hi all!

I’m currently taking the 2019 fast.ai course and I’m finding it really interesting. I love the philosophy of reducing the lines of code necessary whilst still using SOTA techniques. I’m very impressed.

From what I’ve seen so far the course looks at computer vision, and at NLP in terms of sentiment analysis and tabular data. However, the project I’m working on currently has a lot to do with Named Entity Recognition and potentially a bit of Summarisation too. Does fast.ai deal with NER at all?

I basically want to pull structured information out of unstructured data (and not just nouns, adjectives, etc, which is more POS-tagging) – I want to extract custom fields from text. I’ve been reading about Flair Embeddings (thanks to the PapersWithCode site Jeremy shared on twitter) and BERT and ULMFiT and spaCy/prodigy, and I’m not honestly not sure where to go or how to start.

It’s probably also worth mentioning I’m quite new to this side of things so while I want to find a system which will allow me to produce very accurate results, I’m also looking for something which isn’t too dense and difficult to implement – also why I like the fast.ai course.

Thanks! :slight_smile:


(Matthew Teschke) #2

I don’t believe fastai currently supports NER.

Flair seems like it would be a good approach, though it’s so new that I haven’t had a chance to try it out yet. Its documentation does say it supports training your own models, which would fit your use case.

Last summer we did a review of 5 NER libraries (blog post here), which could be a good starting point for you. I know some of the libraries we evaluated allowed for custom tagging, but I forget exactly which ones did.


(Myles) #3

Apologies for the delay, thank you for getting back to me! I’ll definitely check this stuff out.