Question classifier

I am trying to classify whether a sentence is question or not.
For example:

statement : do you like food?

predicted_question : 1

statement : The boy who sat beside him was his son.

predicted_question : 0

How can i approach this problem ? And how can i prepare a data set for it ?

Check if there is a question mark in the string :slight_smile:

On a more serious note, I think the classifier would be challenged by labeled sentences where the question marks are removed.

Didn’t get to NLP myself yet as you can see :wink:

The first step is to create the dataset, you can create dataframe with a column for the questions/not questions and another with the labels. Then you can use the datablock api to create a databunch. You can follow lesson 3 notebook, the problem is very similar, instead of positive/negative you have question/statement.

You can try to remove the question marks from the questions to see how well the model can do.

I hope this helps :slight_smile:

1 Like