Homework ideas or assignments for the course







  • PetFinder - Combining images, text, tabular for prediction.


  • TBA


Other Competitions

  • Dravidian-CodeMix β€” sentiment analysis for Dravidian languages in the code-mixed text found in social media
  • IEEE BigData 2020 Cup β€” a data mining challenge to predict escalations in customer technical support using natural language techniques
  • NLC2CMD β€” translate English descriptions of command-line tasks to their corresponding Bash syntax
  • Contradictory, My Dear Watson: Detecting contradiction and entailment in the multilingual text using TPUs.This is a playground type competition based on Natural Language Inferencing (NLI) to determine whether pairs of sentences are related. Participants are challenged to create an NLI model from a dataset including text from 15 different languages.
  • Hate Speech and Offensive Content Identification in Indo-European Languages provides a forum and data challenge for promoting multilingual research on detecting problematic content. This year the dataset contains 10K annotated tweets from English, German, and Hindi. The focus of the first subtask is to detect hate, offensive, or profane content in the text. The second subtask is more granular to discriminate and classify the respective type.

FYI the first example (fast.ai Datasets) gives me a 404 error when I try to open it. I guess this was from v3 of the course and those links don’t work any more?


Thanks for reporting. Jeremy already fixed the broken link.

1 Like

Thanks for your suggestions.

[Wiki Update]

  • Tabular Baselines
  • Multimodal type dataset

I edited the link to the fast.ai dataset. Thank for for sharing this post.