Share your work here ✅

Which Watersport?

I like aliteration and water, so this question appealed to me. Try it out here…

The build process.

I’m lazy, so I think to myself… Why make up a list of watersports when I can scrape one from here…https://en.wikipedia.org/wiki/List_of_water_sports. So I used the following code to do that semi-automatically… (manually removing erroneous and duplicate entries from the list)

I ended up with 37 categories, which in hindsight is perhaps a bit overboard, but anyway…
To clean the categories I downloaded all the images locally, then uploaded the following dataset to Kaggle… https://www.kaggle.com/datasets/bencoman/watersports
This was before learning of the built in cleaning tools in Lesson 2.

Training used RandomSplitter to specialise resnet18 to produce inference model watersports.pkl, with the following code…
https://www.kaggle.com/code/bencoman/which-watersport-2-train-mode

Took a while to get my system setup properly, which I documented here:

In summary, I created a new HuggingFace space for my app. Cloned that repo to my local machine. Installed LFS. Downloaded the inference model. Copied the contents of app.py to a local Jupyter notebook applocal.ipynb to test in. Downloaded a few example images, then committed and pushed to lot to hugging space.

Now I’m surprised at how well it did, particularly distinguishing between similar categories like:

  • Snorkling, Scuba diving, Cave diving, Free diving , Wreck diving, Spearfishing
  • Fin swimming, Mermaiding
  • Kayaking, Canoe polo, Outrigger boating, Dragon boating, Rowing, Paddle boarding
  • Water skiing, Barefoot skiing
  • Body boarding, Body surfing, Surfing, Kite boarding

Some things still to experiment with:

  • Using RandomResizedCrop - I wonder snaps of common areas of water affect the training - I presume it learns this is irrelevant.
  • Trying a higher level ResNet
  • Review Confusion Matrix
6 Likes