I am a complete novice with machine learning and have just finished using my own dataset for the first half of lesson 1.
I couldn’t find any step-by-step instructions, and considering it’s the first bit of self-learning required, thought it would be useful if I posted what I did.
Any feedback on the method (inaccuracies made by me) would be appreciated.
This uploading part obviously only applies to those using paperspace.
These resources were helpful to me:
- Q3 and Q4 from Reshama’s FAQ in working out how to structure the data (found from this thread)
- Beecoder’s Cricket or Baseball example
Either way, I hope this is helpful to a beginner in a similar situation.
- Collect at least 15 images for each category
- Create a folder locally. I used ‘cowhorse’.
- Inside this folder, create two subfolders: ‘train’ and ‘valid’
- Inside both of these subfolders (train and valid), create another two subfolders for your two categories. In my case ‘cow’ and ‘horse’.
- Put around 80% of your images in the ‘train’ subfolders, and the rest in the ‘valid’ subfolders.
- Your finished structure should look like this:
- Zip up this folder
- SFTP into your paperspace box
- Navigate to /fastai/courses/dl1/data/ and upload your .zip folder here
- unzip this folder
- In your jupyter notebook, change the PATH to the directory you just uploaded
- Due to the low sample size, you may want to change the learning rate of the model (I changed mine to 0.2)
- Run through all the commands and you should see the model using your new dataset: