let’s create a shared Github repository to collaborate? What do you say? It would make collaboration much easier. We can link to the Google spreadsheet in the readme to distribute the work, or use the Project feature of Github repos to assign tasks
I can create it and make it either private or public, I just need your Github usernames.
@iacolippo That sounds great. I would be happy to collaborate in that way. Each collaborator could have their own repository or branch.
What do you guys think?
Great idea, I think you should create the github repo, public for now should be Ok?
Once people are making progress, they can check in their notebooks and add a line in the spreadsheet or we can evolve to the github project features as needed.
Let’s start lightweight first? spreadsheet and public repo with a readme for now?
To add you as collaborators in the organization I need your usernames, I’ve added two people already, for the rest of you, send me your username in PM.
For the ones of you that I already added, you should have an invitation in your inbox!
I would also like to participate ! My GitHub name is Gokkulnath (https://github.com/Gokkulnath).
Looking forward to uncover exciting results ! and Thanks for the oppurtunity!
added everyone that gave me his/her Github username. We can use the repo wiki to share ideas and common practices to have uniformity in the experiments.
FYI gang, something I’ve noticed is that these kind of joint projects work best when individuals decide to just go ahead and implement stuff - then as they go, they can provide updates on progress, and make specific requests as to additional bits of work that need to be done, which others then can get to work on.
My suggestion: don’t wait for someone to organize you all into a group and distribute work to you, since I’ve noticed in practice this rarely happens at all, and even when it does it tends to be slower than just enthusiastic individuals jumping in and getting to work!
This is a great point. And of course remember the idea we discussed in class - using smaller images for your experiments is a great way to do experiments more quickly, and the insights are likely to be similar to what you’d get with larger images.
I am trying to find existing literature around this “incremental learning” problem. I could find work related to class-incremental learning (adding new classes as we run more batches) and some ways to avoid catastrophic forgetting - closer to Jeremy’s idea. I am yet to see anything where the same classes are used in each epoch but with stage-wise increase in batch sizes - Leslie’s Idea.
some links which might be useful to folks working on this problem:
Please find the docker setup here. This can be nice for getting you off the ground in your experiments.
There are two paths you can take. You can either use this as a blueprint for creating your own environment locally on your machine (most of the commands you might need are in the Dockerfile). Or you might want to fork the repo and do your work in it as is. Meaning - as you make changes / create new notebooks the changes to them should show in the workspace folder of the repository. You should be able to make a git commit and push them to github (potentially for sharing the results with others or storing your own work).
I haven’t had a chance to use this like this extensively (this is the first docker image I defined) so mileage can vary.
I went for relative cleanliness but the issue with this setup is that the fastai library is hard to get to (it doesn’t live in the workspace folder but only lives in the container). For the type of work we are setting out to do here this can be limiting. If there would be interest, I can create a separate branch which will use a fastai repo living in workspace but this comes with a couple of rough edges.
I haven’t had a chance to test this out with the most recent version of the library so this pulls down a specific tag on my own fork of fastai. I’ll point this to fastai master or some earlier commit should it be needed when I have a chance to test drive that it works.