Research collaboration opportunity with Leslie Smith

@nachiket273 @boxreb14 @mcskinner @MicPie @bushaev @nirantk @deepnarainsingh @abi @radek @Borz

let’s create a shared Github repository to collaborate? What do you say? It would make collaboration much easier. We can link to the Google spreadsheet in the readme to distribute the work, or use the Project feature of Github repos to assign tasks :slight_smile:

I can create it and make it either private or public, I just need your Github usernames.

If I forgot anyone, @ me

6 Likes

Hi I would interested in helping too!

@iacolippo That sounds great. I would be happy to collaborate in that way. Each collaborator could have their own repository or branch.
What do you guys think?

Hi, I am interested in contributing to this project. Please add me in as well. Thanks

Count me in too, if possible.

@iacolippo Using github should be a good way to collaborate. I am also interested in this project. Please count me in. Thanks.

I’m in too!

Great idea, I think you should create the github repo, public for now should be Ok?

Once people are making progress, they can check in their notebooks and add a line in the spreadsheet or we can evolve to the github project features as needed.

Let’s start lightweight first? spreadsheet and public repo with a readme for now?

~Abi

My GitHub name ist also MicPie (https://github.com/MicPie).
Thank you for the setup!

1 Like

sure , my github user name is nachiket273(https://github.com/nachiket273)

1 Like

I created a Github organization to make things even easier, provisory name: theresizers :slight_smile: https://github.com/theresizers I’m open to suggestion.

I’ve created the repository: https://github.com/theresizers/smart-dataset-growth same thing, open to suggestions for the name :smile:

I’ve put the google spreadsheet in the readme.

To add you as collaborators in the organization I need your usernames, I’ve added two people already, for the rest of you, send me your username in PM.

For the ones of you that I already added, you should have an invitation in your inbox!

3 Likes

I would also like to participate ! My GitHub name is Gokkulnath (https://github.com/Gokkulnath).
Looking forward to uncover exciting results ! and Thanks for the oppurtunity!

1 Like

added everyone that gave me his/her Github username. We can use the repo wiki to share ideas and common practices to have uniformity in the experiments.

1 Like

yes, I am interesting

FYI gang, something I’ve noticed is that these kind of joint projects work best when individuals decide to just go ahead and implement stuff - then as they go, they can provide updates on progress, and make specific requests as to additional bits of work that need to be done, which others then can get to work on.

My suggestion: don’t wait for someone to organize you all into a group and distribute work to you, since I’ve noticed in practice this rarely happens at all, and even when it does it tends to be slower than just enthusiastic individuals jumping in and getting to work! :slight_smile:

24 Likes

This is a great point. And of course remember the idea we discussed in class - using smaller images for your experiments is a great way to do experiments more quickly, and the insights are likely to be similar to what you’d get with larger images.

5 Likes

Nice idea…
My GitHub username is https://github.com/VishuCyrus

1 Like

Sorry @Leslie the people have spoken :wink:

8 Likes

I am trying to find existing literature around this “incremental learning” problem. I could find work related to class-incremental learning (adding new classes as we run more batches) and some ways to avoid catastrophic forgetting - closer to Jeremy’s idea. I am yet to see anything where the same classes are used in each epoch but with stage-wise increase in batch sizes - Leslie’s Idea.

some links which might be useful to folks working on this problem:

  1. 2014 work at MSR
  2. https://arxiv.org/abs/1611.07725
  3. https://arxiv.org/abs/1708.06977
4 Likes

Please find the docker setup here. This can be nice for getting you off the ground in your experiments.

There are two paths you can take. You can either use this as a blueprint for creating your own environment locally on your machine (most of the commands you might need are in the Dockerfile). Or you might want to fork the repo and do your work in it as is. Meaning - as you make changes / create new notebooks the changes to them should show in the workspace folder of the repository. You should be able to make a git commit and push them to github (potentially for sharing the results with others or storing your own work).

I haven’t had a chance to use this like this extensively (this is the first docker image I defined) so mileage can vary.

I went for relative cleanliness but the issue with this setup is that the fastai library is hard to get to (it doesn’t live in the workspace folder but only lives in the container). For the type of work we are setting out to do here this can be limiting. If there would be interest, I can create a separate branch which will use a fastai repo living in workspace but this comes with a couple of rough edges.

I haven’t had a chance to test this out with the most recent version of the library so this pulls down a specific tag on my own fork of fastai. I’ll point this to fastai master or some earlier commit should it be needed when I have a chance to test drive that it works.

5 Likes