Collaborative note creation and way to proceed?

PegasusWithoutWinds · March 1, 2019, 6:35am

During the already very engaging discussion at How to do fastai - Study plans & Learning strategies started by @AmanDaVinci, Jeremy jumped onto the idea that:

I am thinking that maybe we could have our star note takers @PoonamV, @hiromi, and any other heroes who I regretfully might have forgotten to create a Github repos and start the work there, then everyone else who wants to contribute could add more content onto it?

Maybe we could also use a more integrated solution like Gitbook or something similar? Gitbook used to be popular, but recently people seem to be turning away from it. Could people more experienced with it voice their opinions? Any recommendation on other possible tools and frameworks for such task?

init_27 · March 1, 2019, 6:36am

I suggest that we could have a wiki-thread here and once the course is over we could push the final version of the wiki(s) to GH.

AmanDaVinci · March 1, 2019, 6:41am

I second the idea of having a wiki-thread here first and then polishing & publishing it somewhere else. It will be easier to concentrate on the forums when the course is underway.

miwojc · March 1, 2019, 8:54am

The notes for part 1 were great. To make them even better, what about using jupyter notebook as tool for making notes. In this way you can both read and run code / experiment…

PoonamV · March 1, 2019, 10:27am

I think writing code and notes together with some nice images in between will be easier to understand. Jupyter notebooks on github will do that. This can be later turned into webpages also like docs.fast.ai.

PegasusWithoutWinds · March 1, 2019, 11:22am

I certainly would love to use Jupyter Notebook as we already have an ongoing development of infrastructure surrounding it; it is every bit awesome but could pose a rather steep learning curve to new-comers when you are doing source control on it. For those who have contributed to fastai docs, you certainly know what I mean.

If we decide to take this route, we also need to figure out a version control strategy. It could be the same as what we did for the docs, or maybe we could try something different.

@stas any word of wisdom?

miwojc · March 1, 2019, 4:01pm

That’s true but that’s part two so maybe doable?

PegasusWithoutWinds · March 1, 2019, 4:05pm

Don’t get me wrong. I am all for it. We should definitely try it out and see how people react.

Most likely we need to twist the version control strategy a bit. The big idea itself is on the mark.

stas · March 1, 2019, 5:17pm

I’d avoid using Jupyter notebooks as a collaborative tool for multiple concurrent editors like a plague. Unless every participant wants to learn how to merge and resolve conflicts in json files, and do it a lot. That wouldn’t be very encouraging for people to want to contribute. You will have to learn how to use nbdime, which let’s you merge json notebook files in a visual way.

Collaborating on nbs is ok, if you use a token approach which we more or less use for the fastai docs. i.e. I check out the latest docs, quickly edit them and push my changes in. We try to avoid situations where 2 people edit the same notebook at the same time. It’s also very easy to make a mistake and overwrite someone’s changes when doing the merging. This is easy to catch with code, but almost impossible to catch with prose.

I recommend to stick to simple formats that are easy to edit and merge, especially so at the beginning of creation. If you have to include runnable-code probably plain python scripts is the best way. And if really desired there could be a simple tool that combines the final pieces together into a notebook, i.e. combine the text with python scripts and automatically run the outcome like we do with the docs. And once the notes are more or less complete it’s OK to continue with the finished combined notebook, since the future changes would be infrequent and thus avoid merge conflicts.

So you use .md files under git and if you need to include runnable code with outputs, you simply add: code/lesson_8_example_1.py inside .md file, and add that python script to the repo. And then at the end of the note writing period, when there almost no changes left to add, write a script to split that md file into text cells and code cells (sucking all .py files’ content in) and save them into a notebook.

stas · March 1, 2019, 5:23pm

The other approach is to have a note captain for each class and have contributors contribute edits, suggestions and improvements directly to the captain and she/he does all the editing and supervision. No source control issues. I did that while writing my first book and it worked great.

And again, once the notes are complete, then they can be released as a notebook and then it shouldn’t be a problem to add occasional fixes.

I think you’d want to have a coordinator/supervisor for each class anyway, otherwise it might be tricky if there is a disagreement on what Jeremy really meant when he said “Ugh!”

stas · March 1, 2019, 5:54pm

If you consider using discourse wiki, I’d first investigate it’s reliability as a concurrent editing tool. I have seen at least once when my changes were overwritten by another editor who was editing at the same time. We both started with the same edit, then both were editing concurrently, I saved first, he did second, my changes were replaced by his changes. Not good.

But I haven’t investigated whether it’s a bug in discourse or it was a fluke (if there is such a thing in software engineering). I just know now not to edit wiki posts in discourse if the system is telling me someone else is either already editing it or have just started editing, since I no longer trust it.

s.s.o · March 1, 2019, 6:05pm

Maybe an alternative would be using google docs which is suitable for collaborative editing + comments etc. and also possible to web publish…

stas · March 1, 2019, 11:20pm

Maybe an alternative would be using google docs which is suitable for collaborative editing + comments etc. and also possible to web publish…

google docs is indeed a good one for collaborating.

Also I see that people do send PRs with fixes to Hiromi’s notes:
https://github.com/hiromis/notes/pulls?q=is%3Apr+is%3Aclosed
so perhaps nothing needs to be changed, other than making it easy on Hiromi so that others could have commit access and help with PRs.

PegasusWithoutWinds · March 2, 2019, 7:00am

I will attempt to summarize your points here so that I can make sure I get what you mean. Please let me know if I misunderstood anything again.

Jupyter Notebook is not a good collaborative tool for multiple concurrent editors. Too high a collaboration overhead.
Follow the token approach we used for fastai docs.
Stick to simple file formats that are easy to collaborate on, like markdown with code cell annotation, then turn them into Jupyter Notebooks at the end using automated workflow.
Have a note captain in charge of each class and serve as the conflict resolver and final arbitrator.
Discourse wiki might have some issue with concurrent editing.

PegasusWithoutWinds · March 2, 2019, 7:01am

Actually, now thinking about it, why not give Google Colab a try? It is Jupyter Notebook interface but with the collaborative editing support of Google docs.

Maybe we can just let the note captains decide what approach they would like to take, and see how things work out.

init_27 · March 2, 2019, 8:04am

I vote for google docs. We can collaborate on the notes and after Part 2 is complete and we’ve created highly polished notes-then we upload them to a GH repo.

I was worried about someone accidentally deleting notes/editing them-Google docs show the complete history so that wouldn’t be a problem.

Are there any other shortcoming in this approach?

miwojc · March 2, 2019, 9:13am

Bookdown seems to be a nice tool https://bookdown.org/yihui/bookdown/

It provides means for collaboration

miwojc · March 2, 2019, 9:40am

Regardless of tool choice which is very important . I agree that we need to have a person or steering committee to be in charge of this project…

miwojc · March 6, 2019, 9:51pm

is this a way to deliver lesson notes?

https://colab.research.google.com/github/tensorflow/swift/blob/master/docs/site/tutorials/walkthrough.ipynb

jeffhale · March 6, 2019, 11:14pm

Colab seems worth a shot - COLLABoration in Jupyter Notebooks is what it’s designed for