[git] An easier jupyter notebook modification commit process?


(Jeremy Howard (Admin)) #121

Yup there’s no way to make that work in Windows AFAIK without creating a little .bat wrapper.


#122

No it’s working even if the file doesn’t have the .py extension. Even when you want to execute it, python tools\fastai-nbstripout works fine (think I had a typo when I didn’t manage o make it work, probably a / instead of a ).


(Stas Bekman) #123

Thank you, @sgugger!

I removed the check that didn’t work on windows, so you all can now switch to using the new tools/trust-origin-git-config. Your workflow can be updated to:

git clone https://github.com/fastai/fastai_v1
cd fastai_v1
tools/trust-origin-git-config

If you experience any problems please let me know.


#124

So on windows, last instruction should be

python tools\trust-origin-git-config

otherwise it works properly.


(Fred Monroe) #125

what do you guys think of this tool?

seemed like a possible solution for dealing w/ jupyter notebooks in a collaborative / git environment


(Stas Bekman) #126

Thank you for the feedback, @sgugger

  1. Would you still need to include python if the script has .py in it?
  2. When you use python as you have shown does it have to be \ or will / work as a path separator?

(Jeremy Howard (Admin)) #127
  1. Yes - Windows cmd doesn’t support script files as executable
  2. It needs \ on Windows cmd

(Stas Bekman) #128

Thank you, all, for your input. I have updated the docs to include a note on how to invoke this on windows. Hopefully it’ll be a smooth sailing from here on.

wrt the original issue with quoted filepath inside .git/config which lead to the creation of the new script, I submitted a bug report to the git dev list and it started a big discussion, which hasn’t yet resulted in any outcomes, but I trust something good will come out of it.


(Stas Bekman) #129

Thank you, Fred, for mentioning jupytext.

Looking through the demo it appears that it deletes everything but code, and that won’t work for what has been developing here - we do keep outputs and some other important notebook fields in the documentation notebooks. And down the road when code notebooks have been more or less completed it is possible that outputs will be stored again, while still deleting other notebook fields. i.e. we want to have that fine control over what gets stored under git, and jupytext takes it away.

I agree though that it’d be far easier if the stored format wasn’t JSON but some plain text - so merging/diffing would be much easier. Though nbdime handles the diff/merge quite well. Just make sure you have it installed and configured.


(Jeremy Howard (Admin)) #130

@stas could you tell me how to create a directory that doesn’t run stripout, or runs it with different params? I’d like to create a directory containing rendered notebooks for people to look at.


(Stas Bekman) #131

I think all you need to do is to move fastai_v1/.gitattributes to dev_nb if you want those notebooks not to be under dev_nb. I think we should do it anyway, since this setup is only relevant for things under dev_nb.

If, however, you want them as a subfolder under dev_nb, create .gitattributes in that new subfolder and inside you specify:

*.ipynb -filter

which will override its parent .gitattributes configuration. The leading - before filter means ‘Unset’.

However, why not use the .gitattributes from docs? You will end up with stripped notebooks which will keep the output. And no other irrelevant nb noise.


(Fred Guth) #132

Maybe we could think of automatically checking if it is a valid JSON and if it is not, not even let the PR be merged if the json is not valid.

I know this kind of thing is possible within github, some projects use Travis CI, which maybe an overkill for fastai. Unfortunately, I don’t have much experience in this subject.


(Stas Bekman) #133

If you follow the developer install instructions, you will find:

tools/run-after-git-clone

which already takes care of doing the right thing. So your PR will be validated and done correctly by the filter that that script installs. You only need to run it once per git clone. For more details please see: this document.