Submitting PR to fastai V2

arora_aman · September 7, 2019, 3:48am

Submitting Pull Request to fastai V2

I am a Wiki thread! You can edit me.

Creating a PR using gh - the official GitHub CLI

Install gh from here
Create a conda environment with fastai2 requirements, include nbdev.
a) git clone https://github.com/fastai/fastai
b) cd fastai
c) conda env create -f environment.yml
d) source activate fastai
e) pip install -e ".[dev]"
f) Create an editable install of fastcore outside of fastai2 (see docs here)
Run nbdev_install_git_hooks inside the repo folder
Checkout to new feature branch (this is where new feature/bug-fix should be made)
git checkout -b <enter--relevant-feature-branch-name>
Make changes to Jupyter Notebook (ie., add tests, update feature, add new feature, add documentation etc.)
Save updated notebook
Run nbdev_build_lib (see Note2 below if this step produces a bug)
Run library tests nbdev_test_nbs
Commit change
git commit -m "done with feature"
gh pr create -B master -b "enter body of PR here" -t "enter title"

This command will automatically create a fork for you if you’re in a repository that you don’t have permission to push to.

Note: Steps 1-3 need to be only performed once when you fork fastai2 for the first time. In subsequent PRs start from the step 4.

Note2: If you get an error like: No such file or directory: '../fastai2/docs/_config.yml' then you need to clone the fastai-docs repo into the same dir as you have fastai2 in. This is because the docs dir in fastai2 is now a symlink to …/fastai-doc

Confirm PR created

You can confirm that your PR has been created by running gh pr list inside fastai2 folder of your computer. You can also check the status of your PR by running gh pr status. More detailed documentation can be found here.

Updating a PR

If you want to change your code after a PR has been created you can do it by sending more commits to the same remote branch. For example:
git checkout -b <enter-branch-name-same-as-before>
git commit -m "updated the feature"
git push
It will automatically show up in the PR on the github page.
If these are small changes they can be squashed together at the merge time and appear as a single commit in the repository.

Merging a PR

This is done by the project maintainers so nothing more for you to do, you’ll see a notification on github when the PR gets merged and closed.

Workaround for authenticating on headless environments

The first time you submit a PR you’ll be asked to authroize gh to work on your behalf (e.g., create forks as needed, submit PRs, etc…). If you are working on a remote server (a.k.a headless environment), as I suspect most will be, follow the instructions here to complete the steps required.

jeremy · September 7, 2019, 4:00am

@arora_aman if you haven’t tried it before, I suggest you take a look at this: https://github.com/github/hub

It makes life way easier for both beginners and experts!

arora_aman · September 7, 2019, 4:06am

Thank you @jeremy

I made my first PR ever https://github.com/fastai/fastai_dev/pull/174 using hub but I think I still messed it up

Can I please request you to make the thread a wiki so that anyone can edit and we can have a complete list of steps for everyone to follow including myself

sgugger · September 7, 2019, 3:27pm

I am polishing the scripts but whenever anyone submit a PR, it will be expected that the local library and notebooks match. If you made a change to the notebooks in one of the exported cells, you can export it to the library with notebook2script.py, if you made a change to the library, you can export it back to the notebooks with script2notebook.py.

The script diff_nb_script.py can let you know if there is a difference (and will be included in the CI early next week).

jeremy · September 7, 2019, 4:38pm

Done.

jeremy · September 7, 2019, 4:39pm

Also the CI isn’t checking for stripped nbs at the moment - there’s extra metadata in some PRs and nothing picking that up.

hiromi · September 7, 2019, 5:11pm

I am not sure what CI/CD tools fastai is using, but GitHub Actions has a hook for on.pull_request where you can run a shell script and check for stripped notebook. I was trying to see if I can get that to run the tests described in README, but it doesn’t have GPU and not great for testing.

jeremy · September 8, 2019, 12:48pm

@hiromi we use Azure Devops. We have things to check for stripping in most places, but it just hasn’t been set up for this folder yet.

sgugger · September 9, 2019, 7:11pm

Update: the CI now automatically tests every commit/PR to check:

that all notebooks can be opened
that all notebooks are stripped from metadata (remember to run tools/run-after-git-clone onece after cloning the repo)
that there is no diff between the library in local and the result of an export in the notebooks.

If you changed something in one of the exported cells of one of the notebooks, either run the last cell of said notebook to export everything or run

python notebook2script.py

in the dev folder of the cloned repo.

If you changed something in one of the files in local that is generated by the notebooks, run

python script2notebook.py

in the dev folder of the cloned repo.
Note that this only works for small changes like a typo fix, if you want to add functions, use the notebooks.

If you want to check you have a clean diff, run

python diff_nb_script.py

in the dev folder of the cloned repo. If it doesn’t print anything, you are good to go.

slawekbiel · September 10, 2019, 8:35pm

@arora_aman Maybe add a section on updating a PR too? Like
git commit -m "improved the feature"
git push origin <enter-branch-name-same-as-before>

I think it’s also possible to do it without creating a new commit and instead git commit --amend the original one and pushing that.

arora_aman · September 11, 2019, 10:11am

@slawekbiel
Good idea! You should just be able to edit the post since its a Wiki thread

jeremy · September 11, 2019, 4:31pm

It’s best when each separate PR has a separate commit history, otherwise it’s a bit harder to merge since the PR has a whole bunch of unrelated stuff in it.

slawekbiel · September 16, 2019, 5:44pm

I’ve edited it again to clarify that you only create a fork once, rather than on every PR

kdorichev · September 23, 2019, 9:24am

Added these scripts’ description to the README.md

tank13 · October 27, 2019, 8:24pm

Thank you for the guide!

Just made my first PR using hub – like arora_aman, I think I still messed it up. A couple issues that came up in the process:

I notice that git add isn’t part of the instructions. I assumed I should add both the notebook and the script that it changed… is this right?
When running python run_notebook.py -- fn <notebook> I got this error:

Traceback (most recent call last):

File “run_notebook.py”, line 18, in
slow:Param(“Run slow tests”, bool)=False, cpp:Param(“Run tests that require c++ extensions”, bool)=False):
File “/home/t/fastai_dev/dev/local/core/script.py”, line 39, in call_parse
func(**args.dict)
File “run_notebook.py”, line 24, in main
for f in sorted(fns): test_nb(f, flags=flags)
File “/home/t/fastai_dev/dev/local/notebook/test.py”, line 113, in test_nb
raise e
File “/home/t/fastai_dev/dev/local/notebook/test.py”, line 110, in test_nb
ep.preprocess(pnb)
TypeError: preprocess() missing 1 required positional argument: ‘resources’

Which I didn’t know what to do with, but it meant that I wasn’t able to run the built-in tests.

sgugger · October 27, 2019, 9:38pm

This error means you don’t have the last version of nbconvert, you should update it.

tank13 · October 27, 2019, 11:14pm

Thank you!!

pete88b · February 6, 2020, 9:44am

Hi everyone,

I’d like to suggest a little change to Hook - should I do this with a pull request, issue or something else?
Looks like run-after-git-clone was removed Nov 2019 - any chance these instructions can be updated? (… I can’t find nbdev_install_git_hooks either :o)
Cheers,
Pete

sgugger · February 6, 2020, 2:39pm

See the section contribution in the README. You need to install nbdev (which is a dev dependency) to have the command nbdev_install_git_hooks.

tcapelle · February 13, 2020, 12:37pm

I edited the wiki to make use of nbdev. Probably using hub we can remove some of the intermediate steps of branching, it should take care of this. (as when you edit a file on the github website and submit a PR)