Developer chat

stas · September 27, 2018, 6:07am

Great!

Finally, would linux-64 be sufficient, or do we need linux-32 as well?

stas · September 27, 2018, 7:20am

@lesscomfortable, do we really need to run this script on all notebooks on every git pull? it’s pretty slow already and will get only slower over time as the number of notebooks will grow.

It’s probably enough to check whether any notebook got modified and only then re-‘trust’ it - I suppose it stores some kind of checksum for each notebook? but then what to check against as a time reference point. or perhaps Notary() has an API to do the checking (except it might be just as slow as forcing trust) Can you please investigate? Thank you!

sgugger · September 27, 2018, 12:17pm

Just read the setup.py and I’m not sure cupy should be in the requirements: it’s only used for QRNNs and can limit the install for some users. Also, I plan to rewrite the specific bit in QRNNs that require cupy as a pytorch custom C++ extension some day and hopefully get rid of cupy completely.

sgugger · September 27, 2018, 1:24pm

The trust-doc-nbs run fine on Windows, however, run-after-git-clone gives me this error:

D:\Work\Deeplearning\fastai_pytorch>python tools\run-after-git-clone
Traceback (most recent call last):
  File "tools\run-after-git-clone", line 31, in <module>
    run_script(path/"trust-origin-git-config")
  File "tools\run-after-git-clone", line 18, in run_script
    result = subprocess.run(cmd.split(), shell=False, check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 403, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 997, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

sgugger · September 27, 2018, 2:30pm

Successfully (and quickly) installed fastai_pytorch from the repo on a fresh AWS instance (the deep learning AMI) with the following steps:

conda install pytorch-nightly -c pytorch
git clone https://github.com/fastai/fastai_pytorch
cd fastai_pytorch
pip install -e .
tools/run-after-git-clone

The script works perfectly well there (and it didn’t much time to trust all notebooks when I did a new pull).

stas · September 27, 2018, 3:43pm

Thank you for testing, @sgugger!

It’s odd that it doesn’t print out what it can’t find, no?

If I specify an incorrect file on linux I get:

FileNotFoundError: [Errno 2] No such file or directory: 'tools/trust-origin-git-confi': 'tools/trust-origin-git-confi'

Can you add more diagnostics, so the script now looks as: (copy-n-paste over as is):

import subprocess, os
from pathlib import Path

def run_script(script):

    # check that we can execute the script
    cmd = f"{script}"
    print(f"Executing: {cmd}")
    result = subprocess.run(cmd.split(), shell=False, check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    if result.returncode != 0: print(f"Failed to execute: {script}")
    if result.stdout: print(f"{result.stdout.decode('utf-8')}")
    if result.stderr: print(f"Error: {result.stderr.decode('utf-8')}")

# make sure we are under the root of the project
cur_dir = Path(".").resolve().name
if (cur_dir == "tools"): os.chdir("..")

path = Path("tools")
print("Path", path)
# facilitate trusting/distrusting of the repo-wide .gitconfig
run_script(path/"trust-origin-git-config")

# facilitates trusting notebooks under docs_src
run_script(path/"trust-doc-nbs")

jeremy · September 27, 2018, 4:37pm

I think so.

jeremy · September 27, 2018, 4:38pm

If you run it this way, then enter ‘c’ at the pdb prompt, you’ll be able to see what file it can’t find:

python -m pdb tools\run-after-git-clone

lesscomfortable · September 27, 2018, 6:47pm

So checking if a notebook is signed takes roughly half as much as signing it. We could improve the speed of the script by checking if a notebook is signed and only signing it if it is not signed already. If you agree I will modify the script to account for that.

stas · September 27, 2018, 7:06pm

Thank you for investigating it, @lesscomfortable

Given that if the check failed and it needs to be signed still, this may actually slow things down. So let’s leave it as is for now.

There should be an easier way to check for a simple modification of the file - just need to check what to check against. Perhaps dropping a last-checked file onto the fs and use its modification timestamp to see whether the nb is newer. That would be much faster as it’d be just one stat(3) call.

sgugger · September 27, 2018, 7:42pm

Indeed, it’s this one that causes problems:
c:\users\sylvain\anaconda3\envs\fastai1\lib\subprocess.py(997)_execute_child()
Which doesn’t help us since it’s not the file not found, it’s where the file wasn’t found in code

jeremy · September 27, 2018, 7:51pm

I moved some of the docs around so that everything except CONTRIBUTING.md is properly included in the docs site and organized in the sidebar and gets a TOC. (@stas I know you were a bit unsure about whether it’s OK for fastai library users to see this info, but I think it’s OK since it’s clear about who it’s for.)

lesscomfortable · September 27, 2018, 7:52pm

I think your idea is better, however I don’t think checking for signatures and and signing if unsigned will take longer since usually there will be more unmodified notebooks than modified notebooks

ashaw · September 27, 2018, 11:12pm

http://docs.fast.ai/gen_doc.sgen_notebooks.html should have the lates on how to update notebook documentation and the associated html

in the base directory, run something like:
python fastai/gen_doc/sgen_notebooks.py --update_html=True

stas · September 28, 2018, 12:03am

conda and the different dependencies twists and mismatching package names proved to be a bit of a hell.

So I had to build our own torchvision conda package that depends on pytorch-nightly conda package…

Please help me test that it (1) installs and (2) actually works.

Currently only linux-64/python3.6 conda build is available, so please only try if that’s your setup.

First clear out your environment:

conda uninstall fastai pytorch-nightly pytorch torchvision
pip uninstall fastai pytorch-nightly pytorch torchvision

now:

conda install -c pytorch pytorch-nightly
conda install -c fastai/label/test torchvision=0.2.1=pyhe7f20fa_0

and finally:

conda install -c fastai/label/test fastai

Let me know if you successfully install it first and then if you can use it even if you’re using editable install - this is just to test. you can uninstall it right away.

There is the deadline so your help is crucial.

Thank you!

sgugger · September 28, 2018, 1:27am

Had a few commits today and vision docs is almost finished, only vision.learner is missing for now.

jeremy · September 28, 2018, 3:39am

@sgugger that reminds me - not a big deal but you may want to rename vision_learner and text_learner in the module dependencies diagram too.

stas · September 28, 2018, 3:45am

I suppose you are replying to Jeremy’s suggestion, could you please run the modified code I pasted which should tell us what is it trying to find?

I don’t know what to make of those strings you pasted - lacking context.

Thank you.

stas · September 28, 2018, 3:47am

Please go ahead and proceed with your suggestion, @lesscomfortable, we can always measure/tweak it later if we discover that it’s becoming a hurdle. Thank you for looking into it.

stas · September 28, 2018, 5:46am

And btw, we have a fledgling test suite already. Please see:

http://docs.fast.ai/developers.html#test-suite

And you may start adding tests to it as you create new features.

Or if you have some time on your hands, please, write tests for the existing ones.

Note that some of the v0 tests can be ported to v1. https://github.com/fastai/fastai/tree/master/tests
I ported a few subtests from tests/test_core.py of v0, but there a lot more there to port.

Thank you.