Developer chat

Thank you, @sgugger, for the feedback!

Hmm, it seems to ignore custom channels - odd! or perhaps it is since you are running on a different arch.

What do you get when you run:

conda search -c pytorch torchvision

I get:

Loading channels: done
# Name                  Version           Build  Channel
torchvision               0.1.8          py27_0  pkgs/free
torchvision               0.1.8          py35_0  pkgs/free
torchvision               0.1.8          py36_0  pkgs/free
torchvision               0.2.0          py27_0  pkgs/main
torchvision               0.2.0  py27hfb27419_1  pytorch
torchvision               0.2.0          py35_0  pkgs/main
torchvision               0.2.0  py35heaa392f_1  pytorch
torchvision               0.2.0          py36_0  pkgs/main
torchvision               0.2.0  py36h17b6947_1  pytorch
torchvision               0.2.1          py27_0  pkgs/main
torchvision               0.2.1          py27_1  pytorch
torchvision               0.2.1          py35_0  pkgs/main
torchvision               0.2.1          py35_1  pytorch
torchvision               0.2.1          py36_0  pkgs/main
torchvision               0.2.1          py36_1  pytorch
torchvision               0.2.1          py37_1  pytorch

You can see that the required version of torchvision 0.2.1 is there. But perhaps there is no windows package?

and it didn’t complain about pytorch, hmm.

What do you get when you run:

conda search -c pytorch pytorch

And also:

conda search -c fastai dataclasses
Loading channels: done
# Name                  Version           Build  Channel
dataclasses                 0.6          py36_0  fastai

And these were built for py36, are you on py37?

Apologies, if a quick feedback request is turning into a lot of questions - if you’re busy we can do it another time.

For torchvision, I get:

PackagesNotFoundError: The following packages are not available from current channels:

  - torchvision

Current channels:

  - https://conda.anaconda.org/pytorch/win-64
  - https://conda.anaconda.org/pytorch/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/win-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/pro/win-64
  - https://repo.anaconda.com/pkgs/pro/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Note that the official pytorch instruction tell Windows user to install torchvision via pip install torchvision.

For pytorch, I get:

Loading channels: done
# Name                  Version           Build  Channel
pytorch                   0.4.0 py35_cuda80_cudnn7he774522_1  pytorch
pytorch                   0.4.0 py35_cuda90_cudnn7he774522_1  pytorch
pytorch                   0.4.0 py35_cuda91_cudnn7he774522_1  pytorch
pytorch                   0.4.0 py36_cuda80_cudnn7he774522_1  pytorch
pytorch                   0.4.0 py36_cuda90_cudnn7he774522_1  pytorch
pytorch                   0.4.0 py36_cuda91_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py35_cuda80_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py35_cuda90_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py35_cuda92_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py36_cuda80_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py36_cuda90_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py36_cuda92_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py37_cuda80_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py37_cuda90_cudnn7he774522_1  pytorch
pytorch                   0.4.1 py37_cuda92_cudnn7he774522_1  pytorch

On this, note that there are no nightly wheels or pytorch1.0 preview for windows (everyone hates us :wink: ).

And for dataclasses, I get:

Loading channels: done

PackagesNotFoundError: The following packages are not available from current channels:

  - dataclasses

Current channels:

  - https://conda.anaconda.org/fastai/win-64
  - https://conda.anaconda.org/fastai/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/win-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/pro/win-64
  - https://repo.anaconda.com/pkgs/pro/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

One last precision: I’m on python 3.6.5.

Awesome, thank you for the detailed answers, @sgugger. This is super helpful.

So the issue is the platform-specific build:

 conda search -i -c fastai -f dataclasses
[...]
url         : https://conda.anaconda.org/fastai/linux-64/dataclasses-0.6-py36_0.tar.bz2
conda search -i --platform win-64 -c fastai -f dataclasses
Loading channels: done

PackagesNotFoundError: The following packages are not available from current channels:
- dataclasses

this one is no noarch. And now I know a way to test w/o asking for help from windows users. (conda search -i --platform win-64)

Note for myself: need to check that either we have noarch package or we need to build all possible versions for archs that we plan to support. (I’m talking about dependencies).

So we will either need to build all those other builds for other platforms and probably for python 3.7 or make the user install those via pip before doing conda install fastai.

The only issue I see is that torchvision that is not available from the pytorch channel.
So the windows build will need to have that excluded. Or perhaps do you know whether they plan to make it available after pytorch 1.0 release?

torchvision and pytorch v1 for windows will be released when the official pytorch v1 comes out, which won’t be for a few more months. Windows users will just have to wait, I’m afraid, unless someone is brave enough to build on windows from source :slight_smile:

Thank you for the insight

Jeremy, are you saying we can just keep conda packages linux-only for a time being?

Also should we build for python 3.7? or just 3.6 for a time being?

Yes.

py36 is fine. If you have time py37 is nice to have - AFAIK the only difference is that we don’t need dataclasses on py37 (it’s part of the stdlib).

1 Like

Images now display in jupyter without requiring .show()


Note that it displays the full size image, so if you want to display a big image you should resize it first (or perhaps just use .show() which has a figsize default).

h/t @ashaw for the initial research.

3 Likes

OK, here are two new scripts for devs to use. Please pay attention:

  • Francisco created a script to automatically make doc notebooks trusted. And it’s now going to be run automatically on git pull via git’s post-merge hook - but you need to install it once on git clone! See notes: tools/trust-doc-nbs.

  • As we now have more than one thing to run on git clone, we now have a wrapper script that calls tools/trust-doc-nbs and tools/trust-origin-git-config in one shot. For details see: tools/run-after-git-clone.

As usual please let me know if encounter any problems, windows and all…

And if you have extra cycles, do let me know whether that section of the doc is clear, now that there are more and more steps and details… I hope it’s not too verbose…

Thank you.

I also feel a need to rename tools/build as it’s so unclear on what it does (I initially throught it’d do it all, hence the name).

Unless there are any objects or better name suggestions, I will rename it to: tools/sync-nb-exports

1 Like

Great!

Finally, would linux-64 be sufficient, or do we need linux-32 as well?

@lesscomfortable, do we really need to run this script on all notebooks on every git pull? it’s pretty slow already and will get only slower over time as the number of notebooks will grow.

It’s probably enough to check whether any notebook got modified and only then re-‘trust’ it - I suppose it stores some kind of checksum for each notebook? but then what to check against as a time reference point. or perhaps Notary() has an API to do the checking (except it might be just as slow as forcing trust) Can you please investigate? Thank you!

1 Like

Just read the setup.py and I’m not sure cupy should be in the requirements: it’s only used for QRNNs and can limit the install for some users. Also, I plan to rewrite the specific bit in QRNNs that require cupy as a pytorch custom C++ extension some day and hopefully get rid of cupy completely.

1 Like

The trust-doc-nbs run fine on Windows, however, run-after-git-clone gives me this error:

D:\Work\Deeplearning\fastai_pytorch>python tools\run-after-git-clone
Traceback (most recent call last):
  File "tools\run-after-git-clone", line 31, in <module>
    run_script(path/"trust-origin-git-config")
  File "tools\run-after-git-clone", line 18, in run_script
    result = subprocess.run(cmd.split(), shell=False, check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 403, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "C:\Users\Sylvain\Anaconda3\envs\fastai1\lib\subprocess.py", line 997, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Successfully (and quickly) installed fastai_pytorch from the repo on a fresh AWS instance (the deep learning AMI) with the following steps:

conda install pytorch-nightly -c pytorch
git clone https://github.com/fastai/fastai_pytorch
cd fastai_pytorch
pip install -e .
tools/run-after-git-clone

The script works perfectly well there (and it didn’t much time to trust all notebooks when I did a new pull).

3 Likes

Thank you for testing, @sgugger!

It’s odd that it doesn’t print out what it can’t find, no?

If I specify an incorrect file on linux I get:

FileNotFoundError: [Errno 2] No such file or directory: 'tools/trust-origin-git-confi': 'tools/trust-origin-git-confi'

Can you add more diagnostics, so the script now looks as: (copy-n-paste over as is):

import subprocess, os
from pathlib import Path

def run_script(script):

    # check that we can execute the script
    cmd = f"{script}"
    print(f"Executing: {cmd}")
    result = subprocess.run(cmd.split(), shell=False, check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    if result.returncode != 0: print(f"Failed to execute: {script}")
    if result.stdout: print(f"{result.stdout.decode('utf-8')}")
    if result.stderr: print(f"Error: {result.stderr.decode('utf-8')}")

# make sure we are under the root of the project
cur_dir = Path(".").resolve().name
if (cur_dir == "tools"): os.chdir("..")

path = Path("tools")
print("Path", path)
# facilitate trusting/distrusting of the repo-wide .gitconfig
run_script(path/"trust-origin-git-config")

# facilitates trusting notebooks under docs_src
run_script(path/"trust-doc-nbs")

I think so.

1 Like

If you run it this way, then enter ‘c’ at the pdb prompt, you’ll be able to see what file it can’t find:

python -m pdb tools\run-after-git-clone

So checking if a notebook is signed takes roughly half as much as signing it. We could improve the speed of the script by checking if a notebook is signed and only signing it if it is not signed already. If you agree I will modify the script to account for that.

Thank you for investigating it, @lesscomfortable

Given that if the check failed and it needs to be signed still, this may actually slow things down. So let’s leave it as is for now.

There should be an easier way to check for a simple modification of the file - just need to check what to check against. Perhaps dropping a last-checked file onto the fs and use its modification timestamp to see whether the nb is newer. That would be much faster as it’d be just one stat(3) call.

1 Like