Fastai v1 install - Windows

cudawarped · October 18, 2018, 5:53pm

I am having issues installing fastai-v1 on Windows 8.1. I have built pytorch with Visual Studio Community 2017 and the steps exactly as detailed on the website

GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration

I had to install the 14.11 toolset, because it wasn’t installed by default, but this is detailed in the pytorch instructions. I didn’t time the build, but it must have taken at least 3 hours. I tried to use ninja but without any success.
Once installed I had a missing dll error when importing torch but this was fixed with

conda install -c defaults intel-openmp -f

I think my installation may be missing a few components because some of the tests are failing, specifically test\test_torch.py, fails with the below error.

RuntimeError: No CUDA implementation of ‘gesdd’. Install MAGMA and rebuild cutorch (MAGMA) at D:\repos\pytorch\aten\src\thc\generic/THCTensorMathMagma.cu:332

My plan is to address that once I get fastai v1 to “work”(obviously if torch is slightly broken it can’t work 100%).

I can successfully do

import torch

however I cannot install torchvision-nightly. The output from

conda install -c fastai torchvision-nightly

is

Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

torchvision-nightly

pytorch-nightly

Current channels:

Win 64 | Anaconda.org

https://conda.anaconda.org/fastai/noarch

Win 64 | Anaconda.org

Noarch | Anaconda.org

conda-forge/win-64

conda-forge/noarch

Win 64 | Anaconda.org

Noarch | Anaconda.org

main/win-64

main/noarch

Anaconda packages for Windows x86_64 (64-bit)

Anaconda packages (noarch)

r/win-64

r/noarch

Anaconda extras for Windows x86_64 (64-bit)

Anaconda extras (noarch)

msys2/win-64

msys2/noarch

I have therefore downloaded torchvision from

vision/torchvision at main · pytorch/vision · GitHub

and installed following their instructions with

python setup.py install

Following this the below works without any errors

import torchvision

Next I installed fastai and ran the below without any problems.

from fastai import *

however

from fastai.vision import *

is not found and results in the following error

ModuleNotFoundError: No module named ‘fastai.vision’

and

from fastai.text import *

gives me

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 from fastai.text import *

~\fastai\courses\dl1\fastai\text.py in
----> 1 from .core import *
2 from .learner import *
3 from .lm_rnn import *
4 from torch.utils.data.sampler import Sampler
5 import spacy

~\fastai\courses\dl1\fastai\core.py in
----> 1 from .imports import *
2 from .torch_imports import *
3
4 def sum_geom(a,r,n): return an if r==1 else math.ceil(a(1-r**n)/(1-r))
5

~\fastai\courses\dl1\fastai\imports.py in
1 from IPython.lib.deepreload import reload as dreload
----> 2 import PIL, os, numpy as np, math, collections, threading, json, bcolz, random, scipy, cv2
3 import pandas as pd, pickle, sys, itertools, string, sys, re, datetime, time, shutil, copy
4 import seaborn as sns, matplotlib
5 import IPython, graphviz, sklearn_pandas, sklearn, warnings, pdb

ModuleNotFoundError: No module named ‘bcolz’

Should I be installing fastai from source, will this bring in all the dependencies?

Additionally I get the following output

=== Software ===
python version : 3.6.6
fastai version : 1.0.6
torch version : 1.0.0a0+7edfe11
torch cuda ver : 9.2
torch cuda is : available
torch cudnn ver : 7301
torch cudnn is : enabled

=== Hardware ===
torch available : 1

gpu0 : GeForce GTX 980M

=== Environment ===
platform : Windows-8.1-6.3.9600-SP0
conda env : test_fastai
python : D:\c_progs\Anaconda3\envs\test_fastai\python.exe
sys.path :
D:\c_progs\Anaconda3\envs\test_fastai\python36.zip
D:\c_progs\Anaconda3\envs\test_fastai\DLLs
D:\c_progs\Anaconda3\envs\test_fastai\lib
D:\c_progs\Anaconda3\envs\test_fastai
C:\Users\b8\AppData\Roaming\Python\Python36\site-packages
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\torchvision-0.2.1-py3.6.egg
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\win32
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\win32\lib
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\Pythonwin
D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\IPython\extensions

from

python -c “import fastai; fastai.show_install(0)”

cudawarped · October 19, 2018, 9:53am

It looks like the conda install of fastai is currently not setting everything up on windows. I have now installed from source following the Developer Install instructions and both

from fastai.vision import *

and

from fastai.text import *

are working and I can successfully run the first three cells in the dogs_cats.ipynb notebook. I am now getting an error when running

learn.fit_one_cycle(1)

which looks like it could be due to the pytorch data loader however the 45 tests inside the pytorch test\test_dataloader.py passed without any errors. I have tried to examine the cause of the error with %debug but without any success. The full output is below, I suspect this is a windows issue but I am not sure, any help would be appreciated.

PicklingError Traceback (most recent call last)
in
1 learn = ConvLearner(data, tvm.resnet34, metrics=accuracy)
----> 2 learn.fit_one_cycle(1)

d:\ssdbackup\dev\repos\fastai_v1\fastai\train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
17 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
18 pct_start=pct_start, **kwargs))
—> 19 learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
20
21 def lr_find(learn:Learner, start_lr:Floats=1e-5, end_lr:Floats=10, num_it:int=100, **kwargs:Any):

d:\ssdbackup\dev\repos\fastai_v1\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
137 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
138 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
→ 139 callbacks=self.callbacks+callbacks)
140
141 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

d:\ssdbackup\dev\repos\fastai_v1\fastai\basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
89 except Exception as e:
90 exception = e
—> 91 raise e
92 finally: cb_handler.on_train_end(exception)
93

d:\ssdbackup\dev\repos\fastai_v1\fastai\basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
77 cb_handler.on_epoch_begin()
78
—> 79 for xb,yb in progress_bar(data.train_dl, parent=pbar):
80 xb, yb = cb_handler.on_batch_begin(xb, yb)
81 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)[0]

D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\fastprogress\fastprogress.py in iter(self)
59 self.update(0)
60 try:
—> 61 for i,o in enumerate(self._gen):
62 yield o
63 if self.auto_update: self.update(i+1)

d:\ssdbackup\dev\repos\fastai_v1\fastai\data.py in iter(self)
50 def iter(self):
51 “Process and returns items from DataLoader.”
—> 52 for b in self.dl: yield self.proc_batch(b)
53
54 def one_batch(self)->Collection[Tensor]:

D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\torch\utils\data\dataloader.py in iter(self)
817
818 def iter(self):
→ 819 return _DataLoaderIter(self)
820
821 def len(self):

D:\c_progs\Anaconda3\envs\test_fastai\lib\site-packages\torch\utils\data\dataloader.py in init(self, loader)
558 # before it starts, and del tries to join but will get:
559 # AssertionError: can only join a started process.
→ 560 w.start()
561 self.index_queues.append(index_queue)
562 self.workers.append(w)

D:\c_progs\Anaconda3\envs\test_fastai\lib\multiprocessing\process.py in start(self)
103 ‘daemonic processes are not allowed to have children’
104 _cleanup()
→ 105 self._popen = self._Popen(self)
106 self._sentinel = self._popen.sentinel
107 # Avoid a refcycle if the target function holds an indirect

D:\c_progs\Anaconda3\envs\test_fastai\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
→ 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

D:\c_progs\Anaconda3\envs\test_fastai\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
→ 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):

D:\c_progs\Anaconda3\envs\test_fastai\lib\multiprocessing\popen_spawn_win32.py in init(self, process_obj)
63 try:
64 reduction.dump(prep_data, to_child)
—> 65 reduction.dump(process_obj, to_child)
66 finally:
67 set_spawning_popen(None)

D:\c_progs\Anaconda3\envs\test_fastai\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 ‘’‘Replacement for pickle.dump() using ForkingPickler.’‘’
—> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #

PicklingError: Can’t pickle <function crop_pad at 0x0000005CBFE58EA0>: it’s not the same object as fastai.vision.transform.crop_pad

cudawarped · October 20, 2018, 11:49am

Thanks to stats, sgugger and of course Jeremy, I now have fastai v1 “working” on my windows machine.

Unfortunately the data loader has to be run with num_workers=0 at the moment as pointed out by sgugger because there is a problem with multiprocessing in windows. This slows down training significantly but everything still works well enough for me to run through all the example notebooks except text.ipynb. This means that for example in the dogs_cats notebook I had to change

data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms(), tfms=imagenet_norm, size=224, num_workers=0)

If anyone else is struggling, then the steps I found which worked are below on windows 8.1

Create and activate a test environment

conda create -n test_fastai_v_1 python=3.6

activate test_fastai_v_1
Install CUDA 9.2 and cuDnn 7.2
Install pytorch

From source following their instructions, with Visual Studio Community 2017 (it may take around 3 hours), or you can try
Using this wheel, which I built following their instructions, although I can’t guarantee that this will work.

Install torchvision following the instructions under From Source.
Install fastai following the Developer install instructions, excluding

tools/run-after-git-clone

Using this method I have successfully run through all the example notebooks except test.ipynb. If anyone has a better solution please let me know.

cudawarped · October 29, 2018, 6:32pm

Hi, I am new to python and would like to know if there a “simple” fix to allow the fastai data loader to run in more than a single process on windows?
That is creating a data loader as below

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs, num_workers=0)

with

num_workers >0

My understanding of the problem with doing this is that

multiprocessing on windows requires the data aug functions to be pickle-able because multiprocessing on windows can only spawn new processes unlike linux, where both fork and forkserver can be used, and
the data aug functions are decorated meaning that they cannot be pickled.

If this is incorrect then please let me know. In an attempt to understand the inner workings of python better I have experimented with removing the TfmPixel decorator from the crop method, by changing the name of crop to crop_method and including

crop = TfmPixel(crop_method)

meaning that

pickle.dumps(crop)

appears to work, however the method also has a @singledispatch decorator giving the following error
AttributeError: Can't pickle local object 'singledispatch.<locals>.register'
when I try to iterate over the data loader.

At this point I realized I should probably ask if there is a “simple” solution to this problem before I invest any more time?