Fastai2 on Windows

yathindra · January 23, 2021, 7:32am

Same issue for me, when initializing TextLMDataBunch.from_df()

meanpenguin · January 26, 2021, 5:17am

Most likely, you need to add num_wokers=0 (Required for Windows env)

dls = ImageDataLoaders.from_df(df, PATH_DATA,
                              fn_col="FILENAME", 
                              label_col="LABEL",
                              num_workers = 0,
                              bs=4)

shobhit · June 9, 2021, 11:33am

Hi, I am facing a issue and I have a basic (and most probably stupid) query. I am a beginner to FastAI and trying to run it on Windows 10. I have installed it following the instructions on https://docs.fast.ai/ with Anaconda. When trying to train the model with GPU, I am getting out of memory error (probably because GPU memory is only 2 GB). Is there a workaround with this?

Otherwise, is it possible to use FastAI without GPU?

brismith · June 9, 2021, 1:36pm

Welcome @shobhit - reducing the batch size (bs) will reduce the amount of GPU memory used, so I’d try that first. Using the CPU will be pretty slow - and I can’t recall how to force this but maybe change the default device?

shobhit · June 10, 2021, 9:03am

Thank you @brismith , reducing the batch size worked.

AiProgrammer · August 25, 2021, 5:32am

I am running the example script and am getting an error.

RuntimeError: cuda runtime error (801) : operation not supported at ..\torch/csrc/generic/StorageSharing.cpp:258

I’ve tried lots of googling but have not been able to find a solution. Any help is appreciated.
Example Script: fastai/dataloader_spawn.py at master · fastai/fastai · GitHub

EDIT: Using Conda and a GTX 1050. I’ve tried installing the cuda drivers liked here, and also not using these drivers as there was no mention of them on the github. Are they necessary?

EDIT 2: Num workers = 0 fixes this, however I can’t figure out how to get multiple workers even though I’m not using Jupyter notebooks.

(ml) C:\Users\Ben\Desktop\Python Scripts\BiomeRecigonition>python Example.py
C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\torch\_tensor.py:1023: UserWarning: torch.solve is deprecated in favor of torch.linalg.solveand will be removed in a future PyTorch release.
torch.linalg.solve has its arguments reversed and does not return the LU factorization.
To get the LU factorization see torch.lu, which can be used with torch.lu_solve or torch.lu_unpack.
X = torch.solve(B, A).solution
should be replaced with
X = torch.linalg.solve(A, B) (Triggered internally at  ..\aten\src\ATen\native\BatchLinearAlgebra.cpp:760.)
  ret = func(*args, **kwargs)
THCudaCheck FAIL file=..\torch/csrc/generic/StorageSharing.cpp line=258 error=801 : operation not supported
Traceback (most recent call last):
  File "C:\Users\Ben\Desktop\Python Scripts\BiomeRecigonition\Example.py", line 37, in <module>
    learn.lr_find()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\callback\schedule.py", line 282, in lr_find
    with self.no_logging(): self.fit(n_epoch, cbs=cb)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 221, in fit
    self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 163, in _with_events
    try: self(f'before_{event_type}');  f()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 212, in _do_fit
    self._with_events(self._do_epoch, 'epoch', CancelEpochException)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 163, in _with_events
    try: self(f'before_{event_type}');  f()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 206, in _do_epoch
    self._do_epoch_train()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 198, in _do_epoch_train
    self._with_events(self.all_batches, 'train', CancelTrainException)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 163, in _with_events
    try: self(f'before_{event_type}');  f()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\learner.py", line 169, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\fastai\data\load.py", line 109, in __iter__
    for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\torch\utils\data\dataloader.py", line 918, in __init__
    w.start()
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\site-packages\torch\multiprocessing\reductions.py", line 247, in reduce_tensor
    event_sync_required) = storage._share_cuda_()
RuntimeError: cuda runtime error (801) : operation not supported at ..\torch/csrc/generic/StorageSharing.cpp:258

(ml) C:\Users\Ben\Desktop\Python Scripts\BiomeRecigonition>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\Ben\anaconda3\envs\ml\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

AllenK · September 13, 2021, 4:48am

without the nvidia cuda drivers then likely when you installed pytorch, it installed the cpu version and not the gpu version. Start Locally | PyTorch

FCFC · August 28, 2023, 7:48pm

Hi all,

I’m testing my workflow on a Windows machine after previously solely working on Ubuntu and am getting a weird pause when using functions like learn.get_preds() or learn.fit_one_cycle() or learn.lr_find().

The pause is consistently 25 seconds every time a function like that is called. I did not have this issue on Ubuntu. The results are fine, but the pause is driving me crazy! Any idea what could be causing this?

Thanks!