Free GPU credits for Fast.ai Courses

XFFXFF · August 11, 2018, 12:14am

@snarkai
I run python -m spacy download en,and I get a “Permission denied” message

I tried to solve it through https://github.com/snarkai/snark-doc, but it didn’t work

snarkai · August 11, 2018, 12:14am

@XFFXFF try running the same command with sudo?

XFFXFF · August 11, 2018, 12:17am

XFFXFF · August 11, 2018, 12:40am

@snarkai
I solved it by runing sudo /opt/conda/envs/fastai/bin/python -m spacy download en

snarkai · August 11, 2018, 12:47am

@XFFXFF thanks for letting me know! will add this in trouble shooting docs

prajjwal1 · August 11, 2018, 3:45am

Refer to this issue,

I think it has been solved with Pytorch 0.4

prajjwal1 · August 11, 2018, 3:46am

Sudo doesn’t work, neither does pip. Activating the conda environment and then installing also doesn’t work. How do i upgrade pytorch to 0.4?

XFFXFF · August 11, 2018, 11:09am

@prajjwal1
try runing sudo /opt/conda/envs/fastai/bin/pip

diskandar · August 12, 2018, 2:41pm

Hello, it has been working for the snark GPU, but when i start again it gives me this error:
‘File “/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/snark/cli/connect.py”, line 44, in connect
raise NotImplemented()
TypeError: ‘NotImplementedType’ object is not callable’

while the pod is started but does not want to connect local host:8888.

Thank you,
Danny

diskandar · August 12, 2018, 2:42pm

just to make it clear it used to worked, i have been using it several times.

But now suddenly it did not work.

snarkai · August 12, 2018, 4:36pm

Hey @diskandar what’s the version of the snark package? Can you try pip3 install --upgrade --no-cache snark

Another possiblity is that you already have something running on port 8888. Run this to check
$sudo netstat -tulpn | grep 8888

diskandar · August 12, 2018, 7:28pm

Hi thank you now it works with pip3 install --upgrade --no-cache snark
although I have no idea why that thing happen, this is what frustrating about learning this AI.

There are just so many thing that could go wrong on the setting up phase, even before I touch the real AI stuff. @jeremy @rachel

But thank you again!

snarkai · August 12, 2018, 7:45pm

@diskandar We’re iterating fast and pushing new updates, old code becomes deprecated, our package needs to be updated. Apologies for the inconvenience. We aim to have effortless setup.
pip3 install --upgrade --no-cache snark
snark start --pod_type fast.ai --jupyter
and you can access our cloud GPU servers in your own browser.

danebalia · August 13, 2018, 4:25am

Does anyone know the new Promo code for Fast.Ai Learners? I’ve tried at least 4.
FASTAIG5H7
UWQMTGL
FASTAI6GKZ
FASTAI15

That is for Paperspace.

Kasianenko · August 13, 2018, 5:56am

It is in the first post

corvus · August 13, 2018, 9:00am

@snarkai the kernel keeps dying when I try to run the model learning in lesson1.ipynb

Kasianenko · August 13, 2018, 10:37am

Hi, I’ve been running lesson4-imdb notebook on p106 instance.

and run into this error

learner.fit(3e-3, 4, wds=1e-6, cycle_len=1, cycle_mult=2)


---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-357a8890c905> in <module>()
----> 1 learner.fit(3e-3, 4, wds=1e-6, cycle_len=1, cycle_mult=2)

~/courses/dl1/fastai/learner.py in fit(self, lrs, n_cycle, wds, **kwargs)
    302         self.sched = None
    303         layer_opt = self.get_layer_opt(lrs, wds)
--> 304         return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
    305 
    306     def warm_up(self, lr, wds=None):

~/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
    249             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
    250             swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
--> 251             swa_eval_freq=swa_eval_freq, **kwargs)
    252 
    253     def get_layer_groups(self): return self.models.get_layer_groups()

~/courses/dl1/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, visualize, **kwargs)
    159 
    160         if not all_val:
--> 161             vals = validate(model_stepper, cur_data.val_dl, metrics, seq_first=seq_first)
    162             stop=False
    163             for cb in callbacks: stop = stop or cb.on_epoch_end(vals)

~/courses/dl1/fastai/model.py in validate(stepper, dl, metrics, seq_first)
    233         for (*x,y) in iter(dl):
    234             y = VV(y)
--> 235             preds, l = stepper.evaluate(VV(x), y)
    236             batch_cnts.append(batch_sz(x, seq_first=seq_first))
    237             loss.append(to_np(l))

~/courses/dl1/fastai/model.py in evaluate(self, xs, y)
     77         preds = self.m(*xs)
     78         if isinstance(preds,tuple): preds=preds[0]
---> 79         return preds, self.crit(preds, y)
     80 
     81 def set_train_mode(m):

/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce)
   1159         >>> loss.backward()
   1160     """
-> 1161     return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
   1162 
   1163 

/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel)
    784     if dim is None:
    785         dim = _get_softmax_dim('log_softmax', input.dim(), _stacklevel)
--> 786     return torch._C._nn.log_softmax(input, dim)
    787 
    788 

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58

I read on the forum that this notebook can cause this kind of error, so I decreased batch size from 64 to 32, and while GPU 1080 were not available started training network again. After 1.5 hours it gave same error. But I lost track if it finished 1 epoch or not. Any suggestions except decreasing batch size?

prajjwal1 · August 13, 2018, 12:13pm

The error remains same

Installing collected packages: pytorch
  Running setup.py install for pytorch ... error
    Complete output from command /opt/conda/envs/fastai/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-pzov72we/pytorch/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-8ylbpu2q/install-record.txt --single-version-externally-managed --compile:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-pzov72we/pytorch/setup.py", line 13, in <module>
        raise Exception(message)
    Exception: You should install pytorch from http://pytorch.org

    ----------------------------------------
Command "/opt/conda/envs/fastai/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-pzov72we/pytorch/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-8ylbpu2q/install-record.txt --single-version-externally-managed --compile"failed with error code 1 in /tmp/pip-install-pzov72we/pytorch/

moth · August 13, 2018, 1:50pm

I have the same problem

fredguth · August 13, 2018, 3:32pm

You will need to decrease the batch even more.
But to prevent losing training, you can use best_save_name=‘savedmodel’ in the fit function. With that, fit will save the trained model in a file named savedmodel when your metric indicator improves.