Developer chat

(sergii makarevych) #670

Lets add some :+1: to Stas`s pull request!

1 Like

#671

Read about this on Twitter but just wanted to stop by and say unbelievable, outstanding, amazing job Stas :slight_smile: :+1: Kudos to you!

2 Likes

(Pierre Ouannes) #672

That’s amazing Stas, very very nice work !

1 Like

(Kaspar Lund) #673

that is good work thz**64

1 Like

(Johannes Laute) #674

This is amazing! Many thanks

1 Like

(Stas Bekman) #675

Any idea why I get no follow up on this jupyter issue? I’m dumbfounded that after I broke it down to having an easy to reproduce minimal notebook and confirmed with the generic third party install, I bisected to find that the problem started with exactly python 3.6.0, spent hours trying to bisect on components and custom config to rule them all out and not a peep from the jupyter notebook devs :frowning:

Does this problem not bother you at all?

If it does please vote on the issue, perhaps then it’d get some attention.

Or perhaps you’re not using the TOC extension and the magic follow the execution focus, and jump to currently executing cell shortcut - you’re missing out on being a way more efficient than manually scrolling around at times very long notebooks. Except all 3 are problematic when this bug gets triggered. TOC is still useful despite the bug, but the other two can’t work with the bug.

Clearly it’s been around for at least 2 years now (3.6.0 release). And the bug manifestation seems to be dependent on what each cell contains. The reproducible notebooks always does manifest the bug. It should be easy to verify.

3 Likes

(Kerem Turgutlu) #676

Truly thankful :smiley:

1 Like

(Kaspar Lund) #677

it a really annoying issue. I run pretty long jobs and when i use run all cells i cannot see the progress before i get to a celll using fastaiprogress that does work.i have uopvoted

1 Like

(Kaspar Lund) #678

We use BOS but not EOS in the languagemodel tokenization. Isn’t this inconsistent.
When we are reading the tokens going forward then we use BOS to signal that a new sentence begin. shouldn’t we also use EOS so that when we read the tokens backwards then EOS signals that a new reverse sentence begins ?

0 Likes

(Bobak Farzin) #679

I have been thinking about this as being the BOS as the start of the input, then the RNN can “reset” whatever is needed for the next pass and can proceed from there. If the tokens are forward or reversed I don’t think matters, what matters is that you have something that says, “This is a new sequence.” So, when I try the backwards tokens, I revers them all and then have a BOS at the start of the reversed series. Maybe I got that wrong.

I have also been curious about why we don’t reset_state() when we get a new BOS (or EOS in your case) to be sure we are starting “clean” with the new sentence. That would seem right to me but have not tried it out to see if you get a better model.

I have a simple flag added to the spacy tokenizer that would allow you to get reversed tokens. Should I put in a PR for that?

0 Likes

#680

It’s hard to do that in practice because you get BOS in one of your batches but not all of them.

1 Like

(Florian Mutel) #681

Downloaded fastai on a windows machine today, tried it for image classification and noticed that there is a big performance issue due to the use of torch dataloader. It’s either:

  • Set num_workers to 0, which doesn’t use the GPU optimally,
  • Set num_workers to 8, which uses GPU at maximum but add few minutes (~5) at the begining of each epoch for windows to set workers …

any workaround around this ?
Refer to this issue for the current state (torch side).

0 Likes

#682

We don’t have any workaround yet, no.

0 Likes

(Andrew Ayres) #683

I’m experiencing the same error.

I followed pytorch.org instructions to install from source, in order to use CUDA on MacOS (with eGPU hosted NVIDIA GPU). It seems CUDA is being used ok

In:
import torch
torch.cuda.set_device(0)
torch.cuda.is_available()
Out: True

but when I try to run the notebook cell;

data.show_batch(rows=3, figsize=(7,6))

in lesson1-pets.ipynb notebook I get the following Runtime error;

fastai/vision/transform.py", line 194, in _find_coeffs
return torch.gesv(B,A)[0][:,0]
RuntimeError: B should have at least 2 dimensions, but has 1 dimensions instead

the exception was raised here;

/torch/utils/data/dataloader.py(541)_process_next_batch()
539 self._put_indices()
540 if isinstance(batch, _utils.ExceptionWrapper):
–> 541 raise batch.exc_type(batch.exc_msg)
542 return batch

I’m running TORCH_VERSION 1.1.0. You mentioned the bug was fixed in the latest version of PyTorch. When I check for the latest PyTorch release at https://github.com/pytorch/pytorch/releases, I see it’s listed as v1.0.0 released on 7th Dec 2018, so I’m not sure how I ended up with v1.1.0 by cloning with;

git clone --recursive https://github.com/pytorch/pytorch

When I begin install, I see

Building wheel torch-1.1.0a0+04b8a2f

Which seems to correspond with the 1.1.0a0 version number in https://github.com/pytorch/pytorch/blob/master/setup.py - merged two days ago.

If you have any advice about how I could get this error resolved, I’d appreciate it.

0 Likes

#684

I meant it has been fixed in the master of fastai. So with v1.0.41 you shouldn’t have that bug.

0 Likes

Lesson 1 throwing error in ImageDataBunch stage in windows
(Andrew Ayres) #685

Thanks Sylvain! All working now :slight_smile: (after updating Spacy to v2.0.18, then fastai to v1.0.41).

0 Likes

(Kerem Turgutlu) #686

Actually I am still getting OOM, I followed the steps above started a fresh kernel, set an enormous batch size and got OOM. Am I missing something, thanks!

0 Likes

(Stas Bekman) #687

In all the excitement it’s easy to miss the point of this discovery. Nobody can eliminate the OOM situation until someone comes out with a bottomless card.

So you will still have just as many OOM events as before. The difference is that now you can recover from it and not need to restart the notebook. Now you can just reduce the bs size (or other parameters) re-run the same cell.

I will be writing proper documentation shortly. I’m just finishing up some improvements to the code we use in the fastai.

Besides, if you’re using fastai, you don’t need to patch ipython. Just use the fastai git master version and the workaround is already there (for fit() functions at the moment).

2 Likes

(Kerem Turgutlu) #688

Yeap, I intended using it for fit methods. I re-ran the cells but I will update fastai and try again. Thanks a lot!

1 Like

(Joseph Catanzarite) #689

In my opinion, making code concise is not as important as making it readable.

0 Likes