Developer chat

sam2 · November 18, 2018, 5:02pm

In V1>basic_data->DataBunch()>def create(…

there is:

line 93 val_bs = (bs*3)//2

Is this intentional?

My 4GB GPU often went OOM after finishing training loop of an epoch at start of validation. I found this line was the culprit. In my local repo I have made val_bs=bs to overcome this.

Curious why it is the way it is

digitalspecialists · November 18, 2018, 5:31pm

I have found this too, crashing at validation stage, which I hadn’t ever experienced with v0.7
Maybe it is because I am using fp16 quite often and perhaps validation may not (speculation).
You can see it discussed here Different batch_size for train and valid data loaders

sam2 · November 18, 2018, 5:36pm

very interesting. I will leave it at val_bs=bs for the time being until I gather enough courage to bump it up in 10% increments.

stas · November 18, 2018, 7:09pm

Use https://github.com/stas00/ipyexperiments to speed up the bs tune up.

If you have any follow ups about this new tool, please use this thread to discuss it.

sam2 · November 18, 2018, 8:54pm

@stas, Just tried it out ! You are a life-saver !!

stas · November 19, 2018, 11:38pm

Heads up: we now have a tool to query gpu stats that fastai can support, it’s pynvml - and it’s now on both pypi and conda. So most likely it’ll soon be used by the fastai core modules (in particular tests) (and included in fastai dependencies). See the doc above for examples of use. It’s super fast!

PierreO · November 20, 2018, 1:52am

Unless I’m mistaken, there’s currently no method to label a bounding box with text. Would you be interested by something like this ?

sgugger · November 20, 2018, 2:26am

There is, you just have to pass classes on top of your bounding boxes and labels.

PierreO · November 20, 2018, 2:27am

Oh my bad then, sorry about that

sgugger · November 20, 2018, 2:32am

For some reason my update to the docs didn’t fully convert to HTML. You can see it here now.

fredguth · November 20, 2018, 12:59pm

I got an error in basic_train.py line 270 because it is not recognizing table=True,changing the code to remove this from the call, solves the problem.

Is it a bug?

fredguth · November 20, 2018, 1:14pm

Another potential bug:
When I create a DataBunch from a TensorDataset, as in lesson 5, and then try creating an ClassificationInterpretation, it breaks.

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_top_losses(9, figsize=(7,7))

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-33-f4ec02bb4041> in <module>()
      1 interp = ClassificationInterpretation.from_learner(learn)
----> 2 interp.plot_top_losses(9, figsize=(7,7))

~/Code/fastai/fastai/vision/learner.py in plot_top_losses(self, k, largest, figsize)
     96         "Show images in `top_losses` along with their prediction, actual, loss, and probability of predicted class."
     97         tl_val,tl_idx = self.top_losses(k,largest)
---> 98         classes = self.data.classes
     99         rows = math.ceil(math.sqrt(k))
    100         fig,axes = plt.subplots(rows,rows,figsize=figsize)

~/Code/fastai/fastai/basic_data.py in __getattr__(self, k)
     99         return cls(*dls, path=path, device=device, tfms=tfms, collate_fn=collate_fn)
    100 
--> 101     def __getattr__(self,k:int)->Any: return getattr(self.train_dl, k)
    102     def dl(self, ds_type:DatasetType=DatasetType.Valid)->DeviceDataLoader:
    103         "Returns appropriate `Dataset` for validation, training, or test (`ds_type`)."

~/Code/fastai/fastai/basic_data.py in __getattr__(self, k)
     22 
     23     def __len__(self)->int: return len(self.dl)
---> 24     def __getattr__(self,k:str)->Any: return getattr(self.dl, k)
     25 
     26     @property

~/Code/fastai/fastai/basic_data.py in DataLoader___getattr__(dl, k)
      6 __all__ = ['DataBunch', 'DeviceDataLoader', 'DatasetType']
      7 
----> 8 def DataLoader___getattr__(dl, k:str)->Any: return getattr(dl.dataset, k)
      9 DataLoader.__getattr__ = DataLoader___getattr__
     10 

AttributeError: 'TensorDataset' object has no attribute 'classes'

sgugger · November 20, 2018, 2:11pm

That conversation has already been had on another topic, if you’re not using fastai to create your dataset, don’t expect all fastai functionalities to work on it. Here the problem is that your TensorDataset doesn’t have the classes attribute that ClassificationInterpretation requires.

sgugger · November 20, 2018, 6:18pm

After experimenting a bit, and going back and forth, we finally settled on adding a MAJ token: each word that begins with a capital is lower cased (as before) but we add xxmaj in front of it to tell the model. It appears to help a little bit.
There is a new pretrained model to match that change: you’ll find it in URLs.WT103_1
The text example notebook has been updated to use it (and went from 79% to 84.5% accuracy in the process!)

fredguth · November 21, 2018, 2:25pm

Sorry, I didn’t see the question in another topic before posting here.

sgugger · November 21, 2018, 3:44pm

A lot of stuff aimed at unifying the API accross applications just merged:

every type of items now has a reconstruct method that does the opposite of .data: taking the tensor data and creating the object back.
show_batch has been internally modified to actually grab a batch then showing it.
show_results now works across applications.
introducing data.export() that will save the internal information (classes, vocab in text, processors in tabular etc) need for inference in a file named ‘export.pkl’. You can then create an empty_data object by using DataBunch.load_empty(path) (where path points to where this ‘export.pkl’ file is). This also works across applications.

Breaking change:
As a result ImageDataBunch.single_from_classes has been removed as the previous method is more general.

piotr.czapla · November 21, 2018, 9:06pm

Awesome! Sylvain can you point me to the scripts you are using to create the pre-trained model, I’d like to see if I can get some improvements using BiLM training and qrnn.

stas · November 25, 2018, 5:44pm

A post was merged into an existing topic: Fastai v1 install issues thread

MicPie · November 25, 2018, 3:28pm

I wrote a small Tensorboard callback to visualize the metrics and the parameter/gradient distributions and histograms: https://nbviewer.jupyter.org/github/MicPie/fastai_course_v3/blob/master/TBLogger_v2.ipynb

It is still a work in progress, because the code needs to polished and is only tested with the network in the notebook.

Could this be interesting for the library? If, how would I best incorporate the needed Logger class (with the Copyleft license)?

Feedback, suggestions, tips, and etc. are highly appreciated!

PS: I don’t know if switching to TensorboardX would be a better choice. Maybe somebody worked already with TensorbordX and can share his experience?

sgugger · November 25, 2018, 8:21pm

I put the latest notebook I used to pretrain a QRNN here. Didn’t fully test the true_wd=False so you’ll have to add that.