Developer chat

(Jeremy Howard (Admin)) #573

Many apologies @hiromi - I made a change in master recently that removes the need for from fastai import * any time you use an application (e.g. from import *). However, if you aren’t using an application, you need from fastai.basics import *. I’ve fixed all the notebooks now (I hope!)

The 2nd bug you came across is because I added a new fix_dl attribute that provides the training dataset, but without shuffling or augmentation. It only works with fastai ItemLists however, not with generic Datasets. So I’ve fixed that now (in master) to skip creating it for generic datasets.

(Kaspar Lund) #574

when i make git pull it tries to switch to “” the new developer branch ?

is this the new dev branch ?

(Hiromi Suenaga) #575

Thank you so much for the detailed explanation!

I was hoping that I can spot what has changed and fix it somehow, but still a bit slow to get my bearings. Maybe next time :slight_smile:


Breaking change: In all the text application, batch is now the first dimension (and sequence length the second dimension). It won’t impact you if you’re using the fastai models/databunch but if you were customizing things, they may need to be tweaked.

@piotr.czapla Tagging you as a warning, even if I know you’re not using master.

(Stas Bekman) #577

Thanks for the heads up, @Kaspar. I may have done something wrong while trying to add a change to a PR, I will be extra careful next time. I deleted that branch as it has been merged, so you probably shouldn’t have a problem now.

(benedikt herudek) #578

dear fellow developers: consider grabbing some test scripts

… if I may suggest :wink:

(Deena Blumenkrantz) #579

Is there a tread to report bugs in, so we can discuss them before submitting a new issue on github?

Or should I create a whole new thread for any detected bugs?

(Stas Bekman) #580

If you think it’s a bug in fastai you can submit it directly as a github issue, but you can also post here about it if you’re not sure.

(Kaspar Lund) #581

just a heads up. I have created this issue [] with ref to a notebook on how to fix the memory overhead in LanguageModelLoader. The proposed version uses less than 5% of the current version. Testing and validation of accuracy on the english corpus is still need.

Would love some feedback

(Mikhail Grankin) #582

If you call twice it will wrap learn.model twice. Not an issue on runtime, but problem if you try save and load model. May I propose unwrap model in the on_train_end method?

(Lankinen) #583

What you guys think about this code piece.

def cont_cat_split(df, max_card=20, dep_var=None):
df: A pandas data frame, that you wish to take columns.
max_card: Maximum cardinality of a continuous variable.
dep_var: A dependent variable.

cont_names: A list of names of continuous variables.
cat_names: A list of names of categorical variables.

>>> df = pd.DataFrame({'col1' : [1, 2, 3], 'col2' : ['a', 'b', 'a'], 'col3' : [0.5, 1.2, 7.5], 'col4' : ['ab', 'ab', 'o']})
>>> df
   col1 col2 col3 col4
0     1    a  0.5   ab
1     2    b  1.2   ab
2     3    a  7.5    o

>>> cont_cat_split(df, 20, 'col4')
(['col3'], ['col1', 'col2'])
cont_names, cat_names = [], []
for label in df:
    if label == dep_var: continue
    if len(set(df[label])) > max_card and df[label].dtype == int or df[label].dtype == float: cont_names.append(label)
    else: cat_names.append(label)
return cont_names, cat_names

It is making easier to choose which columns to label as category and which as continuous. I know that sometimes people like to do it by hand but often it is just choosing like this so why not create a function for it. I was thinking that best place might be above add_datepart() in Can I create pull request about this or is it too useless?


Oh, good catch!
Yes unwrapping the model at the end is probably the best way to deal with this, and would get us loading and saving for free.


I think it would be a useful addition. If you make a PR, please note that the doc string should just take one line that explains what your function does (with arguments between if they are mentioned). Then edit the doc notebook tabular.transform (since I think this function should go there) and document your new function with more length (no need to list the parameters like you do) then you can show actual examples.

(Lankinen) #586

Done! I hope it is good enough for the library.

(Rupesh Goud) #587

While Im trying to run the lesson 10 , Im facing issues,
AttributeError: ‘numpy.ndarray’ object has no attribute ‘x’ at this line ,
trn_dl = LanguageModelLoader(np.concatenate(trn_lm), bs, bptt)

And my environment is,

=== Software === 
python       : 3.6.6
fastai       : 1.0.38
fastprogress : 0.1.18
torch        : 1.0.0
torch cuda   : 9.0.176 / is **Not available** 

=== Hardware === 
No GPUs available 

=== Environment === 
platform     : Linux-4.4.0-1065-aws-x86_64-with-debian-9.5
distro       : Debian GNU/Linux 9 stretch
conda env    : Unknown
python       : /usr/local/bin/python
sys.path     : 
no supported gpus found on this system

Thanks in advance

Fastai v1 install issues thread
(Duc Haba) #588


After following the install instruction on my laptop, I got the “AttributeError: module ‘typing’ has no attribute ‘_ClassVar’” error
during the “from import *”.

I searched the forum, but couldn’t find any posting relating to this error. Any advice would be greatly appreciated. [include image]

(Stas Bekman) #589

Why do we have get_preds's default ds Valid? Isn’t the main use for get_preds is with the test set?

So currently we need to use:

predictions = learn.get_preds(ds_type=DatasetType.Test)

to do that. Yuck. At least perhaps having a thin wrapper?

predictions = learn.get_preds_for_test()

And then there is an issue with docs which currently don’t show any default value. It looks like a bug in show_doc, all the methods below it have the same issue, showing: ds_type = ``

(Quan Tran) #590

Hi guys,
Based on Jeremy lesson 6 pet nb I wrote a class to simplify the process of plotting Gradcam (optionally with guided backprop based on the Gradcam paper). I think it would be a nice complement to the ClassificationInterpretation and to deep learning model’s interpretation in general
The post is originally here and I am not sure how to move it to fastai dev topic…
Anyway I hope this is helpful and if there’s a way to add this to fastai let me know. I’d love to contribute this to the code base.

(Piotr Czapla) #591

You just saved me hours of debugging :heart:, as we are overwriting the language model loader to create batches for bi-directional training, and everything was working except that the results were random.
Thank you! :slight_smile:

(Mikhail Grankin) #592

I’m doing distributed training on 4 machines with 8 GPUs each. Validation part of .fit() loop takes more time than anything else combined. I believe that is because there is no distributed inference. Is anybody working on that ATM?