Lesson 3 In-Class Discussion ✅

@hxiao0909 were you able to solve this issue, I am facing the same one too @joshfp @sgugger could you please help me with this one

I just updated the Zeit script to the latest fastai (1.0.40) and it’s working properly. Be sure to pull the last version of the corresponding notebook as it’s using the new fast way to do inference.

2 Likes

There is some new type of error coming up then @sgugger
client.js:34 POST https://fastai-1v3.appspot.com/analyze 500
analyze @ client.js:34
onclick @ (index):28
VM173:1 Uncaught SyntaxError: Unexpected token I in JSON at position 0
at JSON.parse ()
at XMLHttpRequest.xhr.onload (client.js:26)
xhr.onload @ client.js:26
load (async)
analyze @ client.js:24
onclick @ (index):28


@sgugger
The above was when i was trying to use google app engine
when I tried to use zeit first i got this error


Then i changed the now.json to
version:2 and removed all remaining lines

But then when i am trying to go on this deployed app site
it is showing the directory structure only which is here
https://zeit-6js94ceor.now.sh/

Actually, what Jeremy ones suggested was not the through the 4th channel out because it is like throwing away more information.

For an excellent code example for modifying 3 channel input pretrained models into 4 (or even more if you wish) by @wdhorton for the Human Protein Atlas competition here.

I’m working with a Kaggle fake news dataset. When I use:

data_lm = (TextList.from_df(df, cols=['text','type'])
                .random_split_by_pct()
                .label_for_lm()
                .databunch(bs=bs))

I get:

AttributeError                            Traceback (most recent call last)
<ipython-input-8-632209f31b16> in <module>
----> 1 data_lm = (TextList.from_df(df, cols=['text','type'])
      2                 .random_split_by_pct()
      3                 .label_for_lm()
      4                 .databunch(bs=bs))

AttributeError: 'float' object has no attribute 'replace'

and when I try:

data_lm = (TextList.from_df(df, cols=['text','type'])
                .random_split_by_pct(0.2)
                .label_for_lm()
                .databunch(bs=bs))

I get:

AttributeError                            Traceback (most recent call last)
<ipython-input-9-223d106d8d5b> in <module>
      1 data_lm = (TextList.from_df(df, cols=['text','type'])
----> 2                 .random_split_by_pct(0.2)
      3                 .label_for_lm()
      4                 .databunch(bs=bs))

AttributeError: 'float' object has no attribute 'replace'

In other words, same error on the next line. Does anyone know what’s going on here?

Thanks!

NVM! It was caused by Nans in the text column… Filtered the Nans from the df and it works fine!

1 Like

I’m getting a strange error related to displaying DataFrames inside of the fastai Conda environment I’ve created.

The following occurs when calling df.head():

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

~/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/pandas/core/frame.py in _repr_html_(self)
    647         # display HTML, so this check can be removed when support for
    648         # IPython 2.x is no longer needed.
--> 649         if console.in_qtconsole():
    650             # 'HTML output is disabled in QtConsole'
    651             return None

~/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/pandas/io/formats/console.py in in_qtconsole()
    121             ip.config.get('KernelApp', {}).get('parent_appname', "") or
    122             ip.config.get('IPKernelApp', {}).get('parent_appname', ""))
--> 123         if 'qtconsole' in front_end.lower():
    124             return True
    125     except NameError:

AttributeError: 'LazyConfigValue' object has no attribute 'lower'

I’ve tried printing Pandas dataframes in notebook in other virtual environments on my machine and it works, so I’m not sure what’s going on here.

I’m working on a local machine on Ubuntu 18.04 and I have pulled the latest version of the repo and updated everything.

I’m not clear on something in the IMDB nb. @lesscomfortable I’m hoping you can explain it for me!

In the following part, is the second line of code (data_lm = TextLMDataBunch.load(path, ‘tmp_lm’, bs=bs)) changing data_lm in any way? I don’t believe it does, but am not sure.

data_lm = (TextList.from_folder(path)
           #Inputs: all the text files in path
            .filter_by_folder(include=['train', 'test', 'unsup']) 
           #We may have other temp folders that contain text files so we only keep what's in train and test
            .random_split_by_pct(0.1)
           #We randomly split and keep 10% (10,000 reviews) for validation
            .label_for_lm()           
           #We want to do a language model so we label accordingly
            .databunch(bs=bs))
data_lm.save('tmp_lm')

We have to use a special kind of TextDataBunch for the language model, that ignores the labels (that’s why we put 0 everywhere), will shuffle the texts at each epoch before concatenating them all together (only for training, we don’t shuffle for the validation set) and will send batches that read that text in order with targets that are the next word in the sentence.

The line before being a bit long, we want to load quickly the final ids by using the following cell.

data_lm = TextLMDataBunch.load(path, 'tmp_lm', bs=bs)

In the comment about “The line before being a bit long” I don’t know if “long” refers to execution time or the just the number of lines of code. That first line that creates data_lm runs fairly quickly, so I don’t really see what’s being gained by the second line that creates data_lm using TextLMDataBunch.

Hope this question makes sense! :slight_smile:

There was a question around 44:01 about the particular coding style for the DataBlocks API that uses that kind of method chaining.

In software engineering that’s called the “fluent interface”, you can find out more about it here: https://en.wikipedia.org/wiki/Fluent_interface

Another example that uses this kind of style is the Django QuerySet API.

Just thought that would be interesting to some of you.

Getting a bunch of errors with Lesson 3, I’m on the planet kaggle.

Showing df results in the following error:

AttributeError                            Traceback (most recent call last)
D:\Anaconda3\envs\fastai_v3\lib\site-packages\IPython\core\formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

D:\Anaconda3\envs\fastai_v3\lib\site-packages\pandas\core\frame.py in _repr_html_(self)
    647         # display HTML, so this check can be removed when support for
    648         # IPython 2.x is no longer needed.
--> 649         if console.in_qtconsole():
    650             # 'HTML output is disabled in QtConsole'
    651             return None

D:\Anaconda3\envs\fastai_v3\lib\site-packages\pandas\io\formats\console.py in in_qtconsole()
    121             ip.config.get('KernelApp', {}).get('parent_appname', "") or
    122             ip.config.get('IPKernelApp', {}).get('parent_appname', ""))
--> 123         if 'qtconsole' in front_end.lower():
    124             return True
    125     except NameError:

AttributeError: 'LazyConfigValue' object has no attribute 'lower'

Then, in the next cell when you define src, after the np.random.seed(42) you get the following error that ends with:

Exception: Your validation data contains a label that isn't present in the training set, please fix your data.

Can anyone help?

I’ve also noticed many differences between the code in the 2019 MOOC and the current code in the github repo. Is that intentional? I tried running the exact commands on the MOOC video, but ImageFileList does not appear to be a valid command?

Anyone applied this on Nuclei dataset(https://www.kaggle.com/c/data-science-bowl-2018) using Mask-RCNN ?

I’m trying to train hand Segmentation for a new dataset (Egohands). However, I always get CUDA out of memory even with P100 (16Gb GPU). The error appear with even very small image size and batch size

size = src_size//32
bs=2

The original image shape is (1280,720)

Can someone suggest me how to deal with it ? Thank you in advance

This might be helpful

1 Like

Hey! The idea behind load is not to have write all that chunk of code and re-create the databunch object every time you want to use it. It is both a computation and code thing. You don’t want to do the same thing twice, if you can avoid it.

To your question, it is not changing data_lm in the same way that loading a model from saved weights does not change the Learner object. However, say you ran it yesterday, created the databunch object and want to run it again today. In that case, there will be nothing to overwrite, in other words no data_lm to change. You will use .load and that will load the same databunch you created yesterday with less computation time and less lines of code.

Please let me know if this solves your problem.

Hey guys! Can anyone advice if I can use lesson3-head-pose.ipynb notebook flow to find several faces on a picture? So each picture will be labeled with multidimensional tensor. Will it be possible if there will be a different amount of faces on each picture?

Solution in this post.

1 Like

@sgugger, I’m wrong or in the jupyter notebook lesson3-camvid.ipynb, data.show_batch() does not show the real input to our model U-Net (real input = image transformed by tfms)?

(same question about learn.show_results() that does not show what will be predicted by the model (ie, a mask) but an overlay of the input image and its mask prediction)

Indeed, data.show_batch() shows this:

But the true input is just a normal image transformed by tfms, no ? (and its targeted label is its mask, see following screenshot)

Last question: what is the code used by fastai to overlay 2 images (an image and its mask as displayed by data.show_batch() and learn.show_results()?

Thanks.

1 Like

Hey, put this before your code:
get_ipython().config.get('IPKernelApp', {})['parent_appname'] = ""

such that it reads:
get_ipython().config.get('IPKernelApp', {})['parent_appname'] = ""
df = pd.read_csv(path/'train_v2.csv')
df.head()

Also, I am running python 3.7.2 in ubuntu 18.04 on p2.xlarge instance on AWS