Part 1 (2020) - Weekly Beginner Only Review and Q&A

My pleasure Daniel :slight_smile: Thanks for your message.

Looking to assign different weights to different image datasets, similar to what we do for class imbalance but for datasets instead. Any example would be much appreciated! @sgugger @wgpubs

I have a question on splitting tabular data. The example Jeremy used in fastbook ch 9 looks at time-series data (where he introduces np.where function). My question is how would you split this data if not using time series. In other words, my valid data should be in same pool as training data. This is from Ch 9…

In [ ]:
cond = (df.saleYear<2011) | (df.saleMonth<10)
train_idx = np.where( cond)[0]
valid_idx = np.where(~cond)[0]

splits = (list(train_idx),list(valid_idx))
TabularPandas needs to be told which columns are continuous and which are categorical. We can handle that automatically using the helper function cont_cat_split:

In [ ]:
cont,cat = cont_cat_split(df, 1, dep_var=dep_var)
In [ ]:
to = TabularPandas(df, procs, cat, cont, y_names=dep_var, splits=splits)

What does your dataset look like?

You may be fine using RandomSplitter but you may not depending on what you’re training data look like. To see what I’m talking about, check out Rachel’s article on creating validation sets here.

You’re fundamental goal is to create a validation set that reflects your inference task without any data leakage (e.g., including bits of data that will help you game the validation results though really hurt your inference time code). Her article has some great examples of all these.

-wg

1 Like

Rachel’s article addressed my question. Thank you for sharing Wayde!

1 Like

What a great treasure find! I just started my study group and cleared lesson 1. Now I’m wondering how to proceed and this is like a cool flashback on topics and discussions!

May I ask for those that went through this study group any additional comments that would have enhance this one so I can add those to make a great study group?

Again thanks Wayde and your study group for archiving this here and giving me more ideas on how to host mine.

2 Likes

Hi andrewn2000 Glad to hear your enjoying the course.

Jeremy says, not to get to bogged down in the theory to soon and focus on, creating stuff like blogging and deploying apps. E.g. making an app that uses some of your domain expertise and blogging about your experiences.

Nothing tests, what you think you have learned more, than blogging or creating apps for others to see and use.

Hope this helps

Cheers mrfabulous1 :smiley: :smiley:

1 Like

wow! your reply couldn’t have come at a better time! I was getting some errors in the vision docs on the learning rate in paperspace.

Instead of trying to troubleshoot them maybe I just push ahead and see if I can’t code out my own simple classifier that I built in v2 of the course.

Thanks again!

Hi andrewn2000 :smiley:

Obviously you have to fix the bugs you encounter while creating a model to use!

However once you have a model (e.g. a .pkl file ) my current strategy is to try build an app for each of the chapters and answer all the questions in the chapters.
Once I get a working model I try to build an app while doing the next notebook, classification apps are easy, but the others are a bit harder because you have to write code to extract the results from your model. In the next 12 months I hope to have deployed an app and a blog for each of the different categories of machine learning. (PS. it doesn’t count if its not online somewhere so at least some people you know can see it).

Good luck mrfabulous1 :smiley: :smiley:

1 Like

Hi, Did you get solution for this problem?
I am also getting same error.

Yes I was able to resolve this. I did signup for cognitive api on azure for this to work.

the same to me

Hey Guys,

I’m new to the fastai.

In the first chapter I get an error like this:

cuda runtime error (801) : operation not supported at ..\torch/csrc/generic/StorageSharing.cpp:247

Can anybody help here?
I installed Linux on a different device and it runs fine. But I want to use it on Windows.

Thank you very much!

Best regards
David

fastai is not officially supported for Windows.

You may want to explore the forums to see if, and how, folks are able to run on Windows (I know many have tried), but my advice to just use Linux … its supported and its what the industry uses for developming ML and DL systems.

Hey wgpubs,

thanks for your response.

I tried to use fastai on a seperate Computer running Linux (Ubuntu 20.04).

I followed the instructions.

But even here I get an Error message which is related to the fastbook module.

What did I do wrong here?

Thanks alot!

Best regards
David

> ERROR: Complete output from command /snap/jupyter/6/bin/python -u -c 'import setuptools, tokenize;_file='"'"'/tmp/pip-install-za7vvjep/sentencepiece/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file_, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-0_vkaeaa --python-tag cp37:
>   ERROR: running bdist_wheel
>   running build
>   running build_py
>   creating build
>   creating build/lib.linux-x86_64-3.7
>   creating build/lib.linux-x86_64-3.7/sentencepiece
>   copying src/sentencepiece/_init_.py -> build/lib.linux-x86_64-3.7/sentencepiece
>   copying src/sentencepiece/sentencepiece_model_pb2.py -> build/lib.linux-x86_64-3.7/sentencepiece
>   copying src/sentencepiece/sentencepiece_pb2.py -> build/lib.linux-x86_64-3.7/sentencepiece
>   running build_ext
>   /bin/sh: 1: pkg-config: not found
>   ./build_bundled.sh: 8: ./build_bundled.sh: git: not found
>   ./build_bundled.sh: 10: ./build_bundled.sh: git: not found
>   ./build_bundled.sh: 12: cd: can't cd to sentencepiece
>   ./build_bundled.sh: 15: ./build_bundled.sh: cmake: not found
>   ./build_bundled.sh: 16: ./build_bundled.sh: nproc: Permission denied
>   ./build_bundled.sh: 16: ./build_bundled.sh: make: not found
>   ./build_bundled.sh: 17: ./build_bundled.sh: make: not found
>   env: 'pkg-config': No such file or directory
>   Failed to find sentencepiece pkg-config
>   ----------------------------------------
>   ERROR: Failed building wheel for sentencepiece
>   ERROR: Complete output from command /snap/jupyter/6/bin/python -u -c 'import setuptools, tokenize;_file='"'"'/tmp/pip-install-za7vvjep/cymem/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file_, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-zt98k7kp --python-tag cp37:
>   ERROR: running bdist_wheel
>   running build
>   running build_py
>   creating build
>   creating build/lib.linux-x86_64-3.7
>   creating build/lib.linux-x86_64-3.7/cymem
>   copying cymem/_init_.py -> build/lib.linux-x86_64-3.7/cymem
>   copying cymem/about.py -> build/lib.linux-x86_64-3.7/cymem
>   package init file 'cymem/tests/_init_.py' not found (or not a regular file)
>   creating build/lib.linux-x86_64-3.7/cymem/tests
>   copying cymem/tests/test_import.py -> build/lib.linux-x86_64-3.7/cymem/tests
>   copying cymem/cymem.pyx -> build/lib.linux-x86_64-3.7/cymem
>   copying cymem/_init_.pxd -> build/lib.linux-x86_64-3.7/cymem
>   copying cymem/cymem.pxd -> build/lib.linux-x86_64-3.7/cymem
>   running build_ext
>   building 'cymem.cymem' extension
>   creating build/temp.linux-x86_64-3.7
>   creating build/temp.linux-x86_64-3.7/cymem
>   gcc -pthread -B /home/filipe/miniconda3/envs/JUPYTER/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/snap/jupyter/6/include/python3.7m -I/snap/jupyter/6/include/python3.7m -c cymem/cymem.cpp -o build/temp.linux-x86_64-3.7/cymem/cymem.o -O3 -Wno-strict-prototypes -Wno-unused-function
>   unable to execute 'gcc': No such file or directory
>   error: command 'gcc' failed with exit status 1
>   ----------------------------------------
>   ERROR: Failed building wheel for cymem
>   ERROR: Complete output from command /snap/jupyter/6/bin/python -u -c 'import setuptools, tokenize;_file='"'"'/tmp/pip-install-za7vvjep/srsly/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file_, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-_odsi9d0 --python-tag cp37:
>   ERROR: running bdist_wheel
>   running build
>   running build_py
>   creating build
>   creating build/lib.linux-x86_64-3.7
>   creating build/lib.linux-x86_64-3.7/srsly
>   copying srsly/_json_api.py -> build/lib.linux-x86_64-3.7/srsly
>   copying srsly/_msgpack_api.py -> build/lib.linux-x86_64-3.7/srsly
>   copying srsly/util.py -> build/lib.linux-x86_64-3.7/srsly
>   copying srsly/_pickle_api.py -> build/lib.linux-x86_64-3.7/srsly
>   copying srsly/_init_.py -> build/lib.linux-x86_64-3.7/srsly
>   copying srsly/about.py -> build/lib.linux-x86_64-3.7/srsly
>   creating build/lib.linux-x86_64-3.7/srsly/cloudpickle
>   copying srsly/cloudpickle/cloudpickle.py -> build/lib.linux-x86_64-3.7/srsly/cloudpickle
>   copying srsly/cloudpickle/_init_.py -> build/lib.linux-x86_64-3.7/srsly/cloudpickle
>   creating build/lib.linux-x86_64-3.7/srsly/tests
>   copying srsly/tests/test_msgpack_api.py -> build/lib.linux-x86_64-3.7/srsly/tests
>   copying srsly/tests/test_pickle_api.py -> build/lib.linux-x86_64-3.7/srsly/tests
>   copying srsly/tests/test_json_api.py -> build/lib.linux-x86_64-3.7/srsly/tests
>   copying srsly/tests/_init_.py -> build/lib.linux-x86_64-3.7/srsly/tests
>   creating build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_msgpack_numpy.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/util.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_version.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_ext_type.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_init_.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/exceptions.py -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   creating build/lib.linux-x86_64-3.7/srsly/ujson
>   copying srsly/ujson/_init_.py -> build/lib.linux-x86_64-3.7/srsly/ujson
>   creating build/lib.linux-x86_64-3.7/srsly/tests/cloudpickle
>   copying srsly/tests/cloudpickle/cloudpickle_file_test.py -> build/lib.linux-x86_64-3.7/srsly/tests/cloudpickle
>   copying srsly/tests/cloudpickle/testutils.py -> build/lib.linux-x86_64-3.7/srsly/tests/cloudpickle
>   copying srsly/tests/cloudpickle/_init_.py -> build/lib.linux-x86_64-3.7/srsly/tests/cloudpickle
>   creating build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_extension.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_sequnpack.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_limits.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_memoryview.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_pack.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_read_size.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_buffer.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_subtype.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_case.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_except.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_seq.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_stricttype.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_unpack.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/_init_.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_numpy.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_newspec.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   copying srsly/tests/msgpack/test_format.py -> build/lib.linux-x86_64-3.7/srsly/tests/msgpack
>   creating build/lib.linux-x86_64-3.7/srsly/tests/ujson
>   copying srsly/tests/ujson/test_ujson.py -> build/lib.linux-x86_64-3.7/srsly/tests/ujson
>   copying srsly/tests/ujson/_init_.py -> build/lib.linux-x86_64-3.7/srsly/tests/ujson
>   copying srsly/msgpack/_unpacker.pyx -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_packer.pyx -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/unpack.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/unpack_define.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/pack_template.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/buff_converter.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/unpack_template.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/pack.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/sysdep.h -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_unpacker.cpp -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/msgpack/_packer.cpp -> build/lib.linux-x86_64-3.7/srsly/msgpack
>   copying srsly/ujson/ujson.c -> build/lib.linux-x86_64-3.7/srsly/ujson
>   copying srsly/ujson/JSONtoObj.c -> build/lib.linux-x86_64-3.7/srsly/ujson
>   copying srsly/ujson/objToJSON.c -> build/lib.linux-x86_64-3.7/srsly/ujson
>   copying srsly/ujson/version.h -> build/lib.linux-x86_64-3.7/srsly/ujson
>   copying srsly/ujson/py_defines.h -> build/lib.linux-x86_64-3.7/srsly/ujson
>   running build_ext
>   building 'srsly.msgpack._unpacker' extension
>   creating build/temp.linux-x86_64-3.7
>   creating build/temp.linux-x86_64-3.7/srsly
>   creating build/temp.linux-x86_64-3.7/srsly/msgpack
>   gcc -pthread -B /home/filipe/miniconda3/envs/JUPYTER/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -D_LITTLE_ENDIAN_=1 -I/snap/jupyter/6/include/python3.7m -I. -I/tmp/pip-install-za7vvjep/srsly/include -I/snap/jupyter/6/include/python3.7m -c srsly/msgpack/_unpacker.cpp -o build/temp.linux-x86_64-3.7/srsly/msgpack/_unpacker.o -O2 -Wno-strict-prototypes -Wno-unused-function
>   unable to execute 'gcc': No such file or directory
>   error: command 'gcc' failed with exit status 1
>   ----------------------------------------
>   ERROR: Failed building wheel for srsly
> 
> ERROR: matplotlib 3.3.2 has requirement certifi>=2020.06.20, but you'll have certifi 2019.3.9 which is incompatible.
>     ERROR: Complete output from command /snap/jupyter/6/bin/python -u -c 'import setuptools, tokenize;_file='"'"'/tmp/pip-install-za7vvjep/srsly/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file_, '"'"'exec'"'"'))' install --record /tmp/pip-record-5ixdcr99/install-record.txt --single-version-externally-managed --compile --user --prefix=:
>     ERROR: running install
>     running build
>     running build_py
>     running build_ext
>     building 'srsly.msgpack._unpacker' extension
>     gcc -pthread -B /home/filipe/miniconda3/envs/JUPYTER/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -D_LITTLE_ENDIAN_=1 -I/snap/jupyter/6/include/python3.7m -I. -I/tmp/pip-install-za7vvjep/srsly/include -I/snap/jupyter/6/include/python3.7m -c srsly/msgpack/_unpacker.cpp -o build/temp.linux-x86_64-3.7/srsly/msgpack/_unpacker.o -O2 -Wno-strict-prototypes -Wno-unused-function
>     unable to execute 'gcc': No such file or directory
>     error: command 'gcc' failed with exit status 1
>     ----------------------------------------
> ERROR: Command "/snap/jupyter/6/bin/python -u -c 'import setuptools, tokenize;_file='"'"'/tmp/pip-install-za7vvjep/srsly/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file_, '"'"'exec'"'"'))' install --record /tmp/pip-record-5ixdcr99/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-install-za7vvjep/srsly/
> 
> ----------------------------------------
Summary

This text will be hidden


> ModuleNotFoundError                       Traceback (most recent call last)
> <ipython-input-2-2b820b2b946f> in <module>
>       1 #hide
>       2 get_ipython().system('pip install -Uqq fastbook')
> ----> 3 import fastbook
>       4 fastbook.setup_book()
> 
> ModuleNotFoundError: No module named 'fastbook'

Yikes, thats a bunch of errors :slight_smile:

Honestly, you’ll get more out of the class and be less distracted with these sysadmin issues if you use one of the cloud provider instances (colab, gradient, etc…). Personally, I’d go with colab for the course:

If you want to go down the rabbit hole of setting up your own environment, here’s some resources that will help. Also, you’ll want to search the forums for how folks are setting up their local development machines as there have many that have and most probably more proficiently than myself.

https://waydegg.com/making-a-dl-server.html

1 Like

Hey wgpubs,

thanks alot for your advice! I will use colab then.

I finished the first 2 lessons only watching the videos.

I have a last question. I want to use fastai for calculating temperature, pressure and mixing ratio of a substance by giving it a bunch of ideal measurements (calculated) which are defined for each property. Every state looks like a polynom of a high order. So by changing temperature, pressure or the mixing ratio the “polynom” changes a bit. When the ai is trained I want to insert measured data. Should I use the plots as pictures to train the ai? Because fastai is really good in picture detection. The first 2 lessons talk about classyfing pictures of bears. In my case its different. I want to have a result that look like this for example: “(p=3 atm, T=1500K, r = 0.85) is the most likely with 95%”.

Thanks for your advice.

Best regards
David

w/r/t to making your data into an image … go for it with this in mind:

… a good rule of thumb for converting a dataset into an image representation:if the human eye can recognize categories from the images, then a deep learning model should be able to do so too. See pp.36-39

If it does, then it will likely work (btw, I’d love to see what those image inputs look like)

If it doesn’t this is probably better suited as a tabular problem (which you’ll get to around ch.6 or 7 of the book).

See my blog series here for more highlights, tips, & tricks from my re-read of fastbook:

2 Likes

I’m new to Deep Learning and this book. This question is regarding chapter 1.I’m able to run a notebook and the “cats and dogs” model. But when I try and “test” the model in the next step using an image of my own, I encounter an error. I’m hoping to find what I’m doing wrong and how to remedy. Thanks

If you’re using google colab, the widgets won’t work. Otherwise if you’re running in jupyter or paperspace gradient you might need to import the library using: import ipywidgets as widgets