docs.fast.ai is still down due to Github problems. How do I run it locally?
Clone the fastai_docs repo. /docs is just a jekyll site - so follow jekyll instructions for running it.
I was thinking, perhaps we can create a new feature for the course notebooks that will automatically adjust the bs depending on the gpu RAM size the user has, so perhaps we could poll nvidia-smi and set the bs accordingly?
The problem is that if I want to get the latest changes, and
git pull, I have to wipe off all my local overrides, otherwise it’s a hell of a merge operation. And then I need to remember to go and change them again - this is a huge waste of time. So it’d be nice if we could get the bs out of needing be changed manually (other than lesson 1 where it’s on purpose too big to demonstrate the point of bs to gpu ram correspondence).
So in the first few cells, there will be a call:
bs = set_bs_based_on_my_gpu_size('lesson1')
so we will need to have a db of memory requiremetns for each lesson and it’d magically sort it out.
Alternative idea, have each nb start with:
bs = 24 # 8GB #bs = 48 #12GB #bs = 100 #21GB
or a dict perhaps, and the user can choose manually quickly rather than re-running the nb again and again and guessing. but of course it can be automated from nvidia-smi info.
I like the idea of setting
bs “automagically”, but if you do that only for the lesson (checking the best size based on a db), I wonder if people won’t expect that is something fastai always can do.
In fastai, bs is defined in DataBunch, but DataBunch has not info on the model. Keras choose to set bs in
fit (as long I remember) where you have info on the model.
If we change to set bs at fit or to use
set_bs_based_...(learn) after the learner is created, this could be a permanent feature.
This is something I would like to have, as nowadays defining bs is a try and error thing.
FYI: fastai-1.0.12 is out
- change TextDataBunchClass method [
from_folder] so that classes argument is passed to the call to TextDataset
- Strip space from file name when CSV has spaces
- Handle missing
- Pass on the
- Bad handling when final batch has size of 1
- rolled back numpy dependency to >=1.12 (anaconda package has a upper pin on it) and to pip>=9.0.1, the old version are buggy but should be ok for fastai
@ashaw, I have another doc-gen puzzle for you, besides the generation of unique anchors.
Please take a look at tabular.html
Do you see the first link with an anchor (in the main content, not toc)?
<p><a href="/tabular.html#tabular"><code>tabular</code></a> contains all the necessary classes to deal with tabular data, across two modules:</p>
but there is no such target as
#tabular. So either you need to add the non-existing target or not have those anchors in the link. I don’t know what the correct way.
And it happens in more than one place in the same file. Therefore I think you just need to add the missing target.
The same problem appears in about a dozen files: basic_train.html callback.html collab.html core.html metrics.html tabular.html text.html vision.data.html vision.html vision.tta.html.
These tools are ! Much appreciated
Hmmm I think https://docs.fast.ai/tabular.html#tabular actually works on the main docs site.
Perhaps the ./checklink-docs-local.sh script breaks is because #tabular is not directly included in local tabular.html file?
Jekyll converts it (through a template) when it serves - https://github.com/fastai/fastai_docs/blob/master/docs/_layouts/post.html#L7
But yes, I think there are still lots of other broken links I need to resolve!
OK, thank you for showing me that it’s js/template-generated and not in the source code.
I guess I can’t rely on a local fs link checker (it’s just much much faster). That’s OK, the live site’s one seems to work and reports a lot more problems. The latest one is here:
Thank you, Andrew.
Edit: I guess I can change the local fs link checker to fire up jekyll, run the check and then take it down - that would do the trick.
bundle exec jekyll build then it’ll create your site in the
Important change in the API of
RNNLearner described in details here.
Also, removed all get_mnist(), get_adult(), get_imdb() methods from datasets since it was making
- the library import vision and text all the time
- circle depencies
I’m in the process of updating the docs and all the examples accordingly.
I hope this thread is ok for this: Would it make sense to have a
check_images method in
ImageDataBunch? @Taka and I had a short discussion in the “share your work” thread of the ongoing course. I used a very simple quick and dirty for loop to check for file integrity in my little fun notebook:
A parallelized version with a delete flag could be nice for cleaning a dataset.
@stas Great idea. But before doing the merge “surgery”, note that there may be other workloads occupying parts of the GPU, and many GPUs to choose from.
How about providing a simple API that
- returns a list of how much RAM is available at each GPU (via-smi)
- returns the RAM requirements for a given model x, so that we can then set the bs and GPU using whatever policy…
Also, using LMS or other high-bandwidth swapping GPU-CPU RAM mechanisms we can deploy much bigger models with a minimal performance penalty…
For beginners the simpler thing is to have a bs table/multiplier factor, as there are just a few GPU RAM configurations, eg 8GB, 11GB, 16GB, …
I made a new mem-utils branch with the starter code. (new code is in
Also, this should work gracefully on CPU-only setups.
new release has been published:
- pretrained language model is now downloaded directly in the .fastai/models/ folder. Use
- add an argument
Learner.lr_find()to prevent early stopping, useful for negative losses.
- add an argument
SegmentationDatasetto choose the PIL conversion mode of the masks.
URLs.download_wt103()has been removed
update on the fastai_docs tools
all tools are now under
firehas been removed, one script can now update specific or all notebooks, the usage of arguments to be passed changed a bit to a simpler:
update all notebooks:
update specific notebooks:
tools/update-nbs docs_src/one.ipynb docs_src/one.ipynb
to pass arguments to update_notebooks just use add any of:
--update_html --update_nb --update_nb_links --do_execute --update_line_num
and that would indicate
--arg=True, otherwise the defaults are used
tools/make_sidebar.pyand the data was split off into
docs_src/sidebar/sidebar_data.py(feel free to relocate the data elsewhere if you don’t like that location, but please not in
tools/folder - thank you).
updated docs: https://docs.fast.ai/gen_doc.html#Updating-sidebar
Also @sgugger or @ashaw can you please look at two XXX comments I made at: https://docs.fast.ai/gen_doc.html#Update-the-doc I am not sure those are still relevant/correct - can you please validate/remove/fix?
@ashaw, I also noticed that
: from headers makes it into the #anchor, e.g.:
I don’t think ‘:’ is allowed inside the anchor. Please correct me if I’m wrong.
Do you mean to ask if we still need a function that does just one notebook? Yes, definitely.
I made a new mem-utils branch with the starter code. (new code is in fastai/utils/mem.py)
Very good start. Just tested it on 2 different servers…
Now it also reports the memory available for the smaller dedicated graphics card (eg a 710) which is not a good candidate for running CUDA.