I see, thanks a lot!
I just updated metrics.ipynb
but still ran all docs tests via ./run_tests.sh
and saw two tests failed event hough I haven’t made any changes. I wanted to point this out maybe it might be helpful.
For reproducing, do the dev install and then run ./run_tests.sh text*
.
_____________________________________________________ text.ipynb::Cell 6 ______________________________________________________
Notebook cell execution failed
Cell 6: Cell execution caused an exception
Input:
data_lm.save('data_lm_export.pkl')
data_clas.save('data_clas_export.pkl')
Traceback:
---------------------------------------------------------------------------
IsADirectoryError Traceback (most recent call last)
<ipython-input-7-7dcc871a6781> in <module>
----> 1 data_lm.save('data_lm_export.pkl')
2 data_clas.save('data_clas_export.pkl')
~/fastai-fork/fastai/basic_data.py in save(self, file)
152 warn("Serializing the `DataBunch` only works when you created it using the data block API.")
153 return
--> 154 try_save(self.label_list, self.path, file)
155
156 def add_test(self, items:Iterator, label:Any=None)->None:
~/fastai-fork/fastai/torch_core.py in try_save(state, path, file)
406
407 def try_save(state:Dict, path:Path=None, file:PathLikeOrBinaryStream=None):
--> 408 target = open(path/file, 'wb') if is_pathlike(file) else file
409 try: torch.save(state, target)
410 except OSError as e:
IsADirectoryError: [Errno 21] Is a directory: '/home/turgutluk/.fastai/data/imdb_sample/data_lm_export.pkl'
_____________________________________________________ text.ipynb::Cell 7 ______________________________________________________
Notebook cell execution failed
Cell 7: Cell execution caused an exception
Input:
data_lm = load_data(path, 'data_lm_export.pkl')
data_clas = load_data(path, 'data_clas_export.pkl', bs=16)
Traceback:
---------------------------------------------------------------------------
IsADirectoryError Traceback (most recent call last)
<ipython-input-8-e145eb9fb246> in <module>
----> 1 data_lm = load_data(path, 'data_lm_export.pkl')
2 data_clas = load_data(path, 'data_clas_export.pkl', bs=16)
~/fastai-fork/fastai/basic_data.py in load_data(path, file, bs, val_bs, num_workers, dl_tfms, device, collate_fn, no_check, **kwargs)
275 "Load a saved `DataBunch` from `path/file`. `file` can be file-like (file or buffer)"
276 source = Path(path)/file if is_pathlike(file) else file
--> 277 ll = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)
278 return ll.databunch(path=path, bs=bs, val_bs=val_bs, num_workers=num_workers, dl_tfms=dl_tfms, device=device,
279 collate_fn=collate_fn, no_check=no_check, **kwargs)
~/.conda/envs/my_fastai/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
364 (sys.version_info[0] == 3 and isinstance(f, pathlib.Path)):
365 new_fd = True
--> 366 f = open(f, 'rb')
367 try:
368 return _load(f, map_location, pickle_module)
IsADirectoryError: [Errno 21] Is a directory: '/home/turgutluk/.fastai/data/imdb_sample/data_lm_export.pkl'
Oh the doc tests aren’t the ones run in the test suite, you should check with pytest or make test.
I’ll check what’s wrong with this notebook tomorrow.
How to distinguish Collection
, Collection[T_co]
, and Collection[int]
?
We can find three of them in index_row
’s doc and source
index_row
[source][test]
index_row
(a
:Union
[Collection
[T_co
],DataFrame
,Series
],idxs
:Collection
[int
]) →Any
def index_row(a:Union[Collection,pd.DataFrame,pd.Series],
idxs:Collection[int]) -> Any:
I didn’t find the online docs for typing
is helpful for figuring out their distinctions.
Here are my guesses:
Collection[int]
can be a list or tuple of integers;Collection
can be a list of any type;Collection[T_co]
is a different expression ofCollection
.
Please help me distinguish them, thanks a lot!
@sgugger @stas
Thank you for the step-by-step instructions, Stas! I just submitted my first PR ever using your guide!
One source of uncertainty I had was that git Notes – fastai mentions
In the
docs_src
folder, if you made changes to the notebooks, run:
cd docs_src
./run_tests.sh
You will need at least 8GB free GPU RAM to run these tests.
But based on https://docs.fast.ai/gen_doc_main.html, it seems that one can just modify the notebook (if the edit is just changing the text, which it was in my case) and commit that. Basically I found a lot of helpful information, but I’m proactively apologizing in case my best attempt at following directions still resulted in doing the wrong thing.
You’re correct, @tank13, that was ambiguous. I have modified that step to clarify that it’s only needed if you modify code cells in the doc notebooks: https://docs.fast.ai/dev/git.html#step-5-test-your-changes
I hope the instructions are more clear now. Thank you for flagging that.
And thank you for the kind words - I’m glad you found it useful!
Hi,
Inside the Abbreviation Guide (https://docs.fast.ai/dev/abbr.html) there is no entry for underscore
as prefix for internal usage.
Should this be added?
You can add it in a PR, yes. Note this is standard python practice, not just us.
While I am writing docs for fastai, sometimes I would like to write a little doc for some frequently used functions of pytorch as well. So, I got a few questions to follow:
- can a dev version of pytorch work well with a dev version of fastai?
- (If I decide to manually udpate a few files of pytorch with my docs on them, then I wonder) how frequent does a dev version of fastai requires to update a new version of pytorch?
- since I am using dev version of fastai, everyday I only sync my fork and local master with the official repo master, today my fastai version is
1.0.53.dev0
which I assume to be the latest, but I don’t ever see pytorch get updated so far. Doesconda update -c fastai fastai
update pytorch for me? If they both can automatically update pytorch when necessary, then how often does fastai require a new version of pytorch?
Thanks! @sgugger
I don’t know for PyTorch nightly as we’re developing on the latest stable release usually (since v1.0 is out). We check now and then for breaking changes in the latest nightlies but not all the time. So I’d say to use PyTorch v1.1 with fastai master.
When you sync your repo, you’re up to date with the latest (it’s rather quiet and only bug fixes at the moment as we’re developing our own v1.1). conda update should update PyTorch to the latest stable release, I don’t think it works with the nightlies.
Thanks a lot for your reply!
So, can I say that conda update -c fastai fastai
won’t update the latest pytorch stable version for me but conda update ...
does?
I have checked my version of pytorch is at 1.01 post2
, but the latest stable version of pytorch is 1.1
. However, I tried conda update pytorch torchvision
. I got the following response:
(fastai) ~ conda update pytorch torchvision
Collecting package metadata: done
Solving environment: done
# All requested packages already installed.
How can I update my pytorch to the latest stable version for my dev version of fastai?
Because fastai dependencies are already satisfied, and you already have the latest fastai release, and you’re telling conda to update fastai, conda won’t do anything.
if you want to update specific packages which made new releases since you installed a particular version of fastai, you need to update them explicitly, e.g., in the case of pytorch/torchvision:
conda install -c pytorch pytorch torchvision
note, I never bother with conda update
, since conda install
does the same thing.
Though, you can instruct conda to update all the dependencies, so here you’d do:
conda install -c fastai -c pytorch --update-deps fastai
Or, you can update all packages in your conda environment with:
conda update -c pytorch -c fastai --update-all
(you still need to list the channels -c pytorch -c fastai
in the above command, otherwise it’ll only check the default channel and whatever is listed in your ~/.condarc
if anything).
Thank you very much! Very helpful!
Just read this topic today: [Solved] Reproducibility: Where is the randomness coming in?
It mentions how to get reproducible results, but it uses a doc from dev (https://docs.fast.ai/dev/test.html#getting-reproducible-results) where it reads: set num_workers=1 (or 0) in your DataLoader/DataBunch
.
From the second lesson and Hirome notes (https://github.com/hiromis/notes/blob/master/Lesson2.md), @jeremy explicit tells to set the seed
in order to always get the same validation set, but doesn’t say anything for num_workers
.
So, 1) is the num_workers=1
really needed ir order to get the same validation set, or is it just needed when executing the tests?
And 2) Should these be explicited in the basic_data
docs, in order to instruct how to get the same dataset for training and validation when needed?
Just a note: I am still learning from the lessons while trying to contribute to the project when I see something that’s missing or confusing in the docs. I hope to contribute more later on as I get more hands on experience.
I also can’t run the tests because I still haven’t got myself a GPU so this restricts me a lot. Is it possible to run them on GCP? How much time is needed for the documentation’s tests?
Thanks!
No this is only if you want to insure you get the same random batches when training. With num_workers set to more than 1, there used to be some problems with the seeds in the various processes. I think this has been fixed now however.
Note that the basic tests (with pytest
run in the fastai folder) don’t need a GPU to run.
It’s been quite a while, I was wondering if there were any doc improvement requirements before the new course starts. I see a list of the pages that Sylvain suggests can be improved, but I’m guessing many of these pages have been updated already. thanks!
The new course will use fastai version 2, so we’ll focus on that documentation.
Hi @sgugger is this still ongoing? I’m not sure if this is an appropriate place to make a suggestion but I’m doing the online course and in lesson 7, Jeremy highlights that in this function: tfms = ([*rand_pad(padding=3, size=28, mode='zeros')], [])
the asterisk operator is used because the rand_pad function returns 2 transforms. But when I checked the documentation here, to understand what he meant, I didn’t see anything highlighting what the function rand_pad actually returns. It wasn’t until I checked the actual code in the fast.ai library that it became clear to me. I’m wondering if adding to the documentation, a description of what each transform actually returns will be helpful to readers. If so, I’d be happy to contribute.
Like it’s said up there, we are focusing on the documentation of v2 right now. Wtill happy to take any PR that gets something better in v1