if len(folders) !=0 and i==0 and '.' not in folders: continue
The folders variable here represents the folders we want to recurse into. Now the line says: if we are the top level (first step of os.walk()) and we have specified folders to recurse into, and ‘.’ is not in that list then do not include files from the current directory. So this works as you would expect, get_files(path, folders=['A','B']) will get you all the files from path/'A' and path/'B'. But if you do get_files(path, folders=['A','B', '.']) it will also include files directly under path.
As a side node, I actually found the previous three lines more mind bending, we are modifying a local variable d that is never used in the code. Only after some debugging I found this in the python docs
When topdown is True , the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames ; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.
So you are basically altering a value yielded from a function to change the future yields
I propose we have a study group meeting tomorrow in about 12 hours from now, and start digging in to the DataBlocks API for a few hours.
Trust me when I say that the DataBlocks API is a rabbit hole and we will be sucked into dataloaders, datasets, TfmdLists, Transforms and what not.
But belive that this will be worth it as then we can spread out as a group and help the fastai library with documentation and more examples.
I have started writing basic blogs about the high level DataBlocks API already but would be great to replicate everything with the mid-level API without DatBlocks.
Excellent, I’ve been inactive too but let’s get right back in it.
A basic understanding of datasets/dataloaders would do. Personally, I get stuck in fastai’s many python-tricks rather than Pytorch or code as such, for me it’s mostly the decorators/ class methods that get magically patched and make the whole library work seamlessly.
Jupyter launches a Jupyter visual debugger. the main functionalities are you can set breakpoints in notebook cells and source files, inspect variables, navigate the call stack. Here is the blog post link.