Developer chat

(Marco Ribeiro) #313

@sgugger Hi,

After this commit I’m getting this error after, 1e-2) :

/media/ssd32gb/Arquivos/MachineLearning/fastai-repo/fastai/ in on_train_begin(self, pbar, metrics_names, **kwargs)
    214         self.pbar = pbar
    215         self.names = ['epoch', 'train_loss', 'valid_loss'] + metrics_names
--> 216         self.pbar.write('  '.join(self.names), table=True)
    217         self.losses,self.val_losses,self.lrs,self.moms,self.metrics,self.nb_batches = [],[],[],[],[],[]

TypeError: write() got an unexpected keyword argument 'table'


You have to also update fastprogress to 1.0.12.

Fixing notebook 1: ConvLearner not found
(Stas Bekman) #315

(Jeremy Howard (Admin)) #316

@dhoa if you’re wanting to create a PR, try using hub

You can’t push directly to our repo - if you want to save your work to github, you’ll need to make a fork.

(Wayne Nixalo) #317

Found a weird issue trying to use download_images in the course-v3/nbs/dl1/download_images notebook. It’ll fail silently most of the time and occaisonally throw a BrokenProcessPool error when trying to download images. I found a workaround: if you try to run the core function being called, download_url, download_images will start working again.

My guess is something to do with threading and ProcessPoolExecutor not being able to ‘touch’ download_url, or something. I don’t know how to fix that right now, so I opened an issue with a gist to replicate the bug and the workaround.

I just git pulled the fastai & course-v3 repos and re-did it to make sure.

(Marc Rostock) #318

if I use DataBunch.fromsomething(), the size=xx parameter is completely ignored unless I have transformations specified. So how would I get simple resizing of my images without other transforms?
This may be by design ( but it is very unintuitive as an api. If I can pass a size kwarg to the DataBunch method, this should not depend on the user also specifying some transformations I think. Suggestion: if “size” is present and ds_tfms is not, this should automatically generate a resize transform. What do you think?

PS: my workaround currently looks like this. Is there a better way of simply resizing?!

ds_tfms=([rotate(degrees=0, p=0.0)],[rotate(degrees=0, p=0.0)]), size=64`

Resize but no crop (or flip)
(Dien Hoa TRUONG) #319

Is it ok if I add an assert at show_batch to check that the batch_size >= rows**2 ? I have just encountered this problem :smiley:
Like below:

def show_batch(self:DataBunch, rows:int=None, figsize:Tuple[int,int]=(12,15), is_train:bool=True)->None:
    assert self.train_dl.dl.batch_size >= rows**2 if is_train else self.valid_dl.dl.batch_size >= rows**2
    show_image_batch(self.train_dl if is_train else self.valid_dl, self.classes,
        denorm=getattr(self,'denorm',None), figsize=figsize, rows=rows)


Careful, wrn22 is not supposed to be used in a ConvLearner. It’s our implementation for training on CIFAR-10, not a pretrained model.

(Dien Hoa TRUONG) #321

In / callbacks.hooks I think there is a disorder of learn.activation_stats.stats.shape

It indicates that:
The saved stats is a FloatTensor of shape (2,num_batches,num_modules) . The first axis is (mean,stdev) .

But it should be (2,num_modules,num_batches) right ?


(Likhit) #322

Thanks for pointing this out. I was confused too because the very example below it was contradictory.

(Stas Bekman) #323

And there is a new guide on the block: How to Make a Pull Request (PR) (specific to fastai needs).

Comments and improvement suggestions/PRs are welcome.

(Stas Bekman) #324

There was little feedback on my comment about bs presets: Developer chat, but the origin of the issue I raised wasn’t really about bs, it was beyond that - it’s related to any change in the user notebook, even if we magically figure out bs, so that users won’t need to change it, there will be other user changes that will have the same issue.

Therefore, I would like to generalize that same subject beyond the specifics of that comment. i.e. unrelated to setting bs.

There must be a good way for a user (me) to be able to git pull and not overwrite the local copy of the notebook, yet, being able to integrate the new changes. notebooks are very difficult to collaborate on even with our magical strip out filters, due to their json format. The two issues I encounter are:

  1. git pull failing because of the local changes.

    git stash wouldn’t work well, because there will be many conflicts on git stash pop - guaranteed.

    I’m thinking perhaps to not ever change any course notebook under git and instead make a copy and change that. Now I can always git pull without problems.

  2. how to actually merge the updated nb with my local changes. I guess nbmerge needs to be used to accomplish that in a less painful way, it can also be configured to be used on git stash pop. So it could fix both issues at once.

Anybody has any other good strategies for having the cake and eating it too?

(Stas Bekman) #325

There is a new tool that helps you make PRs much easier to create. It will magically handle forking, syncing the forked master, and then making a branch. It will clone the repo or use an existing checkout.


Please give it a try and send me feedback if you encounter any problems or want more magic that I haven’t thought of yet:

curl -O
chmod a+x fastai-make-pr-branch
./fastai-make-pr-branch https your-github-username fastai new-feature

Most of you, developers, will probably want ssh instead of https:

./fastai-make-pr-branch ssh your-github-username fastai new-feature

It’s in the fastai repo’s tools/ dir, so you can just git pull and run it directly via tools/fastai-make-pr-branch instead.

run it w/o arguments for help:

This program will checkout a forked version of the original repository, sync it with the original, create a new branch and set it up for a PR.


fastai-make-pr-branch auth user_name repo_name new_branch_name

  auth:   ssh or https (use ssh if your github account is setup to use ssh)
  user:   your github username
  repo:   repository name to fork/use
  branch: name of the branch to create


fastai-make-pr-branch ssh myusername fastai new-feature-branch

- if the original repository has been already forked, it'll be done by the program (it will ask you for your github password)
- if the program is run from a directory that already contains a checked out git repository that matches the parameters, it will re-use it instead of making a new checkout.
- if the requested branch already exists it'll reuse it
- if the master is out of sync with the original master it'll sync it first

(Kaspar Lund) #326

Thanks. got it

class WrnLearner(Learner):
def init(self, data:DataBunch, arch:Callable, **kwargs:Any)->None:
torch.backends.cudnn.benchmark = True
model = arch()
super().init(data, model, **kwargs)
apply_init(model, nn.init.kaiming_normal_)

learn = WrnLearner(data, wrn_22, metrics=error_rate)

(Kai Lichtenberg) #327

@Taka it’s implemented in the current release!

(Likhit) #328

Nice, did you send in a PR?

(Kai Lichtenberg) #329

No, it was done by Jeremy

(nok) #330

@sgugger I encouter a ZeroDivisionError: division by zero when I use the one_cycle with the conda 1.0.14 fastai.

Basically I have a big imageset,so I copy 100 images to a sample folder, you can see the file structure in the notebook. I keep getting this Zero divide error and struggle to resolve it. It’s surprise me when I do it runs smoothly without error.

=== Software === 
python version  : 3.7.0
fastai version  : 1.0.14
torch version   : 1.0.0.dev20181022
nvidia driver   : 396.44
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 11441MB | Tesla K80

=== Environment === 
platform        : Linux-4.9.0-8-amd64-x86_64-with-debian-9.5
distro          : #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08)
conda env       : fastai
python          : /home/mediumnok/.conda/envs/fastai/bin/python
sys.path        : 


I’m thinking there is not enough iterations: is your dataset small? Maybe try more than 1 epoch.

(nok) #332

Thanks for your always quick response. I guess you mean more than 1 batch here.
I slowly reduce the batch size until there are at least 4 batches in an epoch, the problem disappears. Last time I got this ZeroDivsionError due to the RandomLight transformation, I thought I was going back into this issue until I realize it was something else.

Maybe I could do a small PR to throw an error to prevent user doing it on a small dataset instead of throwing ZeroDivisionError?

You can find my attempt to reproduce this error for the examples/vision.ipynb, you can put this notebook in the examples/ directory and run it and reproduce the error. It create a subfolder “sample” in the original directory and copy 50 images in each class.

├── models
├── sample
│ ├── models
│ ├── train
│ │ ├── 3
│ │ └── 7
│ └── valid
│ ├── 3
│ └── 7
├── train
│ ├── 3
│ └── 7
└── valid
├── 3
└── 7