Developer chat

Yes, I do use the bleeding edge version w/developer install.
Sorry about the confusion, I was referring to Sylvain’s Developer chat
" Merged a big change: Learner objects now determine from the loss function…"

I (tried to) upload that merge but the label was still 1.0.15, so was not sure if I did get the newer version (with Sylvain’s) changes or the previous version without them.
I understand that both before and after the merge are not “releases” so both are still 1.0.15… so adding a 1.0.15.20181028 would help differentiate between them.
Just a small bandaid for those at the bleeding-edge…

Thank you for trying the new tool, @nok.

First, to explain how it currently works:

It clones the repo into wherever you are running it from.

It first checks whether the directory you’re in is already a clone you’re wanting to make and then it doesn’t clone, but re-uses the current checkout. the logic is to compare the output of:

git config --get remote.origin.url

with the url you are asking for, so for example if I’m inside the original fastai repo, the above command will return:

git@github.com:fastai/fastai.git

but if I’m asking for the fork of the same, which in my case would be git@github.com:stas00/fastai.git, then it can’t reuse that checkout and must make a new one. And so it does.

However if I’m already inside a checkout that matches: git@github.com:stas00/fastai.git and I am invoking tools/fastai-make-pr-branch for the same repo, it will not do a new checkout and use the current one instead.


Now to how we can improve usability. I think the issue is that when you call it from the fastai repo, with tools/fastai-make-pr-branch - that’s where it will create the new clone. So ideally it should not be called it that way, but as explained here: https://docs-dev.fast.ai/git.html#helper-program

curl -O https://raw.githubusercontent.com/fastai/fastai/master/tools/fastai-make-pr-branch
chmod a+x fastai-make-pr-branch
./fastai-make-pr-branch https your-github-username fastai new-feature

another approach is to position yourself into the base directory you want the clone to happen in:

cd fastai
cd ..
fastai/tools/fastai-make-pr-branch https your-github-username fastai new-feature ../put-it-here

or put the script somewhere in your $PATH, so that you could invoke it from anywhere.

or we should instrument it to have an extra argument so that the user can specify where the output should go. So say if you do call it from the fastai checkout folder you could say:

cd fastai
tools/fastai-make-pr-branch https your-github-username fastai new-feature ../put-it-here

Thoughts?

Thank you for the clarification, @gsg. We already have that mechanism in place. It’s .dev0. If you have 1.0.15 then you’re using a released version, if you now do a dev install you will get 1.0.16.dev0 after git pull - and now you’re on the bleeding edge. You weren’t before.

The timeline is:

...
1.0.14
1.0.15.dev0
1.0.15
1.0.16.dev0
...

Thanks for the follow-up @stas.
My understanding now is that whenever we do the developer install

git pull https://github.com/fastai/fastai 
cd fastai       
tools/run-after-git-clone                                                                                                                
pip install -e .[dev]   

twice, If fastai.__version__ has not changed between the 2 deployments, then there have been no changes to fastai code.

In the above case, since it stayed the same, e.g., 1.0.15.dev0,
this indicates that the changes that Sylvain announced Sunday morning, were not yet in the latest “bleeding” edge…(or that they were already in the Saturday git pull)
Correct?

See the updated doc here: https://docs-dev.fast.ai/develop.html#development-editable-install

You only need to do it once. After that you just do git pull and nothing else.

If you do git pull right now you will see 1.0.16.dev0, so now you’re on the bleeding edge and all the commits should be there. Please double check that it’s the case.

you can also run:

git log --oneline

in your checkout and see the short log of everything you have. if you want pretty:

git log --graph --decorate --pretty=oneline --abbrev-commit

* 4c11bce (HEAD -> master, origin/master, origin/HEAD) improvements:
* 3856935 document version+timeline, adjust levels
*   ffa3f50 Merge branch 'master' of github.com:fastai/fastai
|\
| * 32377a3 new dev cycle: 1.0.16.dev0
| * 0a4e629 CHANGES
| * ef15dda rename create_cnn
* | 67d2ff3 require dev install for PR, plus run-after-git-clone, and split steps
* | b56c79a require coverage for dev, needed for testing on fastai and fastai_docs
* | 2f98955 azure support links
|/
*   14c02c2 Merge branch 'master' of github.com:fastai/fastai
|\
| * 70cb432 Add maybe copy tests (#980)
* | a456e56 Remove model type
|/
* fbd6235 Learner.create_cnn
*   768d606 Merge branch 'master' of github.com:fastai/fastai
|\
| * 2d63ae4 Update CHANGES
| * c037f61 Fix pred_batch
| * 644cb64 create x in cuda for model_sizes() (#990)
* | 01aec14 Learner.create_cnn
|/
* a1ff5c2 Auto activ (#992)
* 7da5bd3 SegmentationDataset classes
* bc255a8 document the issue with missing libcuda.so.1
* 2735255 document gpustat, and nvidia-smi dmon -s u (forum tips)
* bd62fdb add jekyll templates in the package
* 186739f Ensure that plot_pdp accepts an axis name. Fixes #986. (#987)
* ab4a39b Fix saleElapsed vs YearMade interaction plot in ml1/lesson2-rf_interpretation. Fixes #988. (#989)
* 0de3384 move property
*   7d68137 recurse flag

plus, there is CHANGES.md where important changes like bugfixes are logged.

Just pushed 1.0.15. Main change (from CHANGES.md):

ConvLearner ctor is replaced by a function called create_cnn
1 Like
If you do git pull right now you will see 1.0.16.dev0, so now you’re on the bleeding edge and all the commits should be there. Please double check that it’s the case.

Confirmed!
Bleeding again… :slight_smile:
Thanks!!

1 Like

How do we go about creating a TODO/HELP-WANTED list, and invite others to contribute on tasks that need to be done?

I have one item that is up for grabs: https://docs-dev.fast.ai/git.html#hub (see HELP-WANTED there). It should be a fun little project, shouldn’t take more than a few hours to figure out. I laid out all the details, and it just needs to be coded in python to support windows users w/o bash.

@stas Perhaps a simple way would be to create a “TODO/HELP-WANTED” category in the Forum.
Each entry under the category could follow a template (eg like the bug reports template),
that describes the requirements, such as, in the above example, “Windows Platform”, etc.
Others may then state their interest and even create small groups to tackle the task together as a teaching opportunity…

New big change, introduced the data block API. Jeremy will explain it more on Tuesday and I’ll document it tomorrow, but the basics is that it lets you plug the different parts of creating a DataBunch as you want with a lot more flexibility than the current factory methods. Specifically, you tell

  • where are the filenames (if applicable)
  • how to determine the label of each input (re pattern, folder names, csv file…)
  • how to create a validation set (random split, folder names, valid indexes…)
  • what Dataset function to apply (ImageDataset, ImageMultiDataset, SegmentationDataset…)
  • transforms to apply (if applicable)
  • how to databunch it (which is where you tell the batchsize, the dl transforms…)

Examples are in the 104a and 104b notebooks in the dev folder, but here are a few of them:

Pets datasets from lesson 1

path = untar_data(URLs.PETS)
tfms = get_transforms()
data = (InputList.from_folder(path/'images')
        .label_from_re(r'^(.*)_\d+.jpg$')
        .random_split_by_pct(0.2)
        .datasets(ImageClassificationDataset)
        .transform(tfms, size=224)
        .databunch(bs=64)

Classic dogscats in an Imagenet style folder structure

path = Path('data/dogscats')
tfms = get_transforms()
data = (InputList.from_folder(path)
        .label_from_folder()
        .split_by_folder()
        .datasets(ImageClassificationDataset)
        .transform(tfms, size=224)
        .databunch(bs=64))

Planet dataset (multiclassification problem with labels in a csv file)

path = untar_data(URLs.PLANET_SAMPLE)
tfms = get_transforms()
data = (InputList.from_folder(path)
        .label_from_csv('labels.csv', sep=' ', suffix='.jpg', folder='train')
        .random_split_by_pct(0.2)
        .datasets(ImageMultiDataset)
        .transform(tfms, size=128)
        .databunch(bs=64))

Camvid (segmentation tasks with segmentation masks in another folder):

path = Path('data/camvid')
get_y_fn = lambda x: path_lbl/f'{x.stem}_P{x.suffix}'
codes = np.loadtxt(path/'codes.txt', dtype=str)
tfms = get_transforms()
data = (InputList.from_folder(path/'images')
        .label_from_func(get_y_fn)
        .split_by_fname_file('../valid.txt')
        .datasets(SegmentationDataset, classes=codes)
        .transform(get_transforms(), size=128, tfm_y=True)
        .databunch(bs=64))
3 Likes

Facing the same issue, after the latest pull
NameError: name 'ConvLearner' is not defined

Did you make use of the create_cnn in the vision.learner then ?

When I used that like:

learn = create_cnn(data, models.resnet34, metrics=error_rate)

I get the following error:

AttributeError: module 'fastai.vision.data' has no attribute 'c'

1 Like

Thanks for the new camvid notebook - so elegant.

running:
train_ds = SegmentationDataset(train_fns, y_train_fns)
valid_ds = SegmentationDataset(valid_fns, y_valid_fns)

i get this:

TypeError Traceback (most recent call last)
in
----> 1 train_ds = SegmentationDataset(train_fns, y_train_fns)
2 valid_ds = SegmentationDataset(valid_fns, y_valid_fns)

TypeError: init() missing 1 required positional argument: ‘classes’

I believe it is fixed by adding codes as the classes argument ? :

train_ds = SegmentationDataset(train_fns, y_train_fns,codes)
valid_ds = SegmentationDataset(valid_fns, y_valid_fns,code)

Also there is a:

learn.unfreezefreeze()

that should be changed to:

learn.unfreeze()

A small suggestion would be to use an explict mapping from segments to classes using a dict with:

  • key as the classes pixel value in the mask
  • value as the class

You should restart your ntoebook and make sure you define data somewhere as pyton believes it’s the data module of fastai.vision from your error message.

Thanks - note that this is a work in progress so no need to give feedback until it’s done. It’s not in a working state yet.

Thanks for your detail explanation, always learn a lot from the tools/tricks that you shared.

For now, it feels natural to call it within a fastai repo, but I don’t see any thing stopping this tool to be used outside fastai. I am not good at bash script but I saw orig_user_name=fastai inside the script, so from my understanding we can simply changes this line and use it in other open source project as well. So putting it in $PATH make perfect sense.

I will give it a few more trials and see if I will come back with any question. :slight_smile:

Thanks again.

It reminds me of Processing language.

I have the developer version installed with the latest pull. I get the following error when trying to create a cnn (from the docs):

AttributeError: type object ‘Learner’ has no attribute ‘create_cnn’

Any ideas?

create_cnn is a function, it’s not inside Learner.

Do the docs need to be updated then? It says here:

learn = Learner.create_cnn(data, models.resnet18, metrics=accuracy)

Yes, orig_user_name can be made into a parameter and then you could use the script with any github project.
that’s why I called it, fastai-make-pr-branch - as it hardwires the fastai user :wink:

The only custom thing in the script is that it runs tools/run-after-git-clone if it finds it in the repo.

1 Like