V1 API changes too drastic

I notice the API for v1 is changing. surely it should be frozen for the whole version 1.0.20 API should be the same as 1.0.30 surely?
e.g. in docs it still says to create a unet leraner like this:

Learner.create_unet(data, models.resnet18)

but it is actually like this now:

unet_learner(data, models.resnet34, metrics=metrics, wd=wd)

how do the version numbers work?

6 Likes

Please check the recent breaking changes in fastai here :

Thanks for that.
However I would not expect the API to change at all for a major version. If API change you would usually do a version 2.x, 3.x etc not minor point

2 Likes

Currently fastai makes incompatible API changes in patch versions, which is typically only used for bug fixes.

maybe i should stick to keras for now or use pytorch and wait for version 2.x when things have settled down?

2 Likes

As this was raised recently again at https://github.com/fastai/fastai/issues/1739, I will share my thoughts on this subject matter.

History is usually a good predictor of the future.

Therefore given that Sylvain and Jeremy are always at the cutting edge and often are a bit ahead of it, it’s very unlikely that the API will ever get stable. So unless you want to stifle their creativity you should just accept that. The core API may get more stable, but some parts of it (think domain-specific API) will always be in the flux. DL is a baby and as it grows fast it constantly needs new clothes if that helps as a metaphor from the physical world. Perhaps in 10-20 years it’d be easy to come up with stable APIs.

Currently the only solution to not need to rewrite your application code every few weeks is to pick a version and stick to it. Of course, this is not a great solution, since it prevents you from getting bug fix updates w/o breaking your code.

The better solution would require several knowledgeable people to become stable branch maintainers who will backport bug-fixes from the dev branch, but will not change the API. This is how it’s done in all big projects. There is no need to reinvent the wheel.

That way Jeremy and Sylvain can continue going full speed at being creative and innovative, and someone else will be sorting out how to keep the stable branch great and stable at the same time.

So that the ideal situation is where the lessons always rely on the unstable master branch and the production code on the stable branches.

Documentation also has a problem right now, since the docs always match the git master branch and not any previously released version, so it’s hard to rely on such docs if you are forced to use some older version (or even the last released version). So the docs will also need to be branched to match the code and the website updated to reflect that.

One more thing will be github Issues and PRs - it’ll be a mess trying to use the same github repo for dev and stable branches. So most likely either these will need to become forks and not share the same Issue/PR entry points, or perhaps these can somehow be setup to make it clear to the submitters that this way is to a stable branch 1.2.x and this one is for cutting edge master 1.3.x. I’m sure we can copy what other projects do.

Bottom line is that if you want stability, this project needs a bigger team of knowledgeable dedicated developers/maintainers. There is no way Jeremy and Sylvain will be able to be able to give you the best in this domain and support older versions at the same time. I’m sure they would have liked to be able to do that too, but there is that problem of having only that many hours in a day.

These are of course my thoughts alone. It’s quite possible that Jeremy and Sylvain will disagree with everything or parts of what I have shared.

3 Likes

Semantic versioning will help communication, especially useful for a fast moving project

1 Like

I can look at getting more involved around the software engineering aspects, tooling and continuous builds and releases etc if needed

1 Like

With the point of view provided by @stas, I stand corrected. Semantic versioning will not help.

E.g. true that 1.x1.y1 -> 2.x2.y2 conveys breaking changes and therefore the consumer will be extra aware - which was the point of me filing 1739. E.g. Finding Image*List nuked between patch versions was a rather rude shock for me.

However given that we want to rapidly iterate - even if we agree to follow semantic versioning - every new release will have breaking changes.

And even if there is 1 breaking change per release, instead of 1.0.46 -> 1.0.47, we will have 46.0.0 -> 47.0.0.

Which doesn’t help at all. Unless we following release-branching-backporting process - which will require non trivial amounts resources.

Given that fastai lib is not ‘go-live’ at the scale of keras/tf/etc - that amount of resources has no ROI.

1 Like

It is quite simple: don’t make a major release until you are happy with the API. Also using feature and/or release branches are trivial with git

1 Like

FYI, there will be a major stable release when the second version of the MOOC is released this summer.

4 Likes

fastai hasn’t made any major releases, since making the complete rewrite, but how does that help?

Also using feature and/or release branches are trivial with git

Doing git branch is trivial. Maintaining it and backporting fixes is far far from trivial. It is very time consuming and requires library knowledge. Plus see my notes on docs, and git issues/prs. So unfortunately, this is not quite simple.

Fastai library as it is right now specializes in training models and everything that is involved starting from raw datasets. Many of its library components are intended for experimenting, fast prototyping, and interactive computing, most of which have no use in production. Here I am talking about methods like show_batch, plot_confusion_matrix, those from datablock API, and all the callbacks for training loop customizations. As I am typing right now, I don’t even know if they are still called as such, but I know that they exist and can easily find them in docs. So regarding my use of the library, I really don’t care that much if there is breaking API change. Most are pretty easy to spot and fix.

This really does not stop you from using the library even if you want to use it for production purpose. Once you are happy with the training, you could export the trained PyTorch model and then stay comfortably in the stable production world of PyTorch.

I am not saying that quality software engineering maintenance is not desired. It certainly is. However, it requires a lot of man-hours, as @stas has explained in great details.

1 Like

I can look at getting more involved around the software engineering aspects, tooling and continuous builds and releases etc if needed

Thank you for the offer, @ptah . Currently we have this aspect fully automated and no work needs to be done there. It’ll require re-tooling work should we start maintaining a stable branch. So if and when that happens your help would be very welcome.

BTW, we do have a git release branch for every release we made, e.g. the latest one is https://github.com/fastai/fastai/tree/release-1.0.46, so you can test your skills and evaluate how much work it takes by volunteering to be a branch manager. You can make any release branch production quality stable, by becoming its maintainer and backporting bug fixes from master to it.

Docs are still an issue, since they are already out of sync with 1.0.46 release and follow master, but just handling the code would be a good start and you can see whether this is doable for you. In the least-effort scenario docs just remain as they were at the moment of the branching, with just fixes applied if need be - a very minor effort actually. Just need to make a docs.fast.ai/br/1.0.46/ online version.

p.s. in case my communications come across as I’m trying to make it look difficult so that we won’t have to do it - quite on the contrary this needs to be done - I’m just showing that it is hard unsexy work. It’s like working in the QA dept, very unsexy, but no serious business can do well without QA. This is also why we need your help with our test suite, btw.

3 Likes

ideally bugfixes are made against the release branch and then merged to master/develop and feature branches.
what is your git process like?
gitflow + forking workflow gives flexibility and least headaches with bugfixes etc


1 Like

If semantic versioning isn’t being used, then what do the version numbers mean? I can appreciate not wanting the extra burden of following semantic versioning right now, but isn’t that what the 0.x releases are supposed to be?

If the plan is to stay away from semantic versioning for the foreseeable future, then could we at least describe the meaning of the version numbers? What are a, b and c in version a.b.c?

In terms of in general, 0.7 was the original version, 1.0 is the current/old version, and 2.0 is the newest.