If the difference is big like bears vs racoons and you already have a good pretrained model (like the ones we use at the base of our training), I would say retrain.
One interesting example of domain shift thatās present in astrophysics is when we train models on simulated data, e.g., a simulated Universe of galaxies, but then we want to do inference on observed data from an actual telescope, where you end up with correlated noise, weird systematic effects, and physical effects that are quite difficult to simulate.
Very interesting. What has been done in order to solve this problem?
Jeremy, does fast.ai have methods built in that provide for incremental learning?
(i.e. improving the model slowly over time with a single data point each time?)
How do you take care of these domain shifts ?
Lots of folks are still working on it But if you have any cool ideas, Iād love to chat!
EDIT: I should mention that there have been lots of smart people, mostly in the deep learning field, who are working on domain adaptation. For example, generative modeling along with clever loss functions can help with transferring knowledge across domains! So that would be a great place for us to start.
No, you will have to implement it using the data loading API in fastai.
In the manual process, if we are looking at slides of pathological tissue, would you recommend the viewers to be blinded?
for that same example of predicting images from video, how would you recommend running that on a server? not necessarily a constant video stream but if users were uploading like 5 second videos
Nice. Datasheets for datasets really helps you review how you collect the data. It took me 3 weeks to make one.
What is the problem you are referring to? Like a classification problem? We would want the pathologists to manually confirm the predictions of the model, so it might not be blinded.
How would we label data generated by the inference ? Wont it just amplify its errors?
Domain Shift ==> use existing model as Pretrained Model
Solve it using finetune
.
Is there a an actual quantifiable approach that you would recommend to compare human labels with model generated labels during deployment testing?
You wouldnāt want to use unreviewed data for training no. But base predictions could help a human go quicker through the labeling process.
I assume there are fastai-type ways of keeping a nightly updated transfer learning setup.
Will / could there be one of the fastai-v4 nbs 's have an example of the nightly transfer learning training like the previous person asked? I would be interested in knowing how to do that most effectively with fastai (rather than how Iāve done it with pytorch or tf).
I guess u accidentally unshared your screen
Ok, I will create a thread then if you are unsure after responding.
I looked up https://dev.fast.ai/vision.data#ImageDataLoaders.from_name_func and have added a ābsā param to the function from 01_intro. It seems to now be working. However I am concerned that maybe the default batch size is too high since I have a 1080 and it couldnāt do the default. (yes I know that card is now old)
And those human need to be diverse. Different points of views are needed to make sure biases are not introduced if possible.
what is the process of breaking out of a feedback loop? You canāt just git revert with a model and the data