Can crontab be used for incremental training since you can schedule code to run at different intervals? Depending on how you wrote your code.
Do you think Multi-Armed Bandits helps breaking feedback loops?
Jeremyās notebook is not shared on the screen. Is this on purpose?
Any feedback affects and biases future data.
Thatās awesome @rachel
But you can, if you save your previous models regularly, and the data used to train them. Andrei Karpathy gave a very good talk about that at the PyTorch dev summit in 2018. Iāll try to find the link to it tomorrow.
Would that rule out deployment in situations where review is illegal/unethical ? e.g. security cams for indoor spaces . Call/Speech privacy ?
Would it be possible to break a feedback loop by adding some noise? This problem seems similar to overfitting, at least to me.
My Favourite blog on Why you, yes YOU should blog
(Will add to Top wiki during break)
For my pathology project, I let the assessors to annotate the same set of images individually and then use Intraclass Correlation Coefficient in R, mean differences graph and overlay all annotations for each image. We did it at the beginning of the project and identify individual bias prior to mass annotations. Also, we are preparing a gold standard guide to include exceptions during the process.
I would like to hear your feedback on our approach.
You canāt train a model without labeled data, so you need a process to anonymize the data in ways that lets you have humans seeing it. Never trust a deep learning model if it hasnāt been properly validated on labeled data.
Yes, either classifying disease/non-diseased tissue or grading/staging severity.
When reporting the effects of interventions sometimes multiple observers are used and they are (double) blinded to control vs treated to minimize bias.
Iām just wondering if we should be following this general principal during the manual phase?
Advice for Better Blog Posts: slightly more detailed advice and avoiding common pitfalls
How to set up fastpages:
If you struggle with the questionnaire, check out the solutions here:
It would if you donāt use that data wisely.
This is what semi-supervised learning (SSL) is all about, and it is a huge thing!
I work in a fintech in application credit scoring and I can tell you that SSL (we call it reject inference in the financial domain) is super important.
In a nutshell, the general idea is to:
- build a solid model
- run inference on unlabeled data
- pick only the predictions the model is VERY (according to a threshold you set) confident about. E.g. in case of binary classification, predictions with very low/high probabilities.
- add these new data points to the originally labeled dataset and train a new model
- keep iterating
EDIT: Look at this paper for context. I implemented it at work and it works really well!
In terms of blogging, I have always wondered: even though there is a larger beginner audience, there are also probably more beginner posts too, right? This is why I am unsure especially about writing beginner tutorial blog posts.
Iām having trouble viewing relevant docs in fastai.
I want to know what the unique
parameter does when I run:
dls.train.show_batch(max_n=8, nrows=2, unique=True)
:
Iāve tried doc(show_batch)
but Iām not getting info or even able to ctrl+f āuniqueā and find anything, does anyone have pointers of how to do this?
Updating fastai
This was added yesterday by @lgvaz and is in the new release from today. In general, itās a good thing to run an update just before the course as we make a release each Tuesday during the period it runs
unique
will plot a batch of the same images. This is used for checking how your transforms look on a single item
I git pulled an hour ago, doc()
takes me to github source but I still canāt trace the code, as Iām assuming itās in kwargs of methods call within this method?
Looking for general advice on reading the docs -> understanding things