CV is only meant for training your model. Not production/inference. If you wanted to use all ‘n’ you’d want to save the Learners in an array and export them all.
It would depend on the block you use. I can’t guarantee it, but it’s what I’ve found for what I’ve been trying.
Wow! This support works! What time is it in Florida? ;=) Many thanks!
@muellerzr I’m amazed by this walkthrough you are doing. I just watched lesson 2 and in regards to deployments, I do know a little about aws and deploying on aws. If anyone has questions about that, I’m glad to help if I can.
@muellerzr I found the bug! it was numpy that was causing the problem.
When I create the mask the datatype was dtype=np.uint
which created an array of uint32
numbers in my Windows machine, while on colab it was uint64
. PIL fromarray works only on uint32
array. That is why it worked on Win and not on colab.
in the Unknown Labels notebook - “I’m choosing a very high threshold for our metrics as we want only super confident answers (as we only have one label)” shouldn’t we be setting the thresh on the loss function - loss_func=BCEWithLogitsLossFlat(thresh=0.1)
, for us to get confident answers ?
Hey,
I had a few questions related to putting fastai models in production.
- Do you have experience with putting fastai Tabular models in production?
- How did you handle the preprocessing of the tabular data for test?
@navneetkrch for 2, our test_dl (which you can still use after an exported learner) will apply the preprocess first you. What I mean is assume ‘df’ is some test dataframe I loaded into pandas
learn = load_learner(myModel)
dl = learn.dls.test_dl(df)
learn.get_preds(dl=dl)
I see the point in the question - cannot properly answer it BUT what we are doing in our case on accuracy_multi is this:
return ((inp>thresh)==targ.bool()).float().mean()
So basically for each potential class that is returned for an image we only take it as a candidate if the model was at least as confident as the threshold. Then we check it agains the actual target to see if we were right or not. So I believe @muellerzr probably meant 0.9 instead of 0.1 if we want to be super sure as I believe he corrected himself later in the video.
@muellerzr I’m working on Tabular data, looks like it doesn’t work without the labels for the test set. I ran the your notebook dropping the “sales” in the test set and get_preds throws an error. Probably a bug?
it should. Those tabular notebooks are all severly outdated so it doesn’t surprise me it’s not working. I’d recommend the ones from the course folder in the fastai2 repo under nbs
Ok, I will try that. Thanks.
Once all the notebooks are done for our Image
block, I’ll be moving onto tabular. So probably here in the next month at most.
Hmmm. Yes probably. You are right. Adjusting the metric let’s use see what it really is, but I’d imagine the model would also fit faster and better with this adjustment. I’m wondering if MultiCategoryBlock allows this threshold (and if not this would be a great PR)
hi @mgloria i think this is in a different part of the video. That was in the Multi Label notebook. This is in the Unknown Label notebook.
It’s not, this would be a good PR if you feel up to figuring it out @barnacl (we’ll help along the way ) I’m thinking along the lines it’s simply a parameter we can pass to MultiCategoryBlock
It’s being assigned here:
ah wouldn’t just setting the thres in BCEWithLogitLossFlat’s thresh parameter work ?
Let me look at a it a little more, i still have a few questions.
I’m trying to run a loop with thresh varying from 0.1 to 0.9 at intervals of 0.05 and seeing how the accuracy varies. not seeing much change at all not sure if the dataset is easy enough or if i’m doing something wrong. I rem Jeremy said he chose thresh=0.2 for the planet dataset, i’m guessing something like this was to choose it
Yes, but we’d want to bring it in when we initialize our DataBlock (hence a param to MultiCategoryBlock)