Its all greek to me
Pretty good link for learning math notation and greek symbols therein.
https://www.rapidtables.com/math/symbols/Basic_Math_Symbols.html
Its all greek to me
Pretty good link for learning math notation and greek symbols therein.
https://www.rapidtables.com/math/symbols/Basic_Math_Symbols.html
If you are returning to AWS after a while, follow commands to get started:
ssh ubuntu@xx.xx.xx.xx -L8888:localhost:8888 git pull conda env update Jupyter notebook
xx.xx.xx.xx is your public ip. Pls. login to AWS to get same.
Interesting. A box does captures other info as well. I wonder if there is a better function which covers just the shape of the object; maybe such function will be hard to label. Maybe!
Probably the next step is to lean all pixels of bounding box which will turn to an edges of the object.
But labeling would become nightmare I think.
Or perhaps:
That is called segmentation, and weâll be learning it soon
What if I used the trained model on images that donât always have an object to detect (i.e., not having bounding box might be a desired feature)?
@DavideBoschetto Then, from what I understand, this wouldnât be the appropriate setup.
Weâre regressing on the 4 coordinates which means weâre always expecting output / a bounding box to exist.
Iâm guessing the multi label model has an activation function in there somewhere to determine if a bounding box should exist. I assume that would be necessary to support an objectless input.
But maybe someone whoâs a bit more knowledgeable could correct me / give you a better answer?
Yay ! looking forward to the segmentation lecture.
Yeah, I agree but donât have a better answer myself.
Itâs a problem I had to deal with some months ago. I initially assigned an âemptyâ bbox to the center of the image ([32,32] for 64x64 images basically) to objectless images, but the result was that all the uncertain detections were skewed towards the center! And thatâs a problem I donât want to run into, obviously. Still have to find out anything helpful on the topic!
FYI, here is a great post imo that summarizes shortly very well the main computer vision tasks and basic terminology (Classification, Classification+localization, Detection, Segmentation : instance vs semantic): https://luozm.github.io/cv-tasks
Frankly I doât know the answer, but trying to put thoughts came to mind.
If I understood correctly the model Jeremy developed is predicting a combination of bounding box and a category of object enclosed in the box. Now I am assuming that in object detection model training there always be a fix set of object categories without any unknown category as such. Although at inference stage the model may receive an image with many categories unknown to it. As the model is trained to predict only for those known categories and there is no such categories in object-less images, the probability of each objectâs presence will be much lower than the threshold level. And if it is we can decide not to show bounding box for such predictions, which leads us to no bounding box for object-less images.
Is anyone else finding themselves constantly turning to the Python debugger on unrelated projects since Jeremy showed us it? Iâve used debuggers in C before, but for some reason never bothered to investigate a Python one. Iâve already solved so many bugs that would have taken me so much longer without pdb
.
In part1, I donât remember which part, I seem to remember a case where the categories contained a âNone of the Aboveâ category, that unclassified images fell into. The trained model would presumably put images with no bounding box into that category. But those examples were by definition using categorical variables and not the continuous ones weâre using in the bounded box example. So how does one achieve the same thing with continuous variables?
I think the difference here and why to have a âNone of the Aboveâ category is for when the task at hand calls for a âNo object foundâ as one of the possible results. Like if you are looking for a fish in an image and no fish is there. But in this case we do have either one or multiple objects in the image that we want to both classify and localize.
I think there are two ways. The first is to rely on the results of the classifier component, as long as the classifier accuracy/loss is below/above some pre-determined threshold then you would not consider it an object and therefore no bounding box would be generated. The second is to rely on the loss of the bounding box regressor, so anything above some threshold of loss would not generate a bounding box (or at least not make it visible) regardless of the classifier accuracy.
you have a good point, but as is the case in all of these problems weâre relying on the veracity of the loss function. As we train our system our results improve and the loss will decrease. Thresholding the loss at multiple points may be difficult.
Yes this is an interesting questionâŠYou are right, it wouldnât really be possible to rely on the loss rate as it is only applicable during training and have no idea what the loss would be on the test predictions (unless we somehow knew the gt labels). In that case the classifier head component is the best indicator to rely on for whether or not to generate bbox preds.
I think you could write a custom loss function that had a class for âno object detectedâ and didnât calculate loss for the bounding box points when that class was chosen.
Yes but that just goes back to relying on the classifier right? You can already just set a threshold for the classifier head and based on that not generate the bounding box predictions from the regressor head (perhaps we do still have bbx predictions - but we just choose not to use them) for that particular âobjectâ so by default the object becomes âno objectâ. Also if you did have a class for âno objectâ you could just have the boxes get generated as all zeros when that class is triggered which would effectively achieve the same result.
Thatâs a good quick intro Thanks for sharing.