Part 2 Lesson 8 wiki

Its all greek to me

Pretty good link for learning math notation and greek symbols therein.
https://www.rapidtables.com/math/symbols/Basic_Math_Symbols.html

4 Likes

If you are returning to AWS after a while, follow commands to get started:

ssh ubuntu@xx.xx.xx.xx -L8888:localhost:8888
git pull
conda env update
Jupyter notebook

xx.xx.xx.xx is your public ip. Pls. login to AWS to get same.

Interesting. A box does captures other info as well. I wonder if there is a better function which covers just the shape of the object; maybe such function will be hard to label. Maybe!

Probably the next step is to lean all pixels of bounding box which will turn to an edges of the object.

But labeling would become nightmare I think.

Or perhaps:

  1. Predict object salience, something like this https://github.com/imatge-upc/saliency-salgan-2017
  2. Then based on activation map - crop an image to have approximately containing just and object inside.
  3. Convert object to edges, something like https://github.com/SKTBrain/DiscoGAN and then smooth out edges.
    image
3 Likes

That is called segmentation, and we’ll be learning it soon :slight_smile:

10 Likes

What if I used the trained model on images that don’t always have an object to detect (i.e., not having bounding box might be a desired feature)?

@DavideBoschetto Then, from what I understand, this wouldn’t be the appropriate setup.

We’re regressing on the 4 coordinates which means we’re always expecting output / a bounding box to exist.

I’m guessing the multi label model has an activation function in there somewhere to determine if a bounding box should exist. I assume that would be necessary to support an objectless input.

But maybe someone who’s a bit more knowledgeable could correct me / give you a better answer?

Yay ! looking forward to the segmentation lecture.

1 Like

Yeah, I agree but don’t have a better answer myself.
It’s a problem I had to deal with some months ago. I initially assigned an “empty” bbox to the center of the image ([32,32] for 64x64 images basically) to objectless images, but the result was that all the uncertain detections were skewed towards the center! And that’s a problem I don’t want to run into, obviously. Still have to find out anything helpful on the topic!

FYI, here is a great post imo that summarizes shortly very well the main computer vision tasks and basic terminology (Classification, Classification+localization, Detection, Segmentation : instance vs semantic): https://luozm.github.io/cv-tasks

7 Likes

Frankly I do’t know the answer, but trying to put thoughts came to mind.

If I understood correctly the model Jeremy developed is predicting a combination of bounding box and a category of object enclosed in the box. Now I am assuming that in object detection model training there always be a fix set of object categories without any unknown category as such. Although at inference stage the model may receive an image with many categories unknown to it. As the model is trained to predict only for those known categories and there is no such categories in object-less images, the probability of each object’s presence will be much lower than the threshold level. And if it is we can decide not to show bounding box for such predictions, which leads us to no bounding box for object-less images.

2 Likes

Is anyone else finding themselves constantly turning to the Python debugger on unrelated projects since Jeremy showed us it? I’ve used debuggers in C before, but for some reason never bothered to investigate a Python one. I’ve already solved so many bugs that would have taken me so much longer without pdb. :sweat_smile::sweat_smile:

5 Likes

In part1, I don’t remember which part, I seem to remember a case where the categories contained a “None of the Above” category, that unclassified images fell into. The trained model would presumably put images with no bounding box into that category. But those examples were by definition using categorical variables and not the continuous ones we’re using in the bounded box example. So how does one achieve the same thing with continuous variables?

I think the difference here and why to have a “None of the Above” category is for when the task at hand calls for a “No object found” as one of the possible results. Like if you are looking for a fish in an image and no fish is there. But in this case we do have either one or multiple objects in the image that we want to both classify and localize.

I think there are two ways. The first is to rely on the results of the classifier component, as long as the classifier accuracy/loss is below/above some pre-determined threshold then you would not consider it an object and therefore no bounding box would be generated. The second is to rely on the loss of the bounding box regressor, so anything above some threshold of loss would not generate a bounding box (or at least not make it visible) regardless of the classifier accuracy.

1 Like

you have a good point, but as is the case in all of these problems we’re relying on the veracity of the loss function. As we train our system our results improve and the loss will decrease. Thresholding the loss at multiple points may be difficult.

Yes this is an interesting question
You are right, it wouldn’t really be possible to rely on the loss rate as it is only applicable during training and have no idea what the loss would be on the test predictions (unless we somehow knew the gt labels). In that case the classifier head component is the best indicator to rely on for whether or not to generate bbox preds.

Seem like it’s an open issue in IPython

1 Like

I think you could write a custom loss function that had a class for “no object detected” and didn’t calculate loss for the bounding box points when that class was chosen.

Yes but that just goes back to relying on the classifier right? You can already just set a threshold for the classifier head and based on that not generate the bounding box predictions from the regressor head (perhaps we do still have bbx predictions - but we just choose not to use them) for that particular “object” so by default the object becomes “no object”. Also if you did have a class for “no object” you could just have the boxes get generated as all zeros when that class is triggered which would effectively achieve the same result.

That’s a good quick intro :ok_hand: Thanks for sharing.