Object Detection with RGB-D image?


I’m thinking that by adding depth channel in image, the object detection could be improved much. Nowadays, we have many solutions for RGB-D camera like Kinect or Intel Realsense. But I can’t find any post about it in fast.ai. Anyone have tried this ? Do you have some resources that can help me making the very first steps on it ? Thank you

1 Like

You can have a look at the paper: http://taskonomy.stanford.edu/

I suspect that this kind of model is pretty complicated? And probably best done directly on PyTorch or TF? I have a realsense and will have a look at this but dataset will be the biggest issue I think. You’d need to build a big enough dataset to get something running.