How do you solve real world problem using Deep Learning?

Hi everyone! I came up with an idea for my side project which is using depth estimation to build a mini car toys that can keep itself away from obstacles but I don’t know what’s a proper approach to solve real world problem in this field? Is that an ideal way to just clone the best method out there and fine tune to fit this problem or read through paper proposed method and write and train model from scratch?


I am not sure I really understand your use-case.
But I would say If it just a matter of “keep a distance d from the nearest object” then this would not call for Machine Learning (or even Deep Learning).

I think some hard coded logic like this might be able to solve it.

if distance_to_nearest< min_distance: 

But if your problem is intelligently navigating an obstacle course, this would be totally different. Or is your problem doing a depth estimation based on just image/camera data?

Anyway I think you have to be more specific to be able to give you a tip.


Hi thanks for your response!
My idea is let a mini car follow a person and be able to avoid obstacles. My approach to obstacle avoidance is using depth estimation for recognizing depth and then apply the logic as you said to control the wheels.

And your question is the depth estimation based on videos? Or tracking of the people?
For depth estimation I would highly recommend a (really cheap) depth sensor (e.g. look at this) or maybe even a kinect camera.
But If you want to really get depth information from a single camera (instead of two or a depth sensor), this is kind of possible today, especially with the recent advances by Mapillary.

I prefer option 2 which can get depth information from a single image. But my question is I don’t know what’s a proper approach to real world problem in deep learning. Thanks.

I think from a single image can be very difficult (and dependent on your camera). But I would go this way:

  1. Look for code and pretrained models from the most recent paper
  2. See if this works in your domain (your camera, setting etc.)
  3. Try training your own model, based on the ideas of the most recent paper (or implementation)