I was looking for a way to do object detection with fast ai. I found the retinaNet notebook belonging to part 2 of the course and ran it.
everything was fine, until i tried using it for inference.i wanted to apply it on video from my webcam(in another code),but i stumbled upon an error. I tried adding the export, load and predict lines to the same training notebook and got the same error, which is the one below:
is it not possible to get predictions by doing learn.predict ? if so how do i get them?
ps: I understand that this is more suited to part2 2019 of the course, and i apologize. i cant access their forums now, and i am using this for my graduation project so i cant really wait until it the course is available on the internet. I thought part one was enough for object detection, since it has segmentation in it. i was wrong.
learn = load_learner(path=’.’,file=‘model.pkl’)
with torch.no_grad(): learn.model.eval()
input = Variable(image)
input = input.to(device)
z = learn(input)
show_preds(image, frame, z, detect_thresh=0.35, classes=learn.data.classes)
i got an error :
TypeError: ‘Learner’ object is not callable
and also tried this:
learn = load_learner(path=’.’,file=‘model.pkl’)
with torch.no_grad(): z=learn.model.eval()(image)
show_preds(image, frame, z, detect_thresh=0.35, classes=learn.data.classes)
and got this error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 480, 640] instead
which shows that in this case its expecting a batch not a single image. is there a way to get around this?
still the same error (in that same line) :
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 480, 640] instead
for anyone wanting to do this, i manged to get predictions like this: defaults.device = torch.device('cuda') encoder = create_body(models.resnet50, cut=-2) model = RetinaNet(encoder,21,final_bias=-4) state_dict = torch.load('stage2-256.pth') model.load_state_dict(state_dict['model'],strict=False) model = model.cuda() model.eval() with torch.no_grad(): z = model(image.unsqueeze_(0).cuda())
usually you use cpu for inference, but i m using webcam stream so i need it.
if you want cpu:
state_dict = torch.load(‘stage2-256.pth’, map_location = ‘cpu’)
and remove all the .cuda()s
Hi Mohamed,
Would you happen to have an exemplified notebook on how to implement an object detector via fastai? I am basically looking to convert my multi-label classifier to a multi object detector.