I want to detect “anomalous” images from a series of images. The image may be a plot or a graph, and there might be some images that deviate significantly from other images. What techniques can I use to, perhaps, tell with some probability that an image is different from most other images seen thus far?
Alright, I have been lingering in these forums for two months, so time to make my first post.
I have not dived in this area, so thinking out aloud. If all the images are generally of the same type (for e.g., one type of object or MRI scans of the brain):
(1) Train an auto-encoder, where the input is the same as an output.
(2) Run all your images through the first part of this autoencoder and find the image vector representation (Middle layer of this autoencoder)
(4) Calculate the average distance of each image w.r.t all other images
The images with much higher distance than the usual will be anomalous.
I am sure there must be better ways. I see a few links on the subject (haven’t read them yet):
@coderama’s first post is an excellent one - thanks for de-cloaking!
@gohar it’s better to use a “real” loss function instead of an autoencoder if possible, since that way you can be sure that the features that are being used to detect anomalies are the ones you care about. Are there any labels that you could use?
You may also be interested in looking at triplet and siamese networks: https://arxiv.org/abs/1412.6622
Are you able to share your dataset? I’m interested in teaching anomaly detection in part 2, so perhaps we could use your problem as a real world use case?
Its actually simple, Train a OneClassSVM model or LSAnomaly model with VGG16 or RESNET feature vectors, for prediction do the same. Take the Feature vectors of the image to be in prediction and give it to OneClassSVM
I am still in part-1 and my apology if i am still not eligible to ask question in part-2 related queries.
I have some project requirement related to OIL and gas industry where need to find anomaly in the live images or still images for any leak or spillage in pipe or equipment.
But unfortunately i don’t have data.
So shall i use supervised CNN or any un supervised process ?
What dataset to use for modelling as COCO Data set is for normal objects.
Added for some video link of MS build 2018 conf for requirement clarification
Same case for video analytics where from video we need to know it is oil leakage or Gas leakage.
How to start and best model and data sets for above problems.