Masterthesis

Problem: object detection in the so-called open set conditions Explanation: detecting when an object detector fails by producing a prediction "this bounding box has object of class X", where in fact the object is of class Y which has not been encountered during training.

Baseline: "Dropout Sampling for Robust Object Detection in Open-Set Conditions", Miller et al., ICRA 2018

Abstract: They apply the SSD network as a starting point. The method they propose essentially consists of 1) enabling dropout during testing also and 2) feeding the same image through SSD multiple times (with dropout on, so result will be different each time!), and 3) filtering the resulting bounding boxes to detect the errors of the problem type.

Problem of baseline: multiple forward passes can get rather wasteful in terms of required computation time (they need up to 42 per image)

Network setup of baseline: vanilla SSD network with MS COCO classes on SceneNet RGB-D (multiple SceneNet classes grouped into COCO classes)

Own approach: SSD for object detection, GPND for novelty detection; per bounding box is determined if the object is new

Algorithm:

Scene from SceneNet RGB-D
Get 2D data from scene
Feed SSD with 2D data for object. Classify object and calculate classification loss.
Feed autoencoder with 2D data for object. Calculate novelty score.
The novelty score tells us if an object is unknown.
If above certain novelty threshold, discard detection for further evaluation (similar to entropy threshold of baseline).

Training: independent training of SSD and Adversarial Autoencoder (AAE). Use ground truth bounding box for AAE.

Dataset: SceneNet RGB-D dataset

Stretch Goals:

YCB Video dataset
create custom dataset with YCB objects using robot arm
test both trained networks on this dataset which features the same objects but different environments

Open Questions: additional use for generator part of GPND beyond autoencoder function?

Evaluation:

networks to compare against each other

SSD pretrained with MS COCO weights, finetune on SceneNet RGB-D (vanilla SSD)
SSD pretrained with MS COCO weights, add dropout layers and finetune on SceneNet RGB-D (Bayesian SSD)
GPND trained on SceneNet RGB-D with MS COCO classes, vanilla SSD (own approach)

metrics

precision
recall
F1 score
absolute open set error (as defined by baseline)