Added skeleton replication of Miller et al

Signed-off-by: Jim Martens <github@2martens.de>
2019-08-05 15:53:31 +02:00 · 2019-08-05 15:53:31 +02:00 · 737f486d6c
parent eef43bf9a6
commit 737f486d6c
1 changed files with 27 additions and 1 deletions
--- a/body.tex
+++ b/body.tex
@ -657,10 +657,36 @@ Wordnet ID and searching for a fitting COCO class.
 The ground truth for SceneNet RGB-D is stored in protobuf files
 and had to be converted into Python format to use it in the
 codebase. Only ground truth instances that had a matching
-COCO class were saved, the rest discarded. 
+COCO class were saved, the rest discarded.

 \section{Replication of Miller et al.}

+Miller et al. use SSD for the object detection part. They compare
+vanilla SSD, vanilla SSD with entropy thresholding, and the
+Bayesian SSD with each other. The Bayesian SSD was created by
+adding two dropout layers to the vanilla SSD; no other changes
+were made. Miller et al. used weights that were trained on MS COCO
+to predict on SceneNet RGB-D.
+
+As the source code was not available, I had to implement Miller's
+work myself. For the SSD network I used an implementation that
+is compatible with Tensorflow; this implementation had to be
+changed to work with eager mode. Further changes were made to
+support entropy thresholding.
+
+For the Bayesian variant, observations have to be calculated:
+detections of multiple forward passes for the same image are averaged
+into an observation. This algorithm was implemented based on the
+information available in the paper.
+
+To better understand the SceneNet RGB-D data set, I counted the
+number of instances per COCO class and a huge class imbalance was
+visible; not just globally but also between trajectories: some
+classes are only present in some trajectories. This makes training
+with SSD on SceneNet practically impossible.
+I tried to finetune the SSD on SceneNet because the
+pre-trained weights did not produce detection results.
+
 \section{Implementing an auto-encoder}

 \chapter{Results}