Added skeleton replication of Miller et al

Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
Jim Martens 2019-08-05 15:53:31 +02:00
parent eef43bf9a6
commit 737f486d6c
1 changed files with 27 additions and 1 deletions

View File

@ -657,10 +657,36 @@ Wordnet ID and searching for a fitting COCO class.
The ground truth for SceneNet RGB-D is stored in protobuf files
and had to be converted into Python format to use it in the
codebase. Only ground truth instances that had a matching
COCO class were saved, the rest discarded.
COCO class were saved, the rest discarded.
\section{Replication of Miller et al.}
Miller et al. use SSD for the object detection part. They compare
vanilla SSD, vanilla SSD with entropy thresholding, and the
Bayesian SSD with each other. The Bayesian SSD was created by
adding two dropout layers to the vanilla SSD; no other changes
were made. Miller et al. used weights that were trained on MS COCO
to predict on SceneNet RGB-D.
As the source code was not available, I had to implement Miller's
work myself. For the SSD network I used an implementation that
is compatible with Tensorflow; this implementation had to be
changed to work with eager mode. Further changes were made to
support entropy thresholding.
For the Bayesian variant, observations have to be calculated:
detections of multiple forward passes for the same image are averaged
into an observation. This algorithm was implemented based on the
information available in the paper.
To better understand the SceneNet RGB-D data set, I counted the
number of instances per COCO class and a huge class imbalance was
visible; not just globally but also between trajectories: some
classes are only present in some trajectories. This makes training
with SSD on SceneNet practically impossible.
I tried to finetune the SSD on SceneNet because the
pre-trained weights did not produce detection results.
\section{Implementing an auto-encoder}
\chapter{Results}