Written section about implementing auto-encoder

Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
Jim Martens 2019-08-06 10:54:24 +02:00
parent b77a2f6789
commit f52edb95bb
1 changed files with 20 additions and 2 deletions

View File

@ -683,7 +683,7 @@ Miller et al. use SSD for the object detection part. They compare
vanilla SSD, vanilla SSD with entropy thresholding, and the
Bayesian SSD with each other. The Bayesian SSD was created by
adding two dropout layers to the vanilla SSD; no other changes
were made. Miller et al. used weights that were trained on MS COCO
were made. Miller et al. use weights that were trained on MS COCO
to predict on SceneNet RGB-D.
As the source code was not available, I had to implement Miller's
@ -709,10 +709,28 @@ RGB-D data set, I counted the number of instances per COCO class and
a huge class imbalance was visible; not just globally but also
between trajectories: some classes are only present in some
trajectories. This makes training with SSD on SceneNet practically
impossible.
impossible.
\section{Implementing an auto-encoder}
Pidhorskyi et al.~\cite{Pidhorskyi2018} released their source code
but it is for
PyTorch; I had to adapt the code for Tensorflow. For the proof of
concept, a simpler model of encoder and generator was used; the
adversarial parts were disabled for this. The encoder starts with
a sigmoid-activated convolutional layers, followed by two
convolutional layers with ReLU as activation function. It ends
with a Flatten and Dense layer.
Decoding starts with a Dense layer, followed by three transposed
convolutional layers with ReLU as activation function; the last
layer is a transposed convolutional layer with sigmoid as
activation function.
The auto-encoder works on the MNIST data set, as expected. It
works very well for COCO as well, with one caveat: it is equally
good for all classes, even when trained only on one. Novelty
detection is out of the question under theses circumstances.
\chapter{Results}
\chapter{Discussion}