Written section about implementing auto-encoder
Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
parent
b77a2f6789
commit
f52edb95bb
22
body.tex
22
body.tex
|
@ -683,7 +683,7 @@ Miller et al. use SSD for the object detection part. They compare
|
|||
vanilla SSD, vanilla SSD with entropy thresholding, and the
|
||||
Bayesian SSD with each other. The Bayesian SSD was created by
|
||||
adding two dropout layers to the vanilla SSD; no other changes
|
||||
were made. Miller et al. used weights that were trained on MS COCO
|
||||
were made. Miller et al. use weights that were trained on MS COCO
|
||||
to predict on SceneNet RGB-D.
|
||||
|
||||
As the source code was not available, I had to implement Miller's
|
||||
|
@ -709,10 +709,28 @@ RGB-D data set, I counted the number of instances per COCO class and
|
|||
a huge class imbalance was visible; not just globally but also
|
||||
between trajectories: some classes are only present in some
|
||||
trajectories. This makes training with SSD on SceneNet practically
|
||||
impossible.
|
||||
impossible.
|
||||
|
||||
\section{Implementing an auto-encoder}
|
||||
|
||||
Pidhorskyi et al.~\cite{Pidhorskyi2018} released their source code
|
||||
but it is for
|
||||
PyTorch; I had to adapt the code for Tensorflow. For the proof of
|
||||
concept, a simpler model of encoder and generator was used; the
|
||||
adversarial parts were disabled for this. The encoder starts with
|
||||
a sigmoid-activated convolutional layers, followed by two
|
||||
convolutional layers with ReLU as activation function. It ends
|
||||
with a Flatten and Dense layer.
|
||||
Decoding starts with a Dense layer, followed by three transposed
|
||||
convolutional layers with ReLU as activation function; the last
|
||||
layer is a transposed convolutional layer with sigmoid as
|
||||
activation function.
|
||||
|
||||
The auto-encoder works on the MNIST data set, as expected. It
|
||||
works very well for COCO as well, with one caveat: it is equally
|
||||
good for all classes, even when trained only on one. Novelty
|
||||
detection is out of the question under theses circumstances.
|
||||
|
||||
\chapter{Results}
|
||||
|
||||
\chapter{Discussion}
|
||||
|
|
Loading…
Reference in New Issue