diff --git a/body.tex b/body.tex index 7ece5f9..d509960 100644 --- a/body.tex +++ b/body.tex @@ -683,7 +683,7 @@ Miller et al. use SSD for the object detection part. They compare vanilla SSD, vanilla SSD with entropy thresholding, and the Bayesian SSD with each other. The Bayesian SSD was created by adding two dropout layers to the vanilla SSD; no other changes -were made. Miller et al. used weights that were trained on MS COCO +were made. Miller et al. use weights that were trained on MS COCO to predict on SceneNet RGB-D. As the source code was not available, I had to implement Miller's @@ -709,10 +709,28 @@ RGB-D data set, I counted the number of instances per COCO class and a huge class imbalance was visible; not just globally but also between trajectories: some classes are only present in some trajectories. This makes training with SSD on SceneNet practically -impossible. +impossible. \section{Implementing an auto-encoder} +Pidhorskyi et al.~\cite{Pidhorskyi2018} released their source code +but it is for +PyTorch; I had to adapt the code for Tensorflow. For the proof of +concept, a simpler model of encoder and generator was used; the +adversarial parts were disabled for this. The encoder starts with +a sigmoid-activated convolutional layers, followed by two +convolutional layers with ReLU as activation function. It ends +with a Flatten and Dense layer. +Decoding starts with a Dense layer, followed by three transposed +convolutional layers with ReLU as activation function; the last +layer is a transposed convolutional layer with sigmoid as +activation function. + +The auto-encoder works on the MNIST data set, as expected. It +works very well for COCO as well, with one caveat: it is equally +good for all classes, even when trained only on one. Novelty +detection is out of the question under theses circumstances. + \chapter{Results} \chapter{Discussion}