Updated decoding pipeline for Bayesian SSD

Signed-off-by: Jim Martens <github@2martens.de>
2019-08-27 15:39:15 +02:00
parent 98d57633ff
commit 3360cb0315
1 changed files with 11 additions and 16 deletions
--- a/body.tex
+++ b/body.tex
@ -57,7 +57,7 @@ conditions (see figure \ref{fig:open-set}).
 In non-technical words this effectively describes
 the kind of situation you encounter with CCTV cameras or robots
 outside of a laboratory. Both use cameras that record
-images. Subsequently a neural network analyses the image
+images. Subsequently, a neural network analyses the image
 and returns a list of detected and classified objects that it
 found in the image. The problem here is that networks can only
 classify what they know. If presented with an object type that
@ -69,7 +69,7 @@ such a network would falsely assume that a high confidence always
 means the classification is very likely correct. If they use
 a proprietary system they might not even be able to find out
 that the network was never trained on a particular type of object.
-Therefore it would be impossible for them to identify the output
+Therefore, it would be impossible for them to identify the output
 of the network as false positive.
 This goes back to the need for automatic explanation. Such a system
@ -78,7 +78,7 @@ hence mark any classification result of the network as meaningless.
 Technically there are two slightly different approaches that deal
 with this type of task: model uncertainty and novelty detection.
-Model uncertainty can be measured with dropout sampling.
+Model uncertainty can be measured, for example, with dropout sampling.
 Dropout is usually used only during training but
 Miller et al.~\cite{Miller2018} use them also during testing
 to achieve different results for the same image making use of
@ -143,19 +143,12 @@ passes is varied to identify their impact.
 delivers better object detection performance under open set
 conditions compared to object detection without it.
 \paragraph{Contribution}
 The contribution of this thesis is a comparison between dropout
 sampling and auto-encoding with respect to the overall performance
 of both for object detection in the open set conditions using
 the SSD network for object detection and the SceneNet RGB-D data set
 with MS COCO classes.
 \subsection*{Reader's guide}
 First, chapter \ref{chap:background} presents related works and
 provides the background for dropout sampling a.k.a Bayesian SSD.
 Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
-works, and provides details about the software and source code design.
+works and how the decoding pipelines are structured.
 Chapter \ref{chap:experiments-results} presents the data sets,
 the experimental setup, and the results. This is followed by
 chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
@ -473,7 +466,7 @@ image across multiple forward passes.
 \subsection{Implementation Details}
-For this thesis, an SSD implementation based on Tensorflow and
+For this thesis, an SSD implementation based on Tensorflow~\cite{Abadi2015} and
 Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}}
 was used. It was modified to support entropy thresholding,
 partitioning of observations, and dropout
@ -554,8 +547,11 @@ to the following shape of the network output after all
 forward passes: \((batch\_size, \#nr\_boxes \, \cdot \, \#nr\_forward\_passes, \#nr\_classes + 12)\). The size of the output
 increases linearly with more forward passes.
-These detections have to be decoded first. Afterwards they are
+These detections have to be decoded first. Afterwards,
-partitioned into observations to reduce the size of the output, and
+all detections are thrown away which do not pass a confidence
 threshold for the class with the highest prediction probability.
 The remaining detections are partitioned into observations to
 further reduce the size of the output, and
 to identify uncertainty. This is accomplished by calculating the
 mutual IOU of every detection with all other detections. Detections
 with a mutual IOU  score of 0.95 or higher are partitioned into an
@ -575,8 +571,7 @@ varying classifications are averaged into multiple lower confidence
 values which should increase the entropy and, hence, flag an
 observation for removal.
-Per class confidence thresholding, non-maximum suppression, and
+The final step is a top \(k\) selection.
 top \(k\) selection happen like in vanilla SSD.
 \chapter{Experimental Setup and Results}