Updated decoding pipeline for Bayesian SSD
Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
parent
98d57633ff
commit
3360cb0315
27
body.tex
27
body.tex
|
@ -57,7 +57,7 @@ conditions (see figure \ref{fig:open-set}).
|
|||
In non-technical words this effectively describes
|
||||
the kind of situation you encounter with CCTV cameras or robots
|
||||
outside of a laboratory. Both use cameras that record
|
||||
images. Subsequently a neural network analyses the image
|
||||
images. Subsequently, a neural network analyses the image
|
||||
and returns a list of detected and classified objects that it
|
||||
found in the image. The problem here is that networks can only
|
||||
classify what they know. If presented with an object type that
|
||||
|
@ -69,7 +69,7 @@ such a network would falsely assume that a high confidence always
|
|||
means the classification is very likely correct. If they use
|
||||
a proprietary system they might not even be able to find out
|
||||
that the network was never trained on a particular type of object.
|
||||
Therefore it would be impossible for them to identify the output
|
||||
Therefore, it would be impossible for them to identify the output
|
||||
of the network as false positive.
|
||||
|
||||
This goes back to the need for automatic explanation. Such a system
|
||||
|
@ -78,7 +78,7 @@ hence mark any classification result of the network as meaningless.
|
|||
Technically there are two slightly different approaches that deal
|
||||
with this type of task: model uncertainty and novelty detection.
|
||||
|
||||
Model uncertainty can be measured with dropout sampling.
|
||||
Model uncertainty can be measured, for example, with dropout sampling.
|
||||
Dropout is usually used only during training but
|
||||
Miller et al.~\cite{Miller2018} use them also during testing
|
||||
to achieve different results for the same image making use of
|
||||
|
@ -143,19 +143,12 @@ passes is varied to identify their impact.
|
|||
delivers better object detection performance under open set
|
||||
conditions compared to object detection without it.
|
||||
|
||||
\paragraph{Contribution}
|
||||
The contribution of this thesis is a comparison between dropout
|
||||
sampling and auto-encoding with respect to the overall performance
|
||||
of both for object detection in the open set conditions using
|
||||
the SSD network for object detection and the SceneNet RGB-D data set
|
||||
with MS COCO classes.
|
||||
|
||||
\subsection*{Reader's guide}
|
||||
|
||||
First, chapter \ref{chap:background} presents related works and
|
||||
provides the background for dropout sampling a.k.a Bayesian SSD.
|
||||
Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
|
||||
works, and provides details about the software and source code design.
|
||||
works and how the decoding pipelines are structured.
|
||||
Chapter \ref{chap:experiments-results} presents the data sets,
|
||||
the experimental setup, and the results. This is followed by
|
||||
chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
|
||||
|
@ -473,7 +466,7 @@ image across multiple forward passes.
|
|||
|
||||
\subsection{Implementation Details}
|
||||
|
||||
For this thesis, an SSD implementation based on Tensorflow and
|
||||
For this thesis, an SSD implementation based on Tensorflow~\cite{Abadi2015} and
|
||||
Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}}
|
||||
was used. It was modified to support entropy thresholding,
|
||||
partitioning of observations, and dropout
|
||||
|
@ -554,8 +547,11 @@ to the following shape of the network output after all
|
|||
forward passes: \((batch\_size, \#nr\_boxes \, \cdot \, \#nr\_forward\_passes, \#nr\_classes + 12)\). The size of the output
|
||||
increases linearly with more forward passes.
|
||||
|
||||
These detections have to be decoded first. Afterwards they are
|
||||
partitioned into observations to reduce the size of the output, and
|
||||
These detections have to be decoded first. Afterwards,
|
||||
all detections are thrown away which do not pass a confidence
|
||||
threshold for the class with the highest prediction probability.
|
||||
The remaining detections are partitioned into observations to
|
||||
further reduce the size of the output, and
|
||||
to identify uncertainty. This is accomplished by calculating the
|
||||
mutual IOU of every detection with all other detections. Detections
|
||||
with a mutual IOU score of 0.95 or higher are partitioned into an
|
||||
|
@ -575,8 +571,7 @@ varying classifications are averaged into multiple lower confidence
|
|||
values which should increase the entropy and, hence, flag an
|
||||
observation for removal.
|
||||
|
||||
Per class confidence thresholding, non-maximum suppression, and
|
||||
top \(k\) selection happen like in vanilla SSD.
|
||||
The final step is a top \(k\) selection.
|
||||
|
||||
\chapter{Experimental Setup and Results}
|
||||
|
||||
|
|
Loading…
Reference in New Issue