Updated decoding pipeline for Bayesian SSD
Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
27
body.tex
27
body.tex
@ -57,7 +57,7 @@ conditions (see figure \ref{fig:open-set}).
|
|||||||
In non-technical words this effectively describes
|
In non-technical words this effectively describes
|
||||||
the kind of situation you encounter with CCTV cameras or robots
|
the kind of situation you encounter with CCTV cameras or robots
|
||||||
outside of a laboratory. Both use cameras that record
|
outside of a laboratory. Both use cameras that record
|
||||||
images. Subsequently a neural network analyses the image
|
images. Subsequently, a neural network analyses the image
|
||||||
and returns a list of detected and classified objects that it
|
and returns a list of detected and classified objects that it
|
||||||
found in the image. The problem here is that networks can only
|
found in the image. The problem here is that networks can only
|
||||||
classify what they know. If presented with an object type that
|
classify what they know. If presented with an object type that
|
||||||
@ -69,7 +69,7 @@ such a network would falsely assume that a high confidence always
|
|||||||
means the classification is very likely correct. If they use
|
means the classification is very likely correct. If they use
|
||||||
a proprietary system they might not even be able to find out
|
a proprietary system they might not even be able to find out
|
||||||
that the network was never trained on a particular type of object.
|
that the network was never trained on a particular type of object.
|
||||||
Therefore it would be impossible for them to identify the output
|
Therefore, it would be impossible for them to identify the output
|
||||||
of the network as false positive.
|
of the network as false positive.
|
||||||
|
|
||||||
This goes back to the need for automatic explanation. Such a system
|
This goes back to the need for automatic explanation. Such a system
|
||||||
@ -78,7 +78,7 @@ hence mark any classification result of the network as meaningless.
|
|||||||
Technically there are two slightly different approaches that deal
|
Technically there are two slightly different approaches that deal
|
||||||
with this type of task: model uncertainty and novelty detection.
|
with this type of task: model uncertainty and novelty detection.
|
||||||
|
|
||||||
Model uncertainty can be measured with dropout sampling.
|
Model uncertainty can be measured, for example, with dropout sampling.
|
||||||
Dropout is usually used only during training but
|
Dropout is usually used only during training but
|
||||||
Miller et al.~\cite{Miller2018} use them also during testing
|
Miller et al.~\cite{Miller2018} use them also during testing
|
||||||
to achieve different results for the same image making use of
|
to achieve different results for the same image making use of
|
||||||
@ -143,19 +143,12 @@ passes is varied to identify their impact.
|
|||||||
delivers better object detection performance under open set
|
delivers better object detection performance under open set
|
||||||
conditions compared to object detection without it.
|
conditions compared to object detection without it.
|
||||||
|
|
||||||
\paragraph{Contribution}
|
|
||||||
The contribution of this thesis is a comparison between dropout
|
|
||||||
sampling and auto-encoding with respect to the overall performance
|
|
||||||
of both for object detection in the open set conditions using
|
|
||||||
the SSD network for object detection and the SceneNet RGB-D data set
|
|
||||||
with MS COCO classes.
|
|
||||||
|
|
||||||
\subsection*{Reader's guide}
|
\subsection*{Reader's guide}
|
||||||
|
|
||||||
First, chapter \ref{chap:background} presents related works and
|
First, chapter \ref{chap:background} presents related works and
|
||||||
provides the background for dropout sampling a.k.a Bayesian SSD.
|
provides the background for dropout sampling a.k.a Bayesian SSD.
|
||||||
Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
|
Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
|
||||||
works, and provides details about the software and source code design.
|
works and how the decoding pipelines are structured.
|
||||||
Chapter \ref{chap:experiments-results} presents the data sets,
|
Chapter \ref{chap:experiments-results} presents the data sets,
|
||||||
the experimental setup, and the results. This is followed by
|
the experimental setup, and the results. This is followed by
|
||||||
chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
|
chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
|
||||||
@ -473,7 +466,7 @@ image across multiple forward passes.
|
|||||||
|
|
||||||
\subsection{Implementation Details}
|
\subsection{Implementation Details}
|
||||||
|
|
||||||
For this thesis, an SSD implementation based on Tensorflow and
|
For this thesis, an SSD implementation based on Tensorflow~\cite{Abadi2015} and
|
||||||
Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}}
|
Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}}
|
||||||
was used. It was modified to support entropy thresholding,
|
was used. It was modified to support entropy thresholding,
|
||||||
partitioning of observations, and dropout
|
partitioning of observations, and dropout
|
||||||
@ -554,8 +547,11 @@ to the following shape of the network output after all
|
|||||||
forward passes: \((batch\_size, \#nr\_boxes \, \cdot \, \#nr\_forward\_passes, \#nr\_classes + 12)\). The size of the output
|
forward passes: \((batch\_size, \#nr\_boxes \, \cdot \, \#nr\_forward\_passes, \#nr\_classes + 12)\). The size of the output
|
||||||
increases linearly with more forward passes.
|
increases linearly with more forward passes.
|
||||||
|
|
||||||
These detections have to be decoded first. Afterwards they are
|
These detections have to be decoded first. Afterwards,
|
||||||
partitioned into observations to reduce the size of the output, and
|
all detections are thrown away which do not pass a confidence
|
||||||
|
threshold for the class with the highest prediction probability.
|
||||||
|
The remaining detections are partitioned into observations to
|
||||||
|
further reduce the size of the output, and
|
||||||
to identify uncertainty. This is accomplished by calculating the
|
to identify uncertainty. This is accomplished by calculating the
|
||||||
mutual IOU of every detection with all other detections. Detections
|
mutual IOU of every detection with all other detections. Detections
|
||||||
with a mutual IOU score of 0.95 or higher are partitioned into an
|
with a mutual IOU score of 0.95 or higher are partitioned into an
|
||||||
@ -575,8 +571,7 @@ varying classifications are averaged into multiple lower confidence
|
|||||||
values which should increase the entropy and, hence, flag an
|
values which should increase the entropy and, hence, flag an
|
||||||
observation for removal.
|
observation for removal.
|
||||||
|
|
||||||
Per class confidence thresholding, non-maximum suppression, and
|
The final step is a top \(k\) selection.
|
||||||
top \(k\) selection happen like in vanilla SSD.
|
|
||||||
|
|
||||||
\chapter{Experimental Setup and Results}
|
\chapter{Experimental Setup and Results}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user