Expanded methods chapter with vanilla SSD explanation

Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
2019-09-19 13:56:55 +02:00
parent 8ba241a8d7
commit 59b09c45ff

View File

@ -129,7 +129,7 @@ This leads to the following hypothesis: \emph{Dropout sampling
delivers better object detection performance under open set delivers better object detection performance under open set
conditions compared to object detection without it.} conditions compared to object detection without it.}
For the purpose of this thesis, I will use the vanilla SSD as For the purpose of this thesis, I will use the vanilla SSD (as in: the original SSD) as
baseline to compare against. In particular, vanilla SSD uses baseline to compare against. In particular, vanilla SSD uses
a per-class confidence threshold of 0.01, an IOU threshold of 0.45 a per-class confidence threshold of 0.01, an IOU threshold of 0.45
for the non-maximum suppression, and a top k value of 200. for the non-maximum suppression, and a top k value of 200.
@ -421,16 +421,9 @@ be used to identify and reject these false positive cases.
\label{chap:methods} \label{chap:methods}
This chapter explains the functionality of the Bayesian SSD and the This chapter explains the functionality of vanilla SSD, Bayesian SSD, and the decoding pipelines.
decoding pipelines.
\section{Bayesian SSD for Model Uncertainty} \section{Vanilla SSD}
Bayesian SSD adds dropout sampling to the vanilla SSD. First,
the model architecture will be explained, followed by details on
the uncertainty calculation, and implementation details.
\subsection{Model Architecture}
\begin{figure} \begin{figure}
\centering \centering
@ -440,11 +433,29 @@ the uncertainty calculation, and implementation details.
\label{fig:vanilla-ssd} \label{fig:vanilla-ssd}
\end{figure} \end{figure}
Vanilla SSD is based upon the VGG-16 network (see figure \ref{fig:vanilla-ssd}) and adds extra feature layers. These layers Vanilla SSD is based upon the VGG-16 network (see figure
predict the offsets to the anchor boxes, which have different sizes \ref{fig:vanilla-ssd}) and adds extra feature layers. The entire
and aspect ratios. The feature layers also predict the image (always size 300x300) is divided up into anchor boxes. During
corresponding confidences. By comparison, Bayesian SSD only adds training, each of these boxes is mapped to a ground truth box or
two dropout layers after the fc6 and fc7 layers (see figure \ref{fig:bayesian-ssd}). background. For every anchor box the offset to
the object, and the class confidences are calculated. The output of the
SSD network are the predictions with class confidences, offsets to the
anchor box, anchor box coordinates, and variance. The model loss is a
weighted sum of localisation and confidence loss. As the network
has a fixed number of anchor boxes, every forward pass creates the same
number of detections - 8732 in the case of SSD 300x300.
Notably, the object proposals are made in a single run for an image -
single shot.
Other techniques like Faster R-CNN employ region proposals
and pooling. For more detailed information on SSD, please refer to
Liu et al.~\cite{Liu2016}.
\section{Bayesian SSD for Model Uncertainty}
Networks trained with dropout are a general approximate Bayesian model~\cite{Gal2017}. As such, they can be used for everything a true
Bayesian model could be used for. The idea is applied to SSD in this
thesis: two dropout layers are added to vanilla SSD, after the layers fc6 and fc7 respectively (see figure \ref{fig:bayesian-ssd}).
\begin{figure} \begin{figure}
\centering \centering
@ -454,14 +465,14 @@ two dropout layers after the fc6 and fc7 layers (see figure \ref{fig:bayesian-ss
\label{fig:bayesian-ssd} \label{fig:bayesian-ssd}
\end{figure} \end{figure}
\subsection{Model Uncertainty} Motivation for this is model uncertainty: an uncertain model will
predict different classes for the same object on the same image across
Dropout sampling measures model uncertainty with the help of multiple forward passes. This uncertainty is measured with entropy:
entropy: every forward pass creates predictions, these are every forward pass results in predictions, these are partitioned into
partitioned into observations, and then their entropy is calculated. observations, and subsequently their entropy is calculated.
Entropy works to detect uncertainty because uncertain networks A higher entropy indicates a more uniform distribution of confidences
will produce different classifications for the same object in an whereas a lower entropy indicates a larger confidence in one class
image across multiple forward passes. and very low confidences in other classes.
\subsection{Implementation Details} \subsection{Implementation Details}
@ -469,8 +480,11 @@ For this thesis, an SSD implementation based on Tensorflow~\cite{Abadi2015} and
Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}} Keras\footnote{\url{https://github.com/pierluigiferrari/ssd\_keras}}
was used. It was modified to support entropy thresholding, was used. It was modified to support entropy thresholding,
partitioning of observations, and dropout partitioning of observations, and dropout
layers in the SSD model. %Entropy thresholding takes place before layers in the SSD model. Entropy thresholding takes place before
%the per-class confidence threshold is applied. the per-class confidence threshold is applied.
The Bayesian variant was not fine-tuned and operates with the same
weights that vanilla SSD uses as well.
\section{Decoding Pipelines} \section{Decoding Pipelines}
@ -624,8 +638,8 @@ an open set condition. To this end, the weights for the last
All images of the minival2014 data set were used but only ground truth All images of the minival2014 data set were used but only ground truth
belonging to the first 60 classes was loaded. The remaining 20 belonging to the first 60 classes was loaded. The remaining 20
classes were considered "unknown" and were not presented with bounding classes were considered "unknown" and no ground truth bounding
boxes during the inference phase. boxes for them were provided during the inference phase.
\section{Experimental Setup} \section{Experimental Setup}