masterthesis-latex/body.tex

% body thesis file that contains the actual content

\chapter{Introduction}

\subsection*{Motivation}

Famous examples like the automatic soap dispenser which does not
recognize the hand of a black person but dispenses soap when presented
with a paper towel raise the question of bias in computer
systems~\cite{Friedman1996}. Related to this ethical question regarding
the design of so called algorithms is the question of
algorithmic accountability~\cite{Diakopoulos2014}.

Supervised neural networks learn from input-output relations and
figure out by themselves what connections are necessary for that.
This feature is also their Achilles heel: it makes them effectively
black boxes and prevents any answers to questions of causality.

However, these questions of causility are of enormous consequence when
results of neural networks are used to make life changing decisions:
Is a correlation enough to bring forth negative consequences
for a particular person? And if so, what is the possible defence
against math? Similar questions can be raised when looking at computer
vision networks that might be used together with so called smart
CCTV cameras to discover suspicious activity.

This leads to the need for neural networks to explain their results.
Such an explanation must come from the network or an attached piece
of technology to allow adoption in mass. Obviously this setting
poses the question, how such an endeavour can be achieved.

For neural networks there are fundamentally two type of tasks:
regression and classification. Regression deals with any case
where the goal for the network is to come close to an ideal
function that connects all data points. Classification, however,
describes tasks where the network is supposed to identify the
class of any given input. In this thesis, I will focus on
classification.

\subsection*{Object Detection in Open Set Conditions}

More specifically, I will look at object detection in the open set
conditions. In non-technical words this effectively describes
the kind of situation you encounter with CCTV cameras or robots
outside of a laboratory. Both use cameras that record
images. Subsequently a neural network analyses the image
and returns a list of detected and classified objects that it
found in the image. The problem here is that networks can only
classify what they know. If presented with an object type that
the network was not trained with, as happens frequently in real
environments, it will still classify the object and might even
have a high confidence in doing so. Such an example would be
a false positive. Any ordinary person who uses the results of
such a network would falsely assume that a high confidence always
means the classification is very likely correct. If they use
a proprietary system they might not even be able to find out
that the network was never trained on a particular type of object.
Therefore it would be impossible for them to identify the output
of the network as false positive.

This goes back to the need for automatic explanation. Such a system
should by itself recognize that the given object is unknown and
hence mark any classification result of the network as meaningless.
Technically there are two slightly different things that deal
with this type of task: model uncertainty and novelty detection.

Model uncertainty can be measured with dropout sampling.
Dropout is usually used only during training but
Miller et al.~\cite{Miller2018} use them also during testing
to achieve different results for the same image making use of
multiple forward passes. The output scores for the forward passes
of the same image are then averaged. If the averaged class
probabilities resemble a uniform distribution (every class has
the same probability) this symbolises maximum uncertainty. Conversely,
if there is one very high probability with every other being very
low this signifies a low uncertainty. An unknown object is more
likely to cause high uncertainty which allows for an identification
of false positive cases.

Novelty detection is the more direct approach to solve the task.
In the realm of neural networks it is usually done with the help of
auto-encoders that essentially solve a regression task of finding an
identity function that reconstructs on the output the given
input~\cite{Pimentel2014}. Auto-encoders have
internally at least two components: an encoder, and a decoder or
generator. The job of the encoder is to find an encoding that
compresses the input as good as possible while simultaneously
being as loss-free as possible. The decoder takes this latent
representation of the input and has to find a decompression
that reconstructs the input as accurate as possible. During
training these auto-encoders learn to reproduce a certain group
of object classes. The actual novelty detection takes place
during testing: Given an image, and the output and loss of the
auto-encoder, a novelty score is calculated. A low novelty
score signals a known object. The opposite is true for a high
novelty score.

\subsection*{Research Question}

Both presented approaches describe one way to solve the aforementioned
problem of explanation. They can be differentiated by measuring
their performance: the best theoretical idea is useless if it does
not perform well. Miller et al. have shown
some success in using dropout sampling. However, the many forward
passes during testing for every image seem computationally expensive.
In comparison a single run through a trained auto-encoder seems
intuitively to be faster. This leads to the hypothesis (see below).

For the purpose of this thesis, I will
use the work of Miller et al. as baseline to compare against.
They use the SSD~\cite{Liu2016} network for object detection,
modified by added dropout layers, and the SceneNet
RGB-D~\cite{McCormac2017} data set using the MS COCO~\cite{Lin2014}
classes. I will use a simple implementation of an auto-encoder and
novelty detection to compare with the work of Miller et al.
SSD for the object detection and SceneNet RGB-D as the data
set are used for both approaches.

\paragraph{Hypothesis} Novelty detection using auto-encoders
delivers similar or better object detection performance under open set
conditions while being less computationally expensive compared to
dropout sampling.

\paragraph{Contribution}
The contribution of this thesis is a comparison between dropout
sampling and auto-encoding with respect to the overall performance
of both for object detection in the open set conditions using
the SSD network for object detection and the SceneNet RGB-D data set
with MS COCO classes.

\chapter{Background and Contribution}

This chapter will begin with an overview over previous works
in the field of this thesis. Afterwards the theoretical foundations
of the work of Miller et al.~\cite{Miller2018} and auto-encoders will
be explained. The chapter concludes with more details about the
research question and the intended contribution of this thesis.

\section{Related Works}

Novelty detection for object detection is intricately linked with
open set conditions: the test data can contain unknown classes.
Bishop~\cite{Bishop1994} investigates the correlation between
the degree of novel input data and the reliability of network
outputs. Pimentel et al.~\cite{Pimentel2014} provide a review
of novelty detection methods published over the previous decade.

There are two primary pathways that deal with novelty: novelty
detection using auto-encoders and uncertainty estimation with
bayesian networks.

Japkowicz et al.~\cite{Japkowicz1995} introduce a novelty detection
method based on the hippocampus of Gluck and Meyers~\cite{Gluck1993}
and use an auto-encoder to recognize novel instances.
Thompson et al.~\cite{Thompson2002} show that auto-encoders
can learn "normal" system behaviour implicitly.
Goodfellow et al.~\cite{Goodfellow2014} introduce adversarial
networks: a generator that attempts to trick the discriminator
by generating samples indistinguishable from the real data.
Makhzani et al.~\cite{Makhzani2015} build on the work of Goodfellow
and propose adversarial auto-encoders. Richter and
Roy~\cite{Richter2017} use an auto-encoder to detect novelty.

Wang et al.~\cite{Wang2018} base upon Goodfellow's work and
use a generative adversarial network for novelty detection.
Sabokrou et al.~\cite{Sabokrou2018} implement an end-to-end
architecture for one-class classification: it consists of two
deep networks, with one being the novelty detector and the other
enhancing inliers and distorting outliers.
Pidhorskyi et al.~\cite{Pidhorskyi2018} take a probabilistic approach
and compute how likely it is that a sample is generated by the
inlier distribution.

Kendall and Gal~\cite{Kendall2017} provide a Bayesian deep learning
framework that combines input-dependent
aleatoric\footnote{captures noise inherent in observations}
uncertainty with epistemic\footnote{uncertainty in the model}
uncertainty. Lakshminarayanan et al.~\cite{Lakshminarayanan2017}
implement a predictive uncertainty estimation using deep ensembles
rather than Bayesian networks. Geifman et al.~\cite{Geifman2018}
introduce an uncertainty estimation algorithm for non-Bayesian deep
neural classification that estimates the uncertainty of highly
confident points using earlier snapshots of the trained model.
Miller et al.~\cite{Miller2018a} compare merging strategies
for sampling-based uncertainty techniques in object detection.
Sensoy et al.~\cite{Sensoy2018} treat prediction confidence
as subjective opinions: they place a Dirichlet distribution on it.
The trained predictor for a multi-class classification is also a
Dirichlet distribution.

Gal and Ghahramani~\cite{Gal2016} show how dropout can be used
as a Bayesian approximation. Miller et al.~\cite{Miller2018}
build upon the work of Miller et al.~\cite{Miller2018a} and
Gal and Ghahramani: they use dropout sampling under open-set
conditions for object detection. Mukhoti and Gal~\cite{Mukhoti2018}
contribute metrics to measure uncertainty for semantic
segmentation. Wu et al.~\cite{Wu2019} introduce two innovations
that turn variational Bayes into a robust tool for Bayesian
networks: they introduce a novel deterministic method to approximate
moments in neural networks which eliminates gradient variance, and
they introduce a hierarchical prior for parameters and an
Empirical Bayes procedure to select prior variances.


% SSD: \cite{Liu2016}
% ImageNet: \cite{Deng2009}
% COCO: \cite{Lin2014}
% YCB: \cite{Xiang2017}
% SceneNet: \cite{McCormac2017}

\chapter{Methods}

This chapter starts with the design of the source code; the
source code is so much more than a means to an end. The thesis
uses two data sets: MS COCO and SceneNet RGB-D; a section
will explain how these data sets have been prepared.
Afterwards the replication of the work of Miller et al. is
outlined, followed by the implementation of the auto-encoder.

\section{Design of Source Code}

The source code of many published papers is either not available
or seems like an afterthought: it is poorly documented, difficult
to integrate in your own work, and often does not follow common
software development best practices. Moreover, with Tensorflow,
PyTorch, and Caffe there are at least three machine learning
frameworks. Every research team seems to prefer another framework
and sometimes even develops their own; this makes it difficult
to combine the work of different authors.
In addition to all this, most papers do not contain proper information
regarding the implementation details, making it difficult to
accurately replicate them if their source code is not available.

Therefore, it was clear to me: I will release my source code and
make it available as Python package on the PyPi package index.
This makes it possible for other researchers to simply install
a package and use the API to interact with my code. Additionally,
the code has been designed to be future proof and work with
the announced Tensorflow 2.0 by supporting eager mode.

Furthermore, it is configurable, well documented, and conforms
to the clean code guidelines: evolvability and extendability among
others. Unit tests are part of the code as well to identify common
issues early on, saving time in the process.

Lastly, the SSD implementation from a third party repository
has been modified to work inside a Python package architecture and
with eager mode.

\section{Preparation of data sets}

Usually, data sets are not perfect when it comes to neural
networks: they contain outliers, invalid bounding boxes, and similar
problematic things. Before a data set can be used, these problems
need to be removed.

For the MS COCO data set, all annotations were checked for
impossible values: bounding box height or width lower than zero,
x1 and y1 bounding box coordinates lower than zero,
x2 and y2 coordinates lower or equal to zero, x1 greater than x2,
y1 greater than y2, image width lower than x2,
and image height lower than y2. In the last two cases the
bounding box width or height was set to (image with - x1) or (image height - y1)
respectively; in the other cases the annotation was skipped.
If the bounding box width or height afterwards is
lower or equal to zero the annotation is skipped.

In this thesis SceneNet RGB-D is always used with COCO classes.
Therefore, a mapping between COCO and SceneNet RGB-D and vice versa
was necessary. It was created my manually going through each
Wordnet ID and searching for a fitting COCO class.

The ground truth for SceneNet RGB-D is stored in protobuf files
and had to be converted into Python format to use it in the
codebase. Only ground truth instances that had a matching
COCO class were saved, the rest discarded. 

\section{Replication of Miller et al.}

\section{Implementing an auto-encoder}

\chapter{Results}

\chapter{Discussion}

\chapter{Closing}
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`% body thesis file that contains the actual content`

			`\chapter{Introduction}`

			`\subsection*{Motivation}`

			`Famous examples like the automatic soap dispenser which does not`
			`recognize the hand of a black person but dispenses soap when presented`
			`with a paper towel raise the question of bias in computer`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`systems~\cite{Friedman1996}. Related to this ethical question regarding`
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`the design of so called algorithms is the question of`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`algorithmic accountability~\cite{Diakopoulos2014}.`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`Supervised neural networks learn from input-output relations and`
			`figure out by themselves what connections are necessary for that.`
			`This feature is also their Achilles heel: it makes them effectively`
			`black boxes and prevents any answers to questions of causality.`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
			`However, these questions of causility are of enormous consequence when`
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`results of neural networks are used to make life changing decisions:`
			`Is a correlation enough to bring forth negative consequences`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`for a particular person? And if so, what is the possible defence`
			`against math? Similar questions can be raised when looking at computer`
			`vision networks that might be used together with so called smart`
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`CCTV cameras to discover suspicious activity.`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
			`This leads to the need for neural networks to explain their results.`
			`Such an explanation must come from the network or an attached piece`
			`of technology to allow adoption in mass. Obviously this setting`
			`poses the question, how such an endeavour can be achieved.`

			`For neural networks there are fundamentally two type of tasks:`
			`regression and classification. Regression deals with any case`
			`where the goal for the network is to come close to an ideal`
			`function that connects all data points. Classification, however,`
			`describes tasks where the network is supposed to identify the`
			`class of any given input. In this thesis, I will focus on`
			`classification.`

			`\subsection*{Object Detection in Open Set Conditions}`

			`More specifically, I will look at object detection in the open set`
			`conditions. In non-technical words this effectively describes`
			`the kind of situation you encounter with CCTV cameras or robots`
			`outside of a laboratory. Both use cameras that record`
			`images. Subsequently a neural network analyses the image`
			`and returns a list of detected and classified objects that it`
			`found in the image. The problem here is that networks can only`
			`classify what they know. If presented with an object type that`
			`the network was not trained with, as happens frequently in real`
			`environments, it will still classify the object and might even`
			`have a high confidence in doing so. Such an example would be`
			`a false positive. Any ordinary person who uses the results of`
			`such a network would falsely assume that a high confidence always`
			`means the classification is very likely correct. If they use`
			`a proprietary system they might not even be able to find out`
			`that the network was never trained on a particular type of object.`
			`Therefore it would be impossible for them to identify the output`
			`of the network as false positive.`

			`This goes back to the need for automatic explanation. Such a system`
			`should by itself recognize that the given object is unknown and`
			`hence mark any classification result of the network as meaningless.`
			`Technically there are two slightly different things that deal`
			`with this type of task: model uncertainty and novelty detection.`

			`Model uncertainty can be measured with dropout sampling.`
			`Dropout is usually used only during training but`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`Miller et al.~\cite{Miller2018} use them also during testing`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`to achieve different results for the same image making use of`
			`multiple forward passes. The output scores for the forward passes`
			`of the same image are then averaged. If the averaged class`
			`probabilities resemble a uniform distribution (every class has`
			`the same probability) this symbolises maximum uncertainty. Conversely,`
			`if there is one very high probability with every other being very`
			`low this signifies a low uncertainty. An unknown object is more`
			`likely to cause high uncertainty which allows for an identification`
			`of false positive cases.`

			`Novelty detection is the more direct approach to solve the task.`
			`In the realm of neural networks it is usually done with the help of`
			`auto-encoders that essentially solve a regression task of finding an`
			`identity function that reconstructs on the output the given`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`input~\cite{Pimentel2014}. Auto-encoders have`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`internally at least two components: an encoder, and a decoder or`
			`generator. The job of the encoder is to find an encoding that`
			`compresses the input as good as possible while simultaneously`
			`being as loss-free as possible. The decoder takes this latent`
			`representation of the input and has to find a decompression`
			`that reconstructs the input as accurate as possible. During`
			`training these auto-encoders learn to reproduce a certain group`
			`of object classes. The actual novelty detection takes place`
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`during testing: Given an image, and the output and loss of the`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`auto-encoder, a novelty score is calculated. A low novelty`
			`score signals a known object. The opposite is true for a high`
			`novelty score.`

			`\subsection*{Research Question}`

Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`Both presented approaches describe one way to solve the aforementioned`
			`problem of explanation. They can be differentiated by measuring`
			`their performance: the best theoretical idea is useless if it does`
			`not perform well. Miller et al. have shown`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`some success in using dropout sampling. However, the many forward`
			`passes during testing for every image seem computationally expensive.`
			`In comparison a single run through a trained auto-encoder seems`
			`intuitively to be faster. This leads to the hypothesis (see below).`

			`For the purpose of this thesis, I will`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`use the work of Miller et al. as baseline to compare against.`
			`They use the SSD~\cite{Liu2016} network for object detection,`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00			`modified by added dropout layers, and the SceneNet`
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`RGB-D~\cite{McCormac2017} data set using the MS COCO~\cite{Lin2014}`
Updated introduction and made it more concise Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 14:15:17 +02:00			`classes. I will use a simple implementation of an auto-encoder and`
			`novelty detection to compare with the work of Miller et al.`
			`SSD for the object detection and SceneNet RGB-D as the data`
			`set are used for both approaches.`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
			`\paragraph{Hypothesis} Novelty detection using auto-encoders`
			`delivers similar or better object detection performance under open set`
			`conditions while being less computationally expensive compared to`
			`dropout sampling.`

			`\paragraph{Contribution}`
			`The contribution of this thesis is a comparison between dropout`
			`sampling and auto-encoding with respect to the overall performance`
			`of both for object detection in the open set conditions using`
			`the SSD network for object detection and the SceneNet RGB-D data set`
			`with MS COCO classes.`

			`\chapter{Background and Contribution}`

Added related works Signed-off-by: Jim Martens <github@2martens.de> 2019-08-01 16:52:59 +02:00			`This chapter will begin with an overview over previous works`
			`in the field of this thesis. Afterwards the theoretical foundations`
			`of the work of Miller et al.~\cite{Miller2018} and auto-encoders will`
			`be explained. The chapter concludes with more details about the`
			`research question and the intended contribution of this thesis.`

			`\section{Related Works}`

			`Novelty detection for object detection is intricately linked with`
			`open set conditions: the test data can contain unknown classes.`
			`Bishop~\cite{Bishop1994} investigates the correlation between`
			`the degree of novel input data and the reliability of network`
			`outputs. Pimentel et al.~\cite{Pimentel2014} provide a review`
			`of novelty detection methods published over the previous decade.`

			`There are two primary pathways that deal with novelty: novelty`
			`detection using auto-encoders and uncertainty estimation with`
			`bayesian networks.`

			`Japkowicz et al.~\cite{Japkowicz1995} introduce a novelty detection`
			`method based on the hippocampus of Gluck and Meyers~\cite{Gluck1993}`
			`and use an auto-encoder to recognize novel instances.`
			`Thompson et al.~\cite{Thompson2002} show that auto-encoders`
			`can learn "normal" system behaviour implicitly.`
			`Goodfellow et al.~\cite{Goodfellow2014} introduce adversarial`
			`networks: a generator that attempts to trick the discriminator`
			`by generating samples indistinguishable from the real data.`
			`Makhzani et al.~\cite{Makhzani2015} build on the work of Goodfellow`
			`and propose adversarial auto-encoders. Richter and`
			`Roy~\cite{Richter2017} use an auto-encoder to detect novelty.`

			`Wang et al.~\cite{Wang2018} base upon Goodfellow's work and`
			`use a generative adversarial network for novelty detection.`
			`Sabokrou et al.~\cite{Sabokrou2018} implement an end-to-end`
			`architecture for one-class classification: it consists of two`
			`deep networks, with one being the novelty detector and the other`
			`enhancing inliers and distorting outliers.`
			`Pidhorskyi et al.~\cite{Pidhorskyi2018} take a probabilistic approach`
			`and compute how likely it is that a sample is generated by the`
			`inlier distribution.`

			`Kendall and Gal~\cite{Kendall2017} provide a Bayesian deep learning`
			`framework that combines input-dependent`
			`aleatoric\footnote{captures noise inherent in observations}`
			`uncertainty with epistemic\footnote{uncertainty in the model}`
			`uncertainty. Lakshminarayanan et al.~\cite{Lakshminarayanan2017}`
			`implement a predictive uncertainty estimation using deep ensembles`
			`rather than Bayesian networks. Geifman et al.~\cite{Geifman2018}`
			`introduce an uncertainty estimation algorithm for non-Bayesian deep`
			`neural classification that estimates the uncertainty of highly`
			`confident points using earlier snapshots of the trained model.`
			`Miller et al.~\cite{Miller2018a} compare merging strategies`
			`for sampling-based uncertainty techniques in object detection.`
			`Sensoy et al.~\cite{Sensoy2018} treat prediction confidence`
			`as subjective opinions: they place a Dirichlet distribution on it.`
			`The trained predictor for a multi-class classification is also a`
			`Dirichlet distribution.`

			`Gal and Ghahramani~\cite{Gal2016} show how dropout can be used`
			`as a Bayesian approximation. Miller et al.~\cite{Miller2018}`
			`build upon the work of Miller et al.~\cite{Miller2018a} and`
			`Gal and Ghahramani: they use dropout sampling under open-set`
			`conditions for object detection. Mukhoti and Gal~\cite{Mukhoti2018}`
			`contribute metrics to measure uncertainty for semantic`
			`segmentation. Wu et al.~\cite{Wu2019} introduce two innovations`
			`that turn variational Bayes into a robust tool for Bayesian`
			`networks: they introduce a novel deterministic method to approximate`
			`moments in neural networks which eliminates gradient variance, and`
			`they introduce a hierarchical prior for parameters and an`
			`Empirical Bayes procedure to select prior variances.`


			`% SSD: \cite{Liu2016}`
			`% ImageNet: \cite{Deng2009}`
			`% COCO: \cite{Lin2014}`
			`% YCB: \cite{Xiang2017}`
			`% SceneNet: \cite{McCormac2017}`

Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`\chapter{Methods}`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
Added skeleton for methodAdded skeleton for methods Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 12:02:39 +02:00			`This chapter starts with the design of the source code; the`
			`source code is so much more than a means to an end. The thesis`
			`uses two data sets: MS COCO and SceneNet RGB-D; a section`
			`will explain how these data sets have been prepared.`
			`Afterwards the replication of the work of Miller et al. is`
			`outlined, followed by the implementation of the auto-encoder.`

Added method sections Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 15:09:07 +02:00			`\section{Design of Source Code}`

Wrote about design of code Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 12:02:56 +02:00			`The source code of many published papers is either not available`
			`or seems like an afterthought: it is poorly documented, difficult`
			`to integrate in your own work, and often does not follow common`
			`software development best practices. Moreover, with Tensorflow,`
			`PyTorch, and Caffe there are at least three machine learning`
			`frameworks. Every research team seems to prefer another framework`
			`and sometimes even develops their own; this makes it difficult`
			`to combine the work of different authors.`
			`In addition to all this, most papers do not contain proper information`
			`regarding the implementation details, making it difficult to`
			`accurately replicate them if their source code is not available.`

			`Therefore, it was clear to me: I will release my source code and`
			`make it available as Python package on the PyPi package index.`
			`This makes it possible for other researchers to simply install`
			`a package and use the API to interact with my code. Additionally,`
			`the code has been designed to be future proof and work with`
			`the announced Tensorflow 2.0 by supporting eager mode.`

			`Furthermore, it is configurable, well documented, and conforms`
			`to the clean code guidelines: evolvability and extendability among`
			`others. Unit tests are part of the code as well to identify common`
			`issues early on, saving time in the process.`

			`Lastly, the SSD implementation from a third party repository`
			`has been modified to work inside a Python package architecture and`
Fixed typo Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 12:05:23 +02:00			`with eager mode.`
Wrote about design of code Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 12:02:56 +02:00
Added method sections Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 15:09:07 +02:00			`\section{Preparation of data sets}`

Wrote skeleton of data set preparation Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 13:45:52 +02:00			`Usually, data sets are not perfect when it comes to neural`
			`networks: they contain outliers, invalid bounding boxes, and similar`
			`problematic things. Before a data set can be used, these problems`
			`need to be removed.`

			`For the MS COCO data set, all annotations were checked for`
			`impossible values: bounding box height or width lower than zero,`
			`x1 and y1 bounding box coordinates lower than zero,`
			`x2 and y2 coordinates lower or equal to zero, x1 greater than x2,`
			`y1 greater than y2, image width lower than x2,`
			`and image height lower than y2. In the last two cases the`
			`bounding box width or height was set to (image with - x1) or (image height - y1)`
			`respectively; in the other cases the annotation was skipped.`
			`If the bounding box width or height afterwards is`
			`lower or equal to zero the annotation is skipped.`

			`In this thesis SceneNet RGB-D is always used with COCO classes.`
			`Therefore, a mapping between COCO and SceneNet RGB-D and vice versa`
			`was necessary. It was created my manually going through each`
			`Wordnet ID and searching for a fitting COCO class.`

			`The ground truth for SceneNet RGB-D is stored in protobuf files`
			`and had to be converted into Python format to use it in the`
			`codebase. Only ground truth instances that had a matching`
			`COCO class were saved, the rest discarded.`

Added method sections Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 15:09:07 +02:00			`\section{Replication of Miller et al.}`

Added skeleton for methodAdded skeleton for methods Signed-off-by: Jim Martens <github@2martens.de> 2019-08-04 12:02:39 +02:00			`\section{Implementing an auto-encoder}`

Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`\chapter{Results}`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`\chapter{Discussion}`
Added thesis files Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:19:31 +02:00
Added missing tex files and skeleton chapters Signed-off-by: Jim Martens <github@2martens.de> 2019-07-28 14:50:50 +02:00			`\chapter{Closing}`