Finished exposé except for timetable

Signed-off-by: Jim Martens <github@2martens.de>
2019-02-20 17:20:20 +01:00 · 2019-02-20 17:20:20 +01:00 · 9dd9b49ab1
parent dee2a83e9a
commit 9dd9b49ab1
1 changed files with 187 additions and 7 deletions
--- a/body_expose.tex
+++ b/body_expose.tex
@ -2,12 +2,192 @@

 \chapter{Introduction}

-\textcolor{uhhRed}{Hallo}
+Famous examples like the automatic soap dispenser which does not
+recognize the hand of a black person but dispenses soap when presented
+with a paper towel raise the question of bias in computer
+systems\cite{Friedman1996}. Related to this ethical question regarding
+the design of so called algorithms, a term often used in public
+discourse for applied neural networks, is the question of
+algorithmic accountability\cite{Diakopoulos2014}.

-\chapter{Conclusion}
+The charme of supervised neural networks, that they can learn from
+input-output relations and figure out by themselves what connections
+are necessary for that, is also their achilles heel. This feature
+makes them effectively black boxes. It is possible to question the
+training environment, like potential biases inside the data sets, or
+the engineers constructing the networks but it is not really possible
+to question the internal calculations made by a network. On the one
+hand, one might argue, it is only math and nothing magical that
+happens inside these networks. Clearly it is possible, albeit a chore,
+to manually follow the calculations of any given trained network.
+After all it is executed on a computer and at the lowest level only
+uses basic math that does not differ between humans and computers. On
+the other hand not everyone is capable of doing so and more
+importantly it does not reveal any answers to questions of causality.

-% test references
-\cite{Wu2019}
-\cite{Gal2016}
-\cite{Ilg2018}
-\cite{Geifman2018}
+However, these questions of causility are of enormous consequence when
+neural networks are used, for example, in predictive policing. Is a
+correlation, a coincidence, enough to bring forth negative consequences
+for a particular person? And if so, what is the possible defence
+against math? Similar questions can be raised when looking at computer
+vision networks that might be used together with so called smart
+CCTV cameras, for example, like those tested at the train station
+Berlin Südkreuz. What if a network implies you committed suspicious
+behaviour?
+
+This leads to the need for neural networks to explain their results.
+Such an explanation must come from the network or an attached piece
+of technology to allow adoption in mass. Obviously this setting
+poses the question, how such an endeavour can be achieved.
+
+For neural networks there are fundamentally two type of tasks:
+regression, and classification. Regression deals with any case
+where the goal for the network is to come close to an ideal
+function that connects all data points. Classification, however,
+describes tasks where the network is supposed to identify the
+class of any given input. In this thesis, I will focus on
+classification.
+
+More specifically, I will look at object detection in the open-set
+conditions. In non-technical words this effectively describes
+the kind of situation you encounter with CCTV cameras or robots
+outside of a laboratory. Both use cameras that, well, record
+images. Subsequently a neural network analyses the image
+and returns a list of detected and classified objects that it
+found in the image. The problem here is that networks can only
+classify what they know. If presented with an object type that
+the network was not trained with, as happens frequently in real
+environments, it will still classify the object and might even
+have a high confidence in doing so. Such an example would be
+a false positive. Any ordinary person who uses the results of
+such a network would falsely assume that a high confidence always
+means the classification is very likely correct. If they use
+a proprietary system they might not even be able to find out
+that the network was never trained on a particular type of object.
+Therefore it would be impossible for them to identify the output
+of the network as false positive.
+
+This goes back to the need for automatic explanation. Such a system
+should by itself recognize that the given object is unknown and
+hence mark any classification result of the network as meaningless.
+Technically there are two slightly different things that deal
+with this type of task: model uncertainty, and novelty detection.
+
+Model uncertainty can be measured with dropout sampling.
+Dropout is usually used only during training but
+Miller et al\cite{Miller2018} use them also during testing
+to achieve different results for the same image making use of
+multiple forward passes. The output scores for the forward passes
+of the same image are then averaged. If the averaged class
+probabilities resemble a uniform distribution (every class has
+the same probability) this symbolises maximum uncertainty. Conversely,
+if there is one very high probability with every other being very
+low this signifies a low uncertainty. An unknown object is more
+likely to cause high uncertainty which allows for an identification
+of false positive cases.
+
+Novelty detection is the more direct approach to solve the task.
+In the realm of neural networks it is usually done with the help of
+auto-encoders that essentially solve a regression task of finding an
+identity function that reconstructs on the output the given
+input\cite{Pimentel2014}. Auto-encoders have
+internally at least two components: an encoder, and a decoder or
+generator. The job of the encoder is to find an encoding that
+compresses the input as good as possible while simultaneously
+being as loss-free as possible. The decoder takes this latent
+representation of the input and has to find a decompression
+that reconstructs the input as accurate as possible. During
+training these auto-encoders learn to reproduce a certain group
+of object classes. The actual novelty detection takes place
+during testing. Given an image, and the output and loss of the
+auto-encoder, a novelty score is calculated. A low novelty
+score signals a known object. The opposite is true for a high
+novelty score.
+
+Given these two approaches to solve the explanation task of above,
+it comes down to performance. At the end of the day the best
+theoretical idea does not help in solving the task if it cannot
+be implemented in a performant way. Miller et al have shown
+some success in using dropout sampling. However, the many forward
+passes during testing for every image seem computationally expensive.
+In comparison a single run through a trained auto-encoder seems
+intuitively to be faster. This leads to the following hypothesis:
+\emph{Novelty detection using auto-encoders delivers similar or better
+object detection performance under open-set conditions while
+being less computationally expensive compared to dropout sampling}.
+
+For the purpose of this thesis, I will
+use the work of Miller et al as baseline to compare against.
+They use the SSD\cite{Liu2016} network for object detection,
+modified by added dropout layers, and the SceneNet
+RGB-D\cite{McCormac2017} data set using the MS COCO\cite{Lin2014}
+classes. Instead of dropout sampling my approach will use
+an auto-encoder for novelty detection with all else, like
+using SSD for object detection and the SceneNet RGB-D data set,
+being equal. With respect to auto-encoders a recent implementation
+of an adversarial auto-encoder\cite{Pidhorskyi2018} will be used.
+
+The contribution of this thesis is a comparison between dropout
+sampling and auto-encoding with respect to the overall performance
+of both for object detection in the open-set conditions using
+the SSD network for object detection and the SceneNet RGB-D data set
+with MS COCO classes.
+
+\chapter{Thesis as a project}
+
+After introducing the topic and the general task ahead, this part of
+the exposé will focus on how to get there. This includes a timetable
+with SMART goals as well as an outline of the software development
+practices used for implementing the code for this thesis.
+
+\section{Software Development}
+
+Most scientific implementations found on GitHub are not done with
+distribution in mind. They usually require manual cloning of the
+repository, have bad code documentation and don't follow common
+coding standards. This is bad enough by itself but becomes a real
+nuisance if you want to use those implementations in your own
+code. As they are not marked up as Python packages, using them
+usually requires manual workarounds to make them usable as library
+code, for example, in a Python package.
+
+The code of this thesis will be developed from the start inside
+a Python package structure which will make it easy to include
+it later on as dependency for other work. After the thesis
+has been graded the package will be uploaded to the PyPi package
+repository and the corresponding Git repository will be made
+publicly available.
+Any required third party implementations, like the SSD implementation
+for Keras, which are not already available as Python packages will
+be included as library code according to their respective licences.
+
+A large chunk of the code will be written as library-ready code
+that can be used in other applications. Only a small part will
+provide the interface to the library code. The specifics of the
+interface cannot be predicted ahead of time but it will certainly
+include a properly documented CLI as that will be necessary for
+the work of the thesis itself.
+
+Tensorflow will be used as the deep learning framework. To make
+the code future-proof, the eager execution mode will be used as it
+is the default for Tensorflow
+2.0\footnote{\url{https://medium.com/tensorflow/whats-coming-in-tensorflow-2-0-d3663832e9b8}}.
+
+\section{Stretch Goals}
+
+There are a number of goals that are not tightly included in the
+following timetable. Those are optional addons that are nice-to-have
+but not critical for successful completion of the thesis.
+
+\begin{itemize}
+    \item make own approach work on the YCB-Video data
+          set\cite{Xiang2017}
+    \item test dropout sampling and own approach on data set
+          self-recorded with a robot arm and mounted Kinect
+    \item provide GUI to select freely an image to be classified by
+          the trained model and see visualization of result
+\end{itemize}
+
+\section{Timetable}
+
+% TODO