updated for revised plan (Feb 14)

2019-02-14 11:13:20 +01:00
parent db60c2c726
commit faa9190640

44
Home.md

@ -1,19 +1,35 @@
General Idea of the Thesis:
# Masterthesis
1. Realistic scene with objects of YCB Video data set (self recorded or from data set).
2. Retrieve 2D data from scene.
3. Segment point cloud into separate areas for objects.
4. a) Feed SSD with 2D data for object. Classify object and calculate classification loss (cross entropy loss).
b) Feed autoencoder with 2D data for object. Calculate encoding/decoding loss.
5. In testing phase the encoding/decoding loss tells us if an object is unknown.
**Problem**: object detection in the so-called open set conditions
Explanation: detecting when an object detector fails by producing a prediction "this bounding box has object of class X", where in fact the object is of class Y which has not been encountered during training.
Preparation:
**Baseline**: "Dropout Sampling for Robust Object Detection in Open-Set Conditions", Miller et al., ICRA 2018
**Abstract**: They apply the SSD network as a starting point. The method they propose essentially consists of 1) enabling dropout during forward pass also and 2) feeding the same image through SSD multiple time (with dropout on, so result will be different each time!), and 3) filtering the resulting bounding boxes to detect the errors of the problem type.
1. Clean dataset with dataset_cleaner.py developed in master project
2. specify train/validate/test split
3. reorganize dataset inside these splits according to object class rather than movie/frame (metadata only, no duplicate dataset files!)
4. specify inlier and outlier object classes
**Problem of baseline**: multiple forward passes can get rather wasteful in terms of required computation time (they need up to 20 per image)
Network setup of baseline: vanilla SSD network with MS COCO classes
**Own approach**: SSD for object detection, GPND for novelty detection; per bounding box is determined if the object is new
Algorithm:
1. Scene from SceneNet RGB-D
2. Get 2D data from scene
3. Feed SSD with 2D data for object. Classify object and calculate classification loss.
4. Feed autoencoder with 2D data for object. Calculate novelty score.
5. The novelty score tells us if an object is unknown.
Training: independent training of SSD and Adversarial Autoencoder (AAE). Use ground truth bounding box for AAE.
**Dataset**: SceneNet RGB-D dataset
**Stretch Goals**:
* YCB Video dataset
* create custom dataset with YCB objects using robot arm
* test both trained networks on this dataset which features the same objects but different environments
**Open Questions**: additional use for generator part of GPND beyond autoencoder function?
Training:
**Independent** training of SSD and Adversarial Autoencoder (AAE). Use ground truth bounding box for AAE.