Added explanation for entropy threshold differences

Signed-off-by: Jim Martens <github@2martens.de>
2019-09-25 12:34:20 +02:00 · 2019-09-25 12:34:20 +02:00 · fdcf1af62e
parent bfe85a9c4a
commit fdcf1af62e
1 changed files with 9 additions and 3 deletions
--- a/body.tex
+++ b/body.tex
@ -964,9 +964,15 @@ has a high confidence in one class---including the background.
 However, the entropy plays a larger role for the Bayesian variants---as
 expected: the best performing thresholds are 1.0, 1.3, and 1.4 for micro averaging,
 and 1.5, 1.7, and 2.0 for macro averaging. In all of these cases the best
-threshold is not the largest threshold tested. A lower threshold likely
-eliminated some false positives from the result set. On the other hand a
-too low threshold likely eliminated true positives as well.
+threshold is not the largest threshold tested.
+
+This is caused by a simple phenomenon: at some point most or all true
+positives are in and a higher entropy threshold only adds more false
+positives. Such a behaviour is indicated by a stagnating recall for the
+higher entropy levels. For the low entropy thresholds, the low recall
+is dominating the \(F_1\) score, the sweet spot is somewhere in the
+middle. For macro averaging, it holds that a higher optimal entropy
+threshold indicates a worse performance.

 \subsection*{Non-Maximum Suppression and Top \(k\)}