Summary

目前大部分论文都是用和原始模型训练集不同的数据集来作为out of distribution的数据集，。
OOD——out of distribution
ODD——out of distribution detection
ID——in distribution
ODD也称作： one-class classification, novelty detection, anomaly detection, outlier detection, selective prediction, open set recognition,

Evaluation Metrics

AUROC（Aera Under Receiver of Characters curve）：受试工作者特征曲线下的面积，ROC曲线的纵坐标是TPR (TP/TP+FN)，横坐标是FPR (FP/FP+TN)
AUPR （Aera Under the Precision-Recall curve）：precision~recall
FPR$N$：当有$N\%$的OOD样本被检测出来时，一个ID样本被检测错的概率，越低越好。

Deep anomaly detection with outlier exposure

(ICLR’2019) Dan Hendrycks, Mantas Mazeika, Thomas G. Dietterich

主要内容：
- 在一些现有的ODD detectors上增加Outlier Exposure机制，设计了一个损失函数，帮助提高OOD检测的准确率
  $$\mathbb{E} _ {(x, y) \sim \mathcal{D} _ {in}} [ \mathcal{L} (f(x), y) + \lambda \mathbb{E} _ {x^{\prime} \sim \mathcal{D} _ {out}^{OE}} [ \mathcal{L}_{OE} (f(x^{\prime}), f(x), y) ] ]$$
- 在baseline方法Maximum Softmax Probability、Confidence Branch和Density estimators增加了关于OOD样本的损失项，使得原有的detectors增强了检测OOD样本的能力。
代码实现（Pytorch）：https://github.com/hendrycks/outlier-exposure
- 在图片和文本数据集上进行实验

A baseline for detecting misclassified and out-of-distribution examples in neural networks

(ICLR’17) Dan Hendrycks, Kevin Gimpel

主要内容：
- 利用softmax给出的信心值作为判断样本是否是OOD的依据
- 高置信度的错误预测经常是来自于softmax的，单从某个样本上看，softmax给的信心值可能是错的，但是从统计上看，错误分类的样本
  和OOD的样本在softmax给出的信心值上是比正常的正确分类的样本是要低的。
代码实现（Tensorflow）：https://github.com/hendrycks/error-detection

Training confidence-calibrated classifiers for detecting out-of-distribution samples

(ICLR’18) Kimin Lee, Honglak Lee, Kibok Lee, Jinwoo Shin

主要内容：
- 在原有模型的训练中，设计了一个新的损失函数（confidence loss）如下，如果样本是OOD的，预测输出的信心值趋近于一个常数值（如，0），如果样本是ID的，则是一般的预测信心值。
  $$\mathop{min}\limits_{\theta} \ \mathbb{E}_{P_{in} (\hat{x}, \hat{y}) } \ [-logP_{\theta}(y = \hat{y}|\hat{\mathbf{x}}) ] + \beta \mathbb{E}_{P_{out} (\mathbf{x})} \lbrack KL(\mathcal{U}(y) \Vert P_{\theta}(y|\mathbf{x})) \rbrack $$
- 训练一个generator网络，用来生成OOD的样本，并且尽可能与ID样本是相近的
  $$\mathop{min}\limits_{G} \mathop{max}\limits_{D} \ \beta \mathbb{E}_{P_{G(\mathbf{x})}} [ KL (\mathcal(y) || P_{\theta}(y | \mathbf{x})) ] + \mathbb{E}_{P_{in(\mathbf{x})}} [ \mathop{log} D(\mathbf{x}) ] + \mathbb{E}_{P_{G(\mathbf{x})}} [ \mathop{log} (1 - D(\mathbf{x}) ) ]$$
- 使用自定义的损失函数，交替训练原始网络和GAN至收敛。
代码实现（Pytorch）：https://github.com/alinlab/Confident_classifier

Learning Confidence for Out-of-Distribution Detection in Neural Networks

(‘18) Terrance DeVries, Graham W. Taylor

主要内容：
- 在网络的penultimate layer之后增加一个用来显示模型对样本分类信心程度大小的一个模块。
- 用单层或者多层的全连结神经网络来实现
- 设计的损失函数为
  $$\mathcal{L} = - \sum_{i=1}^{M} log (p_i^{\prime})y_i + - \lambda log (c)$$
代码实现（Pytorch）：https://github.com/uoguelph-mlrg/confidence_estimation

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

(ICLR’18) Shiyu Liang, Yixuan Li, R. Srikant

主要内容：
- 在使用的softmax做为ID和OOD样本分类的基础上，增加了temp scaling和在input perturbing的机制。
- Tempperature Scaling:
  $$S _ {i} (\mathbf{x}; T) = \frac{exp(f_i(\mathbf{x})/ T)}{\sum _ {j = 1} ^ {N} exp (f_j(\mathbf{x})/T)}$$
- Input perturbing:
  $$\tilde{x} = x - \varepsilon sign (-\nabla _ {x} log S_{\tilde{y}} (\mathbf{x};T)) $$
代码实现（Pytorch）：https://github.com/facebookresearch/odin

Out-of-Distribution Detection using Multiple Semantic Label Representations

(NeurIPS’18) Gabi Shalev, Yossi Adi, Joseph Keshet

主要内容：
- 在深度神经网络之后接入K个回归函数，输出K个不同词向量
- 用词向量之间的$L_2$范数距离做为衡量OOD样本的标准
代码实现： https://github.com/MLSpeech/semantic_OOD（代码仓库是空的。。。）

A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

(NeurIPS’18) Kimin Lee, Kibok Lee, Honglak Lee, Jinwoo Shin

主要内容：
- 用于OOD样本检测和对抗样本检测
- 通过类条件高斯分布来表示每个类别的样本的特征的分布（这个有近似的理论依据），$f(\cdot)$为DNN的penultimate layer的输出。
  $$P(f(\mathbf{x}) | y = c) = \mathcal{N} (f(\mathbf{x}) | \mu _ {c}, \Sigma)$$
- 通过计算待测样本和距离最近的类条件高斯分布之间的马式距离来作为ODD score。
- 通过在待测样本上增加扰动（在score增加的方向上）来使得ID样本和OOD样本区别更大
代码实现（Pytorch）：https://github.com/pokaxpoka/deep_Mahalanobis_detector

How to know when machine learning does not know

(cleverhans blog by Nicolas Papernot and Nicholas Frosst)

主要内容：
- 提取样本在神经网络中每一层的输出特征
- 在每一层的输出中，用k近邻找出与测试样本最相近的k个训练集中的样本
- 根据测试样本和找出训练样本的分类结果的差异来判断测试样本是否是OOD的
代码实现：https://github.com/rodgzilla/machine_learning_deep_knn

Deep One-Class Classification

(ICML’18) Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, Marius Kloft

主要内容：
- 深度支持向量数据描述。。。将OOD样本的检测问题视为一个二分类问题，这方面研究还有很多工作，One-Class SVM（OC-SVM），编码解码器，GAN之类的。
- 目标是学习一个训练集数据的描述，用神经网络表示为一个超球，这个超球尽可能小地只包含ID样本，如果样本落在超球内，是ID样本，如果落在外面是，OOD样本。
代码实现（Pytorch）：https://github.com/lukasruff/Deep-SVDD
- 做实验时，使用数据集中的一类样本做为ID的，其他类样本为OOD的
- 能检测对抗样本

Deep autoencoding gaussian mixture model for unsupervised anomaly detection

(ICLR’18) Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, Haifeng Chen

主要内容：
- 深度自动编码的高斯混合模型，无监督的异常检测问题
- 利用深度解码器来对输入进行降维
- 利用高斯混合模型通过density estimation来估计每个样本的是否是OOD的分数
代码实现（Pytorch）：https://github.com/danieltan07/dagmm （Tensorflow）：https://github.com/Newcomer520/tf-dagmm
- OODS datasets：http://odds.cs.stonybrook.edu/