Statistical Methods for Out-of-distribution Detection
- Publication Type:
- Thesis
- Issue Date:
- 2023
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
For a network trained on in-distribution (ID) samples, test samples could be out-of-distribution (OOD) that are
drawn from distributions different from that of ID samples. Accordingly, OOD detection aims to identify OOD samples in test phases. The main challenge lies in that a network could provide high-confidence predictions for OOD samples, which indicates that the network cannot distinguish ID and OOD samples. The main causes of the high-confidence issue include limited ID and unavailable OOD samples in training processes. One strategy to enhance the detection performance of a network is to make the outputs more sensitive to OOD samples, i.e., the network tends to provide high- and low-confidence predictions for ID and OOD samples, respectively.
Improving the OOD sensitivity for a network requires to address a series of important problems and challenges: (1) Penalizing OOD samples with high-confidence predictions can improve the OOD sensitivity. Accordingly, how to generate specific OOD samples for a network? (2) If partial OOD samples are observed, how to involve them in the retraining process to balance the ID generalization and OOD detection? (3) If OOD samples are unavailable, how to fine-tune a network with augmented ID samples to improve the OOD sensitivity? (4) If modifying the network is not allowed, how to learn an auxiliary network to capture the OOD-sensitive information for the network?
This thesis systematically studies how to effectively solve the aforementioned issues with experimental and theoretical support. Due to the significant difference between ID and OOD samples, it is essential to consider the data characteristics and data correlations that statistical methods can model. Accordingly, this thesis attempts to incorporate statistical methods into deep neural networks to improve the OOD sensitivity. Specifically, this thesis proposes four novel methods to address these issues. The main ideas include inferring an implicit generator based on the Shannon entropy to generate high-confidence OOD samples, constructing adaptive supervision information for OOD samples to minimize the disruption for learning to classify ID samples, exploring the data space around ID samples to construct the vicinity distributions for OOD samples, and utilizing an auxiliary network to explore the discarded OOD-sensitive information in ID samples according to information bottleneck theory.
Please use this identifier to cite or link to this item:
