HINDBR: Heterogeneous information network based duplicate bug report prediction
- Publisher:
- IEEE
- Publication Type:
- Conference Proceeding
- Citation:
- Proceedings - International Symposium on Software Reliability Engineering, ISSRE, 2020, 2020-October, pp. 195-206
- Issue Date:
- 2020-10-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
issre20b.pdf | Accepted version | 966.68 kB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
©2020 IEEE. Duplicate bug reports often exist in bug tracking systems (BTSs). Almost all the existing approaches for automatically detecting duplicate bug reports are based on text similarity. A recent study found that such approaches may become ineffective in detecting duplicates in bug reports submitted after the justin- time (JIT) retrieval, which is now a built-in feature of modern BTSs (e.g., Bugzilla). This is mainly because the embedded JIT feature suggests possible duplicates in a bug database when a bug reporter types in the new summary field, therefore minimizing the submission of textually similar reports. Although JIT filtering seems effective, a number of bug report duplicates remain undetected. Our hypothesis is that we can detect them using a semantic similarity-based approach. This paper presents HINDBR, a novel deep neural network (DNN) that accurately detects semantically similar duplicate bug reports using a heterogeneous information network (HIN). Instead of matching text similarity alone, HINDBR embeds semantic relations of bug reports into a low-dimensional embedding space where two duplicate bug reports represented by two vectors are close to each other in the latent space. Results show that HINDBR is effective.
Please use this identifier to cite or link to this item: