Postprint version. Published in Innovations in Systems and Software Engineering, Volume 1, Issue 2, September 1, 2005, pages 116-124.
NOTE: At the time of publication, the author Alex Dekhtyar was not yet affiliated with Cal Poly.
The definitive version is available at https://doi.org/10.1007/s11334-005-0011-3.
To support debugging, maintenance, verification and validation (V&V) and/or independent V&V (IV&V), it is necessary to understand the relationship between defect reports and their related artifacts. For example, one cannot correct a code-related defect report without being able to find the code that is affected. Information retrieval (IR) techniques have been used effectively to trace textual artifacts to each other. This has generally been applied to the problem of dynamically generating a trace between artifacts in the software document hierarchy after the fact (after development has proceeded to at least the next lifecycle phase). The same techniques can also be used to trace textual artifacts of the software engineering lifecycle to defect reports. We have applied the term frequency–inverse document frequency (TF-IDF) technique with relevance feedback, as implemented in our requirements tracing on-target (RETRO) tool, to the problem of tracing textual requirement elements to related textual defect reports. We have evaluated the technique using a dataset for a NASA scientific instrument. We found that recall of over 85% and precision of 69%, and recall of 70% and precision of 99% could be achieved, respectively, on two subsets of the dataset.