April 7, 2005.
NOTE: At the time of publication, the author David Janzen was not yet affiliated with Cal Poly.
Despite a half century of advances, the software construction industry still shows signs of immaturity. Professional software development organizations continue to struggle to produce reliable software in a predictable and repeatable manner. While a variety of development practices are advocated that might improve the situation, developers are often reluctant to adopt new, potentially better practices based on anecdotal evidence alone. As a result, empirical software engineering has gained credibility as a discipline that provides scientific data about practice efficacy on which developers can make critical decisions. This research proposes to apply empirical software engineering techniques to evaluate a new approach that offers the potential to significantly improve the state of software construction. Test-driven development (TDD) is a disciplined software development practice that focuses on software design by first writing automated unit-tests followed by production code in short, frequent iterations. TDD focuses the developer’s attention on a software’s interface and behavior while growing the software architecture organically. TDD has gained recent attention with the popularity of the Extreme Programming agile software development methodology. Although TDD has been applied sporadically in various forms for several decades, possible definitions have only recently been proposed. Advocates of TDD rely primarily on anecdotal evidence with relatively little empirical evidence of the benefits of the practice. A small number of studies have looked at TDD only as a testing practice to remove defects. However, there is no research on the broader efficacy of TDD. This research will be the first comprehensive evaluation of how TDD effects overall software architecture quality beyond just defect density. My hypothesis is that TDD improves overall software quality including characteristics such as extensibility, reusability, and maintainability without significantly impacting cost and programmer productivity. I intend to examine this hypothesis by designing and administering a series of longitudinal empirical studies with undergraduate students and professional programmers. Controlled experiments will be conducted in a set of undergraduate courses. Student programmers will be taught to write automated unit-tests integrated with course topics using a new approach which I am calling test-driven learning (TDL). Formal experiments will then compare the quality of software produced with TDD to software produced with a more traditional test-last development approach. A case study or controlled experiment will also be conducted with more experienced programmers in a professional environment. In all of the studies, programmer performance, attitudes toward testing, and future voluntary usage of TDD will also be assessed. The combination of studies in academic and professional environments will establish external validity of the research as well as provide valuable information regarding the effectiveness of TDD at various levels of maturity. The research should also produce several important by-products including pedagogicalmaterials, a framework for future studies, and observations regarding TDD’s fit in the undergraduate computer science curriculum. Positive results from these studies have the potential of significantly improving the state of software construction. For the first time, professional developers will be able to examine empirical evidence of TDD efficacy both as a testing and as a design practice. Additionally, computer science faculty will be encouraged to incorporate TDD into curricula, resulting in better student design and testing skills. Improved pedagogy combined with widespread adoption of TDD offer the potential of radically improving the software engineering community’s ability to reliably produce, reuse, and maintain quality software.
2005 David Janzen.