Who Tests the Testers?: Avoiding the Perils of Automated Testing

John Wrenn, Shriram Krishnamurthi, Kathi Fisler

SIGCSE International Computing Education Research Conference, 2018


Instructors routinely use automated assessment methods to evaluate the semantic qualities of student implementations and, sometimes, test suites. In this work, we distill a variety of automated assessment methods in the literature down to a pair of assessment models. We identify pathological assessment outcomes in each model that point to underlying methodological flaws. These theoretical flaws broadly threaten the validity of the techniques, and we actually observe them in multiple assignments of an introductory programming course. We propose adjustments that remedy these flaws and then demonstrate, on these same assignments, that our interventions improve the accuracy of assessment. We believe that with these adjustments, instructors can greatly improve the accuracy of automated assessment.



These papers may differ in formatting from the versions that appear in print. They are made available only to support the rapid dissemination of results; the printed versions, not these, should be considered definitive. The copyrights belong to their respective owners.