Thesis Proposal


"Towards Improving the Effectiveness of Automated Program Repair"

Qi Xin

Friday, September 22, 2017, at 2:00 P.M.

Lubrano Conference Room (CIT Room 477 - 4th Floor)

Debugging is laborious. Automated program repair aims to save​ ​people time and effort by repairing a faulty program in an​ ​automated way. A typical automated repair technique accepts​ ​as input a faulty program and a fault-exposing test suite. The​ ​goal is to generate as output a correct, patched program that​ ​passes the test suite. Current automated repair techniques are​ ​still far from maturity in that: (1) they define a huge search​ ​space of patches to support repairing a large set of bugs, but​ ​they are not effective in finding a correct patch within such a​ ​huge space and (2) they are prone to produce a patch that is​ ​test-suite-overfitted, or overfitting. An overfitting, patched​ ​program can pass the test suite but does not actually repair the bug.

To address the two problems, we developed an automated repair technique​ ​ssFix and a patch testing technique DiffTGen. ssFix finds and uses​ ​existing code from a code database that is syntactically related to​ ​the context of a fault to produce patches for its repair. By leveraging​ ​such syntax-related code, ssFix potentially creates a search space of​ ​patches whose size is reduced but is still likely to contain a correct​ ​patch. We evaluated ssFix on 357 bugs in the Defects4J bug dataset. Our​ ​results demonstrate the effectiveness of ssFix: it successfully repaired​ ​20 bugs with valid patches generated. Compared to five other repair​ ​techniques: jGenProg, jKali, NoPol, HDRepair, and ACS, it has a better​ ​repair performance.

A repair technique that uses a test suite as the correctness criterion​ ​can produce overfitting patches. Our patch testing technique DiffTGen can​ ​identify an overfitting patch through test case generation. To do so,​ ​it first employs a test generator to generate new test inputs that uncover semantic differences between the patched program and the original, faulty​ ​program. It next shows the differential semantics as different outputs​ ​to an oracle for correctness judging. DiffTGen determines the patched​ ​program to be overfitting if the output of the patched program is judged​ ​as incorrect, and can produce a test case based on a correct output provided​ ​by the oracle. We evaluated DiffTGen on 89 patches generated by four automated​ ​repair techniques for Java with 79 of the patches being likely to be​ ​overfitting. DiffTGen identifies in total 39 (49.4%) overfitting patches​ ​and yields the corresponding test cases.

The repair performance of ssFix depends on its code search ability in​ ​finding syntax-related code from a code database. I propose to investigate​ ​using different code search approaches for program repair and comparing​ ​their effectiveness. I also propose to combine ssFix and DiffTGen and​ ​evaluate their effectiveness in an iterative bug repair process.

Host: Professor Steve Reiss