Runtime Approaches for Automatic Patch Generation

Get Complete Project Material File(s) Now! »

First Contribution: Runtime-based Patch Generation

The first contribution of this thesis consists of three novel techniques that produce patches based on a buggy execution and its running state. It addresses the first problem presented in Section 1.2, by removing the dependence on a failing test case to generate patches.
The first automatic patch generation technique is a template-based patch generation technique for null pointer exceptions in Java. A template-based automatic program repair technique is a technique that uses a predefined set of patch templates to repair an application, e.g., [11]. We consider null pointer exception since it is a runtime failure, therefore it does not require a failing test case to be able to generate a patch for it. The location of the null pointer exception can be determined directly with the stacktrace.
The second technique is a metaprogramming-based automatic program repair technique. In this context, a metaprogram is a program that can modify itself and change its behavior during its execution. We create a metaprogram specialized in handling null pointer exceptions, which can change its behavior dynamically to handle null pointer exceptions. It does not require a failing test case to reproduce the error. The third and final technique is a code synthesizer. It is designed to synthesize Java expressions by combining constants, variables, and methods with Boolean and mathemat-ical operators, e.g., &&, ||, +, −. These expressions are used to replace buggy conditions and create missing pre-conditions. This technique is the first code synthesizer of the literature that directly uses the runtime information to synthesize code.

Second Contribution: A Study of Patch Validity of Runtime- based Patch Generation

The next contribution presented in this thesis is an analysis of the repair search space of runtime patch generation techniques. This contribution focuses on the second problem presented in Section 1.2, the patch validation. It provides a better picture on the number of valid patches in the search space to enable the creation of new techniques to select the relevant patches without the asserts of a failing test. The repair search space is defined by Martinez [12] as “all explorable bug fixes for a given program and a given bug (whether compilable, executable or correct)”. In our contribution, we define the repair search space of null pointer exception bugs.
We answer to the following questions: How many patches are generated and how many valid patches are in the search space? In the previous contribution, we were only interested to see if it was possible to generate patches. These new questions are relevant to address our second problem on patch validity. We need a deep understanding on what makes the repair search space grows and, more importantly, which criteria are important to determine the patch validity.

Third Contribution: Practical Patch Generation in Production

The final contribution of this thesis aims to group all the acquired knowledge of the two first ones to create the first patch generation techniques ever dedicated to the production environment. This final contribution targets the three problems of this thesis. It proposes a new patch generation technique that does not use failing test cases. It also proposes a new way to validate the generated patches and provides an architecture to handle the potential side-effects of the patch generation techniques.
We design two automatic patch generation systems: one is dedicated to client-side applications and the other one is dedicated to server-side. BikiniProxy is the first technique, it generates patches for JavaScript applications. The JavaScript client-side applications are by nature distributed to each client and are only used for rendering and user interface (UI) interactions. Consequently, the side-effects of the patch generation techniques are limited to only one client and persistent state is also not involved. This environment opens new opportunities for patch generation, patch validation, and side-effect handling. The technique consists of an HTTP proxy that injects error monitoring code in the web application. Once an error is detected, it is considered as a collective knowledge that will be shared between all the clients. This collective knowledge is then used to patch the buggy JavaScript for the next users. The patch is then evaluated by monitoring the patched application. If the monitoring detects an invalid behavior, the generated patch is discarded and no further users will employ it. A database of patches for the web application is thus constituted and can be used by the developers to fix their web applications permanently.
The second technique, Itzal, produces patches and validates them directly from production traffic. The server-side applications are by nature centralized for the data persistence and all the business processing. It implies that any wrong manipulation impacts several users, and potentially the persistent state of the application. In this context, it is not realistic to directly modify the running application to generate patches. Our solution is to duplicate the application and production traffic in a sandboxed environment that can be used by the patch generation techniques. The sandboxing prevents all the potential side effects that the patch generation techniques can have on the production application. Another characteristic of Itzal is that it uses the sandboxed environment to deploy the generated patches and to compare its behavior with the production application. This way Itzal is able to assert the validity of the patches.

Automatic Program Repair

Automatic program repair is a task that aims to change automatically the behavior of a program to fix an invalid behavior. This section presents several aspects of automatic program repair. First, we present the test-based automatic program repair techniques. Second, we present techniques that repair specific aspects of an application, such as repairing inconsistencies in a data structure. Third, we present the literature on analyzing generated patches. Finally, we list the benchmarks of bugs that are used in the evaluation of patch generation techniques. There are two surveys by Gazzola et al. [34] and Monperrus [35] that summarize the literature of automatic program repair techniques. We refer to these two surveys for an exhaustive summarization of existing works on such field. In this related work section, we focus on the works that are the most relevant to patch generation without failing test cases.

READ Molecular modelling of the neuroglobin-cytochrome c complex

Test-based Automatic Program Repair

Test-based automatic repair is an automatic program repair technique that uses the test suite of the program as its specification. In this settings the passing test cases specify the correct behavior of the program and at least one failing test case describes the invalid behavior of the program. The goal of test-based repair techniques is to generate a patch that makes the entire test suite passing. In the literature, there are three main approaches to generate patches: the search-based approach, the deductive-based approach, and the template-based approach. These works are state-of-the-art for the patch generation techniques that we present in this thesis.

Search-based Program Repair Techniques

Search-based repair technique consists in searching for patches in a search space of patch candidates. We present the most important search-based program repair techniques in chronological order. Arcuri et al. [36] present Jaff, the first approach that uses evolutionary optimization for program repair. Debroy and Wong [37] use the mutation operators of mutation testing as repair strategies. This work combines fault localization with program mutation to exhaustively explore a space of candidate patches. GenProg [1] introduces the concept of genetic programming for patch generation. It generates patches by adding, deleting, or replacing elements of the code with existing code in the application. AE [38] is optimized GenProg by reducing its repair search space. RSRepair [39] uses a random search, and it is more efficient in finding patches than the genetic programming approach. Their follow-up work [40] prioritizes test cases to reduce the time to detect invalid patch candidates. SPR [41] defines a set of staged repair operators. It discards as early as possible candidate patches that cannot make the test suite a passing one. Consequently, SPR has more time to explore a smaller and more valuable patch search space. ELIXIR [4] focuses on the generation of patches that contain method calls. They rank the candidate patches with machine learning techniques to filter the patches and to reduce the search space. They extract the machine learning features from the context of the buggy statement. CapGen [42] is a patch generation technique that uses the context of the buggy statement to rank the potential candidates of patches. Their results show that a context-based approach is a valuable technique to address the patch overfitting problem.

Table of contents :

List of algorithms
List of listings
List of figures
List of tables
1 Introduction
1.1 Context
1.1.1 Automatic Patch Generation
1.1.2 Self-healing Runtime Approaches
1.2 Problem Statement
1.2.1 Problem 1: Patch Generation without Failing Test Case
1.2.2 Problem 2: Automatic Patch Validation in Production
1.2.3 Problem 3: Side-effect of Patch Generation in Production
1.2.4 Summary
1.3 Thesis Contributions
1.3.1 First Contribution: Runtime-based Patch Generation
1.3.2 Second Contribution: A Study of Patch Validity of Runtime-based Patch Generation
1.3.3 Third Contribution: Practical Patch Generation in Production .
1.4 Outline
1.5 Publications
2 State of the Art
2.1 Program Monitoring and Analysis
2.1.1 Program Monitoring
2.1.2 Program Behavior Analysis
2.1.3 Conclusion
2.2 Automatic Program Repair
2.2.1 Test-based Automatic Program Repair
2.2.2 Specialized Program Repair Techniques
2.2.3 Analysis of Generated Patches
2.2.4 Benchmark for Automatic Program Repair
2.2.5 Conclusion
2.3 Self-healing
2.3.1 Runtime Failure Recovery
2.3.2 Self-Healing for Security
2.3.3 Failure-Oblivious Computing
2.3.4 Conclusion
3 Runtime Approaches for Automatic Patch Generation
3.1 Automatic Patch Generation for Null Pointer Exception
3.1.1 A Taxonomy of Repair Strategies for Null Pointer Exceptions
3.1.2 Template-Based Patch Generation for Null Pointer Exception .
3.1.3 Metaprogramming-based Patch Generation for Null Pointer Exception
3.1.4 Evaluation
3.1.5 Discussion
3.1.6 Conclusion
3.2 Enriched Expression Synthesizer
3.2.1 An Algorithm for Condition Synthesis
3.2.2 Evaluation
3.2.3 Conclusion
3.3 Summary
4 A Study of the Runtime Repair Search Space
4.1 Cascading Null Pointer Exceptions
4.2 Exploring the Repair Search Space
4.2.1 Basic Definitions
4.2.2 The Failure-oblivious Computing Search Space
4.2.3 FO-EXPLORE: An Algorithm to Explore the Failure-oblivious Computing Search Space
4.2.4 Usefulness of Exploring the Search Space
4.3 Empirical Evaluation
4.3.1 Considered Failure-Oblivious Model
4.3.2 Benchmark
4.3.3 Experimental Protocol
4.3.4 Responses to Research Questions
4.4 Threats to Validity
4.5 Conclusion
5 Contributions to Automatic Patch Generation in Production
5.1 Production Patch Generation for Client-side Applications
5.1.1 Background
5.1.2 Patch Generation for Client-side Applications
5.1.3 Evaluation
5.1.4 Threat to Validity
5.1.5 Conclusion
5.2 Production Patch Generation for Server-side Applications
5.2.1 Patch Generation on Production Failures
5.2.2 Evaluation
5.2.3 Conclusion
5.3 Summary
6 Conclusion and Perspectives
6.1 Summary of the Contributions
6.2 Local Perspectives
6.2.1 Patch Generation without Failing Test Case
6.2.2 Automatic Patch Validation in Production
6.2.3 Side-effect of Patch Generation in Production
6.3 Global Perspectives
6.3.1 Production Oracle for Patch Validity
6.3.2 Interaction Between the Developers and the Generated Patches .
6.4 Final Words
References