Control Flow Analysis for Java Methods
DRAFT - This document does not reflect the current JaCoCo implementation.
Implementing a coverage tool for branch coverage requires detailed analysis of the internal control flow of Java methods. Due to the architecture of JaCoCo this analysis needs to happen on compiled class files (bytecode). This document defines graph structures for control flow analysis of Java bytecode and discusses strategies for probe insertion. Marc R. Hoffmann, July 2010
Motivation and Requirements
- Branch Coverage
- Exception Detection
From Statement Coverage to Branch Coverage
A JaCoCo till version 0.4.x provides statement coverage As as starting point differnce between statement coverage and branch coverage. probe insertion strategy.
1public void example() { 2 a(); 3 if (condition()) { 4 b(); 5 } 6 c(); 7}
1public example() : void 2 L0 3 INVOKESTATIC Example.a() : void 4 L1 5 INVOKESTATIC Example.condition() : boolean 6 IFEQ L3 7 L2 8 INVOKESTATIC Example.b() : void 9 L3 10 INVOKESTATIC Example.c() : void 11 L4 11 RETURN
The Control Flow Graph
- Virtual Entry and Exit Nodes
- Subsequent Instructions
- (Conditional) Jump
- Table/Lookup Switch
- Exception Handlers
- Unhandled Exceptions
Probe Insertion
Code coverage analysis is a runtime metric that provides execution details of the software under test. This requires detailed recording about the instructions (instruction coverage) that have been executed. For branch coverage also the outcome of decisions has to be recorded. In any case execution data is collected by so called probes:
A probe is a sequence of bytecode instructions that can be inserted into a Java method. When the probe is executed, this fact is recorded and can be reported by the coverage runtime.
The only purpose of the probe is to record that it has been executed at least once. The probe does not record the number of times it has been called or collect any timing information. The latter is out of scope for code coverage analysis and more in the objective of a performance analysis tool. Typically multiple probes needs to be inserted into each method, therefore probes needs to be identified. Also the probe implementation and the storage mechanism it depends on needs to be thread safe as multi-threaded execution is a common scenario for java applications (albeit not for plain unit tests). Probes must not have any side effects on the original code of the method. Also they should add minimal overhead.
So to summarize the requirements for execution probes:
- Record execution
- Identification for different probes
- Thread safe
- No side effects on application code
- Minimal runtime overhead
JaCoCo implements probes with a boolean[]
array instance per
class. Each probe corresponds to a entry in this array. Whenever the probe is
executed the entry is set to true
with the following four
bytecode instructions:
ALOAD probearray xPUSH probeid ICONST_1 BASTORE
Note that this probe code is thread safe, does not modify the operand stack
or modify local variables and is also thread safe. It does also not leave the
method though an external call. The only prerequisite is that the probe array
is available as a local variable. For this at the beginning of each method
additional instrumentation code needs to be added to obtain the array instance
associated with the belonging class. To avoid code duplication the
initialization is delegated to a static private method
$jacocoinit()
which is added to every non-interface class.
The size of the probe code above depends on the position of the probe array variable and the value of the probe identifier as different opcodes can be used. As calculated in the table below the overhead per probe ranges between 4 and 7 bytes of additional bytecode:
Possible Opcodes | Min. Size [bytes] | Max. Size [bytes] |
Total: | 4 | 7 |
ALOAD_x , ALOAD 1 |
1 | 2 |
ICONST_x , BIPUSH , SIPUSH , LDC , LDC_W 2 |
1 | 3 |
ICONST_1 |
1 | 1 |
BASTORE |
1 | 1 |
1 The probe array is the first variable after the arguments.
If the method arguments do not consume more that 3 slots the 1-byte opcode can
be used.
2 1-byte opcodes for ids 0 to 5, 2-byte opcode for ids up to 127,
3-byte opcode for ids up to 32767. Ids values of 32768 or more require an
additional constant pool entry. For normal class files it is very unlikely to
require more than 32,000 probes.
- Limitation: Only proves that the probe itself has been executed, assumptions about the surrounding application code is interpolation
- Probe in every edge of the control flow graph
- Every exit path known (branch coverage)
- Block entry known (exceptions within blocks)
Different Types of Edges
- Probe insertion strategies
Runtime Overhead
- Comparison to current basic block probes