Article: |
Artifact: |
Demo Video: |
Presentation at ASE'18: |
Poster at ASE'18: |
We present CPA-SymExec, a tool for symbolic execution that is implemented in the open-source, configurable verification framework CPAchecker. Our implementation automatically detects which symbolic facts to track, in order to obtain a small set of constraints that are necessary to decide reachability of a program area of interest. CPA-SymExec is based on abstraction and counterexample-guided abstraction refinement (CEGAR), and uses a constraint-interpolation approach to detect symbolic facts. We show that our implementation can better mitigate the path-explosion problem than symbolic execution without abstraction, by comparing the performance to the state-of-the-art Klee-based symbolic-execution engine Symbiotic and to Klee itself. For the experiments we use two kinds of analysis tasks: one for finding an executable path to a specific location of interest (e.g., if a test vector is desired to show that a certain behavior occurs), and one for confirming that no executable path to a specific location exists (e.g., if it is desired to show that a certain behavior never occurs).
CPA-SymExec is released under the Apache 2 license and available (inclusive source code) at https://cpachecker.sosy-lab.org. A demonstration video is available at https://youtu.be/qoBHtvPKtnw.
This website consists of the following sections:
We will show the effect of CEGAR on the following example program (download):
extern void __VERIFIER_error();
extern unsigned char __VERIFIER_nondet_uchar();
int main() {
unsigned char a = __VERIFIER_nondet_uchar();
unsigned char b = __VERIFIER_nondet_uchar();
unsigned char c = b + 1;
unsigned char i = 0;
while (a < 100) {
if (__VERIFIER_nondet_uchar())
a++;
else
i++;
}
if (c == b && i > a)
ERROR:
__VERIFIER_error();
}
We want to check the specification that method __VERIFIER_error
is never called.
Without CEGAR, CPA-SymExec will not terminate and compute an infinite symbolic-execution tree that grows exponentially in size:
But with CEGAR, CPA-SymExec first performs an analysis that tracks no information about the symbolic memory and no path constraints. This creates a finite symbolic execution tree, because CPA-SymExec stops computation as soon as states repeat (cf. our paper on Symbolic Execution with CEGAR).
Though, because of the missing information, CPA-SymExec finds an (infeasible) program path to a call to __VERIFIER_error
; the computed symbolic execution tree looks like the following, with the target location where the call to __VERIFIER_error
occurs marked with red.
Through its counterexample-guided refinement, CPA-SymExec derives that it must track the symbolic memory of program variables b
and c
so that it knows at the final if-condition that b == c
is never true. It then restarts analysis and computes the following symbolic-execution tree that does not contain a path to a call to __VERIFIER_error
:
This way, CPA-SymExec can prove that the program fulfills the specification.
Note that i > a
is also false for the found error path, because a >= 100
and i = 0
. Thus, the refinement procedure could also choose to track the symbolic memory of i
and a
, instead of b
and c
. This would lead to an infinite symbolic-execution tree similar to the one of CPA-SymExec without CEGAR.
To make refinement avoid such "bad decisions", we use refinement selection with sliced path prefixes.
If a target path is found, CPA-SymExec produces both a symbolic and an examplary concrete trace.
For this slightly modified program:
extern void __VERIFIER_error();
extern unsigned char __VERIFIER_nondet_uchar();
int main() {
unsigned char a = __VERIFIER_nondet_uchar();
unsigned char b = __VERIFIER_nondet_uchar();
unsigned char c = b + 1;
unsigned char i = 0;
while (a < 100) {
if (__VERIFIER_nondet_uchar())
a++;
else
i++;
}
if (c <= b)
ERROR:
__VERIFIER_error();
}
The following traces are produced:
none: N1 -{INIT GLOBAL VARS}-> N27
line 1: N27 -{void __VERIFIER_error();}-> N28
line 2: N28 -{unsigned char __VERIFIER_nondet_uchar();}-> N29
lines 4-19: N29 -{int main();}-> N30
none: N30 -{Function start dummy edge}-> N2
line 5: N2 -{unsigned char a;}-> N3
line 5: N3 -{a = __VERIFIER_nondet_uchar();}-> N4
line 6: N4 -{unsigned char b;}-> N5
line 6: N5 -{b = __VERIFIER_nondet_uchar();}-> N6
line 7: N6 -{unsigned char c = b + 1;}-> N7
line 8: N7 -{unsigned char i = 0;}-> N8
a == main::a#1;
c == ((unsigned char)(((signed int)main::b#3) + 1));
b == main::b#3;
i == 0U;
line 9: N8 -{while}-> N9
line 9: N9 -{[!(a < 100)]}-> N11
main::a#1 >= 100;
line 16: N11 -{[c <= b]}-> N24
main::a#1 >= 100;
((unsigned char)(((signed int)main::b#3) + 1)) <= main::b#3;
line 17: N24 -{Label: ERROR}-> N25
main::a#1 >= 100;
((unsigned char)(((signed int)main::b#3) + 1)) <= main::b#3;
none: N1 -{INIT GLOBAL VARS}-> N27
line 1: N27 -{void __VERIFIER_error();}-> N28
line 2: N28 -{unsigned char __VERIFIER_nondet_uchar();}-> N29
lines 4-19: N29 -{int main();}-> N30
none: N30 -{Function start dummy edge}-> N2
line 5: N2 -{unsigned char a;}-> N3
line 5: N3 -{a = __VERIFIER_nondet_uchar();}-> N4
a == 104U;
line 6: N4 -{unsigned char b;}-> N5
line 6: N5 -{b = __VERIFIER_nondet_uchar();}-> N6
b == 255U;
line 7: N6 -{unsigned char c = b + 1;}-> N7
c == 0U;
line 8: N7 -{unsigned char i = 0;}-> N8
a == 104U;
c == 0U;
i == 0U;
b == 255U;
line 9: N8 -{while}-> N9
line 9: N9 -{[!(a < 100)]}-> N11
a == 104U;
a == 104U;
line 16: N11 -{[c <= b]}-> N24
c == 0U;
b == 255U;
c == 0U;
b == 255U;
line 17: N24 -{Label: ERROR}-> N25
The experimental data was produced on machines with:
Each experiment run was limited to:
We used 5590 tasks with a reach-safety property from the sv-benchmarks in revision svcomp18
. To run CPA-SymExec, we used CPAchecker in version cpachecker-1.7.7-ase18-symExec
. Symbiotic was used in version SPIN-18
, and Klee was used in revision 86e6f6f8
. To run the experiments in a reproducible manner, we used benchexec
in version 1.16
.
In number of solved tasks, our experiments gave the following results:
CPA-SymExec (no CEGAR) | CPA-SymExec (CEGAR) | Klee | Symbiotic | |
---|---|---|---|---|
Correct true | 196 | 2137 | 446 | 1201 |
Correct false | 590 | 545 | 899 | 848 |
Incorrect true | 1 | 0 | 6 | 3 |
Incorrect false | 22 | 22 | 33 | 9 |
All experimental data can be found in folder data_raw
of our artifact.
A comparison table of our results can be found here (csv).
Tool | HTML | CSV | Logfiles | Raw Results |
---|---|---|---|---|
CPA-SymExec without CEGAR | HTML | CSV | Logfiles | XML |
CPA-SymExec with CEGAR | HTML | CSV | Logfiles | XML |
Klee | HTML | CSV | Logfiles | XML |
Symbiotic | HTML | CSV | Logfiles | XML |
The following quantile plot compares CPA-SymExec, CPA-SymExec without the use of CEGAR, Klee and Symbiotic with each other.
The x-axis describes the n-th fastest task that each individual tool could solve: point 1 on the x-axis describes for each tool the task that took the least time for that tool, point 2 the task that took the second-least time, and so on.
The y-axis describes the CPU time (in seconds) that analysis took: point ~(2000, 400) for Symbiotic means that the 2000-th fastest task took 400 seconds of CPU time for analysis.
With this, it is possible to see how fast a tool is, and how many tasks it can solve---the further to the right and the further to the bottom, the more tasks the tool could solve and the faster it could solve these.
It is clearly visible that, overall, CPA-SymExec (with CEGAR) is competitive with Klee and Symbiotic both in the number of tasks it can solve, and the time it requires to do so. But CPA-SymExec, Klee and Symbiotic have different capabilities.
The following plots show the CPU time (in seconds) that CPA-SymExec (x-axis) and Klee (y-axis) require for each task. Tasks that contain a feasible call to __VERIFIER_error
(false
property) are represented by red crosses; tasks that contain no call to __VERIFIER_error
(true
property) are represented by green circles. The numbers in parantheses below the caption show how many tasks of both properties exist. The plot only shows tasks for which either a correct solution was found or a timeout occurred - tasks for which one of the tools ran into an error are not shown. If a data point is above the line through origin, it means that CPA-SymExec was faster solving that corresponding task. If a data point is below the line through origin, it means that Klee was faster solving that task. The plot also contains supporting lines (in grey) for x * 10
and x * 0.1
, to mark the area at which one tool is ten times faster than the other. Tasks close to 1000 on the x-axis depict an analysis timeout for CPA-SymExec, tasks close to 1000 on the y-axis depict an analysis timeout for Klee. Tasks close to point (1000, 1000) can't be solved by either tool.
The plot shows that, for many tasks that both tools can solve, Klee is significantly faster than CPA-SymExec (points below x * 0.1
line) but that there are lots of tasks that only one of the tools can solve, but not the other. (points at upper and right border, close to 1000 seconds).
The following plots give a more detailed comparison, based on the different benchmark categories:
The following plots show the CPU time (in seconds) that CPA-SymExec (x-axis) and Symbiotic (y-axis) require for each task, followed by the detailed plots per category:
The same applies as for Klee: Symbiotic is faster for tasks that both can solve, but the tools can solve a lot of different tasks.
For our experiments, we used all tasks with a reach-safety property of the sv-benchmarks and without recursion. The benchmark set is categorized according to the features of the verification tasks - the table below gives a quick overview over the used set. The category LDV
is special: its verification tasks are created from industrial-scale Linux modules, by the Linux Driver Verification team.
Category | Tasks | LOC | C features | ||||
---|---|---|---|---|---|---|---|
Sum | Min | Max | Avg | Median | |||
Arrays | 167 | 7275 | 14 | 1161 | 44 | 36 | |
False | 44 | 1575 | 15 | 57 | 36 | 37 | |
True | 123 | 5700 | 14 | 1161 | 46 | 36 | |
BitVectors | 50 | 10511 | 13 | 696 | 210 | 39 | |
False | 14 | 2236 | 13 | 636 | 160 | 32 | |
True | 36 | 8275 | 15 | 696 | 230 | 47 | |
ControlFlow | 94 | 183904 | 94 | 22300 | 1956 | 1645 | |
False | 42 | 83038 | 220 | 10835 | 1977 | 1695 | |
True | 52 | 100866 | 94 | 22300 | 1940 | 1057 | |
ECA | 1149 | 29685918 | 344 | 185053 | 25836 | 4269 | |
False | 411 | 11948617 | 566 | 185053 | 29072 | 4827 | |
True | 738 | 17737301 | 344 | 185053 | 24034 | 2590 | |
Floats | 172 | 47518 | 9 | 1122 | 276 | 37 | |
False | 31 | 970 | 15 | 154 | 31 | 31 | |
True | 141 | 46548 | 9 | 1122 | 330 | 50 | |
Heap | 181 | 142378 | 11 | 4605 | 787 | 649 | |
False | 71 | 53834 | 19 | 4605 | 758 | 661 | |
True | 110 | 88544 | 11 | 4576 | 805 | 503 | |
Loops | 163 | 10026 | 14 | 1647 | 62 | 25 | |
False | 52 | 4035 | 14 | 1647 | 78 | 23 | |
True | 111 | 5991 | 14 | 476 | 54 | 26 | |
ProductLines | 597 | 1160305 | 838 | 3789 | 1944 | 1014 | |
False | 265 | 620859 | 847 | 3789 | 2343 | 2951 | |
True | 332 | 539446 | 838 | 3693 | 1625 | 979 | |
Sequentialized | 273 | 580463 | 330 | 18239 | 2126 | 1098 | |
False | 170 | 325198 | 331 | 15979 | 1913 | 974 | |
True | 103 | 255265 | 330 | 18239 | 2478 | 1223 | |
Systems_DeviceDriversLinux64 | 2744 | 40335619 | 339 | 227732 | 14700 | 9432 | |
False | 355 | 6116255 | 1389 | 85772 | 17229 | 13420 | |
True | 2389 | 34219364 | 339 | 227732 | 14324 | 8288 | |
Total | 5590 | 72163917 | 9 | 227732 | 12909 | 3522 | |
False | 1455 | 19156617 | 13 | 185053 | 13166 | 3000 | |
True | 4135 | 53007300 | 9 | 227732 | 12819 | 3967 |
In the following, we assume that you use the artifact to run the tools, but all command lines can be transferred to the web client.
For a quick introduction, watch our tutorial video (~4 minutes).
This section gives an introduction into the usage of CPA-SymExec from the command line and explains how to run the experiments cited in our paper.
For more information about development in CPAchecker, see the CPAchecker documentation.
We assume that you have:
The system provided by our SoSy-Lab Virtual Machine already meets these requirements.
If you don't have benchexec installed yet, you can install it from the deb-package shipped with the artifact. To do so, run from the artifact's root directory:
dpkg -i debs/python3-tempita_0.5.2-1build1_all.deb
dpkg -i benchexec_1.14-1_all.deb
For more information on benchexec, see the documentation.
To run CPA-SymExec on the example program exampleProgram.c
to check whether the ERROR
-label is reachable, execute from directory cpachecker-1.7
, on the command line:
scripts/cpa.sh -symbolicExecution-Cegar ../exampleProgram.c
CPA-SymExec will find a target path and report the verdict FALSE
(i.e., it is false that the ERROR
-label is unreachable). CPA-SymExec will also create multiple files in the directory output/
. The HTML report Counterexample.2.html
gives a good overview over most of these files. We suggest to use Chrome or Firefox to open the report. The 2
in its name implies that a first target path (i.e., counterexample) was found that was not feasible.
To use CPA-SymExec to generate tests for condition coverage on the example program exampleProgram.c
, execute, from directory cpachecker-1.7
:
scripts/cpa.sh -testCaseGeneration-symbolicExecution-Cegar ../exampleProgram.c
The generated test cases will be created as output/Test.*.harness.c
.
To execute a test, compile it together with the original program and run the product. For example, to compile the test case for test 0 into a executable file testcase
, run, from directory cpachecker-1.7
:
gcc -o testcase output/Test0.harness.c ../exampleProgram.c
Then, to execute, run ./testcase
. (Since the program does not create any output, no behavior will be visible.)
To repeat the experiments for CPAchecker, run, from directory cpachecker-1.7
:
benchexec --container ../benchmark_defs/cpachecker.xml
Logfiles and the result XMLs will be in cpachecker-1.7/results/
.
Klee is run from the existing Symbiotic wrapper script, but without any Symbiotic-specific modifications. Thanks to this, the experiments for Klee and Symbiotic use the exact same Klee and LLVM version.
To repeat the experiments for both Klee and Symbiotic, run, from directory symbiotic
:
benchexec --container ../benchmark-defs/symbiotic.xml
Logfiles and the result XMLs will be in symbiotic/results/
. Result files for Klee are named symbiotic.DATE.results.klee.*.xml
, result files for Symbiotic are named symbiotic.DATE.results.symbiotic.*.xml
.
Note: If you only want to run experiments for Klee, run, from directory symbiotic
: bash benchexec --container -r klee ../benchmark-defs/symbiotic.xml
If you only want to run experiments for Symbiotic, run, from directory symbiotic
: bash benchexec --container -r symbiotic ../benchmark-defs/symbiotic.xml
To create HTML and CSV-tables for the results, run table-generator RESULT-FILES
on the corresponding files. For example, to create tables for all three tools, run:
table-generator \
cpachecker-1.7/results/cpachecker.????-??-??_????.results.symbolicExecution-Cegar.xml.bz2 \
symbiotic/results/symbiotic.????-??-??_????.results.klee.xml.bz2 \
symbiotic/results/symbiotic.????-??-??_????.results.symbiotic.xml.bz2
The program will tell you the name of the generated files on the console output.
You can also create tables for the existing data of our experiments. To do so, run:
table-generator \
data_raw/cpachecker.????-??-??_????.results.symbolicExecution-Cegar.xml.bz2 \
data_raw/symbiotic.????-??-??_????.results.klee.xml.bz2 \
data_raw/symbiotic.????-??-??_????.results.symbiotic.xml.bz2
The program will, again, tell you the name of the generated files on the console output.