Symbolic Execution using CEGAR

Dirk Beyer, Thomas Lemberger

Abstract

Symbolic execution, a standard technique in program analysis, is a particularly successful and popular component in systems for test-case generation. One of the open research problems is that the approach suffers from the path-explosion problem. We apply abstraction to symbolic execution, and refine the abstract model using counterexample-guided abstraction refinement (CEGAR), a standard technique from model checking. We also use refinement selection with existing and new heuristics to influence the behavior and further improve the performance of our refinement procedure. We implemented our new technique in the open-source software-verification framework CPAchecker. Our experimental results show that the implementation is highly competitive.

Supplementary Data

We performed an extensive experimental evaluation to show the benefits of different refinement selection heuristics and the competitiveness of our approach. For creating our results, we use BenchExec, a framework for reliable benchmarking and resource measurement.

Comparison of Refinement Selection Heuristics

Refinement Selection allows the use of different heuristics for interpolant selection when using CEGAR. We performed an experimental evaluation on the most promising of these heuristics to find the one best suited for Symbolic Execution. The raw results of a selection of our evaluation can be found in the following table.

Heuristic
No preference	HTML Table	CSV Table	Log Files
Domain good - width narrow	HTML Table	CSV Table	Log Files
Domain good - short	HTML Table	CSV Table	Log Files
Assumptions most - short	HTML Table	CSV Table	Log Files
Comparison of all of the above	HTML Table	CSV Table

Comparison to other Verifiers

We compared our implementation of Symbolic Execution with CEGAR (SymEx+) in CPAchecker with the implementation of eager Symbolic Execution (SymEx) in CPAchecker and Symbiotic 3, a Symbolic Execution verifier using instrumentation and slicing. The raw results of our experimental evaluation can be found in the following table.

Verifier
SymEx (eager)	HTML Table	CSV Table	Log Files
SymEx+ (CEGAR)	HTML Table	CSV Table	Log Files
Symbiotic 3	HTML Table	CSV Table	Log Files
Comparison of all of the above	HTML Table	CSV Table

Reproducing the Results

Virtual Machine

The easiest way to reproduce our results is to use our replication artifact with our virtual machine It is based on Ubuntu 16.04 and comes with everything you need to run SymEx and SymEx+ with and without BenchExec. You can log into the VM using the username sosy and the password sosy . If you are comfortable with using the VM, you can skip Prerequisites. Continue with Single Verification Task to find out how to run a single verification task. If you do not want to use the VM or the artifact, but install the tools yourself on your own system, proceed with the next section.

Prerequisites

We assume that your machine meets the following or similar requirements:

Operating system based on Linux 3.x
OpenJDK 1.7 (or compatible)
16 GiB of RAM
At least 2 CPU cores

We also assume that you use bash as your command-line shell.

If not already present, you will also have to install git and subversion for getting code repositories and ant to compile CPAchecker. You will need administrator privileges to do so. On Ubuntu, the three can be installed by running:

sudo apt-get install git subversion ant

In addition, choose a base directory cpa-symexec to which you want to extract all the data and programs that are necessary to reproduce our results. This may be any directory you have write-access to. For all steps below, we will assume that you are in that directory (to check, run pwd on your command line).

After you have ensured all of the above, retrieve copies of the verification tasks, the tools and the configurations we used in our work, by following the steps listed below.

Obtain our artifact from Zenodo, extract it and change into directory cpa-symexec.
```
wget https://zenodo.org/record/1158649/files/cpa-symexec.zip?download=1
unzip cpa-symexec.zip
```
cd cpa-symexec

Obtain a copy of the SV-COMP'16 verification tasks and create a symlink called svcomp16.

wget https://zenodo.org/record/1158644/files/sv-benchmarks-svcomp16.zip?download=1
unzip sv-benchmarks-svcomp16.zip
ln -s sv-benchmarks-svcomp16/c/ svcomp16

Obtain a copy of Symbiotic in version 3.0.1 with libstdc from GitHub.

wget https://github.com/staticafi/symbiotic/releases/download/3.0.1/Symbiotic-3.0.1-with-libstdc.tar.gz
tar xzf Symbiotic-3.0.1-with-libstdc.tar.gz && rm Symbiotic-3.0.1-with-libstdc.tar.gz

After perfoming these four steps, the directory subtree of cpa-symexec should look as follows:

./
CPAchecker/
benchmarks/
sv-benchmarks-svcomp16/
svcomp16/
Symbiotic
README

Before telling you how to reproduce our full experiments, we will talk about running a single verification task with both CPAchecker and Symbiotic in the next section. This should give you a better understanding of how the tools are executed and what their output looks like.

Single Verification Task

To describe the analysis of a single verification task with both CPAchecker and Symbiotic, we use the preprocessed C program loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i .
The safety property we want to check is: "The function __VERIFIER_error() is never called." In the given program, this safety property is violated, so we expect our verifiers to return the result false.

Verifying a task with CPAchecker (SymEx and SymEx+)

The following commands use CPAchecker to verify loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i using SymEx+:

cd cpa-symexec/CPAchecker
./scripts/cpa.sh -heap 10000M -valueAnalysis-symbolic-Cegar \
    -spec ../svcomp16/ssh-simplified/ALL.prp \
    ../svcomp16/loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i

The output of the command should look similar to the following:

Running CPAchecker with default heap size (1200M). Specify a larger value with -heap if you have more RAM.
Running CPAchecker with default stack size (1024k). Specify a larger value with -stack if needed.
Using the following resource limits: CPU-time limit of 900s (ResourceLimitChecker.fromConfiguration, INFO)

CPAchecker 1.5-svn (OpenJDK 64-Bit Server VM 1.8.0_91) started (CPAchecker.run, INFO)

The following configuration options were specified but are not used:
 counterexample.checker.config
 counterexample.checker
 (CPAchecker.printConfigurationWarnings, WARNING)

Starting analysis ... (CPAchecker.runAlgorithm, INFO)

Stopping analysis ... (CPAchecker.runAlgorithm, INFO)

Verification result: FALSE. Property violation (error label in line 5) found by chosen configuration.
More details about the verification run can be found in the directory "./output".

Notice the verification result FALSE. This is the result we expected and signalizes that a property violation exists in the program. In contrast, the result TRUE would mean that the program satisifes the checked property. In our case this would mean that no call to __VERIFIER_error() is reachable.

The following command will start CPAchecker to verify the same task using SymEx:

cd cpa-symexec/CPAchecker
./scripts/cpa.sh -heap 10000M -valueAnalysis-symbolic \
    -spec ../svcomp16/ssh-simplified/ALL.prp \
    ../svcomp16/loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i

Notice that only the parameter -valueAnalysis-symbolic-Cegar changed to -valueAnalysis-symbolic. This parameter describes the predefined configuration valueAnalysis-symbolic.properties. If you want to experiment a bit more, all of these predefined configurations can be found in cpa-symexec/CPAchecker/config.

Verifying a task with Symbiotic

The following command will start Symbiotic to verify loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i

cd cpa-symexec/Symbiotic
./symbiotic ../svcomp16/loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i

The output should simply be

  FALSE

If you want to see more information, use the option --debug. Run:

./symbiotic --debug=OPT ../svcomp16/loops/sum01_bug02_sum01_bug02_base.case_false-unreach-call_true-termination.i

where OPT is all, compile, prepare, slicer or empty (just --debug=), depending on the information you would like to see. If OPT is empty, basic messages will be displayed.

You can also play around with Symbiotic a bit more. Run ./symbiotic --help to see all available options.

Recreating benchmark results

Comparison of refinement heuristics

One important note first:
BenchExec uses Linux's cgroups feature. If you encounter problems regarding cgroups, check this setup guide.

To run the benchmarks for CPAchecker, the file cpa-symexec/CPAchecker/scripts/cpa.sh has to be part of your PATH or you have to be in the directory cpa-symexec/CPAchecker. For simplicity, we will go with the latter.

To run all benchmarks regarding the evaluated refinement heuristics, run:

cd cpa-symexec/CPAchecker
./scripts/benchmark.py \
  -r symexec-cegar-noRefSel \
  -r symexec-cegar-dg-narrow \
  -r symexec-cegar \
  -r symexec-cegar-atm \
  ../benchmarks/symex.xml

Depending on your hardware, this may take a long time.
After the benchmark finishes, the results for all heuristics can be found in cpa-symexec/CPAchecker/test/results . For each configuration, there are multiple files:

The file DEFINITION.DATE.results.CONFIG.xml.bz2 contains all results.
For every task category, there is a file DEFINITION.DATE.results.CONFIG.CATEGORY.xml.bz2 that contains only the results for the corresponding category.
The archive DEFINITION.DATE.logfiles.zip contains the tool output for every task in a separate file.

DEFINITION is the file name of the benchmark definition file without the suffix (e.g., symex), DATE is the date and time of the start of the benchmark and CONFIG is the configuration used (e.g., symexec-cegar-noRefSel). One example for such a file name is

      symex.2016-12-24_1830.results.symexec-cegar-noRefSel.ControlFlow.xml.bz2 .

To get a human-readable format for the benchmark results of a single configuration, run:

./scripts/table-generator.py ./test/results/symex.DATE.results.CONFIG.xml.bz2

The output will tell you which files are created. It should look similar to the following:

INFO:     ./test/results/symex.2016-05-01_1257.results.symexec-cegar.xml.bz2
INFO: Merging results...
INFO: Generating table...
INFO: Writing HTML into ./test/results/symex.2016-05-01_1257.results.symexec-cegar.html ...
INFO: Writing CSV  into ./test/results/symex.2016-05-01_1257.results.symexec-cegar.csv ...
INFO: done

To get a human-readable format for all the benchmark results in a single table, run:

./scripts/table-generator.py ./test/results/symex.DATE.results.symexec-cegar-noRefSel.xml.bz2 \
    ./test/results/symex.DATE.results.symexec-cegar-dg-narrow.xml.bz2 \
    ./test/results/symex.DATE.results.symexec-cegar.xml.bz2 \
    ./test/results/symex.DATE.results.symexec-cegar-atm.xml.bz2

where you replace DATE with the start date of the benchmark. A quick ls ./test/results/symex.*.results.xml.bz2 can help you with that.

The output should look similar to the following:

INFO:     ./test/results/symex.2016-07-01_1023.results.symexec-cegar-noRefSel.xml.bz2
INFO:     ./test/results/symex.2016-07-05_2333.results.symexec-cegar-dg-narrow.xml.bz2
INFO:     ./test/results/symex.2016-05-01_1257.results.symexec-cegar.xml.bz2
INFO:     ./test/results/symex.2016-07-01_1023.results.symexec-cegar-atm.xml.bz2
INFO: Merging results...
INFO: Generating table...
INFO: Writing HTML into ./test/results/results.2016-09-18_1150.table.html ...
INFO: Writing CSV  into ./test/results/results.2016-09-18_1150.table.csv ...
INFO: Writing HTML into ./test/results/results.2016-09-18_1150.diff.html ...
INFO: Writing CSV  into ./test/results/results.2016-09-18_1150.diff.csv ...
INFO: done

Two different files will be created, in html and csv format:

The files results.DATE.table.{html,csv} contain all results of all benchmark runs.
The files results.DATE.diff.{html,csv} only contain verification tasks for which at least two configurations have different results. While this file might not be very useful when comparing a big number of configurations, it comes in handy when comparing the performance of two configurations with each other.

The html files are best for viewing the results, while the csv files present an easily machine-readable format.

Comparison of verifiers

Running the benchmarks for the different analyses is very similar to running the benchmarks for the refinement heuristics. Because of this, we will only list the necessary commands.

To run the benchmarks for SymEx, run:

cd cpa-symexec/CPAchecker
./scripts/benchmark.py -r symexec cpa-symexec/benchmarks/symex.xml

The benchmarks for SymEx+, in our used configuration, are also part of the benchmarks for the different refinement heuristics performed above. If you performed these, you will already have them. Otherwise, run from directory cpa-symexec/CPAchecker:

./scripts/benchmark.py -r symexec-cegar ../benchmarks/symex.xml

To run the benchmarks for Symbiotic, the symbiotic executable has to be part of your PATH or you have to be in the directory it is located in. So run:

cd cpa-symexec/Symbiotic
../CPAchecker/scripts/benchmark.py ../benchmarks/symbiotic3.xml
cd -

To get the comparison tables of all three run results, run:

cd cpa-symexec/CPAchecker
./scripts/table-generator.py \
  ./test/results/symex.DATE1.results.symexec.xml.bz2 \
  ./test/results/symex.DATE2.results.symexec-cegar.xml.bz2 \
  ../Symbiotic/test/results/symbiotic3.DATE3.results.symbiotic3.xml.bz2

where you have to replace DATE1 to DATE3 with the correct dates.

Finished!

After executing all these steps, you should have produced all the data provided above yourself. Feel free to play around with CPAchecker and Symbiotic some more. If you have any questions regarding our work, the virtual machine or the results, you can write us at thomas.lemberger@sosy.ifi.lmu.de. And if you are curious about our other publications, visit sosy-lab.org.