Decomposing Software Verification into Off-the-Shelf Components: An Application to CEGAR

Dirk Beyer, Jan Haltermann, Thomas Lemberger, and Heike Wehrheim

Abstract

Techniques for software verification are typically realized as cohesive units of software with tightly coupled components. This makes it difficult to re-use components, and the potential for workload distribution is limited. Innovations in software verification might find their way into practice faster if provided in smaller, more specialized components. In this paper, we propose to strictly decompose software verification: the verification task is split into independent subtasks, implemented by only loosely coupled components communicating via clearly defined interfaces. We apply this decomposition concept to one of the most frequently employed techniques in software verification: counterexample-guided abstraction refinement (CEGAR). CEGAR is a technique to iteratively compute an abstract model of the system. We develop a decomposition of CEGAR into independent components with clearly defined interfaces that are based on existing, standardized exchange formats. Its realization component-based CEGAR (C-CEGAR) concerns the three core tasks of CEGAR: abstract-model exploration, feasibility check, and precision refinement. We experimentally show that — despite the necessity of exchanging complex data via interfaces — the efficiency thereby only reduces by a small constant factor while the precision in solving verification tasks even increases. We furthermore illustrate the advantages of C-CEGAR by experimenting with different implementations of components, thereby further increasing the overall effectiveness and testing that substitution of components works well.

Page Contents

We provide the following content:

  1. The main results of our paper with extended figures (see below)
  2. Proof that the boolean expression 1 <= y + 2 ((y * -1 + 1) / 2)* is equivalent to (y mod 2 = 1)
  3. An overview of SV-COMP 2021 verifiers that shows that Ultimate Automizer is the only reasonable choice for invariant generator, next to CPAchecker.
  4. A replication artifact (VM version and web version).

Main Results

Our implementation of C-CEGAR in CoVeriTeam is available on GitLab.

  1. Overhead on Component-Based Design

    1. Effectiveness

      Decomposing an existing CEGAR implementation into components has (almost) no negative effects on the effectiveness of the approach. Moreover, the new tool can have a higher precision because better components can be used.

    2. Efficiency

      The efficiency of C-Pred decreases only by a constant factor (median smaller than three).

      A scatter plot that shows the run time of Pred and C-Pred. It shows that for almost all tasks, C-Pred is at most 10 times slower than Pred. The above plot (Figure 10a in the paper) shows the run time required by traditional, tightly coupled predicate abstraction in CPAchecker (Pred, x-axis), and the decomposed version in C-CEGAR (C-Pred, y-axis). Run time is measured in CPU time seconds.

      A box plot that shows the factor of run-time increase per task of C-Pred in relation to Pred. It shows that in the median, the run time increase is about 2.8 fold. The above plot (Figure 10b in the paper) shows the distribution of the run-time factor (the increase in runtime) for C-Pred, compared to Pred.

      A plot that shows the factor of run-time increase per task, grouped by the number of required CEGAR iterations to solve a task. The above plot (extended version of Figure 11 in the paper) shows the median factor of run-time increase by C-Pred compared to Pred, grouped by the number of required CEGAR iterations. The width of the bar for i corresponds to the number of verification tasks that require exactly i CEGAR iterations.

  2. Cost of Standardized Formats

    The effectiveness of C-CEGAR reduces by 20% when using standardized formats, whereas the efficiency is not influenced.

    A scatter plot that shows the run time of C-Pred and C-PredWit. It shows that there is no significant run-time difference between the two. The above plot (Figure 13 in the paper) compares the run time required by two versions of the decomposed predicate abstraction in C-CEGAR: C-Pred (x-axis) uses the proprietary predicate map for precision exchange, and C-PredWit (y-axis) uses the standardized format for correctness witnesses. Run time is measured in CPU time seconds.

  3. Benefit of Different Components

    1. Benefit of Different Feasibility Checkers

      C-CEGAR allows a simple exchange of feasibility checkers. The use of conceptually different off-the-shelf checkers can increase the effectiveness of C-CEGAR

    2. Benefit of Different Precision Refiners

      C-CEGAR allows a simple exchange of precision refiners. The use of conceptually different off-the-shelf refiners can increase the effectiveness of C-CEGAR.

More details are available in our research paper.