We are hiring new doctoral researchers, student research assistants, and tutors. Apply now!
2 papers accepted at ASE 2024: BenchCloud and CoVeriTeam GUI

Teaching at LMU

Student Talks at LMU

Thesis talks are part of our Oberseminar. All students and staff are welcome to participate.

Teaching in Winter 2024/25

Courses:

Seminars:

Information on courses tought in past semesters can be found on a separate page.

Regularly Offered Courses

Seminars

For all seminars these rules apply.

Important Links

Projects, Bachelor, and Master Theses

We offer several topics for theses on Bachelor and Master level in our research projects (German).

For the MSc in computer science, it is also possible to do one of our proposed projects as practical training (Modulhandbuch, 2.1 P 1) for 6 or 12 ECTS.

When writing a thesis at our chair, follow our thesis guidelines and evaluation criteria and use these technical hints for writing. You can find more important information in our wiki.

Please contact us if you are interested in one of these topics or if you want to propose your own topic that is related to our chair. Please include your transcript in the request so that we can meet and assess suitable topics that fit your qualification. You can also have a look at our collection of past student reports and theses to get an impression of what a thesis at our chair could look like.

Click on marked topics (▶) to show a more detailed description.
Currently unavailable or finished topics are also shown, and may give inspiration for own ideas.

Formal Verification and Testing

Available topics
Btor2 Frontend for ABC [1] (Mentor: Po-Chun Chien)

Background: Btor2 [1] is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. ABC [2] is a framework for logic synthesis and verification. It supports primary bit-level AIGER circuits but also has a frontend for word-level circuits written in Verilog HDL.

Goal: In this project, we want to implement a new word-level frontend in ABC that handles Btor2. With this new frontend, ABC could verify a Btor2 circuit via bit-blasting.

Requirements: (general information)

  • Ability to program in C
  • Knowledge of sequential logic circuit and SAT/SMT solving

Value Analysis with Initial Precision from YML Correctness Witness [1, 2, 3, 4] (Mentor: Marie-Christine Jakobs)

Value analysis [1] is one verification technique in the software analysis framework CPAchecker [2], which tracks the concrete values of certain variables that are determined by its precision. So far, value analysis may create an empty initial precision or an initial precision from an existing value or predicate precision, which in both cases is the result of a previous analysis run. This restricts the reuse of information in the value analysis. The goal of this thesis is to extend the ability of CPAchecker's value analysis to reuse information from previous analysis runs by using information provided in YML correctness witnesses [3]. The idea is to derive from the invariants in the witness the variables that should be tracked and add them to the precision. In addition to considering variables directly occurring in the witness' invariant, further variables might be derived similar to [4]. An experimental evaluation should evaluate the value of generating initial precisions from witnesses. The evaluation should at least compare initial precision from witnesses generated by the value analysis with initial precision from value precision in context of the reverification of changed programs and optionally of the same program. The evaluation may be extended by a comparison with an initial empty precision as well as a precision from predicate analysis, an initial precision from witnesses generated from other analyses like CPAchecker's predicate analysis, k-induction, or even generated by another tool. Similar to [4], the evaluation may further evaluate the approach in context of cooperative verification.

Parallel Test-Case Generation via Test-Goal Splitting [1, 2, 3, 4] (Mentor: Marie-Christine Jakobs)

Test-case generation based on reachability analyses like CoVeriTest [1] check the reachability of each test goal, e.g., each branch of a program, and generate a test case whenever the check succeeds. The goal of this thesis is to develop an approach to parallelize such test-case generation approaches by splitting the set of test goals into disjoint subsets and then running multiple instances of the test-case generation in parallel, each instance focusing on one particular subset of the test goals. The main conceptual part of the thesis is the development of one or more split strategies. Ideally the strategies should consider dependencies, e.g., dominator information or control-dependencies [2], or other structural relations between test goals. In addition, a baseline strategy, e.g., random splitting, should be provided. Since a test case may cover more than the test goal it is generated for, the parallelization approach should also support to exchange information about covered test goals to avoid generating test cases for already covered test goals. Ideally, the developed approach would support load balancing, i.e., the redistribution of uncovered test goals from one instance of the test-case generation approach to another one. The developed approach must be integrated into CPAchecker's [3] test-case generation and experimentally evaluated on the benchmarks of the international testing competition (Test-Comp) [4]. The evaluation must compare the approach to sequential test-case generation and compare the different strategies.

Improving the performance of the BMC algorithm in CPAchecker [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

Bounded Model Checking (BMC) is one of the most widely used techniques in software verification. One of the most popular BMC tools for C programs is CBMC. Currently it is a lot more efficient than CPAchecker's implementation of BMC. The goal of this thesis is to analyze why CPAchecker's BMC is slower than CBMC and to improve its performance.

Building Violation Witnesses from Error-Triggering Test Suites (Mentor: Thomas Lemberger)

This project is available as a BSc thesis and a MSc research training course with 6 ECTS. Mentoring is available in English and German.

Goal: Build a new command-line tool that instruments C code to record information about inputs on execution. Then use this information to transform test cases from a competition-specific XML format into a YAML-based format that contains more information.

Background: Our chair hosts two competitions:The International Competition on Software Testing (Test-Comp) is a competition for automated test generators for C programs, and the International Competition on Software Verification (SV-COMP) is a competition for automated formal verifiers for C programs. Participating tools in Test-Comp produce test suites in a specialized XML format (link), and tools in SV-COMP produce violation witnesses in a specialized YAML format (link). Violation witnesses can already be transformed into a test suite (publication). This allows participants from SV-COMP to participate in Test-Comp. But test suites can not yet be transformed to violation witnesses.

Details: A test suite contains a set of test cases. Each test case is a sequence of inputs that are given to the program in the order of their appearance. Compared to this, a violation witness contains more information about the execution: It not only contains information about the inputs, but also lists the function call and line number at which the input method is called. So when we want to transform a test case to a violation witness, we need to identify the correct sequence in that the input methods are called during execution of the test. To do this, you will write a command-line tool that takes a program-under-verification (C code) and a single test case, and outputs a corresponding violation witness. To do this, your tool needs to:

  1. Add additional program statements to the program-under-verification that record the line of code from that an input method is called.
  2. Execute the test case on the modified program
  3. Use the recorded information to build a valid violation witness.

The project includes a large-scale evaluation of your tool on all Test-Comp 2025 participants: You will take all test suites produced by participants, use your tool to transform them into violation witnesses, and then validate the witnesses using the SV-COMP 2025 witness validators.

Implementation happens in one of the following languages: Java, Kotlin, Go, or C. Basic knowledge of YAML and XML is helpful.

Integration of Verification-Task Generation into the C Build Process (Mentor: Thomas Lemberger)

This project is available as a BSc thesis and a MSc research training course with 6 ECTS. Mentoring is available in English and German.

Goal: Integrate the recording of compile-time parameters into HarnessForge.

Background: The automated software verification of real-world software poses multiple challenges: First, a verification engineer has to identify the properties to check in the program and write a verification harness that encodes the necessary pre-conditions and post-conditions for the code under analysis. Then, code that the verifier can not handle must be modelled (for example external functions, assembly code, etc.). Last, the original source code, the verification harness and the model must be assembled into a verification task that can be processed by the verifier. Even this last part can be difficult: Verifiers mostly support single input files, only, and the build of a C project can be a complicated process. Our tool HarnessForge tries to solve this: given the definition of a verification harness, model overrides and the source directories of the project, it automatically computes source-code dependencies, assembles a single-file verification task, and slices all unnecessary code from that to keep it concise. But HarnessForge currently still requires the developer to provide all necessary build parameters manually. Your task is to automate this process. With your extension, HarnessForge executes the build, records all necessary build parameters, and uses them directly for its further work. Through this, the user only has to provide a build command and not all necessary build parameters.

Details: This project consists of the following steps:

  1. Create a detailed comparison of existings tools that record build parameters. The result is a well-grounded decision for one tool.
  2. Implement the integration of the selected tool into HarnessForge.
  3. Evaluate your integration with a set of real-world case studies.

Implementation happens in Python, but basic knowledge of C and C build systems like cmake or make are helpful.

Analysis of Dissimilarity of ML Classification Models via Statistical Data Clustering Using Zonotopes and/or Polyhedra (Mentor: Iwo Kurzidem)

This topic is available as a BSc or MSc final thesis. Mentoring is available in English and German.

Goal: In this Master's thesis, we aim to use clustering algorithms to detect systematic errors in Deep Neural Networks (DNNs) by over-approximating the range of outputs for a given set of inputs derived from statistical test data. The clustering algorithms are based on formal methods and include abstractions via intervals, zonotopes, and polyhedra. These methods can model geometric and algebraic properties between input and output ranges, allowing for the analysis of the volume and shape of the high-dimensional input/output space. The objective is to identify and potentially quantify the hypervolume of the high-dimensional zonotopes and/or polyhedra.

Background: Object classification systems based on Deep Neural Networks (DNNs) utilize machine learning (ML) algorithms. However, most of these ML models are trained end-to-end and are, therefore, considered "black boxes." For safety-critical applications, this necessitates additional verification at different levels of abstraction, especially since many properties of data-driven models using ML are only evaluated through statistical testing. This thesis aims to investigate whether clustering available statistical data allows us to make predictions about the fundamental dissimilarity of different ML classification models.

Requirements:

  • Practical knowledge of Machine Learning.
  • Successfully completed the following courses: Formale Spezifikation und Verifikation (FSV) and Formale Sprachen und Komplexitaet (FSK).
  • Ability to program in Python.

Extending Software Verification Witnesses [1] (Mentor: Marian Lingsch-Rosenfeld)

Software Verification Witnesses are a widely adopted tool to exchange information about tools centered on quality assurance, like fuzzers, model checkers, deductive verifiers, between many others. Currently only witnesses can only express a subset of what is required to present a complete violation or proof for all possible programs. Goal of this thesis would be to do an extension of the format and implement support for this extension in one of multiple directions. For example, for other program features like arrays and concurrency, for other programming languages like Java and Rust or for other specifications like memory safety.

Analyzing the Robustness of PeSCo-CPA's Selection Algorithm on Program Changes [1, 2, 3, 4] (Mentor: Max Barth, Marie-Christine Jakobs)

Many incremental reverification techniques assume that the same technique is used for the original and the modified program. However, this might not be ensured if we use an algorithm selector for choosing the most suitable verification technique. Therefore, this thesis should study how robust algorithm selection, in particular PeSCo-CPA's algorithm selection [1,2], is with respect to program changes. To this end, the thesis must extract from the logfile the verification sequence applied by PeSCo-CPA for a given task (program) and analyze whether the extracted sequence stays the same for original and modified program. In addition, the thesis may need to execute PeSCo-CPA on the original and modified programs to get the required logfiles. Ideally, the thesis also analyzes whether or or under which conditions the successful analysis stays the same. As set of pairs of original and modified programs, the thesis shall consider the Linux regression verification tasks [3] and the combination tasks [4].

Reimplementation of Metaval for witness format 2.0 [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

When an automatic model checker finds a proof or a bug, it exports it as a witness. These outputs of the verification process can be independently validated by using a Validator. One of these validators is Metaval, which instruments the program using the information present in the witnesses. Goal of this Thesis is to reimplement Metaval for the new witness format 2.0.

Expressing constraints from Violation Witnesses in Code [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

When an automatic model checker finds a but, it reports it as constraints on the possible execution paths in form of a violation witness. These constraints are then used by a witness validator in order to replay the violation. The goal of this thesis is to make the restrictions on the state space available to all automatic verifiers by introducing them into the source code of the program.

Creating a deductive verifier from CPAchecker [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

Automatic software verification tries to automatically proof that a program fulfills a specification without human input. On the other hand deductive verification requires manually writing annotations to proof the program correct. Usually everything required to build a deductive verifier is inside an automatic verifier. The goal of this thesis is to create a deductive verifier out of the highly successful automatic verifier CPAchecker.

Bridging the gap between automatic and deductive verification [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

Automatic software verification tries to automatically proof that a program fulfills a specification without human input. On the other hand deductive verification requires manually writing annotations to proof the program correct. The goal of this thesis is to combine the strengths of both approaches by writing a tools which generates proof goals using information information provided by the user, but allow some of the information to be missing for it to be automatically verified by an automatic verifier.

Boosting k-induction with ranking functions [1, 2] (Mentor: Marek Jankola)

k-induction is a verification technique for reachability analysis. Ranking function analysis can prove the termination of a loop and map states reachable inside the loop into numbers satisfying multiple conditions. Plain k-induction sometimes assumes unreachable states when trying to infer k-inductive invariant. These states are often pruned by using auxiliary invariants either provided by different reachability analyses or written by human. The goal of the thesis is to use termination analysis to compute ranking function and then use these ranking functions to boost k-induction.

Implementation of a different transformation in T2R [1] (Mentor: Marek Jankola)

C program satisfies termination specification if all the possible executions of the program eventually terminate. It is usually very challenging to automatically prove that a given program satisfies this property. T2R is an algorithm that transforms the program so that the transformed program can be verified using well-performing fine-tuned reachability algorithms. It is implemented in our modular framework for program transformations. Current transformation is inefficient for some reachability techniques like symbolic execution. The goal of this thesis is to implement another transformation that would make the symbolic tree easier to explore for symbolic execution.

Witness Transfromation for: Reduction of Memory-Cleanup Problem of C programs to Reach-Safety Problem [1] (Mentors: Marek Jankola, Marian Lingsch-Rosenfeld)

Algorithms for analysis of C programs are usually designed to verify a concrete class of properties of the C programs. Many of the properties of C programs that are checked by verifiers have a similar form. Memory-Safety and Reach-Safety property are both safety properties of the program with different atomic propositions. By instrumenting C programs, we can convert the Memory-Safety property to the Reach-Safety property, which opens new options for Memory-Safety analysis using algorithms that are used for Reach-safety tasks. After having solved the transformed task, we want to transform the witness i.e. the information the verifier provided for the verdict of the program to a witness for the original program. The goal of the thesis is to implement such a transformation of witnesses.

Gather verification tasks from test suites of compilers and verifiers [1] (Mentor: Philipp Wendler)

The SV-Benchmarks repository is a collection of verification tasks and used for example in competitions of verifiers and other tools. Compilers like GCC and Clang likely have test suites with C programs that would be suitable as verification tasks, if adjusted to the expected format. This should be done systematically for at least these two compilers and potentially others, and the result should be submitted to SV-Benchmarks. If not enough potential tasks can be found, another source of tasks would be regression-test suites of verifiers and testing tools.

Reduction of No-Datarace Problem of C programs to Reach-Safety Problem [1] (Mentors: Marek Jankola, Marian Lingsch-Rosenfeld)

Algorithms for analysis of C programs are usually designed to verify a concrete class of properties of the C programs. Many of the properties of C programs that are checked by verifiers have a similar form. No-Datarace and Reach-Safety property are both safety properties of the program with different atomic propositions. By instrumenting C programs, we can convert the No-Datarace property to the Reach-Safety property, which opens new options for No-Datarace analysis using algorithms that are used for Reach-safety tasks. The goal of the thesis is to develop an idea of transformation C programs so that Reach-Safety analysis proves No-Datarace, implement the transformation in a tool and observe whether it is more efficient to solve No-Overflow using Reach-Safety analysis.

Reduction of Memory-Safety Problem of C programs to Reach-Safety Problem [1] (Mentors: Marek Jankola, Marian Lingsch-Rosenfeld)

Algorithms for analysis of C programs are usually designed to verify a concrete class of properties of the C programs. Many of the properties of C programs that are checked by verifiers have a similar form. Memory-Safety and Reach-Safety property are both safety properties of the program with different atomic propositions. By instrumenting C programs, we can convert the Memory-Safety property to the Reach-Safety property, which opens new options for Memory-Safety analysis using algorithms that are used for Reach-safety tasks. The goal of the thesis is to develop an idea of transformation C programs so that Reach-Safety analysis proves Memory-Safety, implement the transformation in a tool and observe whether it is more efficient to solve Memory-Safety using Reach-Safety analysis.

Reduction of Thread-Liveness Problem of C programs to Reach-Safety Problem [1] (Mentors: Marek Jankola, Marian Lingsch-Rosenfeld)

Algorithms for analysis of C programs are usually designed to verify a concrete class of properties of the C programs. Many of the properties of C programs that are checked by verifiers have a similar form. Thread-Liveness property is a liveness property, whereas Reach-Safety is a safety property. It is well-known that liveness can be transformed to safety. By instrumenting C programs, we can convert the Thread-Liveness property to the Reach-Safety property, which opens new options for Thread-Liveness analysis using algorithms that are used for Reach-safety tasks. The goal of the thesis is to implement the transformation in a tool and observe whether it is more efficient to solve Thread-Liveness using Reach-Safety analysis.

A Range To Condition Converter [1, 2, 3] (Mentor: Marie-Christine Jakobs)

Ranged program analysis [1] and Conditional Model Checking [2] are two different techniques to restrict an analysis to examine a subset of a program's behavior. In ranged program analysis, all paths are ordered and a subset of program paths is described by a range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. In conditional model checking, the subset of paths is described by a an automaton like format, the condition, which accepts all program paths except for the paths in the subset. To let ranged program analysis and conditional model checking cooperate, we need to be able to transform between the two formats that describe subpaths.
This thesis should develop a transformation from ranges to conditions. The goal of the this is develop a transformation, formally show that it is sound, i.e., the transformed condition describes the same paths as the original range. If this does not work, it should at least contain all paths of the original range. In addition, the transformation should be realized in the software analysis tool CPAchecker [3] and it should be evaluated as part of a combination of ranged program analysis and conditional model checking.

A Condition To Range Converter [1, 2, 3] (Mentor: Marie-Christine Jakobs)

Ranged program analysis [1] and Conditional Model Checking [2] are two different techniques to restrict an analysis to examine a subset of a program's behavior. In ranged program analysis, all paths are ordered and a subset of program paths is described by a range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. In conditional model checking, the subset of paths is described by a an automaton like format, the condition, which accepts all program paths except for the paths in the subset. To let ranged program analysis and conditional model checking cooperate, we need to be able to transform between the two formats that describe subpaths.
This thesis should develop a transformation from conditions to ranges. The goal of the this is develop a transformation that ideally converts the condition into one range, but may also generate multiple ranges, formally show that it is sound, i.e., the transformed range(s) describe(s) the same paths as the original condition. If this does not work, it/they should at least contain all paths of the original condition. In addition, the transformation should be realized in the software analysis tool CPAchecker [3] and it should be evaluated as part of a combination of conditional model checking and ranged program analysis, e.g., in form of a sequential combination.

Topics around Distributed Summary Synthesis [1] (Mentor: Matthias Kettl)

There are many approaches for automated software verification, but they are either imprecise, do not scale well to large systems, or do not sufficiently leverage parallelization. This hinders the integration of software model checking into the development process (continuous integration). We propose an approach to decompose one large verification task into multiple smaller, connected verification tasks, based on blocks in the program control flow. For each block, summaries (block contracts) are computed — based on independent, distributed, continuous refinement by communication between the blocks. The approach iteratively synthesizes preconditions to assume at the block entry (computed from postconditions received from block predecessors, i.e., which program states reach this block) and violation conditions to check at the block exit (computed from violation conditions received from block successors, i.e., which program states lead to a specification violation). This separation of concerns leads to an architecture in which all blocks can be analyzed in parallel, as independent verification problems. Whenever new information (as a postcondition or violation condition) is available from other blocks, the verification can decide to restart with this new information. We realize our approach on the basis of configurable program analysis and implement it for the verification of C programs in the widely used verifier CPAchecker. A large experimental evaluation shows the potential of our new approach: The distribution of the workload to several processing units works well, and there is a significant reduction of the response time when using multiple processing units. There are even cases in which the new approach beats the highly-tuned, existing single-threaded predicate abstraction.

Deductive Verification with CPAchecker [1, 2] (Mentor: Martin Spiessl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that should provides everything that is needed for verifying C programs. It should therefore be possible to extend the framework such that one can perform deductive proofs by adding the right annotations to the C program. Support for parsing the specification language ACSL was added recently to CPAchecker's frontend, and for solving the different verification conditions that arise from user-provided annotations one could rely reuse the program analyses that are already part of CPAchecker.

Consolidate pointer-aliasing analyses in CPAchecker [1] (Mentor: Philipp Wendler)

CPAchecker has several different implementations of pointer-aliasing analyses, and the current state of what they support is unclear. This leads to the fact that they are rarely used and causes wrong results in case of pointer aliasing. The goal of this thesis is to create a strong CPA for pointer aliasing (either by merging the existing CPAs or rewriting them) that supports all necessary use cases (fast and imprecise as well as more precise) and migrate all existing uses of other pointer CPAs, such that the old code can be deleted (#358). Afterwards, it may be possible to add additional features (e.g., #742) or extend other components of CPAchecker to make use of the new CPA (e.g., #527, #398).

Analysis of a real world software project: What can be specified, analyzed?
Refinement Selection with Adjustable-Block Encoding [1, 2, 3] (Mentor: Philipp Wendler)
Analysing the effect of compiler optimizations on Verifier performance [1, 2, 3] (Mentor: Marian Lingsch-Rosenfeld)

It has been shown that program transformations can improve the performance of verifiers. Compilers manage to produce highly optimized code which usually contains less instructions, loops and function calls. In this thesis your job will be to analyze what compiler optimizations improve verification tasks and generate a heuristic which optimization options should be used for what programs. A rudimentary framework to run experiments already exists.

Comparing the performance increase different invariant generators have for K-Induction [1, 2, 3] (Mentor: Marian Lingsch-Rosenfeld)

K-Induktion is an analysis which uses candidate invariants to proof the safety of a program. These candidate invariants are currently obtained from a dataflow analysis inside CPAchecker. There are a lot of tools which produce different invariants for C-Programs, it would be interesting to see if they can help improve K-Induction.

Data Integrity Analysis using CPAchecker [1] (Mentor: Marian Lingsch-Rosenfeld)

CPAchecker is a very successfull tool for model checking. In particular it contains a lot of the components required to make use of it for checking data integrity. The goal of this thesis would be to implement a data integrity analysis in CPAchecker and evaluating it on a set of benchmarks.

Adding support for Hyperproperties in CPAchecker [1, 2, 3] (Mentor: Marian Lingsch-Rosenfeld)

CPAchecker is a very successfull tool for model checking. Currently it lacks support for different Hyperproperties, for example secure information flow. Goal of this thesis would be to add support for Hyperproperties to CPAchecker.

Inferring Interpolants using Machine Learning [1, 2, 3] (Mentor: Marian Lingsch-Rosenfeld)

Finding good predicates is a key challenge for verifying programs using predicate analysis. Currently some heuristics and interpolation techniques are used to find new predicates. On the other hand machine learning techniques have been succesfull in determining invariants for programs. The goal of this thesis would be to use the information generated during the exploration of the state-space of a program to find good predicates/interpolants using machine learning techniques.

Building a Deductive Verifier using an Automatic Verifier and E-ACSL [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

Deductive and Automatic Verifiers are two classes of verification methods to proof the safety of program. In this thesis the goal is to close the gap between them by transoforming an annotated program into proof goals and then using an automatic verifier to proof them. Building such a tool would be based upon a tool to treat Verifiers as SMT solvers which understand the intricacies of programming languages.

Any to Any specification transformation [1] (Mentor: Marian Lingsch-Rosenfeld)

A verifier tries to proof that a property holds for a program. There are multiple properties a verifier can usually proof, for example Termination, Reachability, Memory Safety, etc.... Verifiers usually specialize on a couple of properties and optimize their algorithms for it. The goal of this thesis would be to implement a tool which does program transformation from any specification to any other and compare the performance of different verifiers on them.

Dynamic control-flow modifications for CPAchecker [1] (Mentor: Marian Lingsch-Rosenfeld)

It has been shown that program transformations can improve the performance of verifiers. Currently most approaches to program transformations change the code without using information collected during the verification process. The goal of this thesis is to improve upon this by developing some transformations on the source code of a program which make verification easier and incorporate information collected during the verification process.

Currently assigned topics
Detecting Collaterally Covered Goals with CPAchecker's Value Analysis [1, 2, 3, 4] (Mentor: Marie-Christine Jakobs)

Software testing is a standard means in quality assurance supported e.g., by automatic tools for test input generation. Several of these automatic tools generate test suites aiming at maximizing the value wrt. a coverage criterion like branch coverage. One particular class of tools are test-case generators based on model checkers like CPAchecker [1,2,3]. These tools derive test goals from the coverage criterion (e.g., the set of branches in a program) and show the reachability of each goal to derive tests. Often, one generated test can reach more than the test goal it was created for. While it is easy to detect other test goals that were visited before reaching the test goal of interest and model-checking based test-case generators often would have covered those test goals with a previous test anyways, other test goals visited after reaching the test goal need to be explicitly determined. The goal of this thesis is to apply CPAchecker's value analysis [2] to determine test goals reached after the test goal of interest with the test inputs generated from showing the reachability of that test goal and integrate this procedure into CPAchecker's test-case-generation algorithm. Ideally, the value analysis should start after the test goal of interest, but could also start from the beginning with the test. In both cases, the value analysis needs to be configured to follow the path induced by the generated test and the visited test goals must be extracted from the path and marked as covered. In the former case, the initial state for the value analysis needs to be derived from the path reaching the test goal and the analysis needs to stop if a new input is required or the analysis cannot precisely determine a branch direction. In the latter case, the value analysis needs to read the test inputs and the analysis needs to stop if a new input is required but no further input is provided by the test or the analysis cannot precisely determine a branch direction. The tasks of the thesis is to design a concept for the detection approach and its integration into CPAchecker's test-case-generation procedure. The designed approach should be implemented in CPAchecker. Also, its efficiency and effectiveness should be evaluated. For example, one could compare the CPU time, number of covered test goals, and number of generated tests using one or more of CPAchecker's test-case generation configurations with and without the detection approach on the Test-Comp benchmark [4].

WitnessDB: Storing, Querying, and Analyzing Verification Witnesses [1] (Mentor: Henrik Wachowitz)

Verification Witnesses are the result of a Software Verification run. They contain information that should make reproducing the TRUE or FALSE verdict of a given program easier. The annual international competition for software verification SV-COMP produces a large number of these witnesses from a plethora of verification tools. There is one drawback however: the dataset is only available as a large ZIP archive. When unpacked, the archive of just SV-COMP ’24 is already 100 GB in size.

In this Project you will build migrate these witnesses to a Database to make them queryable. The task will require you to:

  1. Set up a Database (first locally, then on a server)
  2. Identify the relevant information in the witnesses (Size, Creating tool, Verdict, etc.)
  3. Setup the Database Schema
  4. Make a partial download possible (i.e., download an archive of all witnesses conforming to a query)
  5. Create a Web Interface to query the Database

Applicants should have experience with SQL, Databases and be proficient in one of Python, Java, Kotlin, Go or C++ as well as have some experience with web technologies. Experience with Docker/Podman and Web Development is a plus.

Analyzing the Robustness of CPA-seq's Selection Algorithm on Program Changes [1, 2, 3, 4, 5, 6, 7] (Mentor: Max Barth, Marie-Christine Jakobs)

Many incremental reverification techniques assume that the same technique is used for the original and the modified program. However, this might not be ensured if we use an algorithm selector for choosing the most suitable verification technique. Therefore, this thesis should study how robust algorithm selection, in particular algorithm selection based on Boolean Features [1], is with respect to program changes. To this end, the thesis must extract the Boolean features, both for original and modified program, study which Boolean Features change and remain identical between original and modified program. In a second step, the selection strategy from CPA-seq [1,2,3,4,5] for the reach-error property should be extracted and based on the extracted features it should be analyzed whether the selected technique stays the same between original and modified program. Ideally, the thesis also analyzes whether or or under which conditions the successful analysis stays the same. As set of pairs of original and modified programs, the thesis shall consider the Linux regression verification tasks [6] and the combination tasks [7].

Analyzing the Robustness of PeSCo's Selection Algorithm on Program Changes [1, 2, 3, 4] (Mentor: Max Barth, Marie-Christine Jakobs)

Many incremental reverification techniques assume that the same technique is used for the original and the modified program. However, this might not be ensured if we use an algorithm selector for choosing the most suitable verification technique. Therefore, this thesis should study how robust algorithm selection, in particular PeSCo's algorithm selection [1,2], is with respect to program changes. To this end, the thesis must extract from the logfile the verification sequence applied by PeSCo for a given task (program) and analyze whether the extracted sequence stays the same for original and modified program. In addition, the thesis may need to execute PeSCo on the original and modified programs to get the required logfiles. Ideally, the thesis also analyzes whether or or under which conditions the successful analysis stays the same. As set of pairs of original and modified programs, the thesis shall consider the Linux regression verification tasks [3] and the combination tasks [4].

Decomposing Partial Order Reduction [1] (Mentor: Marian Lingsch-Rosenfeld)

Partial Order Reduction is a technique to reduce the state space of a concurrent program. The goal of this thesis is to implement a modular version of Partial Order Reduction such that other verifiers can also benefit from this technique.

Verifying Concurrent Programs with Assumptions: A Case Study on Deadlock Detection [1] (Mentor: Marian Lingsch-Rosenfeld)

Many verifiers are specialized for reachability properties. This work tries to express complex properties for concurrent programs as reachability.

Assessing and Improving the Quality of the Test-Comp Subset of the SV Benchmark [1, 2] (Mentor: Marie-Christine Jakobs)

To evaluate the test generation tools participating in the International Competition on Testing, the validator executes the generated tests and measures coverage or reachability of a particular method call. For a reliable validation, i.e., test execution, all variables in a test task (C program plus coverage property) need to be properly initialized before influencing the observable program behavior (i.e., program control flow our output) or simplified before read for the first time. Per rule definition, variables should be initialized with __VERIFIER_nondet_XXX calls. In a first step, this thesis shall analyze the test tasks of the competition (a subset of the software verification competition benchmark [1]) for both categories coverage branches and coverage error-call to detect uninitialized variables usages. To this end, one may compile the programs with clang, gcc, etc. using particular compiler flags for uninitialized variables, apply a static analyzer, or run the validator with (a subset of the) tests generated in the last edition of the competition and dynamically observe uninitialized variable usages for instance with Valgrind [2]. Before one decides for one or multiple of these techniques, one should compare what kind of uninitialized variables the approaches detect, e.g., only uninitialized on some paths, uninitialized array entry, uninitialized memory, etc. In a second step, the uninitialized variable usages should be fixed and the fixes should be included into the software verification competition benchmark via a Merge Request.

An IDE Plugin for Software Verification (Mentor: Henrik Wachowitz)

This topic is available as a BSc thesis or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: We integrate the program-analysis tool CPAchecker into IDEs like VSCode and IntelliJ by building a language server for it.

Background: CPAchecker is a powerful tool for program analysis and formal verification. It automatically finds errors in code. But currently, CPAchecker only has a command-line interface. To make it easier to use CPAchecker during development, we want to integrate it into IDEs. Traditionally, a separate IDE plugin had to be written for each IDE that we want to support, but the language server protocol provides one protocol that allows us to integrate CPAchecker into a variety of IDEs by building a single language server.

Details: You build a language server for CPAchecker and the C programming language. The server receives requests from IDEs that contain program code. The server then runs CPAchecker on that code and responds to the IDE with a list of found issues. A basic web service for communicating with CPAchecker already exists and can be used in the backend. The goal is to be able to both deploy the language server locally and remotely (e.g., in the cloud).

Knowledge in software verification is helpful. An understanding of Java micro-services is helpful. The language server should be written in Java.

In the scope of this topic, you have to create a design of the language server, implement this design, and test the server.

Algorithm Selection for Btor2 Verification Tasks using Circuit Structure Information [1, 2, 3] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 [1] is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C [2] is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers.

Goal: Now that a large collection of tools from both hardware and software verification communities are at our disposal, we want to combine forces of them via machine learning (ML). The objective is to construct a selector using information of the Btor2 circuit structure [3], and train it on a large Btor2 dataset. Given a Btor2 task, the ML-based selector can then predict the best-suited tool/algorithm(s) for solving it.

Requirements: (general information)

  • Ability to program in Python
  • Basic knowledge of graph kernel, Kripke structure, and SAT/SMT solving
  • Practical knowledge of support vector machine, decision/boosting tree, and random forest

Using Correctness Witnesses for the Reverification with k-Induction [1, 2, 3] (Mentor: Marie-Christine Jakobs)

After a program change, the program needs to be reverified. Reverifying the program from scratch can be inefficient and incremental verification techniques try to increase the performance of reverification by reusing information from a previous verification. The goal of this thesis is to propose a solution to use correctness witnesses (in GraphML or in YAML format [1]) to make reverification with k-Induction [2] more efficient. k-Induction already uses correctness witnesses for validation, i.e., reverification of the same instead of a changed program. However, for reverification of a changed program one needs to deal with the changes. Due to the change, the witness information cannot directly be mapped to the program, e.g., because loop locations changed. Therefore, the thesis needs to consider where to map the information from the witness, e.g., one could apply loop invariants to all loops of the function or the closest loop. The developed solution should be implemented in the software analysis framework CPAchecker [3]. In addition, an evaluation should study whether the use of correctness witnesses can improve the efficiency or effectiveness of k-induction. The evaluation may use correctness witnesses constructed by k-Induction or other analyses.

Incremental Ranged Analysis via Range Reuse [1, 2, 3] (Mentor: Marie-Christine Jakobs)

Ranged program analysis [1] is a technique to split the analysis of a program into different parts. The idea is split the program's state space, i.e., its execution paths, into separate parts and analyse each part separately, e.g., in parallel. To describe the different parts, ranged program analysis orders the paths and describes a part by a so-called range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. The goal of this thesis, is to make use of the ranges when reverifying a program after change. In its simplest form, reverification should reuse the ranges and simply skip range computation. To this end, ranges need to be stored and read if this is not provided so far. Next to this simple approach, a more complicated approach identifies those ranges that contain modified execution paths, e.g., using the idea of difference verification with conditions [2] and restrict reverification of those ranges. An even more sophisticated approach aims to restrict the ranges such that they focus on modified paths, e.g., strip outer non-modified and possibly split ranges. The different approaches for incremental ranged analysis should be implemented in CPAchecker [3]compared against standard ranged program analysis. A master thesis ideally also looks into theoretical soundness aspects of the approach.

Condition Reuse for Incremental Conditional Model Checking [1, 2, 3] (Mentor: Marie-Christine Jakobs)

Conditional Model Checking [1] is a technique to split the analysis of a program into several analyses that each analyze parts of the program paths. Thereby, the part an analyzer should explore is encoded into a condition, an automata like format, which accepts all program paths except for the paths that should be analyzed. The goal of this thesis, is to make use of the ranges when reverifying a program after change. The challenge is to ensure that also the modified program paths are analyzed, i.e., are not excluded from analysis. To guarantee this property, existing conditions might need to be updated or new conditions for modified paths that are not covered by the existing conditions must be defined.
In its simplest form, reverification should adapt the set of conditions to include also the modified behavior. Next to this simple approach, a more complicated approach identifies those conditions that contain only non-modified execution paths, e.g., using the idea of difference verification with conditions [2], and excludes them from reverification. An even more sophisticated approach aims to restrict the conditions such that they focus on modified paths, e.g., use a product composition of the condition from difference verification [2] and the original condition. The different approaches for incremental conditional model checking should be implemented in CPAchecker [3] and compared against standard conditional model checking. Also, the soundness of the approach must be shown, i.e., it must be formally proven that all modified paths are verified

Boost Symbolic Memory Graph based Explicit Value Analysis with Numerical Abstract Reasoning (Mentor: Daniel Baier)

Explicit Value Analysis with Symbolic Memory Graphs (SMGs) can analyse programs, including memory operations, fast and efficient. However, Value Analysis often over-approximates values beyond an already encountered bound. The goal is to combine the existing SMG based Value Analysis with the existing Numerical Abstract Domain (APRON) using a composite analysis, reusing the already existing components. This is then implemented in CPAchecker and evaluated against other verification approaches over the SV-COMP benchmark set.

Distributed Incremental Verification [1] (Mentor: Matthias Kettl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that provides everything that is needed for verifying C programs. CPAchecker is able to distribute the analysis of a single task to different workers by decomposing the input program into blocks. Each worker analyzes one block and communicates the results to all other workers over messages to establish the correct verification result. The referenced work uses incremental verification on code blocks. In our setting, the blocks are already available and we can also handle the updates of the summaries in parallel.

Calculating Precise Error Conditions within CPAchecker (Mentor: Matthias Kettl)

There might be several reasons for reaching a specific error location in a program. We want to identify all classes of inputs that trigger the same bug. They should be as abstract as possible but as precise as necessary. For a given program we want to find out, which restrictions to nondeterministic variables are necessary to reach an error location. For example, we might reach error location E1, if the nondeterministic variable x equals 1 and the nondeterministic variable y is less than 1024. Error location E2 might be reachable for x > 100 and y < 100. Otherwise the program does not show faulty behavior. In our approach we are interested in the most abstract restrictions for all inputs for reaching error locations instead of a single testvector like x = 1 & y = 1023 for E1 and/or x = 101 & y = 99 for E2.

Finding Semantically Equivalent Correctness Witnesses [1, 2] (Mentors: Matthias Kettl, Marian Lingsch-Rosenfeld)

Correctness Witnesses are an artifact produced by a verifier which allows a validator to replay the verification process. The task would be to analyze correctness witnesses and find semantically equivalent witnesses from the perspective of a validator.

Hardware Speed-Independent Analysis Selection Heuristrics (Mentor: Martin Spiessl)

Background:Current state-of-the-art verifiers like CPAchecker use some kind of time-based criterion for deciding when to switch to a different analysis. For example, in CPAchecker it might happen that a so-called value analysis is executed for 90 seconds and after that a switch to a different analysis is performed. These time boundaries are optimized to yield the best results in the Competition on Software Verification (SV-COMP), where a certain benchmark set (SV-Benchmarks) is used. SV-COMP is typically executed on our Apollon cluster with nodes that have a Intel Xeon E3-1230 v5 CPU each with a CPU time limit of 900 seconds per verification task. Now this has implications for reproducibility on other hardware. If someone wants to replicate these experiments on newer(faster) or older(slower) hardware and uses the same 900 seconds as time limit, the results might actually not be comparable. For example, on a machine that is tenfold slower than our cluster nodes, the switch after 90 seconds described above could stop the value analysis at a point where it only analyzed 10% as much as on the Apollon nodes.
Goals: 1. One goal is to first proof the existence of the described effect by performing some experiments on our benchmarking hardware. Here also other tools from SV-COMP can be analysed to determine which tools have the same problem. 2. After that one could look into ways to make the execution of CPAchecker more independent of time measures like CPU time or wall time. A simple approach could be to do a small micro benchmark in the beginning to determine the hardware 'speed' and take this into account for time-based decisions (though this is not actially making the analysis independent of time measures and just reduces this effect). This can either be done in the tool or tool-independently as a step before the actual tool is executed. Another (more promising) idea is to introduce alternative metrics for analysis progress and use these to determine some sort of analysis 'fuel' that gets used up by the analysis and is dependent factors like number of analysis rounds, explored states, or similar. This then also opens the way for potentially improving the current heuristics because there might be other time-independent indicators that can predict better whether an analysis will be successful or whether switching the analysis will be beneficial.

Genetic Programming in Software Verification (Mentor: Martin Spiessl)

The idea is to generate verification tasks via genetic programming,i.e. do the following repeatedly on a set of programs:
1. mutation of the programs' ASTs
2. selection of 'fit' programs based by running the generated programs against some verifier and chosing some fitness metric that is determined by the verification results

This can then be used used for a large number of interesting applications:
One very simple would be fault localization: Here the allowed mutations are those that remove program statements, and the fitness is determined by a) the bug has to still be present and b) the shorter the program the fitter.
Another application would be to generate structure-wise simple programs that are still hard to solve or to generate programs that are easy to solve by verifier/approach A, but hard to solve by verifier/approach B.

In the end the challenging part here (and why this probably has not been done really for software verification) is to keep time per roundtrip low enough to get enough iterations. For that the idea would be to start use bounded model checking/ limit the analysis time. Other tactics like speculative execution or using some static method to predict the fitness might also be worth exploring.
From a techical point of view the goal is to use BenchExec for executing the mutated programs each round (and such reduce the wall time for each round).
The language used for implementation will be Python3, which allows to use the pycparser as a frontend.

Verification of Timed Automata with BDDs [1] (Mentors: Prof. Dr. Dirk Beyer, Nico Weise)

Timed automata are model of computation that was invented 30 years ago with the goal of modelling not only the sequence of events but also the time of their occurrence. Timed automata are automata with a finite set of control states and clock variables whose values continuously increase over time. The objective of this thesis topic is to construct a basic reachability analysis for timed models.

Taint Analysis using CPAchecker [1, 2] (Mentor: Marian Lingsch-Rosenfeld)

CPAchecker is a very successfull tool for model checking. In particular it contains a lot of the components required to make use of it for taint analysis. The goal of this thesis would be to implement taint analysis in CPAchecker and evaluating it on a set of benchmarks.

Applicability Study of CPAchecker and CBMC on Open-Source Software Projects (Mentor: Marian Lingsch-Rosenfeld)

Verifiers can proof different properties about many different programs. However, it is not clear how well they perform on real-world programs. The goal of this thesis would be to evaluate the performance of different verifiers on a set of real-world programs.

Applicability Study of UAutomizer, Mopsa, Goblint and PredatorHP on Open-Source Software Projects (Mentor: Marian Lingsch-Rosenfeld)

Verifiers can proof different properties about many different programs. However, it is not clear how well they perform on real-world programs. The goal of this thesis would be to evaluate the performance of different verifiers on a set of real-world programs.

Inferring Predicates using Machine Learning [1, 2, 3] (Mentor: Marian Lingsch-Rosenfeld)

Finding good predicates is a key challenge for verifying programs using predicate analysis. Currently some heuristics and interpolation techniques are used to find new predicates. On the other hand machine learning techniques have been succesfull in determining invariants for programs. The goal of this thesis would be to use the information generated during the exploration of the state-space of a program to find good predicates/interpolants using machine learning techniques.

Non-termination analysis in CPAchecker (Mentor: Marie-Christine Jakobs)
Invariant Generation through Sampling [1] (Mentor: Martin Spiessl)

Finding the right invariants is essential when trying to proof properties about programs with loops. In CPAchecker we currently have a search-based analysis that generates candidate invariants. Another approach is to try to learn and continuously refine the invariant by training a classifier. This should work especially well if the neccessary invariants are linear equalities, as this corresponds to the classification boundary of a linear classifier.

Finished topics
Formal Verification of Different Properties of an Explainable Random Forest Model [1, 2] (Mentor: Iwo Kurzidem)

This topic is available as a BSc thesis and as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: In this Master's Thesis, we want to use formal methods to verify different properties of an explainable ML model (random forest). The goal is to identify which properties of the ML model can be verified, with which approach, and to what degree. If any approach or method is not suitable, we want to find out why and if modifications or adaptations are possible to make it work. ML properties of interest are: Contradictions, identification of sets, logic simplification etc.

Background: Current Autonomous Systems use Machine Learning (ML) based algorithms for object detection, classification, trajectory planning and more. However, most of these ML models are trained end-to-end and are therefore a "black-box". For safety relevant applications this requires an additional verification at different levels of abstraction. Especially explainable ML promises a humanly understandable model, but it is not without its own uncertainties. To solve this, we use formal verification to investigate the properties of a trained RF, provided as JSON file.

Requirements:

  • Practical knowledge of decision trees and random forest
  • Successfully completed the following courses: Formale Spezifikation und Verifikation (FSV) and Formale Sprachen und Komplexitaet (FSK)
  • (optional) Ability to program in Python

Formal Verification of Different Properties of a Random Forest Model [1, 2] (Mentor: Iwo Kurzidem)

This topic is available as a BSc thesis and as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: In this Master's Thesis, we want to use formal methods to verify different properties of an explainable ML model (random forest). The goal is to identify which properties of the ML model can be verified, with which approach, and to what degree. If any approach or method is not suitable, we want to find out why and if modifications or adaptations are possible to make it work. ML properties of interest are: Contradictions, identification of sets, logic simplification etc.

Background: Current Autonomous Systems use Machine Learning (ML) based algorithms for object detection, classification, trajectory planning and more. However, most of these ML models are trained end-to-end and are therefore a "black-box". For safety relevant applications this requires an additional verification at different levels of abstraction. Especially explainable ML promises a humanly understandable model, but it is not without its own uncertainties. To solve this, we use formal verification to investigate the properties of a trained RF, provided as JSON file.

Requirements:

  • Practical knowledge of decision trees and random forest
  • Successfully completed the following courses: Formale Spezifikation und Verifikation (FSV) and Formale Sprachen und Komplexitaet (FSK)
  • (optional) Ability to program in Python

A JSON Export for Control-Flow Automata [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc thesis and as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: We decouple the language parser of CPAchecker by implementing a JSON-based export and import of its C control-flow automaton.

Background: CPAchecker is a large Java framework for program analysis and formal verification. CPAchecker receives a program-under-analysis as input, analyzes it, and produces some output. CPAchecker coarsely consists of two steps: First, it parses the input program into the AST and generates the control-flow automaton (CFA) from the AST. CPAchecker then uses this CFA as intermediate representation, and runs its algorithms on that representation instead of the text-based code. Both CFA-creation and algorithms are implemented within CPAchecker, in a tightly coupled sequence. This means that: (a) Any post-processings on the CFA (for example optimizations) must be implemented within CPAchecker. This makes it more difficult to integrate some existing analyses, like C++-based libraries for static analysis. (b) CPAchecker's CFA creation can not be reused by other tools, so they can not profit from the existing optimizations within CPAchecker.

Details: To solve this, we implement an export and import of the CFA in CPAchecker. As format, we use JSON. This enables other tools to read the CFA and work with it, and CPAchecker can import the result.

Knowledge of the foundations of programming languages (e.g., abstract syntax trees) is required. Knowledge in software verification is helpful. The functionality must be written in Java, as a part of CPAchecker.

In the scope of this topic, you have to design the import/export, implement this design, and evaluate its functionality on the example of C programs.

Upgrade of AngularJS, Refactoring UI and automated testing for CPAchecker [1, 2] (Mentor: Philipp Wendler)

CPAchecker produces as result a report about its verification work. An example can be seen here. This report is a web application based on AngularJS and D3. The goal of this project is to replace AngularJS with Angular and to write tests for the functionality of this report using an appropriate framework for automated tests.

Test Algorithm Selection via Boolean Features [1, 2, 3] (Mentor: Marie-Christine Jakobs)

Automatic test-generation approaches have different strengths. To achieve better results, e.g., higher branch coverage, distinct test-generation approaches should be combined. One method to combine different approaches is algorithm selection, i.e., a selector chooses from a given set of approaches a most suitable approach for the particular test-case generation task (program). This method has already been successfully applied for software verification, e.g., [1]. The goal of this thesis is to apply algorithm selection to the verification-based test-case generation approaches available in the software analysis framework CPAchecker [2]. To this end, the thesis should first compare the performance of different analyses like Predicate Abstraction, Value Analysis, BMC, and symbolic execution with respect to covered test goals on the software verification competition benchmark [3] (in particular its test tasks). Thereby, it should be analyzed how well the features from [1] are suited to determine the best analysis per test task and study , whether there exist program features that would allow for a better determination. In a second step, the determined selection based on the identified features should be implemented in CPAchecker [2], e.g., as an extension of the SelectionAlgorithm together with an appropriate configuration. Ideally, the realized selection procedure is evaluated on the benchmark of the International Competition on Software Testing.

Selection of Time Limits for CoVeriTest via Boolean Features [1, 2, 3, 4] (Mentor: Marie-Christine Jakobs)

Automatic test-generation approaches have different strengths. To achieve better results, e.g., higher branch coverage, distinct test-generation approaches should be combined. One method to combine different approaches is the CoVeriTest [1] approach that interleaves different analysis for test-case generation providing each analysis with an individual time limit for each iteration. In this thesis, we want to use the principle of algorithm selection to choose the time limits per task. Algorithm selection has already been successfully applied for software verification, e.g., [2]. Similar to [2], the goal of this thesis is to use program features to define a selector for the time limits. As a starting point, it should be analyzed how well the features from [2] are suited to select the time limits and whether there exist program features that would allow for a better selection. In a second step, the determined selection based on the identified features should be integrated into CoVeriTest, which is implemented in CPAchecker [2]. Ideally, the realized selection procedure is evaluated on the benchmark of the International Competition on Software Testing.

Verification of Micro Services based on OpenAPI Specifications [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc/MSc thesis or as a research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: We build a tool that reads the API specification of a micro-service, extracts some program properties from it, and uses a Java verifier (formal verifier or fuzz tester) to check these properties.

Background: We want to verify Java micro-services for correct behavior. The interface of most Java micro-services are RESTful HTTP APIs. These are often defined using the OpenAPI specification language in JSON or YAML format. OpenAPI allows developers to define constraints on the allowed HTTP requests and responses. For example: 'age': { 'type': 'integer', 'format': 'int32', 'minimum': 0 }. These constraints can be used for findings bugs in micro-services and verifying that they always work as expected. The available constraints are defined in the JSON Schema Validation.

Details: You build a tool that receives an OpenAPI specification and a Java micro service. The tool reads the OpenAPI specification, extracts the constraints on response parameters (for example age.minimum: 0), turns them into a specification that the Java verifier can understand (for example response.age >= 0), and then runs the verifier to verify that the micro-service implementation always fulfills this specification (or to report test inputs that trigger a bug). The constraints on request parameters can be used to restrict the input values to the micro service and make the verification more precise. Depending on the verifier that is used, the specification may be given to the verifier or directly encoded in the Java code (e.g. to be able to use a fuzzer for verification).

Knowledge in software verification is helpful. An understanding of Java micro-services is necessary. The command-line tool can be written in one of the following languages: Java, Kotlin.

In the scope of this topic, the tool must be designed, implemented, and evaluated.

Verification of Micro Services based on API Specifications [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc/MSc thesis or as a research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: We build a tool that reads the API specification of a micro-service, extracts some program properties from it, and uses a Java verifier (formal verifier or fuzz tester) to check these properties.

Background: We want to verify Java micro-services for correct behavior. The interface of most Java micro-services are RESTful HTTP APIs. These are often defined using the OpenAPI specification language in JSON or YAML format. OpenAPI allows developers to define constraints on the allowed HTTP requests and responses. For example: 'age': { 'type': 'integer', 'format': 'int32', 'minimum': 0 }. These constraints can be used for findings bugs in micro-services and verifying that they always work as expected. The available constraints are defined in the JSON Schema Validation.

Details: You build a tool that receives an OpenAPI specification and a Java micro service. The tool reads the OpenAPI specification, extracts the constraints on response parameters (for example age.minimum: 0), turns them into a specification that the Java verifier can understand (for example response.age >= 0), and then runs the verifier to verify that the micro-service implementation always fulfills this specification (or to report test inputs that trigger a bug). The constraints on request parameters can be used to restrict the input values to the micro service and make the verification more precise. Depending on the verifier that is used, the specification may be given to the verifier or directly encoded in the Java code (e.g. to be able to use a fuzzer for verification).

Knowledge in software verification is helpful. An understanding of Java micro-services is necessary. The command-line tool can be written in one of the following languages: Java, Kotlin.

In the scope of this topic, the tool must be designed, implemented, and evaluated.

A Library for Unit Verification [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc/MSc thesis or as a research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: We want to build a command-line tool that builds a verification harness. The verification harness allows to use software verification like unit testing.

Background: Similar to Unit Testing, we want to use verification to find bugs in individual software methods (or proof that there are no bugs). A software verifier is a tool that checks some program code and a verification property ("the program never triggers any assertion failure"). But there are two issues: (1) For unit verification, there is no library like JUnit that provides methods for managing tests. (2) Users are not used to software verification and don't know how to write a verification harness. The second issue is tackled by the topic "Verification-Harness Synthesis for Unit Verification". The first issue is tackled by this topic.

Details: You build a C library that helps software developers write and execute verification tasks, similar to unit testing with JUnit. The library provides methods to define the range of input values (examples: int temperature = anyInt();, restrict(temperature > -273);) and to assert postconditions (example: assert(output == 0)). Ideally, the library provides utility methods that make it easier to define conditions on complex data structures like structs and/or pointer constructs. The library may also come with a small tool that runs all defined verification-methods with a verifier and report the individual outputs (like a test runner).

We focus on C code. A large selection of verifiers can be used off-the-shelf.

Knowledge in software verification and a basic understanding of C is necessary. The library must be written in C.

In the scope of this topic, the library is designed, implemented, and evaluated.

A Gradle Build Tool to Integrate Verifiers into the Project's Build Pipeline [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc thesis or as a large research training course with 12 ECTS. Mentoring is available in English and German.

Goal: We want to build a tool that allows developers to hook software verifiers into their project's build pipeline.

Background: We want to find bugs in software (or proof that there are no bugs) through verification. A software verifier is a tool that checks some program code and a verification property ("the program never triggers any assertion failure"). But most software verifiers only understand already-preprocessed software code and/or only single code files. This means that users have to first preprocess their source code manually and have to point the verifier to each file to analyze. This can be a cumbersome and difficult process.

Details: You build a gradle plugin that enables everyday-developers to integrate software verifiers into their project. In the end, this will be similar to linters and other build plugins that you know: The developer adds your plugin to their build configuration. This makes the plugin automatically runs the software verifier on the software files on project build. Your plugin hooks into the software build process, retrieves all relevant files, and then passes them to the verifier. Your plugin collects the verifier's results and output them in a nice format to the user. We focus on Java code and projects that are built with gradle.

Experience with 'gradle' and a good understanding of Java is recommended. Your build tool can be written in Java or Kotlin.

In the scope of this topic, you design, implement and evaluate your plugin.

A Maven Build Tool to Integrate Verifiers into the Project's Build Pipeline [1] (Mentor: Thomas Lemberger)

This topic is available as a BSc thesis or as a large research training course with 12 ECTS. Mentoring is available in English and German.

Goal: We want to build a tool that allows developers to hook software verifiers into their project's build pipeline.

Background: We want to find bugs in software (or proof that there are no bugs) through verification. A software verifier is a tool that checks some program code and a verification property ("the program never triggers any assertion failure"). But most software verifiers only understand already-preprocessed software code and/or only single code files. This means that users have to first preprocess their source code manually and have to point the verifier to each file to analyze. This can be a cumbersome and difficult process.

Details: You build a maven plugin that enables everyday-developers to integrate software verifiers into their project. In the end, this will be similar to linters and other build plugins that you know: The developer adds your plugin to their build configuration. This makes the plugin automatically runs the software verifier on the software files on project build. Your plugin hooks into the software build process, retrieves all relevant files, and then passes them to the verifier. Your plugin collects the verifier's results and output them in a nice format to the user. We focus on Java code and projects that are built with maven.

Experience with 'maven' and a good understanding of Java is recommended. Your build tool can be written in Java or Kotlin.

In the scope of this topic, you design, implement and evaluate your plugin.

Reverse Program Synthesis for Backward Reachability Analysis [1] (Mentor: Po-Chun Chien)

Description: In most verification tools, the reachability analysis searches for a path from the initial program location to the error location. The analysis could also be done backward, i.e. tries to reach the initial location from the error location. The goal of this project is to implement a control-flow automata (CFA) transformer in CPAchecker that reverses program executions by following the transitions backward. One could then apply existing analysis to verify the reverse CFA.

Requirements: (general information)

  • Ability to program in Java and understand C code
  • Basic knowledge of control-flow automata

Improving the Encoding of Arrays in Btor2-to-C Translation [1, 2] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 [1] is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C [2] is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software analyzers. Currently, on every write of an array, Btor2C creates a copy of it, even when unnecessary (see this issue). This might put burdens on software analyzers and hinders their performance.

Goal: The goal of this project is to alleviate such situation and allocate a new array only when necessary. Additionally, we also want to implement a translator that converts Btor2 array tasks to pure bit-vector tasks by blasting the array operations (read and write) into a chain of ite operations.

Requirements: (general information)

  • Ability to program in C (and/or Python)
  • Basic knowledge of Kripke structure and SAT/SMT solving

Validation-via-Verification for Hardware Btor2 Circuits [1, 2, 3, 4] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 [1] is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C [2] is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers. If a software verifier finds an error in the program, it generates a violation witness, which records the execution path leading to that error. On the other hand, if a software verifier proves a program correct, it generates a correctness witness [3] containing some program invariants that could reconstruct a correctness proof.

Goal: The goal of this project to certify the correctness witnesses (in GraphML format) produced by software verifiers by following the validation-via-verification approach [4]. The original Btor2 circuit is instrumented with the invariants from a correctness witness, and can then be independently solved by hardware verifiers.

Requirements: (general information)

  • Ability to program in Python (, Java, and/or C)
  • Basic knowledge of Kripke structure and SAT/SMT solving

Violation Witness Translation from C to Btor2 [1, 2, 3] (Mentors: Nian-Ze Lee, Po-Chun Chien)

Background: Btor2 [1] is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C [2] is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers. If a software verifier finds an error in the program, it generates a violation witness [3], which records the execution path leading to that error. On the other hand, if a software verifier proves a program correct, it generates a correctness witness containing some program invariants that could reconstruct a correctness proof.

Goal: The goal of this project to translate the violation witnesses in GraphML format produced by software verifiers to Btor2 format, such that one could use hardware simulators to validate them.

Requirements: (general information)

  • Ability to program in Python (, Java, and/or C)
  • Basic knowledge of Kripke structure and SAT/SMT solving

Implement Backward Bounded Model Checking in CPAchecker [1] (Mentor: Nian-Ze Lee)

Description: Bounded model checking (BMC) is a well-known software analysis technique that searches for execution paths that could lead to the error location from the initial program location. Backward BMC, on the contrary, follows the execution backward and searches paths from the error location to the initial location. The goal of the project is to implement such technique in CPAchecker.

Requirements: (general information)

  • Ability to program in Java and understand C code
  • Basic knowledge of SAT/SMT solving and weakest precondition

Scaling Formal Verification: Parallel Analysis of Functions in CPAchecker [1] (Mentor: Matthias Kettl)

Verifying complex programs with respect to a given specification (e.g., 'check that function reach_error is never called') may be time consuming. To make verifiers more applicable in industry, we want to tackle the problem with a divide-and-conquer approach. We decompose the verification task in multiple smaller tasks that are then verified in parallel. As soon as one of the analyses reports a violation, we abort all other analyses and report the violation. This will lead to many false alarms since we do not know with which input parameters functions will be called. Return values of functions will be ignored and interpreted as unknown value in a first step. From there we can continue in two directions:

  1. Analyze functions under a user defined condition (e.g., a user restricts the values of parameter x of function foo to the interval [0;100])
  2. We identify each function call automatically and try to infer all parameters.
The goals of this topic are:
  1. The verification verdict should be available faster (in terms of passed walltime).
  2. Wrong proofs should be non-existing. In case a program is safe, the decomposition should be a sound over-approximation.
Finally, we will evaluate this approach on the SV-COMP benchmark set. This topic is inspired by the tool Infer

Fault Injection: Generating a Benchmark Set for Automated Fault Repair [1, 2, 3] (Mentor: Matthias Kettl)

The SV-COMP benchmarks set is one of the largest benchmarks sets of C programs for software verification. For all programs in the set we know whether the program fulfills certain properties, e.g., the function 'reach_error' is never called. The benchmarks help to evaluate state-of-the-art verification tools and their effectiveness. For fault localization techniques there is no large benchmark set that provides sufficient metadata for each task to perform a large-scale evaluation. The idea is to create such tasks by manually crafting them, injecting faults in safe tasks, or identifiying fixes for unsafe tasks.

Reconstruction of Violation and Correctness Witnesses for Distributed Block Summaries [1] (Mentor: Matthias Kettl)

CPAchecker is a framework for automatic software verification developed at our chair. As such it has a broad technology stack that should provides everything that is needed for verifying C programs. CPAchecker is able to distribute the analysis of a single task to different workers by decomposing the input program into blocks. Each worker analyzes one block and communicates the results to all other workers over messages to establish the correct verification result. In case we find a violation, we want to provide the user with a counterexample that proves the reachability of a violation. Currently, neither the counterexample nor invariants can be synthesized from the messages. However, witnesses are important for improving the trust in the verfication results.

Implementing and Evaluating a new CPA for validating and finding tubes. [1, 2] (Mentors: Matthias Kettl, Marian Lingsch-Rosenfeld)

Finding and fixing bugs is one of the most expensive parts of the software development process. To expedite it, a good description of all inputs which result in the same error can be helpful. To generate them, we will use CPAchecker to find constraints which result in an error when fulfilled, for example all even numbers as input result in an error. This will be done by first finding an error path in the program being analyzed, followed by refining a first guess for a constraint using an approach based on CEGAR.

Program Transformation in CPAchecker: Design and Implementation of a Source-Respecting Translation from Control-Flow Automata to C Code [1] (Mentor: Thomas Lemberger)
Developing a Verifier Based on Parallel Portfolio with CoVeriTeam (Mentor: Sudeep Kanav)

CoVeriTeam is a framework to compose verification tools. We can compose different tools resulting in a verification tool. This thesis aims to construct a new verifier based on a composition of a few tools that solves more tasks than (most of) the verifiers participating in SVCOMP.
In this work, we aim to develop a verifier that is fast with respect to walltime (not cputime). This verifier will be a portfolio solver. We will develop a new kind of composition called portfolio in CoVeriTeam. This composition will make use of MPI technology to execute the processes on different machines.
evaluation: execute the composed verifier on the SVCOMP benchmarks and compare it with the participants in the walltime track competition. The implementation will be done in Python.

A New Spin on Verification with Symbolic Execution: Symbolic Execution as Formula-Based Predicate Analysis in CPAchecker [1] (Mentor: Thomas Lemberger)
Taint Analysis for CPAchecker [1] (Mentor: Martin Spiessl)

While taint analysis has many applications in practice (especially for software security), there is currently no implementation of a taint analysis in CPAchecker. Also there is no category in the Competition on Software Verification (SV-COMP) for taint analysis. The idea behind this thesis is to add support for a simple taint analysis to CPAchecker, search or create some benchmark tasks and standardize them in such a way that they can be used as part of SV-COMP in a new category.

Mining a Benchmark Set for Partial Fixes in C programs [1] (Mentor: Thomas Lemberger)
Concurrent Software Verification through Block-based Task Partitioning and Continuous Summary Refinement [1] (Mentor: Thomas Lemberger)
Adjustable Block Analysis: Actor-Based Creation of Block Summaries for Scaling Formal Verification [1] (Mentor: Thomas Lemberger)
Mutation based Automatic Program Repair in CPAchecker [1] (Mentors: Sudeep Kanav, Thomas Lemberger)

Debugging software usually consists of three steps: 1. finding an error in the program, 2. finding the cause for that error (the fault), and 3. fixing that fault. By default, fixing the fault is done manually by a programmer, which often consumes vast amount of resources and may introduce new bugs. To ease that task, this topic is concerned with automatic program repair. After finding a potential fault in a program, automatic program repair proposes fixes and often checks the validity of these fixes. The goal is to implement and evaluate a technique for automatic program repair in C programs in the state-of-the-art verification framework CPAchecker. The complexity of the technique and the scope of the topic is based on the thesis type (project, bachelor or master). Programming is done in Java.

String analysis in CPAchecker [1, 2] (Mentor: Philipp Wendler)
String analysis in CPAchecker [1, 2] (Mentor: Philipp Wendler)
LTL software model checking (Mentor: Philipp Wendler)
Conversion of counterexamples found by software verifiers into an executable test harness that exposes the bug in the program (prototype exists) [1] (Mentor: Thomas Lemberger)
Correctness-witness validation using predicate analysis (Mentor: Martin Spiessl)

Verifiers take a program and a specification as input and output whether the program fulfills the specification.If the answer is yes, the verifier can output additional information in form of a correctness witness.This correctness witness can then be used (e.g. by another verifier) to validate that the answer is correct.Our verifier CPAchecker provides many different analyses for verification, but only one of them (called k-Induction) is currently able to validate witnesses.An analysis based on predicate abstraction already exists in CPAchecker that should be able to validate witnesses with minor to moderate programming effort.Once this is working, predicate abstraction could also give insights into which information is missing in the witness.At the end, an evaluation of the new witness validation can be performed on an already existing set of representative programs using our verification cluster.This topic is suitable for a bachelor's thesis, but could also be extended for a master's thesis on request.

Heuristic selection of verification analyses (Mentor: Matthias Dangl)
A JavaScript frontend for CPAchecker (Mentor: Philipp Wendler)
Improve analysis of Java programs in CPAchecker (Mentor: Philipp Wendler)
Support for exceptions in the Java analysis of CPAchecker (Mentor: Philipp Wendler)
Towards understandability of verification result: visualization of graphs produced by verifiers (Mentor: Philipp Wendler)
Predicate analysis based on splitting states for CPAchecker [1, 2] (Mentor: Philipp Wendler)
Improve predicate analysis based on domain-type of variables in CPAchecker [1, 2] (Mentor: Philipp Wendler)
New Approaches and Visualization for Verification Coverage [1] (Mentor: Philipp Wendler)
Application of Software Verification to OpenBSD Network Modules [1] (Mentor: Thomas Lemberger)
Implementation of an Analysis for Symbolic Memory Graphs (SMG) in CPAchecker [1, 2] (Mentor: Thomas Bunk)
Unify analyses (ValueAnalysis, IntervalAnalysis, SignAnalysis) in CPAchecker by using a common data-wrapper for numerical values (integers, bitvectors, floats, ...), optional: support for symbolic values, BDDs, and Octagons [1] (Mentor: Karlheinz Friedberger)
Define and evaluate different refinement approaches for interval domain or BDD domain (widening, splitting, interpolation, path prefixes) [1, 2, 3] (Mentor: Karlheinz Friedberger)
Separation Logic in CPAchecker (Mentor: Martin Spiessl)
Symbolic Memory Graphs in CPAchecker [1, 2] (Mentor: Karlheinz Friedberger)
Witness Modifications for Program Transformations: A Case Study on Side-Effect Removal (Mentor: Marian Lingsch-Rosenfeld)
Multi-process architecture for BAM-parallel to provide cluster-based software analysis, maybe with central or shared database for reached states (Mentor: Karlheinz Friedberger)
Backwards analysis for CPAchecker, initially BDD- and predicate-based (both partially implemented somewhere), later symbolic and explicit analysis. (Mentor: Karlheinz Friedberger)
Good, bad, ugly interpolants. Measure interpolants and build heuristics. Topic might be related to refinement selection. [1] (Mentor: Philipp Wendler)
Hybrid Testcase Generation with CPAchecker [1] (Mentor: Thomas Lemberger)
A Web Frontend For Visualization of Computation Steps and their Results in CPAchecker [1] (Mentor: Thomas Lemberger)
Parallel BDDs in Java (cf. Sylvan) [1] (Mentor: Karlheinz Friedberger)
Fault Localization in Model Checking. Implementation and Evaluation of Fault-Localization Techniques with Distance Metrics [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging through the use of distance metrics (cf. Explaining Abstract Counterexamples, 2004).

Fault Localization for Formal Verification. An Implementation and Evaluation of Algorithms based on Error Invariants and UNSAT-cores [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging by implementing Error Invariants and bug-finding with unsat cores, techniques for bug-finding based on boolean representations of the faulty program, to (1) automatically find potential causes for faulty behavior and (2) present the found causes.

Test-based Fault Localization in the Context of Formal Verification: Implementation and Evaluation of the Tarantula Algorithm in CPAchecker [1] (Mentor: Thomas Lemberger)

There are many different ways to find out whether a program is buggy, for example testing or formal verification. Once we know that a program is buggy, i.e., that it shows some faulty behavior, a developer has to manually debug the program to find the cause of the problem and fix it - this is a difficult and long-taking process. The aim of this thesis is to help programmers with debugging by implementing Tarantula, a technique for bug-finding based on test coverage, to (1) automatically find potential causes for faulty behavior and (2) present the found causes.

Converting Test Goals to Condition Automata [1] (Mentor: Thomas Lemberger)

Conditional Model Checking and Conditional Testing are two techniques for the combination of respective verification tools. Conditional Testing describes work done through a set of covered test goals and Conditional Model Checking describes work done through condition automata. Because of this discrepancy, the two techniques can not be combined. To bridge this gap, this thesis transforms a set of test goals into a condition automaton to allow easy cooperation between Conditional Testing and Conditional Model Checking.

Implementation and Evaluation of k-Shrinkability [1, 2, 3, 4] (Mentor: Martin Spiessl)
A Language Server and IDE Plugin for CPAchecker [1] (Mentor: Thomas Lemberger)

At the moment, there are two ways to use CPAchecker: Locally through the command-line, and in the cloud through a web interface. Both options require C developers to leave their IDE everytime they want to run CPAchecker, which disrupts their workflow. The goal of this project is to build an IDE plugin that provides a seamless experience of using CPAchecker (e.g., for Eclipse CDT, VisualStudio Code or mbeddr). We would like the plugin to focus on ease of use and immediate feedback that is easy to comprehend. The plugin should:

  1. allow developers to run CPAchecker on their current project,
  2. provide some options for customization (e.g., whether to run CPAchecker locally or in the cloud), and
  3. provide useful feedback (e.g., if a possibly failing assert-Statement was found by CPAchecker)

Specifying Loops with Contracts. [1] (Mentor: Gidon Ernst)
Information flow testing for PGP keyservers [1] (Mentor: Gidon Ernst)
Implement Fuzzing in CPAchecker (Mentor: Gidon Ernst)
SMT-based checking and synthesis of formal refinements (Mentor: Gidon Ernst)
Rely/Guarantee Reasoning for Separation Logic in SecC (Mentor: Gidon Ernst)
Loop Contracts for Boogie and Dafny (Mentor: Gidon Ernst)
Frontends for a deductive verifier (Python) (Mentor: Gidon Ernst)
Frontends for a deductive verifier (Boogie) (Mentor: Gidon Ernst)
Guided fuzz testing with stochastic optimization (Mentor: Gidon Ernst)
Interactive verification debugging [1] (Mentor: Gidon Ernst)
Simplification and solving strategies for SMT (Mentor: Gidon Ernst)
Development of an interactive theorem prover for untyped logic (Mentor: Gidon Ernst)
User-friendly tactics for interactive proofs (Mentor: Gidon Ernst)
Inference of Invariants over Algebraic Data Types (Mentor: Gidon Ernst)
Test-case generation for protocol implementations via fuzzing (Mentor: Gidon Ernst)
Concolic Testing for Liskov's substitution principle (Mentor: Gidon Ernst)
Studie: PBT Driven Design am Beispiel eines Embedded Software Projekts (Mentor: Gidon Ernst)
Fuzzing Strategies in Legion/SymCC (Mentor: Gidon Ernst)
Symbolic Automata Learning for Unbounded Systems (Mentor: Gidon Ernst)
Case-studies in verifying Python programs (Mentor: Gidon Ernst)
Loop summarization for array programs (Mentor: Gidon Ernst)
Formal Verification of Interactive Visual Novels (Mentor: Gidon Ernst)
Security Specifications for Dafny (Mentor: Gidon Ernst)
Penetration Testing of Web-Applications (Mentor: Gidon Ernst)
Experiments with Inference of Loop Contracts in Dafny (Mentor: Gidon Ernst)
A code-complexity analysis on the component level on the example of CPAchecker [1] (Mentor: Thomas Lemberger)
Real-World Requirements for Software Analysis [1, 2, 3, 4, 5, 6] (Mentor: Martin Spiessl)

The National Institute of Standards and Technology (NIST) has published a specification for capabilities that a "Source Code Security Analysis Tool" needs in order to be useful in software security assurance [1][2]. This is accompanied by test suites (subsets of the NIST SARD test suite, e.g. suite 100 and 101 [3]) with programs that have to be checked for certain weaknesses [4][5]. For classification of the weaknesses, CWE-IDs are used.

This now leads to various interesting research questions. A subset of those can be addressed by a project or thesis, depending on the available time and progress.

0. How can the CWE-IDs from the relevant SARD test suites be mapped to the specification that are used in the Competition on Software Verification(SV-COMP)?
Some of the weaknesses, e.g. memory-safety related ones or undefined behavior like overflows, are already checked for in the SV-COMP. For the CWEs that cannot be translated into the specifications currently used in the SV-COMP, a critical discussion should be added on how easy/hard it would be to add these and on how easy/hard it is to check for those properties. The extend of this discussion can be tuned to the scope of the project/thesis.

1. How many of the tests in the test suite can already be checked by the automatic software verifiers that participated in the Competition on Software Verification(SV-COMP)?
One of the claims of the SV-COMP is that it shall reflect real-world problems in software verification. As such, it is of interest to determine whether this is the case by applying the participating tools to tasks that are used in industry, like the SARD test suite.
Provided that the license allows this, the corresponding test cases can be extracted from the relevant SARD test suites and converted into tasks for the SV-COMP. A task in the SV-COMP consists of a program and a specification, so ideally the right specification is automatically determined by parsing the information in the xml manifest of the SARD test suite. After the conversion is complete, an extensive evaluation with the tools that participated in the SV-COMP (and optionally other tools, like the ones listed in [6]) shall be performed.

2. Can CPAchecker be extended (e.g. by new analyses) to find these weaknesses and become a Source Code Security Analyzer like many other tools[6]?
Currently, CPAchecker is not really designed to return multiple findings, as in verification it is usually enough to show one specification violation. For some CWEs there should be support, e.g. memory safety or overflows. Others however (e.g. "CWE-134: Uncontrolled Format String") would require a new analysis to be implemented. For reducing the scope of the thesis/project, one can also just pick a subset of the CWEs and implement the corresponding analysis or analyses. Another way to adapt the scope is to only look at some of the requirements. These are divided into mandatory features SCSA-RM-{1,2,3,4,5,6} and optional features SCSA-RO-{1,2,3}[1].

CWE-Ids for Software Analysis in CPAchecker and SV-COMP (subtopic from 'Real-World Requirements for Software Analysis') [1, 2, 3, 4, 5, 6, 7] (Mentor: Karlheinz Friedberger)

The long description of the supertopic can be found under 'Real-World Requirements for Software Analysis'.
This subtopic is a specialization for finding and documenting CWE-Ids that match them against CPAchecker's analyses. The target is not as broad as the supertopic, but more related to our own software project. Plan:

  • Which CWE-Ids are supported by CPAchecker? Can we simply extend CPAchecker for a few more?
  • Can we describe CWE-Ids with a specification like the BLAST query language?
  • Does SV-COMP provide tasks for the found CWE-Ids?
  • Provide a nice overview (webpage?) for those CWE-Ids?
The thesis does not include much code-related work, but focuses more on literature research and documentation.

Predicate-based Analysis of Concurrent Programs [1, 2, 3] (Mentor: Karlheinz Friedberger)

Issue 464

Reliable Benchmarking and Measurements

Available topics
Timestamps in logs [1] (Mentor: Philipp Wendler)

During benchmarking runs, BenchExec collects the output of the benchmarked tool into files. In some use cases, it would be beneficial if each produced line would be prefixed with a timestamp. For this we need to pipe the tool output through the BenchExec process instead of writing it directly (GitHub issue). While implementing this it might be possible to solve some related issues as well.

Currently assigned topics
BenchExec: Summary Tables and Filtering Improvements [1] (Mentor: Philipp Wendler)

This proposal aims to support some significant feature requests for the HTML table component of BenchExec and refactor the code to improve readability and maintainability. These issues include upgrading React Router to the latest stable version 6, merging the two tables on the Summary page to one simple table, displaying the error counts in the table summary, and reworking the filtering logic. Some additional goals are supporting syntax highlighting in overlay panes, adding an intuitive display for command line options in merged runset, and changing the name of the runsets to be more user-friendly.

Overlay handling with fuse-overlayfs in BenchExec [1] (Mentor: Philipp Wendler)

For benchmarking purposes, BenchExec typically creates a container with a file-system overlay atop the system's file system. However, due to a kernel change, this process is no longer viable for the root file system. Implementing fuse-overlayfs as an alternative overlay mechanism might resolve this issue. This entails developing a fallback mechanism for BenchExec's file-system overlay strategy: initially attempting to utilize the kernel overlayfs, and if unsuccessful, resorting to fuse-overlayfs. This approach ensures BenchExec's functionality remains intact.

Finished topics
Make use of the Intel Cache Allocation Technology [1, 2] (Mentor: Philipp Wendler)

If several benchmarking runs should be executed in parallel on the same system, the memory usage of one run can influence the performance of others (because CPU caches and the memory bandwidth is shared), even if each run gets its own reserved share of the memory. Recent Intel CPUs have a cache allocation technology, which allows to control the usage of caches and memory bandwidth and reduce the influence that parallel benchmarking runs have on each other. The goal of this project is to add support for using this technology to BenchExec.

Find appropriate garbage collector(s) for CPAchecker via a systematic study [1] (Mentor: Philipp Wendler)

A study should be conducted that evaluates options for garbage collectors when executing CPAchecker as described in the linked issue. The goal of this thesis is to have usable suggestions on what garbage collectors CPAchecker should use in which situation. For this it is crucial that the study is performed in a systematic and well-planned manner such that the results are reliable and meaningful.

Extend allocation of CPU cores in BenchExec [1, 2] (Mentor: Philipp Wendler)

BenchExec is a benchmarking framework and assigns a specific set of CPU cores to each benchmark run. The calculation which core to use for which run needs to consider low-level details of the hardware architecture (like hyper threading, NUMA, etc.) in order to avoid interference between runs as far as possible. Our current implementation supports multi-CPU systems with NUMA and hyper threading, but not some more recent features like hybrid-architecture CPUs or CPUs with split Level 3 caches. The goal of this thesis is to implement a more general core allocation (in Python) with support for all kinds of modern CPUs and additional features and rigorous tests. There should also be an experimental evaluation on how different allocation strategies and such hardware details influence benchmarking results.

Use cgroups v2 for BenchExec to improve container isolation [1, 2] (Mentor: Philipp Wendler)

Our benchmarking framework BenchExec uses cgroups in order to limit and measure time and memory usage during benchmarking. The Linux kernel is replacing cgroups with a new API (v2) and we need to support this in BenchExec (#133). Once we have it, we can use cgroup namespaces to better isolate our containers and provide full support for nested containers (#436).

Modern architecture and tests for HTML tables of BenchExec [1] (Mentor: Philipp Wendler)
Improve usability of summary tab for HTML tables of BenchExec [1] (Mentor: Philipp Wendler)
Measuring and optimizing energy consumption of verification work (Mentor: Martin Spiessl)

SMT Solvers and Java SMT

Currently assigned topics
JavaSMT: develop a interface for proofs returned by SMT solvers and add support for existing SMT solvers inside JavaSMT (with automated tests and build documentation). (Mentor: Daniel Baier)

JavaSMT is a framework that provides a common API for several SMT solvers. SMT solvers are capable of returning a proof for a verdict. These proofs however are different from solver to solver. The goal of this thesis is first an analysis of the different proofs returned by the solvers to develop a common interface, then implementing this in JavaSMT and finally connecting all possible SMT solvers to this interface (with automated tests and possible build documentation). This topic is suited for a bachelor's thesis

JavaSMT: add support for UltimateEliminator, a tool for solver-independent quantifier elimination, to JavaSMT. (with automated tests and build documentation). (Mentor: Daniel Baier)

JavaSMT is a framework that provides a common API for several SMT solvers. However, some SMT solvers lack support for quantifier theory and quantifier elimination. UltimateEliminator bridges this gap by offering solver-independent quantifier elimination. The goal of this thesis is adding a solver-independent quantifier layer to JavaSMT and connecting UltimateEliminator to it, allowing all solvers to utilize quantifiers as long as they are eliminated in the end (with automated tests and build documentation). This topic is suited for a bachelor's thesis

JavaSMT: extend the solver-independent SMT2 String parser/composer with missing SMT theories (e.g. Floating-Points, Strings, ...) and add a SMT-LIB2 based SMT solver to JavaSMT based on the added theories (with automated tests and build documentation). (Mentor: Daniel Baier)

JavaSMT is a framework that provides a common API for several SMT solvers. However, there are solver that do not offer parsing/dumping of formulas in the SMT2 format. The goal of this thesis would be the extension of our solver-independent SMT-LIB2 parser/generator layer in JavaSMT, that enables users to generate SMT2 formulas in String form using the API, but also read arbitrary SMT2 formulas into any of the existing solvers APIs without using the solvers parsing abilities. To evaluate this extension, we add a SMT solver using the parser/generator. This topic is suited for a bachelor's thesis

Finished topics
SMT query caching, slicing, and simplification module for Java SMT [1] (Mentor: Karlheinz Friedberger)
Fuzzing for SMT Solvers on API Level [1, 2] (Mentor: Karlheinz Friedberger)

Formal methods use SMT solvers extensively for deciding formula satisfiability. However, solvers may contain critical bugs that lead to unsound results. We can use tools for mutation-based fuzzing on our own API and the underlying SMT solvers. We can report bugs back to the developers and update JavaSMT with fixed solver libraries or bindings.

Implementing a fast SMT Solver based on interval-computation [1] (Mentor: Karlheinz Friedberger)

An SMT formula is a boolean formula that includes numbers and common arithmetic operations. Since formal verification techniques often use SMT formulas to represent programs and models , the speed of solving SMT formulas is a very important aspect of formal verification. Studies show that SMT formulas that are created by model checkers or symbolic execution often have a specific structure: The relation between program variables (e.g., "x == y + z") is dependent on program constraints over these variables that only contain a single variable each, e.g., x > 0, y > 0, and z > 10. This example would be represented as the SMT formula x == y + z ∧ x > 0 ∧ y > 0 ∧ z > 10.

The satisfiability of SMT formulas with this specific structure can be computed through the following decision procedure:
1. For each program variable in the formula, compute the interval(s) of possible values for this variable for which the constraints of this variable are satisfied, and
2. Check whether the relations between program variables can be satisfied with the computed intervals.

The goal of this topic is to implement this procedure and to make it available as a Java library. Work on this in the symbolic execution engine Klee has shown great potential, leading to speedups of up to 55x compared to the use of existing complete SMT solvers.

Non-Deterministic Tests (Flaky Tests)

Currently assigned topics
Modern Tests for BenchExec [1] (Mentor: Philipp Wendler)

BenchExec's tests are currently written using the test framework nose, which is not maintained anymore and has problems on new Python versions. The goal of this project is to switch the test framework used for BenchExec. As a first step, an appropriate framework needs to be chosen. This should be done systematically based on criteria like ease of use, how common the framework is, whether it is well maintained etc. Next, the framework needs to be integrated in BenchExec's project structure and the existing tests need to adjusted where necessary. This might also include restructuring the tests and in general improving their state. Filling gaps in our test coverage with more unit tests would also be highly welcome.

Logical Verification of Cyber-physical Systems

Available topics
Formalization of Communicating Hybrid Systems Axiomatics in Isabelle/HOL (Mentor: Marvin Brieger)

This topic is available as a BSc thesis, a MSc thesis, or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: The safe interaction of computers with their physical environment (e.g. in modern car-controllers or care robots) requires utmost care as human health and life may be directly affected by a malfunction. Formal proofs provide a very high level of certainty about the safety of such cyber-physical systems. The dynamic logic of communicating hybrid programs (dLCHP), which describes interactions of computers, physical behavior, communication, and parallelism, is is a proof system supporting safety reasoning about cyber-physical systems.

In this thesis, parts of the theory of dLCHP are to be formalized in the interactive proof assistant Isabelle/HOL. That is, available pen-and-paper proofs are to be carried over into a mechanized theory in the proof assistant.

Requirements: The student must already have experience in using an interactive theorem prover (e.g. Isabelle/HOL, Coq, Lean), which might have been obtained in a lecture, seminar, or practical course.

Small theorem prover kernels: Uniform substitution for your favorite logic (Mentor: Marvin Brieger)

This topic is available as a MSc thesis or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: Uniform substitution is a proof rule that enables the implementation of a theorem prover kernel with very little code. For example, KeYmaera X, a theorem prover for reasoning about Cyber-physical systems, only has 2 kLOC code in its core, while its predecessor KeYmaera not using uniform substitution had 100 kLOC code.

Uniform substitution as a proof rule is always tailor-made for a certain logical system. In this thesis, uniform substitution for some logical system is to be investigated that is not considered yet. For example, uniform substitution could be studied for Incorrectness logic, which can be used for proving presence of bugs in software, or for separation logic, which can be used to reason about pointers.

Requirements: The student should have got good grades in logic and discrete structures and formal specification and verification

Implementation of a theorem prover microkernel (Mentor: Marvin Brieger)

This topic is available as a BSc thesis, a MSc thesis, or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: The safe interaction of computers with their physical environment (e.g. in modern car-controllers or care robots) requires utmost care as human health and life may be directly affected by a malfunction. Formal proofs provide a very high level of certainty about the safety of such cyber-physical systems. The dynamic logic of communicating hybrid programs (dLCHP), which describes interactions of computers, physical behavior, communication, and parallelism, is is a proof system supporting safety reasoning about cyber-physical systems.

In this thesis, the logic dLCHP is to be implemented as an extension of the interactive proof assistant KeYmaeara X for hybrid systems.

Requirements: The student should have experience in functional programming (e.g. Haskell, Scala) and modern web frameworks (e.g. Angular, React)

Formalization of Game Logic with Sabotage in Isabelle/HOL (Mentor: Marvin Brieger)

This topic is available as a MSc thesis, or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: Game logic with sabotage (GLs) [1] studies the abilities of competing agents to reach their goals if they can be so mean to lay traps for each other. Unlike game logic without sabotage, GLs is equiexpressive to the modal mu-calculus (L-mu), a logic that can be used to characterize important formal methods e.g. model checking. By equiexpresisvenss GLs provides a novel perspective onto these formal methods. Further, the equiexpressivenss is the foundation for the first completeness result for game logic. Recent flaws in related proofs make the formal verification of this proof a significant task.

In this thesis, the equiexpressivness of GLs and the L-mu is to be formalized in the proof assistant Isabelle/HOL.

Requirements: The student must already have experience in using an interactive theorem prover (e.g. Isabelle/HOL, Coq, Lean), which might have been obtained in a lecture, seminar, or practical course.

[1] Noah Abou El Wafa and André Platzer. 2024. Complete Game Logic with Sabotage. (LICS 2024)

Currently assigned topics
Formalization of Communicating Hybrid Systems Axiomatics in Isabelle/HOL (Mentor: Marvin Brieger)

This topic is available as a BSc thesis, a MSc thesis, or as a MSc research training course with 6/12 ECTS. Mentoring is available in English and German.

Goal: The safe interaction of computers with their physical environment (e.g. in modern car-controllers or care robots) requires utmost care as human health and life may be directly affected by a malfunction. Logical reasoning about such systems provides a very high level of certainty about the safety of such cyber-physical systems. The dynamic logic of communicating hybrid programs (dLCHP), which describes interactions of computers, physical behavior, communication, and parallelism, is one logical system supporting safety reasoning about cyber-physical systems.

In this thesis, parts of the theory of dLCHP are to be formalized in the interactive proof assistant Isabelle/HOL. That is, available pen-and-paper proofs are to be carried over into a mechanized theory in the proof assistant.

Requirements: The student must already have experience in using an interactive theorem prover (e.g. Isabelle/HOL, Coq, Lean), which might have been obtained in a lecture, seminar, or practical course.

Formalization of Coincidence Properties for Communication Traces in Isabelle/HOL (Mentor: Marvin Brieger)
Finished topics
Case Studies in the Dynamic Logic of Communicating Hybrid Programs (Mentor: Marvin Brieger)

BDDs, ZDDs and ParallelJBDD

Finished topics
Implement more decision diagrams in ParallelJBDD: CBDDs [1] (Mentor: Stephan Holzner)
Implement more decision diagrams in ParallelJBDD: tagged BDDs [1] (Mentor: Stephan Holzner)

Collaboration with Bundeswehr University

We also offer thesis topics in collaboration with the PATCH lab at Bundeswehr University. These theses will be formally supervised by our chair, but the day to day work will be guided by a mentor from the PATCH lab.

The language for the following theses is English.

Finished topics
SV-comp Benchmarks for Weak Memory Models [1] (Mentor: Sudeep Kanav)

SV-Comp is the International Competition on Software Verification. It serves as a platform for the research community to evaluate tools targeting verification of C and Java programs. A total of 28 tools from 11 different countries participate in SV-Comp’20.

The simplest (and strongest) model of computation for concurrent programs is sequential consistency (SC), i.e. interleaving semantics. However, for performance reasons, mainstream processors perform optimisations that results in more possible executions than those allowed by SC. This new execution can result in bugs in the program which are hard to detect. All executions which are valid according to a certain processor are specified by its memory model.

We aim to create a new category in SV-Comp dealing with memory models. We currently have a collection of more than twelve thousand benchmarks that were used to validate the formalisation of the memory models of x86, Arm and Power processors and the Linux kernel. Those benchmarks are written in the litmus format, which is only supported by a small number of tools. The goal of this thesis is to port the benchmarks from the litmus format to C code. Dartagnan is a verification tool with support for memory model that can already parse all such benchmarks. What is required, is to implement an exporter from Dartagnan’s intermediate representation of programs to C.

Verification Witnesses: from Llvm to C [1] (Mentor: Sudeep Kanav)

Dartagnan is a software verification tool. It takes a C program, it compiles it to Llvm and then it performs the verification on the Llvm code. In 2020, Dartagnan participated for the first time in Sv-Comp, the International Competition on Software Verification. One of the requirements of Sv-Comp, is to provide verification witnesses: a proof of correctness or a counterexample. Dartagnan can already generate a counterexample witness as a graph. However, this graph refers to the Llvm code while Sv-Comp requires to generate a counterexample witness w.r.t to input C program. The information mapping the C instructions with the corresponding Llvm code is provided by the transformation from C to Llvm used by Dartagnan. However it is not currently used. The goal of this thesis is to use this information and convert the counterexamples generated by Dartagnan to witnesses w.r.t the C code, thus fulfilling Sv-Comp requirements.