We are hiring new doctoral researchers, student research assistants, and tutors. Apply now!
2 papers accepted at ASE 2024: BenchCloud and CoVeriTeam GUI

Prof. Dr. Marie-Christine Jakobs

Professor, Reliable Software

Picture of Prof. Dr. Marie-Christine Jakobs

Software and Computational Systems Lab
Department of Computer Science
Ludwig-Maximilians-Universität München (LMU Munich)
Oettingenstraße 67
80538 Munich (Germany)

Office
Room 058, Oettingenstr. 67
E-Mail
lastname@sosy.ifi.lmu.de (You need to replace lastname.)
ORCID
0000-0002-5890-4673

GPG-Key

Please send me encrypted mails!
My GPG key: Public Key
Fingerprint: 78ED 2ABF EBA8 80FD 93DD B385 6C09 F182 7936 0D27

Short Biography

In 2012, Marie-Christine Jakobs finished her studies in computer science at Paderborn University. She received her doctor's degree in 2017 from Paderborn University. From 2017 to 2019 she worked as postdoctoral researcher at LMU Munich. From 2019-2023 Marie-Christine Jakobs was an assistant professor at TU Darmstadt. Since 2023 she has been a professor for at Ludwig-Maximilians-Universität München.

Open Positions

We are currently hiring a research assistant, either on the level of a PhD student or a postdoctoral researcher for our DFG project ReVeriX. As a researcher in this project, you will work on innovative research questions with respect to efficient and flexible reverification of modified programs to safeguard modifed programs against property violations.
For further details consider the job posting below.

Research

Marie-Christine Jakobs main research is in the area of formal methods for verification of software. She is particularly interested in automatic software verification with a focus on static analysis, model checking, and testing. She is interested in theoretical foundations as well as tool development. Since 2012 she has been contributing to the software analysis tool CPAchecker. Her current research interests include incremental verification, verifier combinations, test-case generation, functional equivalence checking, and the validation of verification results.

Current Research Interests

  • Incremental verification To increase the efficiency of reverifying a program after it has been changed, we want to utilize knowledge gained while verifying previous versions of the program. On the one hand, we aim to speed up reverification by reusing information discovered during the verification of an earlier version of the program. We are in particular interested to reuse information from different verifiers. On the other hand, we develop techniques that focus reverification on paths affected by the modification.
  • Verifier combinations Verifiers have different strength and weaknesses. Therefore, we aim to combine different verifiers to get better verifiers.
  • Test-case generation We use verification techniques to automatically generate test cases. More concretely, our approaches check the reachability of all test goals and output test cases for all goals that are proved reachable. Thereby, we focus on combinations of different verification approaches.
  • Functional equivalence checking In this area, we develop methods to check whether original and modified program are functional equivalent. Our main focus is on techniques to encode functional equivalence into verification tasks that can be efficiently solved.
  • Validation of verification results In this area, we work on formats that allow verifiers to provide evidence for their results and on approaches to independently check a verifier's results based on the provided evidence.

Publications

All my publications are available at DBLP and Google Scholar.

Funded Projects (Current and Past)

Currently supervised PhD students

Software Projects

Teaching

Thesis Topics and Projects

Typically, we offer thesis topics in the area of our current research. Below we list currently available topics. Further topics can be discussed on request. All topics offered by our chair are available on our teaching page.

Available topics
Value Analysis with Initial Precision from YML Correctness Witness [1, 2, 3, 4]

Value analysis [1] is one verification technique in the software analysis framework CPAchecker [2], which tracks the concrete values of certain variables that are determined by its precision. So far, value analysis may create an empty initial precision or an initial precision from an existing value or predicate precision, which in both cases is the result of a previous analysis run. This restricts the reuse of information in the value analysis. The goal of this thesis is to extend the ability of CPAchecker's value analysis to reuse information from previous analysis runs by using information provided in YML correctness witnesses [3]. The idea is to derive from the invariants in the witness the variables that should be tracked and add them to the precision. In addition to considering variables directly occurring in the witness' invariant, further variables might be derived similar to [4]. An experimental evaluation should evaluate the value of generating initial precisions from witnesses. The evaluation should at least compare initial precision from witnesses generated by the value analysis with initial precision from value precision in context of the reverification of changed programs and optionally of the same program. The evaluation may be extended by a comparison with an initial empty precision as well as a precision from predicate analysis, an initial precision from witnesses generated from other analyses like CPAchecker's predicate analysis, k-induction, or even generated by another tool. Similar to [4], the evaluation may further evaluate the approach in context of cooperative verification.

Parallel Test-Case Generation via Test-Goal Splitting [1, 2, 3, 4]

Test-case generation based on reachability analyses like CoVeriTest [1] check the reachability of each test goal, e.g., each branch of a program, and generate a test case whenever the check succeeds. The goal of this thesis is to develop an approach to parallelize such test-case generation approaches by splitting the set of test goals into disjoint subsets and then running multiple instances of the test-case generation in parallel, each instance focusing on one particular subset of the test goals. The main conceptual part of the thesis is the development of one or more split strategies. Ideally the strategies should consider dependencies, e.g., dominator information or control-dependencies [2], or other structural relations between test goals. In addition, a baseline strategy, e.g., random splitting, should be provided. Since a test case may cover more than the test goal it is generated for, the parallelization approach should also support to exchange information about covered test goals to avoid generating test cases for already covered test goals. Ideally, the developed approach would support load balancing, i.e., the redistribution of uncovered test goals from one instance of the test-case generation approach to another one. The developed approach must be integrated into CPAchecker's [3] test-case generation and experimentally evaluated on the benchmarks of the international testing competition (Test-Comp) [4]. The evaluation must compare the approach to sequential test-case generation and compare the different strategies.

A Range To Condition Converter [1, 2, 3]

Ranged program analysis [1] and Conditional Model Checking [2] are two different techniques to restrict an analysis to examine a subset of a program's behavior. In ranged program analysis, all paths are ordered and a subset of program paths is described by a range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. In conditional model checking, the subset of paths is described by a an automaton like format, the condition, which accepts all program paths except for the paths in the subset. To let ranged program analysis and conditional model checking cooperate, we need to be able to transform between the two formats that describe subpaths.
This thesis should develop a transformation from ranges to conditions. The goal of the this is develop a transformation, formally show that it is sound, i.e., the transformed condition describes the same paths as the original range. If this does not work, it should at least contain all paths of the original range. In addition, the transformation should be realized in the software analysis tool CPAchecker [3] and it should be evaluated as part of a combination of ranged program analysis and conditional model checking.

A Condition To Range Converter [1, 2, 3]

Ranged program analysis [1] and Conditional Model Checking [2] are two different techniques to restrict an analysis to examine a subset of a program's behavior. In ranged program analysis, all paths are ordered and a subset of program paths is described by a range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. In conditional model checking, the subset of paths is described by a an automaton like format, the condition, which accepts all program paths except for the paths in the subset. To let ranged program analysis and conditional model checking cooperate, we need to be able to transform between the two formats that describe subpaths.
This thesis should develop a transformation from conditions to ranges. The goal of the this is develop a transformation that ideally converts the condition into one range, but may also generate multiple ranges, formally show that it is sound, i.e., the transformed range(s) describe(s) the same paths as the original condition. If this does not work, it/they should at least contain all paths of the original condition. In addition, the transformation should be realized in the software analysis tool CPAchecker [3] and it should be evaluated as part of a combination of conditional model checking and ranged program analysis, e.g., in form of a sequential combination.

Currently assigned topics
Detecting Collaterally Covered Goals with CPAchecker's Value Analysis [1, 2, 3, 4]

Software testing is a standard means in quality assurance supported e.g., by automatic tools for test input generation. Several of these automatic tools generate test suites aiming at maximizing the value wrt. a coverage criterion like branch coverage. One particular class of tools are test-case generators based on model checkers like CPAchecker [1,2,3]. These tools derive test goals from the coverage criterion (e.g., the set of branches in a program) and show the reachability of each goal to derive tests. Often, one generated test can reach more than the test goal it was created for. While it is easy to detect other test goals that were visited before reaching the test goal of interest and model-checking based test-case generators often would have covered those test goals with a previous test anyways, other test goals visited after reaching the test goal need to be explicitly determined. The goal of this thesis is to apply CPAchecker's value analysis [2] to determine test goals reached after the test goal of interest with the test inputs generated from showing the reachability of that test goal and integrate this procedure into CPAchecker's test-case-generation algorithm. Ideally, the value analysis should start after the test goal of interest, but could also start from the beginning with the test. In both cases, the value analysis needs to be configured to follow the path induced by the generated test and the visited test goals must be extracted from the path and marked as covered. In the former case, the initial state for the value analysis needs to be derived from the path reaching the test goal and the analysis needs to stop if a new input is required or the analysis cannot precisely determine a branch direction. In the latter case, the value analysis needs to read the test inputs and the analysis needs to stop if a new input is required but no further input is provided by the test or the analysis cannot precisely determine a branch direction. The tasks of the thesis is to design a concept for the detection approach and its integration into CPAchecker's test-case-generation procedure. The designed approach should be implemented in CPAchecker. Also, its efficiency and effectiveness should be evaluated. For example, one could compare the CPU time, number of covered test goals, and number of generated tests using one or more of CPAchecker's test-case generation configurations with and without the detection approach on the Test-Comp benchmark [4].

Assessing and Improving the Quality of the Test-Comp Subset of the SV Benchmark [1, 2]

To evaluate the test generation tools participating in the International Competition on Testing, the validator executes the generated tests and measures coverage or reachability of a particular method call. For a reliable validation, i.e., test execution, all variables in a test task (C program plus coverage property) need to be properly initialized before influencing the observable program behavior (i.e., program control flow our output) or simplified before read for the first time. Per rule definition, variables should be initialized with __VERIFIER_nondet_XXX calls. In a first step, this thesis shall analyze the test tasks of the competition (a subset of the software verification competition benchmark [1]) for both categories coverage branches and coverage error-call to detect uninitialized variables usages. To this end, one may compile the programs with clang, gcc, etc. using particular compiler flags for uninitialized variables, apply a static analyzer, or run the validator with (a subset of the) tests generated in the last edition of the competition and dynamically observe uninitialized variable usages for instance with Valgrind [2]. Before one decides for one or multiple of these techniques, one should compare what kind of uninitialized variables the approaches detect, e.g., only uninitialized on some paths, uninitialized array entry, uninitialized memory, etc. In a second step, the uninitialized variable usages should be fixed and the fixes should be included into the software verification competition benchmark via a Merge Request.

Using Correctness Witnesses for the Reverification with k-Induction [1, 2, 3]

After a program change, the program needs to be reverified. Reverifying the program from scratch can be inefficient and incremental verification techniques try to increase the performance of reverification by reusing information from a previous verification. The goal of this thesis is to propose a solution to use correctness witnesses (in GraphML or in YAML format [1]) to make reverification with k-Induction [2] more efficient. k-Induction already uses correctness witnesses for validation, i.e., reverification of the same instead of a changed program. However, for reverification of a changed program one needs to deal with the changes. Due to the change, the witness information cannot directly be mapped to the program, e.g., because loop locations changed. Therefore, the thesis needs to consider where to map the information from the witness, e.g., one could apply loop invariants to all loops of the function or the closest loop. The developed solution should be implemented in the software analysis framework CPAchecker [3]. In addition, an evaluation should study whether the use of correctness witnesses can improve the efficiency or effectiveness of k-induction. The evaluation may use correctness witnesses constructed by k-Induction or other analyses.

Incremental Ranged Analysis via Range Reuse [1, 2, 3]

Ranged program analysis [1] is a technique to split the analysis of a program into different parts. The idea is split the program's state space, i.e., its execution paths, into separate parts and analyse each part separately, e.g., in parallel. To describe the different parts, ranged program analysis orders the paths and describes a part by a so-called range. A range is characterized by a lower and upper path that are typically described by a test input and describes all paths that are in order between lower and upper bound. The goal of this thesis, is to make use of the ranges when reverifying a program after change. In its simplest form, reverification should reuse the ranges and simply skip range computation. To this end, ranges need to be stored and read if this is not provided so far. Next to this simple approach, a more complicated approach identifies those ranges that contain modified execution paths, e.g., using the idea of difference verification with conditions [2] and restrict reverification of those ranges. An even more sophisticated approach aims to restrict the ranges such that they focus on modified paths, e.g., strip outer non-modified and possibly split ranges. The different approaches for incremental ranged analysis should be implemented in CPAchecker [3]compared against standard ranged program analysis. A master thesis ideally also looks into theoretical soundness aspects of the approach.

Condition Reuse for Incremental Conditional Model Checking [1, 2, 3]

Conditional Model Checking [1] is a technique to split the analysis of a program into several analyses that each analyze parts of the program paths. Thereby, the part an analyzer should explore is encoded into a condition, an automata like format, which accepts all program paths except for the paths that should be analyzed. The goal of this thesis, is to make use of the ranges when reverifying a program after change. The challenge is to ensure that also the modified program paths are analyzed, i.e., are not excluded from analysis. To guarantee this property, existing conditions might need to be updated or new conditions for modified paths that are not covered by the existing conditions must be defined.
In its simplest form, reverification should adapt the set of conditions to include also the modified behavior. Next to this simple approach, a more complicated approach identifies those conditions that contain only non-modified execution paths, e.g., using the idea of difference verification with conditions [2], and excludes them from reverification. An even more sophisticated approach aims to restrict the conditions such that they focus on modified paths, e.g., use a product composition of the condition from difference verification [2] and the original condition. The different approaches for incremental conditional model checking should be implemented in CPAchecker [3] and compared against standard conditional model checking. Also, the soundness of the approach must be shown, i.e., it must be formally proven that all modified paths are verified

Non-termination analysis in CPAchecker
Finished topics
Test Algorithm Selection via Boolean Features [1, 2, 3]

Automatic test-generation approaches have different strengths. To achieve better results, e.g., higher branch coverage, distinct test-generation approaches should be combined. One method to combine different approaches is algorithm selection, i.e., a selector chooses from a given set of approaches a most suitable approach for the particular test-case generation task (program). This method has already been successfully applied for software verification, e.g., [1]. The goal of this thesis is to apply algorithm selection to the verification-based test-case generation approaches available in the software analysis framework CPAchecker [2]. To this end, the thesis should first compare the performance of different analyses like Predicate Abstraction, Value Analysis, BMC, and symbolic execution with respect to covered test goals on the software verification competition benchmark [3] (in particular its test tasks). Thereby, it should be analyzed how well the features from [1] are suited to determine the best analysis per test task and study , whether there exist program features that would allow for a better determination. In a second step, the determined selection based on the identified features should be implemented in CPAchecker [2], e.g., as an extension of the SelectionAlgorithm together with an appropriate configuration. Ideally, the realized selection procedure is evaluated on the benchmark of the International Competition on Software Testing.

Selection of Time Limits for CoVeriTest via Boolean Features [1, 2, 3, 4]

Automatic test-generation approaches have different strengths. To achieve better results, e.g., higher branch coverage, distinct test-generation approaches should be combined. One method to combine different approaches is the CoVeriTest [1] approach that interleaves different analysis for test-case generation providing each analysis with an individual time limit for each iteration. In this thesis, we want to use the principle of algorithm selection to choose the time limits per task. Algorithm selection has already been successfully applied for software verification, e.g., [2]. Similar to [2], the goal of this thesis is to use program features to define a selector for the time limits. As a starting point, it should be analyzed how well the features from [2] are suited to select the time limits and whether there exist program features that would allow for a better selection. In a second step, the determined selection based on the identified features should be integrated into CoVeriTest, which is implemented in CPAchecker [2]. Ideally, the realized selection procedure is evaluated on the benchmark of the International Competition on Software Testing.

Regular and Current Courses

  • Software Testing
  • Verification of Parallel Programs
  • Einführung in die Informatik: Programmierung und Softwareentwicklung für Nebenfach
  • Bachelor and master seminars in the area of automatic software verification
  • Oberseminar Zuverlässige Software
Please look up the courses offered in the current term on our teaching page.

Regular Courses Taught in the Past

  • Automatic Software Verification
  • Seminar Advanced Approaches in Software Verification