Google Summer of Code 2024
Contributors
Zhengyang John Lu, University of Waterloo
- Topic: Adaptive Algorithm Selection for Btor2 Verification Tasks
- Mentor: Po-Chun Chien and Nian-Ze Lee
- Code: Btor2-Select, Btor2-SelectMC
- Final report
Ideas
Below you can find a
list of selected project ideas for GSoC,
categorized according to our main development projects.
If you are searching for more topics or want to propose your own,
there is a list with more ideas available.
You can also take a look into the tickets for the specific projects on the corresponding
repository websites.
The assigned mentors listed below are not fixed yet and might be changed.
If you have any questions, please contact us!
CPAchecker (website)
CPAchecker is an award-winning open-source framework for software verification. It is written in Java and based on a highly modular architecture that allows to develop and combine a wide range of different analyses. CPAchecker is used for verification of the Linux kernel and has helped to find hundreds of bugs in the kernel.
-
Distributed Summary Synthesis with external Invariants
In modern software development, the need for robust and scalable software verification methods has become paramount. To achieve scalability in software verification. We work on a concept called "Distributed Summary Synthesis" (DSS), a novel approach that divides programs into distinct and coherent code blocks with a predecessor/successor relationship. This creates a framework where all blocks can be analyzed in parallel by communicating through messages. Given the current knowledge of valid program states at the block entry (sent by predecessors), the blocks broadcast reachable violations or the safety of the block to their predecessors and successors respectively. The ultimate goal is to find invariants for every block that are sufficient to proof the absence of bugs. Currently, these invariants are synthesized by CPAchecker's internal predicate analysis with CEGAR. However, such invariants are often only found after a bunch of messages are exchanged. Therefore, we want to extend the framework to also accept external invariants. As a GSoC student delving into the realm of DSS, you will have the opportunity to extend DSS in the prominent verification platform CPAchecker. Your work will involve exploring the intricacies of distributed communication among program blocks and investigating the injection of external invariants to the framework. The invariants will be obtained from correctness witnesses of other verifiers or machine learning based invariant synthesizers.
Requirements: Programming in Java, reading C code, basic understanding of software verification and invariants
Difficulty: intermediate
Project size: large (~350 hours)
Mentor: Matthias Kettl and Marian Lingsch-Rosenfeld
-
Generating Error Descriptions with CPAchecker
Finding and fixing bugs is one of the most expensive parts of the software development process. To expedite it, a good description of all inputs which result in the same error can be helpful. To generate them, we will use CPAchecker to find constraints which result in an error when fulfilled, for example all even numbers as input result in an error. This will be done by first finding an error path in the program being analyzed, followed by refining a first guess for a constraint using an approach based on CEGAR. As a GSoC student, you will have the opportunity to extend CPAchecker by a new algorithm to find usefull error descriptions. Your work will involve learning about existing software verification approaches inside CPAchecker and making use of them to develop the new algorithm.
Requirements: Programming in Java, reading C code, basic understanding of software verification
Difficulty: intermediate
Project size: large (~350 hours)
Mentor: Marian Lingsch-Rosenfeld and Matthias Kettl
-
Btor2 Frontend for CPAchecker
Btor2 is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. In this project, we want to implement a new frontend in CPAchecker to parse a Btor2 verification problem and represent it as a control-flow automaton. With this new frontend, CPAchecker be used as a hardware model checker.
Requirements:
- Ability to program in Java
- Basic knowledge in formal verification, including Kripke structure and SAT/SMT solving
- Practical knowledge of graph kernel, support vector machine, decision/boosting tree, and random forest
Difficulty: intermediate
Project size: large (~350 hours)
Mentor: Po-Chun Chien and Nian-Ze Lee
Infrastructure for Software Verification
Automated software verifiers are powerful tools to find bugs in programs, but they are often complex and difficult to use. Our team aims to simplify this and make it easy to integrate software verifiers into the software-development process.
CPA-Daemon is a gRPC-based microservice that enables users to run the award-winning verifier CPAchecker in a cloud setup. It focuses on ease of use and continuous verification. In the scope of GSoC, we want to improve the infrastructure surrounding CPA-Daemon. We offer the following project:
-
Building a Verification Report for Unit Verification
You will build a HTML report for a unit-verification run. Currently, users can execute CPA-Daemon on their project. This verifies individual methods separately, and produces a text output on the command line with the overall results. But often, users want to get more detailed information about a verification run. For example, when a bug was found, users want to take a look at the bug report. And if no bug was found in a method, users may be interested in the reachability of code and the information our analysis collected. To provide this information to users in an easy way, you will build a HTML report that first shows a summary of all verification runs. From this overview, users can get more details about an individual verification run. Inspiration for this is the default JUnit HTML report.
Expected outcome: An integration in our existing verification runner that produces a HTML report as described above. A focus of the report is on usability.
Requirements: Experience in Java or Kotlin, HTML, CSS, basic experience in Javascript.
Project size: medium (175 hours)
Mentor: Thomas Lemberger
Difficulty: easy
-
Design and Implementation of Behavior-driven Tests and Microservice Load Tests
You will improve the maintainability and reliability of our tests in CPA-Daemon. We plan to do this in two dimensions: First, you improve the maintainability of our tests through behavior-driven tests with Cucumber. You analyze the existing tests (written with plain JUnit), design an improved test architecture on the foundation of Cucumber, and implement that. Second, you ensure with k6 that CPA-Daemon performs reliably under load. You design and implement load tests for CPA-Daemon that reflect actual usage scenarios. If time permits, you integrate Prometheus to collect, maintain, and visualize the load-test results.
Expected outcomes:
- A running integration-test suite for CPA-Daemon based on Cucumber that reaches the same or more branch coverage than the existing integration tests that are based on basic JUnit. The test suite lists features and test cases with a granularity similar to the existing tests.
- A performance-test suite for CPA-Daemon based on k6 that shows the performance of CPA-Daemon in common load scenarios.
- If time permits: A visualization of the load-test results.
Requirements: Experience in Java and on the command line. Basic experience in Javascript. Good foundations of microservices.
Project size: medium (175 hours)
Mentor: Thomas Lemberger
Difficulty: easy
Btor2-Cert (website)
Btor2-Cert is a certifying and validating hardware-verification framework using software analyzers. It translates a word-level hardware circuit in the Btor2 format to a behaviorally equivalent C program using Btor2C, verify the C program with a certifying software analyzer, translate the witness (i.e., certificate) produced by the analyzer back to the Btor2 domain, and validate the witness with Btor2-Val. Btor2-Cert offers a convenient way to integrate different backend software analyzers and provides trustworthy verification results accompanied by inspectable witnesses, making it a flexible and powerful framework for hardware verification.
-
Algorithm Selection for Btor2 Verification Tasks using Circuit Structure Information
Btor2 is a bit-precise language that models word-level sequential circuits, and has been adopted by the Hardware Model Checking Competition. Btor2C is a hardware-to-software translator that converts circuits in Btor2 format to C programs. With Btor2C, one could analyze hardware designs with off-the-shelf software verifiers. Now that a large collection of tools from both hardware and software verification communities are at our disposal, we want to combine forces of them via machine learning (ML). The objective is to construct a graph-kernel-based selector that utilizes the information of the Btor2 circuit structure, and train it on a large Btor2 dataset. Given a Btor2 task, the ML-based selector can then predict the best-suited tool/algorithm(s) for solving it.
Requirements:
- Ability to program in Python
- Basic knowledge in formal verification, including Kripke structure and SAT/SMT solving
- Practical knowledge of graph kernel, support vector machine, decision/boosting tree, and random forest
Difficulty: hard
Project size: large (~350 hours)
Mentor: Po-Chun Chien and Nian-Ze Lee
JavaSMT (website)
JavaSMT is a common API layer for accessing various SMT solvers. It was created out of our experience integrating and using different SMT solvers in the CPAchecker project. JavaSMT can express formulas in the theory of integers, rationals, bitvectors, floating-points, and uninterpreted-functions, and supports model generation, interpolation, formula inspection and transformation.
Integrate Solver Independent Quantifier Elimination
There are plenty of SMT solvers that do not support usage and elimination of quantifiers. This project aims to integrate a solver independent quantifier theory, including quantifier elimination. We would also like to have an automated build process, integration tests, documentation, etc. Instead of implementing this functionality on our own, we could use existing projects that offer this functionality. A possible candidate is Ultimate Eliminator. We expect that solvers with no quantifier support can support quantifiers after the addition and plan on evaluating it using files from the SMT competition (SMT-COMP).
Expected size of the project: 175 hours
Difficulty: intermediate - hard
Skill requirements: proficient in Java
Extend our Solver-Independent SMT-LIB2 Parser/Generator with more Theories and Solver Bindings
Our existing solver-independent SMT-LIB2 parser/generator component can be extended with theories not yet supported (e.g. Floating-Points or Strings). Additionally, the bindings can be extended by incorporating new solvers that use these theories, but offer no public API. We would also like to have an automated build process, integration tests, documentation, etc. A possible candidate is Vampire. We expect the new solver to be usable in JavaSMT via the API and plan on evaluating the final implementation against other solvers using CPAchecker.
Expected size of the project: 175 hours
Difficulty: intermediate
Skill requirements: proficient in Java, beginner in C/C++ (depending on the 3rd party projects used)
PJBDD (website)
PJBDD is a multi-threaded Binary Decision Diagram (BDD) library written in Java. It supports concurrent computation as well as parallel operations and automated resource management. BDDs are compact representation for Boolean formulas and can be used for efficient manipulation.
Optimization with reduction rules
There are various optimization techniques that can be applied to decision diagrams. One such example would be BDDs with complemented edges and edge-specified reductionss (CESRBDDs), which enables a BDD to be reduced even more than already implenented techniques such as tagged or zero-suppressed BDDs, while remaining a compact form. The goal of this project is to incorporate these techniques in PJBDD and perform an evaluation with a case study. We expect these techniques to improve the existing DDs in PJBDD in regards to their speed.
Expected size of the project: 175 hours
Difficulty: intermediate - hard
Skill requirements: proficient in Java
Mentor: Daniel Baier with Marian Lingsch
BenchExec (website)
BenchExec is a benchmarking framework for Linux (written in Python) that is aimed at a high reliability of the results. It can measure the CPU-time and peak memory usage of whole groups of processes. To do so, it makes use of modern Linux features such as cgroups and namespaces, effectively creating a benchmarking container whose resource usage is measured. The concepts and architecture of BenchExec are described in a paper (open access).
-
Improvements to generated tables
For the results of benchmarking runs, BenchExec produces interactive HTML tables (example) with filtering of rows, plots, etc. These tables are implemented as a React app and we have several open feature requests and maintenance tasks, such as upgrading or replacing used libraries. A list of open issues is available in our GitHub issue tracker and several of them are especially suited for GSoC.
Requirements: JavaScript, ideally React
Skill level: beginner or intermediate
Mentor: Philipp Wendler
Project size: small (90h) or medium (175h) depending on specific choice of improvements
-
Timestamps in logs
During benchmarking runs, BenchExec collects the output of the benchmarked tool into files. In some use cases, it would be beneficial if each produced line would be prefixed with a timestamp. For this we need to pipe the tool output through the BenchExec process instead of writing it directly (GitHub issue). While implementing this it might be possible to solve some related issues as well.
Requirements: Python, Linux
Skill level: intermediate to advanced
Mentor: Philipp Wendler
Project size: small (90h) or medium (175h) depending on whether the additional issues are worked on
-
Overlay handling with fuse-overlayfs
For benchmarking, BenchExec creates a container with a file-system overlay on top of the system's file system. However, due to a kernel change this is not possible for the root file system anymore (issue). Using fuse-overlayfs as an alternative overlay implementation might also work and might solve this issue. So we want to implement that and evaluate how well it works compared to the kernel-based overlayfs (GitHub issue).
Requirements: Python, Linux
Skill level: advanced
Mentor: Philipp Wendler
Project size: small (90h)
Instructions for Contributors
Prospective contributors wishing to participate in Google Summer of Code must realize that this is an important professional opportunity. You will contribute code for an award-winning tool chain or parts of its infrastructure. Therefore, we seek contributors who are committed to helping our tools long-term and are willing to both do quality work, and be proactive in communication with their mentors.
You don't have to be a proven developer - in fact, this whole program is meant to facilitate joining our group and take a look at open source communities. However, experience in coding is welcome and should be mentioned in your proposal.
You should take a look at the tools that you plan to work on before the start date. The timeline from Google reserves a lot of time for bonding periods; use that time wisely. Good communication is important. The group members are available via mail (or online/in-person meeting, if needed). You should communicate with your mentor, and formally report progress and plans weekly.
Recommended steps
- Read Google's instructions for participating and the GSoC Contributor Manual.
- Take a look at the list of ideas.
- Get familiar with the project you are interested in, e.g., by trying it out, reading the documentation, etc. If you already see something that can be improved, file a pull request!
- Contact the developers of the project of your choice, e.g., via the mailing list or the mentor, introduce yourself, and meet your fellow developers.
- Come up with a topic for a GSoC project that you're interested in.
- Write a first draft proposal and get someone to review it.
- Submit your proposal using Google's web interface ahead of the deadline.
Contributor Proposal Guidelines
Proposals are the basis of the GSoC projects. Write a clear proposal on what you plan to do, the scope of your work, and why we should choose you to do it. The proposals are the basis of our decisions of which contributor to choose.
Contributors can use the following application template:
- Introduction and Background: You can either choose a problem from the list of ideas or state a new problem for our group. Before offering the solution (your GSoC project), you should first define the problem. What is the current state of things? What is the issue you wish to solve and why? What is your solution?
- Project Goals: Propose a clear list of deliverables, explaining exactly what you promise to do and what you do not plan to do.
- Implementation: Describe what you plan to do as a solution for the problem you defined above. Include technical details, showing that you understand the technology.
- Timeline: Show that you understand the problem and have a solution. Divide the solution into manageable parts (estimated work items, milestones) and describe a detailed realistic plan on how to accomplish your goal. Include time for searching and fixing bugs, and communication. Do not promise what you cannot keep. Other commitments during GSoC, such as a job, vacation, and exams should also be mentioned. GSoC should be treated like a full-time job, and we will expect approximately 30 hours of work per week. Explain how you will work around conflicts. Describe your plans for communication. You will need to participate on weekly communication such as detailed email or personal meeting with your mentor or the team.
- About Me: Provide your contact information (address, email, phone) and write a few sentences about you and why you think you are the best for this job. Previous contributions to our projects are appreciated. Name people (other developers, students, professors) who can act as a reference for you. Mention your field of study if necessary. Tell us if you are submitting proposals to other organizations, and whether or not you would choose our group if given the choice. If your native language is not German or English, you should be able to communicate in one of those languages.
Accepted Contributors
Your primary responsibility is finishing your project under the guidance of your mentors. To do that, you must submit code regularly and stay in frequent communication with your mentor and our team. For the evaluation, you must succeed with communication, coding, and documentation.
Hints
- Submit your proposal early! Early submissions will be read and discussed by mentors more eagerly.
- Keep it simple! Be concise and precise. Provide a clear, descriptive title.
- Know what you are talking about! Do not submit ideas that cannot be accomplished realistically or that are not related to our projects.
- Aim wide! You can submit more than one proposal, with different ideas. If you do submit more than one proposal, tell us which of them you would prefer, if more were selected.