Google Summer of Code at Software and Computational Systems Lab

Google Summer of Code 2025

Ideas

Instructions

Ideas

Below you can find a list of selected project ideas for GSoC, categorized according to our main development projects. If you are searching for more topics or want to propose your own, there is a list with more ideas available. You can also take a look into the tickets for the specific projects on the corresponding repository websites. The assigned mentors listed below are not fixed yet and might be changed.
If you have any questions, please contact us!

CPAchecker (website)

CPAchecker is an award-winning open-source framework for software verification. It is written in Java and based on a highly modular architecture that allows to develop and combine a wide range of different analyses. CPAchecker is used for verification of the Linux kernel and has helped to find hundreds of bugs in the kernel. To get started with CPAchecker you can take a look at its Tutorial Paper.

Exporting and Validating Correctness Witnesses for Memory Safety in CPAchecker

Automatic Software Verifiers like CPAchecker, while great at finding bugs may also contain some bugs themselves. In order to increase the confidence in the results of the verifiers, they export an argument for their result in a machine-readable format called a witness. A witness can be validated independently by another tool, increasing the confidence in the result of the tool which exported them. Currently CPAchecker does not export relevant information when it finds a proof about memory safety properties. As a GSoC student, you will have the opportunity to extend CPAchecker by a new witness export to provide relevant information about memory safety proofs. Additionally you will be implementing the validation of these witnesses inside CPAchecker in order to evaluate the correctness of the exported witnesses. Your work will involve learning about the existing analyses for memory safety, the witness export and the witness validation in CPAchecker in order to develop the required components.

Expected outcome: An prototype for the export and validation of correctness witnesses for memory safety in CPAchecker.

Requirements: Programming in Java, reading C code, basic understanding of software verification

Difficulty: intermediate

Project size: large (~350 hours)

Mentor: Marian Lingsch-Rosenfeld

Infrastructure for Software Verification

Automated software verifiers are powerful tools to find bugs in programs, but they are often complex and difficult to use. Our team aims to simplify this and make it easy to integrate software verifiers into the software-development process.

Design and Implementation of Behavior-driven Tests

CPA-Daemon is a gRPC-based microservice that enables users to run the award-winning verifier CPAchecker in a cloud setup. It focuses on ease of use and continuous verification.

You will improve the maintainability and reliability of our tests in CPA-Daemon. We plan to do this in two dimensions: First, you improve the maintainability of our tests through behavior-driven tests with Cucumber. You analyze the existing tests (written with plain JUnit), design an improved test architecture on the foundation of Cucumber, and implement that.

Issue related to this project: CPA-Daemon#33

Expected outcome: A running integration-test suite for CPA-Daemon based on Cucumber that reaches the same or more branch coverage than the existing integration tests that are based on basic JUnit. The test suite lists features and test cases with a granularity similar to the existing tests.

Requirements: Experience in Java and on the command line. Strong knowledge on the concepts of microservices.

Project size: small (90 hours)

Mentor: Thomas Lemberger

Difficulty: easy

Chat room for communication at Matrix: #sosylab-gsoc:matrix.org
Design and Implementation of a Web Application for the Visualization of Analysis Work in Distributed Summary Synthesis

Distributed Summary Synthesis (DSS) is a technique to distributed an expensive software verification task in the cloud. DSS decomposes the program-under-analysis into individual blocks, and then analyzes each block separately and in parallel. Whenever a block analysis is finished, the produced result is communicated to strengthen the analysis of the other blocks. DSS uses a microservice architecture with a central coordinator service that composes individual block analyses from the available results. Worker services are then responsible to handle individual block analyses. The produced result can be post-conditions or violation conditions: A post-condition is produced when the analysis proves the block safe (under the currently available information). It represents all possible program states that are possible at the end of the block. A violation condition is produced when the analysis finds a potential error (under the currently available information). It represents a condition that, if it holds at the block entry, leads to an error. Conditions are communicated as logical formulas in the SMT-Lib format. (Scientific paper on the concept of DSS).

You will implement a new web application that visualizes the work done by the worker services in DSS. The coordinator already uses a database to store basic information about each block analysis, like the start-time and end-time, and the produced messages. Your web application will connect to this database and render the performed analyses (for a single verification run) in a graph-based visualization. The user should be able to toggle different layers of visualization (for example a visualization of the individual run times, or a visualization of the code-block size that was analyzed). The visualization should provide a high-level overview of the analysis work, and enable a drill-down into the statistics of individual runs. A prototype of the microservice is implemented here. The final implementation should be in Javascript (with next.js and react).

Expected outcomes:
1. Implementation of the described web application
2. Automated integration tests for the backend of the web application
3. Extensive documentation
Issue related to this project: Distributed Summary Synthesis#30

Requirements: Experience in Javascript (next.js/react), foundational knowledge of formal software verification.

Project size: large (350 hours)

Mentor: Akshay Warrier and Thomas Lemberger

Difficulty: medium

Chat room for communication at Matrix: #sosylab-gsoc:matrix.org
Design and Implementation of a Microservice for AI-based, Formal Code-Summary Synthesis

Distributed Summary Synthesis (DSS) is a technique to distributed an expensive software verification task in the cloud. DSS decomposes the program-under-analysis into individual blocks, and then analyzes each block separately and in parallel. Whenever a block analysis is finished, the produced result is communicated to strengthen the analysis of the other blocks. DSS uses a microservice architecture with a central coordinator service that composes individual block analyses from the available results. Worker services are then responsible to handle individual block analyses. The produced result can be post-conditions or violation conditions: A post-condition is produced when the analysis proves the block safe (under the currently available information). It represents all possible program states that are possible at the end of the block. A violation condition is produced when the analysis finds a potential error (under the currently available information). It represents a condition that, if it holds at the block entry, leads to an error. Conditions are communicated as logical formulas in the SMT-Lib format. (Scientific paper on the concept of DSS).

You will implement a new worker service that uses LLMs to produce post-conditions or violation conditions for a given code block. The worker service receives a C code block, a program specification, a set of pre-conditions that are known to hold at the block entry (in SMT-Lib), and violation conditions that are known at the block exit. The worker service then uses LLMs to produce a new post-condition or violation condition. Before sending the result back to the coordinator, the worker performs some validity checks on the expected properties of the logical formulas to increase the chance of correctness. For communication, the worker uses gRPC. It must be implemented in Python.

Expected outcomes:
1. Implementation of the described Python microservice.
2. Automated integration tests for the microservice
3. Extensive documentation
Requirements: Experience in Python. Strong knowledge on the concepts of microservices.

Project size: large (350 hours)

Mentor: Thomas Lemberger

Difficulty: hard

Chat room for communication at Matrix: #sosylab-gsoc:matrix.org
Finding initial Predicates for Predicate Analysis using AI

To proof a program correct, usually the most difficult task is to come up with invariants which show its correctness. Nowadays there are multiple approaches which try to find invariants using AI, for example Code2Inv, Neural Termination, using LLMs, refining invariants using LLMs, between many others. While invariants are very useful tools to proof software correct, it is sometimes difficult to find them, since they need to fulfill strict correctness criteria. Another very successful approach in software verification is predicate analysis. In contrast to invariants, the predicates used by the analysis to proof the program correct, do not need to fulfill strict correctness criteria to be useful. But still coming up with good predicates is very difficult, in particular since good predicates are invariants. As a GSoC student, you will have the opportunity to implement an open-source tool which using ML techniques provides initial predicates for CPAchecker to improve its performance.

Expected outcome: An open-source tool which takes a program and uses machine learning techniques to synthesize candidates for initial predicates encoded as C-Expressions.

Requirements: Programming in Python, reading C code, basic understanding of software verification and ML

Difficulty: intermediate

Project size: large (~350 hours)

Mentor: Marian Lingsch-Rosenfeld
Verifier Selection using LLMs

With the rise of LLMs, there have been large advances in code summarization, coding assistance, between many others. In particular it has now become possible to get good low dimensional embeddings of code, for example NV-Embed or LLM2Vec. These embeddings offer the possibility to improve the selection of best performing verifiers for a task. Current approaches, like Pesco or Graves usually either compute program features manually or train a custom ML algorithm for the selection. As a GSoC student, you will have the opportunity to create your own open-source tool which uses LLMs to encode the task passed to a verifier and predicts which verifier will perform the best on it.

Expected outcome: An open-source tool which takes a program and returns the verifier which is likely to perform the best on the given verification task.

Requirements: Programming in Python, reading C code, basic understanding of software verification and LLMs

Difficulty: intermediate

Project size: large (~350 hours)

Mentor: Marian Lingsch-Rosenfeld
Reimplementing MetaVal in MetaVal++

MetaVal is a tool and approach to validate witnesses using verifiers by instrumenting the program with the witness. Its current implementation uses deprecated technology and a re-implementation is necessary. In particular the approach could only handle witnesses in version 1.0, while the current standard for witnesses is version 2.0. As a GSoC student, you will have the opportunity to reimplement MetaVal in a new tool called MetaVal++ for witnesses in version 2.0.

Expected outcome: An implementation of the MetaVal algorithm adapted to witnesses version 2.0 in MetaVal++.

Requirements: Programming in Python, reading C code, basic understanding of software verification

Difficulty: intermediate

Project size: Medium (~175 hours)

Mentor: Marian Lingsch-Rosenfeld

BenchExec (website)

BenchExec is a benchmarking framework for Linux (written in Python) that is aimed at a high reliability of the results. It can measure the CPU-time and peak memory usage of whole groups of processes. To do so, it makes use of modern Linux features such as cgroups and namespaces, effectively creating a benchmarking container whose resource usage is measured. The concepts and architecture of BenchExec are described in a paper (open access).

Timestamps in logs

During benchmarking runs, BenchExec collects the output of the benchmarked tool into files. In some use cases, it would be beneficial if each produced line would be prefixed with a timestamp. For this we need to pipe the tool output through the BenchExec process instead of writing it directly (GitHub issue). While implementing this it might be possible to solve some related issues as well: #408, #535, and #536.

Requirements: Python, Linux

Skill level: intermediate to advanced

Mentor: Philipp Wendler
Project size: small (90h) or medium (175h) depending on whether the additional issues are worked on

Instructions for Contributors

Prospective contributors wishing to participate in Google Summer of Code must realize that this is an important professional opportunity. You will contribute code for an award-winning tool chain or parts of its infrastructure. Therefore, we seek contributors who are committed to helping our tools long-term and are willing to both do quality work, and be proactive in communication with their mentors.

You don't have to be a proven developer - in fact, this whole program is meant to facilitate joining our group and take a look at open source communities. However, experience in coding is welcome and should be mentioned in your proposal.

You should take a look at the tools that you plan to work on before the start date. The timeline from Google reserves a lot of time for bonding periods; use that time wisely. Good communication is important. The group members are available via mail (or online/in-person meeting, if needed). You should communicate with your mentor, and formally report progress and plans weekly.

Recommended steps

Read Google's instructions for participating and the GSoC Contributor Manual.
Take a look at the list of ideas.
Get familiar with the project you are interested in, e.g., by trying it out, reading the documentation, etc. If you already see something that can be improved, file a pull request!
Contact the developers of the project of your choice, e.g., via the mailing list or the mentor, introduce yourself, and meet your fellow developers.
Come up with a topic for a GSoC project that you're interested in.
Write a first draft proposal and get someone to review it.
Submit your proposal using Google's web interface ahead of the deadline.

Contributor Proposal Guidelines

Proposals are the basis of the GSoC projects. Write a clear proposal on what you plan to do, the scope of your work, and why we should choose you to do it. The proposals are the basis of our decisions of which contributor to choose.

Contributors can use the following application template:

Introduction and Background: You can either choose a problem from the list of ideas or state a new problem for our group. Before offering the solution (your GSoC project), you should first define the problem. What is the current state of things? What is the issue you wish to solve and why? What is your solution?
Project Goals: Propose a clear list of deliverables, explaining exactly what you promise to do and what you do not plan to do.
Implementation: Describe what you plan to do as a solution for the problem you defined above. Include technical details, showing that you understand the technology.
Timeline: Show that you understand the problem and have a solution. Divide the solution into manageable parts (estimated work items, milestones) and describe a detailed realistic plan on how to accomplish your goal. Include time for searching and fixing bugs, and communication. Do not promise what you cannot keep.
Other commitments during GSoC, such as a job, vacation, and exams should also be mentioned. GSoC should be treated like a full-time job, and we will expect approximately 30 hours of work per week. Explain how you will work around conflicts.
Describe your plans for communication. You will need to participate on weekly communication such as detailed email or personal meeting with your mentor or the team.
About Me: Provide your contact information (address, email, phone) and write a few sentences about you and why you think you are the best for this job. Previous contributions to our projects are appreciated. Name people (other developers, students, professors) who can act as a reference for you. Mention your field of study if necessary.
Tell us if you are submitting proposals to other organizations, and whether or not you would choose our group if given the choice.
If your native language is not German or English, you should be able to communicate in one of those languages.

Accepted Contributors

Your primary responsibility is finishing your project under the guidance of your mentors. To do that, you must submit code regularly and stay in frequent communication with your mentor and our team. For the evaluation, you must succeed with communication, coding, and documentation.

Hints

Submit your proposal early! Early submissions will be read and discussed by mentors more eagerly.
Keep it simple! Be concise and precise. Provide a clear, descriptive title.
Know what you are talking about! Do not submit ideas that cannot be accomplished realistically or that are not related to our projects.
Aim wide! You can submit more than one proposal, with different ideas. If you do submit more than one proposal, tell us which of them you would prefer, if more were selected.

Google Summer of Code in Previous Years

To get information about former projects and ideas, you can take a look at the pages from previous years:

Google Summer of Code 2025

Ideas

Instructions

Ideas

CPAchecker (website)

Exporting and Validating Correctness Witnesses for Memory Safety in CPAchecker

Infrastructure for Software Verification

Design and Implementation of Behavior-driven Tests

Design and Implementation of a Web Application for the Visualization of Analysis Work in Distributed Summary Synthesis

Design and Implementation of a Microservice for AI-based, Formal Code-Summary Synthesis

Finding initial Predicates for Predicate Analysis using AI

Verifier Selection using LLMs

Reimplementing MetaVal in MetaVal++

BenchExec (website)

Timestamps in logs

Instructions for Contributors

Recommended steps

Contributor Proposal Guidelines

Accepted Contributors

Hints

Google Summer of Code in Previous Years