SE /

ThesisProjects

On this page you will find the available thesis projects at the Software Engineering Division. Please contact the supervisor if you are interested in a topic listed here. Please also take a look at the general rules for master thesis projects at the computer science department.

Outline:


Master Program

For the slides introducing the Master thesis process at SE, please click HERE.

For the slides introducing the researchers at SE and their topics, please click HERE.

For an example of a thesis proposal accepted at SE, please click HERE.

Template for thesis proposal in Word and PDF.

NOTE: We also encourage thesis projects in close collaboration with companies or coming from your own ideas. However, please secure an academic supervisor before you write a thesis proposal.

Supervisor: Christian Berger:

  • Topics for Bachelor and Master thesis typically related to autonomous driving systems.
  • Managing Large-Scale Vehicle Sensor Recordings in the Cloud (topics: OpenStack, Nvidia GPU; MSc thesis)
  • Systematic Evaluation of Video Compression Algorithms for Large-Scale Vehicle Sensor Recordings (topics: Nvidia GPU, Intel Quick Sync; MSc thesis)
  • Systematic Evaluation of ML-based Video Compression for Large-Scale Vehicle Sensor Recordings (topics: Nvidia GPU, Variational Auto-Encoders; MSc thesis)
  • Automated Construction of Embedded OS for various Hardware Platforms using Docker

Supervisor: Thorsten Berger:

  • thesis together with ABB Research Germany
    • Using Machine Learning for Feature Traceability
  • thesis on Android App evolution
    • we have a dataset of cloned Android games
    • the goal is to systematically integrate cloned apps and to study this process
  • thesis on RoboCode
    • analyze variants of RoboCode robots: https://robocode.sourceforge.io
    • systematically integrate them (perform domain analysis and create a feature model, create target architecture, integrate variants by realizing variation points)
  • thesis together with SUSE Germany and Linux kernel maintainers
    • incorporate a SAT solver (picosat) into the Linux kernel configurator (make xconfig, make menuconfig) to support kernel configuration processes (detect dead features, control visibility of option subtrees, resolve configuration conflicts)
    • more details on the kconfig-sat website
  • thesis together with Danfoss Denmark
    • static analysis of firmware source code to extract configuration constraints and to reverse-engineer/extend a variability model
  • thesis in the context of a novel variation management system (GIT with variability support)
    • intelligent code merge tool which, for instance, can handle code-alignment issues
    • identify side effects of source code using static analysis
    • conceive and implement a feature dashboard, showing developers what features exist, where they reside in the code, and show various metrics about features (e.g., scattering degree, tangling degree, lines of feature code)
  • thesis in the context of mining software repositories
    • study feature annotations in the codebase history of the Mozilla Firefox project
    • study feature ownership in the codebase history of the Linux kernel: Who maintains feature code? What kinds of coordination patterns arise?
    • identify and study merge refactorings in the codebase history of the Linux kernel or another large open-source project

Supervisor: Michel Chaudron:

  • What is the relative value of different disciplines/tasks in software development? Software is being sold, hence has commercial value ($$$!). The production of software is knowledge intensive - in contrast to e.g. producing clothing or jewellery. This means that the value of software must be due to the thinking and doing of software developers. It is possible to find out whether there are differences in the value that different tasks/decisions contribute to the development of software? Do requirements, architecting, programming, testing, deploying all contribute the same value to the final software product?
  • How do contextual factors affect the use and costs and benefits of using Model-driven development? Contextual factors include organisational issues: size of team, level of education, geographic distribution, use of outsourcing, ... The study would consist both of literature study and possibly a combination of survey and interviews at companies. Also this project can be done in collaboration with Prof. Andreas Vogelsang in Berlin and with Software Center.
  • 'Documentation/Explanation on Demand': how can we produce (select & abstract) the information about a system that is relevant for a developer for a specific task at hand? In this project, we try to create a proof of concept that can produce 'Documentation on Demand' based on a multitude of available sources of information about the system, such as: source code, documentation, bug reports, test reports, etc. One part of this project is also: how to best present this information and how to integrate this into development tools. We can do this project on a regular PC, or explore how a smart whiteboard can be used as a collaborative tool for program understanding.
  • Study DevOps for MDA / tool-chains for use with model-driven development (collecting data from multiple companies).
  • Analyse the evolution of correspondence (and technical debt) between architecture/design and source code over time.
  • Machine Learning for Software Engineering:
    • how to recognize (and ideally generate) good layouts for UML Class diagrams (based on a large repository of examples).
    • Machine Learning for Traceability: use machine learning to link which text in sw documentation explains which image (of that documentation)?
  • Automated summarization of software designs (with Rodi Jolak). Which information from designs is key for understanding a system design? This can be an empirical study in industry. How can we identify important (high level) design concepts from (reverse engineered) source code.
  • How do software designers explain software designs?
  • Develop best practices for assessing program understanding (esp. for use in scientific experiments). This will involve collecting and reading much literature, and then synthesizing best practices.
  • Also check out ongoing projects on which we can build thesis projects:
  • How do ICT-industries develop/grow/evolve in developing countries?
  • There are several universities/companies abroad with which we can set up an exchange...
  • see B.Sc. projects below (we can reduce the scope to fit M.Sc. timescale)

Supervisor: Francisco de Oliveira Neto:

  • Platforms to support empirical software engineering:
    • Create a platform for automatic experimentation with software test techniques.
    • A formal experiment to investigate disparate software testing techniques.
  • Automatic test optimisation:
    • Automatic test optimisation on continuous integration environment
    • Optimisation of failure exposure rate through automatic test case generation.
    • Model-based Testing (collaboration with Diadrom)
    • Adaptive Test Case Selection for Mechatronic Application (collaboration with Volvo Cars)

Supervisor: Robert Feldt

Supervisor: Gregory Gay

Broadly, I am interested in automation of development and analysis tasks - particularly related to software testing. How can we remove tedious repetitive tasks and help humans work more effectively and efficiently, all while maintaining and increasing the quality of software? For more information, see my website: http://greggay.com

Some specific topics to get you started:

  • Using Machine Learning/Natural Language Processing to incorporate domain knowledge into automated test case generation.
  • Using AI and ML techniques to enhance human developer performance during development and testing.
  • Development of new metrics to guide search-based test generation.
  • Self-adaptive search-based test generation techniques (hyperheuristic search).
  • Design of test oracles that can quantify and account for uncertainty.

Supervisor: Jennifer Horkoff

  • Requirements Engineering & Early Requirements Modeling
  • Software Engineering for Bioinformatics Workflows
    • With F. Gomez, Bioinformatics Core Facility
  • Strategic API Value and Measurements (Software Center Project #26)
    • With Axis, Bosch, Ericsson, Grundfos, & Tetra Pak
  • Large-Scale Agile Requirements Engineering (Software Center Project #27, with Eric Knauss)
    • With Bosch, Ericsson, Grundfos, Siemens, Tetra Pak, Volvo C, & Volvo T
  • Requirements Modeling and Game Development
    • with S. Björk
  • Web-service selection criteria and processes

Supervisor: Regina Hebig:

  • Hybrid Software Processes and Metrics
  • Comprehension for AI systems or DevBots for explaining systems
    • Create a taxonomy and map of approaches for Software visualization and Comprehension that support different Machine Learning/AI technologies
    • Apply and compare different existing approaches to visualizing/comprehending AI-decision & propose an improvement
  • MDE for Machine Learning Systems
    • Develop a small DSL that allows to model and generate Machine Learning systems that can be easily integrated into bigger software systems

Supervisor: Rodi Jolak

  • Knowledge Sharing and Distances in Collaborative Software Design (Empirical SE; Knowledge Sharing; Collaboration; Human Aspects): To develop software effectively, a shared software understanding is required. Collaborative software design is one way to capture this shared understanding. Increasingly, in large software engineering projects different distances lead to social barriers between stakeholders. These barriers affect the quantity and quality of knowledge that is shared between stakeholders, thus reducing the quality of the resulting product. While it has been proposed to limit design activities to co-located teams, this might not always be possible or feasible. We think that, despite the technological advances in collaborative design, effective collaboration can only be achieved if we understand how to account for social barriers.
    • Goal: The goal of this project is to study, in-depth, how social barriers affect software design, and how their effects can be reduced.
    • Research Method: Empirical methods, including case studies and experiments.

Supervisor: Eric Knauss

  • Generally interested in case studies in the following areas (especially in combinations of them)
    • Agile
    • Requirements Engineering
    • Continuous Integration, Delivery, Deployment
  • Concrete topics Spring 2018 (Industry contact can be provided):
    • Data requirements for System of Systems (case study on how to define and monitor them)

Supervisor: Philipp Leitner

I supervise master projects in the broad areas of Web and cloud engineering, performance testing, and mining software repositories. Students interested in doing their master's thesis with me are advised to contact me directly, ideally listing their interests in the initial email (what trends and technologies are you excited about and want to know more about? do you have a specific question that you would like to work on as part of your thesis? etc.). I also often advise students on topics of their own design (e.g., if you already have a company and topic that clearly falls under the umbrella outlined above, I will be happy to advise you). You may want to browse my instructions for master students as well.

Supervisor: Patrizio Pelliccione

  • Systematic literature review on lightweight formal methods
  • Automatic construction of models from black-box software artefacts

Supervisor: Riccardo Scandariato:

My research area is security and privacy. Drop by my office if you are interested in a thesis with me. Some examples of available topics are:

  • Security extension for ArchUnit. Creating unit test cases (as assertions) in order to validate the security of a design model.

Supervisors: Jan-Philipp Steghöfer and Salome Maro.

Traceability is "the ability to interrelate any uniquely identifiable software engineering artifacts to any other, maintain required links over time, and use the resulting network to answer questions of both the software product and its development process" (COEST). Traceability is an increasingly important part of software development, especially in safety-critical domains where it is often mandated by certification authorities. All theses will use the open source tool Eclipse Capra as the foundation for the implementation part.

  • Using collaborative features to facilitate traceability maintenance: Proposal
  • A query language for traceability models: traceability information, i.e., links between software development artifacts, grow over time and can become quite numerous. In order to use them purposefully, e.g., to create reports and gain an understanding of the development progress, it is therefore necessary to query these links that are usually available in the form of models. Existing languages like SPARQL are not specialised for the graph structures of these models and do not carry sufficient semantics. The goal of the thesis is therefore to create a DSL for querying traceability information based on a set of scenarios and evaluate the language based on these scenarios and realistic examples.
  • Traceability link quality and improved link suggestions: When working with an evolving system, it is vitally important that all necessary traceability links exist (completeness) and that the once that do exist add value (correctness). In many cases, developers have a hard time deciding whether a link is (still) correct or whether a link is missing. This thesis should explore conceptual and tool support to address both issues by providing visualisation techniques (e.g., to visualise how often links are used by the thickness of the lines representing the links in a traceability graph) and decision support (e.g., by compiling a list of unlinked artifacts using metrics like centrality to rank them). All suggested solutions will be empirically validated.
  • An experimentation environment for trace link recovery: There are a number of trace link recovery mechanisms that often use some form of information retrieval to match development artifacts with each other and create trace links from these matches. It is often not clear which recovery mechanism works best with a certain data set and it is also not trivial to prepare the data for consumption by such an algorithm. In this thesis, existing code for the use of trace link recovery mechanisms with Eclipse Capra should be extended to provide an experimentation environment that provides semi-automatic data processing and the ability to compare and combine the results of different trace link recovery mechanisms.

Supervisor: Jan-Philipp Steghöfer

  • In-situ traceability: Traceability links are used to connect different development artifacts in different domain-specific languages to each other. These links enable change impact analysis, provide input for certification, and improve program comprehension. At the moment, links are often managed in a separate model, however. That means that when editing an artifact in a DSL, the links are not directly available to the developer. In this thesis, techniques to embed trace links into the editing of DSLs in the Eclipse platform will be developed. For instance, a developer should be able to see requirements that are connected to a class in a UML class diagram in Papyrus. Apart from defining a technical solution to this problem, the solution and its usability will be evaluated in experiments with students and/or practitioners.

Supervisor: Richard Torkar

  • A formal experiment in search-based software testing and mutation testing.




Bachelor Program

Supervisors: Michel Chaudron:

  • Study how the use of software architecture- and design documentation affects communication in software development teams
  • Study DevOps for MDA / tool-chains for use with model-driven development.
  • see M.Sc. projects (we can reduce the scope to fit B.Sc. timescale)

Supervisor: Christian Berger:

  • Topics for Bachelor and Master thesis typically related to autonomous driving systems. Proposals (academic & industrial topics) will be continuously announced at Christian Berger page.
  • If you have an exciting topic or research question in mind, do not hesitate to get in contact with me!

Supervisor: Eric Knauss

  • Extend an RE Tool for Large-Scale Agile System Development with Modeling Capabilities (co-supervision with Grischa)
  • Continuous Deployment for Cars: Demonstrator (potential co-supervision with Håkan Burden)

Supervisor: Rodi Jolak

  • See Master Theses (reduced scope)

Supervisor: Jennifer Horkoff

  • See Master Theses (reduced scope)

Supervisor: Francisco de Oliveira Neto:

  • Developing tools to support the analysis of experiments with software testing techniques
  • Investigating the reproducibility of experiments with similarity-based test case selection
  • Alternative: A reduced version of any of the master thesis proposals (see above)

Supervisor: Gregory Gay:

  • See MS description above, and adjust scope accordingly.
  • Feel free to propose topics related to testing and automation.

Supervisor: Riccardo Scandariato:

  • Module for PlantUML to draw SecDFD (Secure Data Flow Diagrams). In collaboration with Katja Tuma.

Supervisors: Jan-Philipp Steghöfer and Salome Maro:

Traceability is "the ability to interrelate any uniquely identifiable software engineering artifacts to any other, maintain required links over time, and use the resulting network to answer questions of both the software product and its development process" (COEST). Traceability is an increasingly important part of software development, especially in safety-critical domains where it is often mandated by certification authorities. All theses will use the open source tool Eclipse Capra as the foundation for the implementation part.

  • Using traceability links for impact analysis. This thesis will investigate how traceability links can be used to facilitate change impact analysis. See Detailed proposal.
  • Visualising traceability links in context. One issue with the usual traceability visualisations (matrices and graphs) is that the artifacts that are linked to each other are represented outside of their usual context (such as the other model elements the linked elements are linked to). This thesis will investigate how the existing visualisations in Eclipse Capra can be extended to provide this context information. At the same time, novel visualisation techniques like FlexiView are going to be explored.
  • Use automated traceability link discovery to verify manually created links. In many domains, automatically created traceability links are not acceptable since they can introduce erroneous links. In these domains, such as automotive, links are instead created manually. However, manually created traceability links are not guaranteed to be correct. In this thesis, automated techniques are going to be investigated to check the correctness of traceability links. In essence, an automated technique that discovers a manually created link thus verifies it. If the link is not found, however, it could be marked as suspect for further investigation.
  • How bad is a wrong link? Manual creation and maintenance of traceability is labor intensive, especially for large and complex systems. To solve this, researchers have proposed automated approaches to generate traceability links. These automated techniques are promising but they are still not perfect. The links generated can be wrong and some correct links are not generated at all. Therefore, the adoption of these techniques in industry is very low. In this study we want to investigate to what extent wrong links have a negative effect on the development process and product in the end.
  • Automatic creation of links between code elements using commit messages. In agile development, it is customary to include the ticket number in the commit message in order to create a link between the code committed and the ticket Rath et al. have shown that it is possible to find such links even if the ticket number is missing in the commit message. In this thesis, we want to investigate a related, but slightly different issue: how we can use data from commit messages to derive links between code elements. One possibility is to see all code files that are committed together as related, but this approach only works if we assume that developers really make atomic commits. It would also be possible to use information retrieval approaches on files committed together to determine if there should be a link between them. In addition, instead of linking files, data in commit messages could also be used to, e.g., establish links on the method level. This thesis aims to investigate the different options and evaluate the quality of links found this way.
  • Extract link suggestions from task-focused data from Mylyn. The task-focused interface of Eclipse Mylyn helps developers focus on the part of the project that are important for the current job. Eclipse Mylyn learns which requirements, code files, tests, and so on are important for a certain task and allows accessing these assets quickly. In this thesis, we want to investigate if this ability can also be used to automatically derive meaningful traceability links and which impact these links have on the quality of a traceability model.
  • Comparison of Traceability Tools. There is a surprising number of commercial and open-source tools available on the market today. Their respective drawbacks and advantages are, however, not clear. Based on preliminary approaches, the goal of this thesis is to first develop a categorisation scheme for traceability tools and then evaluate a selection of the tools based on this scheme. For this purpose, the tool documentation and publicly available material will be complemented with information gathered directly from tool developers and/or users. An additional outcome would be a better understanding of the market forces that shape the feature set of these tools.

Supervisor: Jan-Philipp Steghöfer

  • Automatically evolving trace links when refactoring. Rahimi et al. presented an approach to automatically evolve trace links when refactoring an application. The aim of the thesis is to combine the evolution strategies with the refactoring tools available in the Eclipse IDE and thus support developers in maintaining the consistency of trace links when evolving the system. The results should be implemented using the open source traceability management tool Eclipse Capra and evaluated in experiments with students and/or practitioners.

Supervisor: Hang Yin:

  • Dynamic software update for cyber-physical systems (1-2 students)

Supervisor: Piergiuseppe Mallozzi

  • Topics: Self-driving cars, Machine learning and Artificial Intelligence.
  • Reinforcement Learning applied to autonomous vehicles
  • Runtime monitoring
  • Working on an OpenAI project: https://github.com/openai/universe

Supervisor: Hugo Andrade:

  • Evaluation of software deployment strategies using FPGA (Xilinx Gimme2 board). The goal is to run an experiment on the FPGA board and analyze performance results with respect to different deployment configurations, frameworks and tools.