For enquiries call:

+1-469-442-0620

banner-in1

  • Programming

Top 10 Software Engineer Research Topics for 2024

Home Blog Programming Top 10 Software Engineer Research Topics for 2024

Play icon

Software engineering, in general, is a dynamic and rapidly changing field that demands a thorough understanding of concepts related to programming, computer science, and mathematics. As software systems become more complicated in the future, software developers must stay updated on industry innovations and the latest trends. Working on software engineering research topics is an important part of staying relevant in the field of software engineering. 

Software engineers can do research to learn about new technologies, approaches, and strategies for developing and maintaining complex software systems. Software engineers can conduct research on a wide range of topics. Software engineering research is also vital for increasing the functionality, security, and dependability of software systems. Going for the Top Programming Certification course contributes to the advancement of the field's state of the art and assures that software engineers can continue to build high-quality, effective software systems.

What are Software Engineer Research Topics?

Software engineer research topics are areas of exploration and study in the rapidly evolving field of software engineering. These research topics include various software development approaches, quality of software, testing of software, maintenance of software, security measures for software, machine learning models in software engineering, DevOps, and architecture of software. Each of these software engineer research topics has distinct problems and opportunities for software engineers to investigate and make major contributions to the field. In short, research topics for software engineering provide possibilities for software engineers to investigate new technologies, approaches, and strategies for developing and managing complex software systems. 

For example, research on agile software development could identify the benefits and drawbacks of using agile methodology, as well as develop new techniques for effectively implementing agile practices. Software testing research may explore new testing procedures and tools, as well as assess the efficacy of existing ones. Software quality research may investigate the elements that influence software quality and develop approaches for enhancing software system quality and minimizing the faults and errors. Software metrics are quantitative measures that are used to assess the quality, maintainability, and performance of software. 

The research papers on software engineering topics in this specific area could identify novel measures for evaluating software systems or techniques for using metrics to improve the quality of software. The practice of integrating code changes into a common repository and pushing code changes to production in small, periodic batches is known as continuous integration and deployment (CI/CD). This research could investigate the best practices for establishing CI/CD or developing tools and approaches for automating the entire CI/CD process.

Top Software Engineer Research Topics

In this article we will be going through the following Software Engineer Research Topics:

1. Artificial Intelligence and Software Engineering

Intersections between AI and SE

The creation of AI-powered software engineering tools is one potential research area at the intersection of artificial intelligence (AI) and software engineering. These technologies use AI techniques that include machine learning, natural language processing, and computer vision to help software engineers with a variety of tasks throughout the software development lifecycle. An AI-powered code review tool, for example, may automatically discover potential flaws or security vulnerabilities in code, saving developers a lot of time and lowering the chance of human error. Similarly, an AI-powered testing tool might build test cases and analyze test results automatically to discover areas for improvement. 

Furthermore, AI-powered project management tools may aid in the planning and scheduling of projects, resource allocation, and risk management in the project. AI can also be utilized in software maintenance duties such as automatically discovering and correcting defects or providing code refactoring solutions. However, the development of such tools presents significant technical and ethical challenges, such as the necessity of large amounts of high-quality data, the risk of bias present in AI algorithms, and the possibility of AI replacing human jobs. Continuous study in this area is therefore required to ensure that AI-powered software engineering tools are successful, fair, and responsible.

Knowledge-based Software Engineering

Another study area that overlaps with AI and software engineering is knowledge-based software engineering (KBSE). KBSE entails creating software systems capable of reasoning about knowledge and applying that knowledge to enhance software development processes. The development of knowledge-based systems that can help software engineers in detecting and addressing complicated problems is one example of KBSE in action. To capture domain-specific knowledge, these systems use knowledge representation techniques such as ontologies, and reasoning algorithms such as logic programming or rule-based systems to derive new knowledge from already existing data. 

KBSE can be utilized in the context of AI and software engineering to create intelligent systems capable of learning from past experiences and applying that information to improvise future software development processes. A KBSE system, for example, may be used to generate code based on previous code samples or to recommend code snippets depending on the requirements of a project. Furthermore, KBSE systems could be used to improve the precision and efficiency of software testing and debugging by identifying and prioritizing bugs using knowledge-based techniques. As a result, continued research in this area is critical to ensuring that AI-powered software engineering tools are productive, fair, and responsible.

2. Natural Language Processing

Multimodality

Multimodality in Natural Language Processing (NLP) is one of the appealing research ideas for software engineering at the nexus of computer vision, speech recognition, and NLP. The ability of machines to comprehend and generate language from many modalities, such as text, speech, pictures, and video, is referred to as multimodal NLP. The goal of multimodal NLP is to develop systems that can learn from and interpret human communication across several modalities, allowing them to engage with humans in more organic and intuitive ways. 

The building of conversational agents or chatbots that can understand and create responses using several modalities is one example of multimodal NLP in action. These agents can analyze text input, voice input, and visual clues to provide more precise and relevant responses, allowing users to have a more natural and seamless conversational experience. Furthermore, multimodal NLP can be used to enhance language translation systems, allowing them to more accurately and effectively translate text, speech, and visual content.

The development of multimodal NLP systems must take efficiency into account. as multimodal NLP systems require significant computing power to process and integrate information from multiple modalities, optimizing their efficiency is critical to ensuring that they can operate in real-time and provide users with accurate and timely responses. Developing algorithms that can efficiently evaluate and integrate input from several modalities is one method for improving the efficiency of multimodal NLP systems. 

Overall, efficiency is a critical factor in the design of multimodal NLP systems. Researchers can increase the speed, precision, and scalability of these systems by inventing efficient algorithms, pre-processing approaches, and hardware architectures, allowing them to run successfully and offer real-time replies to consumers. Software Engineering training will help you level up your career and gear up to land you a job in the top product companies as a skilled Software Engineer. 

3. Applications of Data Mining in Software Engineering

Mining Software Engineering Data

The mining of software engineering data is one of the significant research paper topics for software engineering, involving the application of data mining techniques to extract insights from enormous datasets that are generated during software development processes. The purpose of mining software engineering data is to uncover patterns, trends, and various relationships that can inform software development practices, increase software product quality, and improve software development process efficiency. 

Mining software engineering data, despite its potential benefits, has various obstacles, including the quality of data, scalability, and privacy of data. Continuous research in this area is required to develop more effective data mining techniques and tools, as well as methods for ensuring data privacy and security, to address these challenges. By tackling these issues, mining software engineering data can continue to promote many positive aspects in software development practices and the overall quality of product.

Clustering and Text Mining

Clustering is a data mining approach that is used to group comparable items or data points based on their features or characteristics. Clustering can be used to detect patterns and correlations between different components of software, such as classes, methods, and modules, in the context of software engineering data. 

On the other hand, text mining is a method of data mining that is used to extract valuable information from unstructured text data such as software manuals, code comments, and bug reports. Text mining can be applied in the context of software engineering data to find patterns and trends in software development processes

4. Data Modeling

Data modeling is an important area of research paper topics in software engineering study, especially in the context of the design of databases and their management. It involves developing a conceptual model of the data that a system will need to store, organize, and manage, as well as establishing the relationships between various data pieces. One important goal of data modeling in software engineering research is to make sure that the database schema precisely matches the system's and its users' requirements. Working closely with stakeholders to understand their needs and identify the data items that are most essential to them is necessary.

5. Verification and Validation

Verification and validation are significant research project ideas for software engineering research because they help us to ensure that software systems are correctly built and suit the needs of their users. While most of the time, these terms are frequently used interchangeably, they refer to distinct stages of the software development process. The process of ensuring that a software system fits its specifications and needs is referred to as verification. This involves testing the system to confirm that it behaves as planned and satisfies the functional and performance specifications. In contrast, validation is the process of ensuring that a software system fulfils the needs of its users and stakeholders. 

This includes ensuring that the system serves its intended function and meets the requirements of its users. Verification and validation are key components of the software development process in software engineering research. Researchers can help to improve the functionality and dependability of software systems, minimize the chance of faults and mistakes, and ultimately develop better software products for their consumers by verifying that software systems are designed correctly and that they satisfy the needs of their users.

6. Software Project Management

Software project management is an important component of software engineering research because it comprises the planning, organization, and control of resources and activities to guarantee that software projects are finished on time, within budget, and to the needed quality standards. One of the key purposes of software project management in research is to guarantee that the project's stakeholders, such as users, clients, and sponsors, are satisfied with their needs. This includes defining the project's requirements, scope, and goals, as well as identifying potential risks and restrictions to the project's success.

7. Software Quality

The quality of a software product is defined as how well it fits in with its criteria, how well it performs its intended functions, and meets the needs of its consumers. It includes features such as dependability, usability, maintainability, effectiveness, and security, among others. Software quality is a prominent and essential research topic in software engineering. Researchers are working to provide methodologies, strategies, and tools for evaluating and improving software quality, as well as forecasting and preventing software faults and defects. Overall, software quality research is a large and interdisciplinary field that combines computer science, engineering, and statistics. Its mission is to increase the reliability, accessibility, and overall quality of software products and systems, thereby benefiting both software developers and end consumers.

8. Ontology

Ontology is a formal specification of a conception of a domain used in computer science to allow knowledge sharing and reuse. Ontology is a popular and essential area of study in the context of software engineering research. The construction of ontologies for specific domains or application areas could be a research topic in ontology for software engineering. For example, a researcher may create an ontology for the field of e-commerce to give common knowledge and terminology to software developers as well as stakeholders in that domain. The integration of several ontologies is another intriguing study topic in ontology for software engineering. As the number of ontologies generated for various domains and applications grows, there is an increasing need to integrate them in order to enable interoperability and reuse.

9. Software Models

In general, a software model acts as an abstract representation of a software system or its components. Software models can be used to help software developers, different stakeholders, and users communicate more effectively, as well as to properly evaluate, design, test, and maintain software systems. The development and evaluation of modeling languages and notations is one research example connected to software models. Researchers, for example, may evaluate the usefulness and efficiency of various modeling languages, such as UML or BPMN, for various software development activities or domains. 

Researchers could also look into using software models for software testing and verification. They may investigate how models might be used to produce test cases or to do model checking, a formal technique for ensuring the correctness of software systems. They may also examine the use of models for monitoring at runtime and software system adaptation.

The Software Development Life Cycle (SDLC) is a software engineering process for planning, designing, developing, testing, and deploying software systems. SDLC is an important research issue in software engineering since it is used to manage software projects and ensure the quality of the resultant software products by software developers and project managers. The development and evaluation of novel software development processes is one SDLC-related research topic. SDLC research also includes the creation and evaluation of different software project management tools and practices. 

SDLC

Researchers may also check the implementation of SDLC in specific sectors or applications. They may, for example, investigate the use of SDLC in the development of systems that are more safety-critical, such as medical equipment or aviation systems, and develop new processes or tools to ensure the safety and reliability of these systems. They may also look into using SDLC to design software systems in new sectors like the Internet of Things or in blockchain technology.

Why is Software Engineering Required?

Software engineering is necessary because it gives a systematic way to developing, designing, and maintaining reliable, efficient, and scalable software. As software systems have become more complicated over time, software engineering has become a vital discipline to ensure that software is produced in a way that is fully compatible with end-user needs, reliable, and long-term maintainable.

When the cost of software development is considered, software engineering becomes even more important. Without a disciplined strategy, developing software can result in overinflated costs, delays, and a higher probability of errors that require costly adjustments later. Furthermore, software engineering can help reduce the long-term maintenance costs that occur by ensuring that software is designed to be easy to maintain and modify. This can save money in the long run by lowering the number of resources and time needed to make software changes as needed.

2. Scalability

Scalability is an essential factor in software development, especially for programs that have to manage enormous amounts of data or an increasing number of users. Software engineering provides a foundation for creating scalable software that can evolve over time. The capacity to deploy software to diverse contexts, such as cloud-based platforms or distributed systems, is another facet of scalability. Software engineering can assist in ensuring that software is built to be readily deployed and adjusted for various environments, resulting in increased flexibility and scalability.

3. Large Software

Developers can break down huge software systems into smaller, simpler parts using software engineering concepts, making the whole system easier to maintain. This can help to reduce the software's complexity and makes it easier to maintain the system over time. Furthermore, software engineering can aid in the development of large software systems in a modular fashion, with each module doing a specific function or set of functions. This makes it easier to push new features or functionality to the product without causing disruptions to the existing codebase.

4. Dynamic Nature

Developers can utilize software engineering techniques to create dynamic content that is modular and easily modifiable when user requirements change. This can enable adding new features or functionality to dynamic content easier without disturbing the existing codebase. Another factor to consider for dynamic content is security. Software engineering can assist in ensuring that dynamic content is generated in a secure manner that protects user data and information.

5. Better Quality Management

An organized method of quality management in software development is provided by software engineering. Developers may ensure that software is conceived, produced, and maintained in a way that fulfills quality requirements and provides value to users by adhering to software engineering principles. Requirement management is one component of quality management in software engineering. Testing and validation are another part of quality control in software engineering. Developers may verify that their software satisfies its requirements and is error-free by using an organized approach to testing.

In conclusion, the subject of software engineering provides a diverse set of research topics with the ability to progress the discipline while enhancing software development and maintenance procedures. This article has dived deep into various research topics in software engineering for masters and research topics for software engineering students such as software testing and validation, software security, artificial intelligence, Natural Language Processing, software project management, machine learning, Data Mining, etc. as research subjects. Software engineering researchers have an interesting chance to explore these and other research subjects and contribute to the development of creative solutions that can improve software quality, dependability, security, and scalability. 

Researchers may make important contributions to the area of software engineering and help tackle some of the most serious difficulties confronting software development and maintenance by staying updated with the latest research trends and technologies. As software grows more important in business and daily life, there is a greater demand for current research topics in software engineering into new software engineering processes and techniques. Software engineering researchers can assist in shaping the future of software creation and maintenance through their research, ensuring that software stays dependable, safe, reliable and efficient in an ever-changing technological context. KnowledgeHut’s top Programming certification course will help you leverage online programming courses from expert trainers.

Frequently Asked Questions (FAQs)

 To find a research topic in software engineering, you can review recent papers and conference proceedings, talk to different experts in the field, and evaluate your own interests and experience. You can use a combination of these approaches. 

You should study software development processes, various programming languages and their frameworks, software testing and quality assurance, software architecture, various design patterns that are currently being used, and software project management as a software engineering student. 

Empirical research, experimental research, surveys, case studies, and literature reviews are all types of research in software engineering. Each sort of study has advantages and disadvantages, and the research method chosen is determined by the research objective, resources, and available data. 

Profile

Eshaan Pandey

Eshaan is a Full Stack web developer skilled in MERN stack. He is a quick learner and has the ability to adapt quickly with respect to projects and technologies assigned to him. He has also worked previously on UI/UX web projects and delivered successfully. Eshaan has worked as an SDE Intern at Frazor for a span of 2 months. He has also worked as a Technical Blog Writer at KnowledgeHut upGrad writing articles on various technical topics.

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Programming Batches & Dates

Course advisor icon

Software Engineering’s Top Topics, Trends, and Researchers

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Journal of Software Engineering Research and Development Cover Image

  • Search by keyword
  • Search by citation

Page 1 of 2

Metric-centered and technology-independent architectural views for software comprehension

The maintenance of applications is a crucial activity in the software industry. The high cost of this process is due to the effort invested on software comprehension since, in most of cases, there is no up-to-...

  • View Full Text

Back to the future: origins and directions of the “Agile Manifesto” – views of the originators

In 2001, seventeen professionals set up the manifesto for agile software development. They wanted to define values and basic principles for better software development. On top of being brought into focus, the ...

Investigating the effectiveness of peer code review in distributed software development based on objective and subjective data

Code review is a potential means of improving software quality. To be effective, it depends on different factors, and many have been investigated in the literature to identify the scenarios in which it adds qu...

On the benefits and challenges of using kanban in software engineering: a structured synthesis study

Kanban is increasingly being used in diverse software organizations. There is extensive research regarding its benefits and challenges in Software Engineering, reported in both primary and secondary studies. H...

Challenges on applying genetic improvement in JavaScript using a high-performance computer

Genetic Improvement is an area of Search Based Software Engineering that aims to apply evolutionary computing operators to the software source code to improve it according to one or more quality metrics. This ...

Actor’s social complexity: a proposal for managing the iStar model

Complex systems are inherent to modern society, in which individuals, organizations, and computational elements relate with each other to achieve a predefined purpose, which transcends individual goals. In thi...

Investigating measures for applying statistical process control in software organizations

The growing interest in improving software processes has led organizations to aim for high maturity, where statistical process control (SPC) is required. SPC makes it possible to analyze process behavior, pred...

An approach for applying Test-Driven Development (TDD) in the development of randomized algorithms

TDD is a technique traditionally applied in applications with deterministic algorithms, in which the input and the expected result are known. However, the application of TDD with randomized algorithms have bee...

Supporting governance of mobile application developers from mining and analyzing technical questions in stack overflow

There is a need to improve the direct communication between large organizations that maintain mobile platforms (e.g. Apple, Google, and Microsoft) and third-party developers to solve technical questions that e...

Working software over comprehensive documentation – Rationales of agile teams for artefacts usage

Agile software development (ASD) promotes working software over comprehensive documentation. Still, recent research has shown agile teams to use quite a number of artefacts. Whereas some artefacts may be adopt...

Development as a journey: factors supporting the adoption and use of software frameworks

From the point of view of the software framework owner, attracting new and supporting existing application developers is crucial for the long-term success of the framework. This mixed-methods study explores th...

Applying user-centered techniques to analyze and design a mobile application

Techniques that help in understanding and designing user needs are increasingly being used in Software Engineering to improve the acceptance of applications. Among these techniques we can cite personas, scenar...

A measurement model to analyze the effect of agile enterprise architecture on geographically distributed agile development

Efficient and effective communication (active communication) among stakeholders is thought to be central to agile development. However, in geographically distributed agile development (GDAD) environments, it c...

A survey of search-based refactoring for software maintenance

This survey reviews published materials related to the specific area of Search-Based Software Engineering that concerns software maintenance and, in particular, refactoring. The survey aims to give a comprehen...

Guest editorial foreword for the special issue on automated software testing: trends and evidence

Similarity testing for role-based access control systems.

Access control systems demand rigorous verification and validation approaches, otherwise, they can end up with security breaches. Finite state machines based testing has been successfully applied to RBAC syste...

An algorithm for combinatorial interaction testing: definitions and rigorous evaluations

Combinatorial Interaction Testing (CIT) approaches have drawn attention of the software testing community to generate sets of smaller, efficient, and effective test cases where they have been successful in det...

How diverse is your team? Investigating gender and nationality diversity in GitHub teams

Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a “dream”

Investigating factors that affect the human perception on god class detection: an analysis based on a family of four controlled experiments

Evaluation of design problems in object oriented systems, which we call code smells, is mostly a human-based task. Several studies have investigated the impact of code smells in practice. Studies focusing on h...

On the evaluation of code smells and detection tools

Code smells refer to any symptom in the source code of a program that possibly indicates a deeper problem, hindering software maintenance and evolution. Detection of code smells is challenging for developers a...

On the influence of program constructs on bug localization effectiveness

Software projects often reach hundreds or thousands of files. Therefore, manually searching for code elements that should be changed to fix a failure is a difficult task. Static bug localization techniques pro...

DyeVC: an approach for monitoring and visualizing distributed repositories

Software development using distributed version control systems has become more frequent recently. Such systems bring more flexibility, but also greater complexity to manage and monitor multiple existing reposi...

A genetic algorithm based framework for software effort prediction

Several prediction models have been proposed in the literature using different techniques obtaining different results in different contexts. The need for accurate effort predictions for projects is one of the ...

Elaboration of software requirements documents by means of patterns instantiation

Studies show that problems associated with the requirements specifications are widely recognized for affecting software quality and impacting effectiveness of its development process. The reuse of knowledge ob...

ArchReco: a software tool to assist software design based on context aware recommendations of design patterns

This work describes the design, development and evaluation of a software Prototype, named ArchReco, an educational tool that employs two types of Context-aware Recommendations of Design Patterns, to support us...

On multi-language software development, cross-language links and accompanying tools: a survey of professional software developers

Non-trivial software systems are written using multiple (programming) languages, which are connected by cross-language links. The existence of such links may lead to various problems during software developmen...

SoftCoDeR approach: promoting Software Engineering Academia-Industry partnership using CMD, DSR and ESE

The Academia-Industry partnership has been increasingly encouraged in the software development field. The main focus of the initiatives is driven by the collaborative work where the scientific research work me...

Issues on developing interoperable cloud applications: definitions, concepts, approaches, requirements, characteristics and evaluation models

Among research opportunities in software engineering for cloud computing model, interoperability stands out. We found that the dynamic nature of cloud technologies and the battle for market domination make clo...

Game development software engineering process life cycle: a systematic review

Software game is a kind of application that is used not only for entertainment, but also for serious purposes that can be applicable to different domains such as education, business, and health care. Multidisc...

Correlating automatic static analysis and mutation testing: towards incremental strategies

Traditionally, mutation testing is used as test set generation and/or test evaluation criteria once it is considered a good fault model. This paper uses mutation testing for evaluating an automated static anal...

A multi-objective test data generation approach for mutation testing of feature models

Mutation approaches have been recently applied for feature testing of Software Product Lines (SPLs). The idea is to select products, associated to mutation operators that describe possible faults in the Featur...

An extended global software engineering taxonomy

In Global Software Engineering (GSE), the need for a common terminology and knowledge classification has been identified to facilitate the sharing and combination of knowledge by GSE researchers and practition...

A systematic process for obtaining the behavior of context-sensitive systems

Context-sensitive systems use contextual information in order to adapt to the user’s current needs or requirements failure. Therefore, they need to dynamically adapt their behavior. It is of paramount importan...

Distinguishing extended finite state machine configurations using predicate abstraction

Extended Finite State Machines (EFSMs) provide a powerful model for the derivation of functional tests for software systems and protocols. Many EFSM based testing problems, such as mutation testing, fault diag...

Extending statecharts to model system interactions

Statecharts are diagrams comprised of visual elements that can improve the modeling of reactive system behaviors. They extend conventional state diagrams with the notions of hierarchy, concurrency and communic...

On the relationship of code-anomaly agglomerations and architectural problems

Several projects have been discontinued in the history of the software industry due to the presence of software architecture problems. The identification of such problems in source code is often required in re...

An approach based on feature models and quality criteria for adapting component-based systems

Feature modeling has been widely used in domain engineering for the development and configuration of software product lines. A feature model represents the set of possible products or configurations to apply i...

Patch rejection in Firefox: negative reviews, backouts, and issue reopening

Writing patches to fix bugs or implement new features is an important software development task, as it contributes to raise the quality of a software system. Not all patches are accepted in the first attempt, ...

Investigating probabilistic sampling approaches for large-scale surveys in software engineering

Establishing representative samples for Software Engineering surveys is still considered a challenge. Specialized literature often presents limitations on interpreting surveys’ results, mainly due to the use o...

Characterising the state of the practice in software testing through a TMMi-based process

The software testing phase, despite its importance, is usually compromised by the lack of planning and resources in industry. This can risk the quality of the derived products. The identification of mandatory ...

Self-adaptation by coordination-targeted reconfigurations

A software system is self-adaptive when it is able to dynamically and autonomously respond to changes detected either in its internal components or in its deployment environment. This response is expected to ensu...

Templates for textual use cases of software product lines: results from a systematic mapping study and a controlled experiment

Use case templates can be used to describe functional requirements of a Software Product Line. However, to the best of our knowledge, no efforts have been made to collect and summarize these existing templates...

F3T: a tool to support the F3 approach on the development and reuse of frameworks

Frameworks are used to enhance the quality of applications and the productivity of the development process, since applications may be designed and implemented by reusing framework classes. However, frameworks ...

NextBug: a Bugzilla extension for recommending similar bugs

Due to the characteristics of the maintenance process followed in open source systems, developers are usually overwhelmed with a great amount of bugs. For instance, in 2012, approximately 7,600 bugs/month were...

Assessing the benefits of search-based approaches when designing self-adaptive systems: a controlled experiment

The well-orchestrated use of distilled experience, domain-specific knowledge, and well-informed trade-off decisions is imperative if we are to design effective architectures for complex software-intensive syst...

Revealing influence of model structure and test case profile on the prioritization of test cases in the context of model-based testing

Test case prioritization techniques aim at defining an order of test cases that favor the achievement of a goal during test execution, such as revealing failures as earlier as possible. A number of techniques ...

A metrics suite for JUnit test code: a multiple case study on open source software

The code of JUnit test cases is commonly used to characterize software testing effort. Different metrics have been proposed in literature to measure various perspectives of the size of JUnit test cases. Unfort...

Designing fault-tolerant SOA based on design diversity

Over recent years, software developers have been evaluating the benefits of both Service-Oriented Architecture (SOA) and software fault tolerance techniques based on design diversity. This is achieved by creat...

Method-level code clone detection through LWH (Light Weight Hybrid) approach

Many researchers have investigated different techniques to automatically detect duplicate code in programs exceeding thousand lines of code. These techniques have limitations in finding either the structural o...

The problem of conceptualization in god class detection: agreement, strategies and decision drivers

The concept of code smells is widespread in Software Engineering. Despite the empirical studies addressing the topic, the set of context-dependent issues that impacts the human perception of what is a code sme...

  • Editorial Board
  • Sign up for article alerts and news from this journal
  • Publications
  • News and Events
  • Education and Outreach

Software Engineering Institute

Cite this post.

AMS Citation

Carleton, A., 2021: Architecting the Future of Software Engineering: A Research and Development Roadmap. Carnegie Mellon University, Software Engineering Institute's Insights (blog), Accessed May 21, 2024, https://insights.sei.cmu.edu/blog/architecting-the-future-of-software-engineering-a-research-and-development-roadmap/.

APA Citation

Carleton, A. (2021, July 12). Architecting the Future of Software Engineering: A Research and Development Roadmap. Retrieved May 21, 2024, from https://insights.sei.cmu.edu/blog/architecting-the-future-of-software-engineering-a-research-and-development-roadmap/.

Chicago Citation

Carleton, Anita. "Architecting the Future of Software Engineering: A Research and Development Roadmap." Carnegie Mellon University, Software Engineering Institute's Insights (blog) . Carnegie Mellon's Software Engineering Institute, July 12, 2021. https://insights.sei.cmu.edu/blog/architecting-the-future-of-software-engineering-a-research-and-development-roadmap/.

IEEE Citation

A. Carleton, "Architecting the Future of Software Engineering: A Research and Development Roadmap," Carnegie Mellon University, Software Engineering Institute's Insights (blog) . Carnegie Mellon's Software Engineering Institute, 12-Jul-2021 [Online]. Available: https://insights.sei.cmu.edu/blog/architecting-the-future-of-software-engineering-a-research-and-development-roadmap/. [Accessed: 21-May-2024].

BibTeX Code

@misc{carleton_2021, author={Carleton, Anita}, title={Architecting the Future of Software Engineering: A Research and Development Roadmap}, month={Jul}, year={2021}, howpublished={Carnegie Mellon University, Software Engineering Institute's Insights (blog)}, url={https://insights.sei.cmu.edu/blog/architecting-the-future-of-software-engineering-a-research-and-development-roadmap/}, note={Accessed: 2024-May-21} }

Architecting the Future of Software Engineering: A Research and Development Roadmap

Headshot of Anita Carleton.

Anita Carleton

July 12, 2021, published in.

Software Engineering Research and Development

This post has been shared 10 times.

This post is coauthored by John Robert, Mark Klein, Doug Schmidt, Forrest Shull, John Foreman, Ipek Ozkaya, Robert Cunningham, Charlie Holland, Erin Harper, and Edward Desautels

Software is vital to our country’s global competitiveness, innovation, and national security. It also ensures our modern standard of living and enables continued advances in defense, infrastructure, healthcare, commerce, education, and entertainment. As the DoD’s federally funded research and development center (FFRDC) focused on improving the practice of software engineering, the Carnegie Mellon University (CMU) Software Engineering Institute (SEI) is leading the community in creating a multi-year research and development vision and roadmap for engineering next-generation software-reliant systems. This blog post describes that effort.

Software Engineering as Strategic Advantage

In a 2020 National Academy of Science Study on Air Force software sustainment , the U.S. Air Force recognized that “to continue to be a world-class fighting force, it needs to be a world-class software developer.” This concept clearly applies far beyond the Department of Defense . Software systems enable world-class healthcare, commerce, education, energy generation, and more. These systems that run our world are rapidly becoming more data intensive and interconnected, increasingly utilize AI, require larger-scale integration, and must be considerably more resilient. Consequently, significant investment in software engineering R&D is needed now to enable and ensure future capability.

Goals of This Work

The SEI has leveraged its connections with academic institutions and communities, DoD leaders and members of the Defense Industrial Base , and industry innovators and research organizations to:

  • identify future challenges in engineering software-reliant and intelligent systems in emerging, national-priority technical domains, including gaps between current engineering techniques and future domains that will be more reliant on continuous evolution and AI
  • develop a research roadmap that will drive advances in foundational software engineering principles across a range of system types, such as intelligent, safety-critical, and data-intensive systems
  • raise the visibility of software to the point where it receives the sustained recognition commensurate with its importance to national security and competitiveness
  • enable strategic partnerships and collaborations to drive innovation among industry, academia, and government.

Guided by an Advisory Board of U.S. Visionaries and Senior Thought Leaders

To succeed in developing our vision and roadmap for software engineering research and development, it is vital to coordinate the academic, defense, and commercial communities to define an effective agenda and implement impactful results. To help represent the views of all these software engineering constituencies, the SEI formed an advisory board from DoD, industry, academia, research labs, and technology companies to offer guidance. Members of this advisory board include the following:

  • Deb Frincke , advisory board chair, Associate Laboratory Director for National Security Sciences, Oak Ridge National Laboratory
  • Michael McQuade , vice president for research, Carnegie Mellon University
  • Vint Cerf , vice president and chief internet evangelist, Google
  • Penny Compton , vice president for software systems, cyber, and operations, Lockheed Martin Space
  • Tim Dare , deputy director for prototyping and software, Office of the Under Secretary of Defense for Research and Engineering (previous position)
  • Sara Manning Dawson , chief technology officer enterprise security, Microsoft
  • Jeff Dexter , senior director of flight software & cybersecurity, SPACEX
  • Yolanda Gil, president, Association for the Advancement of Artificial Intelligence (AAAI); Director of Knowledge Technologies, Information Sciences Institute at University of Southern California
  • Tim McBride , president, Zoic Studios
  • Nancy Pendleton , vice president and senior chief engineer for mission systems, payloads and sensors, Boeing Defense, Space and Security
  • William Scherlis , director Information Innovation Office, DARPA

In June 2020, the SEI assembled this board to leverage their diverse perspectives and provide strategic advice, influence stakeholders, develop connections, assist in executing the roadmap, and advocate for the use of our results.

Future Systems and Fundamental Shifts in Software Engineering Require New Research Focus

Rapidly deploying software with confidence requires fundamental shifts in software engineering. New types of systems will continue to push beyond the bounds of what current software engineering theories, tools, and practices can support, including (but not limited to):

  • Systems that fuse data at a huge scale, whether for news, entertainment, or intelligence: We will need to continuously mine vast amounts of open-source data streams (e.g., YouTube videos and Twitter feeds) for important information that will in turn drive decision making. This vast stream of data will also drive new ways of constructing systems.
  • Smart cities, buildings, roads, cars, and transport: How will these highly connected systems work together seamlessly? How will we enable safe and affordable transportation and living?
  • Personal digital assistants: How will these assistants learn, adapt, and engage in home and business workflows?
  • Dynamically integrated healthcare: Data from your personal device will be combined with hospital data. How do we meet stringent safety and privacy requirements? How do we evaluate assurance in a highly data-driven environment?
  • Mission-level adaptation for DoD systems: DoD systems will feature mission-level construction of new integrated systems that combine a range of capabilities, such as intel, weapons, and human/machine teaming. The DoD is already moving in this direction, but how can we increase confidence that there will be no unintended consequences?

A Guiding Vision of the Future of Software Engineering

Our guiding vision is one in which the current notion of software development is replaced by the concept of a software pipeline consisting of humans and software as trustworthy collaborators who rapidly evolve systems based on user intent. To achieve this vision, we anticipate the need for not only new development paradigms but also new architectural paradigms for engineering new kinds of systems.

Advanced development paradigms, such as those listed below, lead to efficiency and trust at scale:

  • Humans leverage trusted AI as a workforce multiplier for all aspects of software creation.
  • Formal assurance arguments are evolved to assure and efficiently re-assure continuously evolving software.
  • Advanced software composition mechanisms enable predictable construction of systems at increasingly large scale.

Advanced architectural paradigms, as outlined below, enable the predictable use of new computational models:

  • Theories and techniques drawn from the behavioral sciences are used to design large-scale socio-technical systems, leading to predictable social outcomes.
  • New analysis and design methods facilitate the development of quantum-enabled systems.

AI and non-AI components interact in predictable ways to achieve enhanced mission, societal, and business goals.

Research Focus Areas

The fundamental shifts and needed advances in software engineering described above require new areas of research. In close collaboration with our advisory board and other leaders in the software engineering community, we have developed a research roadmap with six focus areas. Figure 1 shows those areas and outlines a suggested course of research topics to undertake. Short descriptions of each focus area and its challenges follow.

Figure 1: Software Engineering Research Roadmap with Research Focus Areas and Research Objectives (10-15 Year Horizon)

  • AI-Augmented Software Development . At almost every stage of the software development process, AI holds the promise of assisting humans. By relieving humans of tedious tasks, they will be better able to focus on tasks that require the creativity and innovation that only humans can provide. To reach this goal, we need to re-envision the entire software development process with increased AI and automation tool support for developers, and we need to ensure we take advantage of the data generated throughout the entire lifecycle. The focus of this research area is on what AI-augmented software development will look like at each stage of the development process and during continuous evolution, where it will be particularly useful in taking on routine tasks.
  • Assuring Continuously Evolving Systems . When we consider the software-reliant systems of today, we see that they are not static (or even infrequently updated) engineering artifacts. Instead, they are fluid—meaning that they are expected to undergo continuing updates and improvements throughout their lifespan. The goal of this research area is therefore to develop a theory and practice of rapid and assured software evolution that enables efficient and bounded re-assurance of continuously evolving systems.
  • Software Construction through Compositional Correctness . As the scope and scale of software-reliant systems continues to grow and change continuously, the complexity of these systems makes it unrealistic for any one person or group to understand the entire system. It is therefore necessary to integrate (and continually re-integrate) software-reliant systems using technologies and platforms that support the composition of modular components, many of which are reused from existing elements that were not designed to be integrated or evolved together. The goal of this research area is to create methods and tools (such as domain specific modeling language and annotation-based dependency injection) that enable the specification and enforcement of composition rules that allow (1) the creation of required behaviors (both functionality and quality attributes) and (2) the assurance of these behaviors.
  • Engineering Socio-Technical Systems . Societal-scale software systems, such as today’s commercial social media systems, are designed to keep users engaged to influence them. However, avoiding bias and ensuring the accuracy of information are not always goals or outcomes of these systems. Engineering societal-scale systems focuses on prediction of such outcomes (which we refer to as socially inspired quality attributes) that arise when we humans as integral components of the system. The goal is to leverage insights from the social sciences to build and evolve societal-scale software systems that consider qualities such as bias and influence.
  • Engineering AI-enabled Software Systems . AI-enabled systems, which are software-reliant systems that include AI and non-AI components, have some inherently different characteristics than those without AI. However, AI-enabled systems are, above all, a type of software system. These systems have many parallels with the development and sustainment of more conventional software-reliant systems. This research area focuses on exploring which existing software engineering practices can reliably support the development of AI systems, as well as identifying and augmenting software engineering techniques for the specification, design, architecture, analysis, deployment, and sustainment of systems with AI components.
  • Engineering Quantum Computing Systems . Advances in software engineering for quantum are as important as the hardware advances. The goals of this research area are to first enable current quantum computers so they can be programmed more easily and reliably, and then enable increasing abstraction as larger, fully fault-tolerant quantum computing systems become available. Eventually, it should be possible fully integrate these types of systems into a unified classical and quantum software development lifecycle.

Help Shape Our National Software Research Agenda

Along with the advisory board, our research team has examined future trends in the computing landscape and emerging technologies; conducted a series of expert interviews; and convened multiple workshops for broad engagement and diverse perspectives, including a workshop on Software Engineering Grand Challenges and Future Visions co-hosted with the Defense Advanced Research Projects Agency (DARPA) . This workshop brought together leaders in the software engineering research and development community to describe (1) important classes of future software-reliant systems and their associated software engineering challenges, and (2) research methods, tools, and practices that are needed to make those systems feasible. An upcoming SEI blog post will provide a synopsis of what was covered in this workshop.

Your feedback would be appreciated on the software engineering challenges and proposed research focus areas to help inform the National Agenda for Software Engineering Study. Please email [email protected] to send your thoughts and comments on the software engineering study & research roadmap or to volunteer as a potential reviewer of study drafts. Thank you.

Headshot of Anita Carleton.

Author Page

Digital library publications, send a message, more by the author, application of large language models (llms) in software engineering: overblown hype or disruptive change, october 2, 2023 • by ipek ozkaya , anita carleton , john e. robert , douglas schmidt (vanderbilt university), join the sei and white house ostp to explore the future of software and ai engineering, may 30, 2023 • by anita carleton , john e. robert , mark h. klein , douglas schmidt (vanderbilt university) , erin harper, software engineering as a strategic advantage: a national roadmap for the future, november 15, 2021 • by anita carleton , john e. robert , mark h. klein , erin harper, more in software engineering research and development, the latest work from the sei: an openai collaboration, generative ai, and zero trust, april 10, 2024 • by douglas schmidt (vanderbilt university), applying the sei sbom framework, february 5, 2024 • by carol woody, 10 benefits and 10 challenges of applying large language models to dod software acquisition, january 22, 2024 • by john e. robert , douglas schmidt (vanderbilt university), the latest work from the sei, january 15, 2024 • by douglas schmidt (vanderbilt university), the top 10 blog posts of 2023, january 8, 2024 • by douglas schmidt (vanderbilt university), get updates on our latest work..

Sign up to have the latest post sent to your inbox weekly.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

software engineering Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Identifying Non-Technical Skill Gaps in Software Engineering Education: What Experts Expect But Students Don’t Learn

As the importance of non-technical skills in the software engineering industry increases, the skill sets of graduates match less and less with industry expectations. A growing body of research exists that attempts to identify this skill gap. However, only few so far explicitly compare opinions of the industry with what is currently being taught in academia. By aggregating data from three previous works, we identify the three biggest non-technical skill gaps between industry and academia for the field of software engineering: devoting oneself to continuous learning , being creative by approaching a problem from different angles , and thinking in a solution-oriented way by favoring outcome over ego . Eight follow-up interviews were conducted to further explore how the industry perceives these skill gaps, yielding 26 sub-themes grouped into six bigger themes: stimulating continuous learning , stimulating creativity , creative techniques , addressing the gap in education , skill requirements in industry , and the industry selection process . With this work, we hope to inspire educators to give the necessary attention to the uncovered skills, further mitigating the gap between the industry and the academic world.

Opportunities and Challenges in Code Search Tools

Code search is a core software engineering task. Effective code search tools can help developers substantially improve their software development efficiency and effectiveness. In recent years, many code search studies have leveraged different techniques, such as deep learning and information retrieval approaches, to retrieve expected code from a large-scale codebase. However, there is a lack of a comprehensive comparative summary of existing code search approaches. To understand the research trends in existing code search studies, we systematically reviewed 81 relevant studies. We investigated the publication trends of code search studies, analyzed key components, such as codebase, query, and modeling technique used to build code search tools, and classified existing tools into focusing on supporting seven different search tasks. Based on our findings, we identified a set of outstanding challenges in existing studies and a research roadmap for future code search research.

Psychometrics in Behavioral Software Engineering: A Methodological Introduction with Guidelines

A meaningful and deep understanding of the human aspects of software engineering (SE) requires psychological constructs to be considered. Psychology theory can facilitate the systematic and sound development as well as the adoption of instruments (e.g., psychological tests, questionnaires) to assess these constructs. In particular, to ensure high quality, the psychometric properties of instruments need evaluation. In this article, we provide an introduction to psychometric theory for the evaluation of measurement instruments for SE researchers. We present guidelines that enable using existing instruments and developing new ones adequately. We conducted a comprehensive review of the psychology literature framed by the Standards for Educational and Psychological Testing. We detail activities used when operationalizing new psychological constructs, such as item pooling, item review, pilot testing, item analysis, factor analysis, statistical property of items, reliability, validity, and fairness in testing and test bias. We provide an openly available example of a psychometric evaluation based on our guideline. We hope to encourage a culture change in SE research towards the adoption of established methods from psychology. To improve the quality of behavioral research in SE, studies focusing on introducing, validating, and then using psychometric instruments need to be more common.

Towards an Anatomy of Software Craftsmanship

Context: The concept of software craftsmanship has early roots in computing, and in 2009, the Manifesto for Software Craftsmanship was formulated as a reaction to how the Agile methods were practiced and taught. But software craftsmanship has seldom been studied from a software engineering perspective. Objective: The objective of this article is to systematize an anatomy of software craftsmanship through literature studies and a longitudinal case study. Method: We performed a snowballing literature review based on an initial set of nine papers, resulting in 18 papers and 11 books. We also performed a case study following seven years of software development of a product for the financial market, eliciting qualitative, and quantitative results. We used thematic coding to synthesize the results into categories. Results: The resulting anatomy is centered around four themes, containing 17 principles and 47 hierarchical practices connected to the principles. We present the identified practices based on the experiences gathered from the case study, triangulating with the literature results. Conclusion: We provide our systematically derived anatomy of software craftsmanship with the goal of inspiring more research into the principles and practices of software craftsmanship and how these relate to other principles within software engineering in general.

On the Reproducibility and Replicability of Deep Learning in Software Engineering

Context: Deep learning (DL) techniques have gained significant popularity among software engineering (SE) researchers in recent years. This is because they can often solve many SE challenges without enormous manual feature engineering effort and complex domain knowledge. Objective: Although many DL studies have reported substantial advantages over other state-of-the-art models on effectiveness, they often ignore two factors: (1) reproducibility —whether the reported experimental results can be obtained by other researchers using authors’ artifacts (i.e., source code and datasets) with the same experimental setup; and (2) replicability —whether the reported experimental result can be obtained by other researchers using their re-implemented artifacts with a different experimental setup. We observed that DL studies commonly overlook these two factors and declare them as minor threats or leave them for future work. This is mainly due to high model complexity with many manually set parameters and the time-consuming optimization process, unlike classical supervised machine learning (ML) methods (e.g., random forest). This study aims to investigate the urgency and importance of reproducibility and replicability for DL studies on SE tasks. Method: In this study, we conducted a literature review on 147 DL studies recently published in 20 SE venues and 20 AI (Artificial Intelligence) venues to investigate these issues. We also re-ran four representative DL models in SE to investigate important factors that may strongly affect the reproducibility and replicability of a study. Results: Our statistics show the urgency of investigating these two factors in SE, where only 10.2% of the studies investigate any research question to show that their models can address at least one issue of replicability and/or reproducibility. More than 62.6% of the studies do not even share high-quality source code or complete data to support the reproducibility of their complex models. Meanwhile, our experimental results show the importance of reproducibility and replicability, where the reported performance of a DL model could not be reproduced for an unstable optimization process. Replicability could be substantially compromised if the model training is not convergent, or if performance is sensitive to the size of vocabulary and testing data. Conclusion: It is urgent for the SE community to provide a long-lasting link to a high-quality reproduction package, enhance DL-based solution stability and convergence, and avoid performance sensitivity on different sampled data.

Predictive Software Engineering: Transform Custom Software Development into Effective Business Solutions

The paper examines the principles of the Predictive Software Engineering (PSE) framework. The authors examine how PSE enables custom software development companies to offer transparent services and products while staying within the intended budget and a guaranteed budget. The paper will cover all 7 principles of PSE: (1) Meaningful Customer Care, (2) Transparent End-to-End Control, (3) Proven Productivity, (4) Efficient Distributed Teams, (5) Disciplined Agile Delivery Process, (6) Measurable Quality Management and Technical Debt Reduction, and (7) Sound Human Development.

Software—A New Open Access Journal on Software Engineering

Software (ISSN: 2674-113X) [...]

Improving bioinformatics software quality through incorporation of software engineering practices

Background Bioinformatics software is developed for collecting, analyzing, integrating, and interpreting life science datasets that are often enormous. Bioinformatics engineers often lack the software engineering skills necessary for developing robust, maintainable, reusable software. This study presents review and discussion of the findings and efforts made to improve the quality of bioinformatics software. Methodology A systematic review was conducted of related literature that identifies core software engineering concepts for improving bioinformatics software development: requirements gathering, documentation, testing, and integration. The findings are presented with the aim of illuminating trends within the research that could lead to viable solutions to the struggles faced by bioinformatics engineers when developing scientific software. Results The findings suggest that bioinformatics engineers could significantly benefit from the incorporation of software engineering principles into their development efforts. This leads to suggestion of both cultural changes within bioinformatics research communities as well as adoption of software engineering disciplines into the formal education of bioinformatics engineers. Open management of scientific bioinformatics development projects can result in improved software quality through collaboration amongst both bioinformatics engineers and software engineers. Conclusions While strides have been made both in identification and solution of issues of particular import to bioinformatics software development, there is still room for improvement in terms of shifts in both the formal education of bioinformatics engineers as well as the culture and approaches of managing scientific bioinformatics research and development efforts.

Inter-team communication in large-scale co-located software engineering: a case study

AbstractLarge-scale software engineering is a collaborative effort where teams need to communicate to develop software products. Managers face the challenge of how to organise work to facilitate necessary communication between teams and individuals. This includes a range of decisions from distributing work over teams located in multiple buildings and sites, through work processes and tools for coordinating work, to softer issues including ensuring well-functioning teams. In this case study, we focus on inter-team communication by considering geographical, cognitive and psychological distances between teams, and factors and strategies that can affect this communication. Data was collected for ten test teams within a large development organisation, in two main phases: (1) measuring cognitive and psychological distance between teams using interactive posters, and (2) five focus group sessions where the obtained distance measurements were discussed. We present ten factors and five strategies, and how these relate to inter-team communication. We see three types of arenas that facilitate inter-team communication, namely physical, virtual and organisational arenas. Our findings can support managers in assessing and improving communication within large development organisations. In addition, the findings can provide insights into factors that may explain the challenges of scaling development organisations, in particular agile organisations that place a large emphasis on direct communication over written documentation.

Aligning Software Engineering and Artificial Intelligence With Transdisciplinary

Study examined AI and SE transdisciplinarity to find ways of aligning them to enable development of AI-SE transdisciplinary theory. Literature review and analysis method was used. The findings are AI and SE transdisciplinarity is tacit with islands within and between them that can be linked to accelerate their transdisciplinary orientation by codification, internally developing and externally borrowing and adapting transdisciplinary theories. Lack of theory has been identified as the major barrier toward towards maturing the two disciplines as engineering disciplines. Creating AI and SE transdisciplinary theory would contribute to maturing AI and SE engineering disciplines.  Implications of study are transdisciplinary theory can support mode 2 and 3 AI and SE innovations; provide an alternative for maturing two disciplines as engineering disciplines. Study’s originality it’s first in SE, AI or their intersections.

Export Citation Format

Share document.

  • Publications
  • News and Events
  • Education and Outreach

Software Engineering Institute

Research review 2022.

At the 2022 Research Review, our researchers detail how they are forging a new path for software engineering by executing the SEI’s technical strategy to deliver tangible results.

Researchers highlight methods, prototypes, and tools aimed at the most important problems facing the DoD, industry, and academia, including AI engineering, computing at the tactical edge, threat hunting, continuous integration/continuous delivery, and machine learning trustworthiness.

Learn how our researchers' work in areas such as model-based systems engineering, DevSecOps, automated design conformance, software/cyber/AI integration, and AI network defense—to name a few—has produced value for the U.S. Department of Defense (DoD) and advanced the state of the practice.

Monday, November 14, 2022

Tuesday, november 15, 2022, wednesday, november 16, 2022.

Research Topics in Software Engineering

research topics on software engineering

This seminar is an opportunity to become familiar with current research in software engineering and more generally with the methods and challenges of scientific research.

Each student will be asked to study some papers from the recent software engineering literature and review them. This is an exercise in critical review and analysis. Active participation is required (a presentation of a paper as well as participation in discussions).

The aim of this seminar is to introduce students to recent research results in the area of programming languages and software engineering. To accomplish that, students will study and present research papers in the area as well as participate in paper discussions. The papers will span topics in both theory and practice, including papers on program verification, program analysis, testing, programming language design, and development tools.

Research in Software Engineering (RiSE)

Research in Software Engineering graphic

Our mission is to make everyone a programmer and maximize the productivity of every programmer. This will democratize computing to empower every person and every organization to achieve more. We achieve our vision through open-ended fundamental research in programming languages, software engineering, and automated reasoning. We strongly believe in pushing our research to its logical extreme to positively impact people’s lives.

Foundations

Logical formalisms and theorem proving

Lean , Symbolic Automata , Z3  

Programming languages/models

Bosque (opens in new tab) , Catala (opens in new tab) , F* (opens in new tab) , Koka (opens in new tab) , TLA+ (opens in new tab)

Azure Durable Functions , Netherite , Orleans

High assurance/performance cloud

Correctness

Network Verification , Project Everest , Torch

AI and Big Data

AI at Scale , CHET , Parade

Program analysis tools

Corral , Angelic Verification , Verisol

Program understanding/debugging

MSAGL , Time travel debugging

AI-assisted software development

Future of Program Merge , Trusted AI-assisted Programming

Education and the end-user

CS Education

BBC micro:bit , Microsoft MakeCode

End-user embedded systems

Jacdac , MakeAccessible

  • Follow on Twitter
  • Like on Facebook
  • Follow on LinkedIn
  • Subscribe on Youtube
  • Follow on Instagram
  • Subscribe to our RSS feed

Share this page:

  • Share on Twitter
  • Share on Facebook
  • Share on LinkedIn
  • Share on Reddit

Software engineering and programming languages

Software engineering and programming language researchers at Google study all aspects of the software development process, from the engineers who make software to the languages and tools that they use.

About the team

We are a collection of teams from across the company who study the problems faced by engineers and invent new technologies to solve those problems. Our teams take a variety of approaches to solve these problems, including empirical methods, interviews, surveys, innovative tools, formal models, predictive machine learning modeling, data science, experiments, and mixed-methods research techniques. As our engineers work within the largest code repository in the world, the solutions need to work at scale, across a team of global engineers and over 2 billion lines of code.

We aim to make an impact internally on Google engineers and externally on the larger ecosystem of software engineers around the world.

Team focus summaries

Developer tools.

Google provides its engineers’ with cutting edge developer tools that operate on codebase with billions of lines of code. The tools are designed to provide engineers with a consistent view of the codebase so they can navigate and edit any project. We research and create new, unique developer tools that allow us to get the benefits of such a large codebase, while still retaining a fast development velocity.

Developer Inclusion and Diversity

We aim to understand diversity and inclusion challenges facing software developers and evaluate interventions that move the needle on creating an inclusive and equitable culture for all.

Developer Productivity

We use both qualitative and quantitative methods to study how to make engineers more productive. Google uses the results of these studies to improve both our internal developer tools and processes and our external offerings for developers on GCP and Android.

Program Analysis and Refactoring

We build static and dynamic analysis tools that find and prevent serious bugs from manifesting in both Google’s and third-party code. We also leverage this large-scale analysis infrastructure to refactor Google’s code at scale.

Machine Learning for Code

We apply deep learning to Google’s large, well-curated codebase to automatically write code and repair bugs.

Programming Language Design and Implementation

We design, evaluate, and implement new features for popular programming languages like Java, C++, and Go through their standards’ processes.

Automated Software Testing and Continuous Integration

We design, implement and evaluate tools and frameworks to automate the testing process and integrate tests with the Google-wide continuous integration infrastructure.

Featured publications

Highlighted work.

ES flamingo

Some of our locations

Atlanta

Some of our people

Andrew Macvean

Andrew Macvean

  • Human-Computer Interaction and Visualization
  • Software Systems

Caitlin Sadowski

Caitlin Sadowski

  • Data Management
  • Software Engineering

Charles Sutton

Charles Sutton

  • Machine Intelligence
  • Natural Language Processing

Ciera Jaspan

Ciera Jaspan

Domagoj Babic

Domagoj Babic

  • Algorithms and Theory
  • Distributed Systems and Parallel Computing

Emerson Murphy-Hill

Emerson Murphy-Hill

Franjo Ivancic

Franjo Ivancic

  • Security, Privacy and Abuse Prevention

John Penix

Kathryn S. McKinley

  • Hardware and Architecture

Marko Ivanković

Marko Ivanković

Martín Abadi

Martín Abadi

Hans-Juergen Boehm

Hans-Juergen Boehm

Hyrum Wright

Hyrum Wright

Lisa Nguyen Quang Do

Lisa Nguyen Quang Do

John Field

Danny Tarlow

Petros Maniatis

Petros Maniatis

  • Mobile Systems

Albert Cohen

Albert Cohen

research topics on software engineering

Kaiyuan Wang

Dustin C Smith

Dustin C Smith

Harini Sampath

Phitchaya Mangpo

Phitchaya Mangpo

We're always looking for more talented, passionate people.

Careers

  • Search Search for:
  • Architecture
  • Military Tech
  • DIY Projects

Wonderful Engineering

Software Engineer Research Paper Topics 2021: Top 5

research topics on software engineering

Whether you’re studying in advance or you’re close to getting that Software Engineering degree, it’s crucial that you look for possible research paper topics in advance. This will help you have an advantage in your course.

First off, remember that software engineering revolves around tech development and improvement.

Hence, your research paper should have the same goal. It shouldn’t be too complex so that you can go through it smoothly. At the same time, it shouldn’t be too easy to the point that it can be looked up online.

Choosing can be a difficult task. Students are often choosing buy assignment from a professional writer because of the wrong topic choice. Thus, to help you land on the best topic for your needs, we have listed the top 5 software engineer research paper topics in the next sections.

Machine Learning

Machine learning is one of the most used research topics of software engineers. If you’re not yet familiar with this, it’s a field that revolves around producing programs that improve its algorithm on its own just by the use of existing data and experience.

Basically, the art of machine learning aims to make intelligent tools. Here, you will need to use various statistical methods for your computers’ algorithms. This somehow makes it a complex and long topic.

Even so, the good thing about the said field is it covers a lot of subtopics. These can include using machine learning for face spoof detection, iris detection, sentiment analysis technique, and likes. Usually, though, machine learning will go hand in hand with certain detection systems.

Artificial Intelligence

Artificial Intelligence is a much easier concept than machine learning. Note, though, that the latter is just another type of AI tool.

AI refers to the human-like intelligence integrated into machines and computer programs. Focusing on this will give you much more topics to write about. Since it’s present in a lot of fields like gaming, marketing, and even random automated tasks, you will have more materials to refer to.

Some things that you can write about in your paper include AI’s relationship with software engineering, robotics, and natural processing. You can also write about the different types of artificial intelligence tools for a more guided research paper.

Internet Of Things

Another topic that you can write about is the Internet of Things, or more commonly known as IoT . This refers to interconnected devices, machines, or even living beings as long as a network exists.

Writing about IoT will open a huge array of possibilities to write about. You can talk about whether the topic is a problem that needs additional solutions or improvements. At the same time, you will be able to talk about specific machine requirements since IoT works mainly with communication servers.

In addition, the concept of the Internet of Things is also used in several fields like agriculture, e-commerce, and medicine. Because of this, you can rest assured that you won’t run out of things to talk about or refer to.

Software Development Models

Next up, we have software development models. If you want to write about a research paper(or maybe you decided to purchase custom research paper ?) relating to how one can start building an app or software, then using software development models as a topic is a good choice.

Here, you can choose to write about what the concept is or delve deeper into its different types. You can look into the Waterfall Model, V-Model, Incremental, RAD, Agile, Iterative, Spiral, and Prototype. You can choose either one or all of the models and then relate them to software engineering.

Clone Management

One of the most important elements in software engineering is the clone base. Hence, using this as a research topic will help you stay relevant to your course and its needs. In particular, you can focus on clone management.

Clone management is a task that revolves around ensuring that a database is free from error and duplicated codes. What makes this a good topic is its materials are still limited in the field of software engineering. This is compared to other clone-related topics. Hence, you can ensure a distinct topic for your paper.

To land on the best topic, take your interest into account. Look for the field that makes you curious and entertained. In this way, you can build motivation to actually know more about it, and not just for the sake of submitting.

Another good tip is to choose a unique topic. The ones we discussed above can be considered unique since they are some of the latest software-related topics. If you’re going to use a common one, then make sure that you put your own little twist to it. You can also consider seeing the topic in a different light.

Anyhow, your research paper, its grade, and overall quality will greatly depend on what you choose to write about.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Notify me of follow-up comments by email.

Notify me of new posts by email.

research topics on software engineering

Topic modeling in software engineering research

  • Open access
  • Published: 06 September 2021
  • Volume 26 , article number  120 , ( 2021 )

Cite this article

You have full access to this open access article

research topics on software engineering

  • Camila Costa Silva   ORCID: orcid.org/0000-0002-3690-1711 1 ,
  • Matthias Galster   ORCID: orcid.org/0000-0003-3491-1833 1 &
  • Fabian Gilson   ORCID: orcid.org/0000-0002-1465-3315 1  

7154 Accesses

35 Citations

1 Altmetric

Explore all metrics

Topic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling needs to be applied carefully (e.g., depending on the type of textual data analyzed and modeling parameters). Our study aims at describing how topic modeling has been applied in software engineering research with a focus on four aspects: (1) which topic models and modeling techniques have been applied, (2) which textual inputs have been used for topic modeling, (3) how textual data was “prepared” (i.e., pre-processed) for topic modeling, and (4) how generated topics (i.e., word clusters) were named to give them a human-understandable meaning. We analyzed topic modeling as applied in 111 papers from ten highly-ranked software engineering venues (five journals and five conferences) published between 2009 and 2020. We found that (1) LDA and LDA-based techniques are the most frequent topic modeling techniques, (2) developer communication and bug reports have been modelled most, (3) data pre-processing and modeling parameters vary quite a bit and are often vaguely reported, and (4) manual topic naming (such as deducting names based on frequent words in a topic) is common.

Similar content being viewed by others

Semantic topic models for source code analysis, a survey on the use of topic models when mining software repositories.

research topics on software engineering

Latent Dirichlet Allocation (LDA) Based on Automated Bug Severity Prediction Model

Avoid common mistakes on your manuscript.

1 Introduction

Text mining is about searching, extracting and processing text to provide meaningful insights from the text based on a certain goal. Techniques for text mining include natural language processing (NLP) to process, search and understand the structure of text (e.g., part-of-speech tagging), web mining to discover information resources on the web (e.g., web crawling), and information extraction to extract structured information from unstructured text and relationships between pieces of information (e.g., co-reference, entity extraction) (Miner et al. 2012 ). Text mining has been widely used in software engineering research (Bi et al. 2018 ), for example, to uncover architectural design decisions in developer communication (Soliman et al. 2016 ) or to link software artifacts to source code (Asuncion et al. 2010 ).

Topic modeling is a text mining and concept extraction method that extracts topics (i.e., coherent word clusters) from large corpora of textual documents to discovery hidden semantic structures in text (Miner et al. 2012 ). An advantage of topic modeling over other techniques is that it helps analyzing long texts (Treude and Wagner 2019 ; Miner et al. 2012 ), creates clusters as “topics” (rather than individual words) and is unsupervised (Miner et al. 2012 ).

Topic modeling has become popular in software engineering research (Sun et al. 2016 ; Chen et al. 2016 ). For example, Sun et al. ( 2016 ) found that topic modeling had been used to support source code comprehension, feature location and defect prediction. Additionally, Chen et al. ( 2016 ) found that many repository mining studies apply topic modeling to textual data such as source code and log messages to recommend code refactoring (Bavota et al. 2014b ) or to localize bugs (Lukins et al. 2010 ).

Probabilistic topic models such as Latent Semantic Indexing (LSI) (Deerwester et al. 1990 ) and Latent Dirichlet Allocation (LDA) (Blei et al. 2003b ) discover topics in a corpus of textual documents, using the statistical properties of word frequencies and co-occurrences (Lin et al. 2014 ). However, Agrawal et al. ( 2018 ) warn about systematic errors in the analysis of LDA topic models that limit the validity of topics. Lin et al. ( 2014 ) also advise that classical topic models usually generate sub-optimal topics when applied “as is” to small amounts or short text documents.

Considering the limitations of topic modeling techniques and topic models on the one hand and their potential usefulness in software engineering on the other hand, our goal is to describe how topic modeling has been applied in software engineering research. In detail, we explore the following research questions:

RQ1. Which topic modeling techniques have been used and for what purpose? There are different topic modeling techniques (see Section  2 ), each with their own limitations and constraints (Chen et al. 2016 ). This RQ aims at understanding which topic modeling techniques have been used (e.g., LDA, LSI) and for what purpose studies applied such techniques (e.g., to support software maintenance tasks). Furthermore, we analyze the types of contributions in studies that used topic modeling (e.g., a new approach as a solution proposal, or an exploratory study).

RQ2. What are the inputs into topic modeling? Topic modeling techniques accept different types of textual documents and require the configuration of parameters (see Section  2.1 ). Carefully choosing parameters (such as the number of topics to be generated) is essential for obtaining valuable and reliable topics (Agrawal et al. 2018 ; Treude and Wagner 2019 ). This RQ aims at analysing types of textual data (e.g., source code), actual documents (e.g., a Java class or an individual Java method) and configured parameters used for topic modeling to address software engineering problems.

RQ3: How are data pre-processed for topic modeling? Topic modeling requires that the analyzed text is pre-processed (e.g., by removing stop words) to improve the quality of the produced output (Aggarwal and Zhai 2012 ; Bi et al. 2018 ). This RQ aims at analysing how previous studies pre-processed textual data for topic modeling, including the steps for cleaning and transforming text. This will help us understand if there are specific pre-processing steps for a certain topic modeling technique or types of textual data.

RQ4. How are generated topics named? This RQ aims at analyzing if and how topics (word clusters) were named in studies. Giving meaningful names to topics may be difficult but may be required to help humans comprehend topics. For example, naming topics can provide a high-level view on topics discussed by developers in Stack Overflow (a Q&A website) (Barua et al. 2014 ) or by end mobile app users in tweets (Mezouar et al. 2018 ). Analysts (e.g., developers interested in what topics are discussed on Stack Overflow or app reviews) can then look at the name of the topic (i.e., its “label”) rather than the cluster of words. These labels or names must capture the overarching meaning of all words in a topic. We describe different approaches to naming topics generated by a topic model, such as manual or automated labeling of clusters with names based on the most frequent words of a topic (Hindle et al. 2013 ).

In this paper, we provide an overview of the use of topic modeling in 111 papers published between 2009 and 2020 in highly ranked venues of software engineering (five journals and five conferences). We identify characteristics and limitations in the use of topic models and discuss (a) the appropriateness of topic modeling techniques, (b) the importance of pre-processing, (c) challenges related to defining meaningful topics, and (d) the importance of context when manually naming topics.

The rest of the paper is organized as follows. In Section  2 we provide an overview of topic modeling. In Section  3 we describe other literature reviews on the topic as well as “meta-studies” that discuss topic modeling more generally. We describe the research method in Section  4 and present the results in Section  5 . In Section  6 , we summarize our findings and discuss implications and threats to validity. Finally, in Section  7 we present concluding remarks and future work.

2 Topic Modeling

Topic modeling aims at automatically finding topics, typically represented as clusters of words, in a given textual document (Bi et al. 2018 ). Unlike (supervised) machine learning-based techniques that solve classification problems, topic modeling does not use tags, training data or predefined taxonomies of concepts (Bi et al. 2018 ). Based on the frequencies of words and frequencies of co-occurrence of words within one or more documents, topic modeling clusters words that are often used together (Barua et al. 2014 ; Treude and Wagner 2019 ). Figure  1 illustrates the general process of topic modeling, from a raw corpus of documents (“Data input”) to topics generated for these documents (“Output”). Below we briefly introduce the basic concepts and terminology of topic modeling (based on Chen et al. ( 2016 )):

Word w : a string of one or more alphanumeric characters (e.g., “software” or “management”);

Document d : a set of n words (e.g., a text snippet with five words: w 1 to w 5 );

Corpus C : a set of t documents (e.g., nine text snippets: d 1 to d 9 );

Vocabulary V : a set of m unique words that appear in a corpus (e.g., m = 80 unique words across nine documents);

Term-document matrix A : an m by t matrix whose A i , j entry is the weight (according to some weighting function, such as term-frequency) of word w i in document d j . For example, given a matrix A with three words and three documents as

research topics on software engineering

A 1,1 = 5 indicates that “code” appears five times in d 1 , etc.;

Topic z : a collection of terms that co-occur frequently in the documents of a corpus. Considering probabilistic topic models (e.g., LDA), z refers to an m -length vector of probabilities over the vocabulary of a corpus. For example, in a vector z 1 = ( c o d e : 0.35; t e s t : 0.17; b u g : 0.08),

0.35 indicates that when a word is picked from a topic z 1 , there is a 35% chance of drawing the word “code”, etc.;

Topic-term matrix ϕ (or T ): a k by m matrix with k as the number of topics and ϕ i , j the probability of word w j in topic z i . Row i of ϕ corresponds to z i . For example, given a matrix ϕ as

research topics on software engineering

0.05 in the first column indicates that the word “code” appears with a probability of 0.5% in topic z 3 , etc.;

Topic membership vector 𝜃 d : for document d i , a k -length vector of probabilities of the k topics. For example, given a vector \(\theta _{d_{i}} = (z_{1}: 0.25; z_{2}: 0.10; z_{3}: 0.08)\) ,

0.25 indicates that there is a 25% chance of selecting topic z 1 in d i ;

Document-topic matrix 𝜃 (or D ): an n by k matrix with 𝜃 i , j as the probability of topic z j in document d i . Row i of 𝜃 corresponds to \(\theta _{d_{i}}\) . For example, given a matrix 𝜃 as

research topics on software engineering

0.10 in the first column indicates that document d 2 contains topic z 1 with probability of 10%, etc.

figure 1

General topic modeling process

2.1 Data Input

Data used as input into topic modeling can take many forms. This requires decisions on what exactly are documents and what the scope of individual documents is (Miner et al. 2012 ). Therefore, we need to determine which unit of text shall be analyzed (e.g., subject lines of e-mails from a mailing list or the body of e-mails).

To model topics from raw text in a corpus C (see Fig.  1 ), the data needs to be converted into a structured vector-space model, such as the term-document matrix A . This typically also requires some pre-processing. Although each text mining approach (including topic modeling) may require specific pre-processing steps, there are some common steps, such as tokenization, stemming and removing stop words (Miner et al. 2012 ). We discuss pre-processing for topic modeling in more detail when presenting the results for RQ3 in Section  5.4 .

2.2 Modeling

Different models can be used for topic modeling. Models typically differ in how they model topics and underlying assumptions. For example, besides LDA and LSI mentioned before, other examples of topic modeling techniques include Probabilistic Latent Semantic Indexing (pLSI) (Hofmann 1999 ). LSI and pLSI reduce the dimensionality of A using Singular Value Decomposition (SVD) (Hofmann 1999 ). Furthermore, variants of LDA have been proposed, such as Relational Topic Models (RTM) (Chang and Blei 2010 ) and Hierarchical Topic Models (HLDA) (Blei et al. 2003a ). RTM finds relationships between documents based on the generated topics (e.g., if document d 1 contains the topic “microservices”, document d 2 contains the topic “containers” and document d n contains the topic “user interface”, RTM will find a link between documents d 1 and d 2 (Chang and Blei 2010 )). HLDA discovers a hierarchy of topics within a corpus, where each lower level in the hierarchy is more specific than the previous one (e.g., a higher topic “web development” may have subtopics such as “front-end” and “back-end”).

Topic modeling techniques need to be configured for a specific problem, objectives and characteristics of the analyzed text (Treude and Wagner 2019 ; Agrawal et al. 2018 ). For example, Treude and Wagner ( 2019 ) studied parameters, characteristics of text corpora and how the characteristics of a corpus impact the development of a topic modeling technique using LDA. Treude and Wagner ( 2019 ) found that textual data from Stack Overflow (e.g., threads of questions and answers) and GitHub (e.g., README files) require different configurations for the number of generated topics ( k ). Similarly, Barua et al. ( 2014 ) argued that the number of topics depends on the characteristics of the analyzed corpora. Furthermore, the values of modeling parameters (e.g., LDA’s hyperparameters α and β which control an initial topic distribution) can also be adjusted depending on the corpus to improve the quality of topics (Agrawal et al. 2018 ).

By finding words that are often used together in documents in a corpus, a topic modeling technique creates clusters of words or topics z k . Words in such a cluster are usually related in some way, therefore giving the topic a meaning. For example, we can use a topic modeling technique to extract five topics from unstructured document such as a combination of Stack Overflow posts. One of the clusters generated could include the co-occurring words “error”, “debug” and “warn”. We can then manually inspect this cluster and by inference suggest the label “Exceptions” to name this topic (Barua et al. 2014 ).

3 Related Work

3.1 previous literature reviews.

Sun et al. ( 2016 ) and Chen et al. ( 2016 ), similar to our study, surveyed software engineering papers that applied topic modeling. Table  1 shows a comparison between our study and prior reviews. As shown in the table, Sun et al. ( 2016 ) focused on finding which software engineering tasks have been supported by topic models (e.g., support source code comprehension, feature location, traceability link recovery, refactoring, software testing, developer recommendations, software defects prediction and software history comprehension), and Chen et al. ( 2016 ) focused on characterizing how studies used topic modeling to mine software repositories.

Furthermore, as shown in Table  1 , in comparison to Sun et al. ( 2016 ) and Chen et al. ( 2016 ), our study surveys the literature considering other aspects of topic modeling such as data inputs (RQ2), data pre-processing (RQ3), and topic naming (RQ4). Additionally, we searched for papers that applied topic models to any type of data (e.g., Q&A websites) rather than to data in software repositories. We also applied a different search process to identify relevant papers.

Although some of the search venues of these two previous studies and our study overlap, our search focused on specific venues. We also searched papers published between 2009 and 2020, a period which only partially overlaps with the searches presented by Sun et al. ( 2016 ) and Chen et al. ( 2016 ).

Regarding the data analysed in previous studies, Chen et al. ( 2016 ) analyzed two aspects not covered in our study: (a) tools to implement topic models in papers, and (b) how papers evaluated topic models (note that even though we did not cover this aspect explicitly, we checked whether papers compared different topic models, and if so, what metrics they used to compare topic models). However, different to Chen et al. ( 2016 ) we analyzed (a) the types of contribution of papers (e.g., a new approach); (b) details about the types of data and documents used in topic modeling techniques, and (c) whether and how topics were named. Additionally, we extend the survey of Chen et al. ( 2016 ) by investigating hyperparameters (see Section  2.1 ) of topic models and data pre-processing in more detail. We provide more details and a justification of our research method in Section  4 .

3.2 Meta-studies on Topic Modeling

In addition to literature surveys, there are “meta-studies” on topic modeling that address and reflect on different aspects of topic modeling more generally (and are not considered primary studies for the purpose of our review, see our inclusion and exclusion criteria in Section  4 ). In the following paragraphs we organized their discussion into three parts: (1) studies about parameters for topic modeling, (2) studies on topic models based on the type of analyzed data, and (3) studies about metrics and procedures to evaluate the performance of topic models. We refer to these studies throughout this manuscript when reflecting on the findings of our study.

Regarding parameters used for topic modeling, Treude and Wagner ( 2019 ) performed a broad study on LDA parameters to find optimal settings when analyzing GitHub and Stack Overflow text corpora. The authors found that popular rules of thumb for topic modeling parameter configuration were not applicable to their corpora, which required different configurations to achieve good model fit. They also found that it is possible to predict good configurations for unseen corpora reliably. Agrawal et al. ( 2018 ) also performed experiments on LDA parameter configurations and proposed LDADE, a tool to tune the LDA parameters. The authors found that due to LDA topic model instability, using standard LDA with “off-the-shelf” settings is not advisable. We also discuss parameters for topic modeling in Section  2.2 .

For studies on topic models based on the analyzed data, researchers have investigated topic modeling involving short texts (e.g., a tweet) and how to improve the performance of topic models that work well with longer text (e.g., a book chapter) (Lin et al. 2014 ). For example, the study of Jipeng et al. ( 2020 ) compared short-text topic modeling techniques and developed an open-source library of the short-text models. Another example is the work of Mahmoud and Bradshaw ( 2017 ) who discussed topic modeling techniques specific for source code.

Finally, regarding metrics and procedures to evaluate the performance of topic models, some works have explored how semantically meaningful topics are for humans (Chang et al. 2009 ). For example, Poursabzi-Sangdeh et al. ( 2021 ) discuss the importance of interpretability of models in general (also considering other text mining techniques). Another example is the work of Chang et al. ( 2009 ) who presented a method for measuring the interpretability of a topic model based on how well words within topics are related and how different topics are between each other. On the other hand, as an effort to quantify the interpretability of topics without human evaluation, some studies developed topic coherence metrics . These metrics score the probability of a pair of words from topics being found together in (a) external data sources (e.g., Wikipedia pages) or (b) in the documents used by the model that generated those topics (Röder et al. 2015 ). Röder et al. ( 2015 ) combined different implementations of coherence metrics in a framework. Perplexity is another measure of performance for statistical models in natural language processing, which indicates the uncertainty in predicting a single word (Blei et al. 2003b ). This metric is often applied to compare the configurations of a topic modeling technique (e.g., Zhao et al. ( 2020 )). Other studies use perplexity as an indicator of model quality (such as Chen et al. 2019 and Yan et al. 2016b ).

4 Research Method

We conducted a literature survey to describe how topic modeling has been applied in software engineering research. To answer the research questions introduced in Section  1 , we followed general guidelines for systematic literature review (Kitchenham 2004 ) and mapping study methods (Petersen et al. 2015 ). This was to systematically identify relevant works, and to ensure traceability of our findings as well as the repeatability of our study. However, we do not claim to present a fully-fledged systematic literature review (e.g., we did not assess the quality of primary studies) or a mapping study (e.g., we only analyzed papers from carefully selected venues). Furthermore, we used parts of the procedures from other literature surveys on similar topics (Bi et al. 2018 ; Chen et al. 2016 ; Sun et al. 2016 ) as discussed throughout this section.

4.1 Search Procedure

To identify relevant research, we selected high-quality software engineering publication venues. This was to ensure that our literature survey includes studies of high quality and described at sufficient level of detail. We identified venues rated as A and A ∗ for Computer Science and Information Systems research in the Excellence Research for Australia (CORE) ranking (ARC 2012 ). Only one journal was rated B (IST), but we included it due to its relevance for software engineering research. These venues are a subset of venues also searched by related previous literature surveys (Chen et al. 2016 ; Sun et al. 2016 ), see Section  3 . The list of searched venues includes five journals: (1) Empirical Software Engineering (EMSE); (2) Information and Software Technology (IST); (3) Journal of Systems and Software (JSS); (4) ACM Transactions on Software Engineering & Methodology (TOSEM); (5) IEEE Transaction on Software Engineering (TSE). Furthermore, we included five conferences: (1) International Conference on Automated Software Engineering (ASE); (2) ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM); (3) International Symposium on the Foundations of Software Engineering / European Software Engineering Conference (ESEC/FSE); (4) International Conference on Software Engineering (ICSE); (5) International Workshop/Working Conference on Mining Software Repositories (MSR).

We performed a generic search on SpringerLink (EMSE), Science Direct (IST, JSS), ACM DL (TOSEM, ESEC/FSE, ASE, ESEM, ICSE, MSR) and IEEE Xplore (TSE, ASE, ESEM, ICSE, MSR) using the venue (journal or conference) as a high-level filtering criterion. Considering that the proceedings of ASE, ESEM, ICSE and, MSR are published by ACM and IEEE, we searched these venues on ACM DL and IEEE Xplore to avoid missing relevant papers. We used a generic search string (“topic model[l]ing” and “topic model”). Furthermore, in order to find studies that apply specific topic models but do not mention the term “topic model”, we used a second search string with topic model names (“lsi” or “lda” or “plsi” or “latent dirichlet allocation” or “latent semantic”). This second string was based on the search string used by Chen et al. ( 2016 ), who also present a review and analysis of topic modeling techniques in software engineering (see Section  3 ). We applied both strings to the full text and metadata of papers. We considered works published between 2009 and 2020. The search was performed in March 2021. Limiting the search to the last twelve years allowed us to focus on more mature and recent works.

4.2 Study Selection Criteria

We only considered full research papers since full papers typically report (a) mature and complete research, and (b) more details about how topic modeling was applied. Furthermore, to be included, a paper should either apply, experiment with, or propose a topic modeling technique (e.g., develop a topic modeling technique that analyzes source code to recommend refactorings (Bavota et al. 2014b )), and meet none of the exclusion criteria: (a) the paper does not apply topic models (e.g., it applies other text mining techniques and only cites topic modeling in related or future work, such as the paper by Lian et al. ( 2020 ); (b) the paper focuses on theoretical foundation and configurations for topic models (e.g., it discusses how to tune and stabilize topic models, such as Agrawal et al. ( 2018 ) and other meta-studies listed in Section  3.2 ); and (c) the paper is a secondary study (e.g., a literature review like the studies discussed in Section  3.1 ). We evaluated inclusion and exclusion criteria by first reading the abstracts and then reading full texts.

The search with the first search string (see Section  4.1 ) resulted in 215 papers and the search with the second search string resulted in an additional 324 papers. Applying the filtering outlined above resulted in 114 papers. Furthermore, we excluded three papers from the final set of papers: (a) Hindle et al. ( 2011 ), (b) Chen et al. ( 2012 ), and (c) Alipour et al. ( 2013 ). These papers were earlier and shorter versions of follow-up publications; we considered only the latest publications of these papers (Hindle et al. 2013 ; Chen et al. 2017 ; Hindle et al. 2016 ). This resulted in a total of 111 papers for analysis.

4.3 Data Extraction and Synthesis

We defined data items to answer the research questions and characterize the selected papers (see Table  2 ). The extracted data was recorded in a spreadsheet for analysis (raw data are available online Footnote 1 ). One of the authors extracted the data and the other authors reviewed it. In case of ambiguous data, all authors discussed to reach agreement. To synthesize the data, we applied descriptive statistics and qualitatively analyzed the data as follows:

RQ1: Regarding the data item “Technique”, we identified the topic modeling techniques applied in papers. For the data item “Supported tasks”, we assigned to each paper one software engineering task. Tasks emerged during the analysis of papers (see more details in Section  5.2.2 ). We also identified the general study outcome in relation to its goal (data item “Type of contribution”). When analyzing the type of contribution, we also checked whether papers included a comparison of topic modeling techniques (e.g., to select the best technique to be included in a newly proposed approach). Based on these data items we checked which techniques were the most popular, whether techniques were based on other techniques or used together, and for what purpose topic modeling was used.

RQ2: We identified types of data (data item “Type of data”) in selected papers as listed in Section  5.3.1 . Considering that some papers addressed one, two or three different types of data, we counted the frequency of types of data and related them with the document. Regarding “Document”, we identified the textual document and (if reported in the paper) its length. For the data item “Parameters”, we identified whether papers described modeling parameters and if so, which values were assigned to them.

RQ3: Considering that some papers may have not mentioned any pre-processing, we first checked which papers described data pre-processing. Then, we listed all pre-processing steps found and counted their frequencies.

RQ4: Considering the papers that described topic naming, we analyzed how generated topics were named (see Section  5.5 ). We used three types of approaches to describe how topics were named: (a) Manual - manually analysis and labeling of topics; (b) Automated - use automated approaches to label names to topics; and (c) Manual & Automated - mix of both manual and automated approaches to analyse and name topics. We also described the procedures performed to name topics.

5.1 Overview

As mentioned in Section  4.1 , we analyzed 111 papers published between 2009 and 2020 (see Appendix  A.1 - Papers Reviewed). Most papers were published after 2013. Furthermore, most papers were published in journals (68 papers in total, 32 in EMSE alone), while the remaining 43 papers appeared in conferences (mostly MSR with sixteen papers). Table  3 shows the number of papers by venue and year.

5.2 RQ1: Topic Models Used

In this Section we first discuss which topic modeling techniques are used (Section  5.2.1 ). Then, we explore why or for what purpose these techniques were used (Section  5.2.2 ). Finally, we describe the general contributions of papers in relation to their goals (Section  5.2.3 ).

5.2.1 Topic Modeling Techniques

The majority of the papers used LDA (80 out of 111), or a LDA-based technique (30 out of 111), such as Twitter-LDA (Zhao et al. 2011 ). The other topic modeling technique used is LSI. Figure  2 shows the number of papers per topic modeling technique. The total number (125) exceeds the number of papers reviewed (111), because ten papers experimented with more than one technique: Thomas et al. ( 2013 ), De Lucia et al. ( 2014 ), Binkley et al. ( 2015 ), Tantithamthavorn et al. ( 2018 ), Abdellatif et al. ( 2019 ) and Liu et al. ( 2020 ) experimented with LDA and LSI; Chen et al. ( 2014 ) experimented with LDA and Aspect and Sentiment Unification Model (ASUM); Chen et al. ( 2019 ) experimented with Labeled Latent Dirichlet Allocation (LLDA) and Label-to-Hierarchy Model (L2H); Rao and Kak ( 2011 ) experimented with LDA and MLE-LDA; and Hindle et al. ( 2016 ) experimented with LDA and LLDA. ASUM, LLDA, MLE-LDA and L2H are techniques based on LDA.

figure 2

Number of papers per topic modeling technique

The popularity of LDA in software engineering has also been discussed by others, e.g., Treude and Wagner ( 2019 ). LDA is a three-level hierarchical Bayesian model (Blei et al. 2003b ). LDA defines several hyperparameters, such as α (probability of topic z i in document d i ), β (probability of word w i in topic z i ) and k (number of topics to be generated) (Agrawal et al. 2018 ).

Thirty-seven (out of 75) papers applied LDA with Gibbs Sampling (GS). Gibbs sampling is a Markov Chain Monte Carlo algorithm that samples from conditional distributions of a target distribution. Used with LDA, it is an approximate stochastic process for computing α and β (Griffiths and Steyvers 2004 ). According to experiments conducted by Layman et al. ( 2016 ), Gibbs sampling in LDA parameter estimation ( α and β ) resulted in lower perplexity than the Variational Expectation-Maximization (VEM) estimations. Perplexity is a standard measure of performance for statistical models of natural language, which indicates the uncertainty in predicting a single word. Therefore, lower values of perplexity mean better model performance (Griffiths and Steyvers 2004 ).

Thirty papers applied modified or extended versions of LDA (“LDA-based” in Fig.  2 ). Table  4 shows a comparison between these LDA-based techniques. Eleven papers proposed a new extension of LDA to adapt LDA to software engineering problems (hence the same reference in the third and fourth column of Table  4 ). For example, the Multi-feature Topic Model (MTM) technique by Xia et al. ( 2017b ), which implements a supervised version of LDA to create a bug triaging approach. The other 19 papers applied existing modifications of LDA proposed by others (third column in Table  4 ). For example, Hu and Wong ( 2013 ) used the Citation Influence Topic Model (CITM), developed by Dietz et al. ( 2007 ), which models the influence of citations in a collection of publications.

The other topic modeling technique, LSI (Deerwester et al. 1990 ), was published in 1990, before LDA which was published in 2003. LSI is an information extraction technique that reduces the dimensionality of a term-document matrix using a reduction factor k (number of topics) (Deerwester et al. 1990 ). Compared to LDA, LDA follows a generative process that is statistically more rigorous than LSI (Blei et al. 2003b ; Griffiths and Steyvers 2004 ). From the 16 papers that used LSI, seven papers compared this technique to others:

One paper (Rosenberg and Moonen 2018 ) compared LSI with other two dimensionality reduction techniques: Principal Component Analysis (PCA) (Wold et al. 1987 ) and Non-Negative Matrix Factorization (NMF) (Lee and Seung 1999 ). The authors applied these models to automatically group log messages of continuous deployment runs that failed for the same reasons.

Four papers applied LDA and LSI at the same time to compare the performance of these models to Vector Space Model (VSM) (Salton et al. 1975 ), an algebraic model for information extraction. These studies supported documentation (De Lucia et al. 2014 ); bug handling (Thomas et al. 2013 ; Tantithamthavorn et al. 2018 ); and maintenance tasks (Abdellatif et al. 2019 )).

Regarding the other two papers, Binkley et al. ( 2015 ) compared LSI to Query likelihood LDA (QL-LDA) and other information extraction techniques to check the best model for locating features in source code; and Liu et al. ( 2020 ) compared LSI and LDA to Generative Vector Space Model (GVSM), a deep learning technique, to select the best performer model for documentation traceability to source code in multilingual projects.

5.2.2 Supported Tasks

As mentioned before, we aimed to understand why topic modeling was used in papers, e.g., if topic modeling was used to develop techniques to support specific software engineering tasks, or if it was used as a data analysis technique in exploratory studies to understand the content of large amounts of textual data. We found that the majority of papers aimed at supporting a particular task, but 21 papers (see Table  5 ) used topic modeling in empirical exploratory and descriptive studies as a data analysis technique.

We extracted the software engineering tasks described in each study (e.g., bug localization, bug assignment, bug triaging) and then grouped them into eight more generic tasks (e.g., bug handling) considering typical software development activities such as requirements, documentation and maintenance (Leach 2016 ). The specific tasks collected from papers are available online 1 . Note that we kept “Bug handling” and “Refactoring” separate rather than merging them into maintenance because of the number of papers (bug handling) and the cross-cutting nature (refactoring) in these categories. Each paper was related to one of these tasks:

Architecting: tasks related to architecture decision making, such as selection of cloud or mash-up services (e.g., Belle et al. ( 2016 ));

Bug handling: bug-related tasks, such as assigning bugs to developers, prediction of defects, finding duplicate bugs, or characterizing bugs (e.g., Naguib et al. ( 2013 ));

Coding: tasks related to coding, e.g., detection of similar functionalities in code, reuse of code artifacts, prediction of developer behaviour (e.g., Damevski et al. ( 2018 ));

Documentation: support software documentation, e.g., by localizing features in documentation, automatic documentation generation (e.g., Souza et al. ( 2019 ));

Maintenance: software maintenance-related activities, such as checking consistency of versions of a software, investigate changes or use of a system (e.g., Silva et al. ( 2019 ));

Refactoring: support refactoring, such as identifying refactoring opportunities and removing bad smell from source code (e.g., Bavota et al. ( 2014b ));

Requirements: related to software requirements evolution or recommendation of new features (e.g., Galvis Carreno and Winbladh ( 2012 ));

Testing: related to identification or prioritization of test cases (e.g., Thomas et al. ( 2014 )).

Table  5 groups papers based on the topic modeling technique and the purpose. Few papers applied topic modeling to support Testing (three papers) and Refactoring (three papers). Bug handling is the most frequent supported task (33 papers). From the 21 exploratory studies, 13 modeled topics from developer communication to identify developers’ information needs: 12 analyzed posts on Stack Overflow, a Q&A website for developers (Chatterjee et al. 2019 ; Bajaj et al. 2014 ; Ye et al. 2017 ; Bagherzadeh and Khatchadourian 2019 ; Ahmed and Bagherzadeh 2018 ; Barua et al. 2014 ; Rosen and Shihab 2016 ; Zou et al. 2017 ; Chen et al. 2019 ; Han et al. 2020 ; Abdellatif et al. 2020 ; Haque and Ali Babar 2020 ) and one paper analyzed blog posts (Pagano and Maalej 2013 ). Regarding the other eight exploratory studies, three papers investigated web search queries to also identify developers’ information needs (Xia et al. 2017a ; Bajracharya and Lopes 2009 ; 2012 ); four papers investigated end user documentation to analyse users’ feedback on mobile apps (Tiarks and Maalej 2014 ; El Zarif et al. 2020 ; Noei et al. 2018 ; Hu et al. 2018 ); and one paper investigated historical “bug” reports of NASA systems to extract trends in testing and operational failures (Layman et al. 2016 ).

5.2.3 Types of Contribution

For each study, we identified what type of contribution it presents based on the study goal. We used three types of contributions (“Approach”, “Exploration” and “Comparison”, as described below) by analyzing the research questions and main results of each study. A study could contribute either an “Approach” or an “Exploration”, while “Comparison” is orthogonal, i.e., a study that presents a new approach could present a comparison of topic models as part of this contribution. Similarly, a comparison of topic models can also be part of an exploratory study.

Approach: a study develops an approach (e.g., technique, tool, or framework) to support software engineering activities based on or with the support of topic models. For example, Murali et al. ( 2017 ) developed a framework that applies LDA to Android API methods to discover types of API usage errors, while Le et al. ( 2017 ) developed a technique (APRILE+) for bug localization which combines LDA with a classifier and an artificial neural network.

Exploration: a study applies topic modeling as the technique to analyze textual data collected in an empirical study (in contrast to for example open coding). Studies that contributed an exploration did not propose an approach as described in the previous item, but focused on getting insights from data. For example, Barua et al. ( 2014 ) applied LDA to Stack Overflow posts to discover what software engineering topics were frequently discussed by developers; Noei et al. ( 2018 ) explored the evolution of mobile applications by applying LDA to app descriptions, release notes, and user reviews.

Comparison: the study (that can also contribute with an “Approach” or an “Exploration”) compares topic models to other approaches. For example, Xia et al. ( 2017b ) compared their bug triaging approach (based on the so called Multi-feature Topic Model - MTM) with similar approaches that apply machine learning (Bugzie (Tamrawi et al. 2011 )) and SVM-LDA (combining a classifier with LDA (Somasundaram and Murphy 2012 )). On the other hand, De Lucia et al. ( 2014 ) compared LDA and LSI to define guidelines on how to build effective automatic text labeling techniques for program comprehension.

From the papers that contributed an approach , twenty-two combined a topic modeling technique with one or more other techniques applied for text mining:

Information extraction (e.g., VSM) (Nguyen et al. 2012 ; Zhang et al. 2018 ; Chen et al. 2020 ; Thomas et al. 2013 ; Fowkes et al. 2016 );

Classification (e.g., Support Vector Machine - SVM) (Hindle et al. 2013 ; Le et al. 2017 ; Liu et al. 2017 ; Demissie et al. 2020 ; Zhao et al. 2020 ; Shimagaki et al. 2018 ; Gopalakrishnan et al. 2017 ; Thomas et al. 2013 );

Clustering (e.g., K-means) (Jiang et al. 2019 ; Cao et al. 2017 ; Liu et al. 2017 ; Zhang et al. 2016 ; Altarawy et al. 2018 ; Demissie et al. 2020 ; Gorla et al. 2014 );

Structured prediction (e.g., Conditional Random Field - CRF) (Ahasanuzzaman et al. 2019 );

Artificial neural networks (e.g., Recurrent Neural Network - RNN) (Murali et al. 2017 ; Le et al. 2017 );

Evolutionary algorithms (e.g., Multi-Objective Evolutionary Algorithm - MOEA) (Blasco et al. 2020 ; Pérez et al. 2018 );

Web crawling (Nabli et al. 2018 ).

Pagano and Maalej ( 2013 ) was the only study that contributed an exploration that combined LDA with another text mining technique. To analyze how developer communities use blogs to share information, the authors applied LDA to extract keywords from blog posts and then analyzed related “streams of events” (commit messages and releases by time in relation to blog posts), which were created with Sequential pattern mining.

Regarding comparisons we found that (1) 13 out of the 63 papers that contribute an approach also include some form of comparison, and (2) ten out of the 48 papers contribute an exploration also include some form of comparison. We discuss comparisons in more detail below in Section  6.1.2

5.3 RQ2: Topic Model Inputs

In this section we first discuss the type of data (Section  5.3.1 ). Then we discuss the actual textual documents used for topic modeling (Section  5.3.2 ). Finally, we describe which model parameters were used (Section  5.3.3 ) to configure models.

5.3.1 Types of Data

Types of data help us describe the textual software engineering content that has been analyzed with topic modeling. We identified 12 types of data in selected papers as shown in Table  6 . In some papers we identified two or three of these types of data; for example, the study of Tantithamthavorn et al. ( 2018 ) dealt with issue reports, log information and source code.

Source code (37 occurrences), issue/bug reports (22 occurrences) and developer communication (20 occurrences) were the most frequent types of data used. Seventeen papers used two to four types of data in their topic modeling technique; twelve of these papers used a combination of source code with another type of data. For example, Sun et al. ( 2015 ) generated topics from source code and developer communication to support software maintenance tasks, and in another study, Sun et al. ( 2017 ) used topics found in source code and commit messages to assign bug-fixing tasks to developers.

5.3.2 Documents

A document refers to a piece of textual data that can be longer or shorter, such as a requirements document or a single e-mail subject. Documents are concrete instances of the types of data discussed above. Figure  3 shows documents (per type of data) and how often we found them in papers. The most frequent documents are bug reports (12 occurrences), methods from source code (9 occurrences), Q&A posts (9 occurrences) and user reviews (8 occurrences).

figure 3

Documents (leaves in the figure) by type of data (nodes in the figure)

We also analyzed document length and found the following:

In general, papers described the length of documents in number of words, see Table  7 . Footnote 2 On the other hand, two papers (Moslehi et al. 2016 , 2020 ) described their documents’ length in minutes of screencast transcriptions (videos with one to ten minutes, no information about the size of transcripts). Sixteen papers mentioned the actual length of the documents, see Table  7 . Ten papers that described the actual document length did that when describing the data used for topic modeling; four papers discussed document length while describing results; and one mentioned document length as a metric for comparing different data sources;

Most papers (80 out of 111) did not mention document length and also do not acknowledge any limitations or the impact of document length on topics.

Fifteen papers did not mention the actual document length, but at some point acknowledge the influence of document length on topic modeling. For example, Abdellatif et al. ( 2019 ) mentioned that the documents in their data set were “not long”. Similarly, Yan et al. ( 2016b ) did not mention the length of the bug reports used but discussed the impact of the vocabulary size of their corpus on results. Moslehi et al. ( 2018 ) mentioned document length as a limitation and acknowledge that using LDA on short documents was a threat to construct validity. According to these authors, using techniques specific for short documents could have improved the outcomes of their topic modeling.

5.3.3 Model Parameters

Topic models can be configured with parameters that impact how topics are generated. For example, LDA has typically been used with symmetric Dirichlet priors over 𝜃 (document-topic distributions) and ϕ (topic-word distributions) with fixed values for α and β (Wallach et al. 2009 ). Wallach et al. ( 2009 ) explored the robustness of a topic model with asymmetric priors over 𝜃 (i.e., varying values for α ) and a symmetric prior (fixed value for β ) over ϕ . Their study found that such topic model can capture more distinct and semantically-related topics, i.e., the words in clusters are more distinct. Therefore, we checked which parameters and values were used in papers. Overall, we found the following:

Eighteen of the 111 papers do not mention parameters (e.g., number of topics k , hyperparameters α and β ). Thirteen of these papers use LDA or an LDA-based technique, four papers use LSI, while (Liu et al. 2020 ) use LDA and LSI.

The remaining 93 papers mention at least one parameter. The most frequent parameters discussed were k , α and β :

Fifty-eight papers mentioned actual values for k , α and β ;

Two papers mentioned actual values for α and β , but no values for k ;

Twenty-nine papers included actual values for k but not for α and β ;

Thirty-two (out of 58) papers mentioned other parameters in addition to k , α and β . For example, Chen et al. ( 2019 ) applied L2H (in comparison to LLDA), which uses the hyperparameters γ 1 and γ 2 ;

One paper (Rosenberg and Moonen 2018 ) that applied LSI, mentioned the parameter “similarity threshold” rather than k , α and β .

We then had a closer look at the 60 papers that mentioned actual values for hyperparameters α and β :

α based on k : The most frequent setting (29 papers) was α = 50/ k and β = 0.01 (i.e., α was depending on the number of topics, a strategy suggested by Steyvers and Griffiths ( 2010 ) and Wallach et al. ( 2009 )). These values are a default setting in Gibbs Sampling implementations for LDA such as Mallet. Footnote 3

Fixed α and β : Five papers fixed 0.01 for both hyperparameters, as suggested by Hoffman et al. ( 2010 ). Another eight papers fixed 0.1 for both hyperparameters, a default setting in Stanford Topic Modeling Toolbox (TMT); Footnote 4 and three other papers fixed α = 0.1 and β = 1 (these three studies applied RTM).

Varying α or β : Four papers tested different values for α , where two of these papers also tested different values for β ; and one paper varied β but fixed a value for α .

Optimized parameters : Four papers obtained optimized values for hyperparameters (Sun et al. 2015 ; Catolino et al. 2019 ; Yang et al. 2017 ; Zhang et al. 2018 ). These papers applied LDA-GA (as proposed by Panichella et al. ( 2013 )) which, based on genetic algorithms; finds the best values for LDA hyperparameters. In regards to the actual values chosen for optimized hyperparameters, Catolino et al. ( 2019 ) did not mention the values for hyperparameters; Sun et al. ( 2015 ) and Yang et al. ( 2017 ) mentioned only the values used for k ; and Zhang et al. ( 2018 ) described the values for k , α and β .

Regarding the values for k we observed the following:

The 90 papers that mentioned values for k modeled three (Cao et al. 2017 ) to 500 (Li et al. 2018 ; Lukins et al. 2010 ; Chen et al. 2017 ) topics;

Twenty-four (out of 90) papers mentioned that a range of values for k was tested in order to check the performance of the technique (e.g., Xia et al. ( 2017b )) or as a strategy to select the best number of topics (e.g., Layman et al. ( 2016 ));

Although the remaining 66 (out of 90) papers mentioned a single value used for k , most of them acknowledged that had tried several number of topics or used the number of topics suggested by other studies.

As can be seen in Table  7 , there is no common trend of what values for hyperparameter or k depending on the document or document length.

5.4 RQ3: Pre-processing Steps

Thirteen of the papers did not mention what pre-processing steps were applied to the data before topic modeling. Seven papers only described how the data analyzed were selected, but not how they were pre-processed. Table  8 shows the pre-processing steps found in the remaining 91 papers. Each of these papers mentioned at least one of these steps.

Removing noisy content (76 occurrences), Stemming terms (61 occurrences) and Splitting terms (33 occurrences) were the most used pre-processing steps. The least frequent pre-processing step (Resolving negations) was found only in the studies of Noei et al. ( 2019 ) and Noei et al. ( 2018 ). Resolving synonyms and Expanding contractions were also less frequent, with three occurrences each.

Table  9 shows the types of noise removal in papers and their frequency. Most of the papers that described pre-processing steps removed stop words (76 occurrences). Stop words are the most common words in a language, such as “a/an” and “the” in English. Removing stop words allows topic modeling techniques to focus on more meaningful words in the corpus (Miner et al. 2012 ). Eight papers mentioned the stop words list used: Layman et al. ( 2016 ) and Pettinato et al. ( 2019 ) used the SMART stop words list; Footnote 5 Martin et al. ( 2015 ) and Hindle et al. ( 2013 ) used the Natural Language Toolkit English stop words list; Footnote 6 Bagherzadeh and Khatchadourian ( 2019 ), Ahmed and Bagherzadeh ( 2018 ) and Yan et al. ( 2016b ) used the Mallet stop words list; Footnote 7 and Mezouar et al. ( 2018 ) used the Moby stop words list. Footnote 8

As can be seen in Table  9 , some papers removed words based on the frequency of their occurrence (most or least frequent terms) or length (words shorter than four, three or two letters or long terms). Other papers removed long paragraphs. For example, Henß et al. ( 2012 ) removed paragraphs longer than 800 characters because most paragraphs in their data set were shorter than that. We also found two papers that removed short documents: Gorla et al. ( 2014 ) removed documents with fewer than ten words, and Palomba et al. ( 2017 ) removed documents with fewer than three words. The concept of non-informative content depends on the context of each paper. In general, it refers to any data considered not relevant for the objective of the study. For example, Choetkiertikul et al. ( 2017 ), which aimed at predicting bugs in issue reports, removed issues that took too much time to be resolved. Noei et al. ( 2019 ) and Fu et al. ( 2015 ) removed content (end user reviews and commit messages) that did not describe feedback or cause of change.

5.5 RQ4: Topic Naming

Topic naming is about assigning labels (names) to topics (word clusters) to give the clusters a human-understandable meaning. Seventy-five papers (out of 111) did not mention whether or how topics were named. These papers only used the word clusters for analysis, but did not require a name. For example, Xia et al. ( 2017a ) and Canfora et al. ( 2014 ) did not name topics, but mapped the word clusters to the documents (search queries and source code comments) used as input for topic modeling. These papers used the probability of a document to belong to a topic ( 𝜃 ) to associate a document to the topic with the highest probability.

From the 36 papers (out of 111) that mentioned topic naming (see Table  10 ), we identified three ways of how they named topics:

Automated: Assigning names to word clusters without human intervention;

Manual: Manually checking the meaning and the combination of words in cluster to “deduct” a name, sometimes validated with expert judgment;

Manual & Automated: Mix of manual and automated; e.g., topics are manually labeled for one set of clusters to then train a classifier for naming another set of clusters.

Most of the papers (30 out of 36) assigned one name to one topic. However, we identified six papers that used one name for multiple topics (Hindle et al. 2013 ; Pagano and Maalej 2013 ; Bajracharya and Lopes 2012 ; Rosen and Shihab 2016 ) or labeled a topic with multiple names (Zou et al. 2017 ; Gao et al. 2018 ). Two of the papers (Hindle et al. 2013 ; Bajracharya and Lopes 2012 ) that assigned one name to multiple topics used predefined labels, and in the other two papers (Pagano and Maalej 2013 ; Rosen and Shihab 2016 ) authors interpreted words in the clusters to deduct names.

Regarding the papers that assigned multiple names to a topic, Zou et al. ( 2017 ) assigned no, one or more names, depending on how many words in the predefined word list matched words in clusters. Gao et al. ( 2018 ) used an automated approach to label topics with the three most relevant phrases and sentences from the end user reviews inputted to their topic model. The relevance of phrases and sentences were obtained with the metrics Semantic and Sentiment scores proposed by these authors.

6 Discussion

6.1 rq1: topic modeling techniques, 6.1.1 summary of findings.

LDA is the most frequently used topic model. Almost all papers (95 out of 111) applied LDA or a LDA-based technique, while nine papers applied LSI to identify topics and seven papers used LDA and LSI. Regarding the papers that used LDA-based techniques, eleven (out of 30) proposed their own LDA-based technique (Fu et al. 2015 ; Nguyen et al. 2011 ; Liu et al. 2017 ; Cao et al. 2017 ; Panichella et al. 2013 ; Yan et al. 2016a ; Xia et al. 2017b ; Nguyen et al. 2012 ; Damevski et al. 2018 ; Gao et al. 2018 ; Rao and Kak 2011 ). This may indicate that the LDA default implementation may not be adequate to support specific software engineering tasks or extract meaningful topics from all types of data. We discuss more about topic modeling techniques and their inputs in Section  6.2.2 . Furthermore, we found that topic modeling is used to develop tools and methods to support software engineers and concrete tasks (the most frequently supported task we found was bug handling), but also as a data analysis technique for textual data to explore empirical questions (see for example the “oldest” paper in our sample published in 2009 (Bajracharya and Lopes 2009 )).

One aspect that we did not specifically address in this review, but which impacts the applicability of topics models is their computational overhead. Computational overhead refers to processing time and computational resources (e.g., memory, CPU) required for topic modeling. As discussed by others, topic modeling can be computational intensive (Hoffman et al. 2010 ; Treude and Wagner 2019 ; Agrawal et al. 2018 ). However, we found that only few papers (seven out of 111) mentioned computational overhead at all. From these seven papers, five mentioned processing time (Bavota et al. 2014b ; Zhao et al. 2020 ; Luo et al. 2016 ; Moslehi et al. 2016 ; Chen et al. 2020 ), one paper mentioned computational requirements and some processing times (e.g., processor, data pre-processing time, LDA processing time and clustering processing time), and one paper only mention that their technique was processed in “few seconds” (Murali et al. 2017 ). Hence, based on the reviewed studies we cannot provide broader insights into the practical applicability and potential constraints of topic modeling based on the computational overhead.

6.1.2 Comparative Studies

As mentioned in Sections  5.2.1 and  5.2.3 , we identified studies that used more than one topic modeling technique and compared their performance. In detail, we found studies that (1) compared topic modeling techniques to information extraction techniques, such as Vector Space Model (VSM), an algebraic model (Salton et al. 1975 ) (see Table  11 ), (2) proposed an approach that uses a topic modeling technique and compared it to other approaches (which may or may not use topic models) with similar goals (see Table  12 ), and (3) compared the performance of different settings for a topic modeling technique or a newly proposed approach that utilizes topic models (see Table  13 ). In column “Metric” of Tables  11 ,  12 and  13 the metrics show the metrics used in the comparisons to decide which techniques performed “better” (based on the metrics’ interpretation). Metrics in bold were proposed for or adapted to a specific context (e.g., SCORE and Effort reduction), while the other metrics are standard NLP metrics (e.g., Precision, Recall and Perplexity). Details about the metrics used to compare the techniques are provided in Appendix  A.2 - Metrics Used in Comparative Studies.

As shown in Table  11 , ten papers compared topic modeling techniques to information extraction techniques. For example, Rosenberg and Moonen ( 2018 ) compared LSI with two other dimensionality reduction techniques (PCA and NMF) to group log messages of failing continuous deployment runs. Nine out of these ten papers presented explorations, i.e., studies experimented with different models to discuss their application to specific software engineering tasks, such as bug handling, software documentation and maintenance. Thomas et al. ( 2013 ) on the other hand experimented with multiple models to propose a framework for bug localization in source code that applies the best performing model.

Four papers in Table  11 (De Lucia et al. 2014 ; Tantithamthavorn et al. 2018 ; Abdellatif et al. 2019 ; Thomas et al. 2013 ) compared the performance of LDA, LSI and VSM with source code and issue/bug reports. Except for De Lucia et al. ( 2014 ), these studies applied Top-k accuracy (see Appendix  A.2 - Metrics Used in Comparative Studies) to measure the performance of models, and the best performing model was VSM. Tantithamthavorn et al. ( 2018 ) found that VSM achieves both the best Top-k performance and the least required effort for method-level bug localization. Additionally, according to De Lucia et al. ( 2014 ), VSM possibly performed better than LSI and LDA due to the nature of the corpus used in their study: LDA and LSI are ideal for heterogeneous collections of documents (e.g., user manuals from different systems), but in De Lucia et al. ( 2014 ) study each corpus was a collection of code classes from a single software system.

Ten studies proposed an approach that uses a topic modeling technique and compared it to similar approaches (shown in Table  12 ). In column “Approaches compared” of Table  12 , the approach in bold is the one proposed by the study (e.g., Cao et al. 2017 ) or the topic modeling technique used in their approach (e.g., Thomas et al. 2014 ). All newly proposed approaches were the best performing ones according to the metrics used.

In addition to the papers mentioned in Tables  11 and  12 , four papers compared the performance of different settings for a topic modeling technique or tested which topic modeling technique works best in their newly proposed approach (see Table  13 ). Biggers et al. ( 2014 ) offered specific recommendations for configuring LDA when localizing features in Java source code, and observed that certain configurations outperform others. For example, they found that commonly used heuristics for selecting LDA hyperparameter values ( beta = 0.01 or beta = 0.1) in source code topic modeling are not optimal (similar to what has been found by others, see Section  3.2 ). The other three papers (Chen et al. 2014 ; Fowkes et al. 2016 ; Poshyvanyk et al. 2012 ) developed approaches which were tested with different settings (e.g., the approach applying LDA or ASUM (Chen et al. 2014 )).

Regarding the datasets used by comparative studies, only Rao and Kak ( 2011 ) used a benchmarking dataset (iBUGS). Most of the comparative studies (13 out of 24) used source code or issue/bug reports from open source software, which are subject to evolution. The advantage of using benchmarking datasets rather than “living” datasets (e.g., an open source Java system) is that its data will be static and the same across studies. Additionally, data in benchmarking datasets are usually curated. This means that the results of replicating studies can be compared to the original study when both used the same benchmarking dataset.

Finally, we highlight that each of the above mentioned comparisons has a specific context. This means that, for example, the type of data analyzed (e.g., Java classes), the parameter setting (e.g., k = 50), the goal of the comparison (e.g., to select the best model for bug localization or for tracing documentation in source code) and pre-processing (e.g., stemming and stop word removal) were different. Therefore, it is not possible to “synthesize” the results from the comparisons across studies by aggregating the different comparisons in different papers, even for studies that appear to have similar goals or use the same topic modeling techniques, such as comparing the same models with similar types of data (such as Tantithamthavorn et al. 2018 and Abdellatif et al. 2019 ).

6.2 RQ2: Inputs to Topic Models

6.2.1 summary of findings.

Source code, developer communication and issue/bug reports were the most frequent types of data used for topic modeling in the reviewed papers. Consequently, most of the documents referred to individual or groups of functions or methods, individual Q&A posts, or individual bug reports; another frequent document was an individual user review (more discussions are in Section  6.2.3 ). We also found that few papers (16 out of 111) mentioned the actual length of documents used for topic modeling (we discuss this more in Section  6.2.2 ).

Regarding modeling parameters, most of the papers (93 out of 111) explicitly mentioned the configuration of at least one parameter, e.g., k , α or β for LDA. We observed that the setting α = 50/ k and β = 0.01 (asymmetric α and symmetric β ) as suggested by Steyvers and Griffiths ( 2010 ) and Wallach et al. ( 2009 ) was frequently used (28 out of 93 papers). Additionally, papers that applied LDA mostly used the default parameters of the tools used to implement LDA (e.g., Mallet 3 with α = 50/ k and β = 0.01 as default). This finding is similar to what has been reported by others, e.g., according to another review by Agrawal et al. ( 2018 ), LDA is frequently applied “as is out-of-the-box” or with little tuning. This means that studies may rely on the default settings of the tools used with their topic modeling technique, such as Mallet and TMT, rather than try to optimize parameters.

6.2.2 Documents and Parameters for Topic Models

Short texts : According to Lin et al. ( 2014 ), topic models such as LDA have been widely adopted and successfully used with traditional media like edited magazine articles. However, applying LDA to informal communication text such as tweets, comments on blog posts, instant messaging, Q&A posts, may be less successful. Their user-generated content is characterized by very short document length, a large vocabulary and a potentially broad range of topics. As a consequence, there are not enough words in a document to create meaningful clusters, compromising the performance of the topic modeling. This means that probabilistic topic models such as LDA perform sub-optimally when applied “as is” with short documents even when hyperparameters ( α and β in LDA) are optimized (Lin et al. 2014 ). In our sample there were only two papers that mentioned the use of a LDA-based technique specifically for short documents (Hu et al. 2019 ; Hu et al. 2018 ). Hu et al. ( 2019 ) and Hu et al. ( 2018 ) applied Twitter-LDA with end user reviews. Furthermore, Moslehi et al. ( 2018 ) used a weighting algorithm in documents to generate topics with more relevant words, they also acknowledge that the use of a short text technique could have improved their topic model.

As shown in Table  7 , few papers mentioned the actual length of documents. Considering a single document from a corpus, we observed that most papers potentially used short texts (all documents found in papers are shown in Fig.  3 ). For example, papers used an individual search query (Xia et al. 2017a ), an individual Q&A post (Barua et al. 2014 ), an individual user review (Nayebi et al. 2018 ), or an individual commit message (Canfora et al. 2014 ) as a document. Among the papers that mentioned document length, the shortest documents were an individual commit message (9 to 20 words) (Canfora et al. 2014 ) and an individual method (14 words) (Tantithamthavorn et al. 2018 ). Both studies applied LDA.

Two approaches to improve the performance of LDA when analyzing short documents are pooling and contextualization (Lin et al. 2014 ). Pooling refers to aggregating similar (e.g., semantically or temporally) documents into a single document (Mehrotra et al. 2013 ). For example, among the papers analysed, Pettinato et al. ( 2019 ) used temporal pooling and combined short log messages into a single document based on a temporal order. Contextualization refers to creating subsets of documents according to a type of context; considering tweets as documents, the type of context can refer to time, user and hashtags associated with tweets (Tang et al. 2013 ). For example, Weng et al. ( 2010 ) combined all the individual tweets of an author into one pseudo-document (rather than treating each tweet as a document). Therefore, with the contextualization approach, the topic model uses word co-occurrences at a context level instead of at the document level to discover topics.

Hyperparameters Table  14 shows the hyperparameter settings and types of data of the papers that mentioned the value of at least one model parameter. In Table  14 we also highlight the topic modeling techniques used. Note that some topic modeling techniques (e.g., RTM) can receive more parameters that the ones mentioned in Table  14 (e.g., number of documents, similarity thresholds); all parameters mentioned in papers are available online in the raw data of our study 1 . When comparing hyperparameter settings, topic modeling techniques and types of data, we observed the following:

Papers that used LDA-GA, an LDA-based technique that optimizes hyperparameters with Genetic algorithms, applied it to data from developer documentation or source code;

LDA was used with all three types of hyperparameter settings across studies. The most common setting was α based on k for developer communication and source code;

Most of the LDA-based techniques applied fixed values for α and β .

Most of the papers that applied only LSI as the topic modeling technique did not mention hyperparameters. As LSI is a model simpler than LDA, it generally requires the number of topics k . For example, a paper that applied LSI to source code mentioned α and k (Poshyvanyk et al. 2012 ).

Number of topics By relating the type of data to the number of topics, we aimed at finding whether the choice of the number of topics is related to the data used in the topic modeling techniques (see also Table  7 ). However, the number of topics used and data in the studies are rather diverse. Therefore, synthesizing practices and offering insights from previous studies on how to choose the number topics is rather limited.

From the 90 papers that mentioned number of topics ( k ), we found that 66 papers selected a specific number of topics (e.g., based on previous works with similar data or addressing the same task), while 24 papers used several numbers of topics (e.g., Yan et al. ( 2016b ) used 10 to 120 topics in steps of 10). To provide an example of how the number of topics differed even when the same type of data was analyzed with the same topic modeling technique, we looked at studies that applied LDA in textual data from developer communication (mostly Q&A posts) to propose an approach to support documentation. For these papers we found one paper that did not mention k (Henß et al. 2012 ), one paper that modeled different numbers of topics ( k = 10,20,30) (Asuncion et al. 2010 ), one paper that modeled k = 15 (Souza et al. 2019 ) and another paper that modeled k = 40 (Wang et al. 2015 ). This illustrates that there is no common or recommended practice that can be derived from the papers.

Some papers mentioned that they tested several numbers of topics before selecting the most appropriate value for k (in regards to studies’ goals) but did not mention the range of values tested. In regards to papers that mentioned such range, we identified four studies (Nayebi et al. 2018 ; Chen et al. 2014 ; Layman et al. 2016 ; Nabli et al. 2018 ) that tested several values for k and used perplexity (see details in Appendix  A.2 - Metrics Used in Comparative Studies) of models to evaluate which value of k generated the best performing model; three studies (Zhao et al. 2020 ; Han et al. 2020 ; El Zarif et al. 2020 ) also selected the number of topics after testing several values for k ; however they used topic coherence (Röder et al. 2015 ) to evaluate models. One paper (Haque and Ali Babar 2020 ) used both perplexity and topic coherence to select a value for k . Metrics of topic coherence score the probability of a pair of words from the resulted word clusters being found together in (a) external data sources (e.g., Wikipedia pages) or (b) in the documents used by the topic model that generated those word clusters (Röder et al. 2015 ).

6.2.3 Supported Tasks, Types of Data and Types of Contribution

We looked into the relationship between the tasks supported by papers, the type of data used and the types of contributions (see Table  15 ). We observed the following:

Source code was a frequent type of data in papers; consequently it appeared for almost all supported tasks, except for exploratory studies;

Considering exploratory studies, most papers used developer communication (13 out of 21), followed by search queries and end user communication (three papers each);

Papers that supported bug handling mostly used issue/bug reports, source code and end user communication;

Log information was used by papers that supported maintenance, bug handling, and coding;

Considering the papers that supported documentation, three used transcript texts from speech;

From the four papers related to the type of data developer documentation, two supported architecting tasks and the other two, documentation tasks.

Regarding the type of data, URLs and transcripts were only used in studies that contributed an approach.

We found that most of the exploratory studies used data that is less structured. For example, developer communication, such as Q&A posts and conversation threads generally do not follow a standardized template. On the other hand, issue reports are typically submitted through forms which enforces a certain structure.

6.3 RQ3: Data Pre-processing

6.3.1 summary of findings.

Most of the papers (91 out of 111) pre-processed the textual data before topic modeling. Removing noisy content was the most frequent pre-processing step (as typical for natural language processing), followed by stemming and splitting words. Miner et al. ( 2012 ) consider tokenizing as one of the basic data pre-processing steps in text mining. However, in comparison to other basic pre-processing steps such as stemming, splitting words and removing noise, tokenizing was not frequently found in papers (it was at least not mentioned in papers).

Eight papers (Henß et al. 2012 ; Xia et al. 2017b ; Ahasanuzzaman et al. 2019 ; Abdellatif et al. 2019 ; Lukins et al. 2010 ; Tantithamthavorn et al. 2018 ; Poshyvanyk et al. 2012 ; Binkley et al. 2015 ) tested how pre-processing steps affected the performance of topic modeling or topic model-based approaches. For example, Henß et al. ( 2012 ) tested several pre-processing steps (e.g., removing stop words, long paragraphs and punctuation) in e-mail conversations analyzed with LDA. They found that removing such content increased LDA’s capability to grasp the actual semantics of software mailing lists. Ahasanuzzaman et al. ( 2019 ) proposed an approach which applies LDA and Conditional Random Field (CRF) to localize concerns in Stack Overflow posts. The authors did not incorporate stemming and stop words removal in their approach because in preliminary tests these pre-processing steps decreased the performance of the approach.

6.3.2 Pre-processing Different Types of Data

Table  16 shows how different types of data were pre-processed. We observed that stemming, removing noise, lowercasing, and splitting words were commonly used for all types of data. Regarding the differences, we observed the following:

For developer communication there were specific types of noisy content that was removed: URLs, HTML tags and code snippets. This might have happened because most of the papers used Q&A posts as documents, which frequently contain hyperlinks and code examples;

Removing non-informative content was frequently applied to end user communication and end user documentation;

Expanding contracted terms (e.g., “didn’t” to “did not”) were applied to end user communication and issue/bug reports;

Removing empty documents and eliminating extra white spaces were applied only in end user communication. Empty documents occurred in this type of data because after the removal of stop words no content was left (Chen et al. 2014 );

For source code there was a specific noise to be removed: program language specific keywords (e.g., “public”, “class”, “extends”, “if”, and “while”).

Table  16 shows that splitting words, stop words removal and stemming were frequently applied to source code and most of these studies (15) applied these three steps at the same time. Studies that performed these pre-processing steps to source code mostly used methods, classes, or comments in classes/methods as documents. For example, Silva et al. ( 2016 ) who applied LDA, performed these three pre-processing steps in classes from two open source systems using TopicXP (Savage et al. 2010 ). TopicXP is a Eclipse plug-in that extracts source code, pre-process it and executes LDA. This plug-in implements splitting words, stop words removal and stemming.

Splitting words was the most frequent pre-processing step in source code. Studies used this step to separate Camel Cases in methods and classes (e.g., the class constructor InvalidRequestTest produces the terms “invalid”, “request” and “test”). For example, Tantithamthavorn et al. ( 2018 ) compared LDA, LSI and VSM testing different combinations of pre-processing steps to the methods’ identifiers inputted to these techniques. The best performing approach was VSM with splitting words, stop words removal and stemming.

Removing stop words in source code refer to the exclusion of the most common words in a language (e.g., “a/an” and “the” in English), as in studies that used other types of data. Removing stop words in source code is also different from removing programming language keywords and studies mentioned these as separate steps. Lukins et al. ( 2010 ), for example, tested how removing stop words from their documents (comments and identifiers of methods) affected the topics generated by their LDA-based approach. They found that this step did not improve the results substantially.

As mentioned in Section  5.4 , stemming is the process of normalizing words into their single forms by identifying and removing prefixes, suffixes and pluralisation (e.g., “development”, “developer”, “developing” become “develop”). Regarding stemming in source code, papers normalized identifiers of classes and methods, comments related to classes and methods, test cases or a source code file. Three papers tested the effect of this pre-processing step in the performance of their techniques (Tantithamthavorn et al. 2018 ; Poshyvanyk et al. 2012 ; Binkley et al. 2015 ), and one of these papers also tested removing stop words and splitting words (Tantithamthavorn et al. 2018 ). Poshyvanyk et al. ( 2012 ) tested the effect of stemming classes in the performance of their LSI-based approach. The authors concluded that stemming can positively impact features localization by producing topics (“concept lattices” in their study) that effectively organize the results of searches in source code. Binkley et al. ( 2015 ) compared the performance of LSI, QL-LDA and other techniques. They also tested the effects of stemming (with two different stemmers: Porter Footnote 9 and Krovetz Footnote 10 ) and non-stemming methods from five open source systems. These authors found that they obtained better performances in terms of models’ Mean Reciprocal Rank (MRR, details in Appendix  A.2 - Metrics Used in Comparative Studies) with non-stemming.

Additionally, we found that even though some papers used the same type of data, they pre-processed data differently since they had different goals and applied different techniques. For example, Ye et al. ( 2017 ), Barua et al. ( 2014 ) and Chen et al. ( 2019 ) used developer communication (Q&A posts as documents). Ye et al. ( 2017 ) and Barua et al. ( 2014 ) removed stop words, code snippets and HTML tags, while Barua et al. ( 2014 ) also stemmed words. On the other hand, Chen et al. ( 2019 ) removed stop words and the least and the most frequent words, and identified bi-grams. Some studies considered the advice on data pre-processing from previous studies (e.g., Chen et al. 2017 ; Li et al. 2018 ), while others adopted steps that are commonly used in NLP, such as noise removal and stemming (Miner et al. 2012 ) (e.g., Demissie et al. 2020 ). This means that the choice of pre-processing steps do not only depend on the characteristics of the type of data inputted to topic modeling techniques.

6.4 RQ4: Assigning Names to Topics

Most papers did not mention if or how they named topics. The majority of papers that explicitly assigned names to topics (27 out of 36) used a manual approach and relied on human judgment (researchers’ interpretation) of words in clusters. One paper (Rosen and Shihab 2016 ) justified their use of a manual approach by arguing that there was no tool that could give human readable topics based on word clusters. Thus, authors checked every word cluster generated and the documents used (an individual question of a Q&A website) to make sure they would label topics appropriately.

Table  17 shows how topics were named and the type of data analyzed. Table  18 shows how topics were named and the type of contributions they make. We observed the following:

Studies that modeled topics from developer documentation, transcripts and URLs did not mention topic naming. Studies that contributed with both exploration and comparison also did not mention topic naming;

Topics were mostly named in studies that used data from developer communication (ten occurrences) and in exploratory studies (22 occurrences).

From studies that compared topic models or topic modeling-based approaches (see Section  6.1.2 ), only one study (Yan et al. 2016b ) named topics (automatically with predefined labels).

Fourteen papers acknowledged limitations of manual topic naming:

Twelve papers (Bagherzadeh and Khatchadourian 2019 ; Ahmed and Bagherzadeh 2018 ; Martin et al. 2015 ; Hindle et al. 2013 ; Pagano and Maalej 2013 ; Zou et al. 2017 ; Pettinato et al. 2019 ; Layman et al. 2016 ; Ray et al. 2014 ; Tiarks and Maalej 2014 ; Mezouar et al. 2018 ; Abdellatif et al. 2020 ) acknowledged that how topics were named could be a threat to validity. For example, Layman et al. ( 2016 ) mentioned that they did not evaluate the accuracy of the manual topic naming, which was based on their expertise.

Three papers (Hindle et al. 2015 ; Bajracharya and Lopes 2012 ; Li et al. 2018 ) mentioned difficulties to assign names to topics. Hindle et al. ( 2015 ), for example, explained that labeling topics was difficult due to many project specific and unclear terms in clusters.

One paper (Pettinato et al. 2019 ) acknowledged that there is another topic naming approach that could be applied to their data: authors acknowledged that an automated extraction of topic names could replace manual labeling.

Hindle et al. ( 2015 ) provided some recommendations on topic analysis in software engineering based on their experiences. Below are some of their recommendations related to topic naming:

Some of the generated topics will not be relevant (e.g., clusters filled with common terms may not address any particular subject) and topics may be duplicated. This means that not all topics have to be named and used for analysis;

Domain experts can label topics better than non-experts, because they are more familiar to domain-specific keywords that may appear in word clusters;

It is important to rely on the relationship between topics generated and the original data. Hindle et al. ( 2015 ) argued that “the content of the topic can be interpreted in many different ways and LDA does not look for the same patterns that people do”.

6.5 Implications

The goal of this study was to describe how topic modeling is applied in software engineering research. We found studies that experimented, explored data, or proposed solutions to support different software engineering tasks with topic models. Our findings help researchers and practitioners as follows:

Understand which topic modeling techniques to use for what purpose . Researchers and practitioners that are going to select and apply a topic modeling technique, for example, to refactor legacy systems; may consider the experiences of other studies with similar objectives.

Pre-processing based on the type of data to be modeled . Pre-processing steps depend on the type of data analyzed (e.g., removing HTML tags in developer communication, mainly Q&A posts). Researchers and practitioners who, for example, intend to model topics from source code; may consider the same pre-processing steps that other studies applied to source code.

Understand how to name topics . Researchers and practitioners may check how other studies named topics to get insights on how to give meaning to their own topics.

We present some additional insights:

Appropriateness of topic modeling . Although we found that most of papers applied LDA “as is”, it may not be the best approach for other studies or for practical application. LDA is popular because it is an unsupervised model, i.e., it does not require previous knowledge about the data (e.g., pre-defined classes for model training), it is statistically more rigorous than other techniques (e.g., LSI), and it discovers latent relationships (i.e., topics) between documents in a large textual corpus (Griffiths and Steyvers 2004 ). However, LDA is an unstable and non-deterministic model. This means that generated topics cannot be replicated by others, even if the same model inputs (data pre-processing and configuration of parameters) are used. Furthermore, LDA performs poorly with short documents (Lin et al. 2014 ).

Meaningful topics . Topic models should discover semantically meaningful topics. Chang et al. ( 2009 ) argue about the importance of the interpretability of topics generated by probabilistic topic modeling techniques such as LDA. To create meaningful and replicable topics with LDA, Mantyla et al. ( 2018 ) highlight the importance of stabilizing the topic model (e.g., through tuning (Agrawal et al. 2018 )) and advocate the use of stability metrics (e.g., rank-biased overlap - RBO (Mantyla et al. 2018 )).

Research opportunities . Researchers interested in investigating topic modeling in software engineering may consider developing guidelines for researchers on how to use topic modeling, depending on the type of data, goals, etc. Further studies may also explore issues related to approaches for naming topics (e.g., based on domain experts), on the evaluation of the semantic accuracy of topics generated (e.g., how meaningful the topics are and if the context of document have to be considered), and on metrics to measure the performance of topic models supporting different software engineering tasks.

6.6 Threats to Validity

We analysed the validity threats to our study considering four types of threats to validity in systematic literature mapping studies (Petersen et al. 2015 ):

Theoretical validity This threat to validity refers to concerns related to capturing the data as intended, i.e., bias and limitations in the data selection and extraction. As we focused on the practice of topic modeling in software engineering, we restricted the search to highly ranked software engineering venues, which generally publish more mature studies. We used “topic model”, “topic model[l]ing”, “lsi”, “lda”, “plsi”, “latent dirichlet allocation”, “latent semantic” as search keywords to find all papers related to topic modeling. To select papers to the survey, we established inclusion and exclusion criteria. One author selected the papers and the others checked whether the selection criteria were applied appropriately. Furthermore, to minimize this threat in relation to data extraction, we first defined the data items (details are in Table  2 ) to be extracted from papers and the relevance of the data for each research question. Then, one author extracted the data and the others reviewed the results. Controversial data results were discussed to reach agreement.

Descriptive validity In the context of a literature survey, descriptive validity refers to bias and limitations in data synthesis and the accurate and objective description of the data. To mitigate this threat, we described in detail how the data was synthesized (see Section  4.3 ); furthermore, one of the authors synthesized the data and the others reviewed the results. Still, data and results depend on what is reported in papers which was sometimes incomplete, inconsistent or inaccurate (see for example information about document length).

Interpretive validity This threat to validity refers to bias and limitations in the results of the data analysis. We frequently reviewed the synthesized data during the data analysis and the authors with more experience in this type of study checked the occurrence of inconsistencies in results. Still, we recognize that interpretation bias may not have been removed completely.

Repeatability This threat to validity concerns whether the study and its results can be replicated. To reduce this threat, we described our search procedures in detail (Section  4 ), and the processes of data selection, extraction and synthesis in detail. We also followed general guidelines for systematic literature review as suggested by Kitchenham ( 2004 ) and mapping study method as suggested by Petersen et al. ( 2015 ). Furthermore, raw data of our study are available online 1 .

7 Conclusions

We analyzed 111 papers that applied topic modeling. These papers were published in the last twelve years (2009-2020) in ten highly ranked software engineering venues (five conferences and five journals). Below we summarize our findings:

LDA and LDA-based techniques are the most frequently used topic modeling techniques;

Topic modeling was mostly used to develop techniques for handling bugs (e.g., to predict defects). Exploratory studies that use topic modeling as a data analysis technique were also frequent;

Most papers modeled topics from source code (using methods as documents);

Most papers used LDA “as is” and without adapting values of hyperparameters ( α and β );

Most papers describe pre-processing. Some pre-processing steps depend on the type of textual data used (e.g., removal of URL and HTML tags), while others are commonly used in NLP techniques (e.g., stop words removal or stemming);

Only 36 (out of 111) papers named the topics. When naming topics, papers mostly adopted manual topic naming approaches such as deducting names (or labeling pre-defined names) based on the meaning of frequent words in that topic.

By analysing topic modeling techniques, data inputs, data pre-processing, and how topics were named, we identified characteristics and limitations in the use of topic models. Our study can provide insights and references to researchers and practitioners to make the best use of topic modeling, considering the experiences from previous studies.

Our study did not investigate all potential characteristics of topic modeling in software engineering or compared topic models to other text mining techniques. To answer our research questions, we analyzed data items shown in Table  2 . Future studies may investigate other characteristics of the use of topic modeling in software engineering, for example, topic modeling tools or libraries (e.g., Mallet) used; the context of a specific supported software engineering task; or compare topic modeling techniques to other text mining techniques, such as clustering and summarization (e.g., sentence or document embeddings). Furthermore, future work can reflect on other fields or uses of topic modeling to contrast how topic modeling is applied in software engineering. Further studies may also investigate how papers evaluate the performance of their topic modeling techniques, how papers evaluate the the quality of the generated topics, and how exactly word clusters were used when topics were not named.

https://doi.org/10.5281/zenodo.5280890

This table also shows hyperparameters and the number of topics which are discussed in the following subsection.

http://mallet.cs.umass.edu/topics.php

https://nlp.stanford.edu/software/tmt/tmt-0.4/

http://www.ai.mit.edu/projects/jmlr/papers/volume5/lewis04a/a11-smart-stop-list/english.stop

https://gist.github.com/sebleier/554280

https://github.com/mengjunxie/ae-lda/blob/master/misc/mallet-stopwords-en.txt

http://icon.shef.ac.uk/Moby/mwords.html

https://tartarus.org/martin/PorterStemmer/

https://pypi.org/project/krovetz/

Abdellatif A, Costa D, Badran K, Abdalkareem R, Shihab E (2020) Challenges in Chatbot Development: A Study of Stack Overflow Posts. In: Proceedings of the 17th international conference on mining software repositories. https://doi.org/10.1145/3379597.3387472 , vol 12. IEEE/ACM, Seoul, pp 174–185

Abdellatif TM, Capretz LF, Ho D (2019) Automatic recall of software lessons learned for software project managers. Inf Softw Technol 115:44–57. https://doi.org/10.1016/j.infsof.2019.07.006

Article   Google Scholar  

Aggarwal CC, Zhai C (2012) Mining text data. Springer, New York. https://doi.org/10.1007/978-1-4614-3223-4

Book   Google Scholar  

Agrawal A, Fu W, Menzies T (2018) What is wrong with topic modeling? And how to fix it using search-based software engineering. Inf Softw Technol 98(January 2017):74–88. https://doi.org/10.1016/j.infsof.2018.02.005

Ahasanuzzaman M, Asaduzzaman M, Roy CK, Schneider KA (2019) CAPS: a supervised technique for classifying Stack Overflow posts concerning API issues. Empir Softw Eng 25:1493–1532. https://doi.org/10.1007/s10664-019-09743-4

Ahmed S, Bagherzadeh M (2018) What do concurrency developers ask about?: A large-scale study using Stack Overflow. In: Proceedings of the international symposium on empirical software engineering and measurement. https://doi.org/10.1145/3239235.3239524 . ACM, Oulu, pp 1–10

Ali N, Sharafi Z, Guéhéneuc Y G, Antoniol G (2015) An empirical study on the importance of source code entities for requirements traceability. Empir Softw Eng 20(2):442–478. https://doi.org/10.1007/s10664-014-9315-y

Alipour A, Hindle A, Stroulia E (2013) A contextual approach towards more accurate duplicate bug report detection. In: IEEE international working conference on mining software repositories. pp 183–192. https://doi.org/10.1109/MSR.2013.662402

Altarawy D, Shahin H, Mohammed A, Meng N (2018) LASCAD: Language-agnostic software categorization and similar application detection. J Syst Softw 142:21–34. https://doi.org/10.1016/j.jss.2018.04.018

ARC ARC (2012) Excellence in research for australia (ERA). https://www.arc.gov.au/excellence-research-australia http://www.arc.gov.au/pdf/era12/ERAFactsheet_Jan2012_1.pdf

Asuncion HU, Asuncion AU, Taylor RN (2010) Software traceability with topic modeling. In: Proceedings of the international conference on software engineering. IEEE/ACM, Cape Town, pp 95–104

Bagherzadeh M, Khatchadourian R (2019) Going big: a large-scale study on what big data developers ask. In: Proceedings of the 27th joint european software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3338906.3338939 . ACM, Tallinn, pp 432–442

Bajaj K, Pattabiraman K, Mesbah A (2014) Mining questions asked by web developers. In: Proceedings of the 11th working conference on mining software repositories. https://doi.org/10.1145/2597073.2597083 . ACM, Hyderabad, pp 112–121

Bajracharya S, Lopes C (2009) Mining search topics from a code search engine usage log. In: Proceedings of the 6th international working conference on mining software repositories. https://doi.org/10.1109/MSR.2009.5069489 . IEEE, Vancouver, pp 111–120

Bajracharya SK, Lopes CV (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17:424–466. https://doi.org/10.1007/s10664-010-9144-6

Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? An analysis of topics and trends in Stack Overflow. Empir Softw Eng 19 (3):619–654. https://doi.org/10.1007/s10664-012-9231-y

Bavota G, Gethers M, Oliveto R, Poshyvanyk D, Lucia ADE (2014a) Improving software modularization via automated analysis of latent. ACM Trans Softw Eng Methodol 23(1):1–33. https://doi.org/10.1145/2559935

Bavota G, Oliveto R, Gethers M, Poshyvanyk D, De Lucia A (2014b) Methodbook: Recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694. https://doi.org/10.1109/TSE.2013.60

Beitzel SM, Jensen EC, Frieder O (2009) MAP. In: Encyclopedia of database systems. https://doi.org/10.1007/978-0-387-39940-9_492 . Springer US, Boston, pp 1691–1692

Belle AB, Boussaidi GE, Kpodjedo S (2016) Combining lexical and structural information to reconstruct software layers. Inf Softw Technol 74:1–16. https://doi.org/10.1016/j.infsof.2016.01.008

Bi T, Liang P, Tang A, Yang C (2018) A systematic mapping study on text analysis techniques in software architecture. J Syst Softw 144:533–558. https://doi.org/10.1016/j.jss.2018.07.055

Biggers LR, Bocovich C, Capshaw R, Eddy BP, Etzkorn LH, Kraft NA (2014) Configuring latent Dirichlet allocation based feature location. Empir Softw Eng 19(3):465–500. https://doi.org/10.1007/s10664-012-9224-x

Binkley D, Lawrie D, Uehlinger C, Heinz D (2015) Enabling improved IR-based feature location. J Syst Softw 101:30–42. https://doi.org/10.1016/j.jss.2014.11.013

Blasco D, Cetina C, Pastor O (2020) A fine-grained requirement traceability evolutionary algorithm: Kromaia, a commercial video game case study. Inf Softw Technol 119:1–12. https://doi.org/10.1016/j.infsof.2019.106235

Blei DM, Jordan MI, Griffiths TL, Tenenbaum JB (2003a) Hierarchical topic models and the nested chinese restaurant process. In: Proceedings of the 16th international conference on neural information processing systems. Neural Information Processing Systems Foundation, Vancouver, pp 17–24

Blei DM, Ng AY, Jordan MI (2003b) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022. https://doi.org/10.1162/jmlr.2003.3.4-5.993

MATH   Google Scholar  

Brank J, Mladenić D, Grobelnik M, Liu H, Mladenić D, Flach PA, Garriga GC, Toivonen H, Toivonen H (2011) F 1-measure. In: Encyclopedia of machine learning. https://doi.org/10.1007/978-0-387-30164-8_298 . Springer US, pp 397–397

Canfora G, Cerulo L, Cimitile M, Di Penta M (2014) How changes affect software entropy: An empirical study. Empir Softw Eng 19:1–38. https://doi.org/10.1007/s10664-012-9214-z

Cao B, Frank Liu X, Liu J, Tang M (2017) Domain-aware Mashup service clustering based on LDA topic model from multiple data sources. Inf Softw Technol 90:40–54. https://doi.org/10.1016/j.infsof.2017.05.001

Capiluppi A, Ruscio DD, Rocco JD, Nguyen PT, Ajienka N (2020) Detecting Java software similarities by using different clustering techniques. Inf Softw Technol 122. https://doi.org/10.1016/j.infsof.2020.106279

Catolino G, Palomba F, Zaidman A, Ferrucci F (2019) Not all bugs are the same: Understanding, characterizing, and classifying bug types. J Syst Softw 152:165–181. https://doi.org/10.1016/j.jss.2019.03.002

Chang J, Blei DM (2009) Relational topic models for document networks. In: Proceedings of the 12th international conference on artificial intelligence and statistics. Society for Artificial Intelligence and Statistics, Clearwater Beach, pp 81–88

Chang J, Blei DM (2010) Hierarchical relational models for document networks. Ann Appl Stat 4(1):124–150. https://doi.org/10.1214/09-AOAS309

Article   MathSciNet   MATH   Google Scholar  

Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: How humans interpret topic models. In: Proceedings of the 2009 conference advances in neural information. Neural Information Processing Systems Foundation, Vancouver, pp 288–296

Chatterjee P, Damevski K, Pollock L (2019) Exploratory study of slack q&a chats as a mining source for software engineering tools. In: Proceedings of the 16th international conference on mining software repositories. IEEE, Montreal, pp 1–12

Chen H, Coogle J, Damevski K (2019) Modeling stack overflow tags and topics as a hierarchy of concepts. J Syst Softw 156:283–299. https://doi.org/10.1016/j.jss.2019.07.033

Chen L, Hassan F, Wang X, Zhang L (2020) Taming behavioral backward incompatibilities via cross-project testing and analysis. In: Proceedings of the 42nd international conference on software engineering. https://doi.org/10.1145/3377811.3380436 . IEEE/ACM, Seoul, pp 112–124

Chen N, Lin J, Hoi SC, Xiao X, Zhang B (2014) AR-miner: Mining informative reviews for developers from mobile app marketplace. In: Proceedings of the international conference on software engineering. https://doi.org/10.1145/2568225.2568263 , vol 1. IEEE/ACM, Hyderabad, pp 767–778

Chen TH, Thomas SW, Nagappan M, Hassan AE (2012) Explaining software defects using topic models. In: Proceedings of the international working conference on mining software repositories. https://doi.org/10.1109/MSR.2012.6224280 . IEEE, Zurich, pp 189–198

Chen TH, Thomas SW, Hassan AE (2016) A survey on the use of topic models when mining software repositories. Empir Softw Eng 21(5):1843–1919. https://doi.org/10.1007/s10664-015-9402-8

Chen TH, Shang W, Nagappan M, Hassan AE, Thomas SW (2017) Topic-based software defect explanation. J Syst Softw 129:79–106. https://doi.org/10.1016/j.jss.2016.05.015

Choetkiertikul M, Dam HK, Tran T, Ghose A (2017) Predicting the delay of issues with due dates in software projects. Empir Softw Eng 22:1223–1263. https://doi.org/10.1007/s10664-016-9496-7

Craswell N (2009) Mean reciprocal rank. In: Encyclopedia of database systems. https://doi.org/10.1007/978-0-387-39940-9_488 . Springer US, pp 1703–1703

Croft WB, Metzler D (2010) Search engines: Information retrieval in practice. Addison-Wesley, Reading

Google Scholar  

Cui D, Liu T, Cai Y, Zheng Q, Feng Q, Jin W, Guo J, Qu Y (2019) Investigating the impact of multiple dependency structures on software defects, IEEE/ACM, Montreal. https://doi.org/10.1109/ICSE.2019.00069

Damevski K, Chen H, Shepherd DC, Kraft NA, Pollock L (2018) Predicting future developer behavior in the IDE using topic models. IEEE Trans Softw Eng 44(11):1100–1111. https://doi.org/10.1109/TSE.2017.2748134

De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2014) Labeling source code with information retrieval methods: An empirical study. Empir Softw Eng 19(5):1383–1420. https://doi.org/10.1007/s10664-013-9285-5

Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6): 391-407 https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9

Demissie BF, Ceccato M, Shar LK (2020) Security analysis of permission re-delegation vulnerabilities in Android apps. Empir Softw Eng 25:5084–5136. https://doi.org/10.1007/s10664-020-09879-8

Dietz L, Bickel S, Scheffer T (2007) Unsupervised prediction of citation influences. In: Proceedings of the 24th international conference on machine learning. https://doi.org/10.1145/1273496.1273526 . ACM, Corvallis, pp 233–240

Dit B, Revelle M, Poshyvanyk D (2013) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir Softw Eng 18(2):277–309. https://doi.org/10.1007/s10664-011-9194-4

El Zarif O, Da Costa DA, Hassan S, Zou Y (2020) On the relationship between user churn and software issues. In: Proceedings of the 17th international conference on mining software repositories. https://doi.org/10.1145/3379597.3387456 . ACM, New York, pp 339–349

Fowkes J, Chanthirasegaran P, Ranca R, Allamanis M, Lapata M, Sutton C (2016) Autofolding for source code summarization. Proc Int Conf Softw Eng 43(12):649–652. https://doi.org/10.1145/2889160.2889171

Fu Y, Yan M, Zhang X, Xu L, Yang D, Kymer JD (2015) Automated classification of software change messages by semi-supervised Latent Dirichlet Allocation. Inf Softw Technol 57:369–377. https://doi.org/10.1016/j.infsof.2014.05.017

Galvis Carreno LV, Winbladh K (2012) Analysis of user comments: an approach for software requirements evolution. In: Proceedings of the international conference on software engineering. IEEE/ACM, San Francisco, pp 582–591

Gao C, Zeng J, Lyu MR, King I (2018) Online app review analysis for identifying emerging issues. In: Proceedings of the 40th international conference on software engineering. https://doi.org/10.1145/3180155.3180218 . IEEE/ACM, Gothenburg, pp 48–58

Gopalakrishnan R, Sharma P, Mirakhorli M, Galster M (2017) Can latent topics in source code predict missing architectural tactics?. In: Proceedings of the 39th international conference on software engineering, IEEE/ACM, pp 15–26. https://doi.org/10.1109/ICSE.2017.10 . http://ghtorrent.org/

Gorla A, Tavecchia I, Gross F, Zeller A (2014) Checking app behavior against app descriptions. In: Proceedings of the international conference on software engineering. https://doi.org/10.1145/2568225.2568276 . IEEE/ACM, Hyderabad, pp 1025–1035

Griffiths TL, Steyvers M (2004) Finding scientific topics. In: Proceedings of the national academy of sciences. https://doi.org/10.1073/pnas.0307752101 , vol 101. Neural Information Processing Systems Foundation, Irvine, pp 5228–5235

Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: Proceedings of the conference on human language technologies: the 2009 annual conference of the north american chapter of the association for computational linguistics. https://doi.org/10.3115/1620754.1620807 , http://www-nlpir.nist.gov/projects/duc/data.html . Association for Computational Linguistics, Boulder, pp 362–370

Han J, Shihab E, Wan Z, Deng S, Xia X (2020) What do programmers discuss about deep learning frameworks. Empir Softw Eng 25:2694–2747. https://doi.org/10.1007/s10664-020-09819-6

Haque MU, Ali Babar M (2020) Challenges in docker development: a large-scale study using stack overflow. In: Proceedings of the 14th international symposium on empirical software engineering and measurement. https://doi.org/10.1145/3382494.3410693 . IEEE/ACM, Bari, pp 1–11

Hariri N, Castro-Herrera C, Mirakhorli M, Cleland-Huang J, Mobasher B (2013) Supporting domain analysis through mining and recommending features from online product listings. IEEE Trans Softw Eng 39(12):1736–1752. https://doi.org/10.1109/TSE.2013.39

Henß S, Monperrus M, Mezini M (2012) Semi-automatically extracting FAQs to improve accessibility of software development knowledge. In: Proceedings of the international conference on software engineering. https://doi.org/10.1109/ICSE.2012.6227139 . IEEE/ACM, Zurich, pp 793–803

Hindle A, Godfrey MW, Ernst NA, Mylopoulos J (2011) Automated topic naming to support cross-project analysis of software maintenance activities. In: Proceedings of the 33rd international conference on software engineering. ACM, Waikiki, pp 163–172

Hindle A, Ernst NA, Godfrey MW, Mylopoulos J (2013) Automated topic naming: Supporting cross-project analysis of software maintenance activities. Empir Softw Eng 18(6):1125–1155. https://doi.org/10.1007/s10664-012-9209-9

Hindle A, Bird C, Zimmermann T, Nagappan N (2015) Do topics make sense to managers and developers? Empir Softw Eng 20:479–515. https://doi.org/10.1007/s10664-014-9312-1

Hindle A, Alipour A, Stroulia E (2016) A contextual approach towards more accurate duplicate bug report detection and ranking. Empir Softw Eng 21 (2):368–410. https://doi.org/10.1007/s10664-015-9387-3

Hoffman M, Blei D, Bach F (2010) Online learning for latent dirichlet allocation. In: Proceedings of the neural information processing systems conference. https://doi.org/10.1.1.187.1883. Neural Information Processing Systems Foundation, Vancouver, pp 1–9

Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international conference on research and development in information retrieval. ACM, Berkeley, pp 50–57

Hu H, Bezemer CP, Hassan AE (2018) Studying the consistency of star ratings and the complaints in 1 & 2-star user reviews for top free cross-platform Android and iOS apps. Empir Softw Eng 23(6):3442–3475. https://doi.org/10.1007/s10664-018-9604-y

Hu H, Wang S, Bezemer CP, Hassan AE (2019) Studying the consistency of star ratings and reviews of popular free hybrid Android and iOS apps. Empir Softw Eng 24:7–32. https://doi.org/10.1007/s10664-018-9617-6

Hu W, Wong K (2013) Using citation influence to predict software defects. In: Proceedings of the international working conference on mining software repositories. https://doi.org/10.1109/MSR.2013.6624058 . IEEE, San Francisco, pp 419–428

Jiang H, Zhang J, Ren Z, Zhang T (2017) An unsupervised approach for discovering relevant tutorial fragments for APIs. In: Proceedings of the 39th international conference on software engineering. https://doi.org/10.1109/ICSE.2017.12 . IEEE/ACM, Buenos Aires, pp 38–48

Jiang HE, Zhang J, Li X, Ren Z, Lo D, Wu X, Luo Z (2019) Recommending new features from mobile app descriptions. ACM Trans Softw Eng Methodol 28(4):1–29. https://doi.org/10.1145/3344158

Jipeng Q, Zhenyu Q, Yun L, Yunhao Y, Xindong W (2020) Short text topic modeling techniques, applications, and performance: a survey. https://doi.org/10.1109/TKDE.2020.2992485

Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of the fourth ACM international conference on Web search and data mining. https://doi.org/10.1145/1935826 . ACM, New York, pp 815–824

Jones JA, Harrold MJ (2005) Empirical evaluation of the tarantula automatic fault-localization technique. In: Proceedings of the 20th international conference on automated software engineering. https://doi.org/10.1145/1101908.1101949 , http://portal.acm.org/citation.cfm?doid=1101908.1101949 . IEEE/ACM, New York, pp 273–282

Kakas AC, Cohn D, Dasgupta S, Barto AG, Carpenter GA, Grossberg S, Webb GI, Dorigo M, Birattari M, Toivonen H, Timmis J, Branke J, Toivonen H, Strehl AL, Drummond C, Coates A, Abbeel P, Ng AY, Zheng F, Webb GI, Tadepalli P (2011) Area under curve. In: Encyclopedia of machine learning. https://doi.org/10.1007/978-0-387-30164-8_28 . Springer US, pp 40–40

Kitchenham BA (2004) Procedures for performing systematic reviews. Keele, UK, Keele University 33(TR/SE-0401):28. https://doi.org/10.1.1.122.3308

Layman L, Nikora AP, Meek J, Menzies T (2016) Topic modeling of NASA space system problem reports research in practice. In: Proceedings of the 13th working conference on mining software repositories. https://doi.org/10.1145/2901739.2901760 . ACM, Austin, pp 303–314

Le TDB, Thung F, Lo D (2017) Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools. Empir Softw Eng 22:2237–2279. https://doi.org/10.1007/s10664-016-9484-y

Leach RJ (2016) Introduction to software engineering, 2nd edn. CRC Press LLC, Boca Raton. https://ebookcentral.proquest.com/lib/canterbury/detail.action?docID=4711469&query=Software+Engineering

Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791

Article   MATH   Google Scholar  

Li H, Chen THP, Shang W, Hassan AE (2018) Studying software logging using topic models. Empir Softw Eng 23:2655–2694. https://doi.org/10.1007/s10664-018-9595-8

Lian X, Liu W, Zhang L (2020) Assisting engineers extracting requirements on components from domain documents. Inf Softw Technol 118(September 2019):106196. https://doi.org/10.1016/j.infsof.2019.106196

Lin T, Tian W, Mei Q, Cheng H (2014) The dual-sparse topic model: Mining focused topics and focused terms in short text. In: Proceedings of the 23rd international conference on world wide web. https://doi.org/10.1145/2566486.2567980 . ACM, Seoul, pp 539–549

Liu Y, Liu L, Liu H, Wang X, Yang H (2017) Mining domain knowledge from app descriptions. J Syst Softw 133:126–144. https://doi.org/10.1016/j.jss.2017.08.024

Liu Y, Lin J, Cleland-Huang J (2020) Traceability support for multi-lingual software projects. In: Proceedings of the 17th international conference on mining software repositories. https://doi.org/10.1145/3379597.3387440 . ACM, Seoul, pp 443–454

Lukins SK, Kraft NA, Etzkorn LH (2010) Bug localization using latent Dirichlet allocation. Inf Softw Technol 52:972–990. https://doi.org/10.1016/j.infsof.2010.04.002

Luo Q, Moran K, Poshyvanyk D (2016) A large-scale empirical comparison of static and dynamic test case prioritization techniques. In: Proceedings of the 24th international symposium on foundations of software engineering. https://doi.org/10.1145/2950290.2950344 . ACM, Seattle, pp 559–570

Mahmoud A, Bradshaw G (2017) Semantic topic models for source code analysis. Empir Softw Eng 22(4):1965–2000. https://doi.org/10.1007/s10664-016-9473-1

Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60. https://doi.org/10.1214/aoms/1177730491 , http://projecteuclid.org/euclid.aoms/1177730491

Manning CD, Raghavan P, Schütze H (2008) Evaluation of Clustering. In: Introduction to information retrieval. chap 16, https://doi.org/10.33899/csmj.2008.163987 . https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html , http://nlp.stanford.edu/IR?book/html/htmledition/evaluation?of?clustering?1.htmlwhereisthesetofclustersan . Cambridge University Press

Mantyla MV, Claes M, Farooq U (2018) Measuring LDA topic stability from clusters of replicated runs, ACM, Oulu. https://doi.org/10.1145/3239235.3267435

Martin W, Harman M, Jia Y, Sarro F, Zhang Y (2015) The app sampling problem for app store mining. In: Proceedings of the 12th international working conference on mining software repositories. https://doi.org/10.1109/MSR.2015.19 . IEEE, Florence, pp 123–133

Martin W, Sarro F, Harman M (2016) Causal impact analysis for app releases in google play. In: Proceedings of the 24th international symposium on foundations of software engineering. https://doi.org/10.1145/2950290.2950320 . ACM, Seattle, pp 435–446

McIlroy S, Ali N, Khalid H, E Hassan A (2016) Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empir Softw Eng 21:1067–1106. https://doi.org/10.1007/s10664-015-9375-7

Mehrotra R, Sanner S, Buntine W, Xie L (2013) Improving LDA Topic Models for Microblogs via Tweet Pooling and Automatic Labeling. In: Proceedings of the 36th International Conference on Research and Development in Information Retrieval. ACM, Dublin, pp 889–892

Mezouar ME, Zhang F, Zou Y (2018) Are tweets useful in the bug fixing process? An empirical study on Firefox and Chrome. Empir Softw Eng 23 (3):1704–1742. https://doi.org/10.1007/s10664-017-9559-4

Miner G, Elder J, Fast A, Hill T, Nisbet R, Delen D (2012) Practical text mining and statistical analysis for non-structured text data applications. Elsevier Science & Technology, Waltham . https://doi.org/10.1016/C2010-0-66188-8

Moslehi P, Adams B, Rilling J (2016) On mining crowd-based speech documentation. In: Proceedings of the 13th working conference on mining software repositories. https://doi.org/10.1145/2901739.2901771 . ACM, Austin, pp 259–268

Moslehi P, Adams B, Rilling J (2018) Feature location using crowd-based screencasts. In: Proceedings of the 15th international conference on mining software repositories. https://doi.org/10.1145/3196398.3196439 . ACM, New York, pp 192–202

Moslehi P, Adams B, Rilling J (2020) A feature location approach for mapping application features extracted from crowd-based screencasts to source code. Empir Softw Eng 25:4873–4926. https://doi.org/10.1007/s10664-020-09874-z

Murali V, Chaudhuri S, Jermaine C (2017) Bayesian specification learning for finding API usage errors. In: Proceedings of the Joint european software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3106237.3106284 . ACM, Paderborn, pp 151–162

Nabli H, Ben Djemaa R, Ben Amor IA (2018) Efficient cloud service discovery approach based on LDA topic modeling. J Syst Softw 146:233–248. https://doi.org/10.1016/j.jss.2018.09.069

Naguib H, Narayan N, Brügge B, Helal D (2013) Bug report assignee recommendation using activity profiles. In: Proceedings of the international working conference on mining software repositories. https://doi.org/10.1109/MSR.2013.6623999 . IEEE, San Francisco, pp 22–30

Nayebi M, Cho H, Ruhe G (2018) App store mining is not enough for app improvement. Empir Softw Eng 23:2764–2794. https://doi.org/10.1007/s10664-018-9601-1

Nguyen AT, Nguyen TT, Al-Kofahi J, Nguyen HV, Nguyen TN (2011) A topic-based approach for narrowing the search space of buggy files from a bug report. In: Proceedings of the 26th international conference on automated software engineering. https://doi.org/10.1109/ASE.2011.6100062 . IEEE/ACM, Lawrence, pp 263–272

Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: Proceedings of the 27th international conference on automated software engineering. https://doi.org/10.1145/2351676.2351687 . IEEE/ACM, Essen, pp 70–79

Nguyen VA, Boyd-Graber J, Resnik P, Chang J, Graber JB (2014) Learning a concept hierarchy from multi-labeled documents. In: Proceedings of the neural information processing systems conference. Neural Information Processing Systems Foundation, Montreal, pp 1–9

Noei E, Heydarnoori A (2016) EXAF: A search engine for sample applications of object-oriented framework-provided concepts. Inf Softw Technol 75:135–147. https://doi.org/10.1016/j.infsof.2016.03.007

Noei E, Da Costa DA, Zou Y (2018) Winning the app production rally. In: Proceedings of the 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3236024.3236044 . ACM, Lake Buena Vista, pp 283–294

Noei E, Zhang F, Wang S, Zou Y (2019) Towards prioritizing user-related issue reports of mobile applications. Empir Softw Eng 24:1964–1996. https://doi.org/10.1007/s10664-019-09684-y

Pagano D, Maalej W (2013) How do open source communities blog? Empir Softw Eng 18(6):1090–1124. https://doi.org/10.1007/s10664-012-9211-2

Palomba F, Salza P, Ciurumelea A, Panichella S, Gall H, Ferrucci F, De Lucia A (2017) Recommending and localizing change requests for mobile apps based on user reviews. In: Proceedings of the 39th international conference on software engineering. https://doi.org/10.1109/ICSE.2017.18 . IEEE/ACM, Buenos Aires, pp 106–117

Panichella A, Dit B, Oliveto R, Di Penta M, Poshynanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms. In: Proceedings of the international conference on software engineering. https://doi.org/10.1109/ICSE.2013.6606598 . IEEE/ACM, San Francisco, pp 522–531

Pérez F, Lapeṅa R, Font J, Cetina C (2018) Fragment retrieval on models for model maintenance: Applying a multi-objective perspective to an industrial case study. Inf Softw Technol 103:188–201. https://doi.org/10.1016/j.infsof.2018.06.017

Petersen K, Vakkalanka S, Kuzniarz L (2015) Guidelines for conducting systematic mapping studies in software engineering: An update. Inf Softw Technol 64(1):1–18. https://doi.org/10.1016/j.infsof.2015.03.007

Pettinato M, Gil JP, Galeas P, Russo B (2019) Log mining to re-construct system behavior: An exploratory study on a large telescope system. Inf Softw Technol 114:121–136. https://doi.org/10.1016/j.infsof.2019.06.011

Poshyvanyk D, Gueheneuc YG, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. https://doi.org/10.1109/TSE.2007.1016 . https://www.researchgate.net/publication/3189749 , vol 33, pp 420–431

Poshyvanyk D, Marcus A, Ferenc R, Gyimóthy T (2009) Using information retrieval based coupling measures for impact analysis. Empir Softw Eng 14(1):5–32. https://doi.org/10.1007/s10664-008-9088-2 , http://www.mozilla.org/

Poshyvanyk D, Gethers M, Marcus A (2012) Concept location using formal concept analysis and information retrieval. ACM Trans Softw Eng Methodol 21(4):1–34. https://doi.org/10.1145/2377656.2377660

Poursabzi-Sangdeh F, Goldstein DG, Hofman JM, Vaughan JW, Wallach H (2021) Manipulating and measuring model interpretability. In: Proceedings of the conference on human factors in computing systems. https://doi.org/10.1145/3411764.3445315 . ACM, Yokohama

Ramage D, Hall D, Nallapati R, Manning CD (2009) Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the conference on empirical methods in natural language processing. https://doi.org/10.5555/1699510.1699543 . ACL/AFNLP, Singapore, pp 248–256

Rao S, Kak A (2011) Retrieval from software libraries for bug localization: A comparative study of generic and composite text models. In: Proceedings of the international conference on software engineering. https://doi.org/10.1145/1985441.1985451 . IEEE/ACM, Waikiki, pp 43–52

Ray B, Posnett D, Filkov V, Devanbu P (2014) A large scale study of programming languages and code quality in GitHub. In: Proceedings of the symposium on the foundations of software engineering, pp 155–165. https://doi.org/10.1145/2635868.2635922

Revelle M, Gethers M, Poshyvanyk D (2011) Using structural and textual information to capture feature coupling in object-oriented software. Empir Softw Eng 16(6):773–811. https://doi.org/10.1007/s10664-011-9159-7

Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Proceedings of the eighth ACM international conference on web search and data mining - WSDM ’15. https://doi.org/10.1145/2684822.2685324 . ACM, Shanghai, pp 399–408

Rosen C, Shihab E (2016) What are mobile developers asking about? A large scale study using Stack Overflow. Empir Softw Eng 21:1192–1223. https://doi.org/10.1007/s10664-015-9379-3

Rosenberg CM, Moonen L (2018) Improving problem identification via automated log clustering using dimensionality reduction. In: Proceedings of the international symposium on empirical software engineering and measurement. https://doi.org/10.1145/3239235.3239248 . ACM, Oulu, pp 1–10

Rothermel G, Untcn RH, Chu C, Harrold MJ (2001) Prioritizing test cases for regression testing. IEEE Trans Softw Eng 27(10):929–948. https://doi.org/10.1109/32.962562

Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620. https://doi.org/10.1145/361219.361220

Savage T, Dit B, Gethers M, Poshyvanyk D (2010) TopicXP: exploring topics in source code using latent Dirichlet allocation. IEEE, Timisoara. https://doi.org/10.1109/ICSM.2010.5609654

Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

Shimagaki J, Kamei Y, Ubayashi N, Hindle A (2018) Automatic topic classification of test cases using text mining at an android smartphone vendor. In: Proceedings of the 12th international symposium on empirical software engineering and measurement. https://doi.org/10.1145/3239235.3268927 . IEEE/ACM, Oulu, pp 1–10

Silva B, Sant’anna C, Rocha N, Chavez C (2016) The effect of automatic concern mapping strategies on conceptual cohesion measurement. Inf Softw Technol 75:56–70. https://doi.org/10.1016/j.infsof.2016.03.006

Silva LL, Valente MT, Maia MA (2019) Co-change patterns: A large scale empirical study. J Syst Softw 152:196–214. https://doi.org/10.1016/j.jss.2019.03.014

Soliman M, Galster M, Salama AR, Riebisch M (2016) Architectural knowledge for technology decisions in developer communities: An exploratory study with Stack Overflow. In: Proceedings of the 13th working conference on software architecture. https://doi.org/10.1109/WICSA.2016.13 . IEEE, Venice, pp 128–133

Somasundaram K, Murphy GC (2012) Automatic categorization of bug reports using latent Dirichlet allocation. In: Proceedings of the 5th India software engineering conference. https://doi.org/10.1145/2134254.2134276 , vol 12. ACM, pp 125–130

Souza LB, Campos EC, Madeiral F, Paixão K, Rocha AM, Maia M d A (2019) Bootstrapping cookbooks for APIs from crowd knowledge on Stack Overflow. Inf Softw Technol 111(March 2018):37–49. https://doi.org/10.1016/j.infsof.2019.03.009

Steyvers M, Griffiths T (2010) Probalistic Topic Models. In: Landauer T, McNamara D, Dennis S, Kintsch W (eds) Latent semantic analysis: a road to meaning. https://doi.org/10.1016/s0364-0213(01)00040-4 . University of California, Irvine, pp 993–1022

Sun X, Li B, Leung H, Li B, Li Y (2015) MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks. Inf Softw Technol 66:1–12. https://doi.org/10.1016/j.infsof.2015.05.003

Sun X, Liu X, Li B, Duan Y, Yang H, Hu J (2016) Exploring topic models in software engineering data analysis: A survey, IEEE, Shangai. https://doi.org/10.1109/SNPD.2016.7515925

Sun X, Yang H, Xia X, Li B (2017) Enhancing developer recommendation with supplementary information via mining historical commits. J Syst Softw 134:355–368. https://doi.org/10.1016/j.jss.2017.09.021

Taba SES, Keivanloo I, Zou Y, Wang S (2017) An exploratory study on the usage of common interface elements in android applications. J Syst Softw 131:491–504. https://doi.org/10.1016/j.jss.2016.07.010

Tairas R, Gray J (2009) An information retrieval process to aid in the analysis of code clones. https://doi.org/10.1007/s10664-008-9089-1 , http://www.cis.uab.edu/tairasr/clones/literature , vol 14, pp 33–56

Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy set and cache-based approach for bug triaging. In: Proceedings of the 19th ACM symposium on foundations of software engineering. https://doi.org/10.1145/2025113.202516 . ACM, pp 365–375

Tang J, Zhang M, Mei Q (2013) One theme in all views: modeling consensus topics in multiple contexts. In: Proceedings of the 19th international conference on knowledge discovery and data mining. ACM, New York, pp 5–13

Tantithamthavorn C, Lemma Abebe S, Hassan AE, Ihara A, Matsumoto K (2018) The impact of IR-based classifier configuration on the performance and the effort of method-level bug localization. Inf Softw Technol 102(June):160–174. https://doi.org/10.1016/j.infsof.2018.06.001

Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical Dirichlet processes. J Am Stat Assoc 101(476):1566–1581. https://doi.org/10.1198/016214506000000302

Thomas SW, Nagappan M, Blostein D, Hassan AE (2013) The impact of classifier configuration and classifier combination on bug localization. IEEE Trans Softw Eng 39(10):1427–1443. https://doi.org/10.1109/TSE.2013.27

Thomas SW, Hemmati H, Hassan AE, Blostein D (2014) Static test case prioritization using topic models. Empir Softw Eng 19:182–212. https://doi.org/10.1007/s10664-012-9219-7

Tiarks R, Maalej W (2014) How does a typical tutorial for mobile development look like?. In: Proceedings of the 11th international conference on mining software repositories. https://doi.org/10.1145/2597073.2597106 . IEEE/ACM, Hyderabad, pp 272–281

Treude C, Wagner M (2019) Predicting good configurations for GitHub and stack overflow topic models. In: Proceedings of the 16th international conference on mining software repositories. https://doi.org/10.1109/MSR.2019.00022 . IEEE, Montreal, pp 84–95

Vargha A, Delaney HD (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132. https://doi.org/10.3102/10769986025002101

Wallach HM, Mimno D, McCallum A (2009) Rethinking LDA: Why priors matter. In: Proceedings of the conference on advances in neural information processing systems. Curran Associates Inc., Vancouver, pp 1973–1981. http://rexa.info/

Wang C, Blei DM (2011) Collaborative topic modeling for recommending scientific articles. In: Proceedings of the international conference on knowledge discovery and data mining. https://doi.org/10.1145/2020408.2020480 . ACM, New York, pp 448–456

Wang W, Malik H, Godfrey MW (2015) Recommending posts concerning API issues in developer Q&A sites. In: Proceedings of the international working conference on mining software repositories. https://doi.org/10.1109/MSR.2015.28 . http://stackoverflow.com/questions/5358219/ . IEEE/ACM, pp 224–234

Wei X, Croft WB (2006) LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th annual international conference on research and development in information retrieval. https://doi.org/10.1145/1148170.1148204 . ACM, Seattle, pp 178–185

Weng J, Lim EP, Jiang J, He Q (2010) TwitterRank: Finding topic-sensitive influential twitterers. In: Proceedings of the 3rd international conference on web search and data mining. https://doi.org/10.1145/1718487.1718520 . ACM, New York, pp 261–270

Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2:37–52. https://doi.org/10.1016/0169-7439(87)80084-9

Xia X, Bao L, Lo D, Kochhar PS, Hassan AE, Xing Z (2017a) What do developers search for on the web? Empir Softw Eng 22(6):3149–3185. https://doi.org/10.1007/s10664-017-9514-4

Xia X, Lo D, Ding Y, Al-Kofahi JM, Nguyen TN, Wang X (2017b) Improving automated bug triaging with specialized topic model. IEEE Trans Softw Eng 43(3):272–297. https://doi.org/10.1109/TSE.2016.2576454

Yan M, Fu Y, Zhang X, Yang D, Xu L, Kymer JD (2016a) Automatically classifying software changes via discriminative topic model: Supporting multi-category and cross-project. J Syst Softw 113:296–308. https://doi.org/10.1016/j.jss.2015.12.019

Yan M, Zhang X, Yang D, Xu L, Kymer JD (2016b) A component recommender for bug reports using Discriminative Probability Latent Semantic Analysis. Inf Softw Technol 73:37–51. https://doi.org/10.1016/j.infsof.2016.01.005

Yang X, Lo D, Li L, Xia X, Bissyandé T F, Klein J (2017) Characterizing malicious Android apps by mining topic-specific data flow signatures. Inf Softw Technol 90:27–39. https://doi.org/10.1016/j.infsof.2017.04.007

Ye D, Xing Z, Kapre N (2017) The structure and dynamics of knowledge network in domain-specific Q&A sites: a case study of stack overflow. Empir Softw Eng 22(1):375–406. https://doi.org/10.1007/s10664-016-9430-z

Zaman S, Adams B, Hassan AE (2011) Security versus performance bugs: A case study on firefox. In: Proceedings - international conference on software engineering. https://doi.org/10.1145/1985441.198545 , pp 93–102

Zeugmann T, Poupart P, Kennedy J, Jin X, Han J, Saitta L, Sebag M, Peters J, Bagnell JA, Daelemans W, Webb GI, Ting KM, Ting KM, Webb GI, Shirabad JS, Fürnkranz J, Hüllermeier E, Matwin S, Sakakibara Y, Flener P, Schmid U, Procopiuc CM, Lachiche N, Fürnkranz J (2011) Precision and recall. In: Encyclopedia of machine learning. https://doi.org/10.1007/978-0-387-30164-8_652 . Springer US, pp 781–781

Zhang E, Zhang Y (2009) Average precision. In: Encyclopedia of database systems. https://doi.org/10.1007/978-0-387-39940-9_482 . Springer US, pp 192–193

Zhang T, Chen J, Yang G, Lee B, Luo X (2016) Towards more accurate severity prediction and fixer recommendation of software bugs. J Syst Softw 117:166–184. https://doi.org/10.1016/j.jss.2016.02.034

Zhang Y, Lo D, Xia X, Scanniello G, Le TDB, Sun J (2018) Fusing multi-abstraction vector space models for concern localization. Empir Softw Eng 23:2279–2322. https://doi.org/10.1007/s10664-017-9585-2

Zhao N, Chen J, Wang Z, Peng X, Wang G, Wu Y, Zhou F, Feng Z, Nie X, Zhang W, Sui K, Pei D (2020) Real-time incident prediction for online service systems. In: Proceedings of the 28th ACM joint meeting european software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3368089.3409672 , vol 20. ACM, pp 315–326

Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-642-20161-5-34 , vol 6611. Springer, Berlin, chap Advances i, pp 338–349

Zhao Y, Zhanq F, Shlhab E, Zou Y, Hassan AE (2016) How are discussions associated with bug reworking? an empirical study on open source projects. In: Proceedings of the 10th international symposium on empirical software engineering and measurement. https://doi.org/10.1145/2961111.296259 . IEEE/ACM, Ciudad Real, pp 1–10

Zou J, Xu L, Yang M, Zhang X, Yang D (2017) Towards comprehending the non-functional requirements through Developers’ eyes: An exploration of Stack Overflow using topic analysis. Inf Softw Technol 84(1):19–32. https://doi.org/10.1016/j.infsof.2016.12.003

Download references

Acknowledgements

We would like to thank the editor and the anonymous reviewers for their insightful and detailed feedback that helped us to significantly improve the manuscript.

Author information

Authors and affiliations.

University of Canterbury, Christchurch, New Zealand

Camila Costa Silva, Matthias Galster & Fabian Gilson

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Camila Costa Silva .

Ethics declarations

Conflict of interests.

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Andrea De Lucia

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1.1 A.1 Papers Reviewed

1.2 a.2 metrics used in comparative studies.

The column “Context-specific” indicates if the metric was proposed or adapted to a specific context (“Yes”) or is a standard NLP metric (“No”).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Silva, C.C., Galster, M. & Gilson, F. Topic modeling in software engineering research. Empir Software Eng 26 , 120 (2021). https://doi.org/10.1007/s10664-021-10026-0

Download citation

Accepted : 29 July 2021

Published : 06 September 2021

DOI : https://doi.org/10.1007/s10664-021-10026-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Topic modeling
  • Text mining
  • Natural language processing
  • Literature analysis
  • Find a journal
  • Publish with us
  • Track your research
  • Frontiers in Computer Science
  • Research Topics

Advances in Software Quality Engineering for Complex Systems

Total Downloads

Total Views and Downloads

About this Research Topic

The realm of software quality in software engineering is integral to the development of robust and efficient systems. This is particularly crucial in the context of machine learning and artificial intelligence, where the reliability and accuracy of algorithms dictate their usability and effectiveness. Machine learning relies heavily on data-driven approaches to automate predictive and decision-making processes. Visual analytics in AI leverages computational thinking—a problem-solving process that uses computer science techniques—to analyze and visualize complex data. This process is essential for deciphering trends and patterns in large datasets, a task often encountered in machine learning projects. Ethical considerations in AI involve addressing biases in machine learning algorithms, ensuring privacy and fairness, and making AI systems accessible and beneficial for a diverse range of users. Sustainability, on the other hand, relates to developing AI solutions that are environmentally conscious and economically viable over the long term. This Research Topic aims to understand recent advancements in software engineering, particularly in machine learning, artificial intelligence, computer vision, visual analytics, and software quality. It also tackles the latest advances in computational thinking and its role in the computational education arena. Furthermore, the Topic seeks to explore the role of computer vision and visual analytics in pushing boundaries in software solutions; while also aiming to bridge the gap between technological progress and ethical responsibility, focusing on AI ethics and sustainability in developing future-proof software solutions. This holistic approach combines cutting-edge technologies with a conscientious consideration of their impact on society and the environment, guiding the future of software engineering towards innovation, ethics, and sustainability. This article collection invites authors to explore the intersection between software quality and engineering with advanced technologies like machine learning, artificial intelligence (AI), computer vision, and visual analytics. It also takes into account the role of computational thinking at different educational stages. Submissions may highlight the applications of computer vision and visual analytics techniques in enhancing software functionality, user experience, and overall quality. Authors are encouraged to consider the broader implications of their work in the context of AI ethics and sustainability, including the mitigation of biases in AI, the development of environmentally sustainable AI solutions, and the long-term viability of AI technologies in a rapidly evolving digital landscape. Submissions can range from original research, empirical studies, theoretical models, systematic literature reviews, surveys, case studies, and lessons learned papers to practical applications and case studies. All papers will undergo a thorough peer-review process to ensure high-quality and impactful research.

Keywords : Software quality, software engineering, machine learning, artificial intelligence, computer vision, visual analytics, computational thinking, AI ethics and sustainability

Important Note : All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Topic coordinators, submission deadlines, participating journals.

Manuscripts can be submitted to this Research Topic via the following journals:

total views

  • Demographics

No records found

total views article views downloads topic views

Top countries

Top referring sites, about frontiers research topics.

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • CAREER Q&A
  • 31 May 2022

Why science needs more research software engineers

  • Chris Woolston 0

Chris Woolston is a freelance writer in Billings, Montana.

You can also search for this author in PubMed   Google Scholar

Paul Richmond poses for a portrait in his garden

Paul Richmond is a research software engineer in the United Kingdom. Credit: Shelley Richmond

In March 2012, a group of like-minded software developers gathered at the University of Oxford, UK, for what they called the Collaborations Workshop. They had a common vocation — building code to support scientific research — but different job titles. And they had no clear career path. The attendees coined a term to describe their line of work: research software engineer (RSE).

A decade later, RSE societies have sprung up in the United Kingdom, mainland Europe, Australia and the United States. In the United Kingdom, at least 31 universities have their own RSE groups, a sign of the growing importance of the profession, says Paul Richmond, an RSE group leader at the University of Sheffield and a past president of the country’s Society of Research Software Engineering. Nature spoke with Richmond about life as an RSE, the role of software in the research enterprise and the state of the field as it reaches its tenth anniversary.

What do RSEs do?

Fundamentally, RSEs build software to support scientific research. They generally don’t have research questions of their own — they develop the computer tools to help other people to do cool things. They might add features to existing software, clear out bugs or build something from scratch. But they don’t just sit in front of a computer and write code. They have to be good communicators who can embed themselves in a team.

What sorts of projects do they work on?

Almost every field of science runs on software, so an RSE could find themselves working on just about anything. In my career, I’ve worked on software for imaging cancer cells and modelling pedestrian traffic. As a postdoc, I worked on computational neuroscience. I don’t know very much about these particular research fields, so I work closely with the oncologists or neuroscientists or whomever to develop the software that’s needed.

Close up of multi-coloured code on a computer screen

Building code is just one part of the role of a research software engineer. Credit: Norman Posselt/Getty

Why do so many universities support their own RSE groups?

Some high-powered researchers at the top of the academic ladder can afford to hire their own RSE. That engineer might be dedicated to maintaining a single piece of software that’s been around for 10 or 20 years. But most research groups need — or can afford —an RSE only on an occasional basis. If their university has an RSE group, they can hire an in-house engineer for one day a week, or for a month at a time, or whatever they need. In that way, the RSE group is like a core facility. The university tries to ensure a steady workflow for the group, but that’s usually not a problem — there’s no shortage of projects to work on.

What else do RSEs do?

A big part of the job is raising awareness about the importance of quality software. An RSE might train a postdoc or graduate student to develop software on their own. Or they might run a seminar on good software practices. In theory, training 50 people could be more impactful than working on a single project. In practice, it’s often hard for RSEs to find the time for teaching, mentorship and advocacy because they’re so busy supporting research.

Do principal investigators (PIs) appreciate the need for RSEs?

It’s mixed. In the past, researchers weren’t always incentivized to use or create good software. But that’s changing. Many journals now require authors to publish code, and that code has to be FAIR: findable, accessible, interoperable and reproducible. That last term is very important: good software is a crucial component of research reproducibility. We explain to PIs that they need reliable code so they won’t have to retract their paper six months later.

Who should consider a career as an RSE?

Many RSEs started out as PhD students or postdocs who worked on software to support their own project. They realized that they enjoyed that part of the job more than the actual research. RSEs certainly have the skills to work in industry but they thrive in an environment of cutting-edge science in academia.

Most RSEs have a PhD — I have a PhD in computer graphics — but that’s not necessarily a requirement. Some RSEs end up on the tenure track; I was recently promoted to professor. Many others work as laboratory technicians or service staff. I would encourage any experienced developers with an interest in research to consider RSE as a career. I would also love to see more people from under-represented groups join the field. We need more diversity going forward.

What’s your advice for RSE hopefuls?

Try working on a piece of open-source software. If possible, do some training in a collaborative setting. If you have questions, talk to a working RSE. Consider joining an association. The UK Society of Research Software Engineering is always happy to advise people about getting into the field or how to stand out in a job application. People in the United States can reach out to the US Research Software Engineer Association.

research topics on software engineering

NatureTech hub

If you’re a PhD student or postdoc, give yourself a challenge: try to convince your supervisors or PI that they really need to embrace good software techniques. If you can change their minds, it’s a good indication that you have the passion and drive to succeed.

What do you envision for the profession over the next 10 years?

I want to see RSEs as equals in the academic environment. Software runs through the entire research process, but professors tend to get most of the recognition and prestige. Pieces of software can have just as much impact as certain research papers, some of them much more so. If RSEs can get the recognition and rewards that they deserve, then the career path will be that much more visible and attractive.

doi: https://doi.org/10.1038/d41586-022-01516-2

Related Articles

research topics on software engineering

Learn to code to boost your research career

Love science, loathe coding? Research software engineers to the rescue

Guidelines for academics aim to lessen ethical pitfalls in generative-AI use

Guidelines for academics aim to lessen ethical pitfalls in generative-AI use

Nature Index 22 MAY 24

Internet use and teen mental health: it’s about more than just screen time

Correspondence 21 MAY 24

Social-media influence on teen mental health goes beyond just cause and effect

AI’s keen diagnostic eye

AI’s keen diagnostic eye

Outlook 18 APR 24

So … you’ve been hacked

So … you’ve been hacked

Technology Feature 19 MAR 24

No installation required: how WebAssembly is changing scientific computing

No installation required: how WebAssembly is changing scientific computing

Technology Feature 11 MAR 24

Brazil’s plummeting graduate enrolments hint at declining interest in academic science careers

Brazil’s plummeting graduate enrolments hint at declining interest in academic science careers

Career News 21 MAY 24

How religious scientists balance work and faith

How religious scientists balance work and faith

Career Feature 20 MAY 24

How to set up your new lab space

How to set up your new lab space

Career Column 20 MAY 24

Recruitment of Global Talent at the Institute of Zoology, Chinese Academy of Sciences (IOZ, CAS)

The Institute of Zoology (IOZ), Chinese Academy of Sciences (CAS), is seeking global talents around the world.

Beijing, China

Institute of Zoology, Chinese Academy of Sciences (IOZ, CAS)

research topics on software engineering

Full Professorship (W3) in “Organic Environmental Geochemistry (f/m/d)

The Institute of Earth Sciences within the Faculty of Chemistry and Earth Sciences at Heidelberg University invites applications for a   FULL PROFE...

Heidelberg, Brandenburg (DE)

Universität Heidelberg

research topics on software engineering

Postdoc: deep learning for super-resolution microscopy

The Ries lab is looking for a PostDoc with background in machine learning.

Vienna, Austria

University of Vienna

research topics on software engineering

Postdoc: development of a novel MINFLUX microscope

The Ries lab is developing super-resolution microscopy methods for structural cell biology. In this project we will develop a fast, simple, and robust

Postdoctoral scholarship in Structural biology of neurodegeneration

A 2-year fellowship in multidisciplinary project combining molecular, structural and cell biology approaches to understand neurodegenerative disease

Umeå, Sweden

Umeå University

research topics on software engineering

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Research Topics

Research in the Monmouth University Computer Science and Software Engineering Department falls into the following areas:

Artificial Intelligence

AI can be described as the study of systems that process data that are usually non-numeric, such as text and images, in such a way that we can extract patterns and information (meanings) from them. We use techniques in natural language processing, information retrieval, information extraction, machine translation, machine learning, data mining, cognitive science, and the semantic web, to name a few.

Associated faculty:

  • Richard Scherl

Biomedical Informatics

According to the American Medical Informatics Association, biomedical informatics is an emerging and interdisciplinary field that studies the effective uses of biomedical data, information, and knowledge to improve human health. Research in the department is focused on biomedical ontologies, which are controlled vocabularies of well-defined terms connected by relationships. Biomedical ontologies are developed to express data in ways that computers can understand the meanings, in order to facilitate biomedical data sharing and knowledge discovery, thus improving healthcare and biomedical research.

Data Analytics

Nowadays, huge amounts of digital data are generated in many domains (social networks, urban environments, scientific fields, business domains) and their volumes are growing faster than ever before. According to several studies, by the year 2020 about 1.7 MegaBytes of new information will be created every second for every human being on the planet. Such huge volume of data (i.e., Big Data) can be analyzed and exploited by data analytics algorithms to discover hidden and valuable information (i.e., predictive and descriptive models, patterns, regularities) concerning human dynamics and behaviors, and leading to useful applications in both business and scientific fields.

  • Jiacun Wang
  • Daniela Rosca

In computer science, the theories and methods that relate to the storage and retrieval of large collections of data continues to be a fertile area of research. Database research in the department is focused in the areas of database management and information retrieval.

Emergency Management

Emergency management is a process by which all individuals, groups, and communities manage hazards in an effort to avoid or ameliorate the impact of disasters resulting from the hazards. It involves four phases: mitigation, preparedness, response, and recovery. Mitigation efforts attempt to prevent hazards from developing into disasters altogether, or to reduce the effects of disasters when they occur. In the preparedness phase, emergency managers develop plans of action for when the disaster strikes, and analyze and manage required resources. The response phase executes the action plans, which includes the mobilization of the necessary emergency services and dispatch of first responders and other material resources in the disaster area. The aim of the recovery phase is to restore the affected area to its previous state. Effective emergency management relies on thorough integration of emergency plans at all levels of government and non-government involvement.

Formal Methods

A formal method is a mathematic method that uses formal language in the specification, design, construction, and verification of computer systems and software. Formal languages include logic, Petri nets, finite state machines, statecharts, and so on. The development of a formal specification provides insights and an understanding of the software requirements and software design. Since software specifications are mathematical entities and may be analyzed using mathematical methods.

The area of computer science that focuses on the study of multiple computer systems that are connected together using a telecommunication system for data and resource sharing and communication. Networks research in the department is focused in the areas of wireless communications , telecommunications , network security , and network algorithms .

Workflow Management

Workflow management deals with the automation of business processes through software. A workflow management system coordinates process instances according to a formal model of the process, and matches individual activities with properly qualified resources for execution. The business environment today is undergoing rapid and constant changes. The way companies do business, including the business processes and their underlying business rules, should adapt to these changes flexibly with minimum interruption to ongoing operations.

Requirements Engineering for Research Software: A Vision

  • Bajraktari, Adrian
  • Binder, Michelle
  • Vogelsang, Andreas

Modern science is relying on software more than ever. The behavior and outcomes of this software shape the scientific and public discourse on important topics like climate change, economic growth, or the spread of infections. Most researchers creating software for scientific purposes are not trained in Software Engineering. As a consequence, research software is often developed ad hoc without following stringent processes. With this paper, we want to characterize research software as a new application domain that needs attention from the Requirements Engineering community. We conducted an exploratory study based on 8 interviews with 12 researchers who develop software. We describe how researchers elicit, document, and analyze requirements for research software and what processes they follow. From this, we derive specific challenges and describe a vision of Requirements Engineering for research software.

  • Computer Science - Software Engineering

Applications now open for a Summer School in Open Science + Research Software Engineering

Madicken Munk

May 20, 2024

Do you want your science, research, and software to be open and accessible? Do you use or develop software in your research? Do you have some basic skills and would like to build on and expand them?

If this sounds like you, then you might be interested in the upcoming Summer School in Open Science and Research Software Engineering. In July 2024, we will be hosting a five-day workshop on open science and research software engineering at the University of Illinois Urbana-Champaign.

This workshop complements previous summer and winter schools hosted by URSSI on research software engineering. At this school, students will hone their open science skills in addition to building their skillset in research software engineering. To that end, attendees must bring a particular software project to apply learning principles to during the school. Throughout the sessions, learners will collaborate with other school participants on their software projects and apply software engineering and open science best practices to make their work visible, citable, and accessible.

This is aimed at early-career researchers, particularly graduate students and postdocs, who are familiar with basic skills such as interacting with the Unix shell, version control using Git, and Python programming, and would like to learn more about best practices for developing research software and leveraging their research software to practice and enhance their own open science. All disciplines are welcome at this school; including–but not limited to–practitioners in the sciences, engineering, humanities, social sciences, and economics. If you use or develop software in the course of doing research, you will find applicable skills in this workshop for your work.

Target Audience

Ideal candidates for this workshop are science practitioners who use or develop software in their research and want to share their software in their community of practice, or are contributing to other research projects. These practitioners want to make their research open, accessible, and reproducible by implementing open science best practices in addition to building and contributing to research software.

To get the most benefit from this workshop, we expect students to be familiar with the Unix shell, Python, and git, at the level taught at a Software Carpentry Workshop.

Format and topics

This five-day workshop will enable learners to hone their skills in developing sustainable research software, practicing open science in their workflow, and contributing to their communities of practice. Topcs covered will include:

  • The Ethos of Open Science
  • Open Tools and Resources
  • Open Results
  • Software design and modularity
  • Collaborative software development via Git+GitHub
  • Software testing in Python
  • Code review
  • Packaging and distributing Python software
  • Documentation
  • Software, Data, and Documentation Licensing
  • Reproducibility

The school will consist of lectures on these topics along with open hacking time to allow participants to practice the concepts covered in the lectures. To facilitate the hands-on experience, each participant must bring a project to work on throughout the course for applying these concepts.

  • Dates : July 29 - August 02, 2024
  • Location : Urbana, IL - University of Illinois at Urbana-Champaign
  • Cost : Free (supported by a grant from the NASA Transform to Open Science Training call)
  • Travel support is available for non-local participants

Important Dates

  • Application deadline : June 06, 2024
  • Application: summer school application
  • Notification : June 10, 2024

IMAGES

  1. Breakdown of Topics for the Software Engineering Models and Methods KA

    research topics on software engineering

  2. Software engineering

    research topics on software engineering

  3. Advanced topics in software engineering

    research topics on software engineering

  4. 150+ Best Research Paper Topics For Software Engineering

    research topics on software engineering

  5. (PDF) Writing good software engineering research papers

    research topics on software engineering

  6. Latest Software Engineering Thesis Topics For Research Scholars

    research topics on software engineering

VIDEO

  1. software engineering important questions for degree 3rd year 5th semester#degree#software#imp#que

  2. Ethics in Software Engineering: An Unspoken Rule

  3. Software Ecosystems: A New Research Agenda

  4. The technology trends impact on software architecture- Managing Software Architecture

  5. Accelerate Your Software Engineering Career

  6. SOFTWARE ENGINEERING IMPORTANT QUESTIONS // BTECH

COMMENTS

  1. Top 10 Software Engineer Research Topics for 2024

    Top Software Engineer Research Topics. 1. Artificial Intelligence and Software Engineering. Intersections between AI and SE. The creation of AI-powered software engineering tools is one potential research area at the intersection of artificial intelligence (AI) and software engineering. These technologies use AI techniques that include machine ...

  2. Software Engineering's Top Topics, Trends, and Researchers

    For this theme issue on the 50th anniversary of software engineering (SE), Redirections offers an overview of the twists, turns, and numerous redirections seen over the years in the SE research literature. Nearly a dozen topics have dominated the past few decades of SE research—and these have been redirected many times. Some are gaining popularity, whereas others are becoming increasingly ...

  3. Journal of Software Engineering Research and Development

    They wanted to define values and basic principles for better software development. On top of being brought into focus, the ... Philipp Hohl, Jil Klünder, Arie van Bennekum, Ryan Lockard, James Gifford, Jürgen Münch, Michael Stupperich and Kurt Schneider. Journal of Software Engineering Research and Development 2018 6 :15.

  4. Trending Topics in Software Engineering

    Abstract. Software Engineering (SE) is evolving to make the best out of the constantly changing technological trends, ranging from development to deployment to management and decommissioning of software systems. In this new column Trending Topics in Software Engineering, we aim at providing insights, reports, and outlooks on how researchers and ...

  5. Architecting the Future of Software Engineering: A Research and

    In close collaboration with our advisory board and other leaders in the software engineering community, we have developed a research roadmap with six focus areas. Figure 1 shows those areas and outlines a suggested course of research topics to undertake. Short descriptions of each focus area and its challenges follow.

  6. Software Engineering

    Software Engineering. At Google, we pride ourselves on our ability to develop and launch new products and features at a very fast pace. This is made possible in part by our world-class engineers, but our approach to software development enables us to balance speed and quality, and is integral to our success. Our obsession for speed and scale is ...

  7. software engineering Latest Research Papers

    End To End . Predictive Software. The paper examines the principles of the Predictive Software Engineering (PSE) framework. The authors examine how PSE enables custom software development companies to offer transparent services and products while staying within the intended budget and a guaranteed budget.

  8. 2022 Research Review

    Research Review 2022. At the 2022 Research Review, our researchers detail how they are forging a new path for software engineering by executing the SEI's technical strategy to deliver tangible results. Researchers highlight methods, prototypes, and tools aimed at the most important problems facing the DoD, industry, and academia, including AI ...

  9. Machine Learning for Software Engineering

    Keywords: Artificial Intelligence, Machine Learning, Software Engineering, Software Development . Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements.Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of ...

  10. Human-Centered Approaches in Modern Software Engineering

    This is the central focus of our issue endeavor, where we aim to explore the convergence of social sciences, design principles, and software engineering to establish a more human-centered paradigm. This Research Topic highlights techniques and concepts that can infuse empathy and fairness into software development.

  11. Software Engineering and Intelligent Systems

    Software engineering and intelligent systems are two dynamic and interrelated fields that have witnessed significant advancements and transformations in recent years. The convergence of these domains has led to the development of innovative applications and solutions that are shaping various industries, from healthcare and finance to transportation and manufacturing.This Research Topic aims to ...

  12. Research Topics in Software Engineering

    Overview. This seminar is an opportunity to become familiar with current research in software engineering and more generally with the methods and challenges of scientific research. Each student will be asked to study some papers from the recent software engineering literature and review them. This is an exercise in critical review and analysis.

  13. PDF A National Agenda for Software Engineering Research & Development

    5.6.5 Research Topics 60 5.7 Engineering AI-Enabled Software Systems Research Focus Area 61 5.7.1 Goals 61 5.7.2 Limitations of Current Practice 62 ... with the software engineering research communities, to working with our distributed team to assemble the study, all of it had to be done in a virtual setting. I want to thank everyone for their ...

  14. Research in Software Engineering (RiSE)

    Research in Software Engineering (RiSE) Our mission is to make everyone a programmer and maximize the productivity of every programmer. This will democratize computing to empower every person and every organization to achieve more. We achieve our vision through open-ended fundamental research in programming languages, software engineering, and ...

  15. 150 Best Research Paper Topics For Software Engineering

    This paper reviews software tools to solve complicated tasks in the analysis of data. The paper compares NVivo, HyperRESEARCH, and Dedoose. Data Scientist and Software Development. Data scientists convert data into insights, giving elaborate guidance to those who use the data to make educated decisions and take action.

  16. Software Engineering and Programming Languages

    Developer Tools. Google provides its engineers' with cutting edge developer tools that operate on codebase with billions of lines of code. The tools are designed to provide engineers with a consistent view of the codebase so they can navigate and edit any project. We research and create new, unique developer tools that allow us to get the ...

  17. Software Engineer Research Paper Topics 2021: Top 5

    Students are often choosing buy assignment from a professional writer because of the wrong topic choice. Thus, to help you land on the best topic for your needs, we have listed the top 5 software engineer research paper topics in the next sections. Machine Learning. Machine learning is one of the most used research topics of software engineers.

  18. Software Engineering

    Abstract. Software engineering is a pragmatic discipline. From the very beginning, the mindset of the software engineering research community has been focused on solving problems faced by practicing software engineers [1], and hence, much of software engineering work is motivated by pragmatic outcomes.

  19. Topic modeling in software engineering research

    Topic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic "topics" (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to ...

  20. Advances in Software Quality Engineering for Complex Systems

    Keywords: Software quality, software engineering, machine learning, artificial intelligence, computer vision, visual analytics, computational thinking, AI ethics and sustainability . Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements.

  21. An Analysis of Research in Software Engineering:

    This paper presents a software-aided method for assessment and trend analysis, which can be used in software engineering as well as other research fields in computer science (or other disciplines). The method proposed in this paper is modular and automated compared with the method in prior studies [7, 10-22, 2].

  22. Why science needs more research software engineers

    A big part of the job is raising awareness about the importance of quality software. An RSE might train a postdoc or graduate student to develop software on their own. Or they might run a seminar ...

  23. Research Topics

    Research Topics Research in the Monmouth University Computer Science and Software Engineering Department falls into the following areas: Artificial Intelligence AI can be described as the study of systems that process data that are usually non-numeric, such as text and images, in such a way that we can extract patterns and information (meanings) from them. We use techniques in natural language ...

  24. Requirements Engineering for Research Software: A Vision

    Modern science is relying on software more than ever. The behavior and outcomes of this software shape the scientific and public discourse on important topics like climate change, economic growth, or the spread of infections. Most researchers creating software for scientific purposes are not trained in Software Engineering. As a consequence, research software is often developed ad hoc without ...

  25. Applications now open for a Summer School in Open Science + Research

    In July 2024, we will be hosting a five-day workshop on open science and research software engineering at the University of Illinois Urbana-Champaign. This workshop complements previous summer and winter schools hosted by URSSI on research software engineering. At this school, students will hone their open science skills in addition to building ...

  26. Game Theory: An Introduction, 3rd Edition

    E. N. Barron. ISBN: 978-1-394-16912-2. May 2024. 576 pages. <p>Authoritative and quantitative approach to modern game theory with applications from areas including economics, political science, computer science, and engineering <p><i>Game Theory</i> acknowledges the role of mathematics in making logical and advantageous decisions in ...

  27. Applying Machine Learning to Earthquake Engineering: A Scientometric

    Machine Learning (ML) has developed rapidly in recent years, achieving exciting advancements in applications such as data mining, computer vision, natural language processing, data feature extraction, and prediction. ML methods are increasingly being utilized in various aspects of seismic engineering, such as predicting the performance of various construction materials, monitoring the health ...