Organizing Your Social Sciences Research Assignments

  • Annotated Bibliography
  • Analyzing a Scholarly Journal Article
  • Group Presentations
  • Dealing with Nervousness
  • Using Visual Aids
  • Grading Someone Else's Paper
  • Types of Structured Group Activities
  • Group Project Survival Skills
  • Leading a Class Discussion
  • Multiple Book Review Essay
  • Reviewing Collected Works
  • Writing a Case Analysis Paper
  • Writing a Case Study
  • About Informed Consent
  • Writing Field Notes
  • Writing a Policy Memo
  • Writing a Reflective Paper
  • Writing a Research Proposal
  • Generative AI and Writing
  • Acknowledgments

Definition and Introduction

Case analysis is a problem-based teaching and learning method that involves critically analyzing complex scenarios within an organizational setting for the purpose of placing the student in a “real world” situation and applying reflection and critical thinking skills to contemplate appropriate solutions, decisions, or recommended courses of action. It is considered a more effective teaching technique than in-class role playing or simulation activities. The analytical process is often guided by questions provided by the instructor that ask students to contemplate relationships between the facts and critical incidents described in the case.

Cases generally include both descriptive and statistical elements and rely on students applying abductive reasoning to develop and argue for preferred or best outcomes [i.e., case scenarios rarely have a single correct or perfect answer based on the evidence provided]. Rather than emphasizing theories or concepts, case analysis assignments emphasize building a bridge of relevancy between abstract thinking and practical application and, by so doing, teaches the value of both within a specific area of professional practice.

Given this, the purpose of a case analysis paper is to present a structured and logically organized format for analyzing the case situation. It can be assigned to students individually or as a small group assignment and it may include an in-class presentation component. Case analysis is predominately taught in economics and business-related courses, but it is also a method of teaching and learning found in other applied social sciences disciplines, such as, social work, public relations, education, journalism, and public administration.

Ellet, William. The Case Study Handbook: A Student's Guide . Revised Edition. Boston, MA: Harvard Business School Publishing, 2018; Christoph Rasche and Achim Seisreiner. Guidelines for Business Case Analysis . University of Potsdam; Writing a Case Analysis . Writing Center, Baruch College; Volpe, Guglielmo. "Case Teaching in Economics: History, Practice and Evidence." Cogent Economics and Finance 3 (December 2015). doi:https://doi.org/10.1080/23322039.2015.1120977.

How to Approach Writing a Case Analysis Paper

The organization and structure of a case analysis paper can vary depending on the organizational setting, the situation, and how your professor wants you to approach the assignment. Nevertheless, preparing to write a case analysis paper involves several important steps. As Hawes notes, a case analysis assignment “...is useful in developing the ability to get to the heart of a problem, analyze it thoroughly, and to indicate the appropriate solution as well as how it should be implemented” [p.48]. This statement encapsulates how you should approach preparing to write a case analysis paper.

Before you begin to write your paper, consider the following analytical procedures:

  • Review the case to get an overview of the situation . A case can be only a few pages in length, however, it is most often very lengthy and contains a significant amount of detailed background information and statistics, with multilayered descriptions of the scenario, the roles and behaviors of various stakeholder groups, and situational events. Therefore, a quick reading of the case will help you gain an overall sense of the situation and illuminate the types of issues and problems that you will need to address in your paper. If your professor has provided questions intended to help frame your analysis, use them to guide your initial reading of the case.
  • Read the case thoroughly . After gaining a general overview of the case, carefully read the content again with the purpose of understanding key circumstances, events, and behaviors among stakeholder groups. Look for information or data that appears contradictory, extraneous, or misleading. At this point, you should be taking notes as you read because this will help you develop a general outline of your paper. The aim is to obtain a complete understanding of the situation so that you can begin contemplating tentative answers to any questions your professor has provided or, if they have not provided, developing answers to your own questions about the case scenario and its connection to the course readings,lectures, and class discussions.
  • Determine key stakeholder groups, issues, and events and the relationships they all have to each other . As you analyze the content, pay particular attention to identifying individuals, groups, or organizations described in the case and identify evidence of any problems or issues of concern that impact the situation in a negative way. Other things to look for include identifying any assumptions being made by or about each stakeholder, potential biased explanations or actions, explicit demands or ultimatums , and the underlying concerns that motivate these behaviors among stakeholders. The goal at this stage is to develop a comprehensive understanding of the situational and behavioral dynamics of the case and the explicit and implicit consequences of each of these actions.
  • Identify the core problems . The next step in most case analysis assignments is to discern what the core [i.e., most damaging, detrimental, injurious] problems are within the organizational setting and to determine their implications. The purpose at this stage of preparing to write your analysis paper is to distinguish between the symptoms of core problems and the core problems themselves and to decide which of these must be addressed immediately and which problems do not appear critical but may escalate over time. Identify evidence from the case to support your decisions by determining what information or data is essential to addressing the core problems and what information is not relevant or is misleading.
  • Explore alternative solutions . As noted, case analysis scenarios rarely have only one correct answer. Therefore, it is important to keep in mind that the process of analyzing the case and diagnosing core problems, while based on evidence, is a subjective process open to various avenues of interpretation. This means that you must consider alternative solutions or courses of action by critically examining strengths and weaknesses, risk factors, and the differences between short and long-term solutions. For each possible solution or course of action, consider the consequences they may have related to their implementation and how these recommendations might lead to new problems. Also, consider thinking about your recommended solutions or courses of action in relation to issues of fairness, equity, and inclusion.
  • Decide on a final set of recommendations . The last stage in preparing to write a case analysis paper is to assert an opinion or viewpoint about the recommendations needed to help resolve the core problems as you see them and to make a persuasive argument for supporting this point of view. Prepare a clear rationale for your recommendations based on examining each element of your analysis. Anticipate possible obstacles that could derail their implementation. Consider any counter-arguments that could be made concerning the validity of your recommended actions. Finally, describe a set of criteria and measurable indicators that could be applied to evaluating the effectiveness of your implementation plan.

Use these steps as the framework for writing your paper. Remember that the more detailed you are in taking notes as you critically examine each element of the case, the more information you will have to draw from when you begin to write. This will save you time.

NOTE : If the process of preparing to write a case analysis paper is assigned as a student group project, consider having each member of the group analyze a specific element of the case, including drafting answers to the corresponding questions used by your professor to frame the analysis. This will help make the analytical process more efficient and ensure that the distribution of work is equitable. This can also facilitate who is responsible for drafting each part of the final case analysis paper and, if applicable, the in-class presentation.

Framework for Case Analysis . College of Management. University of Massachusetts; Hawes, Jon M. "Teaching is Not Telling: The Case Method as a Form of Interactive Learning." Journal for Advancement of Marketing Education 5 (Winter 2004): 47-54; Rasche, Christoph and Achim Seisreiner. Guidelines for Business Case Analysis . University of Potsdam; Writing a Case Study Analysis . University of Arizona Global Campus Writing Center; Van Ness, Raymond K. A Guide to Case Analysis . School of Business. State University of New York, Albany; Writing a Case Analysis . Business School, University of New South Wales.

Structure and Writing Style

A case analysis paper should be detailed, concise, persuasive, clearly written, and professional in tone and in the use of language . As with other forms of college-level academic writing, declarative statements that convey information, provide a fact, or offer an explanation or any recommended courses of action should be based on evidence. If allowed by your professor, any external sources used to support your analysis, such as course readings, should be properly cited under a list of references. The organization and structure of case analysis papers can vary depending on your professor’s preferred format, but its structure generally follows the steps used for analyzing the case.

Introduction

The introduction should provide a succinct but thorough descriptive overview of the main facts, issues, and core problems of the case . The introduction should also include a brief summary of the most relevant details about the situation and organizational setting. This includes defining the theoretical framework or conceptual model on which any questions were used to frame your analysis.

Following the rules of most college-level research papers, the introduction should then inform the reader how the paper will be organized. This includes describing the major sections of the paper and the order in which they will be presented. Unless you are told to do so by your professor, you do not need to preview your final recommendations in the introduction. U nlike most college-level research papers , the introduction does not include a statement about the significance of your findings because a case analysis assignment does not involve contributing new knowledge about a research problem.

Background Analysis

Background analysis can vary depending on any guiding questions provided by your professor and the underlying concept or theory that the case is based upon. In general, however, this section of your paper should focus on:

  • Providing an overarching analysis of problems identified from the case scenario, including identifying events that stakeholders find challenging or troublesome,
  • Identifying assumptions made by each stakeholder and any apparent biases they may exhibit,
  • Describing any demands or claims made by or forced upon key stakeholders, and
  • Highlighting any issues of concern or complaints expressed by stakeholders in response to those demands or claims.

These aspects of the case are often in the form of behavioral responses expressed by individuals or groups within the organizational setting. However, note that problems in a case situation can also be reflected in data [or the lack thereof] and in the decision-making, operational, cultural, or institutional structure of the organization. Additionally, demands or claims can be either internal and external to the organization [e.g., a case analysis involving a president considering arms sales to Saudi Arabia could include managing internal demands from White House advisors as well as demands from members of Congress].

Throughout this section, present all relevant evidence from the case that supports your analysis. Do not simply claim there is a problem, an assumption, a demand, or a concern; tell the reader what part of the case informed how you identified these background elements.

Identification of Problems

In most case analysis assignments, there are problems, and then there are problems . Each problem can reflect a multitude of underlying symptoms that are detrimental to the interests of the organization. The purpose of identifying problems is to teach students how to differentiate between problems that vary in severity, impact, and relative importance. Given this, problems can be described in three general forms: those that must be addressed immediately, those that should be addressed but the impact is not severe, and those that do not require immediate attention and can be set aside for the time being.

All of the problems you identify from the case should be identified in this section of your paper, with a description based on evidence explaining the problem variances. If the assignment asks you to conduct research to further support your assessment of the problems, include this in your explanation. Remember to cite those sources in a list of references. Use specific evidence from the case and apply appropriate concepts, theories, and models discussed in class or in relevant course readings to highlight and explain the key problems [or problem] that you believe must be solved immediately and describe the underlying symptoms and why they are so critical.

Alternative Solutions

This section is where you provide specific, realistic, and evidence-based solutions to the problems you have identified and make recommendations about how to alleviate the underlying symptomatic conditions impacting the organizational setting. For each solution, you must explain why it was chosen and provide clear evidence to support your reasoning. This can include, for example, course readings and class discussions as well as research resources, such as, books, journal articles, research reports, or government documents. In some cases, your professor may encourage you to include personal, anecdotal experiences as evidence to support why you chose a particular solution or set of solutions. Using anecdotal evidence helps promote reflective thinking about the process of determining what qualifies as a core problem and relevant solution .

Throughout this part of the paper, keep in mind the entire array of problems that must be addressed and describe in detail the solutions that might be implemented to resolve these problems.

Recommended Courses of Action

In some case analysis assignments, your professor may ask you to combine the alternative solutions section with your recommended courses of action. However, it is important to know the difference between the two. A solution refers to the answer to a problem. A course of action refers to a procedure or deliberate sequence of activities adopted to proactively confront a situation, often in the context of accomplishing a goal. In this context, proposed courses of action are based on your analysis of alternative solutions. Your description and justification for pursuing each course of action should represent the overall plan for implementing your recommendations.

For each course of action, you need to explain the rationale for your recommendation in a way that confronts challenges, explains risks, and anticipates any counter-arguments from stakeholders. Do this by considering the strengths and weaknesses of each course of action framed in relation to how the action is expected to resolve the core problems presented, the possible ways the action may affect remaining problems, and how the recommended action will be perceived by each stakeholder.

In addition, you should describe the criteria needed to measure how well the implementation of these actions is working and explain which individuals or groups are responsible for ensuring your recommendations are successful. In addition, always consider the law of unintended consequences. Outline difficulties that may arise in implementing each course of action and describe how implementing the proposed courses of action [either individually or collectively] may lead to new problems [both large and small].

Throughout this section, you must consider the costs and benefits of recommending your courses of action in relation to uncertainties or missing information and the negative consequences of success.

The conclusion should be brief and introspective. Unlike a research paper, the conclusion in a case analysis paper does not include a summary of key findings and their significance, a statement about how the study contributed to existing knowledge, or indicate opportunities for future research.

Begin by synthesizing the core problems presented in the case and the relevance of your recommended solutions. This can include an explanation of what you have learned about the case in the context of your answers to the questions provided by your professor. The conclusion is also where you link what you learned from analyzing the case with the course readings or class discussions. This can further demonstrate your understanding of the relationships between the practical case situation and the theoretical and abstract content of assigned readings and other course content.

Problems to Avoid

The literature on case analysis assignments often includes examples of difficulties students have with applying methods of critical analysis and effectively reporting the results of their assessment of the situation. A common reason cited by scholars is that the application of this type of teaching and learning method is limited to applied fields of social and behavioral sciences and, as a result, writing a case analysis paper can be unfamiliar to most students entering college.

After you have drafted your paper, proofread the narrative flow and revise any of these common errors:

  • Unnecessary detail in the background section . The background section should highlight the essential elements of the case based on your analysis. Focus on summarizing the facts and highlighting the key factors that become relevant in the other sections of the paper by eliminating any unnecessary information.
  • Analysis relies too much on opinion . Your analysis is interpretive, but the narrative must be connected clearly to evidence from the case and any models and theories discussed in class or in course readings. Any positions or arguments you make should be supported by evidence.
  • Analysis does not focus on the most important elements of the case . Your paper should provide a thorough overview of the case. However, the analysis should focus on providing evidence about what you identify are the key events, stakeholders, issues, and problems. Emphasize what you identify as the most critical aspects of the case to be developed throughout your analysis. Be thorough but succinct.
  • Writing is too descriptive . A paper with too much descriptive information detracts from your analysis of the complexities of the case situation. Questions about what happened, where, when, and by whom should only be included as essential information leading to your examination of questions related to why, how, and for what purpose.
  • Inadequate definition of a core problem and associated symptoms . A common error found in case analysis papers is recommending a solution or course of action without adequately defining or demonstrating that you understand the problem. Make sure you have clearly described the problem and its impact and scope within the organizational setting. Ensure that you have adequately described the root causes w hen describing the symptoms of the problem.
  • Recommendations lack specificity . Identify any use of vague statements and indeterminate terminology, such as, “A particular experience” or “a large increase to the budget.” These statements cannot be measured and, as a result, there is no way to evaluate their successful implementation. Provide specific data and use direct language in describing recommended actions.
  • Unrealistic, exaggerated, or unattainable recommendations . Review your recommendations to ensure that they are based on the situational facts of the case. Your recommended solutions and courses of action must be based on realistic assumptions and fit within the constraints of the situation. Also note that the case scenario has already happened, therefore, any speculation or arguments about what could have occurred if the circumstances were different should be revised or eliminated.

Bee, Lian Song et al. "Business Students' Perspectives on Case Method Coaching for Problem-Based Learning: Impacts on Student Engagement and Learning Performance in Higher Education." Education & Training 64 (2022): 416-432; The Case Analysis . Fred Meijer Center for Writing and Michigan Authors. Grand Valley State University; Georgallis, Panikos and Kayleigh Bruijn. "Sustainability Teaching using Case-Based Debates." Journal of International Education in Business 15 (2022): 147-163; Hawes, Jon M. "Teaching is Not Telling: The Case Method as a Form of Interactive Learning." Journal for Advancement of Marketing Education 5 (Winter 2004): 47-54; Georgallis, Panikos, and Kayleigh Bruijn. "Sustainability Teaching Using Case-based Debates." Journal of International Education in Business 15 (2022): 147-163; .Dean,  Kathy Lund and Charles J. Fornaciari. "How to Create and Use Experiential Case-Based Exercises in a Management Classroom." Journal of Management Education 26 (October 2002): 586-603; Klebba, Joanne M. and Janet G. Hamilton. "Structured Case Analysis: Developing Critical Thinking Skills in a Marketing Case Course." Journal of Marketing Education 29 (August 2007): 132-137, 139; Klein, Norman. "The Case Discussion Method Revisited: Some Questions about Student Skills." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 30-32; Mukherjee, Arup. "Effective Use of In-Class Mini Case Analysis for Discovery Learning in an Undergraduate MIS Course." The Journal of Computer Information Systems 40 (Spring 2000): 15-23; Pessoa, Silviaet al. "Scaffolding the Case Analysis in an Organizational Behavior Course: Making Analytical Language Explicit." Journal of Management Education 46 (2022): 226-251: Ramsey, V. J. and L. D. Dodge. "Case Analysis: A Structured Approach." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 27-29; Schweitzer, Karen. "How to Write and Format a Business Case Study." ThoughtCo. https://www.thoughtco.com/how-to-write-and-format-a-business-case-study-466324 (accessed December 5, 2022); Reddy, C. D. "Teaching Research Methodology: Everything's a Case." Electronic Journal of Business Research Methods 18 (December 2020): 178-188; Volpe, Guglielmo. "Case Teaching in Economics: History, Practice and Evidence." Cogent Economics and Finance 3 (December 2015). doi:https://doi.org/10.1080/23322039.2015.1120977.

Writing Tip

Ca se Study and Case Analysis Are Not the Same!

Confusion often exists between what it means to write a paper that uses a case study research design and writing a paper that analyzes a case; they are two different types of approaches to learning in the social and behavioral sciences. Professors as well as educational researchers contribute to this confusion because they often use the term "case study" when describing the subject of analysis for a case analysis paper. But you are not studying a case for the purpose of generating a comprehensive, multi-faceted understanding of a research problem. R ather, you are critically analyzing a specific scenario to argue logically for recommended solutions and courses of action that lead to optimal outcomes applicable to professional practice.

To avoid any confusion, here are twelve characteristics that delineate the differences between writing a paper using the case study research method and writing a case analysis paper:

  • Case study is a method of in-depth research and rigorous inquiry ; case analysis is a reliable method of teaching and learning . A case study is a modality of research that investigates a phenomenon for the purpose of creating new knowledge, solving a problem, or testing a hypothesis using empirical evidence derived from the case being studied. Often, the results are used to generalize about a larger population or within a wider context. The writing adheres to the traditional standards of a scholarly research study. A case analysis is a pedagogical tool used to teach students how to reflect and think critically about a practical, real-life problem in an organizational setting.
  • The researcher is responsible for identifying the case to study; a case analysis is assigned by your professor . As the researcher, you choose the case study to investigate in support of obtaining new knowledge and understanding about the research problem. The case in a case analysis assignment is almost always provided, and sometimes written, by your professor and either given to every student in class to analyze individually or to a small group of students, or students select a case to analyze from a predetermined list.
  • A case study is indeterminate and boundless; a case analysis is predetermined and confined . A case study can be almost anything [see item 9 below] as long as it relates directly to examining the research problem. This relationship is the only limit to what a researcher can choose as the subject of their case study. The content of a case analysis is determined by your professor and its parameters are well-defined and limited to elucidating insights of practical value applied to practice.
  • Case study is fact-based and describes actual events or situations; case analysis can be entirely fictional or adapted from an actual situation . The entire content of a case study must be grounded in reality to be a valid subject of investigation in an empirical research study. A case analysis only needs to set the stage for critically examining a situation in practice and, therefore, can be entirely fictional or adapted, all or in-part, from an actual situation.
  • Research using a case study method must adhere to principles of intellectual honesty and academic integrity; a case analysis scenario can include misleading or false information . A case study paper must report research objectively and factually to ensure that any findings are understood to be logically correct and trustworthy. A case analysis scenario may include misleading or false information intended to deliberately distract from the central issues of the case. The purpose is to teach students how to sort through conflicting or useless information in order to come up with the preferred solution. Any use of misleading or false information in academic research is considered unethical.
  • Case study is linked to a research problem; case analysis is linked to a practical situation or scenario . In the social sciences, the subject of an investigation is most often framed as a problem that must be researched in order to generate new knowledge leading to a solution. Case analysis narratives are grounded in real life scenarios for the purpose of examining the realities of decision-making behavior and processes within organizational settings. A case analysis assignments include a problem or set of problems to be analyzed. However, the goal is centered around the act of identifying and evaluating courses of action leading to best possible outcomes.
  • The purpose of a case study is to create new knowledge through research; the purpose of a case analysis is to teach new understanding . Case studies are a choice of methodological design intended to create new knowledge about resolving a research problem. A case analysis is a mode of teaching and learning intended to create new understanding and an awareness of uncertainty applied to practice through acts of critical thinking and reflection.
  • A case study seeks to identify the best possible solution to a research problem; case analysis can have an indeterminate set of solutions or outcomes . Your role in studying a case is to discover the most logical, evidence-based ways to address a research problem. A case analysis assignment rarely has a single correct answer because one of the goals is to force students to confront the real life dynamics of uncertainly, ambiguity, and missing or conflicting information within professional practice. Under these conditions, a perfect outcome or solution almost never exists.
  • Case study is unbounded and relies on gathering external information; case analysis is a self-contained subject of analysis . The scope of a case study chosen as a method of research is bounded. However, the researcher is free to gather whatever information and data is necessary to investigate its relevance to understanding the research problem. For a case analysis assignment, your professor will often ask you to examine solutions or recommended courses of action based solely on facts and information from the case.
  • Case study can be a person, place, object, issue, event, condition, or phenomenon; a case analysis is a carefully constructed synopsis of events, situations, and behaviors . The research problem dictates the type of case being studied and, therefore, the design can encompass almost anything tangible as long as it fulfills the objective of generating new knowledge and understanding. A case analysis is in the form of a narrative containing descriptions of facts, situations, processes, rules, and behaviors within a particular setting and under a specific set of circumstances.
  • Case study can represent an open-ended subject of inquiry; a case analysis is a narrative about something that has happened in the past . A case study is not restricted by time and can encompass an event or issue with no temporal limit or end. For example, the current war in Ukraine can be used as a case study of how medical personnel help civilians during a large military conflict, even though circumstances around this event are still evolving. A case analysis can be used to elicit critical thinking about current or future situations in practice, but the case itself is a narrative about something finite and that has taken place in the past.
  • Multiple case studies can be used in a research study; case analysis involves examining a single scenario . Case study research can use two or more cases to examine a problem, often for the purpose of conducting a comparative investigation intended to discover hidden relationships, document emerging trends, or determine variations among different examples. A case analysis assignment typically describes a stand-alone, self-contained situation and any comparisons among cases are conducted during in-class discussions and/or student presentations.

The Case Analysis . Fred Meijer Center for Writing and Michigan Authors. Grand Valley State University; Mills, Albert J. , Gabrielle Durepos, and Eiden Wiebe, editors. Encyclopedia of Case Study Research . Thousand Oaks, CA: SAGE Publications, 2010; Ramsey, V. J. and L. D. Dodge. "Case Analysis: A Structured Approach." Exchange: The Organizational Behavior Teaching Journal 6 (November 1981): 27-29; Yin, Robert K. Case Study Research and Applications: Design and Methods . 6th edition. Thousand Oaks, CA: Sage, 2017; Crowe, Sarah et al. “The Case Study Approach.” BMC Medical Research Methodology 11 (2011):  doi: 10.1186/1471-2288-11-100; Yin, Robert K. Case Study Research: Design and Methods . 4th edition. Thousand Oaks, CA: Sage Publishing; 1994.

  • << Previous: Reviewing Collected Works
  • Next: Writing a Case Study >>
  • Last Updated: Mar 6, 2024 1:00 PM
  • URL: https://libguides.usc.edu/writingguide/assignments
  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Political Methodology

  • < Previous chapter
  • Next chapter >

28 Case Selection for Case‐Study Analysis: Qualitative and Quantitative Techniques

John Gerring is Professor of Political Science, Boston University.

  • Published: 02 September 2009
  • Cite Icon Cite
  • Permissions Icon Permissions

This article presents some guidance by cataloging nine different techniques for case selection: typical, diverse, extreme, deviant, influential, crucial, pathway, most similar, and most different. It also indicates that if the researcher is starting from a quantitative database, then methods for finding influential outliers can be used. In particular, the article clarifies the general principles that might guide the process of case selection in case-study research. Cases are more or less representative of some broader phenomenon and, on that score, may be considered better or worse subjects for intensive analysis. The article then draws attention to two ambiguities in case-selection strategies in case-study research. The first concerns the admixture of several case-selection strategies. The second concerns the changing status of a case as a study proceeds. Some case studies follow only one strategy of case selection.

Case ‐study analysis focuses on one or several cases that are expected to provide insight into a larger population. This presents the researcher with a formidable problem of case selection: Which cases should she or he choose?

In large‐sample research, the task of case selection is usually handled by some version of randomization. However, in case‐study research the sample is small (by definition) and this makes random sampling problematic, for any given sample may be wildly unrepresentative. Moreover, there is no guarantee that a few cases, chosen randomly, will provide leverage into the research question of interest.

In order to isolate a sample of cases that both reproduces the relevant causal features of a larger universe (representativeness) and provides variation along the dimensions of theoretical interest (causal leverage), case selection for very small samples must employ purposive (nonrandom) selection procedures. Nine such methods are discussed in this chapter, each of which may be identified with a distinct case‐study “type:” typical, diverse, extreme, deviant, influential, crucial, pathway, most‐similar , and most‐different . Table 28.1 summarizes each type, including its general definition, a technique for locating it within a population of potential cases, its uses, and its probable representativeness.

While each of these techniques is normally practiced on one or several cases (the diverse, most‐similar, and most‐different methods require at least two), all may employ additional cases—with the proviso that, at some point, they will no longer offer an opportunity for in‐depth analysis and will thus no longer be “case studies” in the usual sense ( Gerring 2007 , ch. 2 ). It will also be seen that small‐ N case‐selection procedures rest, at least implicitly, upon an analysis of a larger population of potential cases (as does randomization). The case(s) identified for intensive study is chosen from a population and the reasons for this choice hinge upon the way in which it is situated within that population. This is the origin of the terminology—typical, diverse, extreme, et al. It follows that case‐selection procedures in case‐study research may build upon prior cross‐case analysis and that they depend, at the very least, upon certain assumptions about the broader population.

In certain circumstances, the case‐selection procedure may be structured by a quantitative analysis of the larger population. Here, several caveats must be satisfied. First, the inference must pertain to more than a few dozen cases; otherwise, statistical analysis is problematic. Second, relevant data must be available for that population, or a significant sample of that population, on key variables, and the researcher must feel reasonably confident in the accuracy and conceptual validity of these variables. Third, all the standard assumptions of statistical research (e.g. identification, specification, robustness) must be carefully considered, and wherever possible, tested. I shall not dilate further on these familiar issues except to warn the researcher against the unreflective use of statistical techniques. 1 When these requirements are not met, the researcher must employ a qualitative approach to case selection.

The point of this chapter is to elucidate general principles that might guide the process of case selection in case‐study research, building upon earlier work by Harry Eckstein, Arend Lijphart, and others. Sometimes, these principles can be applied in a quantitative framework and sometimes they are limited to a qualitative framework. In either case, the logic of case selection remains quite similar, whether practiced in small‐ N or large‐ N contexts.

Before we begin, a bit of notation is necessary. In this chapter “ N ” refers to cases, not observations. Here, I am concerned primarily with causal inference, rather than inferences that are descriptive or predictive in nature. Thus, all hypotheses involve at least one independent variable ( X ) and one dependent variable ( Y ). For convenience, I shall label the causal factor of special theoretical interest X   1 , and the control variable, or vector of controls (if there are any), X   2 . If the writer is concerned to explain a puzzling outcome, but has no preconceptions about its causes, then the research will be described as Y‐centered . If a researcher is concerned to investigate the effects of a particular cause, with no preconceptions about what these effects might be, the research will be described as X‐centered . If a researcher is concerned to investigate a particular causal relationship, the research will be described as X   1 / Y‐centered , for it connects a particular cause with a particular outcome. 2   X ‐ or Y ‐centered research is exploratory; its purpose is to generate new hypotheses. X   1 / Y‐centered research, by contrast, is confirmatory/disconfirmatory; its purpose is to test an existing hypothesis.

1 Typical Case

In order for a focused case study to provide insight into a broader phenomenon it must be representative of a broader set of cases. It is in this context that one may speak of a typical‐case approach to case selection. The typical case exemplifies what is considered to be a typical set of values, given some general understanding of a phenomenon. By construction, the typical case is also a representative case.

Some typical cases serve an exploratory role. Here, the author chooses a case based upon a set of descriptive characteristics and then probes for causal relationships. Robert and Helen Lynd (1929/1956) selected a single city “to be as representative as possible of contemporary American life.” Specifically, they were looking for a city with

1) a temperate climate; 2) a sufficiently rapid rate of growth to ensure the presence of a plentiful assortment of the growing pains accompanying contemporary social change; 3) an industrial culture with modern, high‐speed machine production; 4) the absence of dominance of the city's industry by a single plant (i.e., not a one‐industry town); 5) a substantial local artistic life to balance its industrial activity …; and 6) the absence of any outstanding peculiarities or acute local problems which would mark the city off from the midchannel sort of American community. ( Lynd and Lynd 1929/1956 , quoted in Yin 2004 , 29–30)

After examining a number of options the Lynds decided that Muncie, Indiana, was more representative than, or at least as representative as, other midsized cities in America, thus qualifying as a typical case.

This is an inductive approach to case selection. Note that typicality may be understood according to the mean, median, or mode on a particular dimension; there may be multiple dimensions (as in the foregoing example); and each may be differently weighted (some dimensions may be more important than others). Where the selection criteria are multidimensional and a large sample of potential cases is in play, some form of factor analysis may be useful in identifying the most‐typical case(s).

However, the more common employment of the typical‐case method involves a causal model of some phenomenon of theoretical interest. Here, the researcher has identified a particular outcome ( Y ), and perhaps a specific X   1 / Y hypothesis, which she wishes to investigate. In order to do so, she looks for a typical example of that causal relationship. Intuitively, one imagines that a case selected according to the mean values of all parameters must be a typical case relative to some causal relationship. However, this is by no means assured.

Suppose that the Lynds were primarily interested in explaining feelings of trust/distrust among members of different social classes (one of the implicit research goals of the Middletown study). This outcome is likely to be affected by many factors, only some of which are included in their six selection criteria. So choosing cases with respect to a causal hypothesis involves, first of all, identifying the relevant parameters. It involves, secondly, the selection of a case that has a “typical” value relative to the overall causal model; it is well explained. Cases with untypical scores on a particular dimension (e.g. very high or very low) may still be typical examples of a causal relationship. Indeed, they may be more typical than cases whose values lie close to the mean. Thus, a descriptive understanding of typicality is quite different from a causal understanding of typicality. Since it is the latter version that is more common, I shall adopt this understanding of typicality in the remainder of the discussion.

From a qualitative perspective, causal typicality involves the selection of a case that conforms to expectations about some general causal relationship. It performs as expected. In a quantitative setting, this notion is measured by the size of a case's residual in a large‐ N cross‐case model. Typical cases lie on or near the regression line; their residuals are small. Insofar as the model is correctly specified, the size of a case's residual (i.e. the number of standard deviations that separate the actual value from the fitted value) provides a helpful clue to how representative that case is likely to be. “Outliers” are unlikely to be representative of the target population.

Of course, just because a case has a low residual does not necessarily mean that it is a representative case (with respect to the causal relationship of interest). Indeed, the issue of case representativeness is an issue that can never be definitively settled. When one refers to a “typical case” one is saying, in effect, that the probability of a case's representativeness is high, relative to other cases. This test of typicality is misleading if the statistical model is mis‐specified. And it provides little insurance against errors that are purely stochastic. A case may lie directly on the regression line but still be, in some important respect, atypical. For example, it might have an odd combination of values; the interaction of variables might be different from other cases; or additional causal mechanisms might be at work. For this reason, it is important to supplement a statistical analysis of cases with evidence drawn from the case in question (the case study itself) and with our deductive knowledge of the world. One should never judge a case solely by its residual. Yet, all other things being equal, a case with a low residual is less likely to be unusual than a case with a high residual, and to this extent the method of case selection outlined here may be a helpful guide to case‐study researchers faced with a large number of potential cases.

By way of conclusion, it should be noted that because the typical case embodies a typical value on some set of causally relevant dimensions, the variance of interest to the researcher must lie within that case. Specifically, the typical case of some phenomenon may be helpful in exploring causal mechanisms and in solving identification problems (e.g. endogeneity between X   1 and Y , an omitted variable that may account for X   1   and Y , or some other spurious causal association). Depending upon the results of the case study, the author may confirm an existing hypothesis, disconfirm that hypothesis, or reframe it in a way that is consistent with the findings of the case study. These are the uses of the typical‐case study.

2 Diverse Cases

A second case‐selection strategy has as its primary objective the achievement of maximum variance along relevant dimensions. I refer to this as a diverse‐case method. For obvious reasons, this method requires the selection of a set of cases—at minimum, two—which are intended to represent the full range of values characterizing X   1 , Y , or some particular X   1 / Y relationship. 3

Where the individual variable of interest is categorical (on/off, red/black/blue, Jewish/Protestant/Catholic), the identification of diversity is readily apparent. The investigator simply chooses one case from each category. For a continuous variable, the choices are not so obvious. However, the researcher usually chooses both extreme values (high and low), and perhaps the mean or median as well. The researcher may also look for break‐points in the distribution that seem to correspond to categorical differences among cases. Or she may follow a theoretical hunch about which threshold values count, i.e. which are likely to produce different values on Y .

Another sort of diverse case takes account of the values of multiple variables (i.e. a vector), rather than a single variable. If these variables are categorical, the identification of causal types rests upon the intersection of each category. Two dichotomous variables produce a matrix with four cells. Three trichotomous variables produce a matrix of eight cells. And so forth. If all variables are deemed relevant to the analysis, the selection of diverse cases mandates the selection of one case drawn from within each cell. Let us say that an outcome is thought to be affected by sex, race (black/white), and marital status. Here, a diverse‐case strategy of case selection would identify one case within each of these intersecting cells—a total of eight cases. Things become slightly more complicated when one or more of the factors is continuous, rather than categorical. Here, the diversity of case values do not fall neatly into cells. Rather, these cells must be created by fiat—e.g. high, medium, low.

It will be seen that where multiple variables are under consideration, the logic of diverse‐case analysis rests upon the logic of typological theorizing—where different combinations of variables are assumed to have effects on an outcome that vary across types ( Elman 2005 ; George and Bennett 2005 , 235; Lazarsfeld and Barton 1951 ). George and Smoke, for example, wish to explore different types of deterrence failure—by “fait accompli,” by “limited probe,” and by “controlled pressure.” Consequently, they wish to find cases that exemplify each type of causal mechanism. 4

Diversity may thus refer to a range of variation on X or Y , or to a particular combination of causal factors (with or without a consideration of the outcome). In each instance, the goal of case selection is to capture the full range of variation along the dimension(s) of interest.

Since diversity can mean many things, its employment in a large‐ N setting is necessarily dependent upon how this key term is defined. If it is understood to pertain only to a single variable ( X   1 or Y ), then the task is fairly simple. A categorical variable mandates the choice of at least one case from each category—two if dichotomous, three if trichotomous, and so forth. A continuous variable suggests the choice of at least one “high” and “low” value, and perhaps one drawn from the mean or median. But other choices might also be justified, according to one's hunch about the underlying causal relationship or according to natural thresholds found in the data, which may be grouped into discrete categories. Single‐variable traits are usually easy to discover in a large‐ N setting through descriptive statistics or through visual inspection of the data.

Where diversity refers to particular combinations of variables, the relevant cross‐ case technique is some version of stratified random sampling (in a probabilistic setting) or Qualitative Comparative Analysis (in a deterministic setting) ( Ragin 2000 ). If the researcher suspects that a causal relationship is affected not only by combinations of factors but also by their sequencing , then the technique of analysis must incorporate temporal elements ( Abbott 2001 ; Abbott and Forrest 1986 ; Abbott and Tsay 2000 ). Thus, the method of identifying causal types rests upon whatever method of identifying causal relationships is employed in the large‐ N sample.

Note that the identification of distinct case types is intended to identify groups of cases that are internally homogeneous (in all respects that might affect the causal relationship of interest). Thus, the choice of cases within each group should not be problematic, and may be accomplished through random sampling or purposive case selection. However, if there is suspected diversity within each category, then measures should be taken to assure that the chosen cases are typical of each category. A case study should not focus on an atypical member of a subgroup.

Indeed, considerations of diversity and typicality often go together. Thus, in a study of globalization and social welfare systems, Duane Swank (2002) first identifies three distinctive groups of welfare states: “universalistic” (social democratic), “corporatist conservative,” and “liberal.” Next, he looks within each group to find the most‐typical cases. He decides that the Nordic countries are more typical of the universalistic model than the Netherlands since the latter has “some characteristics of the occupationally based program structure and a political context of Christian Democratic‐led governments typical of the corporatist conservative nations” ( Swank 2002 , 11; see also Esping‐Andersen 1990 ). Thus, the Nordic countries are chosen as representative cases within the universalistic case type, and are accompanied in the case‐study portion of his analysis by other cases chosen to represent the other welfare state types (corporatist conservative and liberal).

Evidently, when a sample encompasses a full range of variation on relevant parameters one is likely to enhance the representativeness of that sample (relative to some population). This is a distinct advantage. Of course, the inclusion of a full range of variation may distort the actual distribution of cases across this spectrum. If there are more “high” cases than “low” cases in a population and the researcher chooses only one high case and one low case, the resulting sample of two is not perfectly representative. Even so, the diverse‐case method probably has stronger claims to representativeness than any other small‐ N sample (including the standalone typical case). The selection of diverse cases has the additional advantage of introducing variation on the key variables of interest. A set of diverse cases is, by definition, a set of cases that encompasses a range of high and low values on relevant dimensions. There is, therefore, much to recommend this method of case selection. I suspect that these advantages are commonly understood and are applied on an intuitive level by case‐study researchers. However, the lack of a recognizable name—and an explicit methodological defense—has made it difficult for case‐study researchers to utilize this method of case selection, and to do so in an explicit and self‐conscious fashion. Neologism has its uses.

3 Extreme Case

The extreme‐case method selects a case because of its extreme value on an independent ( X   1 ) or dependent ( Y ) variable of interest. Thus, studies of domestic violence may choose to focus on extreme instances of abuse ( Browne 1987 ). Studies of altruism may focus on those rare individuals who risked their lives to help others (e.g. Holocaust resisters) ( Monroe 1996 ). Studies of ethnic politics may focus on the most heterogeneous societies (e.g. Papua New Guinea) in order to better understand the role of ethnicity in a democratic setting ( Reilly 2000–1 ). Studies of industrial policy often focus on the most successful countries (i.e. the NICS) ( Deyo 1987 ). And so forth. 5

Often an extreme case corresponds to a case that is considered to be prototypical or paradigmatic of some phenomena of interest. This is because concepts are often defined by their extremes, i.e. their ideal types. Italian Fascism defines the concept of Fascism, in part, because it offered the most extreme example of that phenomenon. However, the methodological value of this case, and others like it, derives from its extremity (along some dimension of interest), not its theoretical status or its status in the literature on a subject.

The notion of “extreme” may now be defined more precisely. An extreme value is an observation that lies far away from the mean of a given distribution. This may be measured (if there are sufficient observations) by a case's “Z score”—the number of standard deviations between a case and the mean value for that sample. Extreme cases have high Z scores, and for this reason may serve as useful subjects for intensive analysis.

For a continuous variable, the distance from the mean may be in either direction (positive or negative). For a dichotomous variable (present/absent), extremeness may be interpreted as unusual . If most cases are positive along a given dimension, then a negative case constitutes an extreme case. If most cases are negative, then a positive case constitutes an extreme case. It should be clear that researchers are not simply concerned with cases where something “happened,” but also with cases where something did not. It is the rareness of the value that makes a case valuable, in this context, not its positive or negative value. 6 Thus, if one is studying state capacity, a case of state failure is probably more informative than a case of state endurance simply because the former is more unusual. Similarly, if one is interested in incest taboos a culture where the incest taboo is absent or weak is probably more useful than a culture where it is present or strong. Fascism is more important than nonfascism. And so forth. There is a good reason, therefore, why case studies of revolution tend to focus on “revolutionary” cases. Theda Skocpol (1979) had much more to learn from France than from Austro‐Hungary since France was more unusual than Austro‐Hungary within the population of nation states that Skocpol was concerned to explain. The reason is quite simple: There are fewer revolutionary cases than nonrevolutionary cases; thus, the variation that we explore as a clue to causal relationships is encapsulated in these cases, against a background of nonrevolutionary cases.

Note that the extreme‐case method of case selection appears to violate the social science folk wisdom warning us not to “select on the dependent variable.” 7 Selecting cases on the dependent variable is indeed problematic if a number of cases are chosen, all of which lie on one end of a variable's spectrum (they are all positive or negative), and if the researcher then subjects this sample to cross‐case analysis as if it were representative of a population. 8 Results for this sort of analysis would almost assuredly be biased. Moreover, there will be little variation to explain since the values of each case are explicitly constrained.

However, this is not the proper employment of the extreme‐case method. (It is more appropriately labeled an extreme‐ sample method.) The extreme‐case method actually refers back to a larger sample of cases that lie in the background of the analysis and provide a full range of variation as well as a more representative picture of the population. It is a self‐conscious attempt to maximize variance on the dimension of interest, not to minimize it. If this population of cases is well understood— either through the author's own cross‐case analysis, through the work of others, or through common sense—then a researcher may justify the selection of a single case exemplifying an extreme value for within‐case analysis. If not, the researcher may be well advised to follow a diverse‐case method, as discussed above.

By way of conclusion, let us return to the problem of representativeness. It will be seen that an extreme case may be typical or deviant. There is simply no way to tell because the researcher has not yet specified an X   1 / Y causal proposition. Once such a causal proposition has been specified one may then ask whether the case in question is similar to some population of cases in all respects that might affect the X   1 / Y relationship of interest (i.e. unit homogeneous). It is at this point that it becomes possible to say, within the context of a cross‐case statistical model, whether a case lies near to, or far from, the regression line. However, this sort of analysis means that the researcher is no longer pursuing an extreme‐case method. The extreme‐case method is purely exploratory—a way of probing possible causes of Y , or possible effects of X , in an open‐ended fashion. If the researcher has some notion of what additional factors might affect the outcome of interest, or of what relationship the causal factor of interest might have with Y , then she ought to pursue one of the other methods explored in this chapter. This also implies that an extreme‐case method may transform into a different kind of approach as a study evolves; that is, as a more specific hypothesis comes to light. Useful extreme cases at the outset of a study may prove less useful at a later stage of analysis.

4 Deviant Case

The deviant‐case method selects that case(s) which, by reference to some general understanding of a topic (either a specific theory or common sense), demonstrates a surprising value. It is thus the contrary of the typical case. Barbara Geddes (2003) notes the importance of deviant cases in medical science, where researchers are habitually focused on that which is “pathological” (according to standard theory and practice). The New England Journal of Medicine , one of the premier journals of the field, carries a regular feature entitled Case Records of the Massachusetts General Hospital. These articles bear titles like the following: “An 80‐Year‐Old Woman with Sudden Unilateral Blindness” or “A 76‐Year‐Old Man with Fever, Dyspnea, Pulmonary Infiltrates, Pleural Effusions, and Confusion.” 9 Another interesting example drawn from the field of medicine concerns the extensive study now devoted to a small number of persons who seem resistant to the AIDS virus ( Buchbinder and Vittinghoff 1999 ; Haynes, Pantaleo, and Fauci 1996 ). Why are they resistant? What is different about these people? What can we learn about AIDS in other patients by observing people who have built‐in resistance to this disease?

Likewise, in psychology and sociology case studies may be comprised of deviant (in the social sense) persons or groups. In economics, case studies may consist of countries or businesses that overperform (e.g. Botswana; Microsoft) or underperform (e.g. Britain through most of the twentieth century; Sears in recent decades) relative to some set of expectations. In political science, case studies may focus on countries where the welfare state is more developed (e.g. Sweden) or less developed (e.g. the United States) than one would expect, given a set of general expectations about welfare state development. The deviant case is closely linked to the investigation of theoretical anomalies. Indeed, to say deviant is to imply “anomalous.” 10

Note that while extreme cases are judged relative to the mean of a single distribution (the distribution of values along a single variable), deviant cases are judged relative to some general model of causal relations. The deviant‐case method selects cases which, by reference to some (presumably) general relationship, demonstrate a surprising value. They are “deviant” in that they are poorly explained by the multivariate model. The important point is that deviant‐ness can only be assessed relative to the general (quantitative or qualitative) model. This means that the relative deviant‐ness of a case is likely to change whenever the general model is altered. For example, the United States is a deviant welfare state when this outcome is gauged relative to societal wealth. But it is less deviant—and perhaps not deviant at all—when certain additional (political and societal) factors are included in the model, as discussed in the epilogue. Deviance is model dependent. Thus, when discussing the concept of the deviant case it is helpful to ask the following question: Relative to what general model (or set of background factors) is Case A deviant?

Conceptually, we have said that the deviant case is the logical contrary of the typical case. This translates into a directly contrasting statistical measurement. While the typical case is one with a low residual (in some general model of causal relations), a deviant case is one with a high residual. This means, following our previous discussion, that the deviant case is likely to be an un representative case, and in this respect appears to violate the supposition that case‐study samples should seek to reproduce features of a larger population.

However, it must be borne in mind that the primary purpose of a deviant‐case analysis is to probe for new—but as yet unspecified—explanations. (If the purpose is to disprove an extant theory I shall refer to the study as crucial‐case, as discussed below.) The researcher hopes that causal processes identified within the deviant case will illustrate some causal factor that is applicable to other (more or less deviant) cases. This means that a deviant‐case study usually culminates in a general proposition, one that may be applied to other cases in the population. Once this general proposition has been introduced into the overall model, the expectation is that the chosen case will no longer be an outlier. Indeed, the hope is that it will now be typical , as judged by its small residual in the adjusted model. (The exception would be a circumstance in which a case's outcome is deemed to be “accidental,” and therefore inexplicable by any general model.)

This feature of the deviant‐case study should help to resolve questions about its representativeness. Even if it is not possible to measure the new causal factor (and thus to introduce it into a large‐ N cross‐case model), it may still be plausible to assert (based on general knowledge of the phenomenon) that the chosen case is representative of a broader population.

5 Influential Case

Sometimes, the choice of a case is motivated solely by the need to verify the assumptions behind a general model of causal relations. Here, the analyst attempts to provide a rationale for disregarding a problematic case or a set of problematic cases. That is to say, she attempts to show why apparent deviations from the norm are not really deviant, or do not challenge the core of the theory, once the circumstances of the special case or cases are fully understood. A cross‐case analysis may, after all, be marred by several classes of problems including measurement error, specification error, errors in establishing proper boundaries for the inference (the scope of the argument), and stochastic error (fluctuations in the phenomenon under study that are treated as random, given available theoretical resources). If poorly fitting cases can be explained away by reference to these kinds of problems, then the theory of interest is that much stronger. This sort of deviant‐case analysis answers the question, “What about Case A (or cases of type A)? How does that, seemingly disconfirming, case fit the model?”

Because its underlying purpose is different from the usual deviant‐case study, I offer a new term for this method. The influential case is a case that casts doubt upon a theory, and for that reason warrants close inspection. This investigation may reveal, after all, that the theory is validated—perhaps in some slightly altered form. In this guise, the influential case is the “case that proves the rule.” In other instances, the influential‐case analysis may contribute to disconfirming, or reconceptualizing, a theory. The key point is that the value of the case is judged relative to some extant cross‐case model.

A simple version of influential‐case analysis involves the confirmation of a key case's score on some critical dimension. This is essentially a question of measurement. Sometimes cases are poorly explained simply because they are poorly understood. A close examination of a particular context may reveal that an apparently falsifying case has been miscoded. If so, the initial challenge presented by that case to some general theory has been obviated.

However, the more usual employment of the influential‐case method culminates in a substantive reinterpretation of the case—perhaps even of the general model. It is not just a question of measurement. Consider Thomas Ertman's (1997) study of state building in Western Europe, as summarized by Gerardo Munck. This study argues

that the interaction of a) the type of local government during the first period of statebuilding, with b) the timing of increases in geopolitical competition, strongly influences the kind of regime and state that emerge. [Ertman] tests this hypothesis against the historical experience of Europe and finds that most countries fit his predictions. Denmark, however, is a major exception. In Denmark, sustained geopolitical competition began relatively late and local government at the beginning of the statebuilding period was generally participatory, which should have led the country to develop “patrimonial constitutionalism.” But in fact, it developed “bureaucratic absolutism.” Ertman carefully explores the process through which Denmark came to have a bureaucratic absolutist state and finds that Denmark had the early marks of a patrimonial constitutionalist state. However, the country was pushed off this developmental path by the influence of German knights, who entered Denmark and brought with them German institutions of local government. Ertman then traces the causal process through which these imported institutions pushed Denmark to develop bureaucratic absolutism, concluding that this development was caused by a factor well outside his explanatory framework. ( Munck 2004 , 118)

Ertman's overall framework is confirmed insofar as he has been able to show, by an in‐depth discussion of Denmark, that the causal processes stipulated by the general theory hold even in this apparently disconfirming case. Denmark is still deviant, but it is so because of “contingent historical circumstances” that are exogenous to the theory ( Ertman 1997 , 316).

Evidently, the influential‐case analysis is similar to the deviant‐case analysis. Both focus on outliers. However, as we shall see, they focus on different kinds of outliers. Moreover, the animating goals of these two research designs are quite different. The influential‐case study begins with the aim of confirming a general model, while the deviant‐case study has the aim of generating a new hypothesis that modifies an existing general model. The confusion stems from the fact that the same case study may fulfill both objectives—qualifying a general model and, at the same time, confirming its core hypothesis.

Thus, in their study of Roberto Michels's “iron law of oligarchy,” Lipset, Trow, and Coleman (1956) choose to focus on an organization—the International Typographical Union—that appears to violate the central presupposition. The ITU, as noted by one of the authors, has “a long‐term two‐party system with free elections and frequent turnover in office” and is thus anything but oligarchic ( Lipset 1959 , 70). As such, it calls into question Michels's grand generalization about organizational behavior. The authors explain this curious result by the extraordinarily high level of education among the members of this union. Michels's law is shown to be true for most organizations, but not all. It is true, with qualifications. Note that the respecification of the original model (in effect, Lipset, Trow, and Coleman introduce a new control variable or boundary condition) involves the exploration of a new hypothesis. In this instance, therefore, the use of an influential case to confirm an existing theory is quite similar to the use of a deviant case to explore a new theory.

In a quantitative idiom, influential cases are those that, if counterfactually assigned a different value on the dependent variable, would most substantially change the resulting estimates. They may or may not be outliers (high‐residual cases). Two quantitative measures of influence are commonly applied in regression diagnostics ( Belsey, Kuh, and Welsch 2004 ). The first, often referred to as the leverage of a case, derives from what is called the hat matrix . Based solely on each case's scores on the independent variables, the hat matrix tells us how much a change in (or a measurement error on) the dependent variable for that case would affect the overall regression line. The second is Cook's distance , a measure of the extent to which the estimates of all the parameters would change if a given case were omitted from the analysis. Cases with a large leverage or Cook's distance contribute quite a lot to the inferences drawn from a cross‐case analysis. In this sense, such cases are vital for maintaining analytic conclusions. Discovering a significant measurement error on the dependent variable or an important omitted variable for such a case may dramatically revise estimates of the overall relationships. Hence, it may be quite sensible to select influential cases for in‐depth study.

Note that the use of an influential‐case strategy of case selection is limited to instances in which a researcher has reason to be concerned that her results are being driven by one or a few cases. This is most likely to be true in small to moderate‐sized samples. Where N is very large—greater than 1,000, let us say—it is extremely unlikely that a small set of cases (much less an individual case) will play an “influential” role. Of course, there may be influential sets of cases, e.g. countries within a particular continent or cultural region, or persons of Irish extraction. Sets of influential observations are often problematic in a time‐series cross‐section data‐set where each unit (e.g. country) contains multiple observations (through time), and hence may have a strong influence on aggregate results. Still, the general rule is: the larger the sample, the less important individual cases are likely to be and, hence, the less likely a researcher is to use an influential‐case approach to case selection.

6 Crucial Case

Of all the extant methods of case selection perhaps the most storied—and certainly the most controversial—is the crucial‐case method, introduced to the social science world several decades ago by Harry Eckstein. In his seminal essay, Eckstein (1975 , 118) describes the crucial case as one “that must closely fit a theory if one is to have confidence in the theory's validity, or, conversely, must not fit equally well any rule contrary to that proposed.” A case is crucial in a somewhat weaker—but much more common—sense when it is most, or least, likely to fulfill a theoretical prediction. A “most‐likely” case is one that, on all dimensions except the dimension of theoretical interest, is predicted to achieve a certain outcome, and yet does not. It is therefore used to disconfirm a theory. A “least‐likely” case is one that, on all dimensions except the dimension of theoretical interest, is predicted not to achieve a certain outcome, and yet does so. It is therefore used to confirm a theory. In all formulations, the crucial‐case offers a most‐difficult test for an argument, and hence provides what is perhaps the strongest sort of evidence possible in a nonexperimental, single‐case setting.

Since the publication of Eckstein's influential essay, the crucial‐case approach has been claimed in a multitude of studies across several social science disciplines and has come to be recognized as a staple of the case‐study method. 11 Yet the idea of any single case playing a crucial (or “critical”) role is not widely accepted among most methodologists (e.g. Sekhon 2004 ). (Even its progenitor seems to have had doubts.)

Let us begin with the confirmatory (a.k.a. least‐likely) crucial case. The implicit logic of this research design may be summarized as follows. Given a set of facts, we are asked to contemplate the probability that a given theory is true. While the facts matter, to be sure, the effectiveness of this sort of research also rests upon the formal properties of the theory in question. Specifically, the degree to which a theory is amenable to confirmation is contingent upon how many predictions can be derived from the theory and on how “risky” each individual prediction is. In Popper's (1963 , 36) words, “Confirmations should count only if they are the result of risky predictions ; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory—and event which would have refuted the theory. Every ‘good’ scientific theory is a prohibition; it forbids certain things to happen. The more a theory forbids, the better it is” (see also Popper 1934/1968 ). A risky prediction is therefore one that is highly precise and determinate, and therefore unlikely to be achieved by the product of other causal factors (external to the theory of interest) or through stochastic processes. A theory produces many such predictions if it is fully elaborated, issuing predictions not only on the central outcome of interest but also on specific causal mechanisms, and if it is broad in purview. (The notion of riskiness may also be conceptualized within the Popperian lexicon as degrees of falsifiability .)

These points can also be articulated in Bayesian terms. Colin Howson and Peter Urbach explain: “The degree to which h [a hypothesis] is confirmed by e [a set of evidence] depends … on the extent to which P(eČh) exceeds P (e) , that is, on how much more probable e is relative to the hypothesis and background assumptions than it is relative just to background assumptions.” Again, “confirmation is correlated with how much more probable the evidence is if the hypothesis is true than if it is false” ( Howson and Urlbach 1989 , 86). Thus, the stranger the prediction offered by a theory—relative to what we would normally expect—the greater the degree of confirmation that will be afforded by the evidence. As an intuitive example, Howson and Urbach (1989 , 86) offer the following:

If a soothsayer predicts that you will meet a dark stranger sometime and you do in fact, your faith in his powers of precognition would not be much enhanced: you would probably continue to think his predictions were just the result of guesswork. However, if the prediction also gave the correct number of hairs on the head of that stranger, your previous scepticism would no doubt be severely shaken.

While these Popperian/Bayesian notions 12 are relevant to all empirical research designs, they are especially relevant to case‐study research designs, for in these settings a single case (or, at most, a small number of cases) is required to bear a heavy burden of proof. It should be no surprise, therefore, that Popper's idea of “riskiness” was to be appropriated by case‐study researchers like Harry Eckstein to validate the enterprise of single‐case analysis. (Although Eckstein does not cite Popper the intellectual lineage is clear.) Riskiness, here, is analogous to what is usually referred to as a “most‐ difficult” research design, which in a case‐study research design would be understood as a “least‐likely” case. Note also that the distinction between a “must‐fit” case and a least‐likely case—that, in the event, actually does fit the terms of a theory—is a matter of degree. Cases are more or less crucial for confirming theories. The point is that, in some circumstances, a paucity of empirical evidence may be compensated by the riskiness of the theory.

The crucial‐case research design is, perforce, a highly deductive enterprise; much depends on the quality of the theory under investigation. It follows that the theories most amenable to crucial‐case analysis are those which are lawlike in their precision, degree of elaboration, consistency, and scope. The more a theory attains the status of a causal law, the easier it will be to confirm, or to disconfirm, with a single case. Indeed, risky predictions are common in natural science fields such as physics, which in turn served as the template for the deductive‐nomological (“covering‐law”) model of science that influenced Eckstein and others in the postwar decades (e.g. Hempel 1942 ).

A frequently cited example is the first important empirical demonstration of the theory of relativity, which took the form of a single‐event prediction on the occasion of the May 29, 1919, solar eclipse ( Eckstein 1975 ; Popper 1963 ). Stephen Van Evera (1997 , 66–7) describes the impact of this prediction on the validation of Einstein's theory.

Einstein's theory predicted that gravity would bend the path of light toward a gravity source by a specific amount. Hence it predicted that during a solar eclipse stars near the sun would appear displaced—stars actually behind the sun would appear next to it, and stars lying next to the sun would appear farther from it—and it predicted the amount of apparent displacement. No other theory made these predictions. The passage of this one single‐case‐study test brought the theory wide acceptance because the tested predictions were unique—there was no plausible competing explanation for the predicted result—hence the passed test was very strong.

The strength of this test is the extraordinary fit between the theory and a set of facts found in a single case, and the corresponding lack of fit between all other theories and this set of facts. Einstein offered an explanation of a particular set of anomalous findings that no other existing theory could make sense of. Of course, one must assume that there was no—or limited—measurement error. And one must assume that the phenomenon of interest is largely invariant; light does not bend differently at different times and places (except in ways that can be understood through the theory of relativity). And one must assume, finally, that the theory itself makes sense on other grounds (other than the case of special interest); it is a plausible general theory. If one is willing to accept these a priori assumptions, then the 1919 “case study” provides a very strong confirmation of the theory. It is difficult to imagine a stronger proof of the theory from within an observational (nonexperimental) setting.

In social science settings, by contrast, one does not commonly find single‐case studies offering knockout evidence for a theory. This is, in my view, largely a product of the looseness (the underspecification) of most social science theories. George and Bennett point out that while the thesis of the democratic peace is as close to a “law” as social science has yet seen, it cannot be confirmed (or refuted) by looking at specific causal mechanisms because the causal pathways mandated by the theory are multiple and diverse. Under the circumstances, no single‐case test can offer strong confirmation of the theory ( George and Bennett 2005 , 209).

However, if one adopts a softer version of the crucial‐case method—the least‐likely (most difficult) case—then possibilities abound. Indeed, I suspect that, implicitly , most case‐study work that makes a positive argument focusing on a single case (without a corresponding cross‐case analysis) relies largely on the logic of the least‐ likely case. Rarely is this logic made explicit, except perhaps in a passing phrase or two. Yet the deductive logic of the “risky” prediction is central to the case‐study enterprise. Whether a case study is convincing or not often rests on the reader's evaluation of how strong the evidence for an argument might be, and this in turn—wherever cross‐ case evidence is limited and no manipulated treatment can be devised—rests upon an estimation of the degree of “fit” between a theory and the evidence at hand, as discussed.

Lily Tsai's (2007) investigation of governance at the village level in China employs several in‐depth case studies of villages which are chosen (in part) because of their least‐likely status relative to the theory of interest. Tsai's hypothesis is that villages with greater social solidarity (based on preexisting religious or familial networks) will develop a higher level of social trust and mutual obligation and, as a result, will experience better governance. Crucial cases, therefore, are villages that evidence a high level of social solidarity but which, along other dimensions, would be judged least likely to develop good governance, e.g. they are poor, isolated, and lack democratic institutions or accountability mechanisms from above. “Li Settlement,” in Fujian province, is such a case. The fact that this impoverished village nonetheless boasts an impressive set of infrastructural accomplishments such as paved roads with drainage ditches (a rarity in rural China) suggests that something rather unusual is going on here. Because her case is carefully chosen to eliminate rival explanations, Tsai's conclusions about the special role of social solidarity are difficult to gainsay. How else is one to explain this otherwise anomalous result? This is the strength of the least‐likely case, where all other plausible causal factors for an outcome have been minimized. 13

Jack Levy (2002 , 144) refers to this, evocatively, as a “Sinatra inference:” if it can make it here, it can make it anywhere (see also Khong 1992 , 49; Sagan 1995 , 49; Shafer 1988 , 14–6). Thus, if social solidarity has the hypothesized effect in Li Settlement it should have the same effect in more propitious settings (e.g. where there is greater economic surplus). The same implicit logic informs many case‐study analyses where the intent of the study is to confirm a hypothesis on the basis of a single case.

Another sort of crucial case is employed for the purpose of dis confirming a causal hypothesis. A central Popperian insight is that it is easier to disconfirm an inference than to confirm that same inference. (Indeed, Popper doubted that any inference could be fully confirmed, and for this reason preferred the term “corroborate.”) This is particularly true of case‐study research designs, where evidence is limited to one or several cases. The key proviso is that the theory under investigation must take a consistent (a.k.a. invariant, deterministic) form, even if its predictions are not terrifically precise, well elaborated, or broad.

As it happens, there are a fair number of invariant propositions floating around the social science disciplines (Goertz and Levy forthcoming; Goertz and Starr 2003 ). It used to be argued, for example, that political stability would occur only in countries that are relatively homogeneous, or where existing heterogeneities are mitigated by cross‐cutting cleavages ( Almond 1956 ; Bentley 1908/1967 ; Lipset 1960/1963 ; Truman 1951 ). Arend Lijphart's (1968) study of the Netherlands, a peaceful country with reinforcing social cleavages, is commonly viewed as refuting this theory on the basis of a single in‐depth case analysis. 14

Granted, it may be questioned whether presumed invariant theories are really invariant; perhaps they are better understood as probabilistic. Perhaps, that is, the theory of cross‐cutting cleavages is still true, probabilistically, despite the apparent Dutch exception. Or perhaps the theory is still true, deterministically, within a subset of cases that does not include the Netherlands. (This sort of claim seems unlikely in this particular instance, but it is quite plausible in many others.) Or perhaps the theory is in need of reframing; it is true, deterministically, but applies only to cross‐ cutting ethnic/racial cleavages, not to cleavages that are primarily religious. One can quibble over what it means to “disconfirm” a theory. The point is that the crucial case has, in all these circumstances, provided important updating of a theoretical prior.

Heretofore, I have treated causal factors as dichotomous. Countries have either reinforcing or cross‐cutting cleavages and they have regimes that are either peaceful or conflictual. Evidently, these sorts of parameters are often matters of degree. In this reading of the theory, cases are more or less crucial. Accordingly, the most useful—i.e. most crucial—case for Lijphart's purpose is one that has the most segregated social groups and the most peaceful and democratic track record. In these respects, the Netherlands was a very good choice. Indeed, the degree of disconfirmation offered by this case study is probably greater than the degree of disconfirmation that might have been provided by other cases such as India or Papua New Guinea—countries where social peace has not always been secure. The point is that where variables are continuous rather than dichotomous it is possible to evaluate potential cases in terms of their degree of crucialness .

Note that the crucial‐case method of case‐selection, whether employed in a confirmatory or disconfirmatory mode, cannot be employed in a large‐ N context. This is because an explicit cross‐case model would render the crucial‐case study redundant. Once one identifies the relevant parameters and the scores of all cases on those parameters, one has in effect constructed a cross‐case model that confirms or disconfirms the theory in question. The case study is thenceforth irrelevant, at least as a means of decisive confirmation or disconfirmation. 15 It remains highly relevant as a means of exploring causal mechanisms, of course. Yet, because this objective is quite different from that which is usually associated with the term, I enlist a new term for this technique.

7 Pathway Case

One of the most important functions of case‐study research is the elucidation of causal mechanisms. But which sort of case is most useful for this purpose? Although all case studies presumably shed light on causal mechanisms, not all cases are equally transparent. In situations where a causal hypothesis is clear and has already been confirmed by cross‐case analysis, researchers are well advised to focus on a case where the causal effect of X   1 on Y can be isolated from other potentially confounding factors ( X   2 ). I shall call this a pathway case to indicate its uniquely penetrating insight into causal mechanisms. In contrast to the crucial case, this sort of method is practicable only in circumstances where cross‐case covariational patterns are well studied and where the mechanism linking X   1 and Y remains dim. Because the pathway case builds on prior cross‐case analysis, the problem of case selection must be situated within that sample. There is no standalone pathway case.

The logic of the pathway case is clearest in situations of causal sufficiency—where a causal factor of interest, X   1 , is sufficient by itself (though perhaps not necessary) to account for Y 's value (0 or 1). The other causes of Y , about which we need make no assumptions, are designated as a vector, X   2 .

Note that wherever various causal factors are substitutable for one another, each factor is conceptualized (individually) as sufficient ( Braumoeller 2003 ). Thus, situations of causal equifinality presume causal sufficiency on the part of each factor or set of conjoint factors. An example is provided by the literature on democratization, which stipulates three main avenues of regime change: leadership‐initiated reform, a controlled opening to opposition, or the collapse of an authoritarian regime ( Colomer 1991 ). The case‐study format constrains us to analyze one at a time, so let us limit our scope to the first one—leadership‐initiated reform. So considered, a causal‐pathway case would be one with the following features: (a) democratization, (b) leadership‐initiated reform, (c) no controlled opening to the opposition, (d) no collapse of the previous authoritarian regime, and (e) no other extraneous factors that might affect the process of democratization. In a case of this type, the causal mechanisms by which leadership‐initiated reform may lead to democratization will be easiest to study. Note that it is not necessary to assume that leadership‐initiated reform always leads to democratization; it may or may not be a deterministic cause. But it is necessary to assume that leadership‐initiated reform can sometimes lead to democratization on its own (given certain background features).

Now let us move from these examples to a general‐purpose model. For heuristic purposes, let us presume that all variables in that model are dichotomous (coded as 0 or 1) and that the model is complete (all causes of Y are included). All causal relationships will be coded so as to be positive: X   1 and Y covary as do X   2 and Y . This allows us to visualize a range of possible combinations at a glance.

Recall that the pathway case is always focused, by definition, on a single causal factor, denoted X   1 . (The researcher's focus may shift to other causal factors, but may only focus on one causal factor at a time.) In this scenario, and regardless of how many additional causes of Y there might be (denoted X   2 , a vector of controls), there are only eight relevant case types, as illustrated in Table 28.2 . Identifying these case types is a relatively simple matter, and can be accomplished in a small‐ N sample by the construction of a truth‐table (modeled after Table 28.2 ) or in a large‐ N sample by the use of cross‐tabs.

Notes : X   1 = the variable of theoretical interest. X   2 = a vector of controls (a score of 0 indicates that all control variables have a score of 0, while a score of 1 indicates that all control variables have a score of 1). Y = the outcome of interest. A–H = case types (the N for each case type is indeterminate). G, H = possible pathway cases. Sample size = indeterminate.

Assumptions : (a) all variables can be coded dichotomously (a binary coding of the concept is valid); (b) all independent variables are positively correlated with Y in the general case; ( c ) X   1 is (at least sometimes) a sufficient cause of Y .

Note that the total number of combinations of values depends on the number of control variables, which we have represented with a single vector, X   2 . If this vector consists of a single variable then there are only eight case types. If this vector consists of two variables ( X   2a , X   2b ) then the total number of possible combinations increases from eight (2 3 ) to sixteen (2 4 ). And so forth. However, none of these combinations is relevant for present purposes except those where X   2a and X   2b have the same value (0 or 1). “Mixed” cases are not causal pathway cases, for reasons that should become clear.

The pathway case, following the logic of the crucial case, is one where the causal factor of interest, X   1 , correctly predicts Y while all other possible causes of Y (represented by the vector, X   2 ) make “wrong” predictions. If X   1 is—at least in some circumstances—a sufficient cause of Y , then it is these sorts of cases that should be most useful for tracing causal mechanisms. There are only two such cases in Ta b l e 28.2—G and H. In all other cases, the mechanism running from X   1 to Y would be difficult to discern either because X   1 and Y are not correlated in the usual way (constituting an unusual case, in the terms of our hypothesis) or because other confounding factors ( X   2 ) intrude. In case A, for example, the positive value on Y could be a product of X   1 or X   2 . An in‐depth examination of this case is not likely to be very revealing.

Keep in mind that because the researcher already knows from her cross‐case examination what the general causal relationships are, she knows (prior to the case‐ study investigation) what constitutes a correct or incorrect prediction. In the crucial‐ case method, by contrast, these expectations are deductive rather than empirical. This is what differentiates the two methods. And this is why the causal pathway case is useful principally for elucidating causal mechanisms rather than verifying or falsifying general propositions (which are already more or less apparent from the cross‐case evidence). Of course, we must leave open the possibility that the investigation of causal mechanisms would invalidate a general claim, if that claim is utterly contingent upon a specific set of causal mechanisms and the case study shows that no such mechanisms are present. However, this is rather unlikely in most social science settings. Usually, the result of such a finding will be a reformulation of the causal processes by which X   1 causes Y —or, alternatively, a realization that the case under investigation is aberrant (atypical of the general population of cases).

Sometimes, the research question is framed as a unidirectional cause: one is interested in why 0 becomes 1 (or vice versa) but not in why 1 becomes 0. In our previous example, we asked why democracies fail, not why countries become democratic or authoritarian. So framed, there can be only one type of causal‐pathway case. (Whether regime failure is coded as 0 or 1 is a matter of taste.) Where researchers are interested in bidirectional causality—a movement from 0 to 1 as well as from 1 to 0—there are two possible causal‐pathway cases, G and H. In practice, however, one of these case types is almost always more useful than the other. Thus, it seems reasonable to employ the term “pathway case” in the singular. In order to determine which of these two case types will be more useful for intensive analysis the researcher should look to see whether each case type exhibits desirable features such as: (a) a rare (unusual) value on X   1 or Y (designated “extreme” in our previous discussion), (b) observable temporal variation in X   1 , ( c ) an X   1 / Y relationship that is easier to study (it has more visible features; it is more transparent), or (d) a lower residual (thus indicating a more typical case, within the terms of the general model). Usually, the choice between G and H is intuitively obvious.

Now, let us consider a scenario in which all (or most) variables of concern to the model are continuous, rather than dichotomous. Here, the job of case selection is considerably more complex, for causal “sufficiency” (in the usual sense) cannot be invoked. It is no longer plausible to assume that a given cause can be entirely partitioned, i.e. rival factors eliminated. However, the search for a pathway case may still be viable. What we are looking for in this scenario is a case that satisfies two criteria: (1) it is not an outlier (or at least not an extreme outlier) in the general model and (2) its score on the outcome ( Y ) is strongly influenced by the theoretical variable of interest ( X   1 ), taking all other factors into account ( X   2 ). In this sort of case it should be easiest to “see” the causal mechanisms that lie between X   1 and Y .

Achieving the second desiderata requires a bit of manipulation. In order to determine which (nonoutlier) cases are most strongly affected by X   1 , given all the other parameters in the model, one must compare the size of the residuals for each case in a reduced form model, Y = Constant + X   2 + Res reduced , with the size of the residuals for each case in a full model, Y = Constant + X   2 + X   1 + Res full . The pathway case is that case, or set of cases, which shows the greatest difference between the residual for the reduced‐form model and the full model (ΔResidual). Thus,

Note that the residual for a case must be smaller in the full model than in the reduced‐ form model; otherwise, the addition of the variable of interest ( X   1 ) pulls the case away from the regression line. We want to find a case where the addition of X   1 pushes the case towards the regression line, i.e. it helps to “explain” that case.

As an example, let us suppose that we are interested in exploring the effect of mineral wealth on the prospects for democracy in a society. According to a good deal of work on this subject, countries with a bounty of natural resources—particularly oil—are less likely to democratize (or once having undergone a democratic transition, are more likely to revert to authoritarian rule) ( Barro 1999 ; Humphreys 2005 ; Ross 2001 ). The cross‐country evidence is robust. Yet as is often the case, the causal mechanisms remain rather obscure. In order to better understand this phenomenon it may be worthwhile to exploit the findings of cross‐country regression models in order to identify a country whose regime type (i.e. its democracy “score” on some general index) is strongly affected by its natural‐research wealth, all other things held constant. An analysis of this sort identifies two countries— the United Arab Emirates and Kuwait—with high Δ Residual values and modest residuals in the full model (signifying that these cases are not outliers). Researchers seeking to explore the effect of oil wealth on regime type might do well to focus on these two cases since their patterns of democracy cannot be well explained by other factors—e.g. economic development, religion, European influence, or ethnic fractionalization. The presence of oil wealth in these countries would appear to have a strong independent effect on the prospects for democratization in these cases, an effect that is well modeled by general theory and by the available cross‐case evidence.

To reiterate, the logic of causal “elimination” is much more compelling where variables are dichotomous and where causal sufficiency can be assumed ( X   1 is sufficient by itself, at least in some circumstances, to cause Y ). Where variables are continuous, the strategy of the pathway case is more dubious, for potentially confounding causal factors ( X   2 ) cannot be neatly partitioned. Even so, we have indicated why the selection of a pathway case may be a logical approach to case‐study analysis in many circumstances.

The exceptions may be briefly noted. Sometimes, where all variables in a model are dichotomous, there are no pathway cases, i.e. no cases of type G or H (in Table 28.2 ). This is known as the “empty cell” problem, or a problem of severe causal multicollinearity. The universe of observational data does not always oblige us with cases that allow us to independently test a given hypothesis. Where variables are continuous, the analogous problem is that of a causal variable of interest ( X   1 ) that has only minimal effects on the outcome of interest. That is, its role in the general model is quite minor. In these situations, the only cases that are strongly affected by X   1 —if there are any at all—may be extreme outliers, and these sorts of cases are not properly regarded as providing confirmatory evidence for a proposition, for reasons that are abundantly clear by now.

Finally, it should be clarified that the identification of a causal pathway case does not obviate the utility of exploring other cases. One might, for example, want to compare both sorts of potential pathway cases—G and H—with each other. Many other combinations suggest themselves. However, this sort of multi‐case investigation moves beyond the logic of the causal‐pathway case.

8 Most‐similar Cases

The most‐similar method employs a minimum of two cases. 16 In its purest form, the chosen pair of cases is similar in all respects except the variable(s) of interest. If the study is exploratory (i.e. hypothesis generating), the researcher looks for cases that differ on the outcome of theoretical interest but are similar on various factors that might have contributed to that outcome, as illustrated in Table 28.3 (A) . This is a common form of case selection at the initial stage of research. Often, fruitful analysis begins with an apparent anomaly: two cases are apparently quite similar, and yet demonstrate surprisingly different outcomes. The hope is that intensive study of these cases will reveal one—or at most several—factors that differ across these cases. These differing factors ( X   1 ) are looked upon as putative causes. At this stage, the research may be described by the second diagram in Table 28.3 (B) . Sometimes, a researcher begins with a strong hypothesis, in which case her research design is confirmatory (hypothesis testing) from the get‐go. That is, she strives to identify cases that exhibit different outcomes, different scores on the factor of interest, and similar scores on all other possible causal factors, as illustrated in the second (hypothesis‐testing) diagram in Table 28.3 (B) .

The point is that the purpose of a most‐similar research design, and hence its basic setup, often changes as a researcher moves from an exploratory to a confirmatory mode of analysis. However, regardless of where one begins, the results, when published, look like a hypothesis‐testing research design. Question marks have been removed: (A) becomes (B) in Table 28.3 .

As an example, let us consider Leon Epstein's classic study of party cohesion, which focuses on two “most‐similar” countries, the United States and Canada. Canada has highly disciplined parties whose members vote together on the floor of the House of Commons while the United States has weak, undisciplined parties, whose members often defect on floor votes in Congress. In explaining these divergent outcomes, persistent over many years, Epstein first discusses possible causal factors that are held more or less constant across the two cases. Both the United States and Canada inherited English political cultures, both have large territories and heterogeneous populations, both are federal, and both have fairly loose party structures with strong regional bases and a weak center. These are the “control” variables. Where they differ is in one constitutional feature: Canada is parliamentary while the United States is presidential. And it is this institutional difference that Epstein identifies as the crucial (differentiating) cause. (For further examples of the most‐similar method see Brenner 1976 ; Hamilton 1977 ; Lipset 1968 ; Miguel 2004 ; Moulder 1977 ; Posner 2004 .)

X   1 = the variable of theoretical interest. X   2 = a vector of controls. Y = the outcome of interest.

Several caveats apply to any most‐similar analysis (in addition to the usual set of assumptions applying to all case‐study analysis). First, each causal factor is understood as having an independent and additive effect on the outcome; there are no “interaction” effects. Second, one must code cases dichotomously (high/low, present/absent). This is straightforward if the underlying variables are also dichotomous (e.g. federal/unitary). However, it is often the case that variables of concern in the model are continuous (e.g. party cohesion). In this setting, the researcher must “dichotomize” the scoring of cases so as to simplify the two‐case analysis. (Some flexibility is admissible on the vector of controls ( X   2 ) that are “held constant” across the cases. Nonidentity is tolerable if the deviation runs counter to the predicted hypothesis. For example, Epstein describes both the United States and Canada as having strong regional bases of power, a factor that is probably more significant in recent Canadian history than in recent American history. However, because regional bases of power should lead to weaker parties, rather than stronger parties, this element of nonidentity does not challenge Epstein's conclusions. Indeed, it sets up a most‐difficult research scenario, as discussed above.)

In one respect the requirements for case control are not so stringent. Specifically, it is not usually necessary to measure control variables (at least not with a high degree of precision) in order to control for them. If two countries can be assumed to have similar cultural heritages one needn't worry about constructing variables to measure that heritage. One can simply assert that, whatever they are, they are more or less constant across the two cases. This is similar to the technique employed in a randomized experiment, where the researcher typically does not attempt to measure all the factors that might affect the causal relationship of interest. She assumes, rather, that these unknown factors have been neutralized across the treatment and control groups by randomization or by the choice of a sample that is internally homogeneous.

The most useful statistical tool for identifying cases for in‐depth analysis in a most‐ similar setting is probably some variety of matching strategy—e.g. exact matching, approximate matching, or propensity‐score matching. 17 The product of this procedure is a set of matched cases that can be compared in whatever way the researcher deems appropriate. These are the “most‐similar” cases. Rosenbaum and Silber (2001 , 223) summarize:

Unlike model‐based adjustments, where [individuals] vanish and are replaced by the coefficients of a model, in matching, ostensibly comparable patterns are compared directly, one by one. Modern matching methods involve statistical modeling and combinatorial algorithms, but the end result is a collection of pairs or sets of people who look comparable, at least on average. In matching, people retain their integrity as people, so they can be examined and their stories can be told individually.

Matching, conclude the authors, “facilitates, rather than inhibits, thick description” ( Rosenbaum and Silber 2001 , 223).

In principle, the same matching techniques that have been used successfully in observational studies of medical treatments might also be adapted to the study of nation states, political parties, cities, or indeed any traditional paired cases in the social sciences. Indeed, the current popularity of matching among statisticians—relative, that is, to garden‐variety regression models—rests upon what qualitative researchers would recognize as a “case‐based” approach to causal analysis. If Rosenbaum and Silber are correct, it may be perfectly reasonable to appropriate this large‐ N method of analysis for case‐study purposes.

As with other methods of case selection, the most‐similar method is prone to problems of nonrepresentativeness. If employed in a qualitative fashion (without a systematic cross‐case selection strategy), potential biases in the chosen case must be addressed in a speculative way. If the researcher employs a matching technique of case selection within a large‐ N sample, the problem of potential bias can be addressed by assuring the choice of cases that are not extreme outliers, as judged by their residuals in the full model. Most‐similar cases should also be “typical” cases, though some scope for deviance around the regression line may be acceptable for purposes of finding a good fit among cases.

X   1 = the variable of theoretical interest. X   2a–d = a vector of controls. Y = the outcome of interest.

9 Most‐different Cases

A final case‐selection method is the reverse image of the previous method. Here, variation on independent variables is prized, while variation on the outcome is eschewed. Rather than looking for cases that are most‐similar, one looks for cases that are most‐ different . Specifically, the researcher tries to identify cases where just one independent variable ( X   1 ), as well as the dependent variable ( Y ), covary, while all other plausible factors ( X   2a–d ) show different values. 18

The simplest form of this two‐case comparison is illustrated in Table 28.4 . Cases A and B are deemed “most different,” though they are similar in two essential respects— the causal variable of interest and the outcome.

As an example, I follow Marc Howard's (2003) recent work, which explores the enduring impact of Communism on civil society. 19 Cross‐national surveys show a strong correlation between former Communist regimes and low social capital, controlling for a variety of possible confounders. It is a strong result. Howard wonders why this relationship is so strong and why it persists, and perhaps even strengthens, in countries that are no longer socialist or authoritarian. In order to answer this question, he focuses on two most‐different cases, Russia and East Germany. These two countries were quite different—in all ways other than their Communist experience— prior to the Soviet era, during the Soviet era (since East Germany received substantial subsidies from West Germany), and in the post‐Soviet era, as East Germany was absorbed into West Germany. Yet, they both score near the bottom of various cross‐ national indices intended to measure the prevalence of civic engagement in the current era. Thus, Howard's (2003 , 6–9) case selection procedure meets the requirements of the most‐different research design: Variance is found on all (or most) dimensions aside from the key factor of interest (Communism) and the outcome (civic engagement).

What leverage is brought to the analysis from this approach? Howard's case studies combine evidence drawn from mass surveys and from in‐depth interviews of small, stratified samples of Russians and East Germans. (This is a good illustration, incidentally, of how quantitative and qualitative evidence can be fruitfully combined in the intensive study of several cases.) The product of this analysis is the identification of three causal pathways that, Howard (2003 , 122) claims, help to explain the laggard status of civil society in post‐Communist polities: “the mistrust of communist organizations, the persistence of friendship networks, and the disappointment with post‐communism.” Simply put, Howard (2003 , 145) concludes, “a great number of citizens in Russia and Eastern Germany feel a strong and lingering sense of distrust of any kind of public organization, a general satisfaction with their own personal networks (accompanied by a sense of deteriorating relations within society overall), and disappointment in the developments of post‐communism.”

The strength of this most‐different case analysis is that the results obtained in East Germany and Russia should also apply in other post‐Communist polities (e.g. Lithuania, Poland, Bulgaria, Albania). By choosing a heterogeneous sample, Howard solves the problem of representativeness in his restricted sample. However, this sample is demonstrably not representative across the population of the inference, which is intended to cover all countries of the world.

More problematic is the lack of variation on key causal factors of interest— Communism and its putative causal pathways. For this reason, it is difficult to reach conclusions about the causal status of these factors on the basis of the most‐different analysis alone. It is possible, that is, that the three causal pathways identified by Howard also operate within polities that never experienced Communist rule.

Nor does it seem possible to conclusively eliminate rival hypotheses on the basis of this most‐different analysis. Indeed, this is not Howard's intention. He wishes merely to show that whatever influence on civil society might be attributed to economic, cultural, and other factors does not exhaust this subject.

My considered judgment is that the most‐different research design provides minimal leverage into the problem of why Communist systems appear to suppress civic engagement, years after their disappearance. Fortunately, this is not the only research design employed by Howard in his admirable study. Indeed, the author employs two other small‐ N cross‐case methods, as well as a large‐ N cross‐country statistical analysis. These methods do most of the analytic work. East Germany may be regarded as a causal pathway case (see above). It has all the attributes normally assumed to foster civic engagement (e.g. a growing economy, multiparty competition, civil liberties, a free press, close association with Western European culture and politics), but nonetheless shows little or no improvement on this dimension during the post‐ transition era ( Howard 2003 , 8). It is plausible to attribute this lack of change to its Communist past, as Howard does, in which case East Germany should be a fruitful case for the investigation of causal mechanisms. The contrast between East and West Germany provides a most‐similar analysis since the two polities share virtually everything except a Communist past. This variation is also deftly exploited by Howard.

I do not wish to dismiss the most‐different research method entirely. Surely, Howard's findings are stronger with the intensive analysis of Russia than they would be without. Yet his book would not stand securely on the empirical foundation provided by most‐different analysis alone. If one strips away the pathway‐case (East Germany) and the most‐similar analysis (East/West Germany) there is little left upon which to base an analysis of causal relations (aside from the large‐ N cross‐national analysis). Indeed, most scholars who employ the most‐different method do so in conjunction with other methods. 20 It is rarely, if ever, a standalone method. 21

Generalizing from this discussion of Marc Howard's work, I offer the following summary remarks on the most‐different method of case analysis. (I leave aside issues faced by all case‐study analyses, issues that are explored in Gerring 2007 .)

Let us begin with a methodological obstacle that is faced by both Millean styles of analysis—the necessity of dichotomizing every variable in the analysis. Recall that, as with most‐similar analysis, differences across cases must generally be sizeable enough to be interpretable in an essentially dichotomous fashion (e.g. high/low, present/absent) and similarities must be close enough to be understood as essentially identical (e.g. high/high, present/present). Otherwise the results of a Millean style analysis are not interpretable. The problem of “degrees” is deadly if the variables under consideration are, by nature, continuous (e.g. GDP). This is a particular concern in Howard's analysis, where East Germany scores somewhat higher than Russia in civic engagement; they are both low, but Russia is quite a bit lower. Howard assumes that this divergence is minimal enough to be understood as a difference of degrees rather than of kinds, a judgment that might be questioned. In these respects, most‐different analysis is no more secure—but also no less—than most‐similar analysis.

In one respect, most‐different analysis is superior to most‐similar analysis. If the coding assumptions are sound, the most‐different research design may be quite useful for eliminating necessary causes . Causal factors that do not appear across the chosen cases—e.g. X   2a–d in Table 28.4 —are evidently unnecessary for the production of Y . However, it does not follow that the most‐different method is the best method for eliminating necessary causes. Note that the defining feature of this method is the shared element across cases— X   1 in Table 28.4 . This feature does not help one to eliminate necessary causes. Indeed, if one were focused solely on eliminating necessary causes one would presumably seek out cases that register the same outcomes and have maximum diversity on other attributes. In Table 28.4 , this would be a set of cases that satisfy conditions X   2a–d , but not X   1 . Thus, even the presumed strength of the most‐different analysis is not so strong.

Usually, case‐study analysis is focused on the identification (or clarification) of causal relations, not the elimination of possible causes. In this setting, the most‐ different technique is useful, but only if assumptions of causal uniqueness hold. By “causal uniqueness,” I mean a situation in which a given outcome is the product of only one cause: Y cannot occur except in the presence of X . X is necessary, and in some situations (given certain background conditions) sufficient, to cause Y . 22

Consider the following hypothetical example. Suppose that a new disease, about which little is known, has appeared in Country A. There are hundreds of infected persons across dozens of affected communities in that country. In Country B, located at the other end of the world, several new cases of the disease surface in a single community. In this setting, we can imagine two sorts of Millean analyses. The first examines two similar communities within Country A, one of which has developed the disease and the other of which has not. This is the most‐similar style of case comparison, and focuses accordingly on the identification of a difference between the two cases that might account for variation across the sample. A second approach focuses on communities where the disease has appeared across the two countries and searches for any similarities that might account for these similar outcomes. This is the most‐different research design.

Both are plausible approaches to this particular problem, and we can imagine epidemiologists employing them simultaneously. However, the most‐different design demands stronger assumptions about the underlying factors at work. It supposes that the disease arises from the same cause in any setting. This is often a reasonable operating assumption when one is dealing with natural phenomena, though there are certainly many exceptions. Death, for example, has many causes. For this reason, it would not occur to us to look for most‐different cases of high mortality around the world. In order for the most‐different research design to effectively identify a causal factor at work in a given outcome, the researcher must assume that X   1 —the factor held constant across the diverse cases—is the only possible cause of Y (see Table 28.4 ). This assumption rarely holds in social‐scientific settings. Most outcomes of interest to anthropologists, economists, political scientists, and sociologists have multiple causes. There are many ways to win an election, to build a welfare state, to get into a war, to overthrow a government, or—returning to Marc Howard's work—to build a strong civil society. And it is for this reason that most‐different analysis is rarely applied in social science work and, where applied, is rarely convincing.

If this seems a tad severe, there is a more charitable way of approaching the most‐different method. Arguably, this is not a pure “method” at all but merely a supplement, a way of incorporating diversity in the sub‐sample of cases that provide the unusual outcome of interest. If the unusual outcome is revolutions, one might wish to encompass a wide variety of revolutions in one's analysis. If the unusual outcome is post‐Communist civil society, it seems appropriate to include a diverse set of post‐Communist polities in one's sample of case studies, as Marc Howard does. From this perspective, the most‐different method (so‐called) might be better labeled a diverse‐case method, as explored above.

10 Conclusions

In order to be a case of something broader than itself, the chosen case must be representative (in some respects) of a larger population. Otherwise—if it is purely idiosyncratic (“unique”)—it is uninformative about anything lying outside the borders of the case itself. A study based on a nonrepresentative sample has no (or very little) external validity. To be sure, no phenomenon is purely idiosyncratic; the notion of a unique case is a matter that would be difficult to define. One is concerned, as always, with matters of degree. Cases are more or less representative of some broader phenomenon and, on that score, may be considered better or worse subjects for intensive analysis. (The one exception, as noted, is the influential case.)

Of all the problems besetting case‐study analysis, perhaps the most persistent— and the most persistently bemoaned—is the problem of sample bias ( Achen and Snidal 1989 ; Collier and Mahoney 1996 ; Geddes 1990 ; King, Keohane, and Verba 1994 ; Rohlfing 2004 ; Sekhon 2004 ). Lisa Martin (1992 , 5) finds that the overemphasis of international relations scholars on a few well‐known cases of economic sanctions— most of which failed to elicit any change in the sanctioned country—“has distorted analysts view of the dynamics and characteristics of economic sanctions.” Barbara Geddes (1990) charges that many analyses of industrial policy have focused exclusively on the most successful cases—primarily the East Asian NICs—leading to biased inferences. Anna Breman and Carolyn Shelton (2001) show that case‐study work on the question of structural adjustment is systematically biased insofar as researchers tend to focus on disaster cases—those where structural adjustment is associated with very poor health and human development outcomes. These cases, often located in sub‐Saharan Africa, are by no means representative of the entire population. Consequently, scholarship on the question of structural adjustment is highly skewed in a particular ideological direction (against neoliberalism) (see also Gerring, Thacker, and Moreno 2005) .

These examples might be multiplied many times. Indeed, for many topics the most‐studied cases are acknowledged to be less than representative. It is worth reflecting upon the fact that our knowledge of the world is heavily colored by a few “big” (populous, rich, powerful) countries, and that a good portion of the disciplines of economics, political science, and sociology are built upon scholars' familiarity with the economics, political science, and sociology of one country, the United States. 23 Case‐study work is particularly prone to problems of investigator bias since so much rides on the researcher's selection of one (or a few) cases. Even if the investigator is unbiased, her sample may still be biased simply by virtue of “random” error (which may be understood as measurement error, error in the data‐generation process, or as an underlying causal feature of the universe).

There are only two situations in which a case‐study researcher need not be concerned with the representativeness of her chosen case. The first is the influential case research design, where a case is chosen because of its possible influence on a cross‐case model, and hence is not expected to be representative of a larger sample. The second is the deviant‐case method, where the chosen case is employed to confirm a broader cross‐case argument to which the case stands as an apparent exception. Yet even here the chosen case is expected to be representative of a broader set of cases—those, in particular, that are poorly explained by the extant model.

In all other circumstances, cases must be representative of the population of interest in whatever ways might be relevant to the proposition in question. Note that where a researcher is attempting to disconfirm a deterministic proposition the question of representativeness is perhaps more appropriately understood as a question of classification: Is the chosen case appropriately classified as a member of the designated population? If so, then it is fodder for a disconfirming case study.

If the researcher is attempting to confirm a deterministic proposition, or to make probabilistic arguments about a causal relationship, then the problem of representativeness is of the more usual sort: Is case A unit‐homogeneous relative to other cases in the population? This is not an easy matter to test. However, in a large‐ N context the residual for that case (in whatever model the researcher has greatest confidence in) is a reasonable place to start. Of course, this test is only as good as the model at hand. Any incorrect specifications or incorrect modeling procedures will likely bias the results and give an incorrect assessment of each case's “typicality.” In addition, there is the possibility of stochastic error, errors that cannot be modeled in a general framework. Given the explanatory weight that individual cases are asked to bear in a case‐study analysis, it is wise to consider more than just the residual test of representativeness. Deductive logic and an in‐depth knowledge of the case in question are often more reliable tools than the results of a cross‐case model.

In any case, there is no dispensing with the question. Case studies (with the two exceptions already noted) rest upon an assumed synecdoche: The case should stand for a population. If this is not true, or if there is reason to doubt this assumption, then the utility of the case study is brought severely into question.

Fortunately, there is some safety in numbers. Insofar as case‐study evidence is combined with cross‐case evidence the issue of sample bias is mitigated. Indeed, the suspicion of case‐study work that one finds in the social sciences today is, in my view, a product of a too‐literal interpretation of the case‐study method. A case study tout court is thought to mean a case study tout seul . Insofar as case studies and cross‐case studies can be enlisted within the same investigation (either in the same study or by reference to other studies in the same subfield), problems of representativeness are less worrisome. This is the virtue of cross‐level work, a.k.a. “triangulation.”

11 Ambiguities

Before concluding, I wish to draw attention to two ambiguities in case‐selection strategies in case‐study research. The first concerns the admixture of several case‐ selection strategies. The second concerns the changing status of a case as a study proceeds.

Some case studies follow only one strategy of case selection. They are typical , diverse , extreme , deviant , influential , crucial , pathway , most‐similar , or most‐different research designs, as discussed. However, many case studies mix and match among these case‐selection strategies. Indeed, insofar as all case studies seek representative samples, they are always in search of “typical” cases. Thus, it is common for writers to declare that their case is, for example, both extreme and typical; it has an extreme value on X   1 or Y but is not, in other respects, idiosyncratic. There is not much that one can say about these combinations of strategies except that, where the cases allow for a variety of empirical strategies, there is no reason not to pursue them. And where the same cases can serve several functions at once (without further effort on the researcher's part), there is little cost to a multi‐pronged approach to case analysis.

The second issue that deserves emphasis is the changing status of a case during the course of a researcher's investigation—which may last for years, if not decades. The problem is acute wherever a researcher begins in an exploratory mode and proceeds to hypothesis‐testing (that is, she develops a specific X   1 / Y proposition) or where the operative hypothesis or key control variable changes (a new causal factor is discovered or another outcome becomes the focus of analysis). Things change. And it is the mark of a good researcher to keep her mind open to new evidence and new insights. Too often, methodological discussions give the misleading impression that hypotheses are clear and remain fixed over the course of a study's development. Nothing could be further from the truth. The unofficial transcripts of academia— accessible in informal settings, where researchers let their guards down (particularly if inebriated)—are filled with stories about dead‐ends, unexpected findings, and drastically revised theory chapters. It would be interesting, in this vein, to compare published work with dissertation prospectuses and fellowship applications. I doubt if the correlation between these two stages of research is particularly strong.

Research, after all, is about discovery, not simply the verification or falsification of static hypotheses. That said, it is also true that research on a particular topic should move from hypothesis generating to hypothesis‐testing. This marks the progress of a field, and of a scholar's own work. As a rule, research that begins with an open‐ended ( X ‐ or Y ‐centered) analysis should conclude with a determinate X   1 / Y hypothesis.

The problem is that research strategies that are ideal for exploration are not always ideal for confirmation. The extreme‐case method is inherently exploratory since there is no clear causal hypothesis; the researcher is concerned merely to explore variation on a single dimension ( X or Y ). Other methods can be employed in either an open‐ ended (exploratory) or a hypothesis‐testing (confirmatory/disconfirmatory) mode. The difficulty is that once the researcher has arrived at a determinate hypothesis the originally chosen research design may no longer appear to be so well designed.

This is unfortunate, but inevitable. One cannot construct the perfect research design until (a) one has a specific hypothesis and (b) one is reasonably certain about what one is going to find “out there” in the empirical world. This is particularly true of observational research designs, but it also applies to many experimental research designs: Usually, there is a “good” (informative) finding, and a finding that is less insightful. In short, the perfect case‐study research design is usually apparent only ex post facto .

There are three ways to handle this. One can explain, straightforwardly, that the initial research was undertaken in an exploratory fashion, and therefore not constructed to test the specific hypothesis that is—now—the primary argument. Alternatively, one can try to redesign the study after the new (or revised) hypothesis has been formulated. This may require additional field research or perhaps the integration of additional cases or variables that can be obtained through secondary sources or through consultation of experts. A final approach is to simply jettison, or de‐emphasize, the portion of research that no longer addresses the (revised) key hypothesis. A three‐case study may become a two‐case study, and so forth. Lost time and effort are the costs of this downsizing.

In the event, practical considerations will probably determine which of these three strategies, or combinations of strategies, is to be followed. (They are not mutually exclusive.) The point to remember is that revision of one's cross‐case research design is normal and perhaps to be expected. Not all twists and turns on the meandering trail of truth can be anticipated.

12 Are There Other Methods of Case Selection?

At the outset of this chapter I summarized the task of case selection as a matter of achieving two objectives: representativeness (typicality) and variation (causal leverage). Evidently, there are other objectives as well. For example, one wishes to identify cases that are independent of each other. If chosen cases are affected by each other (sometimes known as Galton's problem or a problem of diffusion), this problem must be corrected before analysis can take place. I have neglected this issue because it is usually apparent to the researcher and, in any case, there are no simple techniques that might be utilized to correct for such biases. (For further discussion of this and other factors impinging upon case selection see Gerring 2001 , 178–81.)

I have also disregarded pragmatic/logistical issues that might affect case selection. Evidently, case selection is often influenced by a researcher's familiarity with the language of a country, a personal entrée into that locale, special access to important data, or funding that covers one archive rather than another. Pragmatic considerations are often—and quite rightly—decisive in the case‐selection process.

A final consideration concerns the theoretical prominence of a particular case within the literature on a subject. Researchers are sometimes obliged to study cases that have received extensive attention in previous studies. These are sometimes referred to as “paradigmatic” cases or “exemplars” ( Flyvbjerg 2004 , 427).

However, neither pragmatic/logistical utility nor theoretical prominence qualifies as a methodological factor in case selection. That is, these features of a case have no bearing on the validity of the findings stemming from a study. As such, it is appropriate to grant these issues a peripheral status in this chapter.

One final caveat must be issued. While it is traditional to distinguish among the tasks of case selection and case analysis, a close look at these processes shows them to be indistinct and overlapping. One cannot choose a case without considering the sort of analysis that it might be subjected to, and vice versa. Thus, the reader should consider choosing cases by employing the nine techniques laid out in this chapter along with any considerations that might be introduced by virtue of a case's quasi‐experimental qualities, a topic taken up elsewhere ( Gerring 2007 , ch. 6 ).

Abadie, A. , Drukker, D. , Herr, J. L. , and Imbens, G. W.   2001 . Implementing matching estimators for average treatment effects in Stata.   Stata Journal , 1: 1–18.

Google Scholar

Abbott, A.   2001 . Time Matters: On Theory and Method . Chicago: University of Chicago Press.

Google Preview

——  and Tsay, A.   2000 . Sequence analysis and optimal matching methods in sociology.   Sociological Methods and Research , 29: 3–33. 10.1177/0049124100029001001

——  and Forrest, J.   1986 . Optimal matching methods for historical sequences.   Journal of Interdisciplinary History , 16: 471–94. 10.2307/204500

Achen, C. H. , and Snidal, D.   1989 . Rational deterrence theory and comparative case studies.   World Politics , 41: 143–69. 10.2307/2010405

Allen, W. S.   1965 . The Nazi Seizure of Power: The Experience of a Single German Town, 1930–1935 . New York: Watts.

Almond, G. A.   1956 . Comparative political systems.   Journal of Politics , 18: 391–409.

Amenta, E.   1991 . Making the most of a case study: theories of the welfare state and the American experience. Pp. 172–94 in Issues and Alternatives in Comparative Social Research ed. C. C. Ragin . Leiden: E. J. Brill.

Barro, R. J.   1999 . Determinants of democracy.   Journal of Political Economy , 107: 158–83. 10.1086/250107

Belsey, D. A. , Kuh, E. , and Welsch, R. E.   2004 . Regression Diagnostics: Identifying Influential Data and Sources of Collinearity . New York: Wiley.

Bennett, A. , Lepgold, J. , and Unger, D.   1994 . Burden‐sharing in the Persian Gulf War.   International Organization , 48: 39–75. 10.1017/S0020818300000813

Bentley, A. 1908/ 1967 . The Process of Government . Cambridge, Mass.: Harvard University Press.

Brady, H. E. , and Collier, D. (eds.) 2004 . Rethinking Social Inquiry: Diverse Tools, Shared Standards . Lanham, Md.: Rowman and Littlefield.

Braumoeller, B. F.   2003 . Causal complexity and the study of politics.   Political Analysis , 11: 209–33. 10.1093/pan/mpg012

Breman, A. , and Shelton, C. 2001. Structural adjustment and health: a literature review of the debate, its role‐players and presented empirical evidence. CMH Working Paper Series, Paper No. WG6: 6. WHO, Commission on Macroeconomics and Health.

Brenner, R.   1976 . Agrarian class structure and economic development in pre‐industrial Europe.   Past and Present , 70: 30–75. 10.1093/past/70.1.30

Browne, A.   1987 . When Battered Women Kill . New York: Free Press.

Buchbinder, S. , and Vittinghoff, E.   1999 . HIV‐infected long‐term nonprogressors: epidemiology, mechanisms of delayed progression, and clinical and research implications.   Microbes Infect , 1: 1113–20. 10.1016/S1286-4579(99)00204-X

Cohen, M. R. , and Nagel, E.   1934 . An Introduction to Logic and Scientific Method . New York: Harcourt, Brace and Company.

Collier, D. , and Mahoney, J.   1996 . Insights and pitfalls: selection bias in qualitative research.   World Politics , 49: 56–91. 10.1353/wp.1996.0023

Collier, R. B. , and Collier, D. 1991/ 2002 . Shaping the Political Arena: Critical Junctures, the Labor Movement, and Regime Dynamics in Latin America . Notre Dame, Ind.: University of Notre Dame Press.

Colomer, J. M.   1991 . Transitions by agreement: modeling the Spanish way.   American Political Science Review , 85: 1283–302. 10.2307/1963946

Converse, P. E. , and Dupeux, G.   1962 . Politicization of the electorate in France and the United States.   Public Opinion Quarterly , 16: 1–23. 10.1086/267067

Coppedge, M. J. 2004. The conditional impact of the economy on democracy in Latin America. Presented at the conference “Democratic Advancements and Setbacks: What Have We Learnt?”, Uppsala University, June 11–13.

De Felice, E. G.   1986 . Causal inference and comparative methods.   Comparative Political Studies , 19: 415–37. 10.1177/0010414086019003005

Desch, M. C.   2002 . Democracy and victory: why regime type hardly matters.   International Security , 27: 5–47. 10.1162/016228802760987815

Deyo, F. (ed.) 1987 . The Political Economy of the New Asian Industrialism . Ithaca, NY: Cornell University Press.

Dion, D.   1998 . Evidence and inference in the comparative case study.   Comparative Politics , 30: 127–45. 10.2307/422284

Eckstein, H.   1975 . Case studies and theory in political science. In Handbook of Political Science , vii: Political Science: Scope and Theory , ed. F. I. Greenstein and N. W. Polsby . Reading, Mass.: Addison‐Wesley.

Eggan, F.   1954 . Social anthropology and the method of controlled comparison.   American Anthropologist , 56: 743–63. 10.1525/aa.1954.56.5.02a00020

Elman, C.   2003 . Lessons from Lakatos. In Progress in International Relations Theory: Appraising the Field , ed. C. Elman and M. F. Elman . Cambridge, Mass.: MIT Press.

——  2005 . Explanatory typologies in qualitative studies of international politics.   International Organization , 59: 293–326.

Emigh, R.   1997 . The power of negative thinking: the use of negative case methodology in the development of sociological theory.   Theory and Society , 26: 649–84. 10.1023/A:1006896217647

Epstein, L. D.   1964 . A comparative study of Canadian parties.   American Political Science Review , 58: 46–59. 10.2307/1952754

Ertman, T.   1997 . Birth of the Leviathan: Building States and Regimes in Medieval and Early Modern Europe . Cambridge: Cambridge University Press.

Esping‐Andersen, G.   1990 . The Three Worlds of Welfare Capitalism . Princeton, NJ: Princeton University Press.

Flyvbjerg, B.   2004 . Five misunderstandings about case‐study research. Pp. 420–34 in Qualitative Research Practice , ed. C. Seale , G. Gobo , J. F. Gubrium , and D. Silverman . London: Sage.

Geddes, B.   1990 . How the cases you choose affect the answers you get: selection bias in comparative politics. In Political Analysis , vol. ii, ed. J. A. Stimson . Ann Arbor: University of Michigan Press.

——  2003 . Paradigms and Sand Castles: Theory Building and Research Design in Comparative Politics . Ann Arbor: University of Michigan Press.

George, A. L. , and Bennett, A.   2005 . Case Studies and Theory Development . Cambridge, Mass.: MIT Press.

——  and Smoke, R.   1974 . Deterrence in American Foreign Policy: Theory and Practice . New York: Columbia University Press.

Gerring, J.   2001 . Social Science Methodology: A Criterial Framework . Cambridge: Cambridge University Press.

——  2007 . Case Study Research: Principles and Practices . Cambridge: Cambridge University Press.

——  Thacker, S. and Moreno, C. 2005. Do neoliberal policies save lives? Unpublished manuscript.

Goertz, G. and Starr, H. (eds.) 2003 . Necessary Conditions: Theory, Methodology and Applications . New York: Rowman and Littlefield.

——  and Levy, J. (eds.) forthcoming. Causal explanations, necessary conditions, and case studies: World War I and the end of the Cold War. Manuscript.

Goodin, R. E. and Smitsman, A.   2000 . Placing welfare states: the Netherlands as a crucial test case.   Journal of Comparative Policy Analysis , 2: 39–64. 10.1080/13876980008412635

Gujarati, D. N.   2003 . Basic Econometrics , 4th edn. New York: McGraw‐Hill.

Hamilton, G. G.   1977 . Chinese consumption of foreign commodities: a comparative perspective.   American Sociological Review , 42: 877–91. 10.2307/2094574

Haynes, B. F.   Pantaleo, G. and Fauci, A. S.   1996 . Toward an understanding of the correlates of protective immunity to HIV infection.   Science , 271: 324–8. 10.1126/science.271.5247.324

Hempel, C. G.   1942 . The function of general laws in history.   Journal of Philosophy , 39: 35–48. 10.2307/2017635

Ho, D. E.   Imai, K.   King, G. and Stuart, E. A. 2004. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Manuscript.

Howard, M. M.   2003 . The Weakness of Civil Society in Post‐Communist Europe . Cambridge: Cambridge University Press.

Howson, C. and Urbach, P.   1989 . Scientific Reasoning: The Bayesian Approach . La Salle, Ill.: Open Court.

Humphreys, M.   2005 . Natural resources, conflict, and conflict resolution: uncovering the mechanisms.   Journal of Conflict Resolution , 49: 508–37. 10.1177/0022002705277545

Jenicek, M.   2001 . Clinical Case Reporting in Evidence‐Based Medicine , 2nd edn. Oxford: Oxford University Press.

Karl, T. L.   1997 . The Paradox of Plenty: Oil Booms and Petro‐states . Berkeley: University of California Press.

Kazancigil, A.   1994 . The deviant case in comparative analysis: high stateness in comparative analysis. Pp. 213–38 in Comparing Nations: Concepts, Strategies, Substance , ed. M. Dogan and A. Kazancigil . Cambridge: Blackwell.

Kemp, K. A.   1986 . Race, ethnicity, class and urban spatial conflict: Chicago as a crucial case   Urban Studies , 23: 197–208. 10.1080/00420988620080231

Kendall, P. L. and Wolf, K. M. 1949/ 1955 . The analysis of deviant cases in communications research. In Communications Research, 1948–1949 , ed. P. F. Lazarsfeld and F. N. Stanton. New York: Harper and Brothers. Reprinted as pp. 167–70 in The Language of Social Research , ed. P. F. Lazarsfeld and M. Rosenberg . New York: Free Press.

Kennedy, C. H.   2005 . Single‐case Designs for Educational Research . Boston: Allyn and Bacon.

Kennedy, P.   2003 . A Guide to Econometrics , 5th edn. Cambridge, Mass.: MIT Press.

Khong, Y. F.   1992 . Analogies at War: Korea, Munich, Dien Bien Phu, and the Vietnam Decisions of 1965 . Princeton, NJ: Princeton University Press.

King, G.   Keohane, R. O. and Verba, S.   1994 . Designing Social Inquiry: Scientific Inference in Qualitative Research . Princeton, NJ: Princeton University Press.

Lakatos, I.   1978 . The Methodology of Scientific Research Programmes . Cambridge: Cambridge University Press.

Lazarsfeld, P. F. and Barton, A. H.   1951 . Qualitative measurement in the social sciences: classification, typologies, and indices. In The Policy Sciences , ed. D. Lerner and H. D. Lass‐ well. Stanford, Calif.: Stanford University Press.

Levy, J. S.   2002 . Qualitative methods in international relations. In Evaluating Methodology in International Studies , ed. F. P. Harvey and M. Brecher. Ann Arbor: University of Michigan Press.

Lijphart, A.   1968 . The Politics of Accommodation: Pluralism and Democracy in the Netherlands . Berkeley: University of California Press.

——  1969 . Consociational democracy.   World Politics , 21: 207–25. 10.2307/2009820

——  1971 . Comparative politics and the comparative method. American Political Science Review , 65: 682–93.

——  1975 . The comparable cases strategy in comparative research.   Comparative Political Studies , 8: 158–77.

Lipset, S. M.   1959 . Some social requisites of democracy: economic development and political development.   American Political Science Review , 53: 69–105. 10.2307/1951731

——  1960/ 1963 . Political Man: The Social Bases of Politics . Garden City, NY: Anchor.

——  1968 . Agrarian Socialism: The Cooperative Commonwealth Federation in Saskatchewan. A Study in Political Sociology . Garden City, NY: Doubleday.

——  Trow, M. A. and Coleman, J. S.   1956 . Union Democracy: The Internal Politics of the International Typographical Union . New York: Free Press.

Lynd, R. S. and Lynd, H. M. 1929/ 1956 . Middletown: A Study in American Culture . New York: Harcourt, Brace.

Mahoney, J. and Goertz, G.   2004 . The possibility principle: choosing negative cases in comparative research.   American Political Science Review , 98: 653–69.

Martin, L. L.   1992 . Coercive Cooperation: Explaining Multilateral Economic Sanctions .Princeton, NJ: Princeton University Press.

Mayo, D. G.   1996 . Error and the Growth of Experimental Knowledge . Chicago: University of Chicago Press.

Meckstroth, T.   1975 . “Most different systems” and “most similar systems:” a study in the logic of comparative inquiry.   Comparative Political Studies , 8: 133–77.

Miguel, E.   2004 . Tribe or nation: nation‐building and public goods in Kenya versus Tanzania.   World Politics , 56: 327–62. 10.1353/wp.2004.0018

Mill, J. S. 1843/ 1872 . The System of Logic , 8th edn. London: Longmans, Green.

Monroe, K. R.   1996 . The Heart of Altruism: Perceptions of a Common Humanity . Princeton, NJ: Princeton University Press.

Moore, B., Jr.   1966 . Social Origins of Dictatorship and Democracy: Lord and Peasant in the Making of the Modern World . Boston: Beacon Press.

Morgan, S. L. and Harding, D. J. 2005. Matching estimators of causal effects: from stratification and weighting to practical data analysis routines. Manuscript.

Moulder, F. V.   1977 . Japan, China and the Modern World Economy: Toward a Reinterpretation of East Asian Development ca. 1600 to ca. 1918 . Cambridge: Cambridge University Press.

Munck, G. L.   2004 . Tools for qualitative research. Pp. 105–21 in Rethinking Social Inquiry: Diverse Tools, Shared Standards , ed. H. E. Brady and D. Collier . Lanham, Md. : Rowman and Littlefield.

Njolstad, O.   1990 . Learning from history? Case studies and the limits to theory‐building. Pp. 220–46 in Arms Races: Technological and Political Dynamics , ed. O. Njolstad . Thousand Oaks, Calif.: Sage.

Patton, M. Q.   2002 . Qualitative Evaluation and Research Methods . Newbury Park, Calif.: Sage.

Popper, K. 1934/ 1968 . The Logic of Scientific Discovery . New York: Harper and Row.

——  1963 . Conjectures and Refutations . London: Routledge and Kegan Paul.

Posner, D.   2004 . The political salience of cultural difference: why Chewas and Tumbukas are allies in Zambia and adversaries in Malawi.   American Political Science Review , 98: 529–46.

Przeworski, A. and Teune, H.   1970 . The Logic of Comparative Social Inquiry . New York: John Wiley.

Queen, S.   1928 . Round table on the case study in sociological research.   Publications of the American Sociological Society, Papers and Proceedings , 22: 225–7.

Ragin, C. C.   2000 . Fuzzy‐set Social Science . Chicago: University of Chicago Press.

——  2004 . Turning the tables. Pp. 123–38 in Rethinking Social Inquiry: Diverse Tools, Shared Standards , ed. H. E. Brady and D. Collier.   Lanham, Md. : Rowman and Littlefield.

Reilly, B.   2000 –1. Democracy, ethnic fragmentation, and internal conflict: confused theories, faulty data, and the “crucial case” of Papua New Guinea.   International Security , 25: 162–85. 10.1162/016228800560552

——  and Phillpot, R.   2003 . “Making democracy work” in Papua New Guinea: social capital and provincial development in an ethnically fragmented society.   Asian Survey , 42: 906–27. 10.1525/as.2002.42.6.906

Rogowski, R.   1995 . The role of theory and anomaly in social‐scientific inference.   American Political Science Review , 89: 467–70. 10.2307/2082443

Rohlfing, I. 2004. Have you chosen the right case? Uncertainty in case selection for single case studies. Working Paper, International University, Bremen.

Rosenbaum, P. R.   2004 . Matching in observational studies. In Applied Bayesian Modeling and Causal Inference from an Incomplete‐data Perspective , ed. A. Gelman and X.‐L. Meng . New York: John Wiley.

——  and Silber, J. H.   2001 . Matching and thick description in an observational study of mortality after surgery.   Biostatistics , 2: 217–32. 10.1093/biostatistics/2.2.217

Ross, M.   2001 . Does oil hinder democracy?   World Politics , 53: 325–61. 10.1353/wp.2001.0011

Sagan, S. D.   1995 . Limits of Safety: Organizations, Accidents, and Nuclear Weapons . Princeton, NJ: Princeton University Press.

Sekhon, J. S.   2004 . Quality meets quantity: case studies, conditional probability and counter‐ factuals.   Perspectives in Politics , 2: 281–93.

Shafer, M. D.   1988 . Deadly Paradigms: The Failure of U.S. Counterinsurgency Policy . Princeton, NJ: Princeton University Press.

Skocpol, T.   1979 . States and Social Revolutions: A Comparative Analysis of France, Russia, and China . Cambridge: Cambridge University Press.

——  and Somers, M.   1980 . The uses of comparative history in macrosocial inquiry.   Comparative Studies in Society and History , 22: 147–97.

Stinchcombe, A. L.   1968 . Constructing Social Theories . New York: Harcourt, Brace.

Swank, D. H.   2002 . Global Capital, Political Institutions, and Policy Change in Developed Welfare States . Cambridge: Cambridge University Press.

Tendler, J.   1997 . Good Government in the Tropics . Baltimore: Johns Hopkins University Press.

Truman, D. B.   1951 . The Governmental Process . New York: Alfred A. Knopf.

Tsai, L.   2007 . Accountability without Democracy: How Solidary Groups Provide Public Goods in Rural China . Cambridge: Cambridge University Press.

Van Evera, S.   1997 . Guide to Methods for Students of Political Science . Ithaca, NY: Cornell University Press.

Wahlke, J. C.   1979 . Pre‐behavioralism in political science. American Political Science Review , 73: 9–31. 10.2307/1954728

Yashar, D. J.   2005 . Contesting Citizenship in Latin America: The Rise of Indigenous Movements and the Postliberal Challenge . Cambridge: Cambridge University Press.

Yin, R. K.   2004 . Case Study Anthology . Thousand Oaks, Calif.: Sage.

Gujarati (2003) ; Kennedy (2003) . Interestingly, the potential of cross‐case statistics in helping to choose cases for in‐depth analysis is recognized in some of the earliest discussions of the case‐study method (e.g. Queen 1928 , 226).

This expands on Mill (1843/1872 , 253), who wrote of scientific enquiry as twofold: “either inquiries into the cause of a given effect or into the effects or properties of a given cause.”

This method has not received much attention on the part of qualitative methodologists; hence, the absence of a generally recognized name. It bears some resemblance to J. S. Mill's Joint Method of Agreement and Difference ( Mill 1843/1872 ), which is to say a mixture of most‐similar and most‐different analysis, as discussed below. Patton (2002 , 234) employs the concept of “maximum variation (heterogeneity) sampling.”

More precisely, George and Smoke (1974 , 534, 522–36, ch. 18 ; see also discussion in Collier and Mahoney 1996 , 78) set out to investigate causal pathways and discovered, through the course of their investigation of many cases, these three causal types. Yet, for our purposes what is important is that the final sample includes at least one representative of each “type.”

For further examples see Collier and Mahoney (1996) ; Geddes (1990) ; Tendler (1997) .

Traditionally, methodologists have conceptualized cases as having “positive” or “negative” values (e.g. Emigh 1997 ; Mahoney and Goertz 2004 ; Ragin 2000 , 60; 2004 , 126).

Geddes (1990) ; King, Keohane, and Verba (1994) . See also discussion in Brady and Collier (2004) ; Collier and Mahoney (1996) ; Rogowski (1995) .

The exception would be a circumstance in which the researcher intends to disprove a deterministic argument ( Dion 1998 ).

Geddes (2003 , 131). For other examples of casework from the annals of medicine see “Clinical reports” in the Lancet , “Case studies” in Canadian Medical Association Journal , and various issues of the Journal of Obstetrics and Gynecology , often devoted to clinical cases (discussed in Jenicek 2001 , 7). For examples from the subfield of comparative politics see Kazancigil (1994) .

For a discussion of the important role of anomalies in the development of scientific theorizing see Elman (2003) ; Lakatos (1978) . For examples of deviant‐case research designs in the social sciences see Amenta (1991) ; Coppedge (2004) ; Eckstein (1975) ; Emigh (1997) ; Kendall and Wolf (1949/1955) .

For examples of the crucial‐case method see Bennett, Lepgold, and Unger (1994) ; Desch (2002) ; Goodin and Smitsman (2000) ; Kemp (1986) ; Reilly and Phillpot (2003) . For general discussion see George and Bennett (2005) ; Levy (2002) ; Stinchcombe (1968 , 24–8).

A third position, which purports to be neither Popperian or Bayesian, has been articulated by Mayo (1996 , ch. 6 ). From this perspective, the same idea is articulated as a matter of “severe tests.”

It should be noted that Tsai's conclusions do not rest solely on this crucial case. Indeed, she employs a broad range of methodological tools, encompassing case‐study and cross‐case methods.

See also the discussion in Eckstein (1975) and Lijphart (1969) . For additional examples of case studies disconfirming general propositions of a deterministic nature see Allen (1965); Lipset, Trow, and Coleman (1956) ; Njolstad (1990) ; Reilly (2000–1) ; and discussion in Dion (1998) ; Rogowski (1995) .

Granted, insofar as case‐study analysis provides a window into causal mechanisms, and causal mechanisms are integral to a given theory, a single case may be enlisted to confirm or disconfirm a proposition. However, if the case study upholds a posited pattern of X/Y covariation, and finds fault only with the stipulated causal mechanism, it would be more accurate to say that the study forces the reformulation of a given theory, rather than its confirmation or disconfirmation. See further discussion in the following section.

Sometimes, the most‐similar method is known as the “method of difference,” after its inventor ( Mill 1843/1872 ). For later treatments see Cohen and Nagel (1934) ; Eggan (1954) ; Gerring (2001 , ch. 9 ); Lijphart (1971 ; 1975) ; Meckstroth (1975) ; Przeworski and Teune (1970) ; Skocpol and Somers (1980) .

For good introductions see Ho et al. (2004) ; Morgan and Harding (2005) ; Rosenbaum (2004) ; Rosenbaum and Silber (2001) . For a discussion of matching procedures in Stata see Abadie et al. (2001) .

The most‐different method is also sometimes referred to as the “method of agreement,” following its inventor, J. S. Mill (1843/1872) . See also De Felice (1986) ; Gerring (2001 , 212–14); Lijphart (1971 ; 1975) ; Meckstroth (1975) ; Przeworski and Teune (1970) ; Skocpol and Somers (1980) . For examples of this method see Collier and Collier (1991/2002) ; Converse and Dupeux (1962) ; Karl (1997) ; Moore (1966) ; Skocpol (1979) ; Yashar (2005 , 23). However, most of these studies are described as combining most‐similar and most‐different methods.

In the following discussion I treat the terms social capital, civil society, and civic engagement interchangeably.

E.g. Collier and Collier (1991/2002) ; Karl (1997) ; Moore (1966) ; Skocpol (1979) ; Yashar (2005 , 23). Karl (1997) , which affects to be a most‐different system analysis (20), is a particularly clear example of this. Her study, focused ostensibly on petro‐states (states with large oil reserves), makes two sorts of inferences. The first concerns the (usually) obstructive role of oil in political and economic development. The second sort of inference concerns variation within the population of petro‐states, showing that some countries (e.g. Norway, Indonesia) manage to avoid the pathologies brought on elsewhere by oil resources. When attempting to explain the constraining role of oil on petro‐states, Karl usually relies on contrasts between petro‐states and nonpetro‐states (e.g. ch. 10 ). Only when attempting to explain differences among petro‐states does she restrict her sample to petro‐states. In my opinion, very little use is made of the most‐different research design.

This was recognized, at least implicitly, by Mill (1843/1872 , 258–9). Skepticism has been echoed by methodologists in the intervening years (e.g. Cohen and Nagel 1934 , 251–6; Gerring 2001 ; Skocpol and Somers 1980 ). Indeed, explicit defenses of the most‐different method are rare (but see De Felice 1986 ).

Another way of stating this is to say that X is a “nontrivial necessary condition” of Y .

Wahlke (1979 , 13) writes of the failings of the “behavioralist” mode of political science analysis: “It rarely aims at generalization; research efforts have been confined essentially to case studies of single political systems, most of them dealing …with the American system.”

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Case Study – Methods, Examples and Guide

Case Study – Methods, Examples and Guide

Table of Contents

Case Study Research

A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation.

It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied. Case studies typically involve multiple sources of data, including interviews, observations, documents, and artifacts, which are analyzed using various techniques, such as content analysis, thematic analysis, and grounded theory. The findings of a case study are often used to develop theories, inform policy or practice, or generate new research questions.

Types of Case Study

Types and Methods of Case Study are as follows:

Single-Case Study

A single-case study is an in-depth analysis of a single case. This type of case study is useful when the researcher wants to understand a specific phenomenon in detail.

For Example , A researcher might conduct a single-case study on a particular individual to understand their experiences with a particular health condition or a specific organization to explore their management practices. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a single-case study are often used to generate new research questions, develop theories, or inform policy or practice.

Multiple-Case Study

A multiple-case study involves the analysis of several cases that are similar in nature. This type of case study is useful when the researcher wants to identify similarities and differences between the cases.

For Example, a researcher might conduct a multiple-case study on several companies to explore the factors that contribute to their success or failure. The researcher collects data from each case, compares and contrasts the findings, and uses various techniques to analyze the data, such as comparative analysis or pattern-matching. The findings of a multiple-case study can be used to develop theories, inform policy or practice, or generate new research questions.

Exploratory Case Study

An exploratory case study is used to explore a new or understudied phenomenon. This type of case study is useful when the researcher wants to generate hypotheses or theories about the phenomenon.

For Example, a researcher might conduct an exploratory case study on a new technology to understand its potential impact on society. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as grounded theory or content analysis. The findings of an exploratory case study can be used to generate new research questions, develop theories, or inform policy or practice.

Descriptive Case Study

A descriptive case study is used to describe a particular phenomenon in detail. This type of case study is useful when the researcher wants to provide a comprehensive account of the phenomenon.

For Example, a researcher might conduct a descriptive case study on a particular community to understand its social and economic characteristics. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a descriptive case study can be used to inform policy or practice or generate new research questions.

Instrumental Case Study

An instrumental case study is used to understand a particular phenomenon that is instrumental in achieving a particular goal. This type of case study is useful when the researcher wants to understand the role of the phenomenon in achieving the goal.

For Example, a researcher might conduct an instrumental case study on a particular policy to understand its impact on achieving a particular goal, such as reducing poverty. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of an instrumental case study can be used to inform policy or practice or generate new research questions.

Case Study Data Collection Methods

Here are some common data collection methods for case studies:

Interviews involve asking questions to individuals who have knowledge or experience relevant to the case study. Interviews can be structured (where the same questions are asked to all participants) or unstructured (where the interviewer follows up on the responses with further questions). Interviews can be conducted in person, over the phone, or through video conferencing.

Observations

Observations involve watching and recording the behavior and activities of individuals or groups relevant to the case study. Observations can be participant (where the researcher actively participates in the activities) or non-participant (where the researcher observes from a distance). Observations can be recorded using notes, audio or video recordings, or photographs.

Documents can be used as a source of information for case studies. Documents can include reports, memos, emails, letters, and other written materials related to the case study. Documents can be collected from the case study participants or from public sources.

Surveys involve asking a set of questions to a sample of individuals relevant to the case study. Surveys can be administered in person, over the phone, through mail or email, or online. Surveys can be used to gather information on attitudes, opinions, or behaviors related to the case study.

Artifacts are physical objects relevant to the case study. Artifacts can include tools, equipment, products, or other objects that provide insights into the case study phenomenon.

How to conduct Case Study Research

Conducting a case study research involves several steps that need to be followed to ensure the quality and rigor of the study. Here are the steps to conduct case study research:

  • Define the research questions: The first step in conducting a case study research is to define the research questions. The research questions should be specific, measurable, and relevant to the case study phenomenon under investigation.
  • Select the case: The next step is to select the case or cases to be studied. The case should be relevant to the research questions and should provide rich and diverse data that can be used to answer the research questions.
  • Collect data: Data can be collected using various methods, such as interviews, observations, documents, surveys, and artifacts. The data collection method should be selected based on the research questions and the nature of the case study phenomenon.
  • Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions and should aim to provide insights and conclusions relevant to the research questions.
  • Draw conclusions: The conclusions drawn from the case study should be based on the data analysis and should be relevant to the research questions. The conclusions should be supported by evidence and should be clearly stated.
  • Validate the findings: The findings of the case study should be validated by reviewing the data and the analysis with participants or other experts in the field. This helps to ensure the validity and reliability of the findings.
  • Write the report: The final step is to write the report of the case study research. The report should provide a clear description of the case study phenomenon, the research questions, the data collection methods, the data analysis, the findings, and the conclusions. The report should be written in a clear and concise manner and should follow the guidelines for academic writing.

Examples of Case Study

Here are some examples of case study research:

  • The Hawthorne Studies : Conducted between 1924 and 1932, the Hawthorne Studies were a series of case studies conducted by Elton Mayo and his colleagues to examine the impact of work environment on employee productivity. The studies were conducted at the Hawthorne Works plant of the Western Electric Company in Chicago and included interviews, observations, and experiments.
  • The Stanford Prison Experiment: Conducted in 1971, the Stanford Prison Experiment was a case study conducted by Philip Zimbardo to examine the psychological effects of power and authority. The study involved simulating a prison environment and assigning participants to the role of guards or prisoners. The study was controversial due to the ethical issues it raised.
  • The Challenger Disaster: The Challenger Disaster was a case study conducted to examine the causes of the Space Shuttle Challenger explosion in 1986. The study included interviews, observations, and analysis of data to identify the technical, organizational, and cultural factors that contributed to the disaster.
  • The Enron Scandal: The Enron Scandal was a case study conducted to examine the causes of the Enron Corporation’s bankruptcy in 2001. The study included interviews, analysis of financial data, and review of documents to identify the accounting practices, corporate culture, and ethical issues that led to the company’s downfall.
  • The Fukushima Nuclear Disaster : The Fukushima Nuclear Disaster was a case study conducted to examine the causes of the nuclear accident that occurred at the Fukushima Daiichi Nuclear Power Plant in Japan in 2011. The study included interviews, analysis of data, and review of documents to identify the technical, organizational, and cultural factors that contributed to the disaster.

Application of Case Study

Case studies have a wide range of applications across various fields and industries. Here are some examples:

Business and Management

Case studies are widely used in business and management to examine real-life situations and develop problem-solving skills. Case studies can help students and professionals to develop a deep understanding of business concepts, theories, and best practices.

Case studies are used in healthcare to examine patient care, treatment options, and outcomes. Case studies can help healthcare professionals to develop critical thinking skills, diagnose complex medical conditions, and develop effective treatment plans.

Case studies are used in education to examine teaching and learning practices. Case studies can help educators to develop effective teaching strategies, evaluate student progress, and identify areas for improvement.

Social Sciences

Case studies are widely used in social sciences to examine human behavior, social phenomena, and cultural practices. Case studies can help researchers to develop theories, test hypotheses, and gain insights into complex social issues.

Law and Ethics

Case studies are used in law and ethics to examine legal and ethical dilemmas. Case studies can help lawyers, policymakers, and ethical professionals to develop critical thinking skills, analyze complex cases, and make informed decisions.

Purpose of Case Study

The purpose of a case study is to provide a detailed analysis of a specific phenomenon, issue, or problem in its real-life context. A case study is a qualitative research method that involves the in-depth exploration and analysis of a particular case, which can be an individual, group, organization, event, or community.

The primary purpose of a case study is to generate a comprehensive and nuanced understanding of the case, including its history, context, and dynamics. Case studies can help researchers to identify and examine the underlying factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and detailed understanding of the case, which can inform future research, practice, or policy.

Case studies can also serve other purposes, including:

  • Illustrating a theory or concept: Case studies can be used to illustrate and explain theoretical concepts and frameworks, providing concrete examples of how they can be applied in real-life situations.
  • Developing hypotheses: Case studies can help to generate hypotheses about the causal relationships between different factors and outcomes, which can be tested through further research.
  • Providing insight into complex issues: Case studies can provide insights into complex and multifaceted issues, which may be difficult to understand through other research methods.
  • Informing practice or policy: Case studies can be used to inform practice or policy by identifying best practices, lessons learned, or areas for improvement.

Advantages of Case Study Research

There are several advantages of case study research, including:

  • In-depth exploration: Case study research allows for a detailed exploration and analysis of a specific phenomenon, issue, or problem in its real-life context. This can provide a comprehensive understanding of the case and its dynamics, which may not be possible through other research methods.
  • Rich data: Case study research can generate rich and detailed data, including qualitative data such as interviews, observations, and documents. This can provide a nuanced understanding of the case and its complexity.
  • Holistic perspective: Case study research allows for a holistic perspective of the case, taking into account the various factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and comprehensive understanding of the case.
  • Theory development: Case study research can help to develop and refine theories and concepts by providing empirical evidence and concrete examples of how they can be applied in real-life situations.
  • Practical application: Case study research can inform practice or policy by identifying best practices, lessons learned, or areas for improvement.
  • Contextualization: Case study research takes into account the specific context in which the case is situated, which can help to understand how the case is influenced by the social, cultural, and historical factors of its environment.

Limitations of Case Study Research

There are several limitations of case study research, including:

  • Limited generalizability : Case studies are typically focused on a single case or a small number of cases, which limits the generalizability of the findings. The unique characteristics of the case may not be applicable to other contexts or populations, which may limit the external validity of the research.
  • Biased sampling: Case studies may rely on purposive or convenience sampling, which can introduce bias into the sample selection process. This may limit the representativeness of the sample and the generalizability of the findings.
  • Subjectivity: Case studies rely on the interpretation of the researcher, which can introduce subjectivity into the analysis. The researcher’s own biases, assumptions, and perspectives may influence the findings, which may limit the objectivity of the research.
  • Limited control: Case studies are typically conducted in naturalistic settings, which limits the control that the researcher has over the environment and the variables being studied. This may limit the ability to establish causal relationships between variables.
  • Time-consuming: Case studies can be time-consuming to conduct, as they typically involve a detailed exploration and analysis of a specific case. This may limit the feasibility of conducting multiple case studies or conducting case studies in a timely manner.
  • Resource-intensive: Case studies may require significant resources, including time, funding, and expertise. This may limit the ability of researchers to conduct case studies in resource-constrained settings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

  • Open access
  • Published: 27 June 2011

The case study approach

  • Sarah Crowe 1 ,
  • Kathrin Cresswell 2 ,
  • Ann Robertson 2 ,
  • Guro Huby 3 ,
  • Anthony Avery 1 &
  • Aziz Sheikh 2  

BMC Medical Research Methodology volume  11 , Article number:  100 ( 2011 ) Cite this article

770k Accesses

1035 Citations

37 Altmetric

Metrics details

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Peer Review reports

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables 1 , 2 , 3 and 4 ) and those of others to illustrate our discussion[ 3 – 7 ].

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables 2 , 3 and 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 – 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables 2 and 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 – 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table 8 )[ 8 , 18 – 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table 9 )[ 8 ].

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Yin RK: Case study research, design and method. 2009, London: Sage Publications Ltd., 4

Google Scholar  

Keen J, Packwood T: Qualitative research; case study evaluation. BMJ. 1995, 311: 444-446.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J, et al: Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009, 6 (10): 1-11.

Article   Google Scholar  

Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, et al: The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO). 2008, [ http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf ]

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, et al: Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010, 41: c4564-

Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P, the Patient Safety Education Study Group: Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010, 15: 4-10. 10.1258/jhsrp.2009.009052.

Article   PubMed   Google Scholar  

van Harten WH, Casparie TF, Fisscher OA: The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002, 60 (1): 17-37. 10.1016/S0168-8510(01)00187-7.

Stake RE: The art of case study research. 1995, London: Sage Publications Ltd.

Sheikh A, Smeeth L, Ashcroft R: Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002, 52 (482): 746-51.

PubMed   PubMed Central   Google Scholar  

King G, Keohane R, Verba S: Designing Social Inquiry. 1996, Princeton: Princeton University Press

Doolin B: Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998, 13: 301-311. 10.1057/jit.1998.8.

George AL, Bennett A: Case studies and theory development in the social sciences. 2005, Cambridge, MA: MIT Press

Eccles M, the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG): Designing theoretically-informed implementation interventions. Implementation Science. 2006, 1: 1-8. 10.1186/1748-5908-1-1.

Article   PubMed Central   Google Scholar  

Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A: Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005, 365 (9456): 312-7.

Sheikh A, Panesar SS, Lasserson T, Netuveli G: Recruitment of ethnic minorities to asthma studies. Thorax. 2004, 59 (7): 634-

CAS   PubMed   PubMed Central   Google Scholar  

Hellström I, Nolan M, Lundh U: 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005, 4: 7-22. 10.1177/1471301205049188.

Som CV: Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005, 18: 463-477. 10.1108/09513550510608903.

Lincoln Y, Guba E: Naturalistic inquiry. 1985, Newbury Park: Sage Publications

Barbour RS: Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?. BMJ. 2001, 322: 1115-1117. 10.1136/bmj.322.7294.1115.

Mays N, Pope C: Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mason J: Qualitative researching. 2002, London: Sage

Brazier A, Cooke K, Moravan V: Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008, 7: 5-17. 10.1177/1534735407313395.

Miles MB, Huberman M: Qualitative data analysis: an expanded sourcebook. 1994, CA: Sage Publications Inc., 2

Pope C, Ziebland S, Mays N: Analysing qualitative data. Qualitative research in health care. BMJ. 2000, 320: 114-116. 10.1136/bmj.320.7227.114.

Cresswell KM, Worth A, Sheikh A: Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010, 10 (1): 67-10.1186/1472-6947-10-67.

Article   PubMed   PubMed Central   Google Scholar  

Malterud K: Qualitative research: standards, challenges, and guidelines. Lancet. 2001, 358: 483-488. 10.1016/S0140-6736(01)05627-6.

Article   CAS   PubMed   Google Scholar  

Yin R: Case study research: design and methods. 1994, Thousand Oaks, CA: Sage Publishing, 2

Yin R: Enhancing the quality of case studies in health services research. Health Serv Res. 1999, 34: 1209-1224.

Green J, Thorogood N: Qualitative methods for health research. 2009, Los Angeles: Sage, 2

Howcroft D, Trauth E: Handbook of Critical Information Systems Research, Theory and Application. 2005, Cheltenham, UK: Northampton, MA, USA: Edward Elgar

Book   Google Scholar  

Blakie N: Approaches to Social Enquiry. 1993, Cambridge: Polity Press

Doolin B: Power and resistance in the implementation of a medical management information system. Info Systems J. 2004, 14: 343-362. 10.1111/j.1365-2575.2004.00176.x.

Bloomfield BP, Best A: Management consultants: systems development, power and the translation of problems. Sociological Review. 1992, 40: 533-560.

Shanks G, Parr A: Positivist, single case study research in information systems: A critical analysis. Proceedings of the European Conference on Information Systems. 2003, Naples

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/11/100/prepub

Download references

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

Author information

Authors and affiliations.

Division of Primary Care, The University of Nottingham, Nottingham, UK

Sarah Crowe & Anthony Avery

Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Kathrin Cresswell, Ann Robertson & Aziz Sheikh

School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Crowe .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Crowe, S., Cresswell, K., Robertson, A. et al. The case study approach. BMC Med Res Methodol 11 , 100 (2011). https://doi.org/10.1186/1471-2288-11-100

Download citation

Received : 29 November 2010

Accepted : 27 June 2011

Published : 27 June 2011

DOI : https://doi.org/10.1186/1471-2288-11-100

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case Study Approach
  • Electronic Health Record System
  • Case Study Design
  • Case Study Site
  • Case Study Report

BMC Medical Research Methodology

ISSN: 1471-2288

approaches case study analysis

Do Your Students Know How to Analyze a Case—Really?

Explore more.

  • Case Teaching
  • Student Engagement

J ust as actors, athletes, and musicians spend thousands of hours practicing their craft, business students benefit from practicing their critical-thinking and decision-making skills. Students, however, often have limited exposure to real-world problem-solving scenarios; they need more opportunities to practice tackling tough business problems and deciding on—and executing—the best solutions.

To ensure students have ample opportunity to develop these critical-thinking and decision-making skills, we believe business faculty should shift from teaching mostly principles and ideas to mostly applications and practices. And in doing so, they should emphasize the case method, which simulates real-world management challenges and opportunities for students.

To help educators facilitate this shift and help students get the most out of case-based learning, we have developed a framework for analyzing cases. We call it PACADI (Problem, Alternatives, Criteria, Analysis, Decision, Implementation); it can improve learning outcomes by helping students better solve and analyze business problems, make decisions, and develop and implement strategy. Here, we’ll explain why we developed this framework, how it works, and what makes it an effective learning tool.

The Case for Cases: Helping Students Think Critically

Business students must develop critical-thinking and analytical skills, which are essential to their ability to make good decisions in functional areas such as marketing, finance, operations, and information technology, as well as to understand the relationships among these functions. For example, the decisions a marketing manager must make include strategic planning (segments, products, and channels); execution (digital messaging, media, branding, budgets, and pricing); and operations (integrated communications and technologies), as well as how to implement decisions across functional areas.

Faculty can use many types of cases to help students develop these skills. These include the prototypical “paper cases”; live cases , which feature guest lecturers such as entrepreneurs or corporate leaders and on-site visits; and multimedia cases , which immerse students into real situations. Most cases feature an explicit or implicit decision that a protagonist—whether it is an individual, a group, or an organization—must make.

For students new to learning by the case method—and even for those with case experience—some common issues can emerge; these issues can sometimes be a barrier for educators looking to ensure the best possible outcomes in their case classrooms. Unsure of how to dig into case analysis on their own, students may turn to the internet or rely on former students for “answers” to assigned cases. Or, when assigned to provide answers to assignment questions in teams, students might take a divide-and-conquer approach but not take the time to regroup and provide answers that are consistent with one other.

To help address these issues, which we commonly experienced in our classes, we wanted to provide our students with a more structured approach for how they analyze cases—and to really think about making decisions from the protagonists’ point of view. We developed the PACADI framework to address this need.

PACADI: A Six-Step Decision-Making Approach

The PACADI framework is a six-step decision-making approach that can be used in lieu of traditional end-of-case questions. It offers a structured, integrated, and iterative process that requires students to analyze case information, apply business concepts to derive valuable insights, and develop recommendations based on these insights.

Prior to beginning a PACADI assessment, which we’ll outline here, students should first prepare a two-paragraph summary—a situation analysis—that highlights the key case facts. Then, we task students with providing a five-page PACADI case analysis (excluding appendices) based on the following six steps.

Step 1: Problem definition. What is the major challenge, problem, opportunity, or decision that has to be made? If there is more than one problem, choose the most important one. Often when solving the key problem, other issues will surface and be addressed. The problem statement may be framed as a question; for example, How can brand X improve market share among millennials in Canada? Usually the problem statement has to be re-written several times during the analysis of a case as students peel back the layers of symptoms or causation.

Step 2: Alternatives. Identify in detail the strategic alternatives to address the problem; three to five options generally work best. Alternatives should be mutually exclusive, realistic, creative, and feasible given the constraints of the situation. Doing nothing or delaying the decision to a later date are not considered acceptable alternatives.

Step 3: Criteria. What are the key decision criteria that will guide decision-making? In a marketing course, for example, these may include relevant marketing criteria such as segmentation, positioning, advertising and sales, distribution, and pricing. Financial criteria useful in evaluating the alternatives should be included—for example, income statement variables, customer lifetime value, payback, etc. Students must discuss their rationale for selecting the decision criteria and the weights and importance for each factor.

Step 4: Analysis. Provide an in-depth analysis of each alternative based on the criteria chosen in step three. Decision tables using criteria as columns and alternatives as rows can be helpful. The pros and cons of the various choices as well as the short- and long-term implications of each may be evaluated. Best, worst, and most likely scenarios can also be insightful.

Step 5: Decision. Students propose their solution to the problem. This decision is justified based on an in-depth analysis. Explain why the recommendation made is the best fit for the criteria.

Step 6: Implementation plan. Sound business decisions may fail due to poor execution. To enhance the likeliness of a successful project outcome, students describe the key steps (activities) to implement the recommendation, timetable, projected costs, expected competitive reaction, success metrics, and risks in the plan.

“Students note that using the PACADI framework yields ‘aha moments’—they learned something surprising in the case that led them to think differently about the problem and their proposed solution.”

PACADI’s Benefits: Meaningfully and Thoughtfully Applying Business Concepts

The PACADI framework covers all of the major elements of business decision-making, including implementation, which is often overlooked. By stepping through the whole framework, students apply relevant business concepts and solve management problems via a systematic, comprehensive approach; they’re far less likely to surface piecemeal responses.

As students explore each part of the framework, they may realize that they need to make changes to a previous step. For instance, when working on implementation, students may realize that the alternative they selected cannot be executed or will not be profitable, and thus need to rethink their decision. Or, they may discover that the criteria need to be revised since the list of decision factors they identified is incomplete (for example, the factors may explain key marketing concerns but fail to address relevant financial considerations) or is unrealistic (for example, they suggest a 25 percent increase in revenues without proposing an increased promotional budget).

In addition, the PACADI framework can be used alongside quantitative assignments, in-class exercises, and business and management simulations. The structured, multi-step decision framework encourages careful and sequential analysis to solve business problems. Incorporating PACADI as an overarching decision-making method across different projects will ultimately help students achieve desired learning outcomes. As a practical “beyond-the-classroom” tool, the PACADI framework is not a contrived course assignment; it reflects the decision-making approach that managers, executives, and entrepreneurs exercise daily. Case analysis introduces students to the real-world process of making business decisions quickly and correctly, often with limited information. This framework supplies an organized and disciplined process that students can readily defend in writing and in class discussions.

PACADI in Action: An Example

Here’s an example of how students used the PACADI framework for a recent case analysis on CVS, a large North American drugstore chain.

The CVS Prescription for Customer Value*

PACADI Stage

Summary Response

How should CVS Health evolve from the “drugstore of your neighborhood” to the “drugstore of your future”?

Alternatives

A1. Kaizen (continuous improvement)

A2. Product development

A3. Market development

A4. Personalization (micro-targeting)

Criteria (include weights)

C1. Customer value: service, quality, image, and price (40%)

C2. Customer obsession (20%)

C3. Growth through related businesses (20%)

C4. Customer retention and customer lifetime value (20%)

Each alternative was analyzed by each criterion using a Customer Value Assessment Tool

Alternative 4 (A4): Personalization was selected. This is operationalized via: segmentation—move toward segment-of-1 marketing; geodemographics and lifestyle emphasis; predictive data analysis; relationship marketing; people, principles, and supply chain management; and exceptional customer service.

Implementation

Partner with leading medical school

Curbside pick-up

Pet pharmacy

E-newsletter for customers and employees

Employee incentive program

CVS beauty days

Expand to Latin America and Caribbean

Healthier/happier corner

Holiday toy drives/community outreach

*Source: A. Weinstein, Y. Rodriguez, K. Sims, R. Vergara, “The CVS Prescription for Superior Customer Value—A Case Study,” Back to the Future: Revisiting the Foundations of Marketing from Society for Marketing Advances, West Palm Beach, FL (November 2, 2018).

Results of Using the PACADI Framework

When faculty members at our respective institutions at Nova Southeastern University (NSU) and the University of North Carolina Wilmington have used the PACADI framework, our classes have been more structured and engaging. Students vigorously debate each element of their decision and note that this framework yields an “aha moment”—they learned something surprising in the case that led them to think differently about the problem and their proposed solution.

These lively discussions enhance individual and collective learning. As one external metric of this improvement, we have observed a 2.5 percent increase in student case grade performance at NSU since this framework was introduced.

Tips to Get Started

The PACADI approach works well in in-person, online, and hybrid courses. This is particularly important as more universities have moved to remote learning options. Because students have varied educational and cultural backgrounds, work experience, and familiarity with case analysis, we recommend that faculty members have students work on their first case using this new framework in small teams (two or three students). Additional analyses should then be solo efforts.

To use PACADI effectively in your classroom, we suggest the following:

Advise your students that your course will stress critical thinking and decision-making skills, not just course concepts and theory.

Use a varied mix of case studies. As marketing professors, we often address consumer and business markets; goods, services, and digital commerce; domestic and global business; and small and large companies in a single MBA course.

As a starting point, provide a short explanation (about 20 to 30 minutes) of the PACADI framework with a focus on the conceptual elements. You can deliver this face to face or through videoconferencing.

Give students an opportunity to practice the case analysis methodology via an ungraded sample case study. Designate groups of five to seven students to discuss the case and the six steps in breakout sessions (in class or via Zoom).

Ensure case analyses are weighted heavily as a grading component. We suggest 30–50 percent of the overall course grade.

Once cases are graded, debrief with the class on what they did right and areas needing improvement (30- to 40-minute in-person or Zoom session).

Encourage faculty teams that teach common courses to build appropriate instructional materials, grading rubrics, videos, sample cases, and teaching notes.

When selecting case studies, we have found that the best ones for PACADI analyses are about 15 pages long and revolve around a focal management decision. This length provides adequate depth yet is not protracted. Some of our tested and favorite marketing cases include Brand W , Hubspot , Kraft Foods Canada , TRSB(A) , and Whiskey & Cheddar .

Art Weinstein

Art Weinstein , Ph.D., is a professor of marketing at Nova Southeastern University, Fort Lauderdale, Florida. He has published more than 80 scholarly articles and papers and eight books on customer-focused marketing strategy. His latest book is Superior Customer Value—Finding and Keeping Customers in the Now Economy . Dr. Weinstein has consulted for many leading technology and service companies.

Herbert V. Brotspies

Herbert V. Brotspies , D.B.A., is an adjunct professor of marketing at Nova Southeastern University. He has over 30 years’ experience as a vice president in marketing, strategic planning, and acquisitions for Fortune 50 consumer products companies working in the United States and internationally. His research interests include return on marketing investment, consumer behavior, business-to-business strategy, and strategic planning.

John T. Gironda

John T. Gironda , Ph.D., is an assistant professor of marketing at the University of North Carolina Wilmington. His research has been published in Industrial Marketing Management, Psychology & Marketing , and Journal of Marketing Management . He has also presented at major marketing conferences including the American Marketing Association, Academy of Marketing Science, and Society for Marketing Advances.

Related Articles

CASE TEACHING

We use cookies to understand how you use our site and to improve your experience, including personalizing content. Learn More . By continuing to use our site, you accept our use of cookies and revised Privacy Policy .

approaches case study analysis

Logo for Éditions science et bien commun

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

QUALITATIVE METHODS

13 Case Studies

Valéry Ridde, Abdourahmane Coulibaly, and Lara Gautier

Case studies consist of an in-depth analysis of one or more cases, using a variety of methods and theoretical approaches. The choice of cases (single or multiple) studied is crucial. Case studies are particularly suitable for studying the emergence and processes involved in policy implementation and for contributing to theory-based evaluations.

Keywords: Qualitative methods, quantitative methods, mixed methods, case study, theoretical approaches, single/multiple cases, empirical triangulation, analytical generalisation

I. What does this method consist of?

Also used in anthropology, the case study approach has long been used in evaluation, where it is considered not as a method but as a research strategy (Yin 2018). By studying a policy in context and using multiple lines of evidence, the case study (single or multiple) seeks to answer ‘how’ and ‘why’ questions from a systems approach and with the support of theoretical approaches. Conducting a case study for a public policy evaluation follows a standard evaluation process: planning, drafting the protocol, preparing the field, collecting and analysing data, sharing results and making recommendations for policy improvement (Gagnon 2012). As with all evaluations, the choice of methods should follow the objectives and the evaluation question, not the other way around. A case study may thus mobilise qualitative, quantitative and different mixed methods designs.

The case study strategy is therefore appropriate when organising an evaluation of policy emergence, process, relevance or adaptation. It is often mobilised when evaluation teams have little or no control over the events and context that influence policy actions. This is often the case outside of experimental situations, which are rare in the field of public policy. It is therefore mostly recommended for understanding a contemporary, often complex, phenomenon organised in a real context.

The case study approach can be used to explain a public policy, describe it in depth or illustrate a specific situation, which can sometimes be original and enlightening for decision-making. The advantage of case studies is that they can be adapted to different situations where there are multiple variables of interest around a policy. It is also about being able to use multiple sources of data, both quantitative and qualitative, which allow for empirical triangulation. The case study strategy allows theoretical propositions and the state of scientific knowledge to guide data collection and analysis. It fits perfectly with, but is not limited to, theory-based evaluation approaches (see separate chapter on theory-based evaluation ).

There are a myriad of proposals for the types of case studies that are possible. Firstly, it is possible to use single/single case studies (involving one policy) or multiple case studies (several policies in the same organisational context or one policy in different contexts). Secondly, these cases can be studied holistically (the policy as a whole) or at different levels of analysis (the dimensions of the policy that the intervention theory will have specified or the particular regional contexts). The choice of case studies should be heuristic (to learn from the study) and strategic (to have data available within the available budget, to answer useful questions). A key criterion for case selection is to have sufficiently relevant information to understand the policy in depth and complexity. Case sampling should therefore be explicit, rigorous and transparent. The selection of case studies can thus be critical, unique, typical, revealing, instrumental, etc. This selection can also be carried out in collaboration between the research and policy teams to ensure that the choices are relevant and feasible. The selection can also be based on prior quantitative analyses to obtain the starting situation of the cases and, for example, choose cases that are very contrasting or very similar in their performance with regard to the policy being analysed.

Sometimes it can also be useful to have a diachronic approach in order to produce longitudinal case studies. For example, analysing a policy over time can reveal the influences of changes in the context or in the strategies of those implementing it, or of those benefiting from it. Starting with cases with similar initial conditions and then studying their evolution is referred to as ‘racing cases’ by Eisenhardt (Gehman et al. 2018).

When analysing the data, the case study approach requires, in addition to the usual analyses specific to the methods (content analysis, thematic analysis, descriptive or inferential statistics, etc.), to mobilise a replication logic. The idea is to compare, in a systematic and rigorous way, the empirical data and the theory, be it the theory of the policy intervention or a theoretical or conceptual framework used to understand the policy. This process is referred to by Yin as analytical generalisation. When several cases support the same theory, it is possible to suggest the presence of a replication logic ( Yin 2010).

Configurations can be heuristic tools for this analysis, whether they are organisational or rooted in critical realism (see separate chapter on realistic evaluation ). Furthermore, finding similar patterns, or situations, in different contexts strengthens the ability to generalise the results of case studies. Yin believes that analytical generalisation requires the construction of a very strong case that will be able to withstand the challenges of logical analysis. Thus, it is essential to specify this theoretical rationale at the outset of the case study, either by mobilising a theory or from the state of the art without it being entirely specific to the public policy being analysed. At the beginning of a case study, it is therefore necessary to remain at a relatively high conceptual level, at least higher than the policy under study. Secondly, the empirical results of the case study must show how they align (or not) with the theoretical argument at the outset. Finally, it will be necessary to discuss how this theoretical thinking, based on this particular policy, can also be applied to other situations and policies in the particular case study. The fact that, even at the beginning of the case study, a counter-argument (rival hypotheses) was also formulated, and that empirical evidence was sought during the data collection process (which refutes them), reinforces the validity of this process of analytical generalisation. Finally, the power of multiple case studies is that this analytical generalisation is strengthened when the results of one case are similar to those of other cases.

Some research teams even propose that case studies can lead to theory-building, especially when analysing complex objects such as public policies.

II. How is this method useful for policy evaluation?

Before deciding to embark on a case study approach, two preliminary questions should be asked which will determine the appropriateness of the approach:

Does the phenomenon I am interested in need the case(s) to be understandable? (e.g., Theory-building case studies)

Does the case(s) represent an empirical window that informs the analysis of the wider phenomenon?

Once one or the other has been answered positively, the evaluative questions can be defined:

Under what real-life conditions can public policy X, piloted in context A, be scaled up in contexts B, C, and D?

How did the controversy about public policy Y in context B emerge?

What are the success factors for the implementation of public policy X in context A?

How were public policies Y and Z implemented in context B?

Why did public policy X in context A and B fail, while it had positive effects in context C?

Why did public policy X implemented in context A fail, while public policy Y implemented in the same context A succeeded?

What is it about the characteristics of public policy Z implemented in contexts A, B, and C that informs μ theory-building case studies?

The case study can be used at any point in the evaluation process, ex ante (at the time of policy design), in itinere (during implementation), or ex post (e.g. to better understand the results produced).

III. An example of the use of this method in Burkina Faso

Simple and multiple longitudinal case studies were mobilised to study a public health financing policy in Burkina Faso (Ridde 2021).

The World Bank encouraged the government to test in a dozen districts a modality for financing health centres in addition to the state budget. The idea was to organise a performance-based payment system in which health centres and health professionals received additional funds based on the achievement of activity results. For example, for each delivery performed in the centre with a partographer, they received 3.2 euros to be shared between the structure and the staff, according to complex procedures and indicators. Verification and control processes were organised to ensure the reliability of payment claims.

To study the emergence of this new policy, we conducted a single case study (focusing on the policy) to better understand its origin, ideas, proposed solutions, people who proposed it, power issues, etc. We employed a literature review and 14 qualitative in-depth interviews with policy makers, funding agencies and experts on the subject. Using an analytical generalisation approach, we compared this emergence to understand whether what happened in Burkina Faso was also happening in Benin.

To study the implementation of the policy in Burkina Faso, we then used multiple longitudinal case studies. For reasons of time and budget, we selected three districts representing the diversity of situations in which the policy was implemented. Then, within each of these districts, we selected six cases from the primary health centres (about 30 per district) and one case that was the referral hospital (only one per district). The six cases were selected according to the three types of financing strategies that the policy wished to test, so two cases per type. We decided to select two cases with the greatest possible contrast within each of the three types: one very performant health centre and one not at all. Performance was calculated using a quantitative method (time series) on the basis of indicators of health centre attendance in the years preceding the policy. This etic analysis (from the external perspective) ranked all the health centres according to their order of performance to support case selection. The latter also benefited from the emic opinion (from the internal point of view) of local health system managers in order to take into account their own perception of the performance of the centres, beyond the quantitative approach which only gives a partial view of performance. Thus, for each of the seven cases selected per district (7×3 = 21), we used multiple sources of data to understand the challenges of policy implementation: analysis of documentation, formal qualitative interviews (between 114 and 215 per district) and informal interviews (between 26 and 168 per district), and observations of situations. A data collection grid was also used to measure the fidelity of policy implementation. In order to better understand the evolution of policy implementation, and in particular adaptations over time, three data collection moments were carried out over a 24-month period, thus following the longitudinal multiple case study approach.

Finally, these case studies have also been fruitful in studying, with a qualitative approach and a long immersion in the field, the unexpected consequences (positive or negative) of this policy. Although this dimension of the evaluation is still too little understood, its implementation in Burkina Faso has shown the relevance of this approach (Turcotte-Tremblay et al. 2017). Limiting oneself to the expected effects, which is often implied by an extreme focus on the sole theory of intervention developed by the teams that define the policy, reduces the heuristic scope of the evaluation. While successes are essential, challenges may also be necessary to improve public policies with the help of case studies.

For all these approaches, the analysis was carried out in a hybrid manner, both deductive (with respect to the intervention theory or a conceptual framework) and inductive (original empirical data). The comparison between cases, between districts and between countries allowed for an increase in abstraction in an analytical generalisation process.

IV. What are the criteria for judging the quality of the mobilisation of this method?

Judging the quality of a complex approach such as case studies requires a global vision, going beyond the specific but essential reflections of the usual methods (quantitative and qualitative). To this end, Yin (2018) proposes to study the quality of case studies in terms of four dimensions:

Construct validity (studying the expected policy and not something else): using multiple sources of evidence, describing and establishing a causal chain, involving stakeholders in the validation of the protocol and reports;

Internal validity (confidence in results): compare empirical data with each other and with theory, construct explanatory logics, account for competing and alternative hypotheses, use logical frameworks/theories of intervention;

External validity (ability to generalise results): use theories, use the logic of analytical replication;

Reliability (for the same case study, the same findings): use a policy study protocol, develop a case database.

V. What are the strengths and limitations of this method compared to others?

The main strength of the case study is its ability to ‘incorporate the unique characteristics of each case and to examine complex phenomena in their context’, i.e. in real-life conditions (Stiles 2013, 30).

The case study strategy, due to the abundance and variety of the corpus of data mobilised, and the research methods employed (qualitative, quantitative or mixed), most often allows for a rich description of the public policy(ies) being evaluated and the contexts of implementation. This is particularly true of single case studies, which allow for in-depth analysis. With regard to multiple case studies, the main advantage is that it allows for more potential variation, which increases the robustness of the explanation. The downside is that these strategies require a significant time commitment. Thus, the sheer volume of work can be problematic, especially if the deadlines set by the sponsors are short. In addition, if there are several evaluative questions, or a question that invites the linking of implementation issues to outcomes, then it may be necessary to consider combining the case study (which may focus on process analysis, for example) with another complementary research strategy, such as quasi-experimental approaches (Yin and Ridde, 2012). Finally, several biases may arise – the biased choice of case(s), low statistical power when conducting quantitative analyses. These biases may erode comparability across cases or contexts. The rich justification of the choice of cases (public policies) (Stake 1995) and the description of the context(s), as well as the process of analytical generalisation, described above, help to reduce the impact of these biases.

With regard to theory-building case studies, both advantages and disadvantages of the case study are identified (Stiles 2013). The case study strategy here consists of comparing different statements from theory with one or more observations. This can be done by describing the few cases in theoretical terms. Thus, although each detail can only be observed once, they can be very numerous and therefore useful for theory building. However, the same biases mentioned above are likely to occur (biased case selection, low statistical power). Confidence in individual statements may be eroded by these biases. On the other hand, as many statements are examined – reflecting a variety of contexts and therefore possible variations – the overall strengthening of confidence in the theory may be just as important as in a hypothesis testing study.

Some bibliographical references to go further

Gagnon, Yves-Chantal. 2012. L’étude de cas comme méthode de recherche . 2nd ed. Québec: Presses de l’Université du Québec.

Gehman, Joel. and Glaser, Vern L.. and Eisenhardt, Kathleen M.. and Gioia, Denny. and Langley, Ann. and Corley, Kevin G.. 2018. “Finding Theory–Method Fit: A Comparison of Three Qualitative Approaches to Theory Building.” Journal of Management Inquiry, 27(3): 284‑300. https://doi.org/10.1177/1056492617706029 .

Ridde, Valéry, éd. 2021. Vers une couverture sanitaire universelle en 2030? Éditions science et bien commun. Québec: Canada: Zenodo. https://doi.org/10.5281/ZENODO.5166925 .

Stake, Robert E. 1995. The Art of Case Study Research . Thousand Oaks, CA: SAGE Publications.

Stiles, William B. 2013. “Using Case Studies to Build Psychotherapeutic Theories.” Psychothérapies , 33(1): 29‑35. https://doi.org/10.3917/psys.131.0029 .

Turcotte-Tremblay, Anne-Marie. and Ali Gali-Gali, Idriss. and De Allegri, Manuela. and Ridde, Valéry. 2017. “The Unintended Consequences of Community Verifications for Performance-Based Financing in Burkina Faso.” Social Science & Medicine , 191: 226‑36. https://doi.org/10.1016/j.socscimed.2017.09.007 .

Yin, Robert K. 2010. “Analytic Generalization.” In Encyclopedia of Case Study Research , by Albert Mills, Gabrielle Durepos, and Elden Wiebe, 6. 2455 Teller Road, Thousand Oaks California 91320 United States: SAGE Publications, Inc. https://doi.org/10.4135/9781412957397.n8 .

Yin, Robert K. 2018. Case study research and applications: design and methods . Sixth edition. Los Angeles: SAGE.

Policy Evaluation: Methods and Approaches Copyright © by Valéry Ridde, Abdourahmane Coulibaly, and Lara Gautier is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

  • Open access
  • Published: 16 April 2024

How does the external context affect an implementation processes? A qualitative study investigating the impact of macro-level variables on the implementation of goal-oriented primary care

  • Ine Huybrechts   ORCID: orcid.org/0000-0003-0288-1756 1 , 2 ,
  • Anja Declercq 3 , 4 ,
  • Emily Verté 1 , 2 ,
  • Peter Raeymaeckers 5   na1 &
  • Sibyl Anthierens 1   na1

on behalf of the Primary Care Academy

Implementation Science volume  19 , Article number:  32 ( 2024 ) Cite this article

599 Accesses

8 Altmetric

Metrics details

Although the importance of context in implementation science is not disputed, knowledge about the actual impact of external context variables on implementation processes remains rather fragmented. Current frameworks, models, and studies merely describe macro-level barriers and facilitators, without acknowledging their dynamic character and how they impact and steer implementation. Including organizational theories in implementation frameworks could be a way of tackling this problem. In this study, we therefore investigate how organizational theories can contribute to our understanding of the ways in which external context variables shape implementation processes. We use the implementation process of goal-oriented primary care in Belgium as a case.

A qualitative study using in-depth semi-structured interviews was conducted with actors from a variety of primary care organizations. Data was collected and analyzed with an iterative approach. We assessed the potential of four organizational theories to enrich our understanding of the impact of external context variables on implementation processes. The organizational theories assessed are as follows: institutional theory, resource dependency theory, network theory, and contingency theory. Data analysis was based on a combination of inductive and deductive thematic analysis techniques using NVivo 12.

Institutional theory helps to understand mechanisms that steer and facilitate the implementation of goal-oriented care through regulatory and policy measures. For example, the Flemish government issued policy for facilitating more integrated, person-centered care by means of newly created institutions, incentives, expectations, and other regulatory factors. The three other organizational theories describe both counteracting or reinforcing mechanisms. The financial system hampers interprofessional collaboration, which is key for GOC. Networks between primary care providers and health and/or social care organizations on the one hand facilitate GOC, while on the other hand, technology to support interprofessional collaboration is lacking. Contingent variables such as the aging population and increasing workload and complexity within primary care create circumstances in which GOC is presented as a possible answer.

Conclusions

Insights and propositions that derive from organizational theories can be utilized to expand our knowledge on how external context variables affect implementation processes. These insights can be combined with or integrated into existing implementation frameworks and models to increase their explanatory power.

Peer Review reports

Contributions to literature

Knowledge on how external context variables affect implementation processes tends to be rather fragmented. Insights on external context in implementation research often remain limited to merely describing macro-context barriers and facilitators.

Organizational theories contribute to our understanding on the impact of external context to an implementation process by explaining the complex interactions between organizations and their environments.

Findings can be utilized to help explain the mechanism of change in an implementation process and can be combined with or integrated into existing implementation frameworks and models to gain a broader picture on how external context affects implementation processes.

In this study, we integrate organizational theories to provide a profound analysis on how external context influences the implementation of complex interventions. There is a growing recognition that the context in which an intervention takes place highly influences implementation outcomes [ 1 , 2 ]. Despite its importance, researchers are challenged by the lack of a clear definition of context. Most implementation frameworks and models do not define context as such, but describe categories or elements of context, without capturing it as a whole [ 2 , 3 ]. Studies often distinguish between internal and external context: micro- and meso-level internal context variables are specific to a person, team, or organization. Macro-level external context variables consist of variables on a broader, socio-economic and policy level that are beyond one’s control [ 4 ].

Overall, literature provides a rather fragmented and limited perspective on how external context influences the implementation process of a complex intervention. Attempts are made to define, categorize, and conceptualize external context [ 5 , 6 ]. Certain implementation frameworks and models specifically mention external context, such as the conceptual model of evidence-based practice implementation in public service sectors [ 7 ], the Consolidated Framework for Implementation Research [ 8 ], or the i-PARiHS framework [ 9 ]. However, they remain limited to identifying and describing external context variables. Few studies are conducted that specifically point towards the actual impact of macro-level barriers and facilitators [ 10 , 11 , 12 ] but only provide limited insights in how these shape an implementation process. Nonetheless, external contextual variables can be highly disruptive for an organization’s implementation efforts, for example, when fluctuations in funding occur or when new legislation or technology is introduced [ 13 ]. In order to build a more comprehensive view on external context influences, we need an elaborative theoretical perspective.

Organizational theories as a frame of reference

To better understand how the external context affects the implementation process of a primary care intervention, we build upon research of Birken et al. [ 13 ] who demonstrate the explanatory power of organizational theories. Organizational theories can help explain the complex interactions between organizations and their environments [ 13 ], providing understanding on the impact of external context on the mechanism of change in an implementation process. We focus on three of the theories Birken et al. [ 8 ] put forward: institutional theory, resource dependency theory, and contingency theory. We also include network theory in recognition of the importance of interorganizational context and social ties between various actors, especially in primary care settings which are characterized by a multitude of diverse actors (meaning: participants of a process).

These four organizational theories demonstrate the ways in which organizations interact with their external environment in order to sustain and fulfill their core activities. All four of them do this with a different lens. Institutional theory states that an organization will aim to fulfil the expectations, values, or norms that are posed upon them in order to achieve a fit with their environment [ 14 ]. This theory helps to understand the relationships between organizations and actors and the institutional context in which they operate. Institutions can broadly be defined as a set of expectations for social or organizational behavior that can take the form formal structures such as regulatory entities, legislation, or procedures [ 15 ]. Resource dependency theory explains actions and decisions of organizations in terms of their dependence on critical and important resources. It postulates that organizations will respond to their external environment to secure the resources they need to operate [ 16 , 17 ]. This theory helps to gain insight in how fiscal variables can shape the adoption of an innovation. Contingency theory presupposes that an organizations’ effectiveness depends on the congruence between situational factors and organizational characteristics [ 18 ]. External context variables such as social and economic change and pressure can impact the way in which an innovation will be integrated. Lastly, network theory in its broader sense underlines the strength of networks: collaborating in networks can establish an effectiveness in which outcomes are achieved that could not be realized by individual organizations acting independently. Networks are about connecting or sharing information, resources, activities, and competences of three or more organizations aiming to achieve a shared goal or outcome [ 19 , 20 ]. Investigating networks helps to gain understanding of the importance of the interorganizational context and how social ties between organizations affect the implementation process of a complex intervention.

Goal-oriented care in Flanders as a case

In this study, we focus on the implementation of the approach goal-oriented care (GOC) in primary care in Flanders, the Dutch-speaking region in Belgium. Primary care is a highly institutionalized and regulated setting with a high level of professionalism. Healthcare organizations can be viewed as complex adaptive systems that are increasingly interdependent [ 21 ]. The primary care landscape in Flanders is characterized by many primary care providers (PCPs) being either self-employed or working in group practices or community health centers. They are organized and financed at different levels (federal, regional, local). In 2015–2019, a primary care reform was initiated in Flanders in which the region was geographically divided into 60 primary care zones that are governed by care councils. The Flemish Institute of Primary Care was created as a supporting institution aiming to strengthen the collaboration between primary care health and welfare actors. The complex and multisectoral nature of primary care in Flanders forms an interesting setting to gain understanding in how macro-level context variables affect implementation processes.

The concept of GOC implies a paradigm shift [ 22 ] that shifts away from a disease or problem-oriented focus towards a person-centered focus that departs from “what matters to the patient.” Boeykens et al. [ 23 ] state in their concept analysis that GOC could be described as a healthcare approach encompassing a multifaceted, dynamic, and iterative process underpinned by the patient’s context and values. The process is characterized by three stages: goal elicitations, goal setting, and goal evaluation in which patients’ needs and preferences form the common thread. It is an approach in which PCPs and patients collaborate to identify personal life goals and to align care with those goals [ 23 ]. An illustration of how this manifests at individual level can be found in Table 1 . The concept of GOC was incorporated in Flemish policies and included in the primary care reform in 2015–2019. It has gained interest in research and policy as a potential catalyst for integrated care [ 24 ]. As such, the implementation of GOC in Flanders provides an opportunity to investigate the external context of a complex primary care intervention. Our main research question is as follows: what can organizational theories tell us about the influence of external context variables on the implementation process of GOC?

We assess the potential of four organizational theories to enrich our understanding of the impact of external context variables on implementation processes. The organizational theories assessed are as follows: institutional theory, resource dependency theory, network theory, and contingency theory. Qualitative research methods are most suitable to investigate such complex matters, as they can help answer “how” and “why” questions on implementation [ 25 ]. We conducted online, semi-structured in-depth interviews with various primary care actors. These actors all had some level of experience at either meso- or micro-level with GOC implementation efforts.

Sample selection

For our purposive sample, we used the following inclusion criteria: 1) working in a Flemish health/social care context in which initiatives are taken to implement GOC and 2) having at least 6 months of experience. For recruitment, we made an overview of all possible stakeholders that are active in GOC by calling upon the network of the Primary Care Academy (PCA) Footnote 1 . Additionally, a snowballing approach was used in which respondents could refer to other relevant stakeholders at the end of each interview. This leads to respondents with different backgrounds (not only medical) and varying roles, such as being a staff member, project coordinator, or policy maker. We aimed at a maximum variation in the type of organizations which were represented by respondents, such as different governmental institutions and a variety of healthcare/social care organizations. In some cases, paired interviews were conducted [ 26 ] if the respondents were considered complementary in terms of expertise, background, and experience with the topic. An information letter and a request to participate was send to each stakeholder by e-mail. One reminder was sent in case of nonresponse.

Data collection

Interviews were conducted between January and June 2022 by a sociologist trained in qualitative research methods. Interviewing took place online using the software Microsoft Teams and were audio-recorded and transcribed verbatim. A semi-structured interview guide was used, which included (1) an exploration of the concept of GOC and how the respondent relates to this topic, (2) questions on how GOC became a topic of interest and initiatives within the respondent’s setting, and (3) the perceived barriers and facilitators for implementation. An iterative approach was used between data collection and data analysis, meaning that the interview guide underwent minor adjustments based on proceeding insights from earlier interviews in order to get richer data.

Data analysis

All data were thematically analyzed, both inductively and deductively, supported by the software NVivo 12©. For the inductive part, implicit and explicit ideas within the qualitative data were identified and described [ 27 ]. The broader research team, with backgrounds in sociology, medical sciences, and social work, discussed these initial analyses and results. The main researcher then further elaborated this into a broad understanding. This was followed by a deductive part, in which characteristics and perspectives from organizational theories were used as sensitizing concepts, inspired by research from Birken et al. [ 13 ]. This provided a frame of reference and direction, adding interpretive value to our analysis [ 28 ]. These analyses were subject of peer debriefing with our cooperating research team to validate whether these results aligned with their knowledge of GOC processes. This enhances the trustworthiness and credibility of our results [ 29 , 30 ]. Data analysis was done in Dutch, but illustrative quotes were translated into English.

In-depth interviews were performed with n = 23 respondents (see Table 2 ): five interviews were duo interviews, and one interview took place with n = 3 respondents representing one organization. We had n = 6 refusals: n = 3 because of time restraints, n = 1 did not feel sufficiently knowledgeable about the topic, n = 1 changed professional function, and there was n = 1 nonresponse. Respondents had various ways in which they related towards the macro-context: we included actors that formed part of external context (e.g., the Flemish Agency of Care and Health), actors that facilitate and strengthen organizations in the implementation of GOC (e.g., the umbrella organization for community health centers), and actors that actively convey GOC inside and outside their setting (e.g., an autonomous and integral home care service). Interviews lasted between 47 and 72 min. Table 3 gives an overview on the main findings of our deductive analysis with their respective links to the propositions of each of the organizational theories that we applied as a lens.

Institutional theory: laying foundations for a shift towards GOC

For the implementation of GOC in primary care, looking at the data with an institutional theory lens helps us understand the way in which primary care organizations will respond to social structures surrounding them. Institutional theory describes the influence of institutions, which give shape to organizational fields: “organizations that, in the aggregate, constitute a recognized area of institutional life [ 31 ], p. 148. Prevailing institutions within primary care in Flanders can affect how organizations within such organizational fields fulfil their activities. Throughout our interviews, we recognized several dynamics that are being described in institutional theory.

First of all, the changing landscape of primary care in Flanders (see 1.2) was often brought up as a dynamic in which GOC is intertwined with other changes. Respondents mention an overall tendency to reform primary care to becoming more integrated and the ideas of person-centered care becoming more upfront. These expectations in how primary care should be approached seem to affect the organizational field of primary care: “You could tell that in people’s minds they are ready to look into what it actually means to put the patient, the person central. — INT01” Various policy actors are committed to further steer towards these approaches: “the government has called it the direction that we all have to move towards. — INT23” It was part of the foundations for the most recent primary care reform, leading to the creation of demographic primary care zones governed by care councils and the Flemish Institute of Primary Care as supporting institution.

These newly established actors were viewed by our respondents as catalysts of GOC. They pushed towards the aims to depart from local settings and to establish connections between local actors. Overall, respondents emphasized their added value as they are close to the field and they truly connect primary care actors. “They [care councils] have picked up these concepts and have started working on it. At the moment they are truly the incubators and ecosystems, as they would call it in management slang. — INT04” For an innovation such as GOC to be diffused, they are viewed as the ideal actors who can function as a facilitator or conduit. They are uniquely positioned as they are closely in contact with the practice field and can be a top-down conduit for governmental actors but also are able to address the needs from bottom-up. “In this respect, people look at the primary care zones as the ideal partners. […] We can start bringing people together and have that helicopter view: what is it that truly connects you? — INT23” However, some respondents also mentioned their difficult governance structure due to representation of many disciplines and organizations.

Other regulatory factors were mentioned by respondents were other innovations or changes in primary care that were intentionally linked to GOC: e.g., the BelRAI Footnote 2 or Flemish Social Protection Footnote 3 . “The government also provides incentives. For example, family care services will gradually be obliged to work with the BelRAI screener. This way, you actually force them to start taking up GOC. — INT23” For GOC to be embedded in primary care, links with other regulatory requirements can steer PCPs towards GOC. Furthermore, it was sometimes mentioned that an important step would be for the policy level to acknowledge GOC as quality of care and to include the concept in quality standards. This would further formalize and enforce the institutional expectation to go towards person-centered care.

Currently, a challenge on institutional level as viewed by most respondents is that GOC is not or only to a limited extent incorporated in the basic education of most primary care disciplines. This leads to most of PCPs only having a limited understanding of GOC and different disciplines not having a shared language in this matter. “You have these primary health and welfare actors who each have their own approach, history and culture. To bring them together and to align them is challenging. — INT10” The absence of GOC as a topic in basic education is mentioned by various respondents as a current shortcoming in effectively implementing GOC in the wider primary care landscape.

Overall, GOC is viewed as our respondents as a topic that has recently gained a lot interest, both by individual PCPS, organizations, and governmental actors. The Flemish government has laid some foundations to facilitate this change with newly created institutions and incentives. However, other external context variables can interfere in how the concept of GOC is currently being picked up and what challenges arise.

Resource dependency theory: in search for a financial system that accommodates interprofessional collaboration

Another external context variable that affects how GOC can be introduced is the financial system that is at place. To analyze themes that were raised during the interviews with regard to finances, we utilized a resource dependency perspective. This theory presumes that organizations are dependent on financial resources and are seeking ways to ensure their continued functioning [ 16 , 17 ]. To a certain extent, this collides with the assumptions of institutional theory that foregrounds organization’s conformity to institutional pressures [ 32 ]. Resource dependency theory in contrast highlights differentiation of organizations that seek out competitive advantages [ 32 ].

In this context, respondents mention that their interest and willingness to move towards a GOC approach are held back by the current dominant system of pay for performance in the healthcare system. This financial system is experienced as restrictive, as it does not provide any incentive to PCPs for interprofessional collaboration, which is key for GOC. A switch to a flat fee system (in which a fixed fee is charged for each patient) or bundled payment was often mentioned as desirable. PCPs and health/social care organizations working in a context where they are financially rewarded for a trajectory or treatment of a patient in its entirety ensure that there is no tension with their necessity to obtain financial resources, as described in the resource dependency theory. Many of our respondents voice that community health centers are a good example. They cover different healthcare disciplines and operate with a fixed price per enrolled patient, regardless of the number of services for that patient. This promotes setting up preventive and health-promoting actions, which confirms our finding on the relevance of dedicated funding.

At the governmental level, the best way to finance and give incentives is said to be a point of discussion: “For years, we have been arguing about how to finance. Are we going to fund counsel coordination? Or counsel organization? Or care coordination? — INT04” Macro-level respondents do however mention financial incentives that are already in place to stimulate interprofessional collaboration: fees for multidisciplinary consultation being the most prominent. Other examples were given in which certain requirements were set for funding (e.g., Impulseo Footnote 4 , VIPA Footnote 5 ) that stimulate actors or settings in taking steps towards more interprofessional collaboration.

Nowadays, financial incentives to support organizations to engage in GOC tend to be project grants. However, a structural way to finance GOC approaches is currently lacking, according to our respondents. As a consequence, a long-term perspective for organizations is lacking; there is no stable financing and organizations are obliged to focus on projects instead of normalizing GOC in routine practice. According to a resource dependency perspective, the absence of financial incentives for practicing GOC hinders organizations in engaging with the approach, as they are focused on seeking out resources in order to fulfil their core activities.

A network-theory perspective: the importance of connectedness for the diffusion of an innovation

Throughout the interviews, interorganizational contextual elements were often addressed. A network theory lens states that collaborating in networks can lead to outcomes that could not be realized by individual organizations acting independently [ 19 , 20 ]. Networks consist of a set of actors such as PCPs or health/social care organizations along with a set of ties that link them [ 33 ]. These ties can be state-type ties (e.g., role based, cognitive) or event-type ties (e.g., through interactions, transactions). Both type of ties can enable a flow in which information or innovations can pass, as actors interact [ 33 ]. To analyze the implementation process of GOC and how this is diffused through various actors, a network theory perspective can help understand the importance of the connection between actors.

A first observation throughout the interviews in which we notice the importance of networks was in the mentioning of local initiatives that already existed before the creation of the primary care zones/care councils. In the area around Ghent, local multidisciplinary networks already organized community meetings, bringing together different PCPs on overarching topics relating to long-term care for patients with chronic conditions. These regions have a tradition of collaboration and connectedness of PCPs, which respondents mention to be highly valuable: “This ensures that we are more decisive, speaking from one voice with regards to what we want to stand for. — INT23” Respondents voice that the existence of such local networks has had a positive effect on the diffusion of ideas such as GOC, as trust between different actors was already established.

Further mentioning of the importance of networks could be found in respondents acknowledging one of the presumptions of network theory: working collaboratively towards a specific objective leads to outcomes that cannot be realized independently. This is especially true for GOC, an approach that in essence requires different disciplines to work together: “When only one GP, nurse or social worker starts working on it, it makes no sense. Everyone who is involved with that person needs to be on board. Actually, you need to finetune teams surrounding a person — INT11.” This is why several policy-level respondents mentioned that emphasis was placed on organizing GOC initiatives in a neighborhood-oriented way, in which accessible, inclusive care is aimed at by strengthening social cohesion. This way, different types of PCPs got to know each other through these sessions an GOC and would start to get aligned on what it means to provide GOC. However, in particular, self-employed PCPs are hard to reach. According to our respondents, occupational groups and care councils are suitable actors to engage these self-employed PCPs, but they are not always much involved in such a network .

To better connect PCPs and health/social care organizations, the absence of connectedness through the technological landscape is also mentioned. Current technological systems and platforms for documenting patient information do not allow for aligning and sharing between disciplines. In Flanders, there is a history of each discipline developing its own software, which lacks centralization or unification: “For years, they have decided to just leave it to the market, in such a way that you ended up with a proliferation of software, each discipline having its own package. — INT06” Most of the respondents mentioning this were aware that Flanders government is currently working on a unified digital care and support platform and were optimistic about its development.

Contingency theory: how environmental pressure can be a trigger for change

Our interviews were conducted during a rather dynamic and unique period of time in which the impact of social change and pressure was clearly visible: the Flemish primary care reform was ongoing which leads to the creation of care councils and VIVEL (see 3.1.1), and the COVID crisis impacted the functioning of these and other primary care actors. These observed effects of societal changes are reminiscent of the assumptions that are made in contingency theory. In essence, contingency theory presupposes that “organizational effectiveness results from fitting characteristics of the organization, such as its structure, to contingencies that reflect the situation of the organization [ 34 ], p. 1.” When it comes to the effects of the primary care reform and the COVID crisis, there were several mentions on how primary care actors reorganized their activities to adapt to these circumstances. Representatives of care councils/primary care zones whom we interviewed underlined that they were just at the point where they could again engage with their original action plans, not having to take up so many COVID-related tasks anymore. On the one hand, the COVID crisis had however forced them to immediately become functional and has also contributed that various primary care actors quickly got to know them. On the other hand, the COVID crisis has also kept them from their core activities for a while. On top of that, the crisis has also triggered a change the overall view towards data sharing. Some respondents mention a rather protectionist approach towards data sharing, while data sharing has become more normalized during the COVID crisis. This discussion was also relevant for the creation of a unified shared patient record in terms of documenting and sharing patient goals.

Other societal factors that were mentioned having an impact on the uptake of GOC are the demographic composition of a certain area. It was suggested that areas that are characterized by a patient population with more chronic care needs will be more likely to steer towards GOC as a way of coping with these complex cases. “You always have these GPs who blow it away immediately and question whether this is truly necessary. They will only become receptive to this when they experience needs for which GOC can be a solution — INT11.” On a macro-level, several respondents have mentioned how a driver for change is to have the necessity for change becoming very tangible. As PCPs are confronted with increasing numbers of patients with complex, chronic needs and their work becomes more demanding, the need for change becomes more acute. This finding is in line with what contingency theory underlines: changes in contingency (e.g., the population that is increasingly characterized by aging and multimorbidity) are an impetus for change for health/social care organizations to resolve this by adopting a structure that better fits the current environmental characteristics [ 34 ].

Our research demonstrates the applicability of organizational theories to help explain the impact that macro-level context variables have on an implementation process. These insights can be integrated into existing implementation frameworks and models to add the explanatory power of macro-level context variables, which is to date often neglected. The organizational theories demonstrate the ways in which organizations interact with their external environment in order to sustain and fulfill their core activities. As demonstrated in Fig. 1 , institutional theory largely explains how social expectations in the form of institutions lead towards the adoption or implementation of innovation, such as GOC. However, other organizational theories demonstrate how other macro-context elements on different areas can either strengthen or hamper the implementation process.

figure 1

How organizational theories can help explain the way in which macro-level context variables affect implementation of an intervention

Departing from the mechanisms that are postulated by institutional theory, we observed that the shift towards GOC is part of a larger Flemish primary care reform in which and new institutions have been established and polices have been drawn up to go towards more integrated, person-centered care. To achieve this, governmental actors have placed emphasis on socialization of care, the local context, and establishing ties between organizations in order to become more complementary in providing primary health care [ 35 ]. With various initiatives surrounding this aim, the Flemish government is steering towards GOC. This is reminiscent of the mechanisms that are posed within institutional theory: organizations adapt to prevailing norms and expectations and mimic behaviors that are surrounding them [ 15 , 36 ].

Throughout our data, we came across concrete examples of how institutionalization takes place. DiMaggio and Powell [ 31 ] describe the subsequent process of isomorphism: organizations start to resemble each other as they are conforming to their institutional environment. A first mechanism through which this change occurs is coercive isomorphism and is clearly noticeable in our data. This type of isomorphism results from both formal and informal pressure coming from organizations from which a dependency relationship exists and from cultural expectations in the society [ 31 ]. Person-centered, GOC care is both formally propagated by governmental institutions and procedures and informally expected by current social tendencies. Care councils within primary care zones explicitly propagate and disseminate ideas and approaches that are desirable on policy level. Another form of isomorphism is professional isomorphism and relates to our finding that incorporation of GOC in basic education is currently lacking. The presumptions of professional isomorphism back up the importance of this: values, norms, and ideas that are developed during education are bound to find entrance within organizations as professionals start operating along these views.

Although many observations in our data back up the assumptions of institutional theory, it should be noticed that new initiatives such as the promotion of person-centered care and GOC can collide with earlier policy trends. Martens et al. [ 12 ] have examined the Belgian policy process relating three integrated care projects and concluded that although there is a strong support for a change towards a more patient-centered system, the current provider-driven system and institutional design complicate this objective. Furthermore, institutional theory tends to simplify actors as passive adopters of institutional norms and expectations and overlook the human agency and sensemaking that come with it [ 37 ]. For GOC, it is particularly true that PCPs will actively have to seek out their own style and fit the approach in their own way of working. Moreover, GOC was not just addressed as a governmental expectation but for many PCPs something they inherently stood behind.

Resources dependency theory poses that organizations are dependent on critical resources and adapt their way of working in response to those resources [ 17 ]. From our findings, it seems that the current financial system does not promote GOC, meaning that the mechanisms that are put forward in resources dependency theory are not set in motion. A macro-level analysis of barriers and facilitators in the implementation of integrated care in Belgium by Danhieux et al. [ 10 ] also points towards the financial system and data sharing as two of the main contextual determinants that affect implementation.

Throughout our data, the importance of a network approach was frequently mentioned. Interprofessional collaboration came forward as a prerequisite to make GOC happen, as well as active commitment on different levels. Burns, Nembhard, and Shortell [ 38 ] argue that research efforts on implementing person-centered, integrated care should have more focus on the use of social networks to study relational coordination. In terms of interprofessional collaboration, to date, Belgium has a limited tradition of working team-based with different disciplines [ 35 ]. However, when it comes to strengthening a cohesive primary care network, the recently established care councils have become an important facilitator. As a network governance structure, they resemble mostly a Network Administrative Organization (NAO): a separate, centralized administrative entity that is externally governed and not another member providing its own services [ 19 ]. According to Provan and Kenis [ 19 ], this type of governance form is most effective in a rather dense network with many participants, when the goal consensus is moderately high, characteristics that are indeed representative for the Flemish primary care landscape. This strengthens our observation that care councils have favorable characteristics and are well-positioned to facilitate the interorganizational context to implement GOC.

Lastly, the presumptions within contingency theory became apparent as respondents talked about how the need for change needs to become tangible for PCPs and organizations to take action, as they are increasingly faced with a shortage of time and means and more complex patient profiles. Furthermore, De Maeseneer [ 39 ] affirms our findings that the COVID-19 crisis could be employed as an opportunity to strengthen primary health care, as health becomes prioritized and its functioning becomes re-evaluated. Overall, contingency theory can help gain insight in how and why certain policy trends or decisions are made. A study of Bruns et al. [ 40 ] found that modifiable external context variables such as interagency collaboration were predictive for policy support for intervention adoption, while unmodifiable external context variable such as socio-economic composition of a region was more predictive for fiscal investments that are made.

Strengths and limitations

This study contributes to our overall understanding of implementation processes by looking into real-life implementation efforts for GOC in Flanders. It goes beyond a mere description of external context variables that affect implementation processes but aims to grasp which and how external context variables influence implementation processes. A variety of respondents from different organizations, with different backgrounds and perspectives, were interviewed, and results were analyzed by researchers with backgrounds in sociology, social work, and medical sciences. Results can not only be applied to further develop sustainable implementation plans for GOC but also enhance our understanding of how the external context influences and shapes implementation processes. As most research on contextual variables in implementation processes has until now mainly focused on internal context variables, knowledge on external context variables contributes to gaining a bigger picture of the mechanism of change.

However, this study is limited to the Flemish landscape, and external context variables and their dynamics might differ from other regions or countries. Furthermore, our study has examined and described how macro-level context variables affect the overall implementation processes of GOC. Further research is needed on the link between outer and inner contexts during implementation and sustainment, as explored by Lengninck-Hall et al. [ 41 ]. Another important consideration is that our sample only includes the “believers” in GOC and those who are already taking steps towards its implementation. It is possible that PCPs themselves or other relevant actors who are more skeptical about GOC have a different view on the policy and organizational processes that we explored. Furthermore, data triangulations in which this data is complemented with document analysis could have expanded our understanding and verified subjective perceptions of respondents.

Insights and propositions that derive from organizational theories can be utilized to expand our knowledge on how external context variables affect implementation processes. Our research demonstrates that the implementation of GOC in Flanders is steered and facilitated by regulatory and policy variables, which sets in motion mechanisms that are described in institutional theory. However, other external context variables interact with the implementation process and can further facilitate or hinder the overall implementation process. Assumptions and mechanisms explained within resource dependency theory, network theory, and contingency theory contribute to our understanding on how fiscal, technological, socio-economic, and interorganizational context variables affect an implementation process.

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available due to confidentiality guaranteed to participants but are available from the corresponding author on reasonable request.

The Primary Care Academy (PCA) is a research and teaching network of four Flemish universities, six university colleges, the White and Yellow Cross (an organization for home nursing), and patient representatives that have included GOC as one of their main research domains.

BelRAI, the Belgian implementation of the interRAI assessment tools; these are scientific, internationally validated instruments enabling an assessment of social, psychological, and physical needs and possibilities of individuals in different care settings. The data follows the person and is shared between care professionals and care organizations.

The Flemish Social Protection is a mandatory insurance established by the Flemish government to provide a range of concessions to individuals with long-term care and support needs due to illness or disability.

Impulseo, financial support for general practitioners who start an individual practice or join a group practice

VIPA, grants for the realization of sustainable, accessible, and affordable healthcare infrastructure

Abbreviations

  • Goal-oriented care

Primary care provider

Primary Care Academy

Squires JE, Graham ID, Hutchinson AM, Michie S, Francis JJ, Sales A, et al. Identifying the domains of context important to implementation science: a study protocol. Implement Sci. 2015;10(1):1–9.

Article   Google Scholar  

Nilsen P, Bernhardsson S. Context matters in implementation science: a scoping review of determinant frameworks that describe contextual determinants for implementation outcomes. BMC Health Serv Res. 2019;19(1):1–21.

Rogers L, De Brún A, McAuliffe E. Defining and assessing context in healthcare implementation studies: a systematic review. BMC Health Serv Res. 2020;20(1):1–24.

Huybrechts I, Declercq A, Verté E, Raeymaeckers P, Anthierens S. The building blocks of implementation frameworks and models in primary care: a narrative review. Front Public Health. 2021;9:675171.

Article   PubMed   PubMed Central   Google Scholar  

Hamilton AB, Mittman BS, Eccles AM, Hutchinson CS, Wyatt GE. Conceptualizing and measuring external context in implementation science: studying the impacts of regulatory, fiscal, technological and social change. Implement Sci. 2015;10 BioMed Central.

Watson DP, Adams EL, Shue S, Coates H, McGuire A, Chesher J, et al. Defining the external implementation context: an integrative systematic literature review. BMC Health Serv Res. 2018;18(1):1–14.

Aarons GA, Hurlburt M, Horwitz SM. Advancing a conceptual model of evidence-based practice implementation in public service sectors. Adm Policy Ment Health Ment Health Serv Res. 2011;38:4–23.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4(1):1–15.

Harvey G, Kitson A. PARIHS revisited: from heuristic to integrated framework for the successful implementation of knowledge into practice. Implement Sci. 2015;11(1):1–13.

Danhieux K, Martens M, Colman E, Wouters E, Remmen R, Van Olmen J, et al. What makes integration of chronic care so difficult? A macro-level analysis of barriers and facilitators in Belgium. International. J Integr Care. 2021;21(4).

Hamilton AB, Mittman BS, Campbell D, Hutchinson C, Liu H, Moss NJ, Wyatt GE. Understanding the impact of external context on community-based implementation of an evidence-based HIV risk reduction intervention. BMC Health Serv Res. 2018;18(1):1–10.

Martens M, Danhieux K, Van Belle S, Wouters E, Van Damme W, Remmen R, et al. Integration or fragmentation of health care? Examining policies and politics in a Belgian case study. Int J Health Policy Manag. 2022;11(9):1668.

PubMed   Google Scholar  

Birken SA, Bunger AC, Powell BJ, Turner K, Clary AS, Klaman SL, et al. Organizational theory for dissemination and implementation research. Implement Sci. 2017;12(1):1–15.

Powell WW, DiMaggio PJ. The new institutionalism in organizational analysis. University of Chicago Press; 2012.

Google Scholar  

Zucker LG. Institutional theories of organization. Annu Rev Sociol. 1987;13(1):443–64.

Hillman AJ, Withers MC, Collins BJ. Resource dependence theory: a review. J Manag. 2009;35(6):1404–27.

Nienhüser W. Resource dependence theory-how well does it explain behavior of organizations? Management Revue; 2008. p. 9–32.

Lammers CJ, Mijs AA, Noort WJ. Organisaties vergelijkenderwijs: ontwikkeling en relevantie van het sociologisch denken over organisaties. Het Spectrum. 2000;6.

Provan KG, Kenis P. Modes of network governance: structure, management, and effectiveness. J Public Adm Res Theory. 2008;18(2):229–52.

Kenis P, Provan K. Het network-governance-perspectief. Business performance management Sturen op prestatie en resultaat; 2008. p. 296–312.

Begun JW, Zimmerman B, Dooley K. Health care organizations as complex adaptive systems. Adv Health Care Org Theory. 2003;253:288.

Mold JW. Failure of the problem-oriented medical paradigm and a person-centered alternative. Ann Fam Med. 2022;20(2):145–8.

Boeykens D, Boeckxstaens P, De Sutter A, Lahousse L, Pype P, De Vriendt P, et al. Goal-oriented care for patients with chronic conditions or multimorbidity in primary care: a scoping review and concept analysis. PLoS One. 2022;17(2):e0262843.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Gray CS, Grudniewicz A, Armas A, Mold J, Im J, Boeckxstaens P. Goal-oriented care: a catalyst for person-centred system integration. Int J Integr Care. 2020;20(4).

Hamilton AB, Finley EP. Qualitative methods in implementation research: an introduction. Psychiatry Res. 2019;280:112516.

Wilson AD, Onwuegbuzie AJ, Manning LP. Using paired depth interviews to collect qualitative data. Qual Rep. 2016;21(9):1549.

Guest G, MacQueen KM, Namey EE. Applied thematic analysis. Sage Publications; 2011.

Bowen GA. Grounded theory and sensitizing concepts. Int J Qual Methods. 2006;5(3):12–23.

Connelly LM. Trustworthiness in qualitative research. Medsurg Nurs. 2016;25(6):435.

Morse JM, Barrett M, Mayan M, Olson K, Spiers J. Verification strategies for establishing reliability and validity in qualitative research. Int J Qual Methods. 2002;1(2):13–22.

DiMaggio PJ, Powell WW. The iron cage revisited: institutional isomorphism and collective rationality in organizational fields. Am Sociol Rev. 1983;147-60.

de la Luz F-AM, Valle-Cabrera R. Reconciling institutional theory with organizational theories: how neoinstitutionalism resolves five paradoxes. J Organ Chang Manag. 2006;19(4):503–17.

Borgatti SP, Halgin DS. On network theory. Organ Sci. 2011;22(5):1168–81.

Donaldson L. The contingency theory of organizations. Sage; 2001.

Book   Google Scholar  

De Maeseneer J, Galle A. Belgium’s healthcare system: the way forward to address the challenges of the 21st century: comment on “Integration or Fragmentation of Health Care? Examining Policies and Politics in a Belgian Case Study”. Int J Health Policy Manag. 2023;12.

Dadich A, Doloswala N. What can organisational theory offer knowledge translation in healthcare? A thematic and lexical analysis. BMC Health Serv Res. 2018;18(1):1–20.

Jensen TB, Kjærgaard A, Svejvig P. Using institutional theory with sensemaking theory: a case study of information system implementation in healthcare. J Inf Technol. 2009;24(4):343–53.

Burns LR, Nembhard IM, Shortell SM. Integrating network theory into the study of integrated healthcare. Soc Sci Med. 2022;296:114664.

Article   PubMed   Google Scholar  

De Maeseneer J. COVID-19: using the crisis as an opportunity to strengthen primary health care. Prim Health Care Res Dev. 2021;22:e73.

Bruns EJ, Parker EM, Hensley S, Pullmann MD, Benjamin PH, Lyon AR, Hoagwood KE. The role of the outer setting in implementation: associations between state demographic, fiscal, and policy factors and use of evidence-based treatments in mental healthcare. Implement Sci. 2019;14:1–13.

Lengnick-Hall R, Stadnick NA, Dickson KS, Moullin JC, Aarons GA. Forms and functions of bridging factors: specifying the dynamic links between outer and inner contexts during implementation and sustainment. Implement Sci. 2021;16:1–13.

Download references

Acknowledgements

We are grateful for the partnership with the Primary Care Academy (academie-eerstelijn.be) and want to thank the King Baudouin Foundation and Fund Daniël De Coninck for the opportunity they offer us for conducting research and have impact on the primary care of Flanders, Belgium. The consortium of the Primary Care Academy consists of the following: lead author: Roy Remmen—[email protected]—Department of Primary Care and Interdisciplinary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Emily Verté—Department of Primary Care and Interdisciplinary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium, and Department of Family Medicine and Chronic Care, Faculty of Medicine and Pharmacy, Vrije Universiteit Brussel, Brussel, Belgium; Muhammed Mustafa Sirimsi—Centre for Research and Innovation in Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Peter Van Bogaert—Workforce Management and Outcomes Research in Care, Faculty of Medicine and Health Sciences, University of Antwerp, Belgium; Hans De Loof—Laboratory of Physio-Pharmacology, Faculty of Pharmaceutical Biomedical and Veterinary Sciences, University of Antwerp, Belgium; Kris Van den Broeck—Department of Primary Care and Interdisciplinary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Sibyl Anthierens—Department of Primary Care and Interdisciplinary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Ine Huybrechts—Department of Primary Care and Interdisciplinary Care, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Peter Raeymaeckers—Department of Sociology, Faculty of Social Sciences, University of Antwerp, Belgium; Veerle Bufel—Department of Sociology, Centre for Population, Family and Health, Faculty of Social Sciences, University of Antwerp, Belgium; Dirk Devroey—Department of Family Medicine and Chronic Care, Faculty of Medicine and Pharmacy, Vrije Universiteit Brussel, Brussel; Bert Aertgeerts—Academic Centre for General Practice, Faculty of Medicine, KU Leuven, Leuven, and Department of Public Health and Primary Care, Faculty of Medicine, KU Leuven, Leuven; Birgitte Schoenmakers—Department of Public Health and Primary Care, Faculty of Medicine, KU Leuven, Leuven, Belgium; Lotte Timmermans—Department of Public Health and Primary Care, Faculty of Medicine, KU Leuven, Leuven, Belgium; Veerle Foulon—Department of Pharmaceutical and Pharmacological Sciences, Faculty Pharmaceutical Sciences, KU Leuven, Leuven, Belgium; Anja Declercq—LUCAS-Centre for Care Research and Consultancy, Faculty of Social Sciences, KU Leuven, Leuven, Belgium; Dominique Van de Velde, Department of Rehabilitation Sciences, Occupational Therapy, Faculty of Medicine and Health Sciences, University of Ghent, Belgium, and Department of Occupational Therapy, Artevelde University of Applied Sciences, Ghent, Belgium; Pauline Boeckxstaens—Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium; An De Sutter—Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium; Patricia De Vriendt—Department of Rehabilitation Sciences, Occupational Therapy, Faculty of Medicine and Health Sciences, University of Ghent, Belgium, and Frailty in Ageing (FRIA) Research Group, Department of Gerontology and Mental Health and Wellbeing (MENT) Research Group, Faculty of Medicine and Pharmacy, Vrije Universiteit, Brussels, Belgium, and Department of Occupational Therapy, Artevelde University of Applied Sciences, Ghent, Belgium; Lies Lahousse—Department of Bioanalysis, Faculty of Pharmaceutical Sciences, Ghent University, Ghent, Belgium; Peter Pype—Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium, End-of-Life Care Research Group, Faculty of Medicine and Health Sciences, Vrije Universiteit Brussel and Ghent University, Ghent, Belgium; Dagje Boeykens—Department of Rehabilitation Sciences, Occupational Therapy, Faculty of Medicine and Health Sciences, University of Ghent, Belgium, and Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium; Ann Van Hecke—Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium, University Centre of Nursing and Midwifery, Faculty of Medicine and Health Sciences, University of Ghent, Belgium; Peter Decat—Department of Public Health and Primary Care, Faculty of Medicine and Health Sciences, University of Ghent, Belgium; Rudi Roose—Department of Social Work and Social Pedagogy, Faculty of Psychology and Educational Sciences, University Ghent, Belgium; Sandra Martin—Expertise Centre Health Innovation, University College Leuven-Limburg, Leuven, Belgium; Erica Rutten—Expertise Centre Health Innovation, University College Leuven-Limburg, Leuven, Belgium; Sam Pless—Expertise Centre Health Innovation, University College Leuven-Limburg, Leuven, Belgium; Anouk Tuinstra—Expertise Centre Health Innovation, University College Leuven-Limburg, Leuven, Belgium; Vanessa Gauwe—Department of Occupational Therapy, Artevelde University of Applied Sciences, Ghent, Belgium; Didier ReynaertE-QUAL, University College of Applied Sciences Ghent, Ghent, Belgium; Leen Van Landschoot—Department of Nursing, University of Applied Sciences Ghent, Ghent, Belgium; Maja Lopez Hartmann—Department of Welfare and Health, Karel de Grote University of Applied Sciences and Arts, Antwerp, Belgium; Tony Claeys—LiveLab, VIVES University of Applied Sciences, Kortrijk, Belgium; Hilde Vandenhoudt—LiCalab, Thomas University of Applied Sciences, Turnhout, Belgium; Kristel De Vliegher—Department of Nursing–Homecare, White-Yellow Cross, Brussels, Belgium; and Susanne Op de Beeck—Flemish Patient Platform, Heverlee, Belgium.

This research was funded by fund Daniël De Coninck, King Baudouin Foundation, Belgium. The funder had no involvement in this study. Grant number: 2019-J5170820-211,588.

Author information

Peter Raeymaeckers and Sibyl Anthierens have contributed equally to this work and share senior last authorship.

Authors and Affiliations

Department of Family Medicine and Population Health, University of Antwerp, Doornstraat 331, 2610, Antwerp, Belgium

Ine Huybrechts, Emily Verté & Sibyl Anthierens

Department of Family Medicine and Chronic Care, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090, Jette/Brussels, Belgium

Ine Huybrechts & Emily Verté

LUCAS — Centre for Care Research and Consultancy, KU Leuven, Minderbroedersstraat 8/5310, 3000, Leuven, Belgium

Anja Declercq

Center for Sociological Research, Faculty of Social Sciences, KU Leuven, Parkstraat 45/3601, 3000, Leuven, Belgium

Department of Social Work, University of Antwerp, St-Jacobstraat 2, 2000, Antwerp, Belgium

Peter Raeymaeckers

You can also search for this author in PubMed   Google Scholar

  • , Emily Verté
  • , Muhammed Mustafa Sirimsi
  • , Peter Van Bogaert
  • , Hans De Loof
  • , Kris Van den Broeck
  • , Sibyl Anthierens
  • , Ine Huybrechts
  • , Peter Raeymaeckers
  • , Veerle Bufel
  • , Dirk Devroey
  • , Bert Aertgeerts
  • , Birgitte Schoenmakers
  • , Lotte Timmermans
  • , Veerle Foulon
  • , Anja Declerq
  • , Dominique Van de Velde
  • , Pauline Boeckxstaens
  • , An De Sutter
  • , Patricia De Vriendt
  • , Lies Lahousse
  • , Peter Pype
  • , Dagje Boeykens
  • , Ann Van Hecke
  • , Peter Decat
  • , Rudi Roose
  • , Sandra Martin
  • , Erica Rutten
  • , Sam Pless
  • , Anouk Tuinstra
  • , Vanessa Gauwe
  • , Leen Van Landschoot
  • , Maja Lopez Hartmann
  • , Tony Claeys
  • , Hilde Vandenhoudt
  • , Kristel De Vliegher
  •  & Susanne Op de Beeck

Contributions

IH wrote the main manuscript text. AD, EV, PR, and SA contributed to the different steps of the making of this manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ine Huybrechts .

Ethics declarations

Ethics approval and consent to participate.

The study protocol was approved by the Medical Ethics Committee of the University of Antwerp/Antwerp University Hospital (reference: 2021-1690). All participants received verbal and written information about the purpose and methods of the study and gave written informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Huybrechts, I., Declercq, A., Verté, E. et al. How does the external context affect an implementation processes? A qualitative study investigating the impact of macro-level variables on the implementation of goal-oriented primary care. Implementation Sci 19 , 32 (2024). https://doi.org/10.1186/s13012-024-01360-0

Download citation

Received : 03 January 2024

Accepted : 28 March 2024

Published : 16 April 2024

DOI : https://doi.org/10.1186/s13012-024-01360-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Contingency theory
  • External context
  • Institutional theory
  • Primary care
  • Implementation process
  • Macro-context
  • Network theory
  • Organizational theories
  • Resource dependency theory

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

approaches case study analysis

  • Open access
  • Published: 18 April 2024

The predictive power of data: machine learning analysis for Covid-19 mortality based on personal, clinical, preclinical, and laboratory variables in a case–control study

  • Maryam Seyedtabib   ORCID: orcid.org/0000-0003-1599-9374 1 ,
  • Roya Najafi-Vosough   ORCID: orcid.org/0000-0003-2871-5748 2 &
  • Naser Kamyari   ORCID: orcid.org/0000-0001-6245-5447 3  

BMC Infectious Diseases volume  24 , Article number:  411 ( 2024 ) Cite this article

114 Accesses

1 Altmetric

Metrics details

Background and purpose

The COVID-19 pandemic has presented unprecedented public health challenges worldwide. Understanding the factors contributing to COVID-19 mortality is critical for effective management and intervention strategies. This study aims to unlock the predictive power of data collected from personal, clinical, preclinical, and laboratory variables through machine learning (ML) analyses.

A retrospective study was conducted in 2022 in a large hospital in Abadan, Iran. Data were collected and categorized into demographic, clinical, comorbid, treatment, initial vital signs, symptoms, and laboratory test groups. The collected data were subjected to ML analysis to identify predictive factors associated with COVID-19 mortality. Five algorithms were used to analyze the data set and derive the latent predictive power of the variables by the shapely additive explanation values.

Results highlight key factors associated with COVID-19 mortality, including age, comorbidities (hypertension, diabetes), specific treatments (antibiotics, remdesivir, favipiravir, vitamin zinc), and clinical indicators (heart rate, respiratory rate, temperature). Notably, specific symptoms (productive cough, dyspnea, delirium) and laboratory values (D-dimer, ESR) also play a critical role in predicting outcomes. This study highlights the importance of feature selection and the impact of data quantity and quality on model performance.

This study highlights the potential of ML analysis to improve the accuracy of COVID-19 mortality prediction and emphasizes the need for a comprehensive approach that considers multiple feature categories. It highlights the critical role of data quality and quantity in improving model performance and contributes to our understanding of the multifaceted factors that influence COVID-19 outcomes.

Peer Review reports

Introduction

The World Health Organization (WHO) has declared COVID-19 a global pandemic in March 2020 [ 1 ]. The first cases of SARSCoV-2, a new severe acute respiratory syndrome coronavirus, were detected in Wuhan, China, and rapidly spread to become a global public health problem [ 2 ]. The clinical presentation and symptoms of COVID-19 may be similar to those of Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS), however the rate of spread is higher [ 3 ]. By December 31, 2022, the pandemic had caused more than 729 million cases and nearly 6.7 million deaths (0.92%) were confirmed in 219 countries worldwide [ 4 ]. For many countries, figuring out what measures to take to prevent death or serious illness is a major challenge. Due to the complexity of transmission and the lack of proven treatments, COVID-19 is a major challenge worldwide [ 5 , 6 ]. In middle- and low-income countries, the situation is even more catastrophic due to high illiteracy rates, a very poor health care system, and lack of intensive care units [ 5 ]. In addition, understanding the factors contributing to COVID-19 mortality is critical for effective management and intervention strategies [ 6 ].

Numerous studies have shown several factors associated with COVID-19 outcomes, including socioeconomic, environmental, individual demographic, and health factors [ 7 , 8 , 9 ]. Risk factors for COVID -19 mortality vary by study and population studied [ 10 ]. Age [ 11 , 12 ], comorbidities such as hypertension, cardiovascular disease, diabetes, and COPD [ 13 , 14 , 15 ], sex [ 13 ], race/ethnicity [ 11 ], dementia, and neurologic disease [ 16 , 17 ], are some of the factors associated with COVID-19 mortality. Laboratory factors such as elevated levels of inflammatory markers, lymphopenia, elevated creatinine levels, and ALT are also associated with COVID-19 mortality [ 5 , 18 ]. Understanding these multiple risk factors is critical to accurately diagnose and treat COVID-19 patients.

Accurate diagnosis and treatment of the disease requires a comprehensive assessment that considers a variety of factors. These factors include personal factors such as medical history, lifestyle, and genetics; clinical factors such as observations on physical examinations and physician reports; preclinical factors such as early detection through screening or surveillance; laboratory factors such as results of diagnostic tests and medical imaging; and patient-reported signs and symptoms. However, the variety of characteristics associated with COVID-19 makes it difficult for physicians to accurately classify COVID-19 patients during the pandemic.

In today's digital transformation era, machine learning plays a vital role in various industries, including healthcare, where substantial data is generated daily [ 19 , 20 , 21 ]. Numerous studies have explored machine learning (ML) and explainable artificial intelligence (AI) in predicting COVID-19 prognosis and diagnosis [ 22 , 23 , 24 , 25 ]. Chadaga et al. have developed decision support systems and triage prediction systems using clinical markers and biomarkers [ 22 , 23 ]. Similarly, Khanna et al. have developed a ML and explainable AI system for COVID-19 triage prediction [ 24 ]. Zoabi has also made contributions in this field, developing ML models that predict COVID-19 test results with high accuracy based on a small number of features such as gender, age, contact with an infected person and initial clinical symptoms [ 25 ]. These studies emphasize the potential of ML and explainable AI to improve COVID-19 prediction and diagnosis. Nonetheless, the efficacy of ML algorithms heavily relies on the quality and quantity of data utilized for training. Recent research has indicated that deep learning algorithms' performance can be significantly enhanced compared to traditional ML methods by increasing the volume of data used [ 26 ]. However, it is crucial to acknowledge that the impact of data volume on model performance can vary based on data characteristics and experimental setup, highlighting the need for careful consideration and analysis when selecting data for model training. While the studies emphasize the importance of features in training ML algorithms for COVID-19 prediction and diagnosis, additional research is required on methods to enhance the interpretability of features.

Therefore, the primary aim of this study is to identify the key factors associated with mortality in COVID -19 patients admitted to hospitals in Abadan, Iran. For this purpose, seven categories of factors were selected, including demographic, clinical and conditions, comorbidities, treatments, initial vital signs, symptoms, and laboratory tests, and machine learning algorithms were employed. The predictive power of the data was assessed using 139 predictor variables across seven feature sets. Our next goal is to improve the interpretability of the extracted important features. To achieve this goal, we will utilize the innovative SHAP analysis, which illustrates the impact of features through a diagram.

Materials and methods

Study population and data collection.

Using data from the COVID-19 hospital-based registry database, a retrospective study was conducted from April 2020 to December 2022 at Ayatollah Talleghani Hospital (a COVID‑19 referral center) in Abadan City, Iran.

A total of 14,938 patients were initially screened for eligibility for the study. Of these, 9509 patients were excluded because their transcriptase polymerase chain reaction (RT-PCR) test results were negative or unspecified. The exclusion of patients due to incomplete or missing data is a common issue in medical research, particularly in the use of electronic medical records (EMRs) [ 27 ]. In addition, 1623 patients were excluded because their medical records contained more than 70% incomplete or missing data. In addition, patients younger than 18 years were not included in the study. The criterion for excluding 1623 patients due to "70% incomplete or missing data" means that the medical records of these patients did not contain at least 30% of the data required for a meaningful analysis. This threshold was set to ensure that the dataset used for the study contained a sufficient amount of complete and reliable information to draw accurate conclusions. Incomplete or missing data in a medical record may relate to key variables such as patient demographics, symptoms, lab results, treatment information, outcomes, or other data points important to the research. Insufficient data can affect the validity and reliability of study results and lead to potential bias or inaccuracies in the findings. It is important to exclude such incomplete records to maintain the quality and integrity of the research findings and to ensure that the conclusions drawn are based on robust and reliable data. After these exclusions, 3806 patients remained. Of these patients, 474 died due to COVID -19, while the remaining 3332 patients recovered and were included in the control group. To obtain a balanced sample, the control group was selected with a propensity score matching (PSM). The PSM refers to a statistical technique used to create a balanced comparison group by matching individuals in the control group (in this case, the survived group) with individuals in the case group (in this case, the deceased group) based on their propensity scores. In this study, the propensity scores for each person represented the probability of death (coded as a binary outcome; survived = 0, deceased = 1) calculated from a set of covariates (demographic factors) using the matchit function from the MatchIt library. Two individuals, one from the deceased group and one from the survived group, are considered matched if the difference between their propensity scores is small. Non-matching participants are discarded. The matching aims to reduce bias by making the distribution of observed characteristics similar between groups, which ultimately improves the comparability of groups in observational studies [ 28 ]. In total, the study included 1063 COVID-19 patients who belonged to either the deceased group (case = 474) or the survived group (control = 589) (Fig.  1 ).

figure 1

Flowchart describing the process of patient selection

In the COVID‑19 hospital‑based registry database, one hundred forty primary features in eight main classes including patient’s demographics (eight features), clinical and conditions features (16 features), comorbidities (18 features), treatment (17 features), initial vital sign (14 features), symptoms during hospitalization (31 features), laboratory results (35 features), and an output (0 for survived and 1 for deceased) was recorded for COVID-19 patients. The main features included in the hospital-based COVID-19 registry database are provided in Appendix Table  1 .

To ensure the accuracy of the recorded information, discharged patients or their relatives were called and asked to review some of the recorded information (demographic information, symptoms, and medical history). Clinical symptoms and vital signs were referenced to the first day of hospitalization (at admission). Laboratory test results were also referenced to the patient’s first blood sample at the time of hospitalization.

The study analyzed 140 variables in patients' records, normalizing continuous variables and creating a binary feature to categorize patients based on outcomes. To address the issue of an imbalanced dataset, the Synthetic Minority Over-sampling Technique (SMOTE) was utilized. Some classes were combined to simplify variables. For missing data, an imputation technique was applied, assuming a random distribution [ 29 ]. Little's MCAR test was performed with the naniar package to assess whether missing data in a dataset is missing completely at random (MCAR) [ 30 ]. The null hypothesis in this test is that the data are MCAR, and the test statistic is a chi-square value.

The Ethics Committee of Abadan University of Medical Science approved the research protocol (No. IR.ABADANUMS.REC.1401.095).

Predictor variables

All data were collected in eight categories, including demographic, clinical and conditions, comorbidities, treatment, initial vital signs, symptoms, and laboratory tests in medical records, for a total of 140 variables.

The "Demographics" category encompasses eight features, three of which are binary variables and five of which are categorical. The "Clinical Conditions" category includes 16 features, comprising one quantitative variable, 12 binary variables, and five categorical features. " Comorbidities ", " Treatment ", and " Symptoms " each have 18, 17, and 30 binary features, respectively. Also, there is one quantitative variable in symptoms category. The "Initial Vital Signs" category features 11 quantitative variables, two binary variables, and one categorical variable. Finally, the "Laboratory Tests" category comprises 35 features, with 33 being quantitative, one categorical, and one binary (Appendix Table  1 ).

Outcome variable

The primary outcome variable was mortality, with December 31, 2022, as the last date of follow‐up. The feature shows the class variable, which is binary. For any patient in the survivor group, the outcome is 0; otherwise, it is 1. In this study, 44.59% ( n  = 474) of the samples were in the deceased group and were labeled 1.

Data balancing

In case–control studies, it is common to have unequal size groups since cases are typically fewer than controls [ 31 ]. However, in case–control studies with equal sizes, data balancing may not be necessary for ML algorithms [ 32 ]. When using ML algorithms, data balancing is generally important when there is an imbalance between classes, i.e., when one class has significantly fewer observations than the other [ 33 ]. In such cases, balancing can improve the performance of the algorithm by reducing the bias in favor of the majority class [ 34 ]. For case–control studies of the same size, the balance of the classes has already been reached and balancing may not be necessary. However, it is always recommended to evaluate the performance of the ML algorithm with the given data set to determine the need for data balancing. This is because unbalanced case–control ratios can cause inflated type I error rates and deflated type I error rates in balanced studies [ 35 ].

Feature selection

Feature selection is about selecting important variables from a large dataset to be used in a ML model to achieve better performance and efficiency. Another goal of feature selection is to reduce computational effort by eliminating irrelevant or redundant features [ 36 , 37 ]. Before generating predictions, it is important to perform feature selection to improve the accuracy of clinical decisions and reduce errors [ 37 ]. To identify the best predictors, researchers often compare the effectiveness of different feature selection methods. In this study, we used five common methods, including Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Naïve Bayes (NB), and Random Forest (RF), to select relevant features for predicting mortality of COVID -19 patients. To avoid overfitting, we performed ten-fold cross-validation when training our dataset. This approach may help ensure that our model is optimized for accurate predictions of health status in COVID -19 patients.

Model development, evaluation, and clarity

In this study, the predictive models were developed with five ML algorithms, including DT, XGBoost, SVM, NB, and RF, using the R programming language (v4.3.1) and its packages [ 38 ]. We used cross-validation (CV) to tune the hyperparameters of our models based on the training subset of the dataset. For training and evaluating our ML models, we used a common technique called tenfold cross validation [ 39 ]. The primary training dataset was divided into ten folding, each containing 10% of the total data, using a technique called stratified random sampling. For each of the 30% of the data, a ML model was built and trained on the remaining 70% of the data. The performance of the model was then evaluated on the 30%-fold sample. This process was repeated 100 times with different training and test combinations, and the average performance was reported.

Performance measures include sensitivity (recall), specificity, accuracy, F1-score, and the area under the receiver operating characteristics curve (AUC ROC). Sensitivity is defined as TP / (TP + FN), whereas specificity is TN / (TN + FP). F1-score is defined as the harmonic mean of Precision and Recall with equal weight, where Precision equals TP + TN / total. Also, AUC refers to the area under the ROC curve. In the evaluation of ML techniques, values were classified as poor if below 50%, ok if between 50 and 80%, good if between 80 and 90%, and very good if greater than 90%. These criteria are commonly used in reporting model evaluations [ 40 , 41 ].

Finally, the shapely additive explanation (SHAP) method was used to provide clarity and understanding of the models. SHAP uses cooperative game theory to determine how each feature contributes to the prediction of ML models. This approach allows the computation of the contribution of each feature to model performance [ 42 , 43 ]. For this purpose, the package shapr was used, which includes a modified iteration of the kernel SHAP approach that takes into account the interdependence of the features when computing the Shapley values [ 44 ].

Patient characteristics

Table 1 shows the baseline characteristics of patients infected with COVID-19, including demographic data such as age and sex and other factors such as occupation, place of residence, marital status, education level, BMI, and season of admission. A total of 1063 adult patients (≥ 18 years) were enrolled in the study, of whom 589 (55.41%) survived and 474 (44.59%) died. Analysis showed that age was significantly different between the two groups, with a mean age of 54.70 ± 15.60 in the survivor group versus 65.53 ± 15.18 in the deceased group ( P  < 0.001). There was also a significant association between age and survival, with a higher proportion of patients aged < 40 years in the survivor group (77.0%) than in the deceased group (23.0%) ( P  < 0.001). No significant differences were found between the two groups in terms of sex, occupation, place of residence, marital status, and time of admission. However, there was a significant association between educational level and survival, with a lower proportion of patients with a college degree in the deceased group (37.2%) than in the survivor group (62.8%) ( P  = 0.017). BMI also differed significantly between the two groups, with the proportion of patients with a BMI > 30 (kg/cm 2 ) being higher in the deceased group (56.5%) than in the survivor group (43.5%) ( P  < 0.001).

Clinical and conditions

Important insights into the various clinical and condition characteristics associated with COVID-19 infection outcomes provides in Table  2 . The results show that patients who survived the infection had a significantly shorter hospitalization time (2.20 ± 1.63 days) compared to those who died (4.05 ± 3.10 days) ( P  < 0.001). Patients who were admitted as elective cases had a higher survival rate (84.6%) compared to those who were admitted as urgent (61.3%) or emergency (47.4%) cases. There were no significant differences with regard to the number of infections or family infection history. However, patients who had a history of travel had a lower decease rate (40.1%).

A significantly higher proportion of deceased patients had cases requiring CPR (54.7% vs. 45.3%). Patients who had underlying medical conditions had a significantly lower survival rate (38.3%), with hyperlipidemia being the most prevalent condition (18.7%). Patients who had a history of alcohol consumption (12.5%), transplantation (30.0%), chemotropic (21.4%) or special drug use (0.0%), and immunosuppressive drug use (30.0%) also had a lower survival rate. Pregnant patients (44.4%) had similar survival outcomes compared to non-pregnant patients (55.6%). Patients who were recent or current smokers (36.4%) also had a significantly lower survival rate.

Comorbidities

Table 3 summarizes the comorbidity characteristics of COVID-19 infected patients. Out of 1063 patients, 54.84% had comorbidities. Chi-Square tests for individual comorbidities showed that most of them had a significant association with COVID-19 outcomes, with P -values less than 0.05. Among the various comorbidities, hypertension (HTN) and diabetes mellitus (DM) were the most prevalent, with 12% and 11.5% of patients having these conditions, respectively. The highest fatality rates were observed among patients with cardiovascular disease (95.5%), chronic kidney disease (62.5%), gastrointestinal (GI) (93.3%), and liver diseases (73.3%). Conversely, patients with neurology comorbidities had the lowest fatality rate (0%). These results highlight the significant role of comorbidities in COVID-19 outcomes and emphasize the need for special attention to be paid to patients with pre-existing health conditions.

The treatment characteristics of the COVID-19 patients and the resulting outcomes are shown in Table  4 . The table shows the frequency of patients who received different types of medications or therapies during their treatment. According to the results, the use of antibiotics (35.1%), remdesivir (29.6%), favipiravir (36.0%), and Vitamin zinc (33.5%) was significantly associated with a lower mortality rate ( P  < 0.001), suggesting that these medications may have a positive impact on patient outcomes. On the other hand, the use of Heparin (66.1%), Insulin (82.6%), Antifungal (89.6%), ACE inhibitors (78.1%), and Angiotensin II Receptor Blockers (ARB) (83.8%) was significantly associated with increased mortality ( P  < 0.001), suggesting that these medications may have a negative effect on the patient's outcome. Also, It seems that taking hydroxychloroquine (51.0%) is associated with a worse outcome at lower significance ( P  = 0.022). The use of Atrovent, Corticosteroids and Non-Steroidal Anti-Inflammatory Drugs (NSAIDs) did not show a significant association with survival or mortality rates. Similarly, the use of Intravenous Immunoglobulin (IVIg), Vitamin C, Vitamin D, and Diuretic did not show a significant association with the patient’s outcome.

Initial vital signs

Table 5 provides initial vital sign characteristics of COVID-19 patients, including heart rate, respiratory rate, temperature, blood pressure, oxygen therapy, and radiography test result. The findings shows that deceased patients had higher HR (83.03 bpm vs. 76.14 bpm, P  < 0.001), lower RR (11.40 bpm vs. 16.25 bpm, P  < 0.001), higher temperature (37.43 °C vs. 36.91 °C, P  < 0.001), higher SBP (128.16 mmHg vs. 123.33 mmHg, P  < 0.001), and higher O 2 requirements (invasive: 75.0% vs. 25.0%, P  < 0.001) compared to the survived patients. Additionally, deceased patients had higher MAP (99.35 mmHg vs. 96.08 mmHg, P  = 0.005), and lower SPO 2 percentage (81.29% vs. 91.95%, P  < 0.001) compared to the survived patients. Furthermore, deceased patients had higher PEEP levels (5.83 cmH2O vs. 0.69 cmH2O, P  < 0.001), higher FiO2 levels (51.43% vs. 8.97%, P  < 0.001), and more frequent bilateral pneumonia (63.0% vs. 37.0%, P  < 0.001) compared to the survived patients. There appears to be no relationship between diastolic blood pressure and treatment outcome (83.44 mmHg vs. 85.61 mmHg).

Table 6 provides information on the symptoms of patients infected with COVID-19 by survival outcome. The table also shows the frequency of symptoms among patients. The most common symptom reported by patients was fever, which occurred in 67.0% of surviving and deceased patients. Dyspnea and nonproductive cough were the second and third most common symptoms, reported by 40.4% and 29.3% of the total sample, respectively. Other common symptoms listed in the Table were malodor (28.7%), dyspepsia (28.4%), and myalgia (25.6%).

The P -values reported in the table show that some symptoms are significantly associated with death, including productive cough, dyspnea, sore throat, headache, delirium, olfactory symptoms, dyspepsia, nausea, vomiting, sepsis, respiratory failure, heart failure, MODS, coagulopathy, secondary infection, stroke, acidosis, and admission to the intensive care unit. Surviving and deceased patients also differed significantly in the average number of days spent in the ICU. There was no significant association between patient outcomes and symptoms such as nonproductive cough, chills, diarrhea, chest pain, and hyperglycemia.

Laboratory tests

Table 7 shows the laboratory values of COVID-19 patients with the average values of the different laboratory results. The results show that the deceased patients had significantly lower levels of red blood cells (3.78 × 106/µL vs. 5.01 × 106/µL), hemoglobin (11.22 g/dL vs. 14.10 g/dL), and hematocrit (34.10% vs. 42.46%), whereas basophils and white blood cells did not differ significantly between the two groups. The percentage of neutrophils (65.59% vs. 62.58%) and monocytes (4.34% vs. 3.93%) was significantly higher in deceased patients, while the percentage of lymphocytes and eosinophils did not differ significantly between the two groups. In addition, deceased patients had higher levels of certain biomarkers, including D-dimer (1.347 mgFEU/L vs. 0.155 mgFEU/L), lactate dehydrogenase (174.61 U/L vs. 128.48 U/L), aspartate aminotransferase (93.09 U/L vs. 39.63 U/L), alanine aminotransferase (74.48 U/L vs. 28.70 U/L), alkaline phosphatase (119.51 IU/L vs. 81.34 IU/L), creatine phosphokinase-MB (4.65 IU/L vs. 3.33 IU/L), and positive troponin I (56.5% vs. 43.5%). The proportion of patients with positive C-reactive protein was also higher in the deceased group.

Other laboratory values with statistically significant differences between the two groups ( P  < 0.001) were INR, ESR, BUN, Cr, Na, K, P, PLT, TSH, T3, and T4. The surviving patients generally had lower values in these laboratory characteristics than the deceased patients.

Model performance and evaluation

Five ML algorithms, namely DT, XGBoost, SVM, NB, and RF, were used in this study to build mortality prediction models COVID -19. The models were based on the optimal feature set selected in a previous step and were trained on the same data set. The effectiveness of the models was evaluated by calculating sensitivity, specificity, accuracy, F1 score, and AUC metrics. Table 8 shows the results of this performance evaluation. The average values are expressed from the test set as the mean (standard deviation).

The results show that the performance of the models varies widely in the different feature categories. The Laboratory Tests category achieved the highest performance, with all models scoring 100% in all metrics. The Symptoms and initial Vital Signs categories also show high performance, with XGBoost achieving the highest accuracy of 98.03% and DT achieving the highest sensitivity of 92.79%.

The Clinical and Conditions category also showed high performance, with all models showing accuracy above 91%. XGBoost achieved the highest sensitivity and specificity of 92.74% and 92.96%, respectively. In contrast, the Demographics category showed the lowest performance, with all models achieving less than 66.5% accuracy.

In summary, the results suggest that certain feature categories may be more useful than others in predicting mortality from COVID-19 and that some ML models may perform better than others depending on the feature category used.

Feature importance

SHapley Additive exPlanations (SHAP) values indicate the importance or contribution of each feature in predicting model output. These values help to understand the influence and importance of each feature on the model's decision-making process.

In Fig.  2 , the mean absolute SHAP values are shown to depict global feature importance. Figure  2 shows the contribution of each feature within its respective group as calculated by the XGBoost prediction model using SHAP. According to the SHAP method, the features that had the greatest impact on predicting COVID-19 mortality were, in descending order: D-dimer, CPR, PEEP, underlying disease, ESR, antifungal treatment, PaO2, age, dyspnea, and nausea.

figure 2

Feature importance based on SHAP-values. The mean absolute SHAP values are depicted, to illustrate global feature importance. The SHAP values change in the spectrum from dark (higher) to light (lower) color

On the other hand, Fig.  3 presents the local explanation summary that indicates the direction of the relationship between a variable and COVID-19 outcome. As shown in Fig.  3 (I to VII), older age and very low BMI were the two demographic factors with the greatest impact on model outcome, followed by clinical factors such as higher CPR, hospitalization, and hyperlipidemia. Higher mortality rates were associated with patients who smoked and had traveled in the past 14 days. Patients with underlying diseases, especially HTN, died more frequently. In contrast, the use of remdesivir, Vit Zn, and favipiravir is associated with lower mortality. Initial vital signs such as high PEEP, low PaO2 and RR had the greatest impact, as did symptoms such as dyspnea, MODS, sore throat and LOC. A higher risk of mortality is observed in patients with higher D-dimer levels and ESR as the most consequential laboratory tests, followed by K, AST and CPK-MB.

figure 3

The SHAP-based feature importance of all categories (I to VII) for COVID‑19 mortality prediction, calculated with the XGBoost model. The local explanatory summary shows the direction of the relationship between a feature and patient outcome. Positive SHAP values indicate death, whereas negative SHAP values indicate survival. As the color scale shows, higher values are blue while lower values are orenge

Using the feature types listed in Appendix Table  1 , Fig.  4 shows that the performance of ML algorithms can be improved by increasing the number of features used in training, especially in distinguishing between symptoms, comorbidities, and treatments. In addition, the amount and quality of data used for training can significantly affect algorithm performance, with laboratory tests being more informative than initial vital signs. Regarding the influence of features, quantitative features tend to have a more positive effect on performance than qualitative features; clinical conditions tend to be more informative than demographic data. Thus, both the amount of data and the type of features used have a significant impact on the performance of ML algorithms.

figure 4

Association between feature sets and performance of machine learning algorithms in predicting COVID-19’s mortality

The COVID-19 pandemic has presented unprecedented public health challenges worldwide and requires a deep understanding of the factors contributing to COVID-19 mortality to enable effective management and intervention. This study used machine learning analysis to uncover the predictive power of an extensive dataset that includes wide range of personal, clinical, preclinical, and laboratory variables associated with COVID-19 mortality.

This study confirms previous research on COVID-19 outcomes that highlighted age as a significant predictor of mortality [ 45 , 46 , 47 ], along with comorbidities such as hypertension and diabetes [ 48 , 49 ]. Underlying conditions such as cardiovascular and renal disease also contribute to mortality risk [ 50 , 51 ].

Regarding treatment, antibiotics, remdesivir, favipiravir, and vitamin zinc are associated with lower mortality [ 52 , 53 ], whereas heparin, insulin, antifungals, ACE, and ARBs are associated with higher mortality [ 54 ]. This underscores the importance of drug choice in COVID -19 treatment.

Initial vital signs such as heart rate, respiratory rate, temperature, and oxygen therapy differ between surviving and deceased patients [ 55 ]. Deceased patients often have increased heart rate, lower respiratory rate, higher temperature, and increased oxygen requirements, which can serve as early indicators of disease severity.

Symptoms such as productive cough, dyspnea, and delirium are significantly associated with COVID-19 mortality, emphasizing the need for immediate monitoring and intervention [ 56 ]. Laboratory tests show altered hematologic and biochemical markers in deceased patients, underscoring the importance of routine laboratory monitoring in COVID-19 patients [ 57 , 58 ].

The ML algorithms were used in the study to predict mortality COVID-19 based on these multilayered variables. XGBoost and Random Forest performed better than other algorithms and had high recall, specificity, accuracy, F1 score, and AUC. This highlights the potential of ML, particularly the XGBoost algorithm, in improving prediction accuracy for COVID-19 mortality [ 59 ]. The study also highlighted the importance of drug choice in treatment and the potential of ML algorithms, particularly XGBoost, in improving prediction accuracy. However, the study's findings differ from those of Moulaei [ 60 ], Nopour [ 61 ], and Mehraeen [ 62 ] in terms of the best-performing ML algorithm and the most influential variables. While Moulaei [ 60 ] found that the random forest algorithm had the best performance, Nopour [ 61 ] and Ikemura [ 63 ] identified the artificial neural network and stacked ensemble models, respectively, as the most effective. Additionally, the most influential variables in predicting mortality varied across the studies, with Moulaei [ 60 ] highlighting dyspnea, ICU admission, and oxygen therapy, and Ikemura [ 63 ] identifying systolic and diastolic blood pressure, age, and other biomarkers. These differences may be attributed to variations in the datasets, feature selection, and model training.

However, it is important to note that the choice of algorithm should be tailored to the specific dataset and research question. In addition, the results suggest that a comprehensive approach that incorporates different feature categories may lead to more accurate prediction of COVID-19 mortality. In general, the results suggest that the performance of ML models is influenced by the number and type of features in each category. While some models consistently perform well across different categories (e.g., XGBoost), others perform better for specific types of features (e.g., SVM for Demographics).

Analysis of the importance of characteristics using SHAP values revealed critical factors affecting model results. D-dimer values, CPR, PEEP, underlying diseases, and ESR emerged as the most important features, highlighting the importance of these variables in predicting COVID-19 mortality. These results provide valuable insights into the underlying mechanisms and risk factors associated with severe COVID-19 outcomes.

The types of features used in ML models fall into two broad categories: quantitative (numerical) and qualitative (binary or categorical). The performance of ML methods can vary depending on the type of features used. Some algorithms work better with quantitative features, while others work better with qualitative features. For example, decision trees and random forests work well with both types of features [ 64 ], while neural networks often work better with quantitative features [ 65 , 66 ]. Accordingly, we consider these levels for the features under study to better assess the impact of the data.

The success of ML algorithms depends largely on the quality and quantity of the data on which they are trained [ 67 , 68 , 69 ]. Recent research, including the 2021 study by Sarker IH. [ 26 ], has shown that a larger amount of data can significantly improve the performance of deep learning algorithms compared to traditional machine learning techniques. However, it should be noted that the effect of data size on model performance depends on several factors, such as data characteristics and experimental design. This underscores the importance of carefully and judiciously selecting data for training.

Limitations

One of the limitations of this study is that it relies on data collected from a single hospital in Abadan, Iran. The data may not be representative of the diversity of COVID -19 cases in different regions, and there may be differences in data quality and completeness. In addition, retrospectively collected data may have biases and inaccuracies. Although the study included a substantial number of COVID -19 patients, the sample size may still limit the generalizability of the results, especially for less common subgroups or certain demographic characteristics.

Future works

Future studies could adopt a multi-center approach to improve the scope and depth of research on COVID-19 outcomes. This could include working with multiple hospitals in different regions of Iran to ensure a more diverse and representative sample. By conducting prospective studies, researchers can collect data in real time, which reduces the biases associated with retrospective data collection and increases the reliability of the results. Increasing sample size, conducting longitudinal studies to track patient progression, and implementing quality assurance measures are critical to improving generalizability, understanding long-term effects, and ensuring data accuracy in future research efforts. Collectively, these strategies aim to address the limitations of individual studies and make an important contribution to a more comprehensive understanding of COVID-19 outcomes in different populations and settings.

Conclusions

In summary, this study demonstrates the potential of ML algorithms in predicting COVID-19 mortality based on a comprehensive set of features. In addition, the interpretability of the models using SHAP-based feature importance, which revealed the variables strongly correlated with mortality. This study highlights the power of data-driven approaches in addressing critical public health challenges such as the COVID-19 pandemic. The results suggest that the performance of ML models is influenced by the number and type of features in each feature set. These findings may be a valuable resource for health professionals to identify high-risk patients COVID-19 and allocate resources effectively.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

World Health Organization

Middle east respiratory syndrome

Severe acute respiratory syndrome

Reverse transcription polymerase chain reaction

Propensity score matching

Synthetic minority over-sampling technique

Missing completely at random

Decision tree

EXtreme gradient boosting

Support vector machine

Naïve bayes

Random forest

Cross-validation

True positive

True negative

False positive

False negative

  • Machine learning

Artificial Intelligence

Shapely additive explanation

Cardiopulmonary Resuscitation

Hypertension

Diabetes mellitus

Cardiovascular disease

Chronic Kidney disease

Chronic obstructive pulmonary disease

Human immunodeficiency virus

Hepatitis B virus

Such as influenza, pneumonia, asthma, bronchitis, and chronic obstructive airways disease

Gastrointestinal

Such as epilepsy, learning disabilities, neuromuscular disorders, autism, ADD, brain tumors, and cerebral palsy

Such as fatty liver disease and cirrhosis

Blood disease

Skin diseases

Mental disorders

Intravenous immunoglobulin

Non-steroidal anti-Inflammatory drugs

Angiotensin converting enzyme inhibitors

Angiotensin II receptor blockers

Beats per minute

Respiratory rate

Temperatures

Systolic blood pressure

Diastolic blood pressure

Mean arterial pressure

Oxygen saturation

Partial pressure of oxygen in the alveoli

Positive end-expiratory pressure

Fraction of inspired oxygen

Radiography (X-ray) test result

Smell disorders

Indigestion

Level of consciousness

Multiple organ dysfunction syndrome

Coughing up blood; Coagulopathy: bleeding disorder

High blood glucose

Intensive care unit

Red blood cell

White blood cell

Low-density lipoprotein

High-density lipoprotein

Prothrombin time

Partial thromboplastin time

International normalized ratio

Erythrocyte sedimentation rate

C-reactive-protein

Lactate dehydrogenase

Aspartate aminotransferase

Alanine aminotransferase

Alkaline phosphatase

Creatine phosphokinase-MB

Blood urea nitrogen

Thyroid stimulating hormone

Triiodothyronine

Coronavirus disease (COVID-19) pandemic. Available from: https://www.who.int/europe/emergencies/situations/covid-19 . [cited 2023 Sep 5].

Moolla I, Hiilamo H. Health system characteristics and COVID-19 performance in high-income countries. BMC Health Serv Res. 2023;23(1):1–14. https://doi.org/10.1186/s12913-023-09206-z . [cited 2023 Sep 5].

Article   Google Scholar  

Peeri NC, Shrestha N, Rahman MS, Zaki R, Tan Z, Bibi S, et al. The SARS, MERS and novel coronavirus (COVID-19) epidemics, the newest and biggest global health threats: what lessons have we learned? Int J Epidemiol. 2020;49(3):717–26.

Article   PubMed   Google Scholar  

WHO Coronavirus (COVID-19) Dashboard | WHO Coronavirus (COVID-19) Dashboard With Vaccination Data. Available from: https://covid19.who.int/ . [cited 2023 Sep 5].

Dessie ZG, Zewotir T. Mortality-related risk factors of COVID-19: a systematic review and meta-analysis of 42 studies and 423,117 patients. BMC Infect Dis. 2021;21(1):1–28. https://doi.org/10.1186/s12879-021-06536-3 . [cited 2023 Sep 5].

Article   CAS   Google Scholar  

Wong ELY, Ho KF, Wong SYS, Cheung AWL, Yau PSY, Dong D, et al. Views on Workplace Policies and its Impact on Health-Related Quality of Life During Coronavirus Disease (COVID-19) Pandemic: Cross-Sectional Survey of Employees. Int J Heal Policy Manag. 2022;11(3):344–53. Available from: https://www.ijhpm.com/article_3879.html .

Google Scholar  

Drefahl S, Wallace M, Mussino E, Aradhya S, Kolk M, Brandén M, et al. A population-based cohort study of socio-demographic risk factors for COVID-19 deaths in Sweden. Nat Commun. 2020;11(1):5097.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Islam N, Khunti K, Dambha-Miller H, Kawachi I, Marmot M. COVID-19 mortality: a complex interplay of sex, gender and ethnicity. Eur J Public Health. 2020;30(5):847–8.

Sarmadi M, Marufi N, Moghaddam VK. Association of COVID-19 global distribution and environmental and demographic factors: An updated three-month study. Environ Res. 2020;188:109748.

Aghazadeh-Attari J, Mohebbi I, Mansorian B, Ahmadzadeh J, Mirza-Aghazadeh-Attari M, Mobaraki K, et al. Epidemiological factors and worldwide pattern of Middle East respiratory syndrome coronavirus from 2013 to 2016. Int J Gen Med. 2018;11:121–5.

Risk of COVID-19-Related Mortality. Available from: https://www.cdc.gov/coronavirus/2019-ncov/science/data-review/risk.html . [cited 2023 Aug 26].

Bhaskaran K, Bacon S, Evans SJW, Bates CJ, Rentsch CT, MacKenna B, et al. Factors associated with deaths due to COVID-19 versus other causes: population-based cohort analysis of UK primary care data and linked national death registrations within the OpenSAFELY platform. Lancet Reg Heal. 2021;6:100-9.

Dessie ZG, Zewotir T. Mortality-related risk factors of COVID-19: a systematic review and meta-analysis of 42 studies and 423,117 patients. BMC Infect Dis. 2021;21(1):855. https://doi.org/10.1186/s12879-021-06536-3 .

Talebi SS, Hosseinzadeh A, Zare F, Daliri S, JamaliAtergeleh H, Khosravi A, et al. Risk Factors Associated with Mortality in COVID-19 Patient’s: Survival Analysis. Iran J Public Health. 2022;51(3):652–8.

PubMed   PubMed Central   Google Scholar  

Singh J, Alam A, Samal J, Maeurer M, Ehtesham NZ, Chakaya J, et al. Role of multiple factors likely contributing to severity-mortality of COVID-19. Infect Genet Evol J Mol Epidemiol Evol Genet Infect Dis. 2021;96:105101.

CAS   Google Scholar  

Bhaskaran K, Bacon S, Evans SJ, Bates CJ, Rentsch CT, MacKenna B, et al. Factors associated with deaths due to COVID-19 versus other causes: population-based cohort analysis of UK primary care data and linked national death registrations within the OpenSAFELY platform. Lancet Reg Heal - Eur. 2021;6:100109. Available from:  https://www.pmc/articles/PMC8106239/ . [cited 2023 Aug 26].

Ge E, Li Y, Wu S, Candido E, Wei X. Association of pre-existing comorbidities with mortality and disease severity among 167,500 individuals with COVID-19 in Canada: A population-based cohort study. PLoS One. 2021;16(10):e0258154. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0258154 . [cited 2023 Aug 26].

Tian S, Liu H, Liao M, Wu Y, Yang C, Cai Y, et al. Analysis of mortality in patients with COVID-19: clinical and laboratory parameters. Open Forum Infect Dis. 2020;7(5). Available from:  https://dx.doi.org/10.1093/ofid/ofaa152 . [cited 2023 Aug 26].

Rashidi HH, Tran N, Albahra S, Dang LT. Machine learning in health care and laboratory medicine: General overview of supervised learning and Auto-ML. Int J Lab Hematol. 2021;43:15–22.

Najafi-Vosough R, Faradmal J, Hosseini SK, Moghimbeigi A, Mahjub H. Predicting hospital readmission in heart failure patients in Iran: a comparison of various machine learning methods. Healthc Inform Res. 2021;27(4):307–14.

Article   PubMed   PubMed Central   Google Scholar  

Alanazi A. Using machine learning for healthcare challenges and opportunities. Informatics Med Unlocked. 2022;100924:1–5.

Chadaga K, Prabhu S, Sampathila N, Chadaga R, Umakanth S, Bhat D, et al. Explainable artificial intelligence approaches for COVID-19 prognosis prediction using clinical markers. Sci Rep. 2024;14(1):1783.

Chadaga K, Prabhu S, Bhat V, Sampathila N, Umakanth S, Chadaga R, et al. An explainable multi-class decision support framework to predict COVID-19 prognosis utilizing biomarkers. Cogent Eng. 2023;10(2):2272361.

Khanna VV, Chadaga K, Sampathila N, Prabhu S, Chadaga R. A machine learning and explainable artificial intelligence triage-prediction system for COVID-19. Decis Anal J. 2023;100246:1–14.

Zoabi Y, Deri-Rozov S, Shomron N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. npj Digit Med. 2021;4(1):1–5.

IH Sarker 2021 Machine Learning: Algorithms, Real-World Applications and Research Directions SN Comput Sci. 2 3 160 Available from: https://doi.org/10.1007/s42979-021-00592-x .

Jones JA, Farnell B. Missing and Incomplete Data Reduces the Value of General Practice Electronic Medical Records as Data Sources in Research. Aust J Prim Health. 2007;13(1):74–80. Available from: https://www.publish.csiro.au/py/py07010 . [cited 2023 Dec 16].

Austin PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011;46(3):399–424.

Torjusen H, Lieblein G, Næs T, Haugen M, Meltzer HM, Brantsæter AL. Food patterns and dietary quality associated with organic food consumption during pregnancy; Data from a large cohort of pregnant women in Norway. BMC Public Health. 2012;12(1):1–11.

Little RJA. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc. 1988;83(404):1198–202.

Tenny S, Kerndt CC, Hoffman MR. Case Control Studies. Encycl Pharm Pract Clin Pharm Vol 1-3 [Internet]. 2023;1–3:V2-356-V2-366. [cited 2024 Apr 14] Available from: https://www.ncbi.nlm.nih.gov/books/NBK448143/ .

Stanfill B, Reehl S, Bramer L, Nakayasu ES, Rich SS, Metz TO, et al. Extending Classification Algorithms to Case-Control Studies. Biomed Eng Comput Biol. 2019;10:117959721985895. Available from: https://www.pmc/articles/PMC6630079/ .[cited 2023 Sep 3].

Mulugeta G, Zewotir T, Tegegne AS, Juhar LH, Muleta MB. Classification of imbalanced data using machine learning algorithms to predict the risk of renal graft failures in Ethiopia. BMC Med Inform Decis Mak. 2023;23(1):1–17. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-023-02185-5 . [cited 2023 Sep 3].

Sadeghi S, Khalili D, Ramezankhani A, Mansournia MA, Parsaeian M. Diabetes mellitus risk prediction in the presence of class imbalance using flexible machine learning methods. BMC Med Inform Decis Mak. 2022;22(1):36. https://doi.org/10.1186/s12911-022-01775-z .

Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet. 2018;50(9):1335. Available from:  https://www.pmc/articles/PMC6119127/ . [cited 2023 Sep 3].

Miao J, Niu L. A Survey on Feature Selection. Procedia Comput Sci. 2016;91(1):919–26.

Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med. 2019;112:103375.

Article   CAS   PubMed   Google Scholar  

R Studio Team. A language and environment for statistical computing. R Found Stat Comput. 2021;1.

Training Sets, Test Sets, and 10-fold Cross-validation - KDnuggets. Available from: https://www.kdnuggets.com/2018/01/training-test-sets-cross-validation.html . [cited 2023 Sep 4].

Hossin M, Sulaiman MN. A review on evaluation metrics for data classification evaluations. Int J data Min Knowl Manag Process. 2015;5(2):1.

Seyedtabib M, Kamyari N. Predicting polypharmacy in half a million adults in the Iranian population: comparison of machine learning algorithms. BMC Med Inform Decis Mak. 2023;23(1):84. https://doi.org/10.1186/s12911-023-02177-5 .

Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765–74.

Greenwell B. Fastshap: Fast approximate shapley values. Man R Packag v0 05. 2020;9–12.  https://www.CRANR-projectorg/package=fastshap . Last accessed.

Aas K, Jullum M, Løland A. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artif Intell. 2021;298:103502.

Mesas AE, Cavero-Redondo I, Álvarez-Bueno C, Sarriá Cabrera MA, de Maffei Andrade S, Sequí-Dominguez I, et al. Predictors of in-hospital COVID-19 mortality: A comprehensive systematic review and meta-analysis exploring differences by age, sex and health conditions. PLoS One. 2020;15(11):e0241742.

Yanez ND, Weiss NS, Romand J-A, Treggiari MM. COVID-19 mortality risk for older men and women. BMC Public Health. 2020;20(1):1–7.

Sasson I. Age and COVID-19 mortality. Demogr Res. 2021;44:379–96.

Huang I, Lim MA, Pranata R. Diabetes mellitus is associated with increased mortality and severity of disease in COVID-19 pneumonia–a systematic review, meta-analysis, and meta-regression. Diabetes Metab Syndr Clin Res Rev. 2020;14(4):395–403.

Albitar O, Ballouze R, Ooi JP, Ghadzi SMS. Risk factors for mortality among COVID-19 patients. Diabetes Res Clin Pract. 2020;166:108293.

Di Castelnuovo A, Bonaccio M, Costanzo S, Gialluisi A, Antinori A, Berselli N, et al. Common cardiovascular risk factors and in-hospital mortality in 3,894 patients with COVID-19: survival analysis and machine learning-based findings from the multicentre Italian CORIST Study. Nutr Metab Cardiovasc Dis. 2020;30(11):1899–913.

Ssentongo P, Ssentongo AE, Heilbrunn ES, Ba DM, Chinchilli VM. Association of cardiovascular disease and 10 other pre-existing comorbidities with COVID-19 mortality: A systematic review and meta-analysis. PLoS ONE. 2020;15(8):e0238215.

Beran A, Mhanna M, Srour O, Ayesh H, Stewart JM, Hjouj M, et al. Clinical significance of micronutrient supplements in patients with coronavirus disease 2019: A comprehensive systematic review and meta-analysis. Clin Nutr ESPEN. 2022;48:167–77.

Perveen RA, Nasir M, Murshed M, Nazneen R, Ahmad SN. Remdesivir and favipiravir changes hepato-renal profile in COVID-19 patients: a cross sectional observation in Bangladesh. Int J Med Sci Clin Inven. 2021;8(1):5196–201.

El-Arif G, Khazaal S, Farhat A, Harb J, Annweiler C, Wu Y, et al. Angiotensin II Type I Receptor (AT1R): the gate towards COVID-19-associated diseases. Molecules. 2022;27(7):2048.

Ikram AS, Pillay S. Admission vital signs as predictors of COVID-19 mortality: a retrospective cross-sectional study. BMC Emerg Med. 2022;22(1):1–10.

Martí-Pastor A, Moreno-Perez O, Lobato-Martínez E, Valero-Sempere F, Amo-Lozano A, Martínez-García M-Á, et al. Association between Clinical Frailty Scale (CFS) and clinical presentation and outcomes in older inpatients with COVID-19. BMC Geriatr. 2023;23(1):1.

Lippi G, Plebani M. Laboratory abnormalities in patients with COVID-2019 infection. Clin Chem Lab Med. 2020;58(7):1131–4.

Naghashpour M, Ghiassian H, Mobarak S, Adelipour M, Piri M, Seyedtabib M, et al. Profiling serum levels of glutathione reductase and interleukin-10 in positive and negative-PCR COVID-19 outpatients: A comparative study from southwestern Iran. J Med Virol. 2022;94(4):1457–64.

Sharifi-Kia A, Nahvijou A, Sheikhtaheri A. Machine learning-based mortality prediction models for smoker COVID-19 patients. BMC Med Inform Decis Mak. 2023;23(1):1–15.

Moulaei K, Shanbehzadeh M, Mohammadi-Taghiabad Z, Kazemi-Arpanahi H. Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Med Inform Decis Mak. 2022;22(1):2. https://doi.org/10.1186/s12911-021-01742-0 .

Nopour R, Erfannia L, Mehrabi N, Mashoufi M, Mahdavi A, Shanbehzadeh M. Comparison of Two Statistical Models for Predicting Mortality in COVID-19 Patients in Iran. Shiraz E-Medical J 2022 236 [Internet]. 2022;23(6):119172. [cited 2024 Apr 14] Available from: https://brieflands.com/articles/semj-119172 .

Mehraeen E, Karimi A, Barzegary A, Vahedi F, Afsahi AM, Dadras O, et al. Predictors of mortality in patients with COVID-19–a systematic review. Eur J Integr Med. 2020;40:101226.

Ikemura K, Bellin E, Yagi Y, Billett H, Saada M, Simone K, et al. Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study. J Med Internet Res [Internet]. 2021;23(2):e23458. Available from: https://www.jmir.org/2021/2/e23458 .

Breiman L. Random forests. Mach Learn. 2001;45:5–32.

Hinton G, Srivastava N, Swersky K. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on. 2012;14(8):2.

Zheng A, Casari A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists. O’Reilly [Internet]. 2018;218. [cited 2024 Apr 14] Available from: https://www.amazon.com/Feature-Engineering-Machine-Learning-Principles/dp/1491953241 .

Adamson AS, Smith A. Machine Learning and Health Care Disparities in Dermatology. JAMA Dermatology. 2018;154(11):1247–8. Available from:  https://jamanetwork.com/journals/jamadermatology/fullarticle/2688587 . [cited 2023 Sep 15].

Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine Learning and Data Mining Methods in Diabetes Research. Comput Struct Biotechnol J. 2017;1(15):104–16.

Schmidt J, Marques MRG, Botti S, Marques MAL. Recent advances and applications of machine learning in solid-state materials science. Comput Mater. 2019;5(1):83. https://doi.org/10.1038/s41524-019-0221-0 .

Download references

Acknowledgements

We thank the Research Deputy of the Abadan University of Medical Sciences for financially supporting this project.

Summary points

∙ How can datasets improve mortality prediction using ML models for COVID-19 patients?

∙ In order, quantity and quality variables have more effect on the model performances.

∙ Intelligent techniques such as SHAP analysis can be used to improve the interpretability of features in ML algorithms.

∙ Well-structured data are critical to help health professionals identify at-risk patients and improve pandemic outcomes.

This research was supported by grant No. 1456 from the Abadan University of Medical Sciences. However, the funding source did not influence the study design, data collection, analysis and interpretation, report writing, or decision to publish the article.

Author information

Authors and affiliations.

Department of Biostatistics and Epidemiology, School of Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran

Maryam Seyedtabib

Research Center for Health Sciences, Hamadan University of Medical Sciences, Hamadan, Iran

Roya Najafi-Vosough

Department of Biostatistics and Epidemiology, School of Health, Abadan University of Medical Sciences, Abadan, Iran

Naser Kamyari

You can also search for this author in PubMed   Google Scholar

Contributions

MS: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Resources, Data curation, Writing–original draft, writing—review & editing, Visualization, Project administration. RNV: Conceptualization, Data curation, Formal analysis, Investigation, Writing–original draft, writing—review & editing. NK: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing–original draft, writing—review & editing, Visualization, Supervision.

Corresponding author

Correspondence to Naser Kamyari .

Ethics declarations

Ethics approval and consent to participate.

This study was approved by the Research Ethics Committee (REC) of Abadan University of Medical Sciences under the ID number IR.ABADANUMS.REC.1401.095. Methods used complied with all relevant ethical guidelines and regulations. The Ethics Committee of Abadan University of Medical Sciences waived the requirement for written informed consent from study participants.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Seyedtabib, M., Najafi-Vosough, R. & Kamyari, N. The predictive power of data: machine learning analysis for Covid-19 mortality based on personal, clinical, preclinical, and laboratory variables in a case–control study. BMC Infect Dis 24 , 411 (2024). https://doi.org/10.1186/s12879-024-09298-w

Download citation

Received : 22 December 2023

Accepted : 05 April 2024

Published : 18 April 2024

DOI : https://doi.org/10.1186/s12879-024-09298-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Predictive model
  • Coronavirus disease
  • Data quality
  • Performance

BMC Infectious Diseases

ISSN: 1471-2334

approaches case study analysis

  • Research article
  • Open access
  • Published: 15 April 2024

What is quality in long covid care? Lessons from a national quality improvement collaborative and multi-site ethnography

  • Trisha Greenhalgh   ORCID: orcid.org/0000-0003-2369-8088 1 ,
  • Julie L. Darbyshire 1 ,
  • Cassie Lee 2 ,
  • Emma Ladds 1 &
  • Jenny Ceolta-Smith 3  

BMC Medicine volume  22 , Article number:  159 ( 2024 ) Cite this article

1512 Accesses

66 Altmetric

Metrics details

Long covid (post covid-19 condition) is a complex condition with diverse manifestations, uncertain prognosis and wide variation in current approaches to management. There have been calls for formal quality standards to reduce a so-called “postcode lottery” of care. The original aim of this study—to examine the nature of quality in long covid care and reduce unwarranted variation in services—evolved to focus on examining the reasons why standardizing care was so challenging in this condition.

In 2021–2023, we ran a quality improvement collaborative across 10 UK sites. The dataset reported here was mostly but not entirely qualitative. It included data on the origins and current context of each clinic, interviews with staff and patients, and ethnographic observations at 13 clinics (50 consultations) and 45 multidisciplinary team (MDT) meetings (244 patient cases). Data collection and analysis were informed by relevant lenses from clinical care (e.g. evidence-based guidelines), improvement science (e.g. quality improvement cycles) and philosophy of knowledge.

Participating clinics made progress towards standardizing assessment and management in some topics; some variation remained but this could usually be explained. Clinics had different histories and path dependencies, occupied a different place in their healthcare ecosystem and served a varied caseload including a high proportion of patients with comorbidities. A key mechanism for achieving high-quality long covid care was when local MDTs deliberated on unusual, complex or challenging cases for which evidence-based guidelines provided no easy answers. In such cases, collective learning occurred through idiographic (case-based) reasoning , in which practitioners build lessons from the particular to the general. This contrasts with the nomothetic reasoning implicit in evidence-based guidelines, in which reasoning is assumed to go from the general (e.g. findings of clinical trials) to the particular (management of individual patients).

Not all variation in long covid services is unwarranted. Largely because long covid’s manifestations are so varied and comorbidities common, generic “evidence-based” standards require much individual adaptation. In this complex condition, quality improvement resources may be productively spent supporting MDTs to optimise their case-based learning through interdisciplinary discussion. Quality assessment of a long covid service should include review of a sample of individual cases to assess how guidelines have been interpreted and personalized to meet patients’ unique needs.

Study registration

NCT05057260, ISRCTN15022307.

Peer Review reports

The term “long covid” [ 1 ] means prolonged symptoms following SARS-CoV-2 infection not explained by an alternative diagnosis [ 2 ]. It embraces the US term “post-covid conditions” (symptoms beyond 4 weeks) [ 3 ], the UK terms “ongoing symptomatic covid-19” (symptoms lasting 4–12 weeks) and “post covid-19 syndrome” (symptoms beyond 12 weeks) [ 4 ] and the World Health Organization’s “post covid-19 condition” (symptoms occurring beyond 3 months and persisting for at least 2 months) [ 5 ]. Long covid thus defined is extremely common. In UK, for example, 1.8 million of a population of 67 million met the criteria for long covid in early 2023 and 41% of these had been unwell for more than 2 years [ 6 ].

Long covid is characterized by a constellation of symptoms which may include breathlessness, fatigue, muscle and joint pain, chest pain, memory loss and impaired concentration (“brain fog”), sleep disturbance, depression, anxiety, palpitations, dizziness, gastrointestinal problems such as diarrhea, skin rashes and allergy to food or drugs [ 2 ]. These lead to difficulties with essential daily activities such as washing and dressing, impaired exercise tolerance and ability to work, and reduced quality of life [ 2 , 7 , 8 ]. Symptoms typically cluster (e.g. in different patients, long covid may be dominated by fatigue, by breathlessness or by palpitations and dizziness) [ 9 , 10 ]. Long covid may follow a fairly constant course or a relapsing and remitting one, perhaps with specific triggers [ 11 ]. Overlaps between fatigue-dominant subtypes of long covid, myalgic encephalomyelitis and chronic fatigue syndrome have been hypothesized [ 12 ] but at the time of writing remain unproven.

Long covid has been a contested condition from the outset. Whilst long-term sequelae following other coronavirus (SARS and MERS) infections were already well-documented [ 13 ], SARS-CoV-2 was originally thought to cause a short-lived respiratory illness from which the patient either died or recovered [ 14 ]. Some clinicians dismissed protracted or relapsing symptoms as due to anxiety or deconditioning, especially if the patient had not had laboratory-confirmed covid-19. People with long covid got together in online groups and shared accounts of their symptoms and experiences of such “gaslighting” in their healthcare encounters [ 15 , 16 ]. Some groups conducted surveys on their members, documenting the wide range of symptoms listed in the previous paragraph and showing that whilst long covid is more commonly a sequel to severe acute covid-19, it can (rarely) follow a mild or even asymptomatic acute infection [ 17 ].

Early publications on long covid depicted a post-pneumonia syndrome which primarily affected patients who had been hospitalized (and sometimes ventilated) [ 18 , 19 ]. Later, covid-19 was recognized to be a multi-organ inflammatory condition (the pneumonia, for example, was reclassified as pneumonitis ) and its long-term sequelae attributed to a combination of viral persistence, dysregulated immune response (including auto-immunity), endothelial dysfunction and immuno-thrombosis, leading to damage to the lining of small blood vessels and (thence) interference with transfer of oxygen and nutrients to vital organs [ 20 , 21 , 22 , 23 , 24 ]. But most such studies were highly specialized, laboratory-based and written primarily for an audience of fellow laboratory researchers. Despite demonstrating mean differences in a number of metabolic variables, they failed to identify a reliable biomarker that could be used routinely in the clinic to rule a diagnosis of long covid in or out. Whilst the evidence base from laboratory studies grew rapidly, it had little influence on clinical management—partly because most long covid clinics had been set up with impressive speed by front-line clinical teams to address an immediate crisis, with little or no input from immunologists, virologists or metabolic specialists [ 25 ].

Studies of the patient experience revealed wide geographical variation in whether any long covid services were provided and (if they were) which patients were eligible for these and what tests and treatments were available [ 26 ]. An interim UK clinical guideline for long covid had been produced at speed and published in December 2020 [ 27 ], but it was uncertain about diagnostic criteria, investigations, treatments and prognosis. Early policy recommendations for long covid services in England, based on wide consultation across UK, had proposed a tiered service with “tier 1” being supported self-management, “tier 2” generalist assessment and management in primary care, “tier 3” specialist rehabilitation or respiratory follow-up with oversight from a consultant physician and “tier 4” tertiary care for patients with complications or complex needs [ 28 ]. In 2021, ring-fenced funding was allocated to establish 90 multidisciplinary long covid clinics in England [ 29 ]; some clinics were also set up with local funding in Scotland and Wales. These clinics varied widely in eligibility criteria, referral pathways, staffing mix (some had no doctors at all) and investigations and treatments offered. A further policy document on improving long covid services was published in 2022 [ 30 ]; it recommended that specialist long covid clinics should continue, though the long-term funding of these services remains uncertain [ 31 ]. To build the evidence base for delivering long covid services, major programs of publicly funded research were commenced in both UK [ 32 ] and USA [ 33 ].

In short, at the time this study began (late 2021), there appeared to be much scope for a program of quality improvement which would capture fast-emerging research findings, establish evidence-based standards and ensure these were rapidly disseminated and consistently adopted across both specialist long covid services and in primary care.

Quality improvement collaboratives

The quality improvement movement in healthcare was born in the early 1980s when clinicians and policymakers US and UK [ 34 , 35 , 36 , 37 ] began to draw on insights from outside the sector [ 38 , 39 , 40 ]. Adapting a total quality management approach that had previously transformed the Japanese car industry, they sought to improve efficiency, reduce waste, shift to treating the upstream causes of problems (hence preventing disease) and help all services approach the standards of excellence achieved by the best. They developed an approach based on (a) understanding healthcare as a complex system (especially its key interdependencies and workflows), (b) analysing and addressing variation within the system, (c) learning continuously from real-world data and (d) developing leaders who could motivate people and help them change structures and processes [ 41 , 42 , 43 , 44 ].

Quality improvement collaboratives (originally termed “breakthrough collaboratives” [ 45 ]), in which representatives from different healthcare organizations come together to address a common problem, identify best practice, set goals, share data and initiate and evaluate improvement efforts [ 46 ], are one model used to deliver system-wide quality improvement. It is widely assumed that these collaboratives work because—and to the extent that—they identify, interpret and implement high-quality evidence (e.g. from randomized controlled trials).

Research on why quality improvement collaboratives succeed or fail has produced the following list of critical success factors: taking a whole-system approach, selecting a topic and goal that fits with organizations’ priorities, fostering a culture of quality improvement (e.g. that quality is everyone’s job), engagement of everyone (including the multidisciplinary clinical team, managers, patients and families) in the improvement effort, clearly defining people’s roles and contribution, engaging people in preliminary groundwork, providing organizational-level support (e.g. chief executive endorsement, protected staff time, training and support for teams, resources, quality-focused human resource practices, external facilitation if needed), training in specific quality improvement techniques (e.g. plan-do-study-act cycle), attending to the human dimension (including cultivating trust and working to ensure shared vision and buy-in), continuously generating reliable data on both processes (e.g. current practice) and outcomes (clinical, satisfaction) and a “learning system” infrastructure in which knowledge that is generated feeds into individual, team and organizational learning [ 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 ].

The quality improvement collaborative approach has delivered many successes but it has been criticized at a theoretical level for over-simplifying the social science of human motivation and behaviour and for adopting a somewhat mechanical approach to the study of complex systems [ 55 , 56 ]. Adaptations of the original quality improvement methodology (e.g. from Sweden [ 57 , 58 ]) have placed greater emphasis on human values and meaning-making, on the grounds that reducing the complexities of a system-wide quality improvement effort to a set of abstract and generic “success factors” will miss unique aspects of the case such as historical path dependencies, personalities, framing and meaning-making and micropolitics [ 59 ].

Perhaps this explains why, when the abovementioned factors are met, a quality improvement collaborative’s success is more likely but is not guaranteed, as a systematic review demonstrated [ 60 ]. Some well-designed and well-resourced collaboratives addressing clear knowledge gaps produced few or no sustained changes in key outcome measures [ 49 , 53 , 60 , 61 , 62 ]. To identify why this might be, a detailed understanding of a service’s history, current challenges and contextual constraints is needed. This explains our decision, part-way through the study reported here, to collect rich contextual data on participating sites so as to better explain success or failure of our own collaborative.

Warranted and unwarranted variation in clinical practice

A generation ago, Wennberg described most variation in clinical practice as “unwarranted” (which he defined as variation in the utilization of health care services that cannot be explained by variation in patient illness or patient preferences) [ 63 ]. Others coined the term “postcode lottery” to depict how such variation allegedly impacted on health outcomes [ 64 ]. Wennberg and colleagues’ Atlas of Variation , introduced in 1999 [ 65 ], and its UK equivalent, introduced in 2010 [ 66 ], described wide regional differences in the rates of procedures from arthroscopy to hysterectomy, and were used to prompt services to identify and address examples of under-treatment, mis-treatment and over-treatment. Numerous similar initiatives, mostly based on hospital activity statistics, have been introduced around the world [ 66 , 67 , 68 , 69 ]. Sutherland and Levesque’s proposed framework for analysing variation, for example, has three domains: capacity (broadly, whether sufficient resources are allocated at organizational level and whether individuals have the time and headspace to get involved), evidence (the extent to which evidence-based guidelines exist and are followed), and agency (e.g. whether clinicians are engaged with the issue and the effect of patient choice) [ 70 ].

Whilst it is clearly a good idea to identify unwarranted variation in practice, it is also important to acknowledge that variation can be warranted . The very act of measuring and describing variation carries great rhetorical power, since revealing geographical variation in any chosen metric effectively frames this as a problem with a conceptually simple solution (reducing variation) that will appeal to both politicians and the public [ 71 ]. The temptation to expose variation (e.g. via visualizations such as maps) and address it in mechanistic ways should be resisted until we have fully understood the reasons why it exists, which may include perverse incentives, insufficient opportunities to discuss cases with colleagues, weak or absent feedback on practice, unclear decision processes, contested definitions of appropriate care and professional challenges to guidelines [ 72 ].

Research question, aims and objectives

Research question.

What is quality in long covid care and how can it best be achieved?

To identify best practice and reduce unwarranted variation in UK long covid services.

To explain aspects of variation in long covid services that are or may be warranted.

Our original objectives were to:

Establish a quality improvement collaborative for 10 long covid clinics across UK.

Use quality improvement methods in collaboration with patients and clinic staff to prioritize aspects of care to improve. For each priority topic, identify best (evidence-informed) clinical practice, measure performance in each clinic, compare performance with a best practice benchmark and improve performance.

Produce organizational case studies of participating long covid clinics to explain their origins, evolution, leadership, ethos, population served, patient pathways and place in the wider healthcare ecosystem.

Examine these case studies to explain variation in practice, especially in topics where the quality improvement cycle proves difficult to follow or has limited impact.

The LOCOMOTION study

LOCOMOTION (LOng COvid Multidisciplinary consortium Optimising Treatments and services across the NHS) was a 30-month multi-site case study of 10 long covid clinics (8 in England, 1 in Wales and 1 in Scotland), beginning in 2021, which sought to optimise long covid care. Each clinic offered multidisciplinary care to patients referred from primary or secondary care (and, in some cases, self-referred), and held regular multidisciplinary team (MDT) meetings, mostly online via Microsoft Teams, to discuss cases. A study protocol for LOCOMOTION, with details of ethical approvals, management, governance and patient involvement has been published [ 25 ]. The three main work packages addressed quality improvement, technology-supported patient self-management and phenotyping and symptom clustering. This paper reports on the first work package, focusing mainly on qualitative findings.

Setting up the quality improvement collaborative

We broadly followed standard methodology for “breakthrough” quality improvement collaboratives [ 44 , 45 ], with two exceptions. First, because of geographical distance, continuing pandemic precautions and developments in videoconferencing technology, meetings were held online. Second, unlike in the original breakthrough model, patients were included in the collaborative, reflecting the cultural change towards patient partnerships since the model was originally proposed 40 years ago.

Each site appointed a clinical research fellow (doctor, nurse or allied health professional) funded partly by the LOCOMOTION study and partly with clinical sessions; some were existing staff who were backfilled to take on a research role whilst others were new appointments. The quality improvement meetings were held approximately every 8 weeks on Microsoft Teams and lasted about 2 h; there was an agenda and a chair, and meetings were recorded with consent. The clinical research fellow from each clinic attended, sometimes joined by the clinical lead for that site. In the initial meeting, the group proposed and prioritized topics before merging their consensus with the list of priority topics generated separately by patients (there was much overlap but also some differences).

In subsequent meetings, participants attempted to reach consensus on how to define, measure and achieve quality for each priority topic in turn, implement this approach in their own clinic and monitor its impact. Clinical leads prepared illustrative clinical cases and summaries of the research evidence, which they presented using Microsoft Powerpoint; the group then worked towards consensus on the implications for practice through general discussion. Clinical research fellows assisted with literature searches, collected baseline data from their own clinic, prepared and presented anonymized case examples, and contributed to collaborative goal-setting for improvement. Progress on each topic was reviewed at a later meeting after an agreed interval.

An additional element of this work package was semi-structured interviews with 29 patients, recruited from 9 of the 10 participating sites, about their clinic experiences with a view to feeding into service improvement (in the other site, no patient volunteered).

Our patient advisory group initially met separately from the quality improvement collaborative. They designed a short survey of current practice and sent it to each clinic; the results of this informed a prioritization exercise for topics where they considered change was needed. The patient-generated list was tabled at the quality improvement collaborative discussions, but patients were understandably keen to join these discussions directly. After about 9 months, some patient advisory group members joined the regular collaborative meetings. This dynamic was not without its tensions, since sharing performance data requires trust and there were some concerns about confidentiality when real patient cases were discussed with other patients present.

How evidence-informed quality targets were set

At the time the study began, there were no published large-scale randomized controlled trials of any interventions for long covid. We therefore followed a model used successfully in other quality improvement efforts where research evidence was limited or absent or it did not translate unambiguously into models for current services. In such circumstances, the best evidence may be custom and practice in the best-performing units. The quality improvement effort becomes oriented to what one group of researchers called “potentially better practices”—that is, practices that are “developed through analysis of the processes of care, literature review, and site visits” (page 14) [ 73 ]. The idea was that facilitated discussion among clinical teams, drawing on published research where available but also incorporating clinical experience, established practice and systematic analysis of performance data across participating clinics would surface these “potentially better practices”—an approach which, though not formally tested in controlled trials, appears to be associated with improved outcomes [ 46 , 73 ].

Adding an ethnographic component

Following limited progress made on some topics that had been designated high priority, we interviewed all 10 clinical research fellows (either individually or, in two cases, with a senior clinician present) and 18 other clinic staff (five individually plus two groups of 5 and 8), along with additional informal discussions, to explore the challenges of implementing the changes that had been agreed. These interviews were not audiotaped but detailed notes were made and typed up immediately afterwards. It became evident that some aspects of what the collaborative had deemed “evidence-informed” care were contested by front-line clinic staff, perceived as irrelevant to the service they were delivering, or considered impossible to implement. To unpack these issues further, the research protocol was amended to include an ethnographic component.

TG and EL (academic general practitioners) and JLD (a qualitative researcher with a PhD in the patient experience) attended a total of 45 MDT meetings in participating clinics (mostly online or hybrid). Staff were informed in advance that there would be an observer present; nobody objected. We noted brief demographic and clinical details of cases discussed (but no identifying data), dilemmas and uncertainties on which discussions focused, and how different staff members contributed.

TG made 13 in-person visits to participating long covid clinics. Staff were notified in advance; all were happy to be observed. Visits lasted between 5 and 8 h (54 h in total). We observed support staff booking patients in and processing requests and referrals, and shadowed different clinical staff in turn as they saw patients. Patients were informed of our presence and its purpose beforehand and given the opportunity to decline (three of 53 patients approached did). We discussed aspects of each case with the clinician after the patient left. When invited, we took breaks with staff and used these as an opportunity to ask them informally what it was like working in the clinic.

Ethnographic observation, analysis and reporting was geared to generating a rich interpretive account of the clinical, operational and interpersonal features of each clinic—what Van Maanen calls an “impressionist tales” [ 74 ]. Our work was also guided by the principles set out by Golden-Biddle and Locke, namely authenticity (spending time in the field and basing interpretations on these direct observations), plausibility (creating a plausible account through rich persuasive description) and criticality (e.g. reflexively examining our own assumptions) [ 75 ]. Our collection and analysis of qualitative data was informed by our own professional backgrounds (two general practitioners, one physical therapist, two non-clinicians).

In both MDTs and clinics, we took contemporaneous notes by hand and typed these up immediately afterwards.

Data management and analysis

Typed interview notes and field notes from clinics were collated in a set of Word documents, one for each clinic attended. They were analysed thematically [ 76 ] with attention to the literature on quality improvement and variation (see “ Background ”). Interim summaries were prepared on each clinic, setting out the narrative of how it had been established, its ethos and leadership, setting and staffing, population served and key links with other parts of the local healthcare ecosystem.

Minutes and field notes from the quality improvement collaborative meetings were summarized topic by topic, including initial data collected by the researchers-in-residence, improvement actions taken (or attempted) in that clinic, and any follow-up data shared. Progress or lack of it was interpreted in relation to the contextual case summary for that clinic.

Patient cases seen in clinic, and those discussed by MDTs, were summarized as brief case narratives in Word documents. Using the constant comparative method [ 77 ], we produced an initial synthesis of the clinical picture and principles of management based on the first 10 patient cases seen, and refined this as each additional case was added. Demographic and brief clinical and social details were also logged on Excel spreadsheets. When writing up clinical cases, we used the technique of composite case construction (in which we drew on several actual cases to generate a fictitious one, thereby protecting anonymity whilst preserving key empirical findings [ 78 ]); any names reported in this paper are pseudonyms.

Member checking

A summary was prepared for each clinic, including a narrative of the clinic’s own history and a summary of key quality issues raised across the ten clinics. These summaries included examples from real cases in our dataset. These were shared with the clinical research fellow and a senior clinician from the clinic, and amended in response to feedback. We also shared these summaries with representatives from the patient advisory group.

Overview of dataset

This study generated three complementary datasets. First, the video recordings, minutes, and field notes of 12 quality improvement collaborative meetings, along with the evidence summaries prepared for these meetings and clinic summaries (e.g. descriptions of current practice, audits) submitted by the clinical research fellows. This dataset illustrated wide variation in practice, and (in many topics) gaps or ambiguities in the evidence base.

Second, interviews with staff ( n  = 30) and patients ( n  = 29) from the clinics, along with ethnographic field notes (approximately 100 pages) from 13 in-person clinic visits (54 h), including notes on 50 patient consultations (40 face-to-face, 6 telephone, 4 video). This dataset illustrated the heterogeneity among the ten participating clinics.

Third, field notes (approximately 100 pages), including discussions on 244 clinical cases from the 45 MDT meetings (49 h) that we observed. This dataset revealed further similarities and contrasts among clinics in how patients were managed. In particular, it illustrated how, for the complex patients whose cases were presented at these meetings, teams made sense of, and planned for, each case through multidisciplinary dialogue. This dialogue typically began with one staff member presenting a detailed clinical history along with a narrative of how it had affected the patient’s life and what was at stake for them (e.g. job loss), after which professionals from various backgrounds (nursing, physical therapy, occupational therapy, psychology, dietetics, and different medical specialties) joined in a discussion about what to do.

The ten participating sites are summarized in Table  1 .

In the next two sections, we explore two issues—difficulty defining best practice and the heterogeneous nature of the clinics—that were key to explaining why quality, when pursued in a 10-site collaborative, proved elusive. We then briefly summarize patients’ accounts of their experience in the clinics and give three illustrative examples of the elusiveness of quality improvement using selected topics that were prioritized in our collaborative: outcome measures, investigation of palpitations and management of fatigue. In the final section of the results, we describe how MDT deliberations proved crucial for local quality improvement. Further detail on clinical priority topics will be presented in a separate paper.

“Best practice” in long covid: uncertainty and conflict

The study period (September 2021 to December 2023) corresponded with an exponential increase in published research on long covid. Despite this, the quality improvement collaborative found few unambiguous recommendations for practice. This gap between what the research literature offered and what clinical practice needed was partly ontological (relating what long covid is ). One major bone of contention between patients and clinicians (also evident in discussions with our patient advisory group), for example, was how far (and in whom) clinicians should look for and attempt to treat the various metabolic abnormalities that had been documented in laboratory research studies. The literature on this topic was extensive but conflicting [ 20 , 21 , 22 , 23 , 24 , 79 , 80 , 81 , 82 ]; it was heavy on biological detail but light on clinical application.

Patients were often aware of particular studies that appeared to offer plausible molecular or cellular explanations for symptom clusters along with a drug (often repurposed and off-label) whose mechanism of action appeared to be a good fit with the metabolic chain of causation. In one clinic, for example, we were shown an email exchange between a patient (not medically qualified) and a consultant, in which the patient asked them to reconsider their decision not to prescribe low-dose naltrexone, an opioid receptor antagonist with anti-inflammatory properties. The request included a copy of a peer-reviewed academic paper describing a small, uncontrolled pre-post study (i.e. a weak study design) in which this drug appeared to improve symptoms and functional performance in patients with long covid, as well as a mechanistic argument explaining why the patient felt this drug was a plausible choice in their own case.

This patient’s clinician, in common with most clinicians delivering front-line long covid services, considered that the evidence for such mechanism-based therapies was weak. Clinicians generally felt that this evidence, whilst promising, did not yet support routine measurement of clotting factors, antibodies, immune cells or other biomarkers or the prescription of mechanism-based therapies such as antivirals, anti-inflammatories or anticoagulants. Low-dose naltroxone, for example, is currently being tested in at least one randomized controlled trial (see National Clinical Trials Registry NCT05430152), which had not reported at the time of our observations.

Another challenge to defining best practice was the oft-repeated phrase that long covid is a “diagnosis by exclusion”, but the high prevalence of comorbidities meant that the “pure” long covid patient untainted by other potential explanations for their symptoms was a textbook ideal. In one MDT, for example, we observed a discussion about a patient who had had both swab-positive covid-19 and erythema migrans (a sign of Lyme disease) in the weeks before developing fatigue, yet local diagnostic criteria for each condition required the other to be excluded.

The logic of management in most participating clinics was pragmatic: prompt multidisciplinary assessment and treatment with an emphasis on obtaining a detailed clinical history (including premorbid health status), excluding serious complications (“red flags”), managing specific symptom clusters (for example, physical therapy for breathing pattern disorder), treating comorbidities (for example, anaemia, diabetes or menopause) and supporting whole-person rehabilitation [ 7 , 83 ]. The evidentiary questions raised in MDT discussions (which did not include patients) addressed the practicalities of the rehabilitation model (for example, whether cognitive therapy for neurocognitive complications is as effective when delivered online as it is when delivered in-person) rather than the molecular or cellular mechanisms of disease. For example, the question of whether patients with neurocognitive impairment should be tested for micro-clots or treated with anticoagulants never came up in the MDTs we observed, though we did visit a tertiary referral clinic (the tier 4 clinic in site H), whose lead clinician had a research interest in inflammatory coagulopathies and offered such tests to selected patients.

Because long covid typically produces dozens of symptoms that tend to be uniquely patterned in each patient, the uncertainties on which MDT discussions turned were rarely about general evidence of the kind that might be found in a guideline (e.g. how should fatigue be managed?). Rather they concerned particular case-based clinical decisions (e.g. how should this patient’s fatigue be managed, given the specifics of this case?). An example from our field notes illustrates this:

Physical therapist presents the case of a 39-year-old woman who works as a cleaner on an overnight ferry. Has had long covid for 2 years. Main symptoms are shortness of breath and possible anxiety attacks, especially when at work. She has had a course of physical therapy to teach diaphragmatic breathing but has found that focusing on her breathing makes her more anxious. Patient has to do a lot of bending in her job (e.g. cleaning toilets and under seats), which makes her dizzy, but Active Stand Test was normal. She also has very mild tricuspid incompetence [someone reads out a cardiology report—not hemodynamically significant].
Rehabilitation guidelines (e.g. WHO) recommend phased return to work (e.g. with reduced hours) and frequent breaks. “Tricky!” says someone. The job is intense and busy, and the patient can’t afford not to work. Discussion on whether all her symptoms can be attributed to tension and anxiety. Physical therapist who runs the breathing group says, “No, it’s long covid”, and describes severe initial covid-19 episode and results of serial chest X-rays which showed gradual clearing of ground glass shadows. Team discussion centers on how to negotiate reduced working hours in this particular job, given the overnight ferry shifts. --MDT discussion, Site D

This example raises important considerations about the nature of clinical knowledge in long covid. We return to it in the final section of the “ Results ” and in the “ Discussion ”.

Long covid clinics: a heterogeneous context for quality improvement

Most participating clinics had been established in mid-2020 to follow up patients who had been hospitalized (and perhaps ventilated) for severe acute covid-19. As mass vaccination reduced the severity of acute covid-19 for most people, the patient population in all clinics progressively shifted to include fewer “post-ICU [intensive care unit]” patients (in whom respiratory symptoms almost always dominated), and more people referred by their general practitioners or other secondary care specialties who had not been hospitalized for their acute covid-19 infection, and in whom fatigue, brain fog and palpitations were often the most troubling symptoms. Despite these similarities, the ten clinics had very different histories, geographical and material settings, staffing structures, patient pathways and case mix, as Table  1 illustrates. Below, we give more detail on three example sites.

Site C was established as a generalist “assessment-only” service by a general practitioner with an interest in infectious diseases. It is led jointly by that general practitioner and an occupational therapist, assisted by a wide range of other professionals including speech and language therapy, dietetics, clinical psychology and community-based physical therapy and occupational therapy. It has close links with a chronic fatigue service and a pain clinic that have been running in the locality for over 20 years. The clinic, which is entirely virtual (staff consult either from home or from a small side office in the community trust building), is physically located in a low-rise building on the industrial outskirts of a large town, sharing office space with various community-based health and social care services. Following a 1-h telephone consultation by one of the clinical leads, each patient is discussed at the MDT and then either discharged back to their general practitioner with a detailed management plan or referred on to one of the specialist services. This arrangement evolved to address a particular problem in this locality—that many patients with long covid were being referred by their general practitioner to multiple specialties (e.g. respiratory, neurology, fatigue), leading to a fragmented patient experience, unnecessary specialist assessments and wasteful duplication. The generalist assessment by telephone is oriented to documenting what is often a complex illness narrative (including pre-existing physical and mental comorbidities) and working with the patient to prioritize which symptoms or problems to pursue in which order.

Site E, in a well-regarded inner-city teaching hospital, had been set up in 2020 by a respiratory physician. Its initial ethos and rationale had been “respiratory follow-up”, with strong emphasis on monitoring lung damage via repeated imaging and lung function tests and in ensuring that patients received specialist physical therapy to “re-learn” efficient breathing techniques. Over time, this site has tried to accommodate a more multi-system assessment, with the introduction of a consultant-led infectious disease clinic for patients without a dominant respiratory component, reflecting the shift towards a more fatigue-predominant case mix. At the time of our fieldwork, each patient was seen in turn by a physician, psychologist, occupational therapist and respiratory physical therapist (half an hour each) before all four staff reconvened in a face-to-face MDT meeting to form a plan for each patient. But whilst a wide range of patients with diverse symptoms were discussed at these meetings, there remained a strong focus on respiratory pathology (e.g. tracking improvements in lung function and ensuring that coexisting asthma was optimally controlled).

Site F, one of the first long covid clinics in UK, was set up by a rehabilitation consultant who had been drafted to work on the ICU during the first wave of covid-19 in early 2020. He had a longstanding research interest in whole-patient rehabilitation, especially the assessment and management of chronic fatigue and pain. From the outset, clinic F was more oriented to rehabilitation, including vocational rehabilitation to help patients return to work. There was less emphasis on monitoring lung function or pursuing respiratory comorbidities. At the time of our fieldwork, clinic F offered both a community-based service (“tier 2”) led by an occupational therapist, supported by a respiratory physical therapist and psychologist, and a hospital-based service (“tier 3”) led by the rehabilitation consultant, supported by a wider MDT. Staff in both tiers emphasized that each patient needs a full physical and mental assessment and help to set and work towards achievable goals, whilst staying within safe limits so as to avoid post-exertional symptom exacerbation. Because of the research interest of the lead physician, clinic F adapted well to the growing numbers of patients with fatigue and quickly set up research studies on this cohort [ 84 ].

Details of the other seven sites are shown in Table  1 . Broadly speaking, sites B, E, G and H aligned with the “respiratory follow-up” model and sites F and I aligned with the “rehabilitation” model. Sites A and J had a high-volume, multi-tiered service whose community tier aligned with the “holistic GP assessment” model (site C above) and which also offered a hospital-based, rehabilitation-focused tier. The small service in Scotland (site D) had evolved from an initial respiratory focus to become part of the infectious diseases (ME/CFS) service; Lyme disease (another infectious disease whose sequelae include chronic fatigue) was also prevalent in this region.

The patient experience

Whilst the 10 participating clinics were very diverse in staffing, ethos and patient flows, the 29 patient interviews described remarkably consistent clinic experiences. Almost all identified the biggest problem to be the extended wait of several months before they were seen and the limited awareness (when initially referred) of what long covid clinics could provide. Some talked of how they cried with relief when they finally received an appointment. When the quality improvement collaborative was initially established, waiting times and bottlenecks were patients’ the top priority for quality improvement, and this ranking was shared by clinic staff, who were very aware of how much delays and uncertainties in assessment and treatment compounded patients’ suffering. This issue resolved to a large extent over the study period in all clinics as the referral backlog cleared and the incidence of new cases of long covid fell [ 85 ]; it will be covered in more detail in a separate publication.

Most patients in our sample were satisfied with the care they received when they were finally seen in clinic, especially how they finally felt “heard” after a clinician took a full history. They were relieved to receive affirmation of their experience, a diagnosis of what was wrong and reassurance that they were believed. They were grateful for the input of different members of the multidisciplinary teams and commented on the attentiveness, compassion and skill of allied professionals in particular (“she was wonderful, she got me breathing again”—patient BIR145 talking about a physical therapist). One or two patient participants expressed confusion about who exactly they had seen and what advice they had been given, and some did not realize that a telephone assessment had been an actual clinical consultation. A minority expressed disappointment that an expected investigation had not been ordered (one commented that they had not had any blood tests at all). Several had assumed that the help and advice from the long covid clinic would continue to be offered until they were better and were disappointed that they had been discharged after completing the various courses on offer (since their clinic had been set up as an “assessment only” service).

In the next sections, we give examples of topics raised in the quality improvement collaborative and how they were addressed.

Example quality topic 1: Outcome measures

The first topic considered by the quality improvement collaborative was how (that is, using which measures and metrics) to assess and monitor patients with long covid. In the absence of a validated biomarker, various symptom scores and quality of life scales—both generic and disease-specific—were mooted. Site F had already developed and validated a patient-reported outcome measure (PROM), the C19-YRS (Covid-19 Yorkshire Rehabilitation Scale) and used it for both research and clinical purposes [ 86 ]. It was quickly agreed that, for the purposes of generating comparative research findings across the ten clinics, the C19-YRS should be used at all sites and completed by patients three-monthly. A commercial partner produced an electronic version of this instrument and an app for patient smartphones. The quality improvement collaborative also agreed that patients should be asked to complete the EUROQOL EQ5D, a widely used generic health-related quality of life scale [ 87 ], in order to facilitate comparisons between long covid and other chronic conditions.

In retrospect, the discussions which led to the unopposed adoption of these two measures as a “quality” initiative in clinical care were somewhat aspirational. A review of progress at a subsequent quality improvement meeting revealed considerable variation among clinics, with a wide variety of measures used in different clinics to different degrees. Reasons for this variation were multiple. First, although our patient advisory group were keen that we should gather as much data as possible on the patient experience of this new condition, many clinic patients found the long questionnaires exhausting to complete due to cognitive impairment and fatigue. In addition, whilst patients were keen to answer questions on symptoms that troubled them, many had limited patience to fill out repeated surveys on symptoms that did not trouble them (“it almost felt as if I’ve not got long covid because I didn’t feel like I fit the criteria as they were laying it out”—patient SAL001). Staff assisted patients in completing the measures when needed, but this was time-consuming (up to 45 min per instrument) and burdensome for both staff and patients. In clinics where a high proportion of patients required assistance, staff time was the rate-limiting factor for how many instruments got completed. For some patients, one short instrument was the most that could be asked of them, and the clinician made a judgement on which one would be in their best interests on the day.

The second reason for variation was that the clinical diagnosis and management of particular features, complications and comorbidities of long covid required more nuance than was provided by these relatively generic instruments, and the level of detail sought varied with the specialist interest of the clinic (and the clinician). The modified C19-YRS [ 88 ], for example, contained 19 items, of which one asked about sleep quality. But if a patient had sleep difficulties, many clinicians felt that these needed to be documented in more detail—for example using the 8-item Epworth Sleepiness Scale, originally developed for conditions such as narcolepsy and obstructive sleep apnea [ 89 ]. The “Epworth score” was essential currency for referrals to some but not all specialist sleep services. Similarly, the C19-YRS had three items relating to anxiety, depression and post-traumatic stress disorder, but in clinics where there was a strong focus on mental health (e.g. when there was a resident psychologist), patients were usually invited to complete more specific tools (e.g. the Patient Health Questionnaire 9 [ 90 ], a 9-item questionnaire originally designed to assess severity of depression).

The third reason for variation was custom and practice. Ethnographic visits revealed that paper copies of certain instruments were routinely stacked on clinicians’ desks in outpatient departments and also (in some cases) handed out by administrative staff in waiting areas so that patients could complete them before seeing the clinician. These familiar clinic artefacts tended to be short (one-page) instruments that had a long tradition of use in clinical practice. They were not always fit for purpose. For example, the Nijmegen questionnaire was developed in the 1980s to assess hyperventilation; it was validated against a longer, “gold standard” instrument for that condition [ 91 ]. It subsequently became popular in respiratory clinics to diagnose or exclude breathing pattern disorder (a condition in which the normal physiological pattern of breathing becomes replaced with less efficient, shallower breathing [ 92 ]), so much so that the researchers who developed the instrument published a paper to warn fellow researchers that it had not been validated for this purpose [ 93 ]. Whilst a validated 17-item instrument for breathing pattern disorder (the Self-Evaluation of Breathing Questionnaire [ 94 ]) does exist, it is not in widespread clinical use. Most clinics in LOCOMOTION used Nijmegen either on all patients (e.g. as part of a comprehensive initial assessment, especially if the service had begun as a respiratory follow-up clinic) or when breathing pattern disorder was suspected.

In sum, the use of outcome measures in long covid clinics was a compromise between standardization and contingency. On the one hand, all clinics accepted the need to use “validated” instruments consistently. On the other hand, there were sometimes good reasons why they deviated from agreed practice, including mismatch between the clinic’s priorities as a research site, its priorities as a clinical service, and the particular clinical needs of a patient; the clinic’s—and the clinician’s—specialist focus; and long-held traditions of using particular instruments with which staff and patients were familiar.

Example quality topic 2: Postural orthostatic tachycardia syndrome (POTS)

Palpitations (common in long covid) and postural orthostatic tachycardia syndrome (POTS, a disproportionate acceleration in heart rate on standing, the assumed cause of palpitations in many long covid patients) was the top priority for quality improvement identified by our patient advisory group. Reflecting discussions and evidence (of various kinds) shared in online patient communities, the group were confident that POTS is common in long covid patients and that many cases remain undetected (perhaps misdiagnosed as anxiety). Their request that all long covid patients should be “screened” for POTS prompted a search for, and synthesis of, evidence (which we published in the BMJ [ 95 ]). In sum, that evidence was sparse and contested, but, combined with standard practice in specialist clinics, broadly supported the judicious use of the NASA Lean Test [ 96 ]. This test involves repeated measurements of pulse and blood pressure with the patient first lying and then standing (with shoulders resting against a wall).

The patient advisory group’s request that the NASA Lean Test should be conducted on all patients met with mixed responses from the clinics. In site F, the lead physician had an interest in autonomic dysfunction in chronic fatigue and was keen; he had already published a paper on how to adapt the NASA Lean Test for self-assessment at home [ 97 ]. Several other sites were initially opposed. Staff at site E, for example, offered various arguments:

The test is time-consuming, labor-intensive, and takes up space in the clinic which has an opportunity cost in terms of other potential uses;

The test is unvalidated and potentially misleading (there is a high incidence of both false negative and false positive results);

There is no proven treatment for POTS, so there is no point in testing for it;

It is a specialist test for a specialist condition, so it should be done in a specialist clinic where its benefits and limitations are better understood;

Objective testing does not change clinical management since what we treat is the patient’s symptoms (e.g. by a pragmatic trial of lifestyle measures and medication);

People with symptoms suggestive of dysautonomia have already been “triaged out” of this clinic (that is, identified in the initial telephone consultation and referred directly to neurology or cardiology);

POTS is a manifestation of the systemic nature of long covid; it does not need specific treatment but will improve spontaneously as the patient goes through standard interventions such as active pacing, respiratory physical therapy and sleep hygiene;

Testing everyone, even when asymptomatic, runs counter to the ethos of rehabilitation, which is to “de-medicalize” patients so as to better orient them to their recovery journey.

When clinics were invited to implement the NASA Lean Test on a consecutive sample of patients to resolve a dispute about the incidence of POTS (from “we’ve only seen a handful of people with it since the clinic began” to “POTS is common and often missed”), all but one site agreed to participate. The tertiary POTS centre linked to site H was already running the NASA Lean Test as standard on all patients. Site C, which operated entirely virtually, passed the work to the referring general practitioner by making this test a precondition for seeing the patient; site D, which was largely virtual, sent instructions for patients to self-administer the test at home.

The NASA Lean Test study has been published separately [ 98 ]. In sum, of 277 consecutive patients tested across the eight clinics, 20 (7%) had a positive NASA Lean Test for POTS and a further 28 (10%) a borderline result. Six of 20 patients who met the criteria for POTS on testing had no prior history of orthostatic intolerance. The question of whether this test should be used to “screen” all patients was not answered definitively. But the experience of participating in the study persuaded some sceptics that postural changes in heart rate could be severe in some long covid patients, did not appear to be fully explained by their previously held theories (e.g. “functional”, anxiety, deconditioning), and had likely been missed in some patients. The outcome of this particular quality improvement cycle was thus not a wholescale change in practice (for which the evidence base was weak) but a more subtle increase in clinical awareness, a greater willingness to consider testing for POTS and a greater commitment to contribute to research into this contested condition.

More generally, the POTS audit prompted some clinicians to recognize the value of quality improvement in novel clinical areas. One physician who had initially commented that POTS was not seen in their clinic, for example, reflected:

“ Our clinic population is changing. […] Overall there’s far fewer post-ICU patients with ECMO [extra-corporeal membrane oxygenation] issues and far more long covid from the community, and this is the bit our clinic isn’t doing so well on. We’re doing great on breathing pattern disorder; neuro[logists] are helping us with the brain fogs; our fatigue and occupational advice is ok but some of the dysautonomia symptoms that are more prevalent in the people who were not hospitalized – that’s where we need to improve .” -Respiratory physician, site G (from field visit 6.6.23)

Example quality topic 3: Management of fatigue

Fatigue was the commonest symptom overall and a high priority among both patients and clinicians for quality improvement. It often coexisted with the cluster of neurocognitive symptoms known as brain fog, with both conditions relapsing and remitting in step. Clinicians were keen to systematize fatigue management using a familiar clinical framework oriented around documenting a full clinical history, identifying associated symptoms, excluding or exploring comorbidities and alternative explanations (e.g. poor sleep patterns, depression, menopause, deconditioning), assessing how fatigue affects physical and mental function, implementing a program of physical and cognitive therapy that was sensitive to the patient’s condition and confidence level, and monitoring progress using validated patient-reported outcome measures and symptom diaries.

The underpinning logic of this approach, which broadly reflected World Health Organization guidance [ 99 ], was that fatigue and linked cognitive impairment could be a manifestation of many—perhaps interacting—conditions but that a whole-patient (body and mind) rehabilitation program was the cornerstone of management in most cases. Discussion in the quality improvement collaborative focused on issues such as whether fatigue was so severe that it produced safety concerns (e.g. in a person’s job or with childcare), the pros and cons of particular online courses such as yoga, relaxation and mindfulness (many were viewed positively, though the evidence base was considered weak), and the extent to which respiratory physical therapy had a crossover impact on fatigue (systematic reviews suggested that it may do, but these reviews also cautioned that primary studies were sparse, methodologically flawed, and heterogeneous [ 100 , 101 ]). They also debated the strengths and limitations of different fatigue-specific outcome measures, each of which had been developed and validated in a different condition, with varying emphasis on cognitive fatigue, physical fatigue, effect on daily life, and motivation. These instruments included the Modified Fatigue Impact Scale; Fatigue Severity Scale [ 102 ]; Fatigue Assessment Scale; Functional Assessment Chronic Illness Therapy—Fatigue (FACIT-F) [ 103 ]; Work and Social Adjustment Scale [ 104 ]; Chalder Fatigue Scale [ 105 ]; Visual Analogue Scale—Fatigue [ 106 ]; and the EQ5D [ 87 ]. In one clinic (site F), three of these scales were used in combination for reasons discussed below.

Some clinicians advocated melatonin or nutritional supplements (such as vitamin D or folic acid) for fatigue on the grounds that many patients found them helpful and formal placebo-controlled trials were unlikely ever to be conducted. But neurostimulants used in other fatigue-predominant conditions (e.g. brain injury, stroke), which also lacked clinical trial evidence in long covid, were viewed as inappropriate in most patients because of lack of evidence of clear benefit and hypothetical risk of harm (e.g. adverse drug reactions, polypharmacy).

Whilst the patient advisory group were broadly supportive of a whole-patient rehabilitative approach to fatigue, their primary concern was fatiguability , especially post-exertional symptom exacerbation (PESE, also known as “crashes”). In these, the patient becomes profoundly fatigued some hours or days after physical or mental exertion, and this state can last for days or even weeks [ 107 ]. Patients viewed PESE as a “red flag” symptom which they felt clinicians often missed and sometimes caused. They wanted the quality improvement effort to focus on ensuring that all clinicians were aware of the risks of PESE and acted accordingly. A discussion among patients and clinicians at a quality improvement collaborative meeting raised a new research hypothesis—that reducing the number of repeated episodes of PESE may improve the natural history of long covid.

These tensions around fatigue management played out differently in different clinics. In site C (the GP-led virtual clinic run from a community hub), fatigue was viewed as one manifestation of a whole-patient condition. The lead general practitioner used the metaphor of untangling a skein of wool: “you have to find the end and then gently pull it”. The underlying problem in a fatigued patient, for example, might be an undiagnosed physical condition such as anaemia, disturbed sleep, or inadequate pacing. These required (respectively) the chronic fatigue service (comprising an occupational therapist and specialist psychologist and oriented mainly to teaching the techniques of goal-setting and pacing), a “tiredness” work-up (e.g. to exclude anaemia or menopause), investigation of poor sleep (which, not uncommonly, was due to obstructive sleep apnea), and exploration of mental health issues.

In site G (a hospital clinic which had evolved from a respiratory service), patients with fatigue went through a fatigue management program led by the occupational therapist with emphasis on pacing, energy conservation, avoidance of PESE and sleep hygiene. Those without ongoing respiratory symptoms were often discharged back to their general practitioner once they had completed this; there was no consultant follow-up of unresolved fatigue.

In site F (a rehabilitation clinic which had a longstanding interest in chronic fatigue even before the pandemic), active interdisciplinary management of fatigue was commenced at or near the patient’s first visit, on the grounds that the earlier this began, the more successful it would be. In this clinic, patients were offered a more intensive package: a similar occupational therapy-led fatigue course as those in site G, plus input from a dietician to advise on regular balanced meals and caffeine avoidance and a group-based facilitated peer support program which centred on fatigue management. The dietician spoke enthusiastically about how improving diet in longstanding long covid patients often improved fatigue (e.g. because they had often lost muscle mass and tended to snack on convenience food rather than make meals from scratch), though she agreed there was no evidence base from trials to support this approach.

Pursuing local quality improvement through MDTs

Whilst some long covid patients had “textbook” symptoms and clinical findings, many cases were unique and some were fiendishly complex. One clinician commented that, somewhat paradoxically, “easy cases” were often the post-ICU follow-ups who had resolving chest complications; they tended to do well with a course of respiratory physical therapy and a return-to-work program. Such cases were rarely brought to MDT meetings. “Difficult cases” were patients who had not been hospitalized for their acute illness but presented with a months- or years-long history of multiple symptoms with fatigue typically predominant. Each one was different, as the following example (some details of which have been fictionalized to protect anonymity) illustrates.

The MDT is discussing Mrs Fermah, a 65-year-old homemaker who had covid-19 a year ago. She has had multiple symptoms since, including fluctuating fatigue, brain fog, breathlessness, retrosternal chest pain of burning character, dry cough, croaky voice, intermittent rashes (sometimes on eating), lips going blue, ankle swelling, orthopnoea, dizziness with the room spinning which can be triggered by stress, low back pain, aches and pains in the arms and legs and pins and needles in the fingertips, loss of taste and smell, palpitations and dizziness (unclear if postural, but clear association with nausea), headaches on waking, and dry mouth. She is somewhat overweight (body mass index 29) and admits to low mood. Functionally, she is mostly confined to the house and can no longer manage the stairs so has begun to sleep downstairs. She has stumbled once or twice but not fallen. Her social life has ceased and she rarely has the energy to see her grandchildren. Her 70-year-old husband is retired and generally supportive, though he spends most evenings at his club. Comorbidities include glaucoma which is well controlled and overseen by an ophthalmologist, mild club foot (congenital) and stage 1 breast cancer 20 years ago. Various tests, including a chest X-ray, resting and exercise oximetry and a blood panel, were normal except for borderline vitamin D level. Her breathing questionnaire score suggests she does not have breathing pattern disorder. ECG showed first-degree atrioventricular block and left axis deviation. No clinician has witnessed the blue lips. Her current treatment is online group respiratory physical therapy; a home visit is being arranged to assess her climbing stairs. She has declined a psychologist assessment. The consultant asks the nurse who assessed her: “Did you get a feel if this is a POTS-type dizziness or an ENT-type?” She sighs. “Honestly it was hard to tell, bless her.”—Site A MDT

This patient’s debilitating symptoms and functional impairments could all be due to long covid, yet “evidence-based” guidance for how to manage her complex suffering does not exist and likely never will exist. The question of which (if any) additional blood or imaging tests to do, in what order of priority, and what interventions to offer the patient will not be definitively answered by consulting clinical trials involving hundreds of patients, since (even if these existed) the decision involves weighing this patient’s history and the multiple factors and uncertainties that are relevant in her case. The knowledge that will help the MDT provide quality care to Mrs Fermah is case-based knowledge—accumulated clinical experience and wisdom from managing and deliberating on multiple similar cases. We consider case-based knowledge further in the “ Discussion ”.

Summary of key findings

This study has shown that a quality improvement collaborative of UK long covid clinics made some progress towards standardizing assessment and management in some topics, but some variation remained. This could be explained in part by the fact that different clinics had different histories and path dependencies, occupied a different place in the local healthcare ecosystem, served different populations, were differently staffed, and had different clinical interests. Our patient advisory group and clinicians in the quality improvement collaborative broadly prioritized the same topics for improvement but interpreted them somewhat differently. “Quality” long covid care had multiple dimensions, relating to (among other things) service set-up and accessibility, clinical provision appropriate to the patient’s need (including options for referral to other services locally), the human qualities of clinical and support staff, how knowledge was distributed across (and accessible within) the system, and the accumulated collective wisdom of local MDTs in dealing with complex cases (including multiple kinds of specialist expertise as well as relational knowledge of what was at stake for the patient). Whilst both staff and patients were keen to contribute to the quality improvement effort, the burden of measurement was evident: multiple outcome measures, used repeatedly, were resource-intensive for staff and exhausting for patients.

Strengths and limitations of this study

To our knowledge, we are the first to report both a quality improvement collaborative and an in-depth qualitative study of clinical work in long covid. Key strengths of this work include the diverse sampling frame (with sites from three UK jurisdictions and serving widely differing geographies and demographics); the use of documents, interviews and reflexive interpretive ethnography to produce meaningful accounts of how clinics emerged and how they were currently organized; the use of philosophical concepts to analyse data on how MDTs produced quality care on a patient-by-patient basis; and the close involvement of patient co-researchers and coauthors during the research and writing up.

Limitations of the study include its exclusive UK focus (the external validity of findings to other healthcare systems is unknown); the self-selecting nature of participants in a quality improvement collaborative (our patient advisory group suggested that the MDTs observed in this study may have represented the higher end of a quality spectrum, hence would be more likely than other MDTs to adhere to guidelines); and the particular perspective brought by the researchers (two GPs, a physical therapist and one non-clinical person) in ethnographic observations. Hospital specialists or organizational scholars, for example, may have noticed different things or framed what they observed differently.

Explaining variation in long covid care

Sutherland and Levesque’s framework mentioned in the “ Background ” section does not explain much of the variation found in our study [ 70 ]. In terms of capacity, at the time of this study most participating clinics benefited from ring-fenced resources. In terms of evidence, guidelines existed and were not greatly contested, but as illustrated by the case of Mrs Fermah above, many patients were exceptions to the guideline because of complex symptomatology and relevant comorbidities. In terms of agency, clinicians in most clinics were passionately engaged with long covid (they were pioneers who had set up their local clinic and successfully bid for national ring-fenced resources) and were generally keen to support patient choice (though not if the patient requested tests which were unavailable or deemed not indicated).

Astma et al.’s list of factors that may explain variation in practice (see “ Background ”) includes several that may be relevant to long covid, especially that the definition of appropriate care in this condition remains somewhat contested. But lack of opportunity to discuss cases was not a problem in the clinics in our sample. On the contrary, MDT meetings in each locality gave clinicians multiple opportunities to discuss cases with colleagues and reflect collectively on whether and how to apply particular guidelines.

The key problem was not that clinicians disputed the guidelines for managing long covid or were unaware of them; it was that the guidelines were not self-interpreting . Rather, MDTs had to deliberate on the balance of benefits and harms in different aspects of individual cases. In patients whose symptoms suggested a possible diagnosis of POTS (or who suspected themselves of having POTS), for example, these deliberations were sometimes lengthy and nuanced. Should a test result that is not technically in the abnormal range but close to it be treated as diagnostic, given that symptoms point to this diagnosis? If not, should the patient be told that the test excludes POTS or that it is equivocal? If a cardiology opinion has stated firmly that the patient does not have POTS but the cardiologist is not known for their interest in this condition, should a second specialist opinion be sought? If the gold standard “tilt test” [ 108 ] for POTS (usually available only in tertiary centres) is not available locally, does this patient merit a costly out-of-locality referral? Should the patient’s request for a trial of off-label medication, reflecting discussions in an online support group, be honoured? These are the kinds of questions on which MDTs deliberated at length.

The fact that many cases required extensive deliberation does not necessarily justify variation in practice among clinics. But taking into account the clinics’ very different histories, set-up, and local referral pathways, the variation begins to make sense. A patient who is being assessed in a clinic that functions as a specialist chronic fatigue centre and attracts referrals which reflect this interest (e.g. site F in our sample) will receive different management advice from one that functions as a telephone-only generalist assessment centre and refers on to other specialties (site C in our sample). The wide variation in case mix, coupled with the fact that a different proportion of these cases were highly complex in each clinic (and in different ways), suggests that variation in practice may reflect appropriate rather than inappropriate care.

Our patient advisory group affirmed that many of the findings reported here resonated with their own experience, but they raised several concerns. These included questions about patient groups who may have been missed in our sample because they were rarely discussed in MDTs. The decision to take a case to MDT discussion is taken largely by a clinician, and there was evidence from online support groups that some patients’ requests for their case to be taken to an MDT had been declined (though not, to our knowledge, in the clinics participating in the LOCOMOTION study).

We began this study by asking “what is quality in long covid care?”. We initially assumed that this question referred to a generalizable evidence base, which we felt we could identify, and we believed that we could then determine whether long covid clinics were following the evidence base through conventional audits of structure, process, and outcome. In retrospect, these assumptions were somewhat naïve. On the basis of our findings, we suggest that a better (and more individualized) research question might be “to what extent does each patient with long covid receive evidence-based care appropriate to their needs?”. This question would require individual case review on a sample of cases, tracking each patient longitudinally including cross-referrals, and also interviewing the patient.

Nomothetic versus idiographic knowledge

In a series of lectures first delivered in the 1950s and recently republished [ 109 ], psychiatrist Dr Maurice O’Connor Drury drew on the later philosophy of his friend and mentor Ludwig Wittgenstein to challenge what he felt was a concerning trend: that the nomothetic (generalizable, abstract) knowledge from randomized controlled trials (RCTs) was coming to over-ride the idiographic (personal, situated) knowledge about particular patients. Based on Wittgenstein’s writings on the importance of the particular, Drury predicted—presciently—that if implemented uncritically, RCTs would result in worse, not better, care for patients, since it would go hand-in-hand with a downgrading of experience, intuition, subjective judgement, personal reflection, and collective deliberation.

Much conventional quality improvement methodology is built on an assumption that nomothetic knowledge (for example, findings from RCTs and systematic reviews) is a higher form of knowing than idiographic knowledge. But idiographic, case-based reasoning—despite its position at the very bottom of evidence-based medicine’s hierarchy of evidence [ 110 ]—is a legitimate and important element of medical practice. Bioethicist Kathryn Montgomery, drawing on Aristotle’s notion of praxis , considers clinical practice to be an example of case-based reasoning [ 111 ]. Medicine is governed not by hard and fast laws but by competing maxims or rules of thumb ; the essence of judgement is deciding which (if any) rule should be applied in a particular circumstance. Clinical judgement incorporates science (especially the results of well-conducted research) and makes use of available tools and technologies (including guidelines and decision-support algorithms that incorporate research findings). But rather than being determined solely by these elements, clinical judgement is guided both by the scientific evidence and by the practical and ethical question “what is it best to do, for this individual, given these circumstances?”.

In this study, we observed clinical management of, and MDT deliberations on, hundreds of clinical cases. In the more straightforward ones (for example, recovering pneumonitis), guideline-driven care was not difficult to implement and such cases were rarely brought to the MDT. But cases like Mrs Fermah (see last section of “ Results ”) required much discussion on which aspects of which guideline were in the patient’s best interests to bring into play at any particular stage in their illness journey.

Conclusions

One systematic review on quality improvement collaboratives concluded that “ [those] reporting success generally addressed relatively straightforward aspects of care, had a strong evidence base and noted a clear evidence-practice gap in an accepted clinical pathway or guideline” (page 226) [ 60 ]. The findings from this study suggest that to the extent that such collaboratives address clinical cases that are not straightforward, conventional quality improvement methods may be less useful and even counterproductive.

The question “what is quality in long covid care?” is partly a philosophical one. Our findings support an approach that recognizes and values idiographic knowledge —including establishing and protecting a safe and supportive space for deliberation on individual cases to occur and to value and draw upon the collective learning that occurs in these spaces. It is through such deliberation that evidence-based guidelines can be appropriately interpreted and applied to the unique needs and circumstances of individual patients. We suggest that Drury’s warning about the limitations of nomothetic knowledge should prompt a reassessment of policies that rely too heavily on such knowledge, resulting in one-size-fits-all protocols. We also cautiously hypothesize that the need to centre the quality improvement effort on idiographic rather than nomothetic knowledge is unlikely to be unique to long covid. Indeed, such an approach may be particularly important in any condition that is complex, unpredictable, variable in presentation and clinical course, and associated with comorbidities.

Availability of data and materials

Selected qualitative data (ensuring no identifiable information) will be made available to formal research teams on reasonable request to Professor Greenhalgh at the University of Oxford, on condition that they have research ethics approval and relevant expertise. The quantitative data on NASA Lean Test have been published in full in a separate paper [ 98 ].

Abbreviations

Chronic fatigue syndrome

Intensive care unit

Jenny Ceolta-Smith

Julie Darbyshire

LOng COvid Multidisciplinary consortium Optimising Treatments and services across the NHS

Multidisciplinary team

Myalgic encephalomyelitis

Middle East Respiratory Syndrome

National Aeronautics and Space Association

Occupational therapy/ist

Post-exertional symptom exacerbation

Postural orthostatic tachycardia syndrome

Speech and language therapy

Severe Acute Respiratory Syndrome

Trisha Greenhalgh

United Kingdom

United States

World Health Organization

Perego E, Callard F, Stras L, Melville-JÛhannesson B, Pope R, Alwan N. Why the Patient-Made Term “Long Covid” is needed. Wellcome Open Res. 2020;5:224.

Article   Google Scholar  

Greenhalgh T, Sivan M, Delaney B, Evans R, Milne R: Long covid—an update for primary care. bmj 2022;378:e072117.

Centers for Disease Control and Prevention (US): Long COVID or Post-COVID Conditions (updated 16th December 2022). Atlanta: CDC. Accessed 2nd June 2023 at https://www.cdc.gov/coronavirus/2019-ncov/long-term-effects/index.html ; 2022.

National Institute for Health and Care Excellence (NICE) Scottish Intercollegiate Guidelines Network (SIGN) and Royal College of General Practitioners (RCGP): COVID-19 rapid guideline: managing the long-term effects of COVID-19, vol. Accessed 30th January 2022 at https://www.nice.org.uk/guidance/ng188/resources/covid19-rapid-guideline-managing-the-longterm-effects-of-covid19-pdf-51035515742 . London: NICE; 2022.

Organization WH: Post Covid-19 Condition (updated 7th December 2022), vol. Accessed 2nd June 2023 at https://www.who.int/europe/news-room/fact-sheets/item/post-covid-19-condition#:~:text=It%20is%20defined%20as%20the,months%20with%20no%20other%20explanation . Geneva: WHO; 2022.

Office for National Statistics: Prevalence of ongoing symptoms following coronavirus (COVID-19) infection in the UK: 31st March 2023. London: ONS. Accessed 30th May 2023 at https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/datasets/alldatarelatingtoprevalenceofongoingsymptomsfollowingcoronaviruscovid19infectionintheuk ; 2023.

Crook H, Raza S, Nowell J, Young M, Edison P: Long covid—mechanisms, risk factors, and management. bmj 2021;374.

Sudre CH, Murray B, Varsavsky T, Graham MS, Penfold RS, Bowyer RC, Pujol JC, Klaser K, Antonelli M, Canas LS. Attributes and predictors of long COVID. Nat Med. 2021;27(4):626–31.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Reese JT, Blau H, Casiraghi E, Bergquist T, Loomba JJ, Callahan TJ, Laraway B, Antonescu C, Coleman B, Gargano M: Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes. EBioMedicine 2023;87.

Thaweethai T, Jolley SE, Karlson EW, Levitan EB, Levy B, McComsey GA, McCorkell L, Nadkarni GN, Parthasarathy S, Singh U. Development of a definition of postacute sequelae of SARS-CoV-2 infection. JAMA. 2023;329(22):1934–46.

Brown DA, O’Brien KK. Conceptualising Long COVID as an episodic health condition. BMJ Glob Health. 2021;6(9): e007004.

Article   PubMed   Google Scholar  

Tate WP, Walker MO, Peppercorn K, Blair AL, Edgar CD. Towards a Better Understanding of the Complexities of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome and Long COVID. Int J Mol Sci. 2023;24(6):5124.

Ahmed H, Patel K, Greenwood DC, Halpin S, Lewthwaite P, Salawu A, Eyre L, Breen A, Connor RO, Jones A. Long-term clinical outcomes in survivors of severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome coronavirus (MERS) outbreaks after hospitalisation or ICU admission: a systematic review and meta-analysis. J Rehabil Med. 2020;52(5):1–11.

Google Scholar  

World Health Organisation: Clinical management of severe acute respiratory infection (SARI) when COVID-19 disease is suspected: Interim guidance (13th March 2020). Geneva: WHO. Accessed 3rd January 2023 at https://t.co/JpNdP8LcV8?amp=1 ; 2020.

Rushforth A, Ladds E, Wieringa S, Taylor S, Husain L, Greenhalgh T: Long Covid – the illness narratives. Under review for Sociology of Health and Illness 2021.

Russell D, Spence NJ. Chase J-AD, Schwartz T, Tumminello CM, Bouldin E: Support amid uncertainty: Long COVID illness experiences and the role of online communities. SSM-Qual Res Health. 2022;2: 100177.

Article   PubMed   PubMed Central   Google Scholar  

Ziauddeen N, Gurdasani D, O’Hara ME, Hastie C, Roderick P, Yao G, Alwan NA. Characteristics and impact of Long Covid: Findings from an online survey. PLoS ONE. 2022;17(3): e0264331.

Evans RA, McAuley H, Harrison EM, Shikotra A, Singapuri A, Sereno M, Elneima O, Docherty AB, Lone NI, Leavy OC. Physical, cognitive, and mental health impacts of COVID-19 after hospitalisation (PHOSP-COVID): a UK multicentre, prospective cohort study. Lancet Respir Med. 2021;9(11):1275–87.

Sykes DL, Holdsworth L, Jawad N, Gunasekera P, Morice AH, Crooks MG. Post-COVID-19 symptom burden: what is long-COVID and how should we manage it? Lung. 2021;199(2):113–9.

Altmann DM, Whettlock EM, Liu S, Arachchillage DJ, Boyton RJ: The immunology of long COVID. Nat Rev Immunol 2023:1–17.

Klein J, Wood J, Jaycox J, Dhodapkar RM, Lu P, Gehlhausen JR, Tabachnikova A, Greene K, Tabacof L, Malik AA et al : Distinguishing features of Long COVID identified through immune profiling. Nature 2023.

Chen B, Julg B, Mohandas S, Bradfute SB. Viral persistence, reactivation, and mechanisms of long COVID. Elife. 2023;12: e86015.

Wang C, Ramasamy A, Verduzco-Gutierrez M, Brode WM, Melamed E. Acute and post-acute sequelae of SARS-CoV-2 infection: a review of risk factors and social determinants. Virol J. 2023;20(1):124.

Cervia-Hasler C, Brüningk SC, Hoch T, Fan B, Muzio G, Thompson RC, Ceglarek L, Meledin R, Westermann P, Emmenegger M et al Persistent complement dysregulation with signs of thromboinflammation in active Long Covid Science 2024;383(6680):eadg7942.

Sivan M, Greenhalgh T, Darbyshire JL, Mir G, O’Connor RJ, Dawes H, Greenwood D, O’Connor D, Horton M, Petrou S. LOng COvid Multidisciplinary consortium Optimising Treatments and servIces acrOss the NHS (LOCOMOTION): protocol for a mixed-methods study in the UK. BMJ Open. 2022;12(5): e063505.

Rushforth A, Ladds E, Wieringa S, Taylor S, Husain L, Greenhalgh T. Long covid–the illness narratives. Soc Sci Med. 2021;286: 114326.

National Institute for Health and Care Excellence: COVID-19 rapid guideline: managing the long-term effects of COVID-19, vol. Accessed 4th October 2023 at https://www.nice.org.uk/guidance/ng188/resources/covid19-rapid-guideline-managing-the-longterm-effects-of-covid19-pdf-51035515742 . London: NICE 2020.

NHS England: Long COVID: the NHS plan for 2021/22. London: NHS England. Accessed 2nd August 2022 at https://www.england.nhs.uk/coronavirus/documents/long-covid-the-nhs-plan-for-2021-22/ ; 2021.

NHS England: NHS to offer ‘long covid’ sufferers help at specialist centres. London: NHS England. Accessed 10th October 2020 at https://www.england.nhs.uk/2020/10/nhs-to-offer-long-covid-help/ ; 2020 (7th October).

NHS England: The NHS plan for improving long COVID services, vol. Acessed 4th February 2024 at https://www.england.nhs.uk/publication/the-nhs-plan-for-improving-long-covid-services/ .London: Gov.uk; 2022.

NHS England: Commissioning guidance for post-COVID services for adults, children and young people, vol. Accessed 6th February 2024 at https://www.england.nhs.uk/long-read/commissioning-guidance-for-post-covid-services-for-adults-children-and-young-people/ . London: gov.uk; 2023.

National Institute for Health Research: Researching Long Covid: Adressing a new global health challenge, vol. Accessed 9.8.23 at https://evidence.nihr.ac.uk/collection/researching-long-covid-addressing-a-new-global-health-challenge/ . London: NIHR; 2022.

Subbaraman N. NIH will invest $1 billion to study long COVID. Nature. 2021;591(7850):356–356.

Article   CAS   PubMed   Google Scholar  

Donabedian A. The definition of quality and approaches to its assessment and monitoring. Ann Arbor: Michigan; 1980.

Laffel G, Blumenthal D. The case for using industrial quality management science in health care organizations. JAMA. 1989;262(20):2869–73.

Maxwell RJ. Quality assessment in health. BMJ. 1984;288(6428):1470.

Berwick DM, Godfrey BA, Roessner J. Curing health care: New strategies for quality improvement. The Journal for Healthcare Quality (JHQ). 1991;13(5):65–6.

Deming WE. Out of the Crisis. Cambridge, MA: MIT Press; 1986.

Argyris C: Increasing leadership effectiveness: New York: J. Wiley; 1976.

Juran JM: A history of managing for quality: The evolution, trends, and future directions of managing for quality: Asq Press; 1995.

Institute of Medicine (US): Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academy Press; 2001.

McNab D, McKay J, Shorrock S, Luty S, Bowie P. Development and application of ‘systems thinking’ principles for quality improvement. BMJ Open Qual. 2020;9(1): e000714.

Sampath B, Rakover J, Baldoza K, Mate K, Lenoci-Edwards J, Barker P. ​Whole-System Quality: A Unified Approach to Building Responsive, Resilient Health Care Systems. Boston: Institute for Healthcare Immprovement; 2021.

Batalden PB, Davidoff F: What is “quality improvement” and how can it transform healthcare? In . , vol. 16: BMJ Publishing Group Ltd; 2007: 2–3.

Baker G. Collaborating for improvement: the Institute for Healthcare Improvement’s breakthrough series. New Med. 1997;1:5–8.

Plsek PE. Collaborating across organizational boundaries to improve the quality of care. Am J Infect Control. 1997;25(2):85–95.

Ayers LR, Beyea SC, Godfrey MM, Harper DC, Nelson EC, Batalden PB. Quality improvement learning collaboratives. Qual Manage Healthcare. 2005;14(4):234–47.

Brandrud AS, Schreiner A, Hjortdahl P, Helljesen GS, Nyen B, Nelson EC. Three success factors for continual improvement in healthcare: an analysis of the reports of improvement team members. BMJ Qual Saf. 2011;20(3):251–9.

Dückers ML, Spreeuwenberg P, Wagner C, Groenewegen PP. Exploring the black box of quality improvement collaboratives: modelling relations between conditions, applied changes and outcomes. Implement Sci. 2009;4(1):1–12.

Nadeem E, Olin SS, Hill LC, Hoagwood KE, Horwitz SM. Understanding the components of quality improvement collaboratives: a systematic literature review. Milbank Q. 2013;91(2):354–94.

Shortell SM, Marsteller JA, Lin M, Pearson ML, Wu S-Y, Mendel P, Cretin S, Rosen M: The role of perceived team effectiveness in improving chronic illness care. Medical Care 2004:1040–1048.

Wilson T, Berwick DM, Cleary PD. What do collaborative improvement projects do? Experience from seven countries. Joint Commission J Qual Safety. 2004;30:25–33.

Schouten LM, Hulscher ME, van Everdingen JJ, Huijsman R, Grol RP. Evidence for the impact of quality improvement collaboratives: systematic review. BMJ. 2008;336(7659):1491–4.

Hulscher ME, Schouten LM, Grol RP, Buchan H. Determinants of success of quality improvement collaboratives: what does the literature show? BMJ Qual Saf. 2013;22(1):19–31.

Dixon-Woods M, Bosk CL, Aveling EL, Goeschel CA, Pronovost PJ. Explaining Michigan: developing an ex post theory of a quality improvement program. Milbank Q. 2011;89(2):167–205.

Bate P, Mendel P, Robert G: Organizing for quality: the improvement journeys of leading hospitals in Europe and the United States: CRC Press; 2007.

Andersson-Gäre B, Neuhauser D. The health care quality journey of Jönköping County Council. Sweden Qual Manag Health Care. 2007;16(1):2–9.

Törnblom O, Stålne K, Kjellström S. Analyzing roles and leadership in organizations from cognitive complexity and meaning-making perspectives. Behav Dev. 2018;23(1):63.

Greenhalgh T, Russell J. Why Do Evaluations of eHealth Programs Fail? An Alternative Set of Guiding Principles. PLoS Med. 2010;7(11): e1000360.

Wells S, Tamir O, Gray J, Naidoo D, Bekhit M, Goldmann D. Are quality improvement collaboratives effective? A systematic review. BMJ Qual Saf. 2018;27(3):226–40.

Landon BE, Wilson IB, McInnes K, Landrum MB, Hirschhorn L, Marsden PV, Gustafson D, Cleary PD. Effects of a quality improvement collaborative on the outcome of care of patients with HIV infection: the EQHIV study. Ann Intern Med. 2004;140(11):887–96.

Mittman BS. Creating the evidence base for quality improvement collaboratives. Ann Intern Med. 2004;140(11):897–901.

Wennberg JE. Unwarranted variations in healthcare delivery: implications for academic medical centres. BMJ. 2002;325(7370):961–4.

Bungay H. Cancer and health policy: the postcode lottery of care. Soc Policy Admin. 2005;39(1):35–48.

Wennberg JE, Cooper MM: The Quality of Medical Care in the United States: A Report on the Medicare Program: The Dartmouth Atlas of Health Care 1999: The Center for the Evaluative Clinical Sciences [Internet]. 1999.

DaSilva P, Gray JM. English lessons: can publishing an atlas of variation stimulate the discussion on appropriateness of care? Med J Aust. 2016;205(S10):S5–7.

Gray WK, Day J, Briggs TW, Harrison S. Identifying unwarranted variation in clinical practice between healthcare providers in England: Analysis of administrative data over time for the Getting It Right First Time programme. J Eval Clin Pract. 2021;27(4):743–50.

Wabe N, Thomas J, Scowen C, Eigenstetter A, Lindeman R, Georgiou A. The NSW Pathology Atlas of Variation: Part I—Identifying Emergency Departments With Outlying Laboratory Test-Ordering Practices. Ann Emerg Med. 2021;78(1):150–62.

Jamal A, Babazono A, Li Y, Fujita T, Yoshida S, Kim SA. Elucidating variations in outcomes among older end-stage renal disease patients on hemodialysis in Fukuoka Prefecture, Japan. PLoS ONE. 2021;16(5): e0252196.

Sutherland K, Levesque JF. Unwarranted clinical variation in health care: definitions and proposal of an analytic framework. J Eval Clin Pract. 2020;26(3):687–96.

Tanenbaum SJ. Reducing variation in health care: The rhetorical politics of a policy idea. J Health Polit Policy Law. 2013;38(1):5–26.

Atsma F, Elwyn G, Westert G. Understanding unwarranted variation in clinical practice: a focus on network effects, reflective medicine and learning health systems. Int J Qual Health Care. 2020;32(4):271–4.

Horbar JD, Rogowski J, Plsek PE, Delmore P, Edwards WH, Hocker J, Kantak AD, Lewallen P, Lewis W, Lewit E. Collaborative quality improvement for neonatal intensive care. Pediatrics. 2001;107(1):14–22.

Van Maanen J: Tales of the field: On writing ethnography: University of Chicago Press; 2011.

Golden-Biddle K, Locke K. Appealing work: An investigation of how ethnographic texts convince. Organ Sci. 1993;4(4):595–616.

Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3(2):77–101.

Glaser BG. The constant comparative method of qualitative analysis. Soc Probl. 1965;12:436–45.

Willis R. The use of composite narratives to present interview findings. Qual Res. 2019;19(4):471–80.

Vojdani A, Vojdani E, Saidara E, Maes M. Persistent SARS-CoV-2 Infection, EBV, HHV-6 and other factors may contribute to inflammation and autoimmunity in long COVID. Viruses. 2023;15(2):400.

Choutka J, Jansari V, Hornig M, Iwasaki A. Unexplained post-acute infection syndromes. Nat Med. 2022;28(5):911–23.

Connors JM, Ariëns RAS. Uncertainties about the roles of anticoagulation and microclots in postacute sequelae of severe acute respiratory syndrome coronavirus 2 infection. J Thromb Haemost. 2023;21(10):2697–701.

Patel MA, Knauer MJ, Nicholson M, Daley M, Van Nynatten LR, Martin C, Patterson EK, Cepinskas G, Seney SL, Dobretzberger V. Elevated vascular transformation blood biomarkers in Long-COVID indicate angiogenesis as a key pathophysiological mechanism. Mol Med. 2022;28(1):122.

Greenhalgh T, Sivan M, Delaney B, Evans R, Milne R: Long covid—an update for primary care. bmj 2022, 378.

Parkin A, Davison J, Tarrant R, Ross D, Halpin S, Simms A, Salman R, Sivan M. A multidisciplinary NHS COVID-19 service to manage post-COVID-19 syndrome in the community. J Prim Care Commun Health. 2021;12:21501327211010990.

NHS England: COVID-19 Post-Covid Assessment Service, vol. Accessed 5th March 2024 at https://www.england.nhs.uk/statistics/statistical-work-areas/covid-19-post-covid-assessment-service/ . London: NHS England; 2024.

Sivan M, Halpin S, Gee J, Makower S, Parkin A, Ross D, Horton M, O'Connor R: The self-report version and digital format of the COVID-19 Yorkshire Rehabilitation Scale (C19-YRS) for Long Covid or Post-COVID syndrome assessment and monitoring. Adv Clin Neurosci Rehabil 2021;20(3).

The EuroQol Group. EuroQol-a new facility for the measurement of health-related quality of life. Health Policy. 1990;16(3):199–208.

Sivan M, Preston NJ, Parkin A, Makower S, Gee J, Ross D, Tarrant R, Davison J, Halpin S, O’Connor RJ, et al. The modified COVID-19 Yorkshire Rehabilitation Scale (C19-YRSm) patient-reported outcome measure for Long Covid or Post-COVID syndrome. J Med Virol. 2022;94(9):4253–64.

Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep. 1991;14(6):540–5.

Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13.

Van Dixhoorn J, Duivenvoorden H. Efficacy of Nijmegen Questionnaire in recognition of the hyperventilation syndrome. J Psychosom Res. 1985;29(2):199–206.

Evans R, Pick A, Lardner R, Masey V, Smith N, Greenhalgh T: Breathing difficulties after covid-19: a guide for primary care. BMJ 2023;381.

Van Dixhoorn J, Folgering H: The Nijmegen Questionnaire and dysfunctional breathing. In . , vol. 1: Eur Respiratory Soc; 2015.

Courtney R, Greenwood KM. Preliminary investigation of a measure of dysfunctional breathing symptoms: The Self Evaluation of Breathing Questionnaire (SEBQ). Int J Osteopathic Med. 2009;12(4):121–7.

Espinosa-Gonzalez A, Master H, Gall N, Halpin S, Rogers N, Greenhalgh T. Orthostatic tachycardia after covid-19. BMJ (Clinical Research ed). 2023;380:e073488–e073488.

PubMed   Google Scholar  

Bungo M, Charles J, Johnson P Jr. Cardiovascular deconditioning during space flight and the use of saline as a countermeasure to orthostatic intolerance. Aviat Space Environ Med. 1985;56(10):985–90.

CAS   PubMed   Google Scholar  

Sivan M, Corrado J, Mathias C. The Adapted Autonomic Profile (Aap) Home-Based Test for the Evaluation of Neuro-Cardiovascular Autonomic Dysfunction. Adv Clin Neurosci Rehabil. 2022;3:10–13. https://doi.org/10.47795/QKBU46715 .

Lee C, Greenwood DC, Master H, Balasundaram K, Williams P, Scott JT, Wood C, Cooper R, Darbyshire JL, Gonzalez AE. Prevalence of orthostatic intolerance in long covid clinic patients and healthy volunteers: A multicenter study. J Med Virol. 2024;96(3): e29486.

World Health Organization: Clinical management of covid-19 - living guideline. Geneva: WHO. Accessed 4th October 2023 at https://www.who.int/publications/i/item/WHO-2019-nCoV-clinical-2021-2 ; 2023.

Ahmed I, Mustafaoglu R, Yeldan I, Yasaci Z, Erhan B: Effect of pulmonary rehabilitation approaches on dyspnea, exercise capacity, fatigue, lung functions and quality of life in patients with COVID-19: A Systematic Review and Meta-Analysis. Arch Phys Med Rehabil 2022.

Dillen H, Bekkering G, Gijsbers S, Vande Weygaerde Y, Van Herck M, Haesevoets S, Bos DAG, Li A, Janssens W, Gosselink R, et al. Clinical effectiveness of rehabilitation in ambulatory care for patients with persisting symptoms after COVID-19: a systematic review. BMC Infect Dis. 2023;23(1):419.

Learmonth Y, Dlugonski D, Pilutti L, Sandroff B, Klaren R, Motl R. Psychometric properties of the fatigue severity scale and the modified fatigue impact scale. J Neurol Sci. 2013;331(1–2):102–7.

Webster K, Cella D, Yost K. The Functional Assessment of Chronic Illness T herapy (FACIT) Measurement System: properties, applications, and interpretation. Health Qual Life Outcomes. 2003;1(1):1–7.

Mundt JC, Marks IM, Shear MK, Greist JM. The Work and Social Adjustment Scale: a simple measure of impairment in functioning. Br J Psychiatry. 2002;180(5):461–4.

Chalder T, Berelowitz G, Pawlikowska T, Watts L, Wessely S, Wright D, Wallace E. Development of a fatigue scale. J Psychosom Res. 1993;37(2):147–53.

Shahid A, Wilkinson K, Marcu S, Shapiro CM: Visual analogue scale to evaluate fatigue severity (VAS-F). In: STOP, THAT and one hundred other sleep scales . edn.: Springer; 2011:399–402.

Parker M, Sawant HB, Flannery T, Tarrant R, Shardha J, Bannister R, Ross D, Halpin S, Greenwood DC, Sivan M. Effect of using a structured pacing protocol on post-exertional symptom exacerbation and health status in a longitudinal cohort with the post-COVID-19 syndrome. J Med Virol. 2023;95(1): e28373.

Kenny RA, Bayliss J, Ingram A, Sutton R. Head-up tilt: a useful test for investigating unexplained syncope. The Lancet. 1986;327(8494):1352–5.

Drury MOC: Science and Psychology. In: The selected writings of Maurice O’Connor Drury: On Wittgenstein, philosophy, religion and psychiatry. edn.: Bloomsbury Publishing; 2017.

Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000;342(25):1887–92.

Mongtomery K: How doctors think: Clinical judgment and the practice of medicine: Oxford University Press; 2005.

Download references

Acknowledgements

We are grateful to clinic staff for allowing us to study their work and to patients for allowing us to sit in on their consultations. We also thank the funder of LOCOMOTION (National Institute for Health Research) and the patient advisory group for lived experience input.

This research is supported by National Institute for Health Research (NIHR) Long Covid Research Scheme grant (Ref COV-LT-0016).

Author information

Authors and affiliations.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Woodstock Rd, Oxford, OX2 6GG, UK

Trisha Greenhalgh, Julie L. Darbyshire & Emma Ladds

Imperial College Healthcare NHS Trust, London, UK

LOCOMOTION Patient Advisory Group and Lived Experience Representative, London, UK

You can also search for this author in PubMed   Google Scholar

Contributions

TG conceptualized the overall study, led the empirical work, supported the quality improvement meetings, conducted the ethnographic visits, led the data analysis, developed the theorization and wrote the first draft of the paper. JLD organized and led the quality improvement meetings, supported site-based researchers to collect and analyse data on their clinic, collated and summarized data on quality topics, and liaised with the patient advisory group. CL conceptualized and led the quality topic on POTS, including exploring reasons for some clinics’ reluctance to conduct testing and collating and analysing the NASA Lean Test data across all sites. EL assisted with ethnographic visits, data analysis, and theorization. JCS contributed lived experience of long covid and also clinical experience as an occupational therapist; she liaised with the wider patient advisory group, whose independent (patient-led) audit of long covid clinics informed the quality improvement prioritization exercise. All authors provided extensive feedback on drafts and contributed to discussions and refinements. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Trisha Greenhalgh .

Ethics declarations

Ethics approval and consent to participate.

LOng COvid Multidisciplinary consortium Optimising Treatments and servIces acrOss the NHS study is sponsored by the University of Leeds and approved by Yorkshire & The Humber—Bradford Leeds Research Ethics Committee (ref: 21/YH/0276) and subsequent amendments.

Patient participants in clinic were approached by the clinician (without the researcher present) and gave verbal informed consent for a clinically qualified researcher to observe the consultation. If they consented, the researcher was then invited to sit in. A written record was made in field notes of this verbal consent. It was impractical to seek consent from patients whose cases were discussed (usually with very brief clinical details) in online MDTs. Therefore, clinical case examples from MDTs presented in the paper are fictionalized cases constructed from multiple real cases and with key clinical details changed (for example, comorbidities were replaced with different conditions which would produce similar symptoms). All fictionalized cases were checked by our patient advisory group to check that they were plausible to lived experience experts.

Consent for publication

No direct patient cases are reported in this manuscript. For details of how the fictionalized cases were constructed and validated, see “Consent to participate” above.

Competing interests

TG was a member of the UK National Long Covid Task Force 2021–2023 and on the Oversight Group for the NICE Guideline on Long Covid 2021–2022. She is a member of Independent SAGE.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Greenhalgh, T., Darbyshire, J.L., Lee, C. et al. What is quality in long covid care? Lessons from a national quality improvement collaborative and multi-site ethnography. BMC Med 22 , 159 (2024). https://doi.org/10.1186/s12916-024-03371-6

Download citation

Received : 04 December 2023

Accepted : 26 March 2024

Published : 15 April 2024

DOI : https://doi.org/10.1186/s12916-024-03371-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Post-covid-19 syndrome
  • Quality improvement
  • Breakthrough collaboratives
  • Warranted variation
  • Unwarranted variation
  • Improvement science
  • Ethnography
  • Idiographic reasoning
  • Nomothetic reasoning

BMC Medicine

ISSN: 1741-7015

approaches case study analysis

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Case Study: How Aggressively Should a Bank Pursue AI?

  • Thomas H. Davenport
  • George Westerman

approaches case study analysis

A Malaysia-based CEO weighs the risks and potential benefits of turning a traditional bank into an AI-first institution.

Siti Rahman, the CEO of Malaysia-based NVF Bank, faces a pivotal decision. Her head of AI innovation, a recent recruit from Google, has a bold plan. It requires a substantial investment but aims to transform the traditional bank into an AI-first institution, substantially reducing head count and the number of branches. The bank’s CFO worries they are chasing the next hype cycle and cautions against valuing efficiency above all else. Siti must weigh the bank’s mixed history with AI, the resistance to losing the human touch in banking services, and the risks of falling behind in technology against the need for a prudent, incremental approach to innovation.

Two experts offer advice: Noemie Ellezam-Danielo, the chief digital and AI strategy at Société Générale, and Sastry Durvasula, the chief information and client services officer at TIAA.

Siti Rahman, the CEO of Malaysia-headquartered NVF Bank, hurried through the corridors of the university’s computer engineering department. She had directed her driver to the wrong building—thinking of her usual talent-recruitment appearances in the finance department—and now she was running late. As she approached the room, she could hear her head of AI innovation, Michael Lim, who had joined NVF from Google 18 months earlier, breaking the ice with the students. “You know, NVF used to stand for Never Very Fast,” he said to a few giggles. “But the bank is crawling into the 21st century.”

approaches case study analysis

  • Thomas H. Davenport is the President’s Distinguished Professor of Information Technology and Management at Babson College, a visiting scholar at the MIT Initiative on the Digital Economy, and a senior adviser to Deloitte’s AI practice. He is a coauthor of All-in on AI: How Smart Companies Win Big with Artificial Intelligence (Harvard Business Review Press, 2023).
  • George Westerman is a senior lecturer at MIT Sloan School of Management and a coauthor of Leading Digital (HBR Press, 2014).

Partner Center

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

The case study approach

Sarah crowe.

1 Division of Primary Care, The University of Nottingham, Nottingham, UK

Kathrin Cresswell

2 Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Ann Robertson

3 School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

Anthony Avery

Aziz sheikh.

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables ​ Tables1, 1 , ​ ,2, 2 , ​ ,3 3 and ​ and4) 4 ) and those of others to illustrate our discussion[ 3 - 7 ].

Example of a case study investigating the reasons for differences in recruitment rates of minority ethnic people in asthma research[ 3 ]

Example of a case study investigating the process of planning and implementing a service in Primary Care Organisations[ 4 ]

Example of a case study investigating the introduction of the electronic health records[ 5 ]

Example of a case study investigating the formal and informal ways students learn about patient safety[ 6 ]

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table ​ (Table5), 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Definitions of a case study

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table ​ (Table1), 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables ​ Tables2, 2 , ​ ,3 3 and ​ and4) 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 - 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table ​ (Table2) 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables ​ Tables2 2 and ​ and3, 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table ​ (Table4 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table ​ (Table6). 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

Example of epistemological approaches that may be used in case study research

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table ​ Table7 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

Example of a checklist for rating a case study proposal[ 8 ]

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table ​ (Table3), 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table ​ (Table1) 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table ​ Table3) 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 - 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table ​ (Table2 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table ​ (Table1 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table ​ (Table3 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table ​ (Table4 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table ​ Table3, 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table ​ (Table4), 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table ​ Table8 8 )[ 8 , 18 - 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table ​ (Table9 9 )[ 8 ].

Potential pitfalls and mitigating actions when undertaking case study research

Stake's checklist for assessing the quality of a case study report[ 8 ]

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2288/11/100/prepub

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

  • Yin RK. Case study research, design and method. 4. London: Sage Publications Ltd.; 2009. [ Google Scholar ]
  • Keen J, Packwood T. Qualitative research; case study evaluation. BMJ. 1995; 311 :444–446. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J. et al. Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009; 6 (10):1–11. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO) 2008. http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf
  • Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T. et al. Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010; 41 :c4564. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P. the Patient Safety Education Study Group. Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010; 15 :4–10. doi: 10.1258/jhsrp.2009.009052. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van Harten WH, Casparie TF, Fisscher OA. The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002; 60 (1):17–37. doi: 10.1016/S0168-8510(01)00187-7. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stake RE. The art of case study research. London: Sage Publications Ltd.; 1995. [ Google Scholar ]
  • Sheikh A, Smeeth L, Ashcroft R. Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002; 52 (482):746–51. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • King G, Keohane R, Verba S. Designing Social Inquiry. Princeton: Princeton University Press; 1996. [ Google Scholar ]
  • Doolin B. Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998; 13 :301–311. doi: 10.1057/jit.1998.8. [ CrossRef ] [ Google Scholar ]
  • George AL, Bennett A. Case studies and theory development in the social sciences. Cambridge, MA: MIT Press; 2005. [ Google Scholar ]
  • Eccles M. the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG) Designing theoretically-informed implementation interventions. Implementation Science. 2006; 1 :1–8. doi: 10.1186/1748-5908-1-1. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A. Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005; 365 (9456):312–7. [ PubMed ] [ Google Scholar ]
  • Sheikh A, Panesar SS, Lasserson T, Netuveli G. Recruitment of ethnic minorities to asthma studies. Thorax. 2004; 59 (7):634. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hellström I, Nolan M, Lundh U. 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005; 4 :7–22. doi: 10.1177/1471301205049188. [ CrossRef ] [ Google Scholar ]
  • Som CV. Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005; 18 :463–477. doi: 10.1108/09513550510608903. [ CrossRef ] [ Google Scholar ]
  • Lincoln Y, Guba E. Naturalistic inquiry. Newbury Park: Sage Publications; 1985. [ Google Scholar ]
  • Barbour RS. Checklists for improving rigour in qualitative research: a case of the tail wagging the dog? BMJ. 2001; 322 :1115–1117. doi: 10.1136/bmj.322.7294.1115. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mays N, Pope C. Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000; 320 :50–52. doi: 10.1136/bmj.320.7226.50. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mason J. Qualitative researching. London: Sage; 2002. [ Google Scholar ]
  • Brazier A, Cooke K, Moravan V. Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008; 7 :5–17. doi: 10.1177/1534735407313395. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Miles MB, Huberman M. Qualitative data analysis: an expanded sourcebook. 2. CA: Sage Publications Inc.; 1994. [ Google Scholar ]
  • Pope C, Ziebland S, Mays N. Analysing qualitative data. Qualitative research in health care. BMJ. 2000; 320 :114–116. doi: 10.1136/bmj.320.7227.114. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cresswell KM, Worth A, Sheikh A. Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010; 10 (1):67. doi: 10.1186/1472-6947-10-67. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Malterud K. Qualitative research: standards, challenges, and guidelines. Lancet. 2001; 358 :483–488. doi: 10.1016/S0140-6736(01)05627-6. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Yin R. Case study research: design and methods. 2. Thousand Oaks, CA: Sage Publishing; 1994. [ Google Scholar ]
  • Yin R. Enhancing the quality of case studies in health services research. Health Serv Res. 1999; 34 :1209–1224. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Green J, Thorogood N. Qualitative methods for health research. 2. Los Angeles: Sage; 2009. [ Google Scholar ]
  • Howcroft D, Trauth E. Handbook of Critical Information Systems Research, Theory and Application. Cheltenham, UK: Northampton, MA, USA: Edward Elgar; 2005. [ Google Scholar ]
  • Blakie N. Approaches to Social Enquiry. Cambridge: Polity Press; 1993. [ Google Scholar ]
  • Doolin B. Power and resistance in the implementation of a medical management information system. Info Systems J. 2004; 14 :343–362. doi: 10.1111/j.1365-2575.2004.00176.x. [ CrossRef ] [ Google Scholar ]
  • Bloomfield BP, Best A. Management consultants: systems development, power and the translation of problems. Sociological Review. 1992; 40 :533–560. [ Google Scholar ]
  • Shanks G, Parr A. Proceedings of the European Conference on Information Systems. Naples; 2003. Positivist, single case study research in information systems: A critical analysis. [ Google Scholar ]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 22 April 2024

Virtual reality-empowered deep-learning analysis of brain cells

  • Doris Kaltenecker 1 , 2 , 3 , 4   na1 ,
  • Rami Al-Maskari 4 , 5 , 6 , 7   na1 ,
  • Moritz Negwer 5   na1 ,
  • Luciano Hoeher 5 ,
  • Florian Kofler 6 , 7 , 8 , 9 ,
  • Shan Zhao 4 , 5 ,
  • Mihail Todorov   ORCID: orcid.org/0000-0002-8627-1260 4 , 5 ,
  • Zhouyi Rong 4 , 5 ,
  • Johannes Christian Paetzold 5 , 7 , 10 ,
  • Benedikt Wiestler   ORCID: orcid.org/0000-0002-2963-7772 8 ,
  • Marie Piraud   ORCID: orcid.org/0000-0002-4917-2458 9 ,
  • Daniel Rueckert   ORCID: orcid.org/0000-0002-5683-5889 10 ,
  • Julia Geppert 1 , 2 , 3 ,
  • Pauline Morigny 1 , 2 , 3 ,
  • Maria Rohm   ORCID: orcid.org/0000-0003-3926-1534 1 , 2 , 3 ,
  • Bjoern H. Menze 6 , 11 ,
  • Stephan Herzig   ORCID: orcid.org/0000-0003-3950-3652 1 , 2 , 3 , 12 ,
  • Mauricio Berriel Diaz   ORCID: orcid.org/0000-0003-4670-919X 1 , 2 , 3 &
  • Ali Ertürk   ORCID: orcid.org/0000-0001-5163-5100 4 , 5 , 13 , 14 , 15  

Nature Methods ( 2024 ) Cite this article

Metrics details

  • Fluorescence imaging
  • Machine learning
  • Neuroscience

Automated detection of specific cells in three-dimensional datasets such as whole-brain light-sheet image stacks is challenging. Here, we present DELiVR, a virtual reality-trained deep-learning pipeline for detecting c-Fos + cells as markers for neuronal activity in cleared mouse brains. Virtual reality annotation substantially accelerated training data generation, enabling DELiVR to outperform state-of-the-art cell-segmenting approaches. Our pipeline is available in a user-friendly Docker container that runs with a standalone Fiji plugin. DELiVR features a comprehensive toolkit for data visualization and can be customized to other cell types of interest, as we did here for microglia somata, using Fiji for dataset-specific training. We applied DELiVR to investigate cancer-related brain activity, unveiling an activation pattern that distinguishes weight-stable cancer from cancers associated with weight loss. Overall, DELiVR is a robust deep-learning tool that does not require advanced coding skills to analyze whole-brain imaging data in health and disease.

Analyzing the expression of proteins is essential to understand cellular and molecular processes in physiology and disease. While standard immunohistochemistry is useful for validating protein expression on tissue sections, it does not provide a holistic view of expression patterns in larger samples and information can be lost during slicing 1 , 2 . Tissue clearing and fluorescent imaging solve many of these restrictions and allow unbiased protein expression analysis up to the whole-organism scale 1 , 3 , 4 , 5 .

Whole-brain analysis is essential for detecting areas involved in specific behaviors or conditions. A brain-wide snapshot of the neuronal activity of an animal can be obtained by immunostaining for the expression of immediate early genes such as c-Fos . Unbiased quantification methods for system-level examination at the single-cell resolution are essential to interpret those brain-wide findings 6 , but current automated methods for cell detection and registration to the Allen Mouse Brain Atlas 7 , 8 , 9 are difficult to apply consistently to three-dimensional (3D) whole-brain datasets. Variations in image acquisitions between samples, uneven signal-to-noise ratios across the tissue or low abundance of the target protein limit detection sensitivity and specificity. This requires manual adjustments such as setting sample and volume-specific thresholds or using conservative thresholds that will not capture all information in each sample. Deep-learning-based cell detection methods offer a promising solution to address these challenges; however, their implementation typically demands advanced coding skills, presenting a challenge for users lacking computational expertise.

Here, we developed DELiVR (deep learning and virtual reality mesoscale annotation pipeline), a virtual reality (VR)-aided deep-learning pipeline for detecting c-Fos + cells in cleared mouse brains (Fig. 1a ) that can be extended to other cell types. We generated high-quality annotations of light-sheet microscopy data of cleared whole mouse brains stained for c-Fos in a VR environment. Next, we trained a deep neural network on these data to identify c-Fos + cells across the brain and mapped them automatically to the Allen Brain Atlas. To increase the usability of DELiVR, we packaged it into a single Docker container that runs via a plugin for the open-source software Fiji. DELiVR can also be trained with custom data via Fiji to adapt DELiVR to specific datasets. We used DELiVR to study cancer-related cachexia and found increased neuronal activity in mice with weight-stable cancer in brain areas related to sensory processing and foraging. In contrast, this increase was lost in cachectic animals, suggesting a weight-stable cancer-specific neurophysiological hyperactivation phenotype.

figure 1

a , Summary of VR-aided deep learning for antibody-labeled cell segmentation in mouse brains. (i) Fixed mouse brains are subjected to SHANEL-based antibody labeling, tissue clearing and fluorescent light-sheet imaging. (ii) Volumes of raw data are labeled in VR to generate reference annotations. (iii) The DELiVR pipeline was packaged in a Docker container, controlled via a Fiji plugin. DELiVR segments cells using deep learning and registers them to the Allen Brain Atlas. DELiVR produces per-region cell counts and generates visualizations with all detected cells color coded by atlas region. b , Patch volume of raw data (c-Fos-labeled brain imaged with LSFM) and loaded into Arivis VisionVR. Volume size represents 200 3 voxel, rendered isotropically. c , Illustration of VR goggles and VR zoomed-in view of the same data as in b . d – f , Using Arivis VisionVR, individual cells were annotated by placing a selection cube on the cell ( d ), fitting the cube to the size of the cell ( e ) and filling ( f ). Scale bar, 10 µm. g , h , Zoomed-in view of raw data (same volume as in b ) ( g ) and annotation overlay generated in VR ( h ). Scale bar, 10 µm. i , Time spent for annotating a test patch using 2D-slice ( n  = 7) and VR annotation ( n  = 12 with n  = 6 annotations performed with Arivis VisonVR and n  = 6 annotations performed with syGlass). Data are presented as mean ± s.e.m. *** P  = 0.0005, two-sided Mann–Whitney U -test. j , Instance Dice of 2D-slice annotation ( n  = 7) versus VR annotation ( n  = 12 with n  = 6 annotations performed with Arivis VisonVR and n  = 6 annotations performed with syGlass). Data are presented as mean ± s.e.m. * P  = 0.0445, two-sided unpaired t -test. A.U., arbitrary units.

Source data

Reference annotation is faster in vr compared to 2d slices.

We used the SHANEL protocol 10 for whole-brain c-Fos immunostaining, tissue clearing and light-sheet fluorescence microscopy (LSFM). To train deep-learning segmentation models in a supervised manner, substantial amounts of high-quality expert annotations are crucial. As common annotation approaches such as ITK-SNAP 11 rely on time-consuming sequential two-dimensional (2D) slice-by-slice annotation, we used a VR approach that allows for full immersion into 3D volumetric data (Fig. 1b,c ). We used two commercial VR annotation software packages (Arivis VisionVR and syGlass 12 ) to evaluate the speed and accuracy of VR in comparison to 2D slice-based annotation in ITK-SNAP.

For annotation using Arivis VisionVR, the annotator defined a region of interest (ROI) in which an adaptive thresholding function was applied, according to the annotator’s input (Fig. 1d–h and Supplementary Video 1 ). In syGlass, the annotation tool allowed the annotator to draw simple 3D shapes as ROIs and adjust a threshold until the annotation was acceptable to the annotator (Extended Data Fig. 1a–d and Supplementary Video 2 ). In ITK-SNAP, individual c-Fos + cells were segmented in each plane of the image stack (Extended Data Fig. 1e and Supplementary Video 3 ). We evaluated the time spent by the annotators for a 100³ voxel sub-volume (depicting 83 c-Fos + cells) as well as the annotation quality of cell instances using the F1 score. We found that VR annotation was significantly ( P  = 0.0005, two-sided Mann–Whitney U -test) faster than 2D-slice annotation (Fig. 1i ) and improved annotation quality (increase in F1 score from 0.7383 to 0.8032 (Fig. 1j )). Thus, we decided to generate reference data in VR for our deep-learning algorithm for c-Fos activity mapping.

DELiVR outperforms threshold-based c-Fos segmentation

To comprehensively analyze neuronal activity across the entire brain, DELiVR detects and aligns the cells to the Allen Brain Atlas. DELiVR then visualizes the segmentation in both image and atlas space. Therefore, DELiVR consists of multiple steps (Fig. 2a ). First, the pipeline downsamples the raw image stack and generates ventricle masks (Extended Data Fig. 2a–c ). It then upscales the masks and uses them to mask the ventricles in the raw image input. DELiVR then utilizes a customized sliding-window inferer to identify potential cells. Afterwards, we conduct a connected component analysis 13 to identify individual cells in the masked images and filter by size. DELiVR then aligns the previously downsampled brain to the Allen Brain Atlas (CCF3, 50 µm per voxel) with mBrainAligner 14 and assigns the corresponding atlas region to each detected cell. The connected component analysis returns a set of center-point coordinates and volume for each segmented cell, which DELiVR then automatically maps to the Allen Brain Atlas with mBrainAligner.

figure 2

a , Scheme of the DELiVR inference pipeline. All components are packaged in a single Docker container. Raw image stacks serve as input. They are downsampled for atlas alignment and optionally masked (to exclude detection on ventricles). The masked images are then passed on to deep-learning cell detection (inference), which produces binary segmentations. The binarized cell’s center points are subsequently transformed to the Allen Brain Atlas CCF3 space. The cells are visualized in atlas space as (group-wise) heat maps and in image space as color-coded tiff stacks. b , Quantitative comparison of segmentation performance based on instance Dice (F1 score) between different deep-learning architectures and DELiVR. c , F1 scores for non-deep-learning methods (gray) and DELiVR (the same F1 score for DELiVR is used as in b ). d , 3D qualitative comparison between ClearMap, ClearMap2, ‘Optimized’ ClearMap, Ilastik and DELiVR on instance basis. Predicted cells with overlap in reference annotations (TP) are masked in green, predicted cells with no overlap in reference annotations (FP) are masked in red. Undetected reference annotation cells (FN) are marked in blue. TP, true positive; FP, false positive; FN, false negative. Scale bar, 100 µm. e , Whole-brain segmentation output of the detected cells is visualized in atlas space using BrainRender. Scale bar, 1 mm in CCF3 atlas space.

To train and validate our model, we randomly sampled and VR-annotated 48 × 100³ voxel patches (referring to 5,889 cells) from a c-Fos-labeled brain. From these we trained a 3D BasicUNet (Extended Data Fig. 2d ). In addition, we trained recent larger segmentation models, such as transformers 15 , SegResNet 16 and the MONAI DynUnet 17 to determine which model was best suited for our data. Assessing the instance performance by calculating the overlap between individual cells, the 3D BasicUNet architecture showed the best performance (based on F1 score) (Fig. 2b and Extended Data Fig. 2e ). Therefore, we chose the 3D BasicUNet for our DELiVR pipeline.

We also compared DELiVR with previously published non-deep-learning models that are applicable to cell detection in 3D images and had code available (ClearMap 7 , ClearMap2 (ref. 18 ) and Ilastik 19 ). Our performance on the test set shows an F1 score of 0.7918 (+89.03% increase), instance sensitivity of 0.8470 (+181.64% increase), instance precision of 0.7434 (+7.74% increase) and a volumetric Dice of 0.6739 (+581,39% increase) compared to the second-best performing method, ClearMap2 (Fig. 2c,d , Extended Data Fig. 2f and Supplementary Table 1 ). We increased the performance of ClearMap based on the F1 score to 0.65 by manually pre-processing image stacks and optimizing parameters for cell detection 3 ; however, DELiVR still had a superior performance. These scores demonstrate a clear improvement over filter and threshold-based segmentation methods as the deep-learning model captures 84.8 times more cells (1,611 true positives) than ClearMap (19 true positives), 2.8 times more cells than ClearMap2 (572 true positives) and 31.2% more than the optimized version of ClearMap (1,228 true positives) while not over-segmenting. For visualization, DELiVR generates a whole-brain segmentation output that exists in the original image space. Here, each segmented cell corresponds to a threshold value fitting to an Area ID of the Allen Brain Atlas and was colored according to the brain region that it belongs to (Extended Data Fig. 3a,b and Supplementary Video 4 ). In addition, we used BrainRender 20 to plot and visualize the detected cells in the atlas space (Fig. 2e and Extended Data Fig. 3c ).

To increase usability, the entire DELiVR pipeline, encompassing atlas alignment, cell detection and visualization, is available as a single, user-friendly Docker container for both Linux and Windows. Docker is a software platform that allows to bundle and distribute applications, along with their required components, in a uniform container format 21 . We also developed a dedicated Fiji plugin to seamlessly run the DELiVR Docker (Fig. 3a–c and Supplementary Video 5 ).

figure 3

a – c , The DELiVR plugin will appear in Fiji upon installation. It can launch DELiVR for inference ( b ) or launch the training Docker to train on domain-specific training data ( c ). d , e , Zoomed-in Arivis VisionVR view of raw data from a CX3CR1 GFP/+ microglia reporter mouse ( d ) and annotation overlay of cell bodies generated in VR ( e ). Scale bar, 10 µm. f , 3D representation of the training evaluation on instance basis; predicted cells with overlap in reference annotations are masked in green (TP), predicted cells with no overlap in reference annotations are masked in red (FP) and reference annotation cells with no corresponding prediction are marked in blue (FN). Following training, DELiVR segments microglia cell bodies with a Dice (F1) score of 0.92. Scale bar, 10 µm. g , Optical section of a CX3CR1 GFP/+ microglia reporter mouse brain hemisphere ( n  = 1, sagittal), scanned at ×12 magnification and with inversed brightness (microglia indicates black spots). Scale bar, 1 mm. h , Zoomed-in view of the cortex (red inset in g ), with overlaid segmented cells detected by whole-hemisphere DELiVR analysis shown in green ( n  = 1). Scale bar, 100 µm. i , Visualization of 12.2 million CX3CR1 GFP/+ microglia across one hemisphere, generated by DELiVR and visualized with Imaris. Color-coding per Allen Brain Atlas CCF3 regions. Scale bar, 1 mm.

Moreover, we provide a Docker container for training that integrates with the DELiVR Fiji plugin (Fig. 3a,c ). This feature allows users to (re-)train DELiVR on other datasets, thereby enhancing DELiVR’s precision and adaptability. Users can choose to fine-tune the existing c-Fos model or train their own model from scratch. For this, one can adjust hyperparameters such as the number of epochs and learning rate. The user-trained model can then be used in the DELiVR pipeline as the inference model. For a comprehensive guide, please consult our ‘DELiVR handbook’ ( Supplementary Note ).

We used our DELiVR training to annotate microglial cell bodies, the brain’s resident macrophages 22 . We performed whole-brain nanobody labeling in CX3CR1 GFP/+ reporter mice, clearing and LSFM, and annotated microglia somata in VR (Fig. 3d,e ). Training used a dataset of 161 VR-annotated 100 3 voxel patches with a total of 3,798 annotated microglia somata. The newly trained model had an F1 score of 0.92, indicating robust performance 23 (Fig. 3f ). We applied this model in the DELiVR pipeline and could detect and map microglia cell bodies throughout the brain. Using DELiVR’s visualization tool, we evaluated the microglia cell body segmentation output generated by DELiVR in our original images (Fig. 3g–i ) and mapped the segmented cells to region IDs of the Allen Brain Atlas (Fig. 3i ). Thereby, DELiVR allows to find and confirm an anatomical or functional sub-area in the original image stack of the brain.

DELiVR identifies activation patterns in tumor-bearing mice

Cancer affects normal physiology locally in the surrounding tissue but can also lead to profound changes in the systemic metabolism of the patient. This is exemplified by the wasting syndrome cancer-associated cachexia (CAC) characterized by involuntary loss of body weight 24 , 25 , 26 and specific changes in brain activity 27 .

To identify brain regions affecting body weight maintenance in cancer, we used DELiVR to compare the neuronal activity patterns between weight-stable cancer and CAC. We subcutaneously transplanted NC26 colon cancer cells that give rise to weight-stable cancer or C26 colon cancer cells that induce weight loss (Fig. 4a ). As expected 28 , no changes in body weight were observed in NC26 tumor-bearing mice compared to controls, whereas C26 tumor-bearing mice showed significant ( P  < 0.0001, one-way analysis of variance (ANOVA) with Sidak post hoc analysis) reductions (Fig. 4b ). The differences in body weight were not due to differences in tumor mass (Fig. 4c ). C26 tumor-bearing mice displayed reduced weights of the gastrocnemius muscle and white adipose tissue depots (Extended Data Fig. 4a–c ). We observed a small but statistically significant ( P  = 0.0479, one-way ANOVA with Sidak post hoc analysis) decrease in brain weights of cachectic C26 versus weight-stable NC26 tumor-bearing mice (Extended Data Fig. 4d ). We performed c-Fos antibody labeling, clearing and imaging of whole brains of these mice and applied DELiVR for whole-brain mapping of neuronal activity. c-Fos + density maps indicated an increase in brain activity in weight-stable NC26 tumor-bearing mice compared to phosphate-buffered saline (PBS) controls, whereas this increase was not present in cachectic C26 tumor-bearing mice (Fig. 4d ).

figure 4

a , Experimental setup. Adult mice were subcutaneously injected with PBS as control; NC26 cells that lead to a weight-stable cancer or cachexia-inducing C26 cancer cells. b , Body weight change of mice at the end of the experiment compared to starting body weight. Tumor weight was subtracted from the final body weight. n (PBS) = 12, n (NC26) = 8, n (C26) = 12. Data are presented as mean ± s.e.m. **** P  < 0.0001, one-way ANOVA with Sidak post hoc analysis c , Tumor weight at the end of the experiment. n (NC26) = 8, n (C26) = 12. Data are presented as mean ± s.e.m. d , Normalized c-Fos + cell density in brains of PBS controls, mice with weight-stable cancer (NC26) and mice with cancer-associated weight loss (C26), visualized in CCF3 atlas space. n (PBS) = 12, n (NC26) = 8, n (C26) = 12. Scale bars, 2 mm.

The increase in brain activity in NC26 tumor-bearing mice was most pronounced throughout the cortical plate and in the lateral septal complex (Fig. 5a,b ). Overall, we identified 19 areas in NC26 tumor-bearing mice that showed statistically significantly ( P adj < 0.1, two-sided unpaired t -tests with Benjamini–Hochberg multiple-testing correction with family-wise error rate (FWER) = 0.1) increased c-Fos expression compared to PBS controls after multiple-testing correction (Fig. 5a ). We found that NC26-bearing mice also have more c-Fos + cells in the cortical plate, with the most pronounced differences observed in the somatomotor areas (Fig. 5a,b ). NC26 tumors notably increased c-Fos + density in the somatosensory cortex related to the snout, specifically in the mouth region (layer 2/3 and 4) and barrel field layer 4. Furthermore, NC26-bearing mice showed more c-Fos + cells than PBS controls in the primary (layers 1 and 5) and secondary motor areas (layers 2/3 and 5; Fig. 5a,b ). The primary motor cortex layer 5 is especially interesting because it contains extratelencephalic projection neurons that project as far as the spinal cord, among others 29 .

figure 5

a , Brain-region-wise c-Fos + cell density log 2 (fold change) compared between the three groups. * P adj < 0.1 (two-sided unpaired t -tests with Benjamini–Hochberg multiple-testing correction with FWER = 0.1, n (PBS) = 12, n (NC26) = 8, n (C26) = 12). b , Brain areas with significantly different ( P adj < 0.1) c-Fos expression between NC26/C26 (top) or NC26/PBS (bottom) visualized using BrainRender. Red indicates significantly (* P adj < 0.1) more c-Fos + cells in NC26 in both cases. Two-sided unpaired t -tests with Benjamini–Hochberg multiple-testing correction with FWER = 0.1, n (PBS) = 12, n (NC26) = 8, n (C26) = 12. Scale bars, 1 mm. c , Flattened-cortex visualizations of normalized c-Fos + cell density for PBS control mice ( n  = 12), NC26 ( n  = 8) and C26 tumor-bearing mice ( n  = 12), Scale bars, 1 mm in flattened-cortex projection space (flattened from CCF3 atlas space). d , c-Fos + cell density in cortical subregions that were statistically significantly (* P adj < 0.1) different after multiple-testing correction. Two-sided unpaired t -tests with Benjamini–Hochberg multiple-testing correction with FWER = 0.1, n (PBS) = 12, n (NC26) = 8, n (C26) = 12.

We found seven areas that were significantly altered between NC26 and cachectic C26 tumor-bearing mice, whereas we did not observe significant changes ( P adj > 0.1, two-sided unpaired t -tests with Benjamini–Hochberg multiple-testing correction with FWER = 0.1) in c-Fos + expression when comparing PBS and cachectic C26 tumor-bearing mice after correcting for multiple testing (Fig. 5a ). When comparing NC26 to C26 tumor-bearing mice, we found that NC26 mice had more c-Fos + cells overall in the cortical plate, with the differences clustering in the dorsal and agranular lateral retrosplenial cortex as well as a subset of entorhinal cortex (Fig. 5a,c ). Evaluation of c-Fos + density heat maps offer additional details (Fig. 5d ). Evaluation of layer 2/3 retrosplenial cortex shows that activity clusters in the anterior third of retrosplenial cortex in PBS and C26, but not NC26 retrosplenial cortex, where it is both stronger and spread further to the back. The retrosplenial cortex is thought to be a site of multisensory integration, spatial integration and environment mapping 30 and is crucially involved in foraging behavior 31 .

Overall, our findings showed that brain activity in weight-stable NC26 cancer-bearing mice are markedly different from both cachectic C26 cancer-bearing mice and PBS controls. Specifically, we find a consistent hyperactivation phenotype in NC26 brains that is detectable at the whole-cortex level, but most pronounced in areas relevant to somatosensation at the snout and motor planning, as well as spatial navigation (all of which would be consistent with a foraging-related brain activation pattern). Thus, with DELiVR we were able to identify a neuronal activity pattern specific to the NC26 cancer model.

Here, we present DELiVR, an end-to-end VR-enabled deep-learning-based quantification pipeline for whole-brain cell mapping in cleared mouse brains. We designed it to make deep learning accessible to most biologists via a Fiji front end, not requiring coding skills. We leveraged VR technology to generate reference annotations for training a deep-learning-based segmentation network. DELiVR improves segmentation accuracy compared to current cell detection methods and generates a registered segmentation output that can be examined in the original image and in the atlas spaces. In addition, our Fiji training feature enables users to adapt DELiVR to their specific datasets, increasing its versatility and usability.

Traditional, non-machine-learning solutions for large-scale analysis of c-Fos cell detection, such as ClearMap 7 , 18 rely on a sophisticated system of thresholding and filtering to detect small structures and classify them as cells. While such approaches generated valuable information 6 , their performance is limited for data with variable signal-to-noise ratios, as is the case when imaging large volumes such as the entire mouse brain. Though parameters can be adjusted, it is difficult to find a setting that accounts for all cells. Hence, the thresholds tend to be set conservatively, meaning that subtle differences may be lost during threshold-based analysis. A trained deep-learning model learns these local variances, thus providing more accurate cell number estimates than threshold-based methods, as exemplified by DELiVR’s high instance F1 score. Previous approaches for segmenting cells in mouse brains ranged from deep learning 32 , 33 , 34 , random forest algorithms 19 , 35 and threshold-based solutions 7 , 18 , 36 ; however, only a subset of studies published their analysis pipeline and model weights in a working package that makes it applicable for other datasets. We found that DELiVR’s 3D BasicUNet consistently outperformed all other approaches with available code. In addition to providing highly accurate AI-based cell detection, DELiVR provides a unique and accessible open-source tool, functioning seamlessly within Fiji. It encompasses all steps of brain activity mapping, including cell detection, atlas alignment and visualization, in an easily accessible environment without the need for writing additional computer code.

Our experiments showed that VR is a superior means of annotation and data exploration for volumetric data analysis. Non-VR methods show orthogonal slices, which allows an annotator to outline the shape of individual cells in 2D; however, it obfuscates necessary volumetric information, making annotation challenging and time consuming; an annotator never sees the whole cell, only a cross-section and must scroll through slices to ensure that it is in fact a cell and not background noise. In contrast, VR allows the annotator to capture 3D structures in their entirety, enabling the fast generation of more reliable annotated data.

In future work, it will be interesting to explore the possibility of performing active learning in a VR environment. Active learning is a combined machine-learning training and annotation approach, where a model selectively chooses the most informative or uncertain data points for manual annotation, allowing for efficient model improvement with fewer labeled data points 37 . This approach is currently limited by the possibilities of the VR annotation software application programming interfaces. Using an ensemble of networks or a test time augmentation uncertainty map as well as methods such as Monte-Carlo-based sampling using dropout layers 38 to highlight areas that are ambiguous to the network, the annotator can be guided to even more efficient time use in VR annotation. The annotators’ choices can then be fed into a fine-tuning step to improve the model while annotating.

We used DELiVR to profile the brain activation patterns of cancer-bearing mice that were either weight-stable or displayed CAC. A mix of reduced food intake, elevated catabolism, increased energy expenditure and inflammation drives weight loss in cancer 26 . The brain was shown to contribute to anorexia in CAC, as it responds to inflammatory cytokines that modulate the activity of neuronal populations that regulate appetite 39 . In addition, activation of neurons in the parabrachial nucleus was shown to suppress appetite in mouse models of CAC 40 . The reduction in brain weight among cachectic C26 tumor-bearing mice aligns with prior reports of decreased brain weight in mice with cachexia-inducing pancreatic tumors 41 . It is currently unclear whether this this volume reduction is due to cell death, or white matter loss.

Notably, we found a substantial increase in c-Fos + expression in the brains of weight-stable NC26 tumor-bearing mice, especially in motor and sensory areas, and higher-order regions such as the retrosplenial cortex. Those regions are linked to sensorimotor control, motor sequencing and foraging 30 , 42 , 43 . The abundance of sensory-related regions suggests cancer-specific impairment in GABAergic inhibition 44 , driving a hyperactivation phenotype via disinhibition. If and how these increases in neuronal activation in weight-stable cancer-bearing mice affect body weight maintenance will be of a high interest to explore in future studies.

In conclusion, we present DELiVR: an integrated, easy-to-use pipeline to label, scan and analyze neuronal activity markers across the entire mouse brain and show how VR increases the speed and accuracy of generating reference annotations. Using DELiVR, we find differences in c-Fos expression between cachectic and non-cachectic cancer mouse brains, pointing us to a previously unknown neurophysiological phenotype in cancer-related weight control.

Whole-brain immunolabeling and clearing

Immunostaining for c-Fos was performed using a modified version of SHANEL 10 . All incubation steps were carried out under moderate shaking (300 rpm). For the pretreatment, samples were dehydrated with an ethanol/water series (50%, 70% and 100% ethanol) at room temperature for 3 h per step. Next, samples were incubated in dichloromethane (DCM)/methanol (2:1 v/v) at room temperature for 1 day. Brains were rehydrated with an ethanol/water series (100%, 70% and 50% ethanol and diH 2 O) at room temperature for 3 h per step. Samples were incubated in 0.5 M acetic acid at room temperature for 5 h followed by washing with diH 2 O. Next, brains were incubated in 4 M guanidine HCl, 0.05 M sodium acetate, 2% v/v Triton X-100, pH 6.0, at room temperature for 5 h followed by washing with diH 2 O. Brains were incubated in a mix of 10% CHAPS and 25% N -methyldiethanolamine at 37 °C for 12 h before washing with diH 2 O. Blocking was performed by incubating the brains in 0.2% Triton X-100, 10% dimethylsulfoxide and 10% goat serum in PBS shaking at 37 °C for 2 days. Samples were incubated with c-Fos primary antibody (Cell Signaling Technology, 2250, 1:1,000 dilution) in primary antibody buffer (0.2% Tween-20, 5% dimethylsulfoxide, 3% goat serum and 100 µl heparin per 100 ml PBS) shaking at 37 °C for 7 days. The antibody solution was filtered (22-µm pore size) before use. Samples were washed in washing solution (0.2% Tween-20 and 100 µl heparin in 100 ml PBS) shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were incubated with the secondary antibody (Alexa Fluor 647 and goat anti-rabbit IgG (H + L) from Invitrogen, A-21245, 1:500 dilution) in secondary antibody buffer (0.2% Tween-20, 3% goat serum and 100 µl heparin per 100 ml PBS) shaking at 37 °C for 7 days followed by incubating in washing solution shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were dehydrated using 3DISCO 2 with a THF/H 2 O series (50%, 70%, 90% and 100% THF) for 12 h per step followed by an incubation in DCM for 1 h. Tissues were incubated in benzyl alcohol/benzyl benzoate (1:2 v/v) until tissue transparency was reached (>4 h).

For microglia labeling, brains of CX3CR1 GFP/+ mice were pretreated via the modified SHANEL protocol as described above and incubated with Atto647N-conjugated anti-GFP nanobooster (Chromotek, gba647n-100, 1:1,000 dilution) with 5% 2-hydroxypropyl-β-cyclodextrin, 0.2% Tween-20 and 6% goat serum in PBS for 5 days at 37 °C. Brains were washed as described in washing solution shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were dehydrated with an ethanol/dH 2 O series (50%, 70%, 90% and 100% ethanol) at room temperature for 2 h each step and incubated in 100% ethanol overnight. Subsequently, brains were incubated in DCM for 1 h before incubation in benzyl alcohol/benzyl benzoate until tissue transparency was reached.

Light-sheet imaging

Light-sheet imaging for c-Fos labeled brains was conducted through a ×4 objective lens (Olympus XLFLUOR 340) equipped with an immersion-corrected dipping cap mounted on an UltraMicroscope II (LaVision BioTec) coupled to a white light laser module (NKT SuperK Extreme EXW-12). The antibody signal was visualized using a 640/40 nm excitation and 690/50 nm emission filter. Tiling scans (3 × 3 tiles) were acquired with a 15–20% overlap, 60% sheet width and 0.027 NA. The images were taken in 16-bit depth and at a nominal resolution of 1.625 μm per voxel on the xy axes. In the z dimension we took images in 6-μm steps using left- and right-sided illumination. Whole-brain scans for microglia-labeled CX3CR1 GFP/+ brains were generated with the LaVision BioTec Ultramicroscope Blaze coupled with LaVision BioTec MI PLAN ×12 objective (0.53 NA (WD = 10 mm), nominal pixel size of 0.54 µm in xy ). Stitching of tile scans was carried out using Fiji’s stitching plugin, using the ‘Stitch Sequence of Grids of Images’ plugin 45 and custom Python scripts.

ClearMap 7 and the CellMap portion 18 of ClearMap2 were used with adapted settings for thresholds and cell sizes that fitted to the higher resolution and different signal-to-noise ratios in our dataset. Segmentation masks were saved as tiff stacks by toggling the ‘save’ option in the last segmentation step. ClearMap was ported to Python (v.3.5) before use, but functioned identically 46 . We only used the cell segmentation portions, no pre-processing (for example ClearMap2’s flat-field correction) or post-processing, such as atlas alignment, were performed. Both pipelines were run for an entire brain and subsequently subdivided into test patches that we used for the comparisons with DELiVR. For ‘optimized ClearMap’ 3 , we performed the following pre-processing steps on our image stack: (1) Background equalization to homogenize intensity distribution and appearance of the c-Fos + cells over different regions of the brain, using pseudo-flat-field correction function from Bio-Voxxel toolbox ( https://doi.org/10.5281/zenodo.5986129 ). (2) Convoluted background removal, to remove all particles bigger than relevant cells. This was performed with the median option in the Bio-Voxxel toolbox. (3) A 2D median filter to remove remaining noise after background removal. (4) Unsharpen mask to amplify the high-frequency components of a signal and increase overall accuracy of the cell detection algorithm of ClearMap. (5) A z -wise removal of artifacts by manually selecting ROIs in Fiji. After pre-processing, ClearMap 7 was applied by following the original publication and considering the threshold levels that we obtained from the pre-processing steps.

Ventricle masking

We wrote an automated pre-processing script that downsamples the image stack to an isotropic 25 × 25 × 25 µm per voxel and then applies a custom-trained random forest to identify ventricles. Specifically, we integrated Ilastik 19 (v.1.4.0b8) with a 3D pixel classifier, which we trained on several downsampled brain image stacks to differentiate between ventricles and brain parenchyma. The pre-processing script then generates a 3D mask stack that our script upsamples to the original image stack dimensions, using bicubic interpolation to avoid aliasing artifacts at ventricle edges. It then masks each original z -plane image with the respective mask, pads it and returns a 16-bit image stack (saved as one big .npy file that can be read via np.memmap).

VR annotation for c-Fos + cells was carried out using Arivis VisionVR (v.3.4.0, Carl Zeiss Microscopy Software Center Rostock) or syGlass (v.1.7.2, ref. 12 ). For this purpose, the annotator was wearing a VR headset (Oculus Rift S) and carried out annotations in VR using hand controllers (Oculus Touch). Slice-by-slice annotation was carried out using ITK-SNAP (v.3.8, ref. 11 ). For comparing VR and 2D-sliced based annotation, a 100 3 -voxel volume of c-Fos labeled brain was annotated by the participants and the time was recorded until the annotation task was finished. For training and testing our deep-learning network, we annotated a total of 48 × 100³ voxel patches in VR. All of our training and test patches were furthermore vetted by an expert biologist in ITK-SNAP to ensure that only cells were annotated. We evaluated the annotation quality using the formula of Dice as described below. For more details about the annotation process in VR, please see our ‘DELiVR handbook’ provided as a Supplementary Note . Microglia cell bodies were annotated in VR similar to c-Fos + cells using Arivis VisionVR. Only the somata were annotated, while the microglia processes were excluded.

Deep learning

To automatically segment the cells in all brains, we trained a 3D BasicUNet 47 for DELiVR from the MONAI library 48 . The annotated dataset of 48 × 100³ patches was split into nine patches for testing and 39 patches for training stratified by signal after manual ventricle masking. As an activation function, we chose Mish 49 and as optimizer Ranger21 (ref. 50 ). As a loss function, we used binary cross-entropy loss 17 . For the training of 500 epochs, we set the initial learning rate to 1 × 10 −3 and the batch size to four. The network was then trained on a single GPU (NVIDIA RTX8000). Instead of conducting model selection, we selected the last checkpoint after 500 epochs of training. To compare the DELiVR 3D BasicUNet with other segmentation models, we trained UNETR 15 , SegResNet 16 and MONAI DynUNET 17 with similar specifications.

The microglia 3D BasicUNet model was trained in a similar fashion for 500 epochs using 161 patches containing 3,798 cells. These were split into 129 patches for training and 32 patches for testing. Training was performed on an NVIDIA A100 GPU.

Evaluation of the segmentation model

Evaluation of the deep-learning model was conducted in a twofold manner. First, we evaluated the volumetric segmentation quality by assessing, for each voxel, whether it was correctly classified as foreground or background using pymia 51 . A volumetric quality assessment gave us TPs, FPs, FNs and true negatives by comparing every prediction voxel with the reference annotation voxel. Additionally, we conducted an instance-wise assessment of the segmentation quality. Therefore, we assess detection rates on a single-cell (instance) level 52 . To fairly evaluate every cell irrespective of the patch, we aggregated the counts across all patches and computed the instance metrics globally 53 .

Volumetric and instance scores were calculated according to the following equations:

Comparison with ClearMap, ClearMap2, ‘Optimized ClearMap’ and Ilastik was performed on a test brain to generate segmentations from which we cropped 100³-voxel patches to avoid artifacts that occur when the methods are applied at the patch level. These patches were then compared to our reference annotation using the same metrics as described above.

Atlas registration and statistical analysis

For atlas registration, we used mBrainAligner 14 , which worked well with our datasets (Supplementary Fig. 1 ). We manually saved the downsampled isotropic 25 × 25 × 25 µm per voxel stacks as .v3draw using Vaa3d 54 . Subsequently, we wrote an automated script that aligned the image stacks to mBrainAligner’s 50 × 50 × 50 µm per voxel version of the Allen Brain Atlas CCF3 reference atlas, using the LSFM example settings with minor adaptations. Subsequently, we used mBrainAligner’s swc transformation tool to map the center-point coordinates of our c-Fos + cells into atlas space.

Furthermore, we wrote a custom cell-to-atlas script (reusing parser code from VeSSAP 55 and the Allen Brain Atlas CCF3 atlas file as provided by the Scalable Brain Atlas 56 ) that filters the cells by size, with a user-defined upper and lower limit and returns two tables: a table with each cell as a row, including the region and Allen Brain Atlas color code, etc. and a region table with one region per row, in which the number of c-Fos + cells per region is summarized. For all datasets, the post-processing script generates overview tables that contain cell counts for all regions. We used the latter for uncorrected Student’s t- tests. Finally, we implemented a level-aware multiple-testing script that compares groups at the Allen Brain Atlas’s 11 structure levels. We excluded the fiber tracts from our statistical comparisons.

Visualization

For visualizing the cells and regions in atlas space, we used BrainRender 20 (v.2) with a modified density plot function 46 . To visualize the segmented cells in the original image space, we combined the area-wise color code from the Allen Brain Atlas with the 3D segment mask output by the connected component analysis. The result is a cell mask file with each cell being color coded according to the brain area that it belongs to, which makes overlaying with the original image data in for example Fiji easy and allows for direct visual inspection of the segmentation results. Finally, we used the Allen Institute for Brain Science’s cortical flat-map code ( https://github.com/int-brain-lab/atlas ) with adaptions 46 to include our heat maps.

DELiVR Docker and Fiji plugin

We packaged the DELiVR pipeline as provided in GitHub ( https://github.com/erturklab/delivr_cfos ) into a Docker container (base, nvidia/cuda:11.7.2-runtime-ubuntu22.04) including mBrainAligner 14 ( https://github.com/Vaa3D/vaa3d_tools/tree/master/hackathon/mBrainAligner ), Ilastik ( https://www.ilastik.org/download.html , v.1.4.0b8) and TeraStitcher portable 57 ( https://github.com/abria/TeraStitcher/wiki/Binary-packages#terastitcher-portable-no-gui-only-command-line-tools , v.1.11.10). The code included Python (v.3.8), PyTorch (v.1.11), PyTorch Lightning (v.2.0.5), Nibabel (v.5.1.0), MONAI (v.1.2.0), SciPy (v.1.8.1), NumPy (v.1.24.4), Pandas (v.1.4.3), imglib2 ( https://github.com/imglib/imglib2 ) and cc3d ( https://github.com/seung-lab/connected-components-3d ). For details, please see the Docker file on GitHub ( https://github.com/erturklab/delivr_cfos/blob/main/Dockerfile ).

We wrote the Fiji 58 (v.1.52p) plugin in Java (v.1.8, using Maven (v.3.9.5) and Jackson, https://github.com/FasterXML/jackson ) as a front end. This provides a graphical user interface that compiles a config.json with path names and analysis parameters. Subsequently, the plugin calls the Docker container via a shell command and displays the progress of the pipeline. For a more detailed description, please see our ‘DELiVR handbook’ provided as a Supplementary Note .

Docker for training and Fiji plugin

We packaged the training code ( https://github.com/erturklab/delivr_train ) as a separate Docker container, which is also accessible via the Fiji plugin. The training plugin accepts annotated patches and trains a model specifically for this dataset. This model can then be imported into the inference pipeline for dataset-specific inference for any cell type. The Fiji training plugin compiles a config_train.json and arranges the file layout for the training Docker. It displays the training progress and shows the final test scores at the end.

Cell culture

C26 and NC26 colon cancer cells were cultured in high-glucose DMEM with pyruvate (Life Technologies, 41966052), supplemented with 10% fetal bovine serum (Sigma-Aldrich, F7524) and 1% penicillin-streptomycin (Thermo Fisher, 15140122) as described previously 28 , 59 . Before using the cells for transplantation, cells had a confluence of 80%. Cells were trypsinized, counted and required cell numbers were suspended in Dulbecco’s PBS (Thermo Fisher, 14190250).

Animal experimentation

Experiments were carried out with male BALB/c mice aged 10–12 weeks. They were purchased from Charles River Laboratories, maintained on a 12-h light–dark cycle and fed a regular unrestricted chow diet. The set points in the animal room were set to 20–24 °C temperature and 45–65% humidity. The mice were injected with 1 × 10 6 C26 or 1.5 × 10 6 NC26 colon cancer cells 28 , 59 in 50 µl PBS subcutaneously into the right flank. Control mice were injected with 50 µl PBS. After 5 days from cell implantation, mice were monitored daily for tumor growth and body weight. Cachectic C26 tumor-bearing mice were considered cachectic when they had lost 10–15% of body weight. Mice were killed following deep anesthesia with a mix of ketamine/xylazine, followed by intracardiac perfusion with heparinized PBS (10 U ml −1 heparin) and by a perfusion with 4% paraformaldehyde (PFA). Tissues and organs were dissected, weighed and post-fixed at 4 °C overnight. Animal experimentation was performed in accordance with European Union directives and the German Animal Welfare Act (Tierschutzgesetz) and approved by the state ethics committee and the Government of Upper Bavaria (ROB-55.2-2532.Vet_02-18-93).

The 6–8-week-old CX3CR1 GFP/+ (B6.129P-Cx3cr1tm1Litt/J) mice were purchased from The Jackson Laboratory (strain code 005582). They were deeply anesthetized using a combination of midazolam, medetomidine and fentanyl, intracardially perfused with 15 ml 0.01 M PBS solution (10 U ml −1 heparin) and 15 ml 4% PFA solution. The brain was dissected, post-fixed in 4% PFA for 6 h, then proceeded for staining and clearing following the SHANEL protocol. CX3CR1 GFP/+ mice were killed for organ withdrawal (Tötung zu Wissenschaftlichen Zwecken/Organentnahme) in accordance with the German law for animal experiments (Tierschutzgesetz), paragraph 4, section 3.

Statistical analysis

Results from biological replicates were expressed as mean ± s.e.m. Statistical analysis was performed using GraphPad Prism (v.9). Normality was tested using Shapiro–Wilk normality tests. To compare two conditions, unpaired Student’s t -tests or Mann–Whitney U -tests were performed. A one-way ANOVA with Sidak’s post hoc test or Kruskal–Wallis tests with Dunn’s multiple comparison test were used to compare three groups. For the c-Fos + density comparison between areas, we used two-sided t -tests followed by Benjamini–Hochberg multiple-testing correction with a false discovery rate (FWER) of 0.1, as implemented in SciPy statsmodels.stats.multitest.multipletests module ( https://www.statsmodels.org/dev/generated/statsmodels.stats.multitest.multipletests.html ).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All data that support the findings of this study are available from the corresponding author. We provide the numerical source files of all figures in the supplementary material. Our training and test data as well as the trained network is available in GitHub at https://github.com/erturklab/delivr_cfos (ref. 60 ) and https://github.com/erturklab/delivr_train (ref. 61 ). A subset of representative whole-brain scans is available at the EBI Bioimage Repository (accession code S-BIAD1019 ). Due to limitations to share large imaging data online, additional whole-brain scans ( n  = 27 whole brains, ~2 TB data) will be made available upon reasonable request. The Allen Brain Atlas (CCF3) was downloaded from the Scalable Brain Atlas repository at https://scalablebrainatlas.incf.org/mouse/ABA_v3 . Source data are provided with this paper.

Code availability

All code to run our pipeline end-to-end is available in GitHub at https://github.com/erturklab/delivr_cfos (ref. 60 ). Training code is available in GitHub at https://github.com/erturklab/delivr_train (ref. 61 ). Docker containers and the plugin can be obtained from https://discotechnologies.org/DELiVR . The code is released under the MIT license.

Ueda, H. R. et al. Tissue clearing and its applications in neuroscience. Nat. Rev. Neurosci. 21 , 61–79 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Erturk, A. et al. Three-dimensional imaging of solvent-cleared organs using 3DISCO. Nat. Protoc. 7 , 1983–1995 (2012).

Article   CAS   PubMed   Google Scholar  

Cai, R. et al. Panoptic imaging of transparent mice reveals whole-body neuronal projections and skull-meninges connections. Nat. Neurosci. 22 , 317–327 (2019).

Belle, M. et al. Tridimensional visualization and analysis of early human development. Cell 169 , 161–173 (2017).

Bhatia, H. S. et al. Spatial proteomics in three-dimensional intact specimens. Cell 185 , 5040–5058 (2022).

Molbay, M., Kolabas, Z. I., Todorov, M. I., Ohn, T. L. & Erturk, A. A guidebook for DISCO tissue clearing. Mol. Syst. Biol. 17 , e9807 (2021).

Article   PubMed   PubMed Central   Google Scholar  

Renier, N. et al. Mapping of brain activity by automated volume analysis of immediate early genes. Cell 165 , 1789–1802 (2016).

Nectow, A. R. et al. Identification of a brainstem circuit controlling feeding. Cell 170 , 429–442 (2017).

Topilko, T. et al. Edinger-Westphal peptidergic neurons enable maternal preparatory nesting. Neuron 110 , 1385–1399 (2022).

Zhao, S. et al. Cellular and molecular probing of intact human organs. Cell 180 , 796–812 (2020).

Yushkevich, P. A. & Gerig, G. ITK-SNAP: An intractive medical image segmentation tool to meet the need for expert-guided segmentation of complex medical images. IEEE Pulse 8 , 54–57 (2017).

Article   PubMed   Google Scholar  

Pidhorskyi, S., Morehead, M., Jones, Q., Spirou, G., & Doretto, G. syGlass: interactive exploration of multidimensional images using virtual reality head-mounted displays. Preprint at arXiv https://doi.org/10.48550/arXiv.1804.08197 (2018).

Silversmith, W. cc3d: Connected components on multilabel 3D & 2D images. Version 3.2.1. Zenodo https://doi.org/10.5281/zenodo.5719536 (2021).

Qu, L. et al. Cross-modal coherent registration of whole mouse brains. Nat. Methods 19 , 111–118 (2022).

Hatamizadeh, A. et al. UNETR: Transformers for 3D Medical Image Segmentation. In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 1748–1758 (2022)

Myronenko, A. 3D MRI brain tumor segmentation using autoencoder regularization. in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018 , Part II, pp. 311–320 (Springer International Publishing, 2018).

Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18 , 203–211 (2021).

Kirst, C. et al. Mapping the fine-scale organization and plasticity of the brain vasculature. Cell 180 , 780–795 (2020).

Berg, S. et al. ilastik: interactive machine learning for (bio)image analysis. Nat. Methods 16 , 1226–1232 (2019).

Claudi, F. et al. Visualizing anatomically registered data with brainrender. eLife 10 , e65751 (2021).

Merkel, D. Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014 , 2 (2014).

Salter, M. W. & Stevens, B. Microglia emerge as central players in brain disease. Nat. Med. 23 , 1018–1027 (2017).

Kofler, F. et al. Approaching peak ground truth. in 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), 1–6 (IEEE, 2023).

Porporato, P. E. Understanding cachexia as a cancer metabolism syndrome. Oncogenesis 5 , e200 (2016).

Baracos, V. E., Martin, L., Korc, M., Guttridge, D. C. & Fearon, K. C. H. Cancer-associated cachexia. Nat. Rev. Dis. Prim. 4 , 17105 (2018).

Schmidt, S. F., Rohm, M., Herzig, S. & Berriel Diaz, M. Cancer cachexia: more than skeletal muscle wasting. Trends Cancer 4 , 849–860 (2018).

Argiles, J. M., Stemmler, B., Lopez-Soriano, F. J. & Busquets, S. Inter-tissue communication in cancer cachexia. Nat. Rev. Endocrinol. 15 , 9–20 (2018).

Morigny, P. et al. High levels of modified ceramides are a defining feature of murine and human cancer cachexia. J. Cachexia Sarcopenia Muscle 11 , 1459–1475 (2020).

Baker, A. et al. Specialized subpopulations of deep-layer pyramidal neurons in the neocortex: bridging cellular properties to functional consequences. J. Neurosci. 38 , 5441–5455 (2018).

Alexander, A. S. et al. Egocentric boundary vector tuning of the retrosplenial cortex. Sci. Adv. 6 , eaaz2322 (2020).

Carstensen, L. C., Alexander, A. S., Chapman, G. W., Lee, A. J. & Hasselmo, M. E. Neural responses in retrosplenial cortex associated with environmental alterations. iScience 24 , 103377 (2021).

Kim, Y. et al. Mapping social behavior-induced brain activation at cellular resolution in the mouse. Cell Rep. 10 , 292–305 (2015).

Jager, P. et al. Dual midbrain and forebrain origins of thalamic inhibitory interneurons. eLife https://doi.org/10.7554/eLife.59272 (2021).

Tyson, A. L. et al. A deep learning algorithm for 3D cell detection in whole mouse brain image datasets. PLoS Comput. Biol. 17 , e1009074 (2021).

Menegas, W. et al. Dopamine neurons projecting to the posterior striatum form an anatomically distinct subclass. eLife 4 , e10032 (2015).

Matsumoto, K. et al. Advanced CUBIC tissue clearing for whole-organ cell profiling. Nat. Protoc. 14 , 3506–3537 (2019).

Nath, V., Yang, D., Landman, B. A., Xu, D. & Roth, H. R. Diminishing uncertainty within the training pool: active learning for medical image segmentation. IEEE Trans. Med. Imaging 40 , 2534–2547 (2021).

Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. in International Conference on Machine Learning , 1050–1059. (PMLR, 2015).

Burfeind, K. G., Michaelis, K. A. & Marks, D. L. The central role of hypothalamic inflammation in the acute illness response and cachexia. Semin. Cell Dev. Biol. 54 , 42–52 (2016).

Campos, C. A. et al. Cancer-induced anorexia and malaise are mediated by CGRP neurons in the parabrachial nucleus. Nat. Neurosci. 20 , 934–942 (2017).

Winnard, P. T. Jr. et al. Brain metabolites in cholinergic and glutamatergic pathways are altered by pancreatic cancer cachexia. J. Cachexia Sarcopenia Muscle 11 , 1487–1500 (2020).

Rolls, E. T., Cheng, W. & Feng, J. The orbitofrontal cortex: reward, emotion and depression. Brain Commun. 2 , fcaa196 (2020).

Basu, R. et al. The orbitofrontal cortex maps future navigational goals. Nature 599 , 449–452 (2021).

Kann, O. The interneuron energy hypothesis: Implications for brain disease. Neurobiol. Dis. 90 , 75–85 (2016).

Preibisch, S., Saalfeld, S. & Tomancak, P. Globally optimal stitching of tiled 3D microscopic image acquisitions. Bioinformatics 25 , 1463–1465 (2009).

Negwer, M. et al. FriendlyClearMap: an optimized toolkit for mouse brain mapping and analysis. Gigascience 12 , giad035 (2022).

Falk, T. et al. U-Net: deep learning for cell counting, detection, and morphometry. Nat. Methods 16 , 67–70 (2019).

Cardoso, J. M. et al. MONAI: an open-source framework for deep learning in healthcare. Preprint at arXiv https://doi.org/10.48550/arXiv.2211.02701 (2022).

Misra, D. A self regularized non-monotonic neural activation function. Preprint at arXiv https://doi.org/10.48550/arXiv.1908.08681 (2019).

Wright, L. & Demeure, N. Ranger21: a synergistic deep learning optimizer. Preprint at arXiv https://doi.org/10.48550/arXiv.2106.13731 (2021).

Jungo, A., Scheidegger, O., Reyes, M. & Balsiger, F. pymia: a Python package for data handling and evaluation in deep learning-based medical image analysis. Comput. Methods Prog. Biomed. 198 , 105796 (2021).

Article   Google Scholar  

Pan, C. et al. Deep learning reveals cancer metastasis and therapeutic antibody targeting in the entire body. Cell 179 , 1661–1676 (2019).

Kofler, F. et al. blob loss: instance imbalance aware loss functions for semantic segmentation. in Information Processing in Medical Imaging (Springer, 2023).

Peng, H., Ruan, Z., Long, F., Simpson, J. H. & Myers, E. W. V3D enables real-time 3D visualization and quantitative analysis of large-scale biological image data sets. Nat. Biotechnol. 28 , 348–353 (2010).

Todorov, M. I. et al. Machine learning analysis of whole mouse brain vasculature. Nat. Methods 17 , 442–449 (2020).

Bakker, R., Tiesinga, P. & Kotter, R. The scalable brain atlas: instant web-based access to public brain atlases and related content. Neuroinformatics 13 , 353–366 (2015).

Bria, A. & Iannello, G. TeraStitcher: a tool for fast automatic 3D-stitching of teravoxel-sized microscopy images. BMC Bioinform. 13 , 316 (2012).

Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9 , 676–682 (2012).

Morigny, P. et al. Association of circulating PLA2G7 levels with cancer cachexia and assessment of darapladib as a therapy. J. Cachexia Sarcopenia Muscle 12 , 1333–1351 (2021).

DELiVR pipeline. (GitHub, 2024); https://zenodo.org/doi/10.5281/zenodo.10908720

Training code. (GitHub, 2024); https://zenodo.org/doi/10.5281/zenodo.10909998

Download references

Acknowledgements

We thank AIME (Berlin) for providing the processing time on their servers. We thank I. Horvath, H. Mai and L. Kümmerle from the Institute for Tissue Engineering and Regenerative Medicine (iTERM, Helmholtz Munich) for annotation tasks. We thank M. Ali and I. Horvath (iTERM) for helping with initial Docker container setup and slurm scripting. We thank L. Harrison from the Institute for Diabetes and Cancer (Helmholtz Munich) for editing the graphical summary. Figures 1a and 3a were generated with BioRender.com. We thank R. Zechner and M. Schweiger from the Institute of Molecular Biosciences (University of Graz) for kindly providing NC26 cancer cell lines. We thank M. Elsner and F. Hellal (iTERM) for proofreading and editing this manuscript. This work was supported by the Vascular Dementia Research Foundation, Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy within the framework of the Munich Cluster for Systems Neurology (EXC 2145 SyNergy, grant no. ID 390857198) and DFG (grant nos. SFB 1052, project A9; TR 296 project 03) as well as the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung) within the NATON collaboration (grant no. 01KX2121) and the HIVacToGC collaboration. This work was also supported by the European Research Council Consolidator grant (CALVARIA, grant no. GA 865323 to A.E.) and Nomis Heart Atlas Project Grant (Nomis Foundation). This work was supported by the European Research Council under the European Union’s Horizon 2020 research and innovation program (949017 to M.R.) and a grant from the Else-Kröner-Fresenius-Stiftung (2020 EKSE.23 to S.H.), as well as the Edith-Haberland-Wagner Stiftung. B.M., B.W. and F.K. are supported through the SFB 824, subproject B12, DFG through TUM International Graduate School of Science and Engineering, GSC 81. B.M. acknowledges support by the Helmut Horten Foundation.

Open access funding provided by Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH).

Author information

These authors contributed equally: Doris Kaltenecker, Rami Al-Maskari, Moritz Negwer.

Authors and Affiliations

Institute for Diabetes and Cancer (IDC), Helmholtz Munich, Neuherberg, Germany

Doris Kaltenecker, Julia Geppert, Pauline Morigny, Maria Rohm, Stephan Herzig & Mauricio Berriel Diaz

Joint Heidelberg-IDC Translational Diabetes Program, Heidelberg University Hospital, Heidelberg, Germany

German Center for Diabetes Research (DZD), Neuherberg, Germany

Institute for Stroke and Dementia Research, Klinikum der Universität München, Ludwig-Maximilians-Universität LMU, Munich, Germany

Doris Kaltenecker, Rami Al-Maskari, Shan Zhao, Mihail Todorov, Zhouyi Rong & Ali Ertürk

Institute for Tissue Engineering and Regenerative Medicine, Helmholtz Munich, Neuherberg, Germany

Rami Al-Maskari, Moritz Negwer, Luciano Hoeher, Shan Zhao, Mihail Todorov, Zhouyi Rong, Johannes Christian Paetzold & Ali Ertürk

Department of Computer Science, TUM Computation, Information and Technology, Technical University of Munich (TUM), Munich, Germany

Rami Al-Maskari, Florian Kofler & Bjoern H. Menze

Center for Translational Cancer Research of the TUM (TranslaTUM), Munich, Germany

Rami Al-Maskari, Florian Kofler & Johannes Christian Paetzold

Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany

Florian Kofler & Benedikt Wiestler

Helmholtz AI, Helmholtz Munich, Neuherberg, Germany

Florian Kofler & Marie Piraud

Department of Computing, Imperial College London, London, United Kingdom

Johannes Christian Paetzold & Daniel Rueckert

Department for Quantitative Biomedicine, University of Zurich, Zurich, Switzerland

Bjoern H. Menze

Chair Molecular Metabolic Control, TU Munich, Munich, Germany

Stephan Herzig

School of Medicine, Koç University, İstanbul, Turkey

Munich Cluster for Systems Neurology (SyNergy), Munich, Germany

Deep Piction, Munich, Germany

You can also search for this author in PubMed   Google Scholar

Contributions

D.K., R.A., S.H., M.B.D. and A.E. conceptualized the study. A.E. supervised and managed the entire project. D.K. performed in vivo work, antibody labeling, imaging and data processing. R.A. performed VR and deep-learning analysis. M.N. performed atlas registration, data analysis and interpreted the results. L.H. performed VR annotations. F.K. trained the deep-learning c-Fos models and wrote the initial inference pipeline. R.A., M.N. and F.K. improved the pipeline and ran data inference. M.N. and R.A. added visualization. R.A. and F.K. compared model performance metrics. R.A. wrote the Fiji plugin. M.N., R.A. and D.K. tested the Fiji plugin. M.N. and F.K. packaged the pipeline in a Docker container. D.K., R.A., M.N. and L.H. wrote the DELiVR handbook. S.Z. and M.T. supported imaging and antibody labeling. J.G. and P.M. supported in vivo experiments. Z.R. provided microglia data. R.A. trained the deep-learning microglia models. J.C.P. supported data interpretation. M.R. provided funding and useful discussions. A.E., M.R.B., S.H., M.R., B.W., D.R., B.M. and M.P. provided funding. S.H. supported supervision of the study. A.E. and M.B.D. supervised the study. D.K., R.A. und M.N. drafted the manuscript. A.E., D.K., M.N. and R.A. revised the manuscript. F.K. provided critical feedback on the revised manuscript.

Corresponding authors

Correspondence to Mauricio Berriel Diaz or Ali Ertürk .

Ethics declarations

Competing interests.

A.E. is a co-founder of Deep Piction. The remaining authors declare no competing interests related to this work.

Peer review

Peer review information.

Nature Methods thanks Hirofumi Kobayashi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Nina Vogt, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 vr segmentation in syglass and 2d-slice-based segmentation using itk-snap..

a, Volume of raw data (c-Fos labeled brain) that was generated by light-sheet microscopy and loaded into syGlass. Volume size represents 200 3 voxel, rendered isotropically. b-d, Using VR, individual cells were segmented in syGlass by using three-dimensional euclidean shapes as ROI and adjusting a threshold until the segmentation was acceptable. Scale bar indicates 5 µm. e, ITK-SNAP view of a single plane of the image stack. Cells were labeled in 2D, slice by slice. Segmentations are color coded by cell ID.

Extended Data Fig. 2 DELiVR pre-processing automatically removes artefacts.

a-c, Horizontal view of an original image slice (a), the proposed mask (b) and the masked image slice generated (c). Scale bar = 1 mm. d, Architecture of the c-Fos deep-learning network; a MONAI 3D BasicUNet. e, Quantitative comparison (instance precision and sensitivity) of segmentation performance between deep-learning architectures and DELiVR’s 3D BasicUNet. f, Segmentation performance of non-deep-learning methods and DELiVR (Scores for DELiVR are the same as used in e).

Extended Data Fig. 3 Whole-brain segmentation output generated with DELiVR.

a, 3D visualization of a whole raw light-sheet image stack. b, 3D view of whole-brain segmentation output of detected cells by DELiVR. The area-wise color code from the Allen Brain Atlas was combined with the 3D segmentation. Thereby each cell is color coded according to the brain area it was detected in. The segmentation of cells is shown in the original image space. Scale bar = 500 µm. c, Visualization of the detected cells in CCF3 atlas space using BrainRender (same image as in Fig. 2e ). Scale bar = 1 mm.

Extended Data Fig. 4 Tissue weights of mice with weight-stable cancer (NC26) and cancer-associated weight loss (C26).

a, Gastrocnemius (GC) muscle weight. n(PBS) = 12, n(NC26) = 8, n(C26) = 12, ****p < 0.0001,***p = 0.0004, One-way ANOVA with Sidak post hoc analysis. b, Epididymal white adipose tissue (eWAT) weight. n(PBS) = 12, n(NC26) = 8, n(C26) = 12, **p(PBS vs C26) = 0.0040,**p(NC26 vs C26) = 0.0015, Kruskal-Wallis test with Dunn´s multiple comparison test. c, Subcutaneous WAT (scWAT) weight. ***p = 0.0003, **p = 0.0019, Kruskal-Wallis test with Dunn´s multiple comparison test. d, Brain weight. n(PBS) = 12, n(NC26) = 7, n(C26) = 12, *p = 0.0479, One-way ANOVA with Sidak post hoc analysis. All data are presented as mean values +/− SEM.

Supplementary information

Supplementary information.

Supplementary Fig.1, Supplementary Table 1 and DELiVR handbook.

Reporting Summary

Supplementary video 1.

Annotation of cells in virtual reality using Arivis VisionVR.

Supplementary Video 2

Annotation of cells in VR using syGlass.

Supplementary Video 3

2D-slice-based annotation using ITK-snap.

Supplementary Video 4

Whole-brain with c-Fos + cells detected by DELiVR depicted in the original image space. Cells are color coded by the area they belong to in the Allen Brain Atlas.

Supplementary Video 5

Example run of the ImageJ plugin on Ubuntu Linux.

Source Data Fig. 1

Numerical source data.

Source Data Fig. 2

Source data fig. 4, source data fig. 5, source data extended data fig. 2, source data extended data fig. 4, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Kaltenecker, D., Al-Maskari, R., Negwer, M. et al. Virtual reality-empowered deep-learning analysis of brain cells. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02245-2

Download citation

Received : 03 June 2022

Accepted : 12 March 2024

Published : 22 April 2024

DOI : https://doi.org/10.1038/s41592-024-02245-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

approaches case study analysis

IMAGES

  1. 49 Free Case Study Templates ( + Case Study Format Examples + )

    approaches case study analysis

  2. (DOC) How to Analyse a Case Study

    approaches case study analysis

  3. Writing A Case Study Analysis : How to Write a Case Study and Make It

    approaches case study analysis

  4. How to Create a Case Study + 14 Case Study Templates

    approaches case study analysis

  5. 35+ SAMPLE Case Study Analysis in PDF

    approaches case study analysis

  6. methodology case study approach

    approaches case study analysis

VIDEO

  1. How to Analyze a Business Case Study

  2. What is case study and how to conduct case study research

  3. Types of Case Study [Explanation with Examples]

  4. Analyzing Case Study and Writing Case Study Analysis l STRATEGIC MANAGEMENT

  5. 3.7 Research Strategy: Case Study

  6. A Data-Driven Case Study Analysis (Doordash, Uber)

COMMENTS

  1. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  2. Writing a Case Study Analysis

    Identify the key problems and issues in the case study. Formulate and include a thesis statement, summarizing the outcome of your analysis in 1-2 sentences. Background. Set the scene: background information, relevant facts, and the most important issues. Demonstrate that you have researched the problems in this case study. Evaluation of the Case

  3. What Is a Case Study?

    Revised on November 20, 2023. A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are ...

  4. Writing a Case Analysis Paper

    Ca se Study and Case Analysis Are Not the Same! Confusion often exists between what it means to write a paper that uses a case study research design and writing a paper that analyzes a case; they are two different types of approaches to learning in the social and behavioral sciences. Professors as well as educational researchers contribute to ...

  5. Methodology or method? A critical review of qualitative case study reports

    Definitions of qualitative case study research. Case study research is an investigation and analysis of a single or collective case, intended to capture the complexity of the object of study (Stake, Citation 1995).Qualitative case study research, as described by Stake (Citation 1995), draws together "naturalistic, holistic, ethnographic, phenomenological, and biographic research methods ...

  6. Case Selection for Case‐Study Analysis: Qualitative and Quantitative

    While each of these techniques is normally practiced on one or several cases (the diverse, most‐similar, and most‐different methods require at least two), all may employ additional cases—with the proviso that, at some point, they will no longer offer an opportunity for in‐depth analysis and will thus no longer be "case studies" in the usual sense (Gerring 2007, ch. 2).

  7. Case Study

    Defnition: A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.

  8. Continuing to enhance the quality of case study methodology in health

    Purpose of case study methodology. Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16,17 It is ideal for situations including, but not limited to, exploring under-researched and real ...

  9. The case study approach

    A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the ...

  10. Three Approaches to Case Study Methods in Education: Yin, Merriam, and

    Three Approaches to Case Study Methods in Education: Yin, Merriam, and Stake . Bedrettin Yazan . University of Alabama, Tuscaloosa, Alabama . Case study methodology has long been a contestedterrain in social sciences research which is characterized by varying, sometimes opposing, approaches espoused by many research methodologists.

  11. (PDF) Three Approaches to Case Study Methods in ...

    The chief. purpose of his book is the explication of a set of interpretive orientations towards case study. which include "naturalistic, holistic, ethnographic, phenomenological, and biographic ...

  12. Do Your Students Know How to Analyze a Case—Really?

    Give students an opportunity to practice the case analysis methodology via an ungraded sample case study. Designate groups of five to seven students to discuss the case and the six steps in breakout sessions (in class or via Zoom). Ensure case analyses are weighted heavily as a grading component. We suggest 30-50 percent of the overall course ...

  13. Case Studies

    Case studies consist of an in-depth analysis of one or more cases, using a variety of methods and theoretical approaches. The choice of cases (single or multiple) studied is crucial. Case studies are particularly suitable for studying the emergence and processes involved in policy implementation and for contributing to theory-based evaluations ...

  14. Data Analytics Case Study Guide 2024

    Roadmap to Handling a Data Analysis Case Study. Embarking on a data analytics case study requires a systematic approach, step-by-step, to derive valuable insights effectively. Here are the steps to help you through the process: Step 1: Understanding the Case Study Context: Immerse yourself in the intricacies of the case study.

  15. How does the external context affect an implementation processes? A

    Background Although the importance of context in implementation science is not disputed, knowledge about the actual impact of external context variables on implementation processes remains rather fragmented. Current frameworks, models, and studies merely describe macro-level barriers and facilitators, without acknowledging their dynamic character and how they impact and steer implementation ...

  16. The predictive power of data: machine learning analysis for Covid-19

    Background and purpose The COVID-19 pandemic has presented unprecedented public health challenges worldwide. Understanding the factors contributing to COVID-19 mortality is critical for effective management and intervention strategies. This study aims to unlock the predictive power of data collected from personal, clinical, preclinical, and laboratory variables through machine learning (ML ...

  17. What is quality in long covid care? Lessons from a national quality

    Long covid (post covid-19 condition) is a complex condition with diverse manifestations, uncertain prognosis and wide variation in current approaches to management. There have been calls for formal quality standards to reduce a so-called "postcode lottery" of care. The original aim of this study—to examine the nature of quality in long covid care and reduce unwarranted variation in ...

  18. Case Study: How Aggressively Should a Bank Pursue AI?

    Anuj Shrestha. Summary. Siti Rahman, the CEO of Malaysia-based NVF Bank, faces a pivotal decision. Her head of AI innovation, a recent recruit from Google, has a bold plan. It requires a ...

  19. The case study approach

    A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table.

  20. Virtual reality-empowered deep-learning analysis of brain cells

    Previous approaches for segmenting cells in mouse brains ranged from deep learning 32,33,34, random forest algorithms 19,35 and threshold-based solutions 7,18,36; however, only a subset of studies ...

  21. Land

    Land use changes in rapidly urbanizing regions around the world constitute a principal anthropogenic element fueling the surge in carbon emissions. Here, land use patterns within the Beijing-Tianjin-Hebei (BTH) urban agglomeration under low-carbon development (LCD) scenarios were simulated. Additionally, social network analysis was employed to formulate carbon balance planning guidelines ...

  22. Challenges

    The service area analysis tool in the Network Analysis extension of ArcGIS had been used to evaluate the accessibility of the green spaces in this study. The service area is the region that encompasses the accessible road network around a facility within a distance or travel time threshold, which is also called impedance [ 11 , 12 , 13 ].

  23. JMSE

    The offshore wind sector is moving into deep waters and using floating platforms to harness the higher wind speeds in exposed locations. There are various floating platform types currently in development, but semi-submersibles are considered the most prominent early movers. Such floaters need to be towed to and from wind farm locations for installation, special cases of repair and ...