Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 03 March 2021

Digital public health surveillance: a systematic scoping review

  • Zahra Shakeri Hossein Abad   ORCID: orcid.org/0000-0003-4519-864X 1 , 2 ,
  • Adrienne Kline 1 , 3 ,
  • Madeena Sultana 1 , 2 ,
  • Mohammad Noaeen 4 ,
  • Elvira Nurmambetova 1 ,
  • Filipe Lucini   ORCID: orcid.org/0000-0002-6090-6846 1 , 5 ,
  • Majed Al-Jefri   ORCID: orcid.org/0000-0002-2293-3632 1 , 3 &
  • Joon Lee   ORCID: orcid.org/0000-0001-8593-9321 1 , 2 , 6  

npj Digital Medicine volume  4 , Article number:  41 ( 2021 ) Cite this article

15k Accesses

32 Citations

19 Altmetric

Metrics details

  • Public health

The ubiquitous and openly accessible information produced by the public on the Internet has sparked an increasing interest in developing digital public health surveillance (DPHS) systems. We conducted a systematic scoping review in accordance with the PRISMA extension for scoping reviews to consolidate and characterize the existing research on DPHS and identify areas for further research. We used Natural Language Processing and content analysis to define the search strings and searched Global Health, Web of Science, PubMed, and Google Scholar from 2005 to January 2020 for peer-reviewed articles on DPHS, with extensive hand searching. Seven hundred fifty-five articles were included in this review. The studies were from 54 countries and utilized 26 digital platforms to study 208 sub-categories of 49 categories associated with 16 public health surveillance (PHS) themes. Most studies were conducted by researchers from the United States (56%, 426) and dominated by communicable diseases-related topics (25%, 187), followed by behavioural risk factors (17%, 131). While this review discusses the potentials of using Internet-based data as an affordable and instantaneous resource for DPHS, it highlights the paucity of longitudinal studies and the methodological and inherent practical limitations underpinning the successful implementation of a DPHS system. Little work studied Internet users’ demographics when developing DPHS systems, and 39% (291) of studies did not stratify their results by geographic region. A clear methodology by which the results of DPHS can be linked to public health action has yet to be established, as only six (0.8%) studies deployed their system into a PHS context.

Similar content being viewed by others

case study health surveillance

Digital technologies in the public-health response to COVID-19

Jobie Budd, Benjamin S. Miller, … Rachel A. McKendry

case study health surveillance

Applications of digital health for public health responses to COVID-19: a systematic scoping review of artificial intelligence, telehealth and related technologies

Dinesh Visva Gunasekeran, Rachel Marjorie Wei Wen Tseng, … Tien Yin Wong

case study health surveillance

Digital health interventions for non-communicable disease management in primary health care in low-and middle-income countries

Shangzhi Xiong, Hongsheng Lu, … Lijing L. Yan

Introduction

Internet technology is now a part of almost everyone’s life. Internet usage among US adults has steadily been increasing from 52% in 2000 to 90% in 2019 1 . Today, 97% of Internet users worldwide are active on social media, and the number of social media accounts per average Internet users has grown from 6.2 in 2015 to around 8 in 2019 2 . The low-cost data stream available on social media and other Internet-based sources is increasingly harnessed by clinicians, patients, and the general public to disseminate insights into disease trends and promote healthy lifestyles and health policies 3 , 4 . Every minute, people around the world are publicly sharing volumes of personal and communal health information on different digital platforms 5 , such as social media, discussion forums and blogs, and Internet search engines. Digital surveillance data, inspired by the definition of digital epidemiology data by Salathé 6 , is the publicly available user-contributed data not generated with the primary goal of surveillance. This data can provide an inlet to impervious populaces and has become integral to digital public health surveillance (DPHS). Public health surveillance (PHS), as a tool for monitoring and targeting interventions 7 , is the ongoing systematic collection, analysis, and interpretation of data, tightly integrated with the timely dissemination of these data to those who can undertake effective prevention and control activities 8 , 9 . Apart from the unprecedented volume of digital data, when used appropriately, these online resources can provide an increasingly clear picture of the dynamics and complexities of traditional PHS processes 5 , 10 . Compared to the data captured through traditional PHS channels, digital resources contain information that can be harnessed to reduce the time to outbreak detection, add more transparency to outbreak information published by the governments, and facilitate public health (PH) responses to emerging diseases and population-related risk factors 10 . These resources can be either used for infodemiology–utilizing digital data for mining, analysis, and information aggregation with the ultimate aim to inform PH and public policy or used for infoveillance– infodemiology methods with the main focus on surveillance 11 . Infodemiology was first formally introduced by Gunther Eysenbach in 2002 to describe the distribution of health information and misinformation on digital platforms 12 and was later extended to other areas of utilizing digital data for PH research, such as outbreak detection, substance use, and drug utilization 13 .

The interactivity of the Internet and the highly networked, hyperlocal, and contextualized nature of digital data offer an unparalleled opportunity for the public, patients, and health officials alike to communicate and address health issues. Profiling vaccine criticisms 14 , mining patient’s narratives about drug experiences on open-access forums 15 , geospatial tracking of the population during disease outbreaks, providing local and near real-time information to recognition of an outbreak 16 , 17 , and population-based clustering of behavioural risk factors such as physical inactivity, substance use, and poor diet in large population 18 , 19 are examples of realizations of such opportunities.

Effective DPHS requires an understanding of the potentials and pitfalls of digital data for monitoring PH and exploring disease dynamics. Several narrative reviews of the application of digital media in PHS and epidemiology have been published 20 , 21 , 22 , 23 , 24 , 25 , 26 . Bernardo et al. reviewed 32 studies published between 2002 and 2011 that utilized search queries and social media data for infectious diseases surveillance 20 . The authors concluded that even though there are challenges associated with the quality of digital data, there have been successful applications of digital disease surveillance since 2006 and their performances in terms of cost, time, and accuracy compare favourably with those of traditional surveillance systems. This was confirmed by a recent scoping review on using web-data for disease surveillance and epidemiology in which Mavragani studied 338 articles from 2009 to 2018 and highlighted the potential of digital surveillance in health informatics research 26 . Newer reviews on this subject have dealt with the popularity of different surveillance domains over time and summarized recent methodological developments mapped to each domain 27 , 28 . The most recent and extensive digital surveillance review 28 has pictured a timeline, tracking interest online for PH and solely focused on ethical and validity issues ripe in the digital health monitoring revolution. While the topics covered in our review encapsulate those mentioned, this review will expand on the notion of DPHS by exploring more platforms and a broader context within the PH field. Moreover, a systematic evaluation is absent in the existing reviews, and most encapsulate only certain platforms or diseases/disorders. Therefore, we aimed to provide a comprehensive synthesis of evidence to add to the extant literature filling both of these needs while providing a proportional topic saturation level. Our scoping review also provides details on utilizing digital media in different aspects of PHS. This allows future researchers to identify where the need for future work is ripe and what untapped potentials need more attention in the digital surveillance sphere.

To identify literature on DPHS, we conducted an iterative systematic search with extensive hand searching. Our scoping review was designed, implemented, and reported following the Preferred Reporting Items for Systematic Reviews and Meta Analyses Extension for Scoping Reviews guidelines (PRISMA-ScR) 29 . While there are other well-established guidelines for conducting systematic scoping reviews 30 , 31 , 32 , the detailed reporting guideline, demonstrative examples, and best-practices for large-scale scoping reviews provided by PRISMA-ScR were ideal for our review. The search yielded 4249 articles. Excluding duplicates, we found 2907 studies from which we selected 755 studies of 16 PHS themes, associated with 49 PH categories and 208 sub-categories (Fig. 1 ). The complete list of included articles is provided in Supplementary Note 5 (a1–a755) .

figure 1

The overall process of article selection following PRISMA-ScR guideline.

Table 1 lists all PHS themes, their corresponding (sub)categories, and the relevant articles. These themes include behavioural risk factors (BRFs), cancer, chronic disease, communicable diseases, paediatric health, drug utilization, food and nutrition, health practices, health services, environmental hazards, mental health, mortality, vaccine, and urogenital/preconception. Articles that did not coincide with these topics but dealt with PHS were subsumed under the ‘others’ category (e.g., occupational safety). Each paper was contextualized based on the theme it was most closely affiliated with (i.e., BRFs for smoking behaviours and mental health for suicide, depression, bipolar, or eating disorders). More than one context was permitted to capture topics that would fit into two categories (i.e., eating disorders were placed in both the mental health and the chronic disease categories). Many papers harnessed digital data to study the quality of health services; a category was created to reflect this. While those affiliated with health education/campaigns and communication were placed in a communication subgroup and those involving emergency departments, nursing homes, and other health services were grouped in the accessibility and the quality subgroups.

The surveillance theme with the most number of publications was the ‘communicable disease’ surveillance at 25% (187). The stark rise in the volume of communicable disease publications coincides with the 2016 Zika outbreaks. In 2016, ILI-focused studies were the most common ‘communicable disease’ studies (53%), following a similar distribution to the overall trend of all such studies. In 2017, Zika-focused studies were the most common (36%). Publications in 2017 saw a greater variety of health events studied (Fig. 2 ).

figure 2

The temporal trends of the two most prevalent themes of DPHS systems in the literature.

A large proportion of BRFs studies can be linked to policy changes. The peak of e-cigarette publications in 2016 and 2017 (Fig. 2 ) may be attributed to growing international concerns in the preceding years as policymakers noticed vaping products marketed towards youth and young adults. A congressional report in the USA 33 and the WHO FCTC 34 , both in 2014, may have prompted increased research in this area in subsequent years. Similarly, the sudden academic interest in cannabis research in 2016 may result from the rapid legalization and decriminalization of medicinal and recreational cannabis in the preceding years (Fig. 2 ).

Countries, affiliations, and surveillance systems

A total of 79% (593) of the studies included in this review were published by researchers from the USA (426), UK (51), Australia (44), Canada (36), and Italy (36). The most common surveillance theme researched among these countries include communicable diseases, BRFs, chronic disease, drug utilization, and mental health (Fig. 3 a).

figure 3

a Top five countries and PHS themes. b The frequency of different combinations of affiliations, PHS themes and the average number of authors per country.

More than 94% (707) of the studies involved authors affiliated with academia, from which 460 studies are only academia affiliated. Only 3% (23) of studies have an author affiliated with governments, with ten of them studied communicable diseases, and three studied the general aspects of PH (Fig. 3 b). None of these studies investigated the vaccine, environmental hazards, or health practices surveillance systems. The studies utilized datasets with no geographic focus (36%, 268) are dominated by BRFs, communicable, and chronic diseases. The majority of studies with geographically focused datasets used country-level data, and only 0.7% used ZIP-code level datasets. The studies in this category are dominated by communicable diseases, BRFs, and health services surveillance systems (Fig. 3 b).

Social media platforms and surveillance systems

Starting from 2005, the three most common digital platforms studied were, in descending order, Twitter, Google Trends, and Facebook. Their numbers increasing sharply from less than three studies per year in 2009 until reaching 78, 49, and 13 studies, respectively, in 2019. Google (Flu) Trends (GT and GFT) are utilized by 41% (76) of publications on communicable diseases, among which 57% (43) of studies aimed to predict outbreaks and seasonal diseases. From 69 studies that utilized Twitter to study communicable diseases, 32% (22) mined tweets for outbreak prediction. Facebook, Instagram, and YouTube were mainly utilized to study BRFs, focusing on smoking, substance use, and lifestyle. Fifty percent of studies that used Yelp investigated topics related to ‘health services’, while this number for Facebook, YouTube, Instagram, and GT is less than 2% (Fig. 4 ). Almost half of the studies on ‘mental health’ used Twitter data, and 11 studies used GT to observe the seasonal patterns of internet search volume in a wide range of mental health terms. More details about the digital platforms used by the included studies are presented in Supplementary Note 3 .

figure 4

Surveillance systems that utilized more than one platform were assigned to multiple, and the maximum allowed being five. Studies that investigated more than five platforms are mapped to the ‘Social Media Platform’ column.

Methods—data collection duration

There was a wide variability in data collection duration (Fig. 5 ). Overall, 36% (268) of the included studies had a duration of more than 2 years, 14% of such studies had a duration of 1–2 years, and 40% of studies had a duration of less than 1 year, with a greater proportion covering less than 6 months. All surveillance themes followed similar distributions, with some notable exceptions: 53% of chronic disease publications had a duration greater than 2 years, while this number for communicable diseases and BRFs themes is 44% and 21%, respectively. Notably, urogenital publications had the shortest duration of data collection, with 34% lasting less than 1 month. Indeed, from Table 1 , the associated PH categories (i.e., genital, renal, and urinary) are events with a typically short onset and duration. Moreover, 98% (740) of studies implemented their analysis based on secondary data—the longitudinal data that are sometimes collected months or years after the event occurred 35 . Thus, surveillance systems that are developed based on secondary data analysis are more useful for long-term rather than short-term interventions 35 .

figure 5

The differences in data collection duration across included studies and the proportion of articles within each time frame across all surveillance systems.

Methods—objectives, data analysis, and findings

We classified the studies based on their overall data collection and analysis methodology (Fig. 6 ). Studies with the main focus on mining, analysis, and information aggregation to inform PH and public policy were placed in the infodemiology category (77%). Studies that emphasized surveillance were classified as infoveillance (23%) 11 . Not surprisingly, 112 (60%) of publications on communicable diseases are infoveillance studies. This could be because of the great potentials of the existing digital data such as search queries and access logs to explore the public’s digital behaviour and detect epidemic outbreaks. The main objectives of infodemiology publications were to mine user’s status updates (O13, 32%), and the most common finding was providing baseline data (F16, 23%). Conversely, the infoveillance studies were dominated by the ones that showed the predictability (F13, 28%) and applicability (F1, 22%) of digital data for outbreak detection (O14, 31%).

Objectives and findings

From the manual content analysis of the objectives and findings of the included studies, eighteen distinct strands of investigations emerged. ‘Providing baseline information’ on risk patterns and trends in the occurrence of various health events (22%, 163), exploring the ‘applicability’ of utilizing web-based platforms in PHS systems (13%, 98), and ‘identifying user’s digital behaviour’ for evaluating the correlation between online activity and incidence and temporal trends of risk factors (11%, 84) are the top three (Fig. 6 ).

figure 6

The bottom charts represent the temporal trends of data analysis used by the included studies and the frequency of articles that identified each of the age/gender/place in their datasets.

Detecting unhealthy advertisements (O1) is the second most frequent objective associated with BRFs publications, with 89% (16) of them related to smoking (69%: e-cigarette/JUUL and LCC). Seventy five percent (12) of these publications showed the prevalence of advertising smoking behaviour (F14), and 19% (3) explored the marketing strategies used by smoking vendors (F10). This implies the utilization of digital resources as marketing platforms for different smoking brands, which may carry major PH risks (Fig. 6 ). Exploring public opinion (O5) and sentiment (O6) towards immunization are the most common objectives in the publications on vaccine surveillance (48%, 23). These objectives are mainly mapped to supportive attitudes (F18) and negative sentiments (F12), respectively. These findings imply the need to design and implement appropriate educational information tailored to different social media platforms, with the main focus on the users who are at risk of excessive exposure to anti-vaccine information. For example, men are far more likely to express a negative opinion about HPV immunization than women a695 , or users who are more often exposed to negative opinions about HPV vaccines are more likely to post negative messages subsequently a697 .

Twenty one percent (13) of publications on drug post-marketing/utilization reported on the applicability (F1) of using Internet-based data in exploring drug safety/adverse drug reaction (ADR) (85%), post-marketing (8%), and drug abuse (7%). Interestingly, two studies showed that Twitter might not be a useful platform for this system, as the ADR reports on Twitter usually underrepresent specific drugs and often do not meet the FDA criteria required for reporting an ADR a468 , a476 . This is in line with a recent systematic review that shows the prevalence of ADR reports on social media varies from 0.2% to 8% of all postings 36 . Sixty three percent (19) of mental health studies reported risk indicators (F7), from which 73% (14) were related to self-harm or suicide attempts. Applying linguistic analysis methods a652 , exploring time-varying features related to suicide risk factors a625 , mapping digital behaviour of different age groups to these indicators a610 , a622 , and emotion analysis a645 are sample exploratory techniques discussed by the publications in this category. In oncology, exploring the digital behaviour of users (F4) can be used to identify temporal trends of cancer risk factor queries, cancer incidence and mortality, and interests in cancer screening, compared to other information-seeking domains 37 . Thirty eight percent (5) of studies placed in the [Cancer/F4] category used GT a167 , a169 , a170 , a175 and Yahoo Buzz Index (YBI) a168 to conduct search-based cancer surveillance and 23% (3) mined user-generated content (O13) on Twitter a161 , a171 , a173 to study cancer information-seeking behaviours and the incidence of some types of cancer.

Age/gender/place and temporal trends of data analysis

Given the primary purpose of surveillance is the monitoring and assessment of the overall health status of population subgroups 9 , analyzing time, demographics (age, gender), and place is a critical component of any PHS system 35 . Since the rise of Internet-based data usage in PHS, great strides have been made in identifying place, gender, and age from anonymous self-reported information on the Internet. Mining users’ profile information a37 , a199 , content analysis a132 , a162 , a727 , population survey a318 , a508 , mapping to local demographic data a630 , and utilizing third-party tools a120 , a201 are some sample techniques used by the studies included in this review to explore these variables. However, relatively few studies have systematically incorporated these epidemiologic parameters in their data analysis, despite the value of these indicators in identifying risk groups (Fig. 6 ). Moreover, it is worth noting that questions of validity, mis-classification of users 38 , and under-counting caused by sampling bias 39 are challenges that still need to be addressed. The data analysis of 61% (460) of studies reflects the results of a specific time window, which, excluding communicable diseases, is the most common type of temporal analysis in all reported surveillance systems. Conversely, temporal analysis of the ‘epidemic occurrence’ of a disease and ‘seasonal patterns’ have been the commonly used inferential analytic approaches in analyzing communicable diseases data (Fig. 6 ). Thirty-two percent (242) of studies did not capture any of the age/gender/place variables for their data analysis, with the majority of them coming from the BRFs category.

Evaluation of the surveillance system

Seventy-four percent (561) of studies evaluated the usefulness of their proposed DPHS system by drawing a mapping between the system’s objectives and outcomes. Among these, 361 (48% of total) studies were evaluated subjectively, 116 (15%) used quantitative methods such as statistical analysis and machine learning (ML) techniques, and 85 (11%) used surveys/qualitative analysis methods. Twenty-five percent (192) of studies used the ‘representativeness’ approach to explore the extent to which the characteristics of reported events can accurately represent the incidence of actual health events 40 (Fig. 6 ). About two-thirds (64%, 120) of the articles on communicable diseases used this approach, followed by studies on environmental hazards (43%, 10). Given that the rate calculation (e.g., seasonal/cyclic incidence of a health event) required for measuring the inclusivity of a system needs an entirely separate data system maintained by an external agency (e.g., Centers for Disease Control and Prevention (CDC) ILI data), utilizing this approach might be more challenging for the other surveillance systems.

Data types and analysis methods

Figure 7 summarizes the frequency of different data types used by the included studies, their mapping to different PHS themes, and the proportion of the studies that applied ML techniques to process each data type. Textual data are the category with the highest number of ML applications (31%), and none of the studies that utilized video data used ML. This meagre rate, of course, reflects the fact that there are several pitfalls to the process of analyzing Internet-based data. ‘Search queries’ is the second most frequent data type. Given its popularity, considerations must be given to the limitations of search query analysis, such as the dynamic changes of health information-seeking behaviour, the uncertainty of information seeker representativeness (e.g., some searches may be generated by bots or news reports), and the limited geographic data that can be gleaned from this data type.

figure 7

The mapping between data types used by the included studies and the PHS systems, platforms, and the use of machine learning.

Key findings

We report a comprehensive scoping review to summarize and synthesize evidence from a large and heterogeneous body of literature studying DPHS. The growing body of evidence of DPHS reflects the chronological availability of new digital platforms and new data mining and ML techniques. Our findings show the huge effect of mass media on the public’s information-seeking behaviour. Exploring these behaviours can help PH officials tailor their messages to address PH interests and improve healthcare delivery.

Digital data can help portray the dynamics of PHS systems and allow PH professionals to pinpoint the general concerns or needs of the public during infectious disease events to create location-specific campaigns. For example, the finding that there is no association between dental caries and toothache-related information-seeking behaviours among South American Google users can reinforce the unfamiliarity of this population about the relationship between dental pain and the final stages of chronic oral diseases a735 .

Our findings show a higher prevalence of digital surveillance systems for communicable diseases (25%, 187). One possible reason for this is that topics such as seasonal outbreaks and epidemics, sexually transmitted and infectious diseases, can be coalesced in this category, making it a far-reaching one. Another reason may be the ease of using relative search volumes for various outbreak-related and infectious diseases using Google Trends, access logs on other social media platforms, as well as the fear/hype surrounding infectious diseases and different epidemics such as H1N1, Ebola, and Zika. Very few papers dealt with ‘disease burden’ (0.3%) and ‘occupational safety’ (0.5%), which came as a surprise given the excellent availability of Google Trends data.

The surveillance themes studied by each country appear to follow international trends (Fig. 3 a). Interestingly, the USA and Australia had a greater proportion of articles studying BRFs, which can be attributed to international differences. For instance, according to the UN World Drug Report (2016), the prevalence of cannabis users in the USA and Australia in 2015 surpassed that of the European average by roughly 4% 41 . Although cannabis remains the most commonly used illicit drug in both countries, Australia has seen a drastic rise in the use of amphetamines and other illicit drugs since 2012. The USA holds the largest market for e-cigarettes. Also, it has the most reported vaping-related illness, particularly in young people. Furthermore, both countries have significantly more overweight or obese people. Recent reports show that 67% of Australian adults and 71% of American adults (over the age of 20) are overweight. Indeed, these factors, combined, may contribute to increased research in smoking, lifestyle habits and illicit substance use, which in turn increases the proportion of behavioural risk factor publications.

While the use of user-generated information on the Internet certainly shows promises, especially from the standpoint of providing an alternative and inexpensive solution to PHS, questions remain regarding the validity and generalizability of social media and Internet data 28 . Given the limited length of data (e.g., a tweet), different language styles between Internet users, and no restriction on their writing style, user-generated content often contains a high amount of noise, making the automatic information extraction and classification of free-text data challenging and time-consuming. Moreover, many concerns have been raised about the correctness and the quality of health-related digital data and the detrimental effects that misinformation can have on PH 42 . This concern with misinformation was also apparent during the 2014 Ebola outbreak a335 or the Zika outbreak in 2016 a354 , a357 , a359 , a366 . Table 2 lists the included studies that investigated the spread of inaccurate or incomplete health-related information on the Internet. The number of studies in this category increased from 21 in 2015 to 60 in 2019, with a spike in 2017, comprising 8% of all included studies. Digital misinformation can quickly spread but difficult to refute. As listed in Table 2 , the majority of research on PH-related misinformation has focused on communicable diseases, and BRFs surveillance systems and most of the reported misinformation by the included studies have proliferated via Twitter, news websites, and Facebook, respectively. Sixty-seven percent (40) of these studies analyzed textual data, and 18% (11) contained video data. Among the studies without geographic focus, the investigation is dominated by those of drug utilization, chronic diseases, and vaccines, respectively. Interestingly, studies that investigated misinformation in a specific geographical zone mainly focused on BRFs, communicable diseases, and health services surveillance systems. Despite this long-standing effort, there is still a clear need for a valid assessment of the potential for harm associated with digital health misinformation and its relative impact for different surveillance systems.

Limitations of the included studies

First, we found that 61% (460) of studies conducted cross-sectional analysis (Fig. 6 ), and thus they were unable to evaluate the longitudinal or temporal dynamics of their findings. These findings might change over time, and longitudinal analysis would be needed before being utilized by PH decision-makers. Ten percent (75) of studies did not even report the time scale of their analysis and only reported the analysis results. Even if the temporal analysis is unrevealing, the usefulness of a PHS system needs to be assessed periodically to ensure that it is serving a useful PH function 35 .

Second, the majority of the studies that utilized digital data for PHS (77%, 581) had an exploratory nature and attempted to gather information and data to inform PH officials about the potential of DPHS in different areas of PHS (Table 1 ). Among these studies, 28% (165) provided baseline data (F16 in Fig. 6 ), 17% (98) investigated the applicability and feasibility of digital data for PHS (F1), and 28% (163) studied users’ digital behaviour and their concerns and opinions about different aspects of PH (F4, F6, F12, and F18). While these studies provide some valuable information on the potential of DPHS, they represent only the first three steps of a PHS process (i.e., planning&design, data collection, and data analysis, Fig. 8 ) and are limited in real-world evaluation (i.e., sensitivity and representativeness analysis) and system deployment.

figure 8

The coloured phase in red highlights the key difference between traditional and digital public health surveillance. The summary of current limitations of research on DPHS discussed throughout this review, is mapped to and listed below each activity of the process.

Third, around 40% (299) of studies were limited by sample size and scope, as they used labour-intensive methods such as manual coding and qualitative analysis. The majority of the 219 studies that applied NLP methods used rule-based and lexical matching techniques such as topic modelling, sentiment analysis, and language modelling. These methods can only extract abstract themes at a high level, and the subjectivity in the interpretation of their results might limit the generalizability and the accuracy of the findings of these studies.

Fourth, the content bias is another limitation of the included studies in our review. User-generated content on the Internet is highly biased as it reflects information that people are comfortable having revealed and may not represent the real spectrum of their feelings/experiences. In addition to this, our study’s results show that among the 554 studies that used text, image, or video data types, only 20% (111) took into account whether their findings were associated with the user’s personal experience (i.e., self-reported) or not. Thus, there is a clear need for studies capable of determining and mitigating content biases that affect the formation and adoption of digital data for PHS.

Fifth, the final link in the surveillance chain is the timely dissemination of the system’s findings to the general public or PH officials for action. Of the articles included in this review, only six (0.8%) linked their results for public health action. While there is a clear need for rigorous methodologies by which the results of DPHS systems can be converted into usable information, vigilance is still needed regarding the efficacy and safety of these findings to forgo the unintended consequences of these results on PH decisions.

Sixth, while the anonymity of Internet users enables individuals with discreditable stigma to reap the benefits of supportive communication on digital media 43 , 44 , the difficulty of ascertaining demographics poses several unresolved questions regarding the inherent population biases of Internet users with different cultural background or socioeconomic status. Demographics for most digital platforms are not nationally representative and skewed toward younger age groups and users with higher levels of education 45 , 46 . We found that no studies assessed digital media utilization for vulnerable populations (e.g., low-income, older adults, or people with a disability) who are underpresented on different digital platforms. Studies on detecting social bots are scarce. Considering the radically increasing rate of childhood obesity with the subsequent adolescent onset of nutrition-related chronic conditions such as diabetes and cardiovascular diseases 47 , 48 , which could be due to the massive exposure of adults and children to unhealthy food and beverages through product placements and promotional advertisements on different digital platforms 49 , 50 , 51 , this topic is vastly underreported by the research on DPHS.

Seventh, among the 379 studies that utilized Twitter, Facebook, and Instagram, 41% (156) confined their analysis to content that was attributed with specific hashtag(s). These studies represent a biased population of users, and they may have skewed the data by excluding contents relevant to the health event under study. Furthermore, from the full-text of the 581 studies that did not use hashtags, we manually extracted the methodologies they employed to query the Internet or filter their collected data and found that the majority (71%, 411) used only their subjective opinion and 10% (57) used the existing literature to define their search keywords. Trend analysis (i.e., Google correlates) and ontology-based keyword extraction were used by 6% (37) and 5%(29) of the studies, respectively. Only 1% (7) of studies used automatic algorithms such as ML, NLP, or lexical analysis to extract context-sensitive keywords. Considering the rapid changes in web search behaviours, the uncertainty regarding the representativeness of pre-defined keywords, and the highly context-sensitive nature of health-related events, keyword querying alone might not be suitable in DPHS a634 .

Eighth, furthering the population bias of the social media data, 82% (619) of studies analyzed only one platform, potentially leading to false positives. For example, Twitter content on poliomyelitis differs significantly from other English-language media content a410 . Eighty five percent (638) of studies are limited to English-language content. Given that some of the addressed health-related issues by the included studies may be prevalent in countries other than the USA and countries with large English-speaking populations, the language bias can limit the conclusions to English-speaking populations. For example, the largest burden of cervical cancer is in non-English-speaking countries such as countries in Africa, Asia, and South America a135 , while only English-tweets were reviewed to study this topic.

Ninth, although the health outcomes of different PHS systems are highly location-dependent and might vary based on local healthcare policies 52 , the results of 36% (274) of the studies reported in this review were not segmented by geographic location, thus limiting the conclusiveness of their results. For example, while search engine data may be a useful tool to study the temporal dynamics of the pollen seasons in Ukraine and China a587 , a595 , the agreement between search queries and pollen concentrations in France is usually poor a588 . Similarly, in studies that investigated drug abuse in the context of varying policies, digital data were shown to be a valuable indicator of drug-related communications a114 – a116 , a123 . However, this limitation is inherent in some of the digital platforms such as Yelp, Reddit, and WikiTrends as they do not make the location of the poster or visitor readily available. More details about the challenges of using specific digital platforms for different PHS topics are presented in Supplementary Note 4 .

DPHS and its challenges

Despite the improvements enabled by digital technologies, the overall process of PHS research has remained constant and contains five main systematic and iterative activities 9 , 53 . Figure 8 illustrates the overall process of DPHS and summarizes the limitations of existing research on DPHS discussed earlier by mapping them to different activities of this process. During the course of this review, we found that the main differences between traditional and DPHS lie in how and for what purposes the data are generated and utilized (highlighted in Fig. 8 ). Following the definition of digital surveillance data used to define the scope of this review, a DPHS system uses digital data voluntarily generated by the public, regardless of the main objectives of the task at hand. Digital data generated through online surveys or polls with a pre-defined surveillance goal or digital content that is not publicly available cannot be considered digital surveillance data. This methodological difference between traditional and digital PHS systems helps explain the challenges mapped to different DPHS activities (listed in Fig. 8 ). Data source bias (e.g., limited platforms and content/population bias), data collection limitations (e.g., subjective filtering), challenging data analysis due to the complexities of unstructured digital data, and lack of sensitivity analysis for evaluating DPHS systems due to the limitations of mapping digital data to national and real-world data are some of the key challenges that still need to be addressed in future work.

Limitations of the scoping review

This study has some limitations. First, the terminology in the context of DPHS is not yet established in a consistent way, and our search strings may not have captured all the existing evidence. To mitigate this, in addition to a literature review and involving domain experts, we used language modelling and lexical analysis to find the context-sensitive terms that present the field. Second, papers excluded based on our criteria may yet prove relevant to DPHS, despite decisions made by three reviewers. Finally, although we have tried to discuss some of the most important findings in the literature through intuitive and detailed visualization techniques, it is impossible in a limited space to detail all the aspects of the studies utilized digital media for PHS. The supplementary dashboard we present alongside this study presents more interactive results. However, we believe that a more broadly based review of each of the surveillance systems presented in this paper provides necessary contexts for DPHS.

Search strategy and selection criteria

For this scoping review, we searched Global Health, Web of Science, and PubMed for articles published in English, up to January 2020. For each search string, we also searched the first ten pages of Google Scholar that displayed 20 results per page to ensure we had included all highly cited articles relevant to the scope of our review. To define the search strings for automated search, we used literature review, manual content analysis, and Natural Language Processing (NLP), including language modelling (i.e., the probability of a given sequence of words in a document) and lexical association analysis (i.e., the co-occurrence of words), to explore the context-sensitive terms relating to DPHS (Supplementary Note 1.1 and Supplementary Table 1 ). The reference lists of the included articles were also screened for additional relevant studies not identified during the automatic search. To assess the performance of the developed search strategy, the sensitivity of more than 200 search strings were tested using a quasi gold standard 54 set of 80 articles. These articles were selected manually from studies published in four public health journals from 2017 to 2018 (Supplementary Note 1.2 and Supplementary Table 2 ).

We included all studies published in English and investigated digital data to implement a surveillance system directly (infoveillance) or mined, analyzed, and aggregated information from digital resources to inform PH and public policy for PHS purposes (infodemiology). Digital data in this paper, regardless of its type, refer to the publicly available user-contributed content on the Internet that was not generated with the main purpose of supporting PHS 25 . Digital data sources can be categorized into social networking sites (e.g., Facebook, Twitter); Internet search data (e.g., Google (Flu) Trends); collaborative websites (e.g., Wikipedia); content sharing websites (e.g., YouTube, news websites); and blogs and forums (e.g., Reddit, Yelp) 55 . Thus, we excluded all PHS studies that actively collected data by conducting online surveys, digital polls, and interviews. Moreover, articles that used digital data for personal surveillance (i.e., monitoring potentially exposed individuals to detect early symptoms 35 ) were excluded from this review. We also excluded studies that utilized digital data for purposes other than PHS. For example, studies that reported on leveraging the social structures of digital platforms for health education and research recruitment, or studies that only contributed to developing new ML techniques for PHS were not eligible for inclusion. Full details of the inclusion/exclusion criteria are listed in Supplementary Note 1.4 .

The titles and abstracts of the articles identified by the search strategy were manually screened by three reviewers independently for eligibility according to the inclusion and exclusion criteria. Disagreements about eligibility were settled by discussion among the three reviewers. One reviewer manually assessed the full text of included publication and identified additional papers that did not meet the eligibility requirements.

Data analysis

A data extraction form was developed and independently piloted on 50 publications by three reviewers. Seven reviewers extracted data from the included articles and two reviewers manually reviewed all fields of the data extraction form and resolved discrepancies by reviewing the full text of the included studies. The following data were extracted from the included papers: authors’ affiliation, number of authors, year of publication, country of authors, country of data collection, platform(s) under study, surveillance theme and (sub) category, objective and findings, the temporal trend of data analysis, surveillance type, age/gender/place mapped to the data, the language of data, analysis methods (i.e., quantitative, qualitative, machine learning), data type (e.g., text, image, video, and search query), duration/start of data collection, evaluation methods, and the methodology of using digital resources for PHS.

To summarize the extracted data from the included articles, we used a descriptive-analytical method to extract contextual and process-oriented information from each study 56 . A qualitative analysis was also conducted using NVivo 10 57 , a software programme for qualitative analysis, to chart the descriptive results and findings of the included studies. We tabulated a hierarchy of digital surveillance systems reported by the included studies and used narrative visualizations to report the findings of this review. We also developed an interactive visual dashboard (available at https://rpubs.com/zshakeri/dphs_dashboard ) to provide insights into the findings with a multidimensional and more granular conceptual structure that is difficult to articulate in text alone. More details about the dashboard are provided in Supplementary Note 2 .

As the primary purpose of this study was to perform scientific paper profiling on internet-based user-generated data in the PHS context, we did not critically appraise the methodological quality of the included studies. However, we will comment on the methodological limitations that could have affected their results and implications.

Data availability

All data generated or analyzed during this review are included in this article and its supplementary information files.

Center, P. R. Internet/broadband fact sheet. https://www.pewresearch.org/internet/fact-sheet/social-media/ (2019). Accessed on July 2020.

Index, G. W. Global Web Index’s Flagship Report on the Latest Trends in Social Media (GlobalWebIndex (GWI), New York City, 2018).

Fung, I. C.-H., Tse, Z. T. H. & Fu, K.-W. The use of social media in public health surveillance. Western Pac. Surveill. Response J. 6 , 3 (2015).

Article   PubMed   PubMed Central   Google Scholar  

Brownstein, J. S., Freifeld, C. C. & Madoff, L. C. Digital disease detection—harnessing the web for public health surveillance. N. Engl. J. Med. 360 , 2153 (2009).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kass-Hout, T. A. & Alhinnawi, H. Social media in public health. Br. Med. Bull. 108 , 5–24 (2013).

Article   PubMed   Google Scholar  

Salathé, M. Digital epidemiology: what is it, and where is it going? Life Sci. Soc. Policy 14 , 1 (2018).

Jamison, D. T. et al. Disease Control Priorities in Developing Countrie s (The World Bank, 2006).

Thacker, S. B. et al. Public health surveillance in the united states: evolution and challenges. MMWR Surveill. Summ . 61 , 3–9 (2012).

Google Scholar  

Teutsch, S. M. Considerations in planning a surveillance system. Princibles and Practice of Public Health Surveillance 18–28 (Oxford University Press, New York, NY, 2010).

Salathe, M. et al. Digital epidemiology. PLoS Comput. Biol. 8 , e1002616 (2012).

Eysenbach, G. Infodemiology and infoveillance: tracking online health information and cyberbehavior for public health. Am. J. Prevent. Med. 40 , S154–S158 (2011).

Article   Google Scholar  

Eysenbach, G. Infodemiology: the epidemiology of (mis)information. Am. J. Med. 113 , 763–765 (2002).

Zeraatkar, K. & Ahmadi, M. Trends of infodemiology studies: a scoping review. Health Inf. Librar. J. 35 , 91–120 (2018).

Ward, J. K., Peretti-Watel, P. & Verger, P. Vaccine criticism on the internet: propositions for future research. Hum.Vaccines & Immunother. 12 , 1924–1929 (2016).

Freifeld, C. C. et al. Digital drug safety surveillance: monitoring pharmaceutical products in twitter. Drug Safe. 37 , 343–350 (2014).

Article   CAS   Google Scholar  

Carneiro, H. A. & Mylonakis, E. Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clin. Infect. Dis. 49 , 1557–1564 (2009).

Nuti, S. V. et al. The use of google trends in health care research: a systematic review. PLoS ONE 9 , e109583 (2014).

Article   PubMed   PubMed Central   CAS   Google Scholar  

Nicholls, J. Everyday, everywhere: alcohol marketing and social media-"current trends. Alcohol Alcohol. 47 , 486–493 (2012).

Naslund, J. A. et al. Systematic review of social media interventions for smoking cessation. Addict. Behav. 73 , 81–93 (2017).

Bernardo, T. M. et al. Scoping review on search queries and social media for disease surveillance: a chronology of innovation. J. Med. Internet Res. 15 , e147 (2013).

Sinnenberg, L. et al. Twitter as a tool for health research: a systematic review. Am. J. Public Health 107 , e1–e8 (2017).

Velasco, E., Agheneza, T., Denecke, K., Kirchner, G. & Eckmanns, T. Social media and internet-based data in global systems for public health surveillance: a systematic review. Milbank Q. 92 , 7–33 (2014).

Fung, I. et al. Ebola virus disease and social media: a systematic review. Am J. Infect. Control 44 , 1660–1671 (2016).

Capurro, D. et al. The use of social networking sites for public health practice and research: a systematic review. J. Med. Internet Res. 16 , 1–14 (2014).

Park, H., Jung, H., On, J., Park, S. K. & Kang, H. Digital epidemiology: use of digital data collected for non-epidemiological purposes in epidemiological studies. Healthc. Inform. Res. 24 , 253–262 (2018).

Mavragani, A. Infodemiology and infoveillance: scoping review. J. Med. Internet Res. 22 , e16206 (2020).

Edo-Osagie, O., De La Iglesia, B., Lake, I. & Edeghere, O. A scoping review of the use of twitter for public health research. Comput. Biol. Med. 122 , 1–13 (2020).

Aiello, A., Renson, A. & Zivich, P. Social media—and internet-based disease surveillance for public health. Annu. Rev. Public Health 2020 , 101–118 (2020).

Tricco, A. C. et al. Prisma extension for scoping reviews (prisma-scr): checklist and explanation. Ann. Intern Med. 169 , 467–473 (2018).

Peters, M. D. et al. Guidance for conducting systematic scoping reviews. Int. J. Evid. Based Healthc. 13 , 141–146 (2015).

Arksey, H. & O’Malley, L. Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8 , 19–32 (2005).

Levac, D., Colquhoun, H. & O’Brien, K. K. Scoping studies: advancing the methodology. Implement. Sci. 5 , 69 (2010).

Marynak, K. et al. State laws prohibiting sales to minors and indoor use of electronic nicotine delivery systems-"united states, november 2014. Morbid. Mortal. Wkly Rep. 63 , 1145 (2014).

Organization, W. H. et al. Electronic nicotine delivery systems. Report by WHO (WHO, 2014).

Declich, S. & Carter, A. O. Public health surveillance: historical origins, methods and evaluation. Bull. World Health Organ. 72 , 285 (1994).

CAS   PubMed   PubMed Central   Google Scholar  

Golder, A., G., N. & Y., L. Systematic review on the prevalence, frequency and comparative value of adverse events data in social media. Br. J. Clin. Pharmacol. 80 , 878–888 (2015).

Wehner, M. R. & Nead, K. T. Can google help us fight cancer? Lancet Oncol. 19 , 867 (2018).

Hahn, R. A. & Stroup, D. F. Race and ethnicity in public health surveillance: criteria for the scientific use of social categories. Public Health Rep. 109 , 7 (1994).

Aiello, A. E., Renson, A. & Zivich, P. N. Social media–and internet-based disease surveillance for public health. Annu. Rev. Public Health 41 , 101–118 (2020).

German, R. R., Horan, J. M., Lee, L. M., Milstein, B. & Pertowski, C. A. Updated Guidelines for Evaluating Public Health Surveillance Systems; Recommendations from the Guidelines Working Group (MMWR Recomm Rep., 2001).

UNODC. World Drug Report (United Nations Office on Drugs and Crime, 2016).

Chou, W.-Y. S., Oh, A. & Klein, W. M. Addressing health-related misinformation on social media. JAMA 320 , 2417–2418 (2018).

Powell, J., Darvell, M. & Gray, J. The doctor, the patient and the world-wide web: how the internet is changing healthcare. J. R. Soc. Med. 96 , 74–76 (2003).

Yeshua-Katz, D. & Martins, N. Communicating stigma: the pro-ana paradox. Health Commun. 28 , 499–508 (2013).

Kaplan, A. M. & Haenlein, M. Users of the world, unite! the challenges and opportunities of social media. Bus. Horiz. 53 , 59–68 (2010).

Sadah, S. A., Shahbazi, M., Wiley, M. T. & Hristidis, V. A study of the demographics of web-based health-related social media users. J. Med. Internet Res. 17 , e194 (2015).

Sanou, D. et al. Acculturation and nutritional health of immigrants in canada: a scoping review. J. Immigr. Minor. Health 16 , 24–34 (2014).

Smith, K. B. & Smith, M. S. Obesity statistics. Prim. Care 43 , 121–135 (2016).

Olstad, D. L. & Lee, J. Leveraging artificial intelligence to monitor unhealthy food and brand marketing to children on digital media. Lancet Child Adolesc Health 4 , 418–420 (2020).

Dunlop, S., Freeman, B. & Jones, S. C. Marketing to youth in the digital age: The promotion of unhealthy products and health promoting behaviours on social media. Media Commun. 4 , 35–49 (2016).

Potvin Kent, M., Pauzé, E., Roy, E.-A., de Billy, N. & Czoli, C. Children and adolescents’ exposure to food and beverage marketing in social media apps. Pediatr Obes. 14 , e12508 (2019).

Croner, C. M. Public health, gis, and the internet. Ann. Rev. Public Health 24 , 57–82 (2003).

Choi, B. C. The past, present, and future of public health surveillance. Scientifica 2012 , 1–26 (2012).

Golder, S., McIntosh, H. M., Duffy, S. & Glanville, J. Developing efficient search strategies to identify reports of adverse effects in medline and embase. Health Inf. Librar. J. 23 , 3–12 (2006).

Levac, D., Colquhoun, H. & O’Brien, K. K. Scoping studies: advancing the methodology. Implemen. Sci. 5 , 69 (2010).

Bazeley, P. & Jackson, K. Qualitative Data Analysis with NVivo (SAGE publications limited, 2013).

Download references

Acknowledgements

This work was supported by a postdoctoral scholarship from the Libin Cardiovascular Institute and the Cumming School of Medicine, University of Calgary. Also, this work was supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (RGPIN-2014-04743) and funding from the O’Brien Institute for Public Health, University of Calgary.

Author information

Authors and affiliations.

Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Zahra Shakeri Hossein Abad, Adrienne Kline, Madeena Sultana, Elvira Nurmambetova, Filipe Lucini, Majed Al-Jefri & Joon Lee

Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Zahra Shakeri Hossein Abad, Madeena Sultana & Joon Lee

Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Adrienne Kline & Majed Al-Jefri

Department of Electrical and Computer Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada

Mohammad Noaeen

Department of Critical Care Medicine, Cumming School of Medicine, University of Calgary and Alberta Health Services, Calgary, AB, Canada

Filipe Lucini

Department of Cardiac Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

You can also search for this author in PubMed   Google Scholar

Contributions

Z.S.H.A. led developing and implementing the protocol, designed the search strategy and retrieved articles, designed the data extraction process, screened the search results, extracted data, performed the data analysis, interpreted and visualized the results, developed the dashboard, and led on writing the manuscript. M.N. contributed to the data collection, data analysis, and critically reviewed the results. A.K. and M.S. contributed to the screened search results and extracted data. A.K. contributed to the interpretation of clinical results. E.N., F.L., and M.A. contributed to the extracted data. Z.S.H.A. and M.N. reviewed the extracted data and resolved discrepancies by reviewing the full text of the included studies. J.L. conceived the study, contributed to the protocol development, and critically reviewed the results and the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zahra Shakeri Hossein Abad .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Shakeri Hossein Abad, Z., Kline, A., Sultana, M. et al. Digital public health surveillance: a systematic scoping review. npj Digit. Med. 4 , 41 (2021). https://doi.org/10.1038/s41746-021-00407-6

Download citation

Received : 08 September 2020

Accepted : 21 January 2021

Published : 03 March 2021

DOI : https://doi.org/10.1038/s41746-021-00407-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Natural language processing of multi-hospital electronic health records for public health surveillance of suicidality.

  • Ariel Cohen
  • Richard Delorme

npj Mental Health Research (2024)

Forecasting virus outbreaks with social media data via neural ordinary differential equations

  • Matías Núñez
  • Nadia L. Barreiro
  • Christopher Rackauckas

Scientific Reports (2023)

Population mobility data provides meaningful indicators of fast food intake and diet-related diseases in diverse populations

  • Abigail L. Horn
  • Brooke M. Bell
  • Kayla de la Haye

npj Digital Medicine (2023)

Digital Epidemiological Approaches in HIV Research: a Scoping Methodological Review

  • Lindsay E. Young
  • Yuanfeixue Nan
  • Robin Stevens

Current HIV/AIDS Reports (2023)

Reimagining India’s Health System: Technology Levers for Universal Health Care

  • Vijay Chandru
  • Sharad Sharma
  • Raghu Dharmaraju

Journal of the Indian Institute of Science (2022)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case study health surveillance

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Performance of COVID-19 case-based surveillance system in FCT, Nigeria, March 2020 –January 2021

Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliations Nigeria Field Epidemiology and Laboratory Training Program, Abuja, Nigeria, African Field Epidemiology Network (AFENET), Abuja, Nigeria

ORCID logo

Contributed equally to this work with: Aishat Bukola Usman, Abdulhakeem Abayomi Olorukooba

Roles Methodology, Supervision, Writing – review & editing

Affiliation African Field Epidemiology Network (AFENET), Abuja, Nigeria

Roles Methodology, Supervision, Writing – original draft, Writing – review & editing

Affiliation Department of Community Medicine, Ahmadu Bello University, Zaria, Kaduna, Nigeria

Roles Resources, Visualization, Writing – review & editing

¶ ‡ INA, DJJ, LAL, CCU, and MSB also contributed equally to this work.

Affiliation Department of Medical Laboratory Science, Ahmadu Bello University, Zaria, Kaduna State, Nigeria

Roles Data curation, Writing – review & editing

Affiliation Department of Public Health, Federal Capital Territory Administration, Abuja, Nigeria

Roles Data curation, Project administration, Writing – review & editing

Roles Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

  • Chikodi Modesta Umeozuru, 
  • Aishat Bukola Usman, 
  • Abdulhakeem Abayomi Olorukooba, 
  • Idris Nasir Abdullahi, 
  • Doris Japhet John, 
  • Lukman Ademola Lawal, 
  • Charles Chukwudi Uwazie, 
  • Muhammad Shakir Balogun

PLOS

  • Published: April 14, 2022
  • https://doi.org/10.1371/journal.pone.0264839
  • Peer Review
  • Reader Comments

Fig 1

Introduction

The emergence of novel SARS-CoV-2 has caused a pandemic of Coronavirus Disease 19 (COVID-19) which has spread exponentially worldwide. A robust surveillance system is essential for correct estimation of the disease burden and containment of the pandemic. We evaluated the performance of COVID-19 case-based surveillance system in FCT, Nigeria and assessed its key attributes.

We used a cross-sectional study design, comprising a survey, key informant interview, record review and secondary data analysis. A self-administered, semi-structured questionnaire was administered to key stakeholders to assess the attributes and process of operation of the surveillance system using CDC’s Updated Guidelines for Evaluation of Public Health Surveillance System 2001. Data collected alongside surveillance data from March 2020 to January 2021 were analyzed and summarized using descriptive statistics.

Out of 69,338 suspected cases, 12,595 tested positive with RT-PCR with a positive predictive value (PPV) of 18%. Healthcare workers were identified as high-risk group with a prevalence of 23.5%. About 82% respondents perceived the system to be simple, 85.5% posited that the system was flexible and easily accommodates changes, 71.4% reported that the system was acceptable and expressed willingness to continue participation. Representativeness of the system was 93%, stability 40%, data quality 56.2% and timeliness 45.5%, estimated result turnaround time (TAT) was suboptimal.

The system was found to be useful, simple, flexible, sensitive, acceptable, with good representativeness but the stability, data quality and timeliness was poor. The system meets initial surveillance objectives but rapid expansion of sample collection and testing sites, improvement of TAT, sustainable funding, improvement of electronic database, continuous provision of logistics, supplies and additional trainings are needed to address identified weaknesses, optimize the system performance and meet increasing need of case detection in the wake of rapidly spreading pandemic. More risk-group persons should be tested to improve surveillance effectiveness.

Citation: Umeozuru CM, Usman AB, Olorukooba AA, Abdullahi IN, John DJ, Lawal LA, et al. (2022) Performance of COVID-19 case-based surveillance system in FCT, Nigeria, March 2020 –January 2021. PLoS ONE 17(4): e0264839. https://doi.org/10.1371/journal.pone.0264839

Editor: Khin Thet Wai, Freelance Consultant, Myanmar, MYANMAR

Received: September 24, 2021; Accepted: February 17, 2022; Published: April 14, 2022

Copyright: © 2022 Umeozuru et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data cannot be shared publicly because of third party confidentiality issue. Data will be available on request with the approval of the Federal Capital Territory Health Research Ethics Committee, for researchers who meet the criteria for access to confidential data. A non-author point of contact for Federal Capital Territory Administration: The Secretary of Health and Human Services Secretariat, Federal Capital Territory Administration, Abuja, Nigeria. Dr. Abubukar Tafida ( [email protected] ).

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Since its emergence in December 2019, the novel coronavirus SARS-CoV-2 has caused a pandemic of Coronavirus Disease 19 (COVID-19) which has spread exponentially all over the world. As of July 2021, the disease had affected 220 countries with over 190 million confirmed cases and 4 million deaths worldwide [ 1 ]. Although the case fatality rate of COVID-19 which is estimated at 2%–3% is lower than that of MERS (40%) and SARS (10%), the pandemic associated with COVID-19 has been more severe [ 2 ]. The first COVID-19 case in Nigeria was recorded on 27th February [ 3 ], this index case brought about the activation of COVID-19 Emergency Operation Center (EOC) at national and sub-national levels. COVID-19 surveillance in Nigeria from 27th February to 31st January 2021 recorded a total of 131,242 confirmed cases; 104,989 recoveries, 1,586 deaths with case fatality rate (CFR) of 1.2% and 24,667 active cases [ 4 ]. The Federal Capital Territory (FCT) reported its first COVID-19 case on 20th March 2020 following laboratory confirmation of the first three COVID-19 cases. The FCT has since become the second epicenter of COVID-19 in Nigeria after Lagos State with a total of 16,863 confirmed cases from 20th March to 31st January 2021: 10,983 recoveries, 126 deaths (CFR 1.1%) and 5,754 active cases. The highest proportion of COVID-19 cases and deaths was seen among age groups 31–40 years and 61–70 years, respectively; and males accounted for a higher proportion of confirmed cases and deaths [ 4 , 5 ].

Case detection and contact identification remain the key surveillance objectives for effective containment of COVID-19. A robust surveillance system is essential for correct estimation of the burden of the disease and containment of the pandemic [ 6 ]. To adequately measure the level of COVID-19 pandemic containment, there is need for a robust local and regional epidemiological data [ 7 ]. COVID-19 surveillance aims to enable public health authorities to reduce transmission of COVID-19 in the state, thereby limiting associated morbidity and mortality [ 8 ]. COVID-19 is captured as a mandatory notifiable case-based disease under the Integrated Disease Surveillance and Response (IDSR) 001 with requirements for immediate reporting. FCT Public Health Department (PHD) coordinates COVID-19 surveillance in FCT and is responsible for the review of data generated in the state, from which information is used for immediate public health action. Surveillance information flows from the lower to higher levels for onward public health actions based on the final laboratory outcome, while feedback goes in reverse direction with all reporting levels adequately captured in the system Fig 1 . COVID-19 surveillance utilizes both active and passive surveillance.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0264839.g001

Specific objectives of COVID-19 surveillance system are: to enable rapid case detection, conduct disease control interventions, identify, follow-up and quarantine contacts, detect and contain clusters especially among vulnerable and high-risk populations, guide the implementation and adjustment of targeted control measures, monitor and predict the impact on healthcare systems and monitor longer-term epidemiologic trends and evolution of SARS-CoV-2 virus [ 8 , 9 ].

The system is mainly funded by the Federal Capital Territory Administration (FCTA). The Federal Government of Nigeria through the Nigeria Centre for Disease Control (NCDC) provides trainings, funds the laboratory which is a fundamental aspect of COVID-19 surveillance and provides updated guidelines. Partners such as World Health Organization (WHO), the World Bank through Regional Disease Surveillance Systems Enhancement (REDISSE) and Africa Centre for Disease Control (ACDC) also provide funding and technical support to the system.

The FCT COVID-19 surveillance system like every other public health system is prone to challenges and setback. Although the system has been established and operational since the inception of the pandemic, there is limited information about the overall performance. Evaluation of the COVID-19 surveillance system will ascertain if the system is meeting the set objectives and whether the attributes are effective enough to accomplish these objectives [ 10 , 11 ]. Furthermore, information from the evaluation is crucial for improving the system as well as enhancing preparedness for possible future outbreaks of COVID-19 and other emerging infectious diseases. Several attributes of a surveillance system impact its capacity to monitor health events efficiently [ 12 ]. They include usefulness, simplicity, flexibility, data quality, acceptability, sensitivity, positive predictive value, representativeness, timeliness, and stability. The evaluation of public health surveillance systems entails an assessment of these nine system attributes [ 13 – 15 ]. We evaluated the performance of the COVID-19 case-based surveillance system in FCT, Nigeria and assessed its key attributes.

Study setting

The FCT which is in North-central Nigeria has a landmass of approximately 7,315 km 2 and it is situated within the savannah region with moderate climatic conditions. The territory is currently made up of six local government areas (LGAs) also called area councils, namely: Abaji, Abuja Municipal, Bwari, Gwagwalada, Kuje and Kwali [ 16 ]. The FCT has 2016 projected population of 3,564,126 [ 17 ]. There are 19 NCDC accredited Reverse Transcription Polymerase Chain Reaction (RT-PCR) testing laboratories for COVID-19: three government owned laboratories (National Reference Laboratory, Defense Reference Laboratory and FCTA PCR Laboratory) and 16 private laboratories that cater for persons of interest (POI). There are four active COVID-19 treatments centres in FCT: Gwagwalada Specialist Hospital, National Hospital Abuja, This Day Dome and Idu treatment center. There are six active sample collection centres: Maitama District Hospital, Nyanya General Hospital, This Day Dome, Gwagwalada Specialist Hospital, FCT Public Health Department, Federal Medical Center Jabi and Kubwa General Hospital.

The surveillance system uses a digital surveillance tool called Surveillance Outbreak Response Management and Analysis System (SORMAS) to report COVID-19 cases. SORMAS is an open-source real-time mobile and web application used as an electronic health surveillance database. The NCDC adopted SORMAS as its primary digital surveillance platform for implementing the IDSR system and customized it for the surveillance of priority diseases of public health importance in Nigeria, a COVID-19 module was developed and added to SORMAS in January 2020 [ 3 ]. The essential surveillance data for COVID-19 is reported, compiled, and analyzed daily, with zero reporting when there are no cases. Data is usually compiled at the state and national levels, more in-depth analysis on age, gender, testing patterns, comorbidities and risk factors, symptomatology and severity, etc. are also conducted periodically.

Other existing surveillance approaches are used along with the essential elements of comprehensive surveillance for COVID-19, these include participatory surveillance which enables members of the public to self-report signs or symptoms, this relies on voluntary reporting and is frequently facilitated by dedicated smartphone applications and telephone hotlines which were made available to the public for advice and referral to health care service, this provides an early indication of disease spread in the community [ 10 ]. The data generated are securely stored electronically and in hard copies at the FCT COVID-19 EOC. Data are not disclosed to any party unless the purpose of utilization is clarified and fully authorized by FCT ethical committee. The data is also electronically backed up and access is limited to authorized key persons. The FCT COVID-19 surveillance report captures data on suspected COVID-19 cases and deaths, active case search, community reporting via the COVID-19 toll-free lines, case investigations, health facility (deaths and suspected cases), laboratory data and treatment center admissions, outcomes and discharges.

Study design

An evaluation of the COVID-19 surveillance system was conducted using a cross-sectional study design. The surveillance system was evaluated using the “2001 United States Centers for Disease Control’s updated guidelines for Evaluating Public Health Surveillance Systems”.

Study population

A total of 30 stakeholders involved in the COVID-19 surveillance system in FCT were interviewed, these includes: The Director Public Health, State Epidemiologist, Assistant State Epidemiologist, Incident Manager, Laboratory pillar lead, State Disease Surveillance and Notification Officer (DSNO), assistant state DSNO, local government area DSNOs, SORMAS state surveillance officer, case managers, Laboratory Scientists, community health workers and Partners (WHO, ACDC).

COVID-19 case definitions

Suspected case..

“A suspect case is defined as any person (including severely ill patients) presenting with fever, cough or difficulty in breathing AND who within 14 days before the onset of illness had any of the following exposures: History of travel to any country with confirmed and ongoing community transmission of SARS-CoV-2 OR Close contact with a confirmed case of COVID-19 OR Exposure to a healthcare facility where COVID-19 case(s) have been reported” [ 18 ].

Confirmed case.

“A person with laboratory confirmation of SARS-CoV-2 infection with or without signs and symptoms” [ 18 ].

Probable case.

“A probable case is defined as a person who meets the criteria for a suspect case AND for whom testing for COVID-19 is inconclusive or for whom testing was positive on a pan-coronavirus assay” [ 18 ].

Contact case.

“someone who had contact (within 1 meter) with a confirmed case during their symptomatic period, including one day before symptom onset” [ 18 ].

Data collection

A mixed data collection approach was adopted, this includes key informant interviews (KII), surveys, record reviews and analysis of FCT COVID-19 surveillance data from March 2020 to January 2021. Data collection was done in February 2021. The KII was conducted with four key stakeholders (state epidemiologist, incident manager, state DSNO and state SORMAS Surveillance officer). Purposive sampling method was used to select the four key stakeholders for KII. Survey was conducted using a self-administered semi-structured questionnaire which was administered to 30 stakeholders to obtain their inputs in describing the system and assess their perception of attributes of the surveillance system. The questionnaire had two sections: Section A collected sociodemographic information of respondents while Section B was on the surveillance system attributes: simplicity, flexibility, data quality, acceptability, sensitivity, positive predictive value, representativeness, timeliness, and stability. The questionnaire outlined various indicators (range 2–5) for assessment of each system attribute. The questionnaire and key informant interview guide were adapted from the United States Centres for Disease Control and Prevention (CDC), 2001 Updated Guidelines for Evaluation of Public Health Surveillance Systems [ 14 ]. SORMAS was the primary data source for this study, FCT COVID-19 surveillance data comprising of epidemiological and laboratory variables on suspected and confirmed cases between March 2020 and January 2021 were extracted from the SORMAS platform on excel datasheet. Variables of interest were socio-demographic characteristics, date case was reported, laboratory result and outcome of illness. The surveillance data was used to assess the usefulness of the system. Documents relevant to COVID-19 surveillance were reviewed; these includes Form A0: case investigation form (CIF), Form A1: case initial reporting form for confirmed cases (Day 1), Form A2: case follow-up reporting form (Day 14–21), Form B1: contact initial reporting form for close contacts (Day 1), Form B2: contact follow-up reporting form for close contacts (Day 14–21), National Technical Guidelines for IDSR 2019 version, National Interim Guidelines for Clinical Management of COVID-19 version 4, the First Few X (FFX) cases and contact investigation protocol for 2019-novel coronavirus (2019-nCoV) infection.

Data analysis

We analyzed the extracted surveillance data using Epi Info 7.0 and Microsoft Excel 2016. Data output was summarized into descriptive statistics (frequencies and proportions) using charts and tables. Case fatality rate was calculated by dividing the number of deaths from COVID-19 over the time period by the number of individuals that tested positive to COVID-19 during that time, the resulting ratio was then multiplied by 100 to give a percentage. Positivity rate was calculated by dividing the number of people that tested positive by the number of people who were tested. Attack rate was calculated by dividing the total number of cases during the time period by the total population at risk. Content analysis procedure was employed for data collected from the KIIs. We analyzed the data from survey questionnaires and scored the responses for various system attributes. To achieve consistency and comparability of findings, we used the evaluation method and scoring system utilized for influenza surveillance evaluations conducted in other African countries [ 11 , 19 ]. A scale from 1 to 3 was used to provide a score for each indicator as follows: < 60% scored 1 (poor performance); 60–79% scored 2 (moderate performance); ≥80% scored 3 (good performance [ 19 ]. The scores allotted to each indicator were averaged for all indicators evaluated within each attribute to give an overall score for a particular attribute. The nine evaluated attributes were then averaged to get an overall score for the surveillance system.

Ethical consideration

Ethical approval with approval number: FHREC/2021/01/18/26-03-21 was obtained from the FCT Health Research Ethics Committee, written informed consent was also obtained from stakeholders.

Findings from KII with stakeholders

Some challenges affecting the surveillance system as indicated by the key stakeholders include non-harmonization of treatment protocol amongst the treatment centres, prolonged result turnaround time, incomplete reporting, inadequate funding, sub-optimal testing in all area councils except Abuja municipal, low case-contact ratio due to prolonged result turnaround time (TAT) and insufficient number of personnel for contact tracing and inadequate office space for the EOC thereby not encouraging adequate physical distancing. The stakeholders also reported community stigmatization and denial as a major challenge, leading to low testing, poor disclosure, inadequate contact tracing and poor adherence to non-pharmaceutical interventions (NPI) by FCT residents due to low-risk perception.

Other reported challenges include inadequate vehicles for response activities, increasing number of healthcare workers’ infection, cumbersome process of phone number extraction from SORMAS leading to delay in result dissemination in addition to limited number of personnel to carry out this task, low/stock out of some response commodities, discordant results from private laboratories and inadequate holding area for suspected cases. Stakeholders stated that the SORMAS platform is extremely slow and so hinders real-time data capture on the field. As a result, a huge disparity was always seen between data captured on SORMAS and the actual surveillance data. Tablets and SIM cards are available for SORMAS operation but there is no designated fund to carter for data and airtime, as a result, some personnel often do out of pocket spending to keep the platform updated.

FCT COVID-19 surveillance system attributes

Usefulness..

COVID-19 surveillance as part of IDSR was established to provide prompt detection and response to COVID-19 pandemic. A total number of 69,338 suspected cases of COVID-19 were reported in FCT between March 2020 to January 2021, 12,595 (18%) samples tested positive with RT-PCR, Table 1 .

thumbnail

https://doi.org/10.1371/journal.pone.0264839.t001

A total of 125 deaths were recorded among confirmed cases giving an overall CFR of 1%. The system showed distribution of these cases across the six local governments areas of the state, Abuja municipal recorded the highest attack rate (56.0) followed by Gwagwalada (15.1) with an overall attack rate of 35.1. Out of 2,384 suspected cases among healthcare workers, 604 tested positive for COVID-19, which translates into a prevalence of 25.3% among healthcare workers (HCWs). Prevalence was 18.5% among traders, 16.2% among students, and 14.1% among civil servants. The proportion of HCW infection was highest among nurses (28.8%) and doctors (25.8%). Distribution of confirmed cases showed a propagated pattern as seen in Fig 2 , with peaks seen in week 52 and 51, a sharp decline in number of cases was observed from week 1 in 2021.

thumbnail

https://doi.org/10.1371/journal.pone.0264839.g002

Case fatalities were recorded across the entire period with highest number of deaths seen in weeks 28 and 53. About 74% of COVID-19 cases admitted at the treatment centers made full recovery, 1% died while outcome was unknown for 25%. The system successfully detected the second wave of the outbreak which started from week 48 and accounted for more than half of the COVID-19 cases recorded in the entire period Fig 3 . System usefulness had a total score of 98%.

thumbnail

https://doi.org/10.1371/journal.pone.0264839.g003

Simplicity.

The FCT COVID-19 Surveillance System has a simple structure, the structure of the system is such that data flow occurs uninterrupted from communities to the national level as indicated in Fig 1 . All respondents reported that data tools were easy to use and the COVID-19 case definitions were easy to understand. About 95% reported that COVID-19 posters and other Information, Education and Communication (IEC) materials were displayed in their facilities. Out of those that reported availability of COVID-19 IEC materials in their facilities, 95% stated the IEC materials contained COVID-19 case definitions while 53% said they contained COVID-19 case management guidelines. About 76% claimed they report all suspected cases of COVID-19, 48%, 19% and 5% report suspected cases to the next level through phone calls, 19% report on CIF and 5% report through SORMAS. Eighty-one percent reported that the system constantly detects an increase in COVID-19 cases. Twenty-four percent of respondents reported detected cases to other organizations and 14% of the respondents who send case reports to other organizations use other forms apart from the COVID-19 surveillance forms to send the reports. The estimated time spent on collection, entering, editing, analyzing, storing, backing up and transfer of data ranged from 25 minutes to 72 hours among respondents. About 71% of respondents reported ease of working within the system in terms of operation: workload, workflow, the flow of information and inter-unit relationship. About 81% of respondents reported having adequate number of staff for data collection and those who reported inadequate number of staff for data collection needed an average of 13 staff for optimal task performance. The overall score for system simplicity was 81.5%.

Flexibility.

Ninety-one percent of respondents reported that changes in the surveillance system were easily accommodated by the data collection tools and 81% stated that new data elements were easily reported as part of the monthly report. The overall score for system flexibility was 85.5%

Acceptability.

All respondents expressed willingness to continue their participation in the surveillance system, however, 71% had challenges affecting work efficiency. About 86% had made suggestions/comments about improving the system, the suggestions made include: engagement of more support staff 28%, financial support for staff 28%, improvement of result turnaround time (TAT) 22%, transport logistic support 22%, improvement of electronic database 17%, additional training 17%, automation of result dissemination 17%, improvement of surveillance funding 17%, provision of adequate work tools 17%, more spacious work space 17%, timely supply of essential materials 11%, simplification of CIFs for suspected cases 6%, collaboration with other agencies 6% and proper supervision at sample collection sites 6%. About 57% of those that made suggestions reported that their suggestions were considered. About 90% feel the system appreciates them for doing their job. All respondents needed additional support to carry out their tasks effectively. The overall system acceptability score was 71.4%.

Timeliness.

About 90% of respondents reported availability of written policies on the timeliness of data reporting, 67% reported having challenges with sending reports on a timely basis. The challenges reported includes delays in receiving results from the laboratories, problems with data harmonization, lack of funds for transmitting reports, overwhelming nature of workload, inadequate manpower, delay in reporting from lower levels, lack of/poor internet access and poor transport logistics. About 71% incurred additional costs for reporting, ranging from 3000 naira to 100,000 Naira monthly. Respondents reported that TAT ranged from 24 hours to two weeks, the optimal TAT being 48 hours, TAT was suboptimal as only 29% reported TAT of 24–48 hours. About 76% of respondents reported data on daily basis, 19% reported weekly while 5% reported monthly. The timeliness and completeness of reporting were 63% and 70% respectively which were below the 80% and 100% targets respectively, Fig 4 . The overall score for timeliness was 45.5%.

thumbnail

https://doi.org/10.1371/journal.pone.0264839.g004

Sensitivity.

About 97% of respondents were satisfied with the COVID-19 case definitions, 85% felt the system could detect all cases of COVID-19, 10% reported occurrence of false-negative results and 5% reported delay in the release of patients’ results. About 95% of respondents said the system can correctly detect new cases of COVID-19. Nineteen percent reported frequent cases of misdiagnosis, the problems listed includes: misunderstanding of COVID-19 signs and symptoms with other infectious diseases, positive cases that turn out negative after a retest within 24 hours and lack of community testing. Respondents made the following suggestions for improvement of case detection: siting of sample collection centres all LGAs, improvement of TAT, raising the index of suspicion among healthcare workers, increased awareness in local languages, establishment of new testing sites, improvement of sensitivity of case definition, prompt release of result by testing laboratories, more trainings and improvement of sample collection method. A review of FCT COVID-19 surveillance data from March 2020 to January 2021 showed the system was able to detect 12,595 confirmed cases out of a total of 69,338 suspected cases within the period under review. The overall score for system sensitivity was 90.6%.

Positive predictive value.

The FCT COVID-19 surveillance system detected a total of 69,338 suspected cases from March 2020—Jan. 2021. Samples were collected from these cases and tested at the laboratory. Out of these, 12,595 tested positive by RT-PCR

Therefore, the positive predictive value for FCT COVID-19 surveillance system = (12,595/69,338) × 100 = 18%.

Data quality.

About 76% of respondents described the level of completeness of data generated from the COVID-19 surveillance system as partially complete while 57% reported that the data generated was partially valid. About 71% reported having been supervised on data management with an average of three supervisions in the last six months, more than 62% received feedback upon completion of supervision. About 5% of respondents rated the care exercised in completing surveillance forms and data management as excellent, 19% rated very good, 52% good, 19% fair and 5% poor. About 49% of respondents stated that feedback from SORMAS was inadequate, the challenges encountered include poor TAT, incomplete CIFs, most results not updated on SORMAS and problems with using SORMAS on a real-time basis for data capture. The overall score for the data quality of the system was 56.2%.

Representativeness.

A review of the state SORMAS COVID-19 line-lists from March 2020 to January 2021 showed the FCT COVID-19 surveillance system captures people of all ages Fig 5 .

thumbnail

https://doi.org/10.1371/journal.pone.0264839.g005

Distribution of suspected and confirmed cases was seen across the six LGAs in FCT during the period under review. All respondents said the system captures people of all ages and 86% reported the system captures people from all geographical locations of FCT. Age groups 30–34, 35–39 and 25–29 years were mostly affected constituting about 14%, 13.9% and 13.8% of the confirmed cases respectively, Table 2 , males accounted for a higher proportion of cases (56%). The representativeness score was 93%. CFR was highest among age group 70–79 and 80–89 years, Table 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0264839.t002

About 95% of respondents received feedbacks from the next level. About 67% used the data collected to make informed decisions, planning of response and for analysis for transmission to policymakers in the state, while 33% reported they simply transmitted to the next level. About 67% claimed the system has been interrupted in the past due to inadequate funding, 52% claimed it was interrupted due to inadequate staff, 33% reported interruption due to stock out of consumables, 5% due to HCW infection and 5% due to strike action. Only 48% had stipends to carry out their tasks. About 76% needed more resource to carry out their duties effectively: funding (38%), human resource (34%) and material resources (28%), the material resources needed include internet services, adequate PPEs, data tools and transport logistics. Overall system stability score was 40%.

Having assessed all key system attributes, system usefulness scored the highest followed by flexibility while stability had the lowest score, Fig 6 . The system had an average overall score of 73.5%.

thumbnail

https://doi.org/10.1371/journal.pone.0264839.g006

The main goal of surveillance during outbreak management is to detect cases early in order to mount effective public health action to reduce the transmission. The FCT COVID-19 surveillance system exhibits some of the attributes of a good surveillance system, however, poor stability, data quality and timeliness were limitations. Our findings showed that the surveillance system had an overall moderate performance, this implies that the system was not performing optimally although the system meets the initial objectives of the surveillance. This corresponds with the findings in similar studies conducted on Malaria Surveillance system in Kano State [ 10 ] and Avian Influenza Surveillance System in Enugu State, Nigeria, [ 21 ], where poor data quality, instability and non-representativeness of surveillance reports were the major limitations. We found that FCT COVID-19 surveillance system was effective in detecting cases in the communities. This was possible through periodic community testing and targeting suspected risk groups and persons of interest (POI), especially people coming into the state from countries with high burden. HCWs were identified as a high-risk group due to the high positivity rate recorded among them. The highest proportion of HCW infections occurred among nurses and doctors, this is worrisome considering that they usually have the first contact with patients. Esohe et al made a similar observation in a study of facility-based surveillance activities for COVID-19 infection and outcomes among healthcare workers in a Nigerian tertiary hospital [ 22 ]. Targeting risk groups for testing can improve the effectiveness of COVID-19 surveillance in settings where mass testing is not feasible. FCT COVID-19 surveillance system was found to be useful because it detects the disease in a way that permits early treatment, prevention and control. The system also provides estimates of COVID-19 related morbidity and mortality, including the identification of factors associated with the disease and detects trends that signal changes in the incidence of the disease. The system provided measurement of deaths attributed to COVID-19, which is a key indicator of the overall impact of the pandemic. Data generated from the system permits assessment of the effect of prevention and control programs and leads to improved clinical, behavioral, social, policy and environmental practices.

The system displayed good simplicity and ease of usage as data flow across the system occurs seamlessly and uninterruptedly and the data tools were deemed easy to use. The simplicity of this system could be attributed to the observed ease of application of the case definitions thereby making it easy to ascertain a case of COVID-19. However, our findings show that one of the major challenges facing the fight against COVID-19 is the similarities between its symptoms with many other popular diseases. Similarly, in an evaluation of health surveillance systems in South Africa, about three-quarters of healthcare providers acknowledged that the operations of the existing surveillance system were simple and understandable [ 23 ]. The COVID-19 surveillance system was considered flexible since it was well integrated into IDSR, with paper-based forms and electronic forms (SORMAS) used for real-time data collection and reporting. The system was able to adapt changes according to needs and operational demands with minimal additional costs, this finding was in line with that of a similar study in Pakistan by Atifa et al. [ 6 ]. Although simplicity and flexibility were found to be major strength of the system, the system requires steady considerable amount of funding for smooth uninterrupted operation.

The system had good acceptability at the state and LGA levels, although more than half of the respondents had challenges affecting work efficiency and all respondents needed extra support for effective operation. Effort needs to be made by the state government and partners to address these challenges and provide the needed support to stakeholders in the system. Despite the reported challenges, all respondents were willing to continue participating in the system. About two-third of respondents perceived their contributions and suggestions within the system as being valued. In contrast, elsewhere, more than half of the health workers were dissatisfied with their involvement in surveillance activities [ 11 ]. Similarly, less than half of the respondents in a study in South Africa were willing to be involved in activities within the surveillance system [ 23 ]. Additional findings showed that majority of respondents recognized COVID-19 to be of high priority compared to other diseases, thus they were willing to engage in surveillance activities. Increased health worker participation in routine surveillance reporting is dependent on the perceived simplicity of the system [ 24 ]. Therefore, the high acceptability of the system can be attributed to simplified data tools, case definition and training materials which ease understanding. The overall timeliness of the system was poor, although there were written policies on the timeliness of data reporting, challenges were encountered that affected timely reporting and prolonged TAT from the testing laboratory was a major setback as this delayed effective public health action and timely containment. These finding aligns with a similar study that evaluated the timeliness and completeness of laboratory-based surveillance of COVID-19 cases in England which reported delays in timeliness, the delays occurring mostly in the first stage of the reporting process, before laboratory information is keyed onto the surveillance platform [ 25 ]. Data-driven insights to guide public health decision-making for the pandemic response rely majorly on complete and timely data on laboratory-confirmed cases. SORMAS platform which is the primary data source for stakeholders relies on data reported on case investigation forms (CIFs) and by laboratories in order to swiftly inform the epidemiology of the disease. The incomplete collection and reporting of key variables such as symptoms, date of onset, hospitalization and travel history on CIFs as well as prolonged TAT from laboratory prevents the identification of detailed risk factors for transmission and severity of infection. The use of SORMAS as an electronic data collection tool would have improved timeliness of reporting but SORMAS was not used on real time basis as it should due to technical and connectivity difficulties and so the aim was almost defeated. There is a need to improve the electronic system and integrate it fully with the surveillance system including testing laboratories so as to facilitate timely and real time data collection and reporting.

The system had good sensitivity as it was able to detect a good number of COVID-19 cases both symptomatic and asymptomatic. The system was sensitive enough to detect a second wave of the pandemic in FCT which accounted for more than half of the entire cases recorded within the time period. However, cases of misdiagnosis due to the inability to differentiate COVID-19 signs and symptoms from that of other diseases was a major challenge, this could be attributed to the novelty of the disease and also due to the fact that many diseases share the same combination of symptoms. The PPV of the system was low due to the non-specific nature of COVID-19 symptoms and the system did not want to miss cases, as a result, cases that did not meet the case definition were investigated and tested. The system’s data quality was poor as the majority of data generated by the system were partially complete due to the absence of some key variables. This finding is in line with the report of an assessment of the national framework of COVID-19 surveillance in the United States by Ulrich et. al. [ 9 ], where incomplete and inconsistent data collection was recorded and the data was reported in a non-standardized way. However, it is in contrast to the findings by Atifa et. al. from evaluation of the COVID-19 Laboratory-Based Surveillance System in Islamabad-Pakistan where they found completeness of data and consistency of reporting to be good [ 6 ]. Surveillance systems must capture adequate and detailed information about cases to facilitate rapid epidemiological analysis of the disease and effectively inform prompt public health actions. Coordination of the pandemic in FCT needs a departure from paper-based reporting systems and move towards fully integrated electronic systems to further improve data quality. The goal of electronic reporting of IDSR data is to strengthen the disease surveillance system for prompt detection of public health events and real-time reporting, thereby enabling prompt response to outbreaks and public health actions [ 26 , 22 ]. The system has good representativeness as cases were detected across all ages and geographical areas of FCT, cases were also reported from public and private healthcare facilities. Representativeness is important for planning and executing targeted interventions and monitoring progress towards containment. The stability of the system was considered to be fair as there was no steady sustainable funding from both government and partners and activities of the system had been interrupted in the past as a result of this. Nigeria’s health system is irrefutably fragile as a result of incessant outbreaks that have weakened healthcare frameworks. The COVID-19 pandemic has worsened the pre-existing situation. As expected, the pandemic had overwhelming public health and socioeconomic impacts globally, resulting in a decrease in the epidemiological control of several infectious diseases [ 27 ].

It is also important to mention that the COVID-19 pandemic has uncovered Africa’s inefficient and ineffective health surveillance systems. Although, Africa has recorded several outbreaks of emerging and re-emerging infectious diseases such as Ebola virus disease and other epidemic-prone diseases, less attention has been given towards surveillance system strengthening [ 28 ]. Indeed, the impact posed by the COVID-19 pandemic on health systems in the region has been catastrophic and has stressed the importance of rethinking, reflecting and focusing on lessons learned during the COVID-19 pandemic. Africa, like every other continent has been affected, and the underlying shortfalls in its health system has aggravated the situation [ 29 ]. In the wake of COVID-19 widespread, healthcare systems became overwhelmed in Africa. Short supply of skilled health workers was recorded, falling 60% below the United Nation’s minimum limit, while sub-Saharan Africa has only 1%–5% of the intensive care unit beds per capita, compared to European and East Asian countries [ 30 ]. In Nigeria for example, despite the decrease in patient visits during lockdown, the increased healthcare needs for critically ill patients in inadequately equipped and understaffed intensive care units resulted in substantially less time dedicated to non-COVID-19 patient care. Nigeria’s current health systems cannot efficiently cater for the increasing needs of already infected patients who require intensive care for acute respiratory diseases [ 31 ].

Infectious diseases surveillance involves continuous vigilance for health occurrences and related events to ensure quick response [ 22 ]. However, because of low engagement and actions from different stakeholders in increasing the performance strength of surveillance, COVID-19 cases may have been under-reported in Africa [ 31 ]. Although recorded cases and mortality may seem low, it has been projected that Africa will likely have one of the worst effects of the pandemic. It is thus important to ramp up laboratory and diagnostic capacity in an effective and continuous structure across African countries for prompt detection, and accurate predictions of COVID-19 and other infectious diseases.

The inefficient surveillance strategies in Africa have led to suboptimal reporting and monitoring of diseases of public health importance, which gives a false sense of decrease in incidence rates of prevalent diseases, ultimately affecting policymaking and eradication strategies [ 28 ]. Low information-based activity from healthcare centers has led to low quality delivery of healthcare services in Nigeria. It has also affected morbidity and mortality rates, along with relevant general data on the leading infections in Nigeria [ 28 ]. The success of any health initiative is jeopardized because of low disease surveillance in Africa. This has led to disorganized healthcare sectors with poor infrastructure, low knowledge trends and low-quality healthcare delivery, which indicate low sustainability. Moreover, it has also affected the accurate assessment of the health systems as well as health promotion programs, leading to unproductive resource allocation by program investors.

It is very essential to mention that during the pandemic, several African countries around the world documented a rise in infectious diseases. COVID-19 pandemic exacerbated the spread of many infectious diseases such Dengue fever, yellow fever, measles, Lassa fever and malaria as seen in many African countries [ 32 ]. The primary prevention strategy against these diseases is vaccination and entomological control of vectors; implementation of such strategy in the continent is far below what is needed to control the diseases. The restrictions encountered due to COVID-19 pandemic led to interruption of prevention and control programs [ 33 ]. This underscores the burden and challenges of other highly infectious diseases amid COVID-19 pandemic and these diseases have impacted African countries resulting to over-straining of the already dilapidated healthcare system. For instance, there were several infectious diseases outbreak or reemergence amid the COVID-19 pandemic in Nigeria. The coexistence of other outbreaks during COVID-19 pandemic increased the burden on the country’s health system mainly because the necessary response programs for these re-emerged infectious diseases were redirected to the COVID-19 national response. Particularly, during the COVID-19 pandemic, there were yellow fever, cholera and Lassa fever outbreaks in many parts of Nigeria, owing to the country’s inadequate health system, which hampered the development of proper disease responses. As in other countries around the world, the burden added by COVID-19 to the public health system has led to a reduction in the epidemiological control of other infectious diseases [ 29 ]. This disappointment permitted the episodes to gain in recurrence and seriousness with sizeable mortality rates, which also included healthcare workers [ 34 ].

Africa’s healthcare workforce and testing capacities are inadequate to integrate the COVID-19 pandemic and other viral infection surveillance, this had led to escalation of a chain reaction of crises in the healthcare system [ 28 , 35 ]. Individuals with chronic liver disease have been added to those at risk with increased danger for critical expression of COVID-19, however, the existence of viral hepatitis, as well as malaria, tuberculosis, and HIV/AIDS does not directly escalate vulnerability in comparison to the SARS-CoV-2. The high occurrence of poor medical diagnosis among individual living with viral hepatitis in sub-Saharan Africa could be linked to the lack of SARS-CoV-2 infection restriction guidelines [ 36 ]. Late presentation, misdiagnosis or under-diagnosis of these tropical infectious diseases in Nigeria can be directly linked to the unrest in the health care delivery system during COVID-19, laboratory equipment, infrastructure and manpower have been channeled solely for the purpose of COVID-19 [ 36 , 37 ].

Evaluation of COVID-19 surveillance system in FCT shows the state’s adequate capacity to detect, respond and contain the disease. Morbidity and mortality data which is the main point of measurement of COVID-19 burden, highlights the usefulness of response activities as well preventive measures taken. It was crucial to identify weaknesses and strengths of the FCT’s COVID-19 surveillance system in this short period in order provide government, stakeholders and partners with robust evidence for better policymaking, strategic planning, improvement actions and sustainability of the system for monitoring these sorts of pandemics.

Limitations

The mixed-method approach employed in this study elicited in-depth perceptions on the surveillance system attributes from respondents. However, a limitation of this study was a likelihood of bias in responses due to social desirability since the study was largely dependent on respondents’ self-reporting. Moreover, the perspectives of the respondents on the surveillance system attributes was restricted to certain concepts as provided in the adapted semi-structured questionnaire. Nevertheless, a comment section was provided to enable respondents to illuminate further on the attributes. Secondly, the surveillance data retrieved and analyzed in our study could be limited due to incomplete data entry and missing variables on the electronic database. The extent of this is likely minimal since some of the missing data were keyed in manually on the excel sheet after retrieval from SORMAS. Nonetheless, the data in this study permit an early assessment of the epidemiological and clinical characteristics of COVID-19 in FCT thereby describing the usefulness of the surveillance system. The substantial proportion of missing data has prompted a systemic effort to improve data collection process and introduce electronic CIF. Lastly, we were unable to conduct KII with case managers due to restricted access at treatment centres; one-on-one interviews with these key stakeholders would have provided richer information about the system as regards management of cases. This was mitigated by using electronic surveys as an alternate method to obtain the necessary data.

This study has provided an early insight into the performance of COVID-19 surveillance system in FCT, Nigeria, highlighting information necessary for health system strengthening and public health planning. Despite its many strengths, some significant weaknesses and gaps were identified in the FCT COVID surveillance system during this evaluation. These weakness needs to be addressed with a sense of urgency in order to optimize system performance. The system has played a critical role in the containment of the pandemic, demonstrated by the rapid reduction in the number of confirmed cases, well-coordinated incident management system, good case management and IPC measures. Data generated from the FCT COVID-19 surveillance system was promptly analyzed and very useful in providing information on the trend which subsequently helped in decision making, monitoring the outbreak and planning/improvement of response activities.

The system was found to be simple, flexible, sensitive, acceptable to stakeholders, with good representativeness. However, the stability, data quality and timeliness should be improved upon. Additionally, with the high operating costs of the surveillance system and a history of dependence on external financial support, the long-term financial sustainability of the system remains uncertain. Overall, the performance of FCT COVID-19 surveillance system was rated moderate as it was observed to be addressing the public health problem for which it was instituted and also meeting its objectives to some reasonable extent although there were major issues affecting the efficiency of the system.

Recommendations

  • The FCT Public Health Department should ensure that COVID-19 surveillance activities are fully funded, as funding from donors might not be sustainable and sufficient to cover all surveillance activities, funding was found to be a major determinant of the stability of the system.
  • Partners (WHO, AFENET, ACDC and REDISEE) should consider increasing funding and technical support made available for FCT COVID-19 surveillance at all levels.
  • The Director of Public Health should ensure that all personnel involved in surveillance activities at the state, LGA and health facility levels are trained and retrained periodically, this will improve the system’s data quality and consistency in reporting.
  • The State Department of Health should provide additional logistic resources to DSNOs in rural areas, vehicles should also be made available for surveillance activities at all times.
  • The FCT Ministry of Health and FCT PHD should strengthen, expand and fully decentralize COVID-19 surveillance activities to all area councils and activate sample collection centres in selected health facilities across the area councils.
  • Constant monitoring of infection prevention and control (IPC) compliance in health facilities by FCT PHD to reduce HCW infection.
  • External quality assessment for private COVID-19 testing laboratories by MLSCN and NCDC to ensure accurate results are constantly reported.
  • Efforts to effectively improve system-wide timeliness should be directed to strengthening the first reporting stage from the testing Laboratory.
  • NCDC should make a deliberate effort to resolve the observed technical and connectivity issues associated with SORMAS so as to facilitate easy operation of the platform on real-time basis. This will resolve the issue of large amounts of backlogs of CIFs not captured on SORMAS, by so doing, the COVID-19 surveillance can then consider phasing out paper documentation, thereby achieve overall improvement of the system’s timeliness and data quality.
  • The FCT PHD should improve on supportive supervision to LGAs and facilities on proper data collection, timely reporting and stock consumption reports.
  • The state ministry of health should establish more state-owned COVID-19 testing laboratories to support testing in FCT and improve result TAT.

Supporting information

https://doi.org/10.1371/journal.pone.0264839.s001

Acknowledgments

We wish to acknowledge the Director of FCT Public Health Department, Dr. Josephine Okechukwu and the FCT State DSNO Mrs. Fatima Ahmed, for their assistance, guidance and encouragement during the conceptualization of this study and facilitation of surveillance data acquisition process. We also acknowledge the FCT PHD and EOC for their immense support throughout the period of this study.

  • 1. WHO. WHO Coronavirus (COVID-19) Dashboard With Vaccination Data. Available from: https://covid19.who.int/
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 4. NCDC. COVID-19 Situation Report. Available from: https://ncdc.gov.ng/diseases/sitreps/An update of COVID-19 outbreak in Nigeria_220121_4
  • 14. U. S. Centers for Disease Control. (PDF) Updated Guidelines for Evaluating Public Health Surveillance Systems Recommendations from the Guidelines Working Group. Available from: https://www.researchgate.net/publication/51408718_Updated
  • 15. European Centre for Disease Prevention and Control. Data quality monitoring and surveillance system evaluation—A handbook of methods and applications. 2014. Available from: https://www.ecdc.europa.eu/en/publications-data/
  • 17. National Population Commission. Demographic Statistics. 2017;(May):219–28. Available from: https://nigeriastat.gov.ng
  • 20. Nigeria: Administrative Division (States and Local Government Areas)—Population Statistics, Charts and Map. Available from: https://www.citypopulation.de/php/nigeria-admin.php

Surveillance: The Role of Observation in Epidemiological Studies

  • First Online: 13 December 2023

Cite this chapter

Book cover

  • Adetoun F. Asala 2  

376 Accesses

Public health surveillance is the corner stone of public health practice and is critical for improving population health. Public health surveillance is the continuous, ongoing systematic collection, analysis, and interpretation of relevant health data, which play key role in planning, implementing, and evaluating public health policies and practices as well as disseminating information needed for disease prevention and control. Surveillance data provide useful information to assess burden of health outcomes and prioritize public health actions, prevent spread of disease, and inform decision making. This chapter will explain the benefits of public health surveillance. This chapter will detail: (1) the types and process of public health surveillance, (2) surveillance data sources and approaches to public health surveillance, (3) significance and relevance of surveillance in public health practice, and (4) describe the elements of a public health surveillance system. The chapter will compare public health surveillance to epidemiologic research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Centers for Disease Control and Prevention (CDC). Introduction to public health. In: Public health 101 series. Atlanta, GA: U.S. Department of Health and Human Services, CDC; 2014. Available at: https://www.cdc.gov/training/publichealth101/surveillance.html . Accessed 8 Mar 2023.

Google Scholar  

Thacker SB, Qualters JR, Lee LM, Centers for Disease Control and Prevention. Public health surveillance in the United States: evolution and challenges. MMWR Suppl. 2012;61(3):3–9.

PubMed   Google Scholar  

Office of Public Health Scientific Services, Centers for Disease Control and Prevention. Public health surveillance: preparing for the future. Atlanta: Centers for Disease Control and Prevention. CDC; 2018. Available at: https://www.cdc.gov/surveillance/pdfs/Surveillance-Series-Bookleth.pdf

Groseclose SL, Buckeridge DL. Public health surveillance systems: recent advances in their use and evaluation. Annu Rev Public Health. 2017;38:57–79.

Article   PubMed   Google Scholar  

Tulchinsky TH. John snow, cholera, the broad street pump; waterborne diseases then and now. In: Case studies in public health [internet]. Elsevier; 2018. p. 77–99. Available from: https://linkinghub.elsevier.com/retrieve/pii/B9780128045718000172 .

Chapter   Google Scholar  

Declich S, Carter AO. Public health surveillance: historical origins, methods and evaluation. Bull World Health Organ. 1994;72(2):285–304.

CAS   PubMed   PubMed Central   Google Scholar  

Boslaugh S, editor. Encyclopedia of epidemiology. Sage Publishing; 2007.

Naing NN. Easy way to learn standardization: direct and indirect methods. Malays J Med Sci. 2000;7(1):10–5.

Smith ML, Ory MG. Measuring success: evaluation article types for the Public Health Education and Promotion Section of Frontiers in Public Health. Front Public Health. 2014;2:111. https://doi.org/10.3389/fpubh.2014.00111 .

Article   PubMed   PubMed Central   Google Scholar  

Carter A, National Advisory Committee on Epidemiology Subcommittee. Establishing goals, techniques and priorities for national communicable disease surveillance. Can J Infect Dis. 1991;2(1):37–40. https://doi.org/10.1155/1991/346135 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Office of Public Health Scientific Services, Centers for Disease Control and Prevention. Public health surveillance: preparing for the future. Atlanta: Centers for Disease Control and Prevention; 2018. Available from: Public Health Surveillance Preparing for the Future ( cdc.gov )

Carter HB. Optimizing active surveillance. Eur Urol. 2016;70(6):909–11. https://doi.org/10.1016/j.eururo.2016.07.017 .

Lyerla R, Stroup DF. Toward a public health surveillance system for behavioral health. Public Health Rep. 2018;133(4):360–5.

Paterson BJ, Durrheim DN. The remarkable adaptability of syndromic surveillance to meet public health needs. J Epidemiol Glob Health. 2013;3(1):41–7. https://doi.org/10.1016/j.jegh.2012.12.005 .

Murray J, Cohen AL. Infectious disease surveillance. In: International encyclopedia of public health; 2017. p. 222–9. https://doi.org/10.1016/B978-0-12-803678-5.00517-8 .

Lucero-Obusan C, Oda G, Mostaghimi A, Schirmer P, Holodniy M. Public health surveillance in the US Department of Veterans Affairs: evaluation of the Praedico surveillance system. BMC Public Health. 2022;22(1):272. https://doi.org/10.1186/s12889-022-12578-2 .

Sheikhali SA, Abdallat M, Mabdalla S, Al Qaseer B, Khorma R, Malik M, Profili MC, Rø G, Haskew J. Design and implementation of a national public health surveillance system in Jordan. Int J Med Inform. 2016;88:58–61. https://doi.org/10.1016/j.ijmedinf.2016.01.003 .

Bilandzic A, Bozat-Emre S. At-a-glance-initial evaluation of Manitoba’s cannabis surveillance system. Health Promot Chronic Dis Prev Can. 2020;40(7–8):245–9. https://doi.org/10.24095/hpcdp.40.7/8.04 .

Burkom H, Loschen W, Wojcik R, Holtry R, Punjabi M, Siwek M, Lewis S. Electronic surveillance system for the early notification of community-based epidemics (ESSENCE): overview, components, and public health applications. JMIR Public Health Surveill. 2021;7(6):e26303. https://doi.org/10.2196/26303 .

Jiang WX, Huang F, Tang SL, Wang N, Du X, Zhang H, Zhao YL. Implementing a new tuberculosis surveillance system in Zhejiang, Jilin and Ningxia: improvements, challenges and implications for China’s National Health Information System. Infect Dis Poverty. 2021;10:22. https://doi.org/10.1186/s40249-021-00811-w .

Aiello AE, Renson A, Zivich P. Social media-and internet-based disease surveillance for public health. Annu Rev Public Health. 2020;41:101–18.

Download references

Author information

Authors and affiliations.

Mississippi State Department of Health, Office of Preventive Health, Ridgeland, MS, USA

Adetoun F. Asala

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Adetoun F. Asala .

Editor information

Editors and affiliations.

Department of Epidemiology and Biostatistics, Jackson State University, Jackson, MS, USA

Amal K. Mitra

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Asala, A.F. (2024). Surveillance: The Role of Observation in Epidemiological Studies. In: Mitra, A.K. (eds) Statistical Approaches for Epidemiology. Springer, Cham. https://doi.org/10.1007/978-3-031-41784-9_8

Download citation

DOI : https://doi.org/10.1007/978-3-031-41784-9_8

Published : 13 December 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-41783-2

Online ISBN : 978-3-031-41784-9

eBook Packages : Medicine Medicine (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Methodology
  • Open access
  • Published: 22 September 2021

Use of artificial intelligence for public health surveillance: a case study to develop a machine Learning-algorithm to estimate the incidence of diabetes mellitus in France

  • Romana Haneef   ORCID: orcid.org/0000-0001-7741-0268 1 ,
  • Sofiane Kab 2   na1 ,
  • Rok Hrzic 3   na1 ,
  • Sonsoles Fuentes 1 ,
  • Sandrine Fosse-Edorh 1 ,
  • Emmanuel Cosson 4 , 5 &
  • Anne Gallay 1  

Archives of Public Health volume  79 , Article number:  168 ( 2021 ) Cite this article

6576 Accesses

6 Citations

6 Altmetric

Metrics details

The use of machine learning techniques is increasing in healthcare which allows to estimate and predict health outcomes from large administrative data sets more efficiently. The main objective of this study was to develop a generic machine learning (ML) algorithm to estimate the incidence of diabetes based on the number of reimbursements over the last 2 years.

We selected a final data set from a population-based epidemiological cohort (i.e., CONSTANCES) linked with French National Health Database (i.e., SNDS). To develop this algorithm, we adopted a supervised ML approach. Following steps were performed: i. selection of final data set, ii. target definition, iii. Coding variables for a given window of time, iv. split final data into training and test data sets, v. variables selection, vi. training model, vii. Validation of model with test data set and viii. Selection of the model. We used the area under the receiver operating characteristic curve (AUC) to select the best algorithm.

The final data set used to develop the algorithm included 44,659 participants from CONSTANCES. Out of 3468 variables from SNDS linked to CONSTANCES cohort were coded, 23 variables were selected to train different algorithms. The final algorithm to estimate the incidence of diabetes was a Linear Discriminant Analysis model based on number of reimbursements of selected variables related to biological tests, drugs, medical acts and hospitalization without a procedure over the last 2 years. This algorithm has a sensitivity of 62%, a specificity of 67% and an accuracy of 67% [95% CI: 0.66–0.68].

Conclusions

Supervised ML is an innovative tool for the development of new methods to exploit large health administrative databases. In context of InfAct project, we have developed and applied the first time a generic ML-algorithm to estimate the incidence of diabetes for public health surveillance. The ML-algorithm we have developed, has a moderate performance. The next step is to apply this algorithm on SNDS to estimate the incidence of type 2 diabetes cases. More research is needed to apply various MLTs to estimate the incidence of various health conditions.

Peer Review reports

The availability of administrative data generated from different sources is increasing and the possibility to link these data sources with other databases offers unique opportunity to answer those research questions, which require a large sample size or detailed data on hard-to-reach population [ 1 ]. French National Health Data System (i.e., SNDS [ Système National de Données Santé] ) is an example of a big data/large administrative linked data set, which is used for public health surveillance in France [ 2 ]. It includes most updated, individual level health information about health insurance claims, hospital discharge and mortality of whole French population (i.e., 66 million people) [ 2 ]. However, the estimation of health indicators from linked administrative data is challenging due to several reasons such as variability in data sources and data collection methods, availability of a large number of variables, lack of skills and capacity to analyze big data [ 3 ]. More efficient ways of analyzing health information using big data across European countries are required. In that context, the use of artificial intelligence (AI) is increasing in healthcare. Indeed AI allows to handle data with a large number of dimensions (features) and units (feature vectors) efficiently with a high precision. AI techniques offer benefits in estimation of health indicators both at individual and population levels (i.e., improving social and health policy process). Machine learning (ML) is an application of AI that provides systems the ability to learn automatically and improve from experience without being explicitly programmed [ 4 ]. Supervised learning algorithms build on a mathematical model of a set of data that contains both the inputs and the desired outputs [ 5 ]. This approach is based on the prior knowledge of what the output values for a given sample should be [ 6 ]. ML techniques have been applied for the diagnosis of certain conditions as well as outcome prediction and prognosis evaluation with high precision [ 7 , 8 , 9 ].

This study was carried out under the InfAct (Information for Action) project [ 10 ], which is a joint action of Member States aiming to develop a more sustainable European health information system through improving the availability of comparable, robust and policy-relevant health status data and health system performance information. InfAct gathers 40 national health authorities from 28 Member States. This study is part of a work package (WP9) focused on innovation in health information system (i.e., using data linkages and/or AI) to improve public health surveillance and health system performance for health policy process. As a first step, we have explored the current usage of these innovative techniques (i.e., data linkages and/or AI) in European countries and very few countries apply AI to estimate health indicators in their public health activities [ 11 ]. Therefore, the next step was to develop a generic approach by applying these innovative techniques to estimate the health indicators of chronic conditions for improved surveillance.

We used diabetes as a case study due to several reasons. First, it is one of the leading cause of morbidity in the world [ 12 ] and its prevalence is increasing among all ages in the European region, mostly due to increase in overweight and obesity, unhealthy diet and physical inactivity [ 13 ]. Second, a training data set using CONSTANCES cohort was already developed and used to answer various research questions for diabetes. Third, as this study is part of the InfAct project with a limited period to be completed. Fourth, estimation of incidence of diabetes cases is important to develop the prevention strategies to reduce its burden. For example, promoting healthy diet and physical activity in daily life could reduce the risk of developing diabetes 2.

The main objective of this study was to develop for the first time a generic ML-algorithm to estimate the incidence of diabetes based on the number of reimbursements over the last 2 years, excluding the anti-diabetes drugs as predictors and focused on non-diabetic participants over the last 2 years.

Development of the ML-algorithm

To develop ML-algorithm, we adopted a supervised ML approach using R-software (R × 64 3.6.3) and used following key libraries: caret 6.0–86, AppliedPredictiveModeling 1.1–7, CORElearn 1.54.2, C50 0.1.3.1, and xgboost 1.3.2.1. Following steps were performed: i. selection of final data set, ii. target definition, iii. Coding of variables for a given window of time, iv. split final data into training and test data sets, v. variables selection, vi. training model, vii. Validation of model with test data set and viii. Selection of the model.

i. Selection of final data set

We selected a final data set from a population-based epidemiological cohort (i.e., CONSTANCES) to develop an algorithm to estimate the incidence of diabetes. The participants were recruited by CONSTANCES between January 1, 2012, and December 31, 2014. This cohort comprises after final completion a national representative randomly selected sample of 50,954 aged between 18 and 69 years (inclusive) and living in France [ 14 , 15 ]. The participants are randomly selected from the beneficiaries of the National Health Insurance Fund (i.e. CNAM [Caisse Nationale d’Assurance Maladie]). In this cohort, data are collected using a self-administered questionnaire (SAQ) and a medical questionnaire (MQ) and were used to define the known diabetes cases and pharmacologically-treated diabetes [ 16 ]. For known diabetes cases, in the SAQ, participants reported to have diabetes through the item: “Have you ever been told by a doctor or other health care professional that you had diabetes?” In the medical questionnaire, completed during the medical examination, the physician asked each participant if they had diabetes. For the pharmacologically-treated diabetes, two questions in the medical questionnaire were related to diabetes treatment: “ Are you currently being treated for diabetes with oral medication?” And “Are you currently being treated for diabetes with one or more insulin injections?” [ 16 ] .

After fulfilling a SAQ on health status, life style factors, socioeconomic and demographic characteristics, the participants attend to their related health screening center for a medical examination which includes: medical questionnaire, physical examination and blood sampling. This information previously collected was linked with the French National Health Data System (i.e., SNDS). We excluded women who declared being already diagnosed of gestational diabetes mellitus, pregnant women, no data on participants in SNDS, incomplete data in SAQ/MQ, incomplete data on age of diabetes diagnosis, diabetes cases who were declared before 12 months of SAQ/MQ and all participants with antidiabetic drug reimbursement between 12 and 36 months before SAQ/MQ. Moreover, we considered gestational diabetes as a special group and excluded these women due to their different physiopathology. Some may develop the diabetes earlier or later due to hormonal disturbance. To predict the incidence of diabetes among these women required specific case definitions.

ii. Target definition

The target definition includes the participants who declared diabetes in CONSTANCES cohort (first occurrence ≤12 months) with the first antidiabetic drug reimbursement between 0 and 12 months before inclusion. The diabetes status at inclusion (M0: inclusion in CONSTANCES) was defined according to CONSTANCES as described above. The linkage with the French National Health Data System (i.e., SNDS) allowed recording all antidiabetic drug reimbursement between 0 and 36 months before inclusion (SAQ/MQ).

Participants without declared diabetes (CONSTANCES SAQ/MQ) and/or antidiabetic drug reimbursement between 0 and 36 months before inclusion were defined as non-diabetes cases (target 0). Participants who declared diabetes in CONSTANCES SAQ/MQ (first occurrence ≤12 months) with the first antidiabetic drug reimbursement ≤12 months before inclusion were defined as incident cases (target 1). These diabetes cases included both type 1 and 2. Participants with antidiabetic drug reimbursement between 12 and 36 months before inclusion or declared diabetes (first occurrence > 12 months) were excluded (see Fig.  1 ). We excluded these participants to avoid the potential influence of anti-diabetes drugs on the estimation of incidence of diabetes (not true incident cases).

figure 1

Target definition in CONSTANCES Cohort, “A case study performed in 2019-20 to develop a Machine Learning-algorithm to estimate the incidence of Diabetes Mellitus in France”. *SAQ: Self-administered Questionnaire. MQ: Medical Questionnaire

iii. Coding of variables for a given window of time

In CONSTANCES, we only coded those variables, which were also available in the SNDS to apply the potential ML-algorithm on SNDS to estimate the incidence of diabetes. A total of 3483 continuous variables were coded and standardized using the z-score transformation (for each data point, we subtracted the mean and divided with the standard deviation) over the last 24 months before the date of SAQ. The rational to have a time window of 24 months before the SAQ was to provide a long duration to study changes in diagnostic procedures, hospitalizations and drug consumption that allows to estimate the incidence of diabetes with high accuracy. Following were the main categories of variables: number of medical consultations (50 variables), drug dispensed coded using the 5th level of the Anatomical Therapeutic code [ATC 05] (461 variables), biological test (747 variables), medical acts (i.e., X-ray, surgery, etc.) (2135 variables), all hospitalizations (5 variables), hospitalizations with a procedure (i.e., dialysis, radiotherapy, etc.) (5 variables), hospitalizations without a procedure (5 variables), hospitalizations related to following associated health conditions: diabetes, heart failure, stroke, heart attack, foot ulcer, lower limb amputation, ischemic heart disease, transient ischemic attack, end-stage renal failure, diabetic coma, diabetic ketoacidosis and cancer (75 variables).

iv. Split final data set into training and test data sets

The final data set was randomly split into 80% as a training data set and 20% as test data set. There was a significant imbalance of number of positive target (i.e., target 1 = diabetes treated cases) over the number of negative target (i.e., target 0 = non-diabetes cases) in the training dataset. To avoid the bias in ML-algorithm and skew in class distribution, we performed a random down sampling in the training data set in target 0 group to achieve the same number of individuals in both target groups. This includes 35,728 participants where target 0 includes 35,663 participants and target 1 includes 65. Random down sampling was performed on target 0 until to achieve the same number of target 1 i.e., 65.

The selection of variables and the model was performed using the training data. The test data was used solely to test the final model performance.

v. Variables selection

First, we removed all variables with a variance equal to zero and then the ReliefF exp. score was estimated, based on the relevance of each variable, to differentiate between target 1 and target 0. The ReliefF expRank method is noise tolerant and is not affected by features interactions [ 17 , 18 , 19 ]. All the variables were ranked according to the ReliefF exp. score. For continuous variables, the score values range from 0 to 1 [ 18 ]. The cutoff score was 0.01 and was selected based on the visual inspection of the ordered plot of ReliefF values for all variables, called “elbow plot” approach. The variables that had a ReliefF exp. score equal or more than 0.01 were included to train different models and the variables less than 0.01 were excluded.

Steps vi to viii model selection and validation of the model with test data set

The four following models [i.e., 1. Linear discriminant analysis (LDA), 2. Logistic regression (LR), 3. Flexible discriminant analysis (FDA) and 4. Decision tree model (C5)] were applied to the training data set. We also fit three boosted algorithms (1. Boosted logistic regression, 2. Boosted C5, and 3. XGBoost), to test whether these more computationally intensive algorithms can perform significantly better than the more standard four models. The AUC is the most commonly used evaluation/performance metric in machine learning studies and measure the ability of a classifier to distinguish between classes. We used the area under the receiver operating characteristic curve (AUC) in both the cross-validation steps and the best (final) algorithm. For each model, we compared the performance in terms of AUC. We used five-fold cross-validation repeated three times to fit each model in the training stage. After that, the models’ performances were assessed using the testing data set. We calculated the mean distributions of variables to highlight each predictor, the relative difference in the two predicted groups (diabetic, non-diabetic), for example the mean of age was 56.70 y for the predicted diabetics and 43.69 y for non-diabetics.

Sensitivity analysis

Unbalanced training data set can skew the class distribution that may affect the performance of the machine learning algorithm. To address this issue, random resampling was applied to balance the training data set and to avoid the bias estimation. We applied two following techniques: 1. over sampling is the random repeated sampling from the minority class (positive target: diabetes cases) that artificially inflates its prevalence and 2. Down sampling is the random repeated sampling from the majority class (negative target: non-diabetes cases) that artificially reduces its prevalence. The sensitivity analysis was performed for over and down sampling approaches on the test data set. Then, we automated the model selection process by giving the computer a specific metric including sensitivity, specificity, positive predictive value, negative predictive value and kappa. Finally, a single model was retained based on its performance and its transferability to other databases.

Final data set

The final data set to develop the algorithm included 44,659 participants, with 81 incident diabetes cases (target 1) and 44,578 participants without diabetes (target 0) (Fig.  2 ). The general characteristics of the final data set is described in Table  1 . The incident diabetes group was included older, with a higher percentage of men, treated hypertension and dyslipidemia, former smokers, a higher body mass index and a family history of diagnosed diabetes as compared to non-diabetes group.

figure 2

Flow chart for the selection of the final data set from CONSTANCES Cohort, “A case study performed in 2019-20 to develop a Machine Learning-algorithm to estimate the incidence of Diabetes Mellitus in France”. *SAQ = Self-administered Questionnaire. MQ = Medical Questionnaire

Variables selection

Out of 3468 continuous variables coded, 23 variables (0.7%) had a ReliefF exp. Score above 0.01, ranked based on this score and were therefore selected (Fig.  3 ) (Table  2 ).

figure 3

Variables selection based on ReliefF Exp Score, “A case study performed in 2019-20 to develop a Machine Learning-algorithm to estimate the incidence of Diabetes Mellitus in France”

The first variable was the “age”. The following nine were related to “number of reimbursements of biological tests performed in last 2 years” (i.e., Alkaline Phosphatase test, Gamma Glutamyle Transferase test, Transaminases (ALAT and ASAT, TGP and TGO) blood test, Uric Acid (Uricemia) blood test, glucose blood, Creatinine level blood test, Exploration of a Lipid Anomaly (ELA) blood test, HbA1c test and C-Reactive Protein test). The next seven were related to “number of reimbursements of various non-diabetes drugs in last 2 years” (i.e., Proton pump inhibitors drugs, antidiarrheal drugs, Penicillin with broad spectrum drugs, bacterial and viral vaccines, Acetic acid derivatives, Propionic acid derivatives and Anilides (Paracetamol). The following five were related to “number of reimbursements of various medical acts” (i.e., fundus examination by biomicroscopy with contact lens, functional examination of ocular motricity, binocular vision examination, mammography and X-ray for thorax). The last one is “the total number of hospitalizations without a procedure (i.e., dialysis, chemotherapy) in last 2 years”.

Algorithm to estimate the incidence of diabetes

After the selection of variables, four different models [i.e., 1. Linear discriminant analysis (LDA), 2. Logistic regression (LR), 3. Flexible discriminant analysis (FDA) and 4. Decision tree model (C5)], were trained with the training dataset using three repeats of five-fold cross-validation graph. The performance of each of the model was tested on the test dataset and was measured by AUC, for which an empirical 95% confidence interval was calculated based on 15 resamples with replacement (Fig.  4 ). After that, we compared the performances of these four models using test data set to select the one based on the performance metrics (Table  3 ). We kept the LDA model since it showed a better performance with an accuracy of 67% with the test data set as compared to other models (Table 3 ). The three boosted algorithms improved on the predictive performance as compared to the standard models (Fig. 4 ). The accuracy of boosted version of logistic regression, the boosted C5.0 classification model, and XGBoost was 77, 67 and 69%, respectively (see Additional file  1 ). The results of sensitivity analysis are reported in the Additional file 1 .

figure 4

Area under the receiver operating characteristic curve and the empirical 95% confidence interval (based on fifteen resamples with replacement) for all models using the test dataset, “A case study performed in 2019-20 to develop a Machine Learning-algorithm to estimate the incidence of Diabetes Mellitus in France”

Distribution of means of selected variables in test data set

After the selection of LDA model, the 23 selected variables were trained with the test data set (20% of final data set 44,659 = 8931). We compared the distribution of means of these continuous variables among two groups: incident diabetes cases (i.e., 2921) and non- diabetes incident cases (i.e., 6010) using LDA algorithm in the test data set (Table  4 ). The 2921 diabetes patients in the test dataset are the predicted cases. The mean distribution of all selected variables related to the number of reimbursements of biological tests, medicines not used for diabetes treatment and medical acts performed in last 2 years, was significantly higher in the incident diabetes group than in non-diabetes group. For example, the age was the first ranked variable with 0.04 ReliefF exp. score among 23 selected variables and was highly discriminant in the incident diabetes group. The mean age of patients in diabetes group was 57 years old as compared to 44 years old in non-diabetes group (Table 4 ).

Following the age variable, nine other features selected, related to the mean number of reimbursements of biological tests, were more discriminant (i.e., the behavior of distinguishing features or characteristics of variables by comparing two groups) in incident diabetes group than in non-diabetes group. These biological tests were performed to measure the normal values of certain enzymes, proteins, glucose and uric acid in the blood to check the normal functions of liver, kidney, pancreas and other organs. For example, the mean number of reimbursement of blood glucose test in last 2 years was 1.74 times higher in diabetes predicted group than in non-diabetes predicted group. The following group of features was the mean number of reimbursements of drugs. There were seven drugs and their mean number of reimbursements in last 2 years was more discriminant in incident diabetes group than in non-diabetes group. In the category of medical acts, there were three following features more discriminant in incident diabetes group: mean number of reimbursements of examination of fundus by biomicroscopy with contact lens, ocular motricity and binocular vision in last 2 years.

There were seven unusual features selected by the ML-algorithm and were discriminant in incident diabetes group: mean number of reimbursements of broad-spectrum penicillin, vaccines, propionic acid, Anilides (Paracetamol), mammography, X-ray for thorax and mean number of hospitalizations without any procedure.

We have developed an algorithm based on the supervised ML approach to estimate the incidence of diabetes using a training data set from a cohort study. This algorithm (i.e., LDA model) was built on 23 selected variables from the CONSTANCES based on the number of reimbursements over the last 2 years to estimate the incidence of diabetes. This algorithm showed a moderate performance in predicting the incidence of diabetes cases with a sensitivity of 62% and an accuracy of 67%. Among 23 selected variables, six were related to diabetes that were expected, such as age and Glucose blood test. Whereas 17 other variables were not directly related to diabetes and were more discriminant in incident diabetes group than in non-diabetes group such as Proton pump inhibitors drug.

Main limitations of the ML-algorithm

This study was performed as a proof of concept using ML-approach to estimate the incidence of diabetes cases from reimbursement data. The results have highlighted low accuracy and there are several aspects that could explain this low accuracy. Here are some limitations: first, small number of diabetes-treated cases in the final data set, which could be related to the lack of older population in CONSTANCES cohort (participants between 18 and 69 years at inclusion), maybe potentially linked with low accuracy. Participation in a cohort like CONSTANCES is cumbersome and demands additional time to take part in health examinations. People in less good health and having co-morbidities (including both old and young people), require regular health check-ups therefore, they could less motivation to participate in cohort studies. Thus, it required to wait few years of follow-up for volunteers to include older age groups to have more incident diabetes cases. The risk of developing diabetes increases with age, therefore by including larger number of older people in the final dataset, the performance of this algorithm may be improved. Second, the time window used to code the variables was previous 2 years, was long duration. We included a longer window to better evaluate the changes in diagnostic procedures, number of hospitalizations and drug consumption and to estimate the incidence of diabetes with high accuracy. More research is needed to explore different time windows and their impact on accuracy level of estimates. Third is related to diabetes, which is a complex medical condition with two major clinical types of diabetes, type 1 diabetes and type 2 diabetes. The pathology and dynamics of developing these two types of diabetes are very different. The type 1 diabetes is thought to be due to autoimmunological destruction of the Langerhans Islets hosting pancreatic-ß-cells. It is diagnosed at very early stage of life and is believed to involve a combination of genetic and environmental factors. Whereas the main causes of type 2 diabetes are due to lifestyle, physical activity, dietary habits and genetic, is usually developed at later than 50 years of life. In our study, we defined the pharmacologically treated diabetes cases as target 1 and non-diabetes cases as target 0. However, we did not explicitly define the pharmacological treated diabetes cases to be further characterized as type 1 and type 2. With the inclusion of this information in the model, the accuracy level of the model could be enhanced. Fourth, the predictors from reimbursement data may not have a strong ability to predict the diabetes outcome. Fifth, these results highlight the low level of model specificity. Considering the low incidence of diabetes, it is important to take into account that insufficient model specificity could lead to the overestimation of the diabetes incidence. These results should be interpreted with caution. Sixth, we applied an automated procedure and selected 23 the most related variables to our outcome based on a ReliefF exp. score equal or more than 0.01. We did not consult is with a clinician. We recommend to combining both upstream clinical expertise and automated approaches for future research, for example by defining more complex indicators (more specific algorithms) related to specific conditions with higher risk of developing diabetes such as hypertension or dyslipidaemia, this can improve the performance of the algorithms. Finally, using the boosted algorithm for over sampling requires a high computational capacity and may take several days to compute the results. The computational capacity that is currently routinely available to public health institutions limits the attractiveness of complex analytical models that, while potentially highly accurate, may require several days to fit. This often results in such models being the product of specific research projects instead of being routinely developed in-house to support population health monitoring and clinical decision-making. This unfortunately partially limits the access to the advantages of advanced analytics. We therefore recommend that public health institutions invest in high-performance computing capacity.

Despite these limitations of this ML-algorithm, this study has some strengths: first is using supervised machine learning approach, we have developed the innovative methodology and could be applied to address other research questions. Second, this approach allows to reduce the dimension of a large number of variables (i.e., 3468) and identifies the most relevant variables (i.e., 23/3468 = 0.7%) to the desired outcomes more efficiently. The LDA model has been used for variables selection and dimensionality reduction for diabetes diagnosis [ 20 ]. Third, it allows to identify new variables, which were not observed using classical statistical approaches and can enrich the information to estimate the health indicators.

Our study has highlighted that among the 23 selected variables in the final model, five could be related to diabetes “number of Glucose blood test”, “number of reimbursement of HbA1c tests”, “number of Fundus examination by biomicroscopy with contact lens”, “number of reimbursement of functional examination of the ocular motricity” and “number of reimbursement of binocular vision examination”. These tests/examinations could happen after a diagnosed diabetes (without treatment in our study) or also characterize a group with higher risk factors but with not diagnosed diabetes. Further, three variables among the five are not specific to diabetes (“number of Fundus examination by biomicroscopy with contact lens”, “number of reimbursement of functional examination of the ocular motricity” and “number of reimbursements of binocular vision examination. We also performed an additional analysis after implementing the model without these 5 variables, the accuracy was 65% (81% sensitivity and 65% specificity). In France, the screening recommendations for diabetes are based on the glucose blood test. HbA1c is only recommended for the management of diabetes but not for diagnoses. In 2009 and 2010, the WHO has introduced HbA1c as an alternative method to diagnose diabetes that has been adopted by many countries since this date. The ophthalmologic problems such as glaucoma, cataract, ocular movement disorders, etc., are the main complications of diabetes. Therefore, the increase frequency of medical acts performed as a result of diabetes related complications such as visual functions allowed to better characterize incident diabetes cases. Moreover, the increased use of non-diabetic drugs along with mentioned biological tests in incident diabetes group may explain potentially the pre-existing comorbidity or may be late diagnosis diabetes with cardiovascular or gastrointestinal diseases. The aim was to estimate the incidence of treated diabetes, therefore all participants with prior antidiabetic drugs (12–36 months before inclusion) were excluded and therefore the antidiabetic drugs were not included among 23 selected variables. The main reason to exclude these participants were to capture the true estimation of incidence of diabetes and took into account other variables to predict the incidence of diabetes.

Implications and perspectives for future research

This innovative approach has been applied to two further studies: i. to classify and to estimate the prevalence of type 1 and type 2 diabetes cases [ 21 ] and, ii. to identify the number of undiagnosed diabetes cases ML algorithms in the SNDS (on going). For the first study, ML-algorithm developed has a sensitivity of 100% and specificity of 97%, and for the second study, the sensitivity is 71% and specificity is 61%.

The next step is to apply this algorithm on SNDS to estimate the incidence of type 2 diabetes cases. We recommend further research for following perspectives using ML-techniques: first to use different time windows (for example 6 months, 12 months or 16 months) to code variables and to explore their impact on estimates, second to predict the trend of diabetes over time and third, to estimate the contribution of determinants of diabetes such as BMI, dietary habits and physical activity, on developing type 2 diabetes using ML approaches.

The use of MLT to analyze large administrative databases (health and non-health related data sources) is increasing across European countries in order to improve the public health surveillance and health policy process. Supervised machine learning is an innovative methodology for the development of algorithms to exploit large health administrative databases. It was the first step that we have developed a generic ML-algorithm with a moderate performance to estimate the incidence of diabetes in a training data set. The results of this study have highlighted important methodological steps to apply MLTs and their implications on large health administrative databases. The next step is to apply this algorithm on SNDS to estimate the incidence of type 2 diabetes cases. More research is needed to apply various MLTs to estimate the incidence of various health conditions and to calculate the contribution of various risk factors on developing type 2 diabetes.

Availability of data and materials

Not applicable.

Abbreviations

Machine Learning

Système National de Données Santé: French National Health Database

A population-based epidemiological cohort

Artificial Intelligence

Information for Action i.e., a joint action of Member States to establish a sustainable European health information system

Work Package

Caisse Nationale de l’Assurance Maladies des Travailleurs Salaries

Self-administered Questionnaire

Medical Questionnaire

Anatomical Therapeutic Code

Linear Discriminant Analysis

Logistic Regression

Flexible Discriminant Analysis

Decision Tree

Area under Receiver Operating Characteristics Curve

Positive Predictive Value

Negative Predictive Value

95% Confidence Interval

Body Mass Index

World Health Organization

Harron K, Dibben C, Boyd J, Hjern A, Azimaee M, Barreto ML, et al. Challenges in administrative data linkage for research. Big Data Soc. 2017;4(2):2053951717745678. https://doi.org/10.1177/2053951717745678 .

Article   PubMed   PubMed Central   Google Scholar  

Tuppin PRJ, Constantinou P, et al. Value of a national administrative database to guide public decisions: from the. Rev Epidemiol Sante Publique. 2017;65(4):S149–67. https://doi.org/10.1016/j.respe.2017.05.004 .

Article   PubMed   Google Scholar  

Bradley CJ, Penberthy L, Devers KJ, Holden DJ. Health Services Research and Data Linkages: Issues, Methods, and Directions for the Future. Health Serv Res. 2010;45(5p2):1468–88.

Article   Google Scholar  

Machine Learning: https://www.expertsystem.com/machine-learning-definition/ . 2017.

Russell S, Norvig P: Artificial Intelligence: A Modern Approach: https://repository.unimal.ac.id/1022/1/Artificial%20Intelligence%20-%20A%20Modern%20Approach%203rd%20Ed%20-%20Stuart%20Russell%20and%20Peter%20Norvig%2C%20Berkeley%20%282010%29.pdf . University Text Book (Third Edition) 2009.

Soni D: Supervised vs Unsupervised Learning: https://towardsdatascience.com/supervised-vs-unsupervised-learning-14f68e32ea8d . 2018.

Jha S, Topol EJ. Adapting to artificial intelligence: radiologists and pathologists as information specialists. JAMA. 2016;316(22):2353–4. https://doi.org/10.1001/jama.2016.17438 .

Patel VL, Shortliffe EH, Stefanelli M, Szolovits P, Berthold MR, Bellazzi R, et al. The coming of age of artificial intelligence in medicine. Artif Intell Med. 2009;46(1):5–17. https://doi.org/10.1016/j.artmed.2008.07.017 .

Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine Learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104–16. https://doi.org/10.1016/j.csbj.2016.12.005 .

Joint Action on Health Information: https://www.inf-act.eu/ . 2018.

Haneef R, Delnord M, Vernay M, Bauchet E, Gaidelyte R, Van Oyen H, et al. Innovative use of data sources: a cross-sectional study of data linkage and artificial intelligence practices across European countries. Arch Public Health. 2020;78(1):55. https://doi.org/10.1186/s13690-020-00436-9 .

Cho NH, Shaw JE, Karuranga S, Huang Y, da Rocha Fernandes JD, Ohlrogge AW, et al. IDF diabetes atlas: global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res Clin Pract. 2018;138:271–81. https://doi.org/10.1016/j.diabres.2018.02.023 .

Article   CAS   PubMed   Google Scholar  

WHO-Europe: The challenges of diabetes: http://www.euro.who.int/en/health-topics/noncommunicable-diseases/diabetes/data-and-statistics .

CONSTANCES: http://www.constances.fr/index_EN.php#assets . 2019.

Zins M. Goldberg M, team C: the French CONSTANCES population-based cohort: design, inclusion and follow-up. Eur J Epidemiol. 2015;30(12):1317–28. https://doi.org/10.1007/s10654-015-0096-4 .

Fuentes S, Cosson E, Mandereau-Bruno L, Fagot-Campagna A, Bernillon P, Goldberg M, et al. Identifying diabetes cases in health administrative databases: a validation study based on a large French cohort. Int Jo Public Health. 2019;64(3):441–50. https://doi.org/10.1007/s00038-018-1186-3 .

Chaix B, Kestens Y, Bean K, Leal C, Karusisi N, Meghiref K, et al. Cohort profile: residential and non-residential environments, individual activity spaces and cardiovascular risk factors and diseases--the RECORD cohort study. Int J Epidemiol. 2012;41(5):1283–92. https://doi.org/10.1093/ije/dyr107 .

Kononenko MR-SI: An adaption of Relief for attribute estimation in regression: http://www.clopinet.com/isabelle/Projects/reading/robnik97-icml.pdf . 1997.

Devaney M, Ram A. Machine Learning: proceedings of the fourteenth international conference, Nashville, TN, July 1997 (to appear); 2004.

Google Scholar  

Çalişir D, Doğantekin E. An automatic diabetes diagnosis system based on LDA-wavelet support vector machine classifier. Expert Syst Appl. 2011;38(7):8311–5. https://doi.org/10.1016/j.eswa.2011.01.017 .

Fuentes S, Hrzic R, Haneef R, Kab S, Fosse-Edorh S, Cosson E. Development of type 1/type 2 classification algorithm through machine learning methods and its application to surveillance using a nationwide database in France in: Diabetologia ; 2020.

Download references

Acknowledgements

We acknowledge Le Marie Zins (Responsible for CONSTANCES cohort) for her kind support to access and use the data from this cohort.

This research has been carried out in the context of the project ‘801553 / InfAct’ which has received funding from the European Union’s Health Programme (2014–2020).

Author information

Sofiane Kab and Rok Hrzic contributed equally as a second author.

Authors and Affiliations

Department of Non-Communicable Diseases and Injuries, Santé Publique France, 12 rue du Val d’Onse, 94415, Saint-Maurice, France

Romana Haneef, Sonsoles Fuentes, Sandrine Fosse-Edorh & Anne Gallay

Population-Based Epidemiological Cohorts Unit, INSERM UMS 011, Villejuif, France

Sofiane Kab

Department of International Health, Care and Public Health Research Institute – CAPHRI, University of Maastricht University, Maastricht, The Netherlands

Department of Endocrinology-Diabetology-Nutrition, AP-HP, Avicenne Hospital, Paris 13 University, Sorbonne Paris Cité, CRNH-IdF, CINFO, Bobigny, France

Emmanuel Cosson

Sorbonne Paris Cité, UMR U1153 Inserm/U1125 Inra/Cnam/Université Paris 13, Bobigny, France

You can also search for this author in PubMed   Google Scholar

Contributions

Conceived and designed the study: RHaneef SFuentes RHrzic AG. Performed the study: RHaneef SKab RHrzic SFuentes. Analyzed the data: RHaneef SKab RHrzic SFuentes. Interpretation of the results: All authors contribute to the interpretation of the results. Contributed to the writing of the manuscript: All authors contributed to the writing of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Romana Haneef .

Ethics declarations

Ethics approval and consent to participate, consent for publication.

All authors gave the consent for publication.

Competing interests

R. Haneef is the first author of this paper and the section editor of “health information system” of “Archives of Public Health”. All other authors declare that they have no competing interests related to the work.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

It describes the summary of results with three boosted algorithms [boosted version of logistic regression, boosted C5.0 classification model, and XGBoost] to our set of four models [i.e., 1. Linear discriminant analysis (LDA), 2. Logistic regression (LR), 3. Flexible discriminant analysis (FDA) and 4. Decision tree model (C5)]. It includes Table S 1.1 (main analysis), Table S1.2 (sensitivity analysis: over sampling), Table S1.3 (sensitivity analysis: down sampling), Table S2 (description of models used and the set of hyperparameters explored) and Table S3 (Area under curve: AUC).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Haneef, R., Kab, S., Hrzic, R. et al. Use of artificial intelligence for public health surveillance: a case study to develop a machine Learning-algorithm to estimate the incidence of diabetes mellitus in France. Arch Public Health 79 , 168 (2021). https://doi.org/10.1186/s13690-021-00687-0

Download citation

Received : 25 December 2020

Accepted : 02 September 2021

Published : 22 September 2021

DOI : https://doi.org/10.1186/s13690-021-00687-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Machine learning technique
  • Supervise learning
  • Health indicator
  • Diabetes mellitus
  • Electronic health records and public health surveillance

Archives of Public Health

ISSN: 2049-3258

case study health surveillance

  • Client stories and case studies
  • Climate Center
  • Climate risk modeling
  • Digital modernization report
  • Diversity, equity, and inclusion
  • Energy in 30 podcast
  • Federal IT modernization

The case for real-time public health surveillance

Public health surveillance is not a new concept. But with the COVID-19 pandemic, there is new need for a deeper understanding—and modernization into real-time surveillance to enhance public health action.

The terms “endemic” and “epidemic” were first defined by Hippocrates around 400 B.C. to distinguish between diseases. With the implementation of new technologies, public health surveillance has the goal of “systematic collection, analysis, and interpretation of health-related data needed for the planning, implementation, and evaluation of public health practice” according to the World Health Organization.

Day-to-day public health surveillance is used for early detection, impact assessment, intervention and implementation, evaluation, and risk assessment that all lead to public health actions . A few of the many applications for public health surveillance include chronic diseases such as cancer  and diabetes, infectious diseases  such as flu and vector-borne diseases, as well as environmental health and injury control.

Timely detection of health signals

Public health surveillance heavily relies on public health information systems to ensure data standardization and integration . Any delay in responding to disease outbreaks constitutes a very serious threat to global public health . This is why real-time surveillance and enhanced communication between health care facilities and all levels of the public health system is crucial to ensure action.

For example, ICF implements the Demographic and Health Surveys (DHS) Program  for USAID, providing technical assistance on more than 300 surveys in at least 90 countries. We manage a wide range of data, from infectious diseases (such as malaria  and HIV) to nutrition and fertility.

However, survey data is not the same as real-time surveillance. It takes time to collect the information, analyze the data, and publish.

Working with CDC since 2014, ICF has been continually evolving CDC’s BioSense  platform with the goal of strengthening CDC's public health surveillance system. BioSense collects mostly emergency department data and some urgent care data—from chief complaint diagnostics to discharge diagnostics—with the goal of rapidly collecting, evaluating, and sharing syndromic surveillance data to understand and monitor health events. The syndromic data is received within 24 to 48 hours of collection, which makes it a near real-time surveillance system for detecting outbreaks.

BioSense is constantly evolving. It played a large role in the COVID-19 pandemic : not only in finding hotspots of infections but also in vaccine and treatment data, adverse effects of ivermectin, and indirect effects of COVID-19 on mental health. While the BioSense team is working on adding more capabilities, there are some challenges based on the data itself but also where the data comes from. Every site is different, therefore data will look different based on who enters the syndrome into the system. Teams of data scientists, subject matter experts in specific diseases, and other partners must work together to ensure data standardization.

Modernizing the immunization data infrastructure

While the pandemic showcased the need for consolidation of immunization data, there were several challenges in the capacity of public health information systems to engulf large amounts of COVID-19 data along with the vaccination efforts. One solution was to secure real-time data between different entities. The Immunization (IZ) Gateway is a secure cloud-based platform that was built as a central point of connection for entities wishing to query or update immunization data to/from any number of state immunization information systems.

The IZ Gateway establishes a national immunization registry and enables research and analysis for national situational awareness, effectiveness, adverse effects, and rates of vaccination. Currently used by the CDC for COVID-19 reporting, the IZ Gateway offers a simple, standardized, secure, and scalable solution to exchanging data between different systems. While mostly being used for immunization data, the same platform can be adapted for any data exchange to enhance real-time surveillance for health systems.

Social media is also being used to detect potential outbreaks. NYC Department of Health and Mental Hygiene partnered with Columbia University to develop a data mining software to search for restaurant reviews posted on Yelp. If anyone reported a few people getting sick within at least 10 hours after eating at the restaurant, a positive signal alerted the team who could then launch an investigation. Twitter has also been used by some health departments during flu season to let campus students know where to get vaccinated. However, there are some limitations to using social media data for public health surveillance, including a limited reach of users and issues around sharing protected information.

Data-informed decisions regarding outbreaks and patient care

It is essential for researchers, epidemiologists, and policymakers to have access to timely data. Real-time surveillance enables rapid and standardized data sharing to understand how diseases spread, strengthen disease surveillance, and track the overdose crisis and syndromic data.

To continue fighting for better public health systems, we need to ensure interoperability across multiple systems. We also must keep modernizing health IT systems and integrating different workflows at all levels of public health, from clinicians to federal agencies. Continuing to provide and enhance real-time, accurate, epidemiologic and laboratory data will lead to the best public health actions.

Cecilia is a public health leader with nearly 10 years of experience implementing bioinformatics solutions for infectious diseases surveillance and outbreak investigations.  View bio

  • Digital modernization
  • Public health
  • Public sector

Related insights

  • Open access
  • Published: 18 January 2024

“Hospitals respond to demand. Public health needs to respond to risk”: health system lessons from a case study of northern Queensland’s COVID-19 surveillance and response

  • Alexandra Edelman 1 , 4 ,
  • Tammy Allen 2 ,
  • Susan Devine 2 ,
  • Paul F. Horwood 2 ,
  • Emma S. McBryde 3 ,
  • Julie Mudd 4 ,
  • Jeffrey Warner 2 &
  • Stephanie M Topp 2 , 5  

BMC Health Services Research volume  24 , Article number:  104 ( 2024 ) Cite this article

655 Accesses

9 Altmetric

Metrics details

The vast region of northern Queensland (NQ) in Australia experiences poorer health outcomes and a disproportionate burden of communicable diseases compared with urban populations in Australia. This study examined the governance of COVID-19 surveillance and response in NQ to identify strengths and opportunities for improvement.

The manuscript presents an analysis of one case-unit within a broader case study project examining systems for surveillance and response for COVID-19 in NQ. Data were collected between October 2020–December 2021 comprising 47 interviews with clinical and public health staff, document review, and observation in organisational settings. Thematic analysis produced five key themes.

Study findings highlight key strengths of the COVID-19 response, including rapid implementation of response measures, and the relative autonomy of NQ’s Public Health Units to lead logistical decision-making. However, findings also highlight limitations and fragility of the public health system more generally, including unclear accountabilities, constraints on local community engagement, and workforce and other resourcing shortfalls. These were framed by state-wide regulatory and organisational incentives that prioritise clinical health care rather than disease prevention, health protection, and health promotion. Although NQ mobilised an effective COVID-19 response, findings suggest that NQ public health systems are marked by fragility, calling into question the region’s preparedness for future pandemic events and other public health crises.

Conclusions

Study findings highlight an urgent need to improve governance, resourcing, and political priority of public health in NQ to address unmet needs and ongoing threats.

Peer Review reports

The COVID-19 pandemic has underscored the critical importance of pandemic preparedness and health systems capability to support communicable disease surveillance and response in Australia and globally. Public health surveillance is the cornerstone of public health decision-making and practice and is fundamental to averting epidemics [ 1 , 2 ]. Yet surveillance capacity, and pandemic preparedness, varies widely between and within countries [ 3 ]. Australia’s response to the COVID-19 pandemic led to comparative successes, by international standards, in transmission suppression, health system preparation, and control of case numbers; highlighting key strengths including effective coordination of response efforts across state and federal governments [ 4 , 5 ]. In parallel, lessons from Australia’s pandemic response highlight opportunities to strengthen health system preparedness for future pandemics and other public health challenges [ 6 ].

Public health surveillance in Australia operates at national, sub-national (state and territory) and local levels. Responses to health emergencies, including communicable disease outbreaks, are primarily the responsibility of state and territory health departments [ 7 , 8 ]. Within the state of Queensland, northern Queensland (NQ) is a vast rural and remote region (Fig.  1 ) which lies within the Indo-Pacific geographic region. The region experiences poorer health outcomes and a disproportionate burden of communicable diseases compared with urban populations in Australia [ 9 ]. Unique geographic, demographic, and environmental conditions in NQ contribute to communicable disease emergence and outbreaks. There are many organisations, funding streams, programs, processes, people, and networks involved in COVID-19 surveillance and response in NQ. When the first COVID-19 cases in the NQ region were confirmed in March 2020, they occurred against a backdrop of concurrent communicable disease challenges and risks including incursion of mosquito-borne viruses (e.g., Japanese encephalitis) and exotic mosquito vectors, [ 10 ] an outbreak of syphilis, [ 11 ] and the threat of multi-drug resistant tuberculosis (TB) [ 12 ].

figure 1

Hospital and Health Service (HHS) boundaries and Australian Statistical Geography Standard (ASGS) remoteness areas in northern Queensland

Theories of health system governance facilitate investigation of how different actors in a given system or organisation function and operate, with the aim of supporting health system strengthening [ 13 ]. Sheikh et al.’s health system framework recognises that any system – such as surveillance and response systems for COVID-19, is comprised of “hardware” and “software” components [ 14 ]. System hardware refers to tangible components such as financing, skilled health workforce, information systems, commodities and infrastructure that provide the material basis for service or surveillance [ 14 ]. System software refers to the values, norms, organisational cultures (the “usual way of doing business”), relationships and trust, and processes by which providers or managers are held to account – all of which transform material components into a functioning system [ 14 ]. To support pandemic response efforts and future pandemic preparedness, this study examines the governance of COVID-19 surveillance and response systems in NQ, with attention to both system hardware and software components, to identify key strengths and opportunities for improvement. In particular, the study examines: the role of formal rules (such as clinical protocols and guidelines, algorithms or operating procedures) and informal rules (such as workplace norms and organisational culture) in surveillance and response decision-making at different levels; and the relationships, trust and accountability mechanisms on which COVID-19 surveillance and response systems are enacted.

This study is part of a broader project investigating communicable disease governance systems in NQ. The broader project adopts a single case study design with four embedded units of analysis; the current study reports on just one of those units, COVID-19 surveillance and response [ 15 ].

Case studies enable in-depth examination of complex issues in their real-world contexts. Embedded case study designs allow for one or more sub-units to be the focus of in-depth inquiry while allowing the broader topic to remain the phenomenon of interest [ 15 ]. The boundaries of the case are the communicable disease surveillance and response systems in NQ, with COVID-19 one of four embedded units of analysis in the broader study. The case study context is the broader public health system in NQ (Fig.  2 ). The four case units were selected for their differences according to organisational, biological, regulatory, and political factors; with the other case units being TB, mosquito-borne arboviruses and malaria, and sexually transmissible infections and blood-borne viruses.

figure 2

Case study and case context boundaries, showing the COVID-19 embedded unit of analysis

Data were collected and analysed between October 2020 and June 2022 across three project phases (Fig.  3 ). Phase 1 involved process and stakeholder mapping of the case predominantly via analysis of government and organisational documentation relating to communicable disease control in NQ. Three key informant interviews were also conducted with individuals occupying high-level roles within the NQ public health system. In Phase 2, a further 44 interviews were conducted to develop detailed case unit reports for each of the four embedded units of analysis. This involved selection of 10–20 individuals per unit of analysis, using both purposive and snowballing selection methods. Many aspects of the discussions with interviewees pertained to multiple diseases, case-level issues, and the case context.

figure 3

Three phases of the broader study

Organisational charts from public websites and networks of the investigator team were used to identify potential interview participants occupying key roles at both front line and mid-level management within health care delivery and planning organisations in NQ. Both current and historical perspectives were considered. Role types (Table  1 ) of interviewees included: clinical staff such as infectious disease, emergency department physicians and pathologists; public health staff such as public health nurses, environmental health officers, public health medical officers, and epidemiologists; mid-level managers such as medical superintendents; and policy or research personnel holding policy, strategy and/or research roles in government or non-government organisations, including expert consultants. As the focus of the study is on the governance of communicable disease surveillance and response systems, which pertain predominantly to decision-making processes and platforms in key government and non-government organisations in NQ, interviews were not sought with patients or non-expert members of the community. An interview guide was developed for Phase 2 interviews (Appendix 1 ) with question fields relating to surveillance and response processes, workforce roles and relationships.

Policy document review and (pre-approved) unstructured observations in the organisational setting of three public health offices within NQ Hospital and Health Services (HHSs) were also undertaken in this phase. Observations were conducted to identify key activities and relationships relevant to communicable disease surveillance and response within the main sites of public health decision making in NQ. This involved attendance over several days of one or two researchers (AE and ST) within busy office spaces and at routine meetings, and accompanying senior staff as they conducted their usual work. An observation template was used to prompt notetaking in the following fields: workplace description, roles, communication and relationships, nature of work, workflow characteristics, and tentative interpretations/theoretical memos. Documents were identified and collected from the websites of relevant organisations (e.g., HHSs, AICCHOs, government department websites) as identified by the research team and suggested by interviewees and included: published strategies and operational plans; annual reports; policy reports; federal-level and state legislation and regulations; health service contracts and agreements; performance frameworks; vision statements; action plans; and response/management handbooks.

Interviews from both the first and second phases were transcribed verbatim and were coded inductively in NVivo [QSR]. To develop the coding framework, two researchers (AE and SMT) coded a selection of four interview transcripts (one per unit of analysis) and met to compare approaches and decide on a suitable analysis framework. The resultant framework borrowed from the United States Centers for Disease Control and Prevention (CDC)’s 10 Essential Public Health Services [ 16 ], which reflect key public health processes enabling effective communicable disease surveillance and response, as high-level headings to group lower-level inductive codes that reflected both hardware and software elements. Document and observation data were used to create initial process and stakeholder maps and to support analysis of the interview data through triangulation. Thematic analysis produced case unit themes that reflect patterns of relationships between hardware and software components shaping surveillance and response processes and outcomes. Emerging findings from the work were discussed with individuals occupying roles at executive levels within health service organisations and in key policy roles within government entities. Notes from these meetings were used to verify and contextualise Phase 2 findings and link them to the broader policy context.

A subsequent Phase 3, of comparative analysis was conducted across all four embedded units of analysis, but is not reported here. This manuscript reports findings only from the COVID-19 case unit in Phase 2 of the study.

All methods were carried out in accordance with relevant guidelines and regulations including seeking informed consent from all interview participants. This project was approved by the Townsville HHS Human Research Ethics Committee (HREC) HREC/2019/QTHS/59,811 with reciprocal approval received from the James Cook University HREC. Governance (site-specific) approval was received from the Townsville HHS, Cairns and Hinterland HHS, Torres and Cape HHS and Mackay HHS to enable data collection at these sites.

Findings are divided into two major sections. The first provides a descriptive account of shifts in regulatory and organisational settings in response to the pandemic in NQ, and their implications for public health activity in the NQ context. The second section reports thematic findings relating to the strengths of the approaches and the challenges experienced in their implementation, considering both hardware (material) and software (relational) factors.

The NQ health system and features of the pandemic response

Public health services, or the preventive, protective and promotive functions of the NQ health system, are delivered by several organisations. Queensland’s 16 state-funded statutory entities called Hospital and Health Services (HHSs), five of which are located in NQ, are primarily funded to deliver secondary and tertiary health care services via hospital, clinic-based, and outreach services. Within the NQ HHSs, two Public Health Units (PHUs) in the Townsville HHS and the Cairns and Hinterland HHS deliver a range of services including disease notifications, screening services, vaccinations, vector control, and outbreak monitoring and response across the NQ region. The Townsville HHS PHU services its own region and the neighbouring North West and Mackay HHS regions, and at the time of study the Cairns and Hinterland HHS serviced its own region and the Torres and Cape HHS region. Health promotion services are also delivered by the PHUs, although this function was hampered by reductions to Commonwealth funding for health promotion, prevention and early intervention from June 2014 [ 17 ] and cuts made to the health promotion workforce resulting from a Queensland health service restructure and workforce reductions in 2012 and 2013 [ 18 ].

Primary care services in NQ are mainly delivered via private GP clinics, with the federally funded Primary Health Networks (PHNs) functioning as regional commissioning and planning bodies. A network of 12 Aboriginal and Torres Strait Islander Community Controlled Health Organisations (AICCHOs) are non-government organisations that deliver culturally safe primary care services supported by a peak body (the Queensland Aboriginal and Islander Health Council – QAIHC, which is a leadership and policy organisation that represents AICCHOs across Queensland).

On 29 January 2020, in response to the COVID-19 pandemic, Queensland’s Minister for Health and Minister for Ambulance Services declared a public health emergency under Sect. 319 of the Public Health Act 2005 (Qld) . The declaration initiated “command and control” [ 19 ] arrangements including activation of an emergency and disaster management policy infrastructure to frame the response, centred around Public Health Directions issued by Queensland’s Chief Health Officer (CHO) [ 20 ]. Like in other Australian jurisdictions, a wide suite of measures were introduced in Queensland to reduce COVID-19 transmission, including mask wearing in public, hotel quarantine, and home isolation orders. Biosecurity Zones, also known as COVID-19 Restricted Travel Areas or designated areas, were established in NQ under the Biosecurity Act 2015 between March and June 2020. These Zones restricted entry to remote communities in Queensland for the purpose of slowing the spread of COVID-19 among community members at greater risk of becoming seriously ill if they contracted the virus [ 21 ]. In the Cape and Torres regions, the area north of Mossman and the Atherton Tablelands was a single zone, and in North West Queensland communities, Burketown, Palm Island, Doomadgee, and Mornington were also subject to high level restrictions. These restrictions heavily affected Aboriginal and Torres Strait Islander communities across NQ [ 22 ].

Rapid mobilisation of testing and surveillance systems in NQ included the use of point of care testing and reporting of cases through Queensland’s digital Notifiable Conditions Systems (NoCS). In the NQ HHSs, Health Emergency Operation Centres (HEOCs) were rapidly “stood up” to provide incident management support as part of the pandemic response. The AICCHOs and peak body (QAIHC) also developed and disseminated communication plans and public health messaging to communities about how to protect the community during the pandemic. In parallel, the roll-out of the COVID-19 vaccines commenced nation-wide in February 2021, largely led in Queensland by the HHSs. Logistical considerations for vaccine teams in the NQ HHSs included cold chain management, which involved planning, fridge audits and management of cold chain breaches.

As of September 2022, 233,139 confirmed cases of COVID-19 and 232 deaths were reported in NQ. These figures represented less than 1% of Queensland’s total cases and 11% of COVID-19 -related deaths [ 23 ]. The proportion of the Queensland population fully vaccinated was 93.1% in September 2022 [ 23 ].

Key strengths and challenges

Key strengths and challenges are presented against five key themes: rapid implementation and coordination of local response measures; unclear lines of accountability; constraints on/gaps in local community engagement; workforce shortages and burnout; and lack of priority afforded to public health function in health services.

Rapid implementation and coordination of local response measures

A key strength of the COVID-19 response in NQ was the rapid implementation and coordination of logistical response measures locally from March 2020. Directed and framed by centralised powers associated with a state-wide declaration of a public health emergency in January 2020, the operationalisation of large components of the COVID-19 response fell to the two PHUs in NQ. As new Public Health Directions were released by Queensland’s Chief Health Officer, COVID-19 response teams in the PHUs rapidly translated broad advice into logistical activities involving local-level coordination with police, quarantine services, hotels, and a wide range of other stakeholders, in conjunction with testing and contact tracing activities led though the broader HHSs.

Region-wide responses were supported by collegiality and trust between senior public health staff in the HHSs. The two PHU directors and public health medical officers in the HHSs leveraged informal relationships and networks across the NQ region to share information, experiences and engage in localised response coordination, demonstrating the importance of health system software in enabling response efforts. Moreover, although a major structural reform and devolution process enacted in 2012 (involving the establishment of the HHSs) broke down many previous formal, region-wide communicable disease monitoring and response structures, the cultural legacy of the historical separation of public health services from local hospitals meant that the PHUs still operated with relative administrative autonomy from other local health structures in decision-making for aspects of the COVID-19 response. Despite PHUs being embedded in the HHSs for the purposes of broader (state-wide) management and resource allocation purposes (system hardware), this relative autonomy of PHUs at the local level allowed for rapid decision-making and operationalisation of the Public Health Directions, including proactive development of localised COVID-19 response plans.

“So obviously, there’s a lot of state plans, but […] we’ve been very big on having localised plans as well […] we have pushed quite heavily for our HHSs and our PHUs to plan at least: ‘what would we do if we were in a situation where we needed more help [at a state level] and we couldn’t get it?’” Int 17 .

In parallel, and operating almost entirely autonomously from the HHSs, the AICCHOs and peak body representing AICCHOs in the state (QAIHC) rapidly developed and disseminated communications plans and public health messaging to communities about how to protect the community during the pandemic.

Unclear lines of accountability

Despite the rapid mobilisation of staff and resources locally, a recurring theme in participants’ accounts of COVID-19 surveillance and response systems was the lack of clarity in accountabilities framing the response. Although the quasi-independence of the PHUs within the HHSs offered a degree of autonomy as noted above, it also meant that Queensland’s Department of Health and HHS expectations of the role of the PHUs lacked clarity, alignment or was changeable.

“We function in this little hybrid world where we’re part of the Hospital [and Health Service, i.e., HHS] but also separate from it. Because we were always separate from it and our structures and so forth have been maintained quite separately.” Int 17 .

In addition to reporting internally within their host HHS and delivering services to neighbouring HHSs, the PHUs also responded directly to separate divisions/branches within the Queensland Department of Health. During the COVID-19 pandemic an additional, and separate, disaster management reporting hierarchy was established. These overlapping reporting arrangements created an onerous administrative burden on the PHUs during the COVID-19 pandemic. For example, the local HHSs, and Queensland Department of Health’s Communicable Diseases Branch and COVID-19 Compliance Team, as well as the State Health Emergency Coordination Centre, all required regular collection and reporting by the PHUs of different, though related, COVID-19 data.

Despite these reporting requirements, the Queensland Department of Health provided little operational guidance to support planning and coordination of local pandemic responses in the PHUs. The Queensland Health Public Health Practice Manual (2016) legitimised this separation of operational and strategic responsibilities by casting the role of the Department as a high-level strategic and policy setting (rather than operational) body [ 24 ]. The Manual was introduced to provide clarity around accountabilities for public health following the devolution of responsibility for public health to the HHSs (from the Department of Health) from 2012; but in the COVID-19 response, this expectation of functional separation impeded two-way decision-making between the HHSs and the Department. Several participants expressed a struggle to balance a personal sense of responsibility for the success or otherwise of response measures locally, with an inability to participate in decision-making at a state level about what responses were needed and how they should be delivered. Top-down rules regarding communication and limited control of budgets (i.e. rules controlling system “hardware”) did not marry with the expectation and reality, that individuals demonstrate a strong sense of mission, adaptability and responsibility in the crisis (i.e. expectations around system “software”).

“Technically if people [i.e., the Chief Health Officer and Department] are telling you what to do, they should supply you with the adequate advice on how to do it, [but] that’s probably the big gap […] It’s ‘you need to go do this’. Okay, well we’ve got all these issues. ‘Well, they’re operational, you [i.e., the PHU] just need to go work them out yourself. We don’t have the resources for that, that’s operational, you need to go and work those out yourself’.” Int 11 .

Consequently, local public health staff saw themselves as simultaneously accountable for response outcomes yet neglected in key governance processes and resourcing decisions that drove these outcomes, which contributed to feelings of vulnerability and frustration.

“That’s one thing as well that I think I felt and all the [key staff in the PHUs] have felt with COVID - is the fact that they [the Queensland Department of Health] just bypassed us. And that also undermines your credibility and undermines your authority which doesn’t help really.” Int 14 .

Framed by the lack of clarity in accountability arrangements, several participants recounted poorly designed and executed responses in NQ. For example, one participant recounted that there were no processes established for review of biosecurity management plans for people accessing movement restriction exemptions; yet the mechanisms to raise and address this issue at a state level lacked responsiveness.

“We had lots of people travelling in on [movement restriction] exemptions, which at that time, people on exemptions were meant to have biosecurity management plans most often than not, no one was reviewing and approving them […] and we saw gaps in it and we got onto SHECC [the State Health Emergency Coordination Centre] down in Brisbane and they’re like, ‘No one approves them. People just have to say that they’ve got one.’” Int 21 .

Constraints on/gaps in local community engagement

As part of the emergency response, new restrictions had been imposed by the Queensland Department of Health on HHS-led media liaison relating to COVID-19. Under the Queensland Public Health Sub-plan, [ 19 ] Level 2 and Level 3 Events require that all media releases and other public health awareness campaigns related to the event be endorsed by the State Health Coordinator (usually the Chief Health Officer or Deputy Director-General) prior to their release. Participants in the PHUs therefore described encountering restrictions on developing and sharing localised COVID-19 -related public health information with their communities.

“ The stuff we would normally do with an outbreak, we then weren’t able to do. We couldn’t do any localised comms. And we weren’t allowed to talk to the media.” Int 17 .

Participants also recounted a backdrop of staff cuts to community engagement roles in Queensland Health that followed the 2012 restructure, and low levels of trust between Queensland Health services and the AICCHO sector. The AICCHO sector represents a pivotal link with local communities; yet participant accounts and written reports indicated that there were few governance mechanisms in Queensland Health-led initiatives to involve the AICCHO sector in COVID-19 planning discussions, either at a local HHS level or with Department bodies. For example, it was reported that AICCHO services were not involved in the development of local pandemic plans, received little to no assistance or communication from HHSs throughout the initial crisis period, and were not consulted in relation to key policies including biosecurity zones or surge workforce [ 25 ]. Combined, these factors diminished the capacity of the PHUs to conduct local-level community engagement as part of the pandemic response.

Consequently, participants described that some COVID-19 -related policies and community education materials were not sufficiently responsive to local contexts and needs. For example, “fly-in-fly-out” models of vaccine implementation in some remote communities were described that had limited focus on improving Aboriginal and Torres Strait Islander community understanding of the vaccines (Int 15). Some participants also felt that local communities had not been effectively engaged in relation to the enactment of the Biosecurity Zones.

Workforce shortages and burnout

Workforce challenges identified in the study were framed by critical workforce gaps that pre-dated the COVID-19 pandemic. Participant accounts emphasised the significant public health workforce and workload implications of enacting COVID-19 surveillance and response activities at a local level, including to operationalise the Public Health Directions in the PHUs. Yet, the availability and deployment readiness of a “surge” workforce in the HHSs was limited.

“The whole thing about how you plan to ramp up contact tracing in the event of a big state-wide or local event is still a struggle because we’ve got a list of people who have been – who are existing, or could be trained as, contact tracing officers. But then getting them released, particularly in circumstances where the hospital is not allowed to turn around and reduce services, is almost impossible. Unless they [HHS administrators] know that it’s a critical staff member, you can’t have them.” Int 15 .

The difficulties mobilising a surge workforce meant that the increased workload burden required to support the COVID-19 response largely fell to existing public health services and teams within the Cairns and Townsville PHUs. In at least the first 12 months of the pandemic, a large proportion of staff in the PHUs were re-allocated – in either full or part-time capacities, or simply on top of other work – to the COVID-19 response from vector control, sexual health, communicable disease control and other business as usual (“BAU”) public health services. In the Townsville PHU at the time of data collection, for example, the vector control team had been entirely re-directed towards assisting with hotel quarantine. Similar disruptions were described in other vector control and BAU activities across NQ.

“We just have to figure out what needs to be sacrificed. And unfortunately, that would probably have to be a lot of BAU [business as usual] stuff, and so only follow up the most urgent cases which is pretty sad.” Int 18 .

Emergency COVID-19 funding enabled the establishment of some new positions in the HHSs to conduct testing, contact tracing and follow up. However, this funding was time-limited, meaning that employment arrangements for new staff were short-term. On top of delivering critical COVID-19 response functions, staff in the PHUs described having to continually re-apply to HHS administrators for funding every three months to support the new positions needed for the ongoing COVID-19 response, causing planning problems and compounding recruitment challenges. As the emergency funding ceased, participants described an erosion of redundancy capacity, meaning an inability among staff remaining involved in the response to step away from their roles even for short periods. As people returned to previous roles, the remaining teams were smaller yet there had not been a concomitant decrease in workload or reduction in “high alert intensity” (Int 17).

The intense workload pressures experienced by public health staff had led to increasing feelings of fatigue and burnout during 2021, eroding motivation to maintain or build relationships (system software) that were essential to many COVID-19 specific and more routine public health functions. Moreover, these experiences were framed by a sense among participants that public health continues to be generally poorly understood, and little valued, by health service administrators whose decisions are central to perceptions of under-resourcing (system hardware) of public health personnel and activities.

“At an administrative level, the hospital […] doesn’t make any allowance for public health being different to how the rest of the hospital works […] one of the big objectives is to provide patient centred care. Of course, public health is not about patient centred care, but everything has to be about patient centred care because that’s the flavour of the current [HHS] strategic plan.” Int 15 .

Lack of priority afforded to public health function in health services

The final theme reflects a widespread concern among participants that public health generally lacks priority – in terms of both values (software) and resourcing commitments (hardware) – within the HHSs. Participants recounted a need, both pre and during COVID-19, to explain and defend the value of public health as a service directorate to HHS administrators. While some felt that the COVID-19 response may have drawn attention to the value of public health (“ Public health […] is sexy all of a sudden” Int 43) others pointed to a hospital-centric approach in the way that COVID-19 emergency funding was overwhelmingly used to support frontline clinical, rather than public health, activities in the HHSs during the study period. One participant recounted that, despite state and federal COVID-19 resourcing including substantial increased support for contact tracing and testing activities, COVID-19 response plans tended to be reactive: demonstrating an underlying, persisting lack of understanding of critical prevention activities.

“One of the difficulties we’ve had is people understanding that public health needs to respond before the curve, not with the curve. Hospitals respond to demand. Public health needs to respond to risk.” Int 17 .

Structurally, there are no key performance indicators in the HHS Service Level Agreements that map to public health (preventive, protective, and promotive) priorities, and a very small proportion of HHS budgets is allocated to public health services. Allocations for Prevention Services – Public Health in the Service Level Agreements (2019/20-2021-22) , for example, represent only between 0.1 and 2% of HHS non-capital allocations [ 26 ]. Moreover, participants described a lack of transparency in these budget allocations within the HHSs, and the challenge of relying on temporary project-based funding to support several critical public health functions.

“So what is killing us more than anything else is understaffing and temporary funding from multiple pots that we don’t understand. I don’t even understand the slice of the HHS budget that is available to us. We only pretend to budget here.” Int 15 .

Australia has a robust national regulatory architecture for communicable disease surveillance and response; yet identifying gaps and opportunities to strengthen overall preparedness for infectious disease outbreaks and pandemics requires detailed investigation of localised capacities, activities and relationships. This COVID-19 case unit, nested within a broader case study on communicable disease surveillance and response systems in NQ, highlights several critical insights to inform health system strengthening in a region with higher communicable disease risks and unmet health needs. Study findings highlight key strengths of the COVID-19 response, including rapid implementation of response measures, and the relative autonomy of the PHUs to lead logistical decision-making. However, the study also reveals critical limitations of the public health system in NQ, including unclear accountabilities, constraints on local community engagement, and workforce and other resourcing shortfalls – framed by regulatory and organisational elements prioritising clinical health care rather than disease prevention, health protection, and health promotion. We reflect on what these experiences demonstrate with respect to Queensland’s public health surveillance and response capacity, not only in the pandemic but more broadly.

First, the findings highlight an urgent need for prioritisation of public health within Queensland’s decentralised health system. Public health (as distinct from publicly funded health services) is concerned with the protection and promotion of health, and prevention of injury, illness, and disability. It is a core responsibility of government and conceptualised as an integral component of comprehensive primary healthcare; yet our study findings indicate that this distinction is poorly understood in the devolved organisational context of the HHSs in NQ and potentially beyond, even accounting for the experience of COVID-19. Others have criticised the clinical / biomedical orientation of Australia’s health system and the urgent need for policy redirection towards public health, [ 27 ] as well as the need to clarify definitions, terminologies and classifications to identify what is a “public health” activity and facilitate comparison of this activity across jurisdictions [ 28 ]. In Queensland, the organisational position of PHUs – nested within and dependent for resources on board-governed health districts with typically hospital-focused administrators – is poorly suited to public health decision-making. Gaps in support for public health capacity in NQ was a key finding of a Queensland Government review of COVID-19 response, [ 6 ] underpinning a recommendation to amend legislation to strengthen accountability for population health in Queensland Health. Delivering on this recommendation needs to be accompanied by adequate resourcing for key workforce roles to support a full complement of public health functions across the diverse NQ region. Without this, pandemic responses will remain overwhelmingly reactive, rather than responsive to risk.

Second, the study highlights opportunities to proactively explore public health workforce models that are amenable to scaling up and down to support an effective response, supported by foundational, effective, and permanent public health services in the HHSs. The study demonstrates that emergency funding from a low-capacity base is not sufficient to deliver a sustained public health response: our study identified only modest increases in budgets relative to new responsibilities, inclusive of developing and managing a hotel quarantine system from scratch. The PHUs across the NQ region faced workforce shortages and burnout resulting in significant disruptions to other longstanding public health priorities including sexual health and vector management for arboviruses. The full implications of these workforce redirections globally are yet to be seen, but there are fears that disruptions to prevention and treatment services during the first two years of the COVID-19 pandemic will drive higher case numbers of other diseases such as TB [ 29 ].

Third, the study highlights critical gaps in community engagement capacity within NQ public health systems. While this to some extent reflects the emergency response (“command and control”) approach as well as capacity and resourcing constraints in the PHUs, it may also reflect gaps in capabilities and workforce models within the HHSs to proactively lead engagement with diverse communities across the NQ region. The gaps evident in the current study in the relationship between Queensland Health and the AICCHO sector point to an urgent need for relationship and trust-building to support coordination of community engagement activities and surveillance and response services for NQ Aboriginal and Torres Strait Islander populations. The Queensland Health/QAIHC 2021 joint Health Equity Framework [ 30 ] outlines a strategic framework to drive health equity, and eliminate institutional racism across the health system as a foundation for improving health outcomes for Queensland Aboriginal and Torres Strait Islander peoples. Future planning for community engagement in public health in NQ must be guided by this Framework. There may also be opportunities to strengthen resourcing of local community engagement and feedback mechanisms through emphasising these activities in future pandemic plans.

Overall, although NQ is regarded to have mobilised an effective COVID-19 response, [ 6 ] the findings of the study highlight that NQ public health systems are marked by fragility, calling into question the region’s preparedness for future pandemic events and other public health crises. The findings of this component of the study highlight that to strengthen public health systems in NQ there is an urgent need to improve clarity in accountability relationships for PHUs, strengthen coordination between services, invest in the public health workforce, and improve the responsiveness of services to local need through mechanisms supporting effective community engagement. There is also an opportunity for ongoing research to strengthen the case for public health investment in NQ, state-wide and nationally, by highlighting the overwhelming health benefits and cost savings attributable to responding to risk rather than demand.

Key strengths of the study include its attention to both “hardware” and “software” health system elements to identify key strengths and challenges, and the use of multiple data sources to enable data triangulation. Limitations include the focus of the study on state government-funded services as the entities with legislated responsibility for public health in Queensland. In addition, there were some limitations to the US CDC’s 10 Essential Public Health Services as an analytic framework including limited guidance or attention to critical domains in a public health crisis, such as post-disaster recovery, surge capacity (beyond workforce), and continuity of essential services. As fewer interviewees were included from the primary care sector, future governance-focussed research might also seek to explore structures and networks established to support communicable disease surveillance and response functions outside of state-funded services. Patients and the broader public were not represented in the current study whose focus was on governance systems and their key institutions. As critical stakeholders, however, future work should be conducted to consider the perspectives and experiences of patients and the broader public regarding issues of public health governance.

Study findings highlight key strengths of the COVID-19 response, including rapid implementation of response measures, and the relative autonomy of the PHUs to lead logistical decision-making. However, the study also reveals critical limitations of the public health system in NQ, pertinent to longer term aspirations for pandemic preparedness in Australia. First, the findings highlight an urgent need for prioritisation (both strategic and operational) of public health within Queensland’s decentralised health system. Second, the study demonstrates opportunities to proactively explore public health workforce models that are amenable to surge mobilisation for effective outbreak and pandemic responses. Third, the study highlights critical gaps in community engagement capacity within NQ public health systems, underscoring the importance of approaching future community engagement planning from a foundation of trust between key NQ services. Overall, the findings of the study demonstrate an urgent need for improved governance, resourcing, and political priority of public health in NQ to address unmet needs and ongoing pandemic, and other public health threats.

Data Availability

Due to confidentiality agreements, qualitative de-identified data from interviews conducted during the current study are only available from the corresponding author on reasonable request to bona fide researchers. Other data derived from public resources are made available in the article.

Abbreviations

Aboriginal and Torres Strait Islander Community Controlled Health Organisation(s)

Health Emergency Operation Centre(s)

Hospital and Health Service

Public Health Unit

Notifiable Conditions Systems

North Queensland

Queensland Aboriginal and Islander Health Council

World Health Organization. Communicable Disease surveillance and response systems: a guide to planning. Lyon, France: World Health Organization; 2006.

Google Scholar  

Choi BC. The past, present, and future of public health surveillance. Scientifica. 2012;2012:875253.

Article   PubMed   PubMed Central   Google Scholar  

Bell JA, Nuzzo JB. Global Health Security Index: advancing collective action and accountability amid Global Crisis. Washington, DC: Nuclear Threat Initiative; 2021.

Bennett S. Responding to the pandemic at a national and state public health level. Microbiol Australia. 2021;42(1):13–7.

Article   Google Scholar  

Van Nguyen H, Lan Nguyen H, Thi Minh Dao A, Van Nguyen T, The Nguyen P, Mai Le P, et al. The COVID-19 pandemic in Australia: public health responses, opportunities and challenges. Int J Health Plann Manag. 2022;37(1):5–13.

Reform Planning Group. Unleashing the potential: an open and equitable health system: Healthcare for Queenslanders in a pandemic ready world. Brisbane: Queensland Government; 2020.

Australian Government Department of Health. Health emergency preparedness and response 2020 [Available from: https://www1.health.gov.au/internet/main/publishing.nsf/Content/health-pubhlth-strateg-bio-index.htm .

Australian Government Department of Health. Surveillance systems reported in Communicable Diseases Intelligence, 2016 2020 [Available from: https://www1.health.gov.au/internet/main/publishing.nsf/Content/cda-surveil-surv_sys.htm .

Quinn EK, Massey PD, Speare R. Communicable Diseases in rural and remote Australia: the need for improved understanding and action. Rural Remote Health. 2015;15(3):3371.

PubMed   Google Scholar  

Horwood PF, McBryde ES, Peniyamina D, Ritchie SA. The Indo-Papuan conduit: a biosecurity challenge for Northern Australia. Aust N Z J Public Health. 2018;42(5):434–6.

Article   PubMed   Google Scholar  

Australian Government Department of Health and Aged Care. National response to syphilis Canberra: Australian Government; 2022 [Available from: https://www.health.gov.au/initiatives-and-programs/national-response-to-syphilis?utm_source=health.gov.au&utm_medium=callout-auto-custom&utm_campaign=digital_transformation .

Baird T, Donnan E, Coulter C, Simpson G, Konstantinos A, Eather G. Multidrug-resistant Tuberculosis in Queensland, Australia: an ongoing cross-border challenge. Int J Tuberculosis lung Disease: Official J Int Union against Tuberculosis Lung Disease. 2018;22(2):206–11.

Article   CAS   Google Scholar  

Pyone T, Smith H, van den Broek N. Frameworks to assess health systems governance: a systematic review. Health Policy Plann. 2017;32(5):710–22.

Sheikh K, Gilson L, Agyepong IA, Hanson K, Ssengooba F, Bennett S. Building the field of health policy and systems research: framing the questions. PLoS Med. 2011;8(8):e1001073.

Robert K, Yin. Case study research design and methods (5th edition). Thousand Oaks, CA: Sage; 2014.

Centers for Disease Control and Prevention. 10 Essential Public Health Services. U.S. Department of Health & Human Services; 2021.

Queensland Government. Queensland Government submission: Select Committee on Health Inquiry into health policy, administration and expenditure. Brisbane; 2015.

Sweet M. On budget day for Queensland, what can be expected from the health cuts? Croakey Health Media. 2012 September 11, 2012.

Queensland Government. Queensland Health Public Health Sub-plan. Brisbane: Queensland Government. ; 2018 February 2018.

Queensland Government. Chief Health Officer Public Health Directions Brisbane: Queensland Government. ; 2022 [Available from: https://www.health.qld.gov.au/system-governance/legislation/cho-public-health-directions-under-expanded-public-health-act-powers .

Strict travel restrictions in. Place for Queensland’s indigenous communities [press release]. Brisbane: The Queensland Cabinet and Ministerial Directory27 March; 2020.

Queensland Government. Factsheet: travel restrictions in Aboriginal and Torres Strait Islander communities during coronavirus. Brisbane: Department of Seniors, Disability Services and Aboriginal and Torres Strait Islander Partnerships; 2020.

Queensland Government. Queensland COVID-19 statistics Brisbane2022 [updated 12 September 2022. Available from: https://www.qld.gov.au/health/conditions/health-alerts/coronavirus-covid-19/government-response/statistics#caseoverview

Queensland Department of Health. Public health practice manual. Brisbane: Prevention Division, Queensland Department of Health; 2016.

Queensland Aboriginal and Islander Health Council. QAIHC Submission to the Health, communities, Disability Services and Domestic and Family Violence Prevention Committee: Inquiry into the Queensland Government’s response to COVID-19 in relation to the health response only. Brisbane: QAIHC; 2020.

Queensland Health. Service agreements and deeds of amendment: Queensland Government; 2022 [Available from: https://www.health.qld.gov.au/system-governance/health-system/managing/agreements-deeds .

Baum F, Freeman T. Why Community Health systems have not flourished in High Income countries: what the Australian experience tells us. Int J Health Policy Manage. 2022;11(Special Issue on CHS-Connect):49–58.

Jorm L, Gruszin S, Churches T. A multidimensional classification of public health activity in Australia. Australia and New Zealand Health Policy. 2009;6(1):9.

Roberts L. How COVID hurt the fight against other dangerous Diseases. Nature. 2021;592(7855):502–4.

Article   CAS   PubMed   Google Scholar  

Queensland Health and Queensland Aboriginal and Islander Health Council. Making Tracks together: Queensland’s Aboriginal and Torres Strait Islander Health Equity Framework. Brisbane: Queensland Health and QAIHC; 2021.

Download references

Acknowledgements

We would like to thank the study’s advisory group (Emeritus Prof. Ian Wronski AO, Mr Bevan Ah Kee, Dr. Chris Coulter, Ms Sonia Girle, Dr. Richard Gair) for their advice and guidance at each point of the research. We would like to acknowledge and thank all the participants for their time and generosity in sharing their experiences, and providing critical feedback during dissemination forums.

This research was funded by an NHMRC Investigator Award held by SMT (GNT1173004).

Author information

Authors and affiliations.

Menzies School of Health Research, Charles Darwin University, Alice Springs, Northern Territory, Australia

Alexandra Edelman

College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia

Tammy Allen, Susan Devine, Paul F. Horwood, Jeffrey Warner & Stephanie M Topp

Australian Institute of Tropical Health and Medicine, James Cook University, Townsville, Queensland, Australia

Emma S. McBryde

College of Medicine and Dentistry, James Cook University, Townsville, Queensland, Australia

Alexandra Edelman & Julie Mudd

James Cook University, Building 41, Level 2, 1 James Cook Drive, Douglas, Queensland, 4811, Australia

Stephanie M Topp

You can also search for this author in PubMed   Google Scholar

Contributions

Conceived the study: SMT; Design and methodology: SMT, AE, TA SD, PH, EM, JM, JW; Data Collection and Management: AE, SMT; Led data Analysis: AE, SMT; Contributed to data interpretation: TA, SD, PH, EM, JM, JW; Wrote first draft: AE; Critical Edits: SMT, TA, SD, PH, EM, JM, JW. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Stephanie M Topp .

Ethics declarations

Ethics approval and consent to participate.

All methods were carried out in accordance with relevant guidelines and regulations including seeking informed consent from all interview participants. This project was approved by the Townsville HHS Human Research Ethics Committee (HREC) as a low-risk project on 28 November 2019: HREC/2019/QTHS/59811 . Reciprocal approval was received from the James Cook University HREC on 15 January 2020. Governance (site-specific) approval was received from the Townsville HHS, Cairns and Hinterland HHS, Torres and Cape HHS and Mackay HHS to enable data collection at these sites.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Appendix 1: Phase 2 - interview guide

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Edelman, A., Allen, T., Devine, S. et al. “Hospitals respond to demand. Public health needs to respond to risk”: health system lessons from a case study of northern Queensland’s COVID-19 surveillance and response. BMC Health Serv Res 24 , 104 (2024). https://doi.org/10.1186/s12913-023-10502-x

Download citation

Received : 21 July 2023

Accepted : 19 December 2023

Published : 18 January 2024

DOI : https://doi.org/10.1186/s12913-023-10502-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Public health
  • Health systems
  • Pandemic preparedness

BMC Health Services Research

ISSN: 1472-6963

case study health surveillance

  • Research in practice
  • Open access
  • Published: 07 May 2022

Conducting public health surveillance in areas of armed conflict and restricted population access: a qualitative case study of polio surveillance in conflict-affected areas of Borno State, Nigeria

  • Eric Wiesen   ORCID: orcid.org/0000-0003-2605-3015 1 ,
  • Raymond Dankoli 2 ,
  • Melton Musa 3 ,
  • Jeff Higgins 1 ,
  • Joseph Forbi 1 ,
  • Jibrin Idris 3 ,
  • Ndadilnasiya Waziri 3 ,
  • Oladapo Ogunbodede 3 ,
  • Kabiru Mohammed 4 ,
  • Omotayo Bolu 1 ,
  • Gatei WaNganda 1 ,
  • Usman Adamu 4 &
  • Eve Pinsker 5  

Conflict and Health volume  16 , Article number:  20 ( 2022 ) Cite this article

2941 Accesses

1 Citations

4 Altmetric

Metrics details

This study examined the impact of armed conflict on public health surveillance systems, the limitations of traditional surveillance in this context, and innovative strategies to overcome these limitations. A qualitative case study was conducted to examine the factors affecting the functioning of poliovirus surveillance in conflict-affected areas of Borno state, Nigeria using semi-structured interviews of a purposeful sample of participants. The main inhibitors of surveillance were inaccessibility, the destroyed health infrastructure, and the destroyed communication network. These three challenges created a situation in which the traditional polio surveillance system could not function. Three strategies to overcome these challenges were viewed by respondents as the most impactful. First, local community informants were recruited to conduct surveillance for acute flaccid paralysis in children in the inaccessible areas. Second, the informants engaged in local-level negotiation with the insurgency groups to bring children with paralysis to accessible areas for investigation and sample collection. Third, GIS technology was used to track the places reached for surveillance and vaccination and to estimate the size and location of the inaccessible population. A modified monitoring system tracked tailored indicators including the number of places reached for surveillance and the number of acute flaccid paralysis cases detected and investigated, and utilized GIS technology to map the reach of the program. The surveillance strategies used in Borno were successful in increasing surveillance sensitivity in an area of protracted conflict and inaccessibility. This approach and some of the specific strategies may be useful in other areas of armed conflict.

Introduction

Global strategy.

The Global Polio Eradication Program was established in 1988 with the lofty goal of eradicating polio globally by the year 2000 [ 1 ]. The key strategies for polio eradication are achieving high coverage with 4 doses of polio vaccine for all infants, supplemental polio mass vaccination campaigns to boost polio immunity, sensitive surveillance to rapidly detect poliovirus circulation, and outbreak response vaccination campaigns. Since its inception, the program has reduced the annual incidence of paralytic polio from over 350,000 cases in 125 countries in 1988 to only 140 cases in just two countries in 2020 [ 2 ]. Despite this remarkable progress, the program is now 20 years past the target date and struggling to stop transmission in the remaining indigenous wild poliovirus reservoirs [ 3 ] in parts of Pakistan and areas of conflict in Afghanistan.

Requirements for polio surveillance

Sensitive poliovirus surveillance is a key component of the effort to eradicate polio because it allows the program to rapidly detect and respond to any cases of polio to stop the transmission [ 4 ]. The poliovirus surveillance system is centered primarily on active surveillance for any case of acute flaccid paralysis (AFP) in children with laboratory testing of fecal specimens for poliovirus. Conducting this surveillance well requires a comprehensive network of district surveillance officers and health facility surveillance focal persons to quickly detect, report, and investigate AFP cases as they occur [ 4 ]. This network requires participation by public and private health care providers and is often augmented with support from partners such as the World Health Organization (WHO). There are a set of performance indicators that track the functioning and sensitivity of typical AFP surveillance systems. However, in areas of armed conflict these surveillance systems are challenged.

Challenges in surveillance in armed conflicts

The ability to conduct sensitive surveillance is substantially curtailed in situations of insecurity and inaccessibility due to armed conflict [ 5 ]. Without full access to the population for vaccination and surveillance, poliovirus can circulate undetected. For example, an outbreak of polio in South Sudan was detected in 2008, which, based on poliovirus genomic sequencing analysis, Footnote 1 [ 7 ], represented three years of undetected transmission due to ongoing conflict in that country [ 8 ]. Disruptions to both vaccination and surveillance have led to polio outbreaks and delayed detection in Afghanistan, Somalia, Angola, and the Democratic Republic of Congo as well [ 9 ].

Context in Borno State, Nigeria

This complex problem, which can have far-reaching implications, is exemplified in the northern Nigeria State of Borno where wild poliovirus (WPV) was detected in 2016 and linked to transmission of lineages last detected in 2011, representing five years of undetected transmission due to the ongoing conflict in the state [ 10 ]. For over a decade Northeast Nigeria, and particularly Borno state, has been plagued by ongoing attacks by Boko Haram and offshoot terrorist groups [ 11 ]. These armed groups are responsible for mass killings, hostage takings, and destruction of houses and infrastructure including health facilities. During 2014–2016, Boko Haram gained control of progressively more territory in the state. Approximately 2.2 million people fled their homes due to the terrorist activities and millions more are in need of humanitarian assistance [ 12 ]. A large, number of people remained trapped in inaccessible areas of Borno State that the polio program could not access to conduct disease surveillance or vaccination [ 13 ]. Because of this situation, the polio program in Nigeria could not rule out the possibility of continued polio transmission in the state.

This study examined the impact of armed conflict on public health surveillance systems, the limitations of traditional surveillance strategies in this context, and potential strategies to overcome these limitations. The primary question was: how can the conventional polio surveillance system and strategies be modified for areas of conflict and inaccessible populations? Secondary questions focused on exploring the inhibitors of effective surveillance in the context of armed conflict, potential strategies to overcome them, modified performance monitoring mechanisms, and systems for facilitating collaboration for surveillance.

Study design

This study employed a qualitative single case study design to examine the AFP surveillance system in inaccessible areas of Borno State, Nigeria. Inaccessibility was defined as the inability of civilians to safely move in and out of a given area due to the risk of attack by insurgents. Elements of case study research include corroboration of findings from different types of evidence, use of a conceptual framework to guide the research design, and use of appropriate data collection and data analysis techniques to address issues of validity and reliability [ 14 ]. This design was chosen to allow an in-depth exploration of the challenges and strategies at play in a severe conflict situation.

Researcher characteristics

The corresponding author conducted all the interviews and analyses for this study. Based in Atlanta, he had travelled to Borno state twice prior to the study to support the polio eradication program and had met some of the respondents. He was not working on polio eradication in Nigeria at the time of this study and did not attempt to bias or sway them in any way from providing their own perspectives during the interviews. While the use of a sole researcher to collect data is a limitation of the study, several strategies to strengthen validity were employed. These included: (1) triangulation among data sources—a document review was conducted prior to interviews, not to limit the interviews but to make it possible to explore, expand on, and question barriers and enablers tentatively identified through documents; (2) use of a second coder to check the researcher’s application of the codes, until 80% agreement was consistently reached; (3) member checking with the interview respondents to review the analysis and ensure that it accurately reflected their responses; (4) peer debriefing and discussion of results with a group of colleagues who work on polio eradication in areas of armed conflict in Nigeria, Somalia, Pakistan, and Afghanistan. Observations from member checking and peer debriefing [ 15 ] were incorporated into the final discussion of the data and greatly enriched the analysis.

Conceptual framework

A conceptual framework (Fig.  1 ) was developed (as Yin recommends for case studies as noted above) encompassing the key factors that affect the AFP surveillance system in conflict-affected areas, and revised following data analysis. The framework was developed though a review of the literature, analysis of the current polio surveillance structure, and reflections on the unique challenges affecting surveillance in Borno [ 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 ]. It includes the systems, assumptions, barriers, theories, and opportunities regarding conducting high quality polio surveillance in conflict-affected areas. The framework identifies the interconnections between these factors to focus on the opportunities to change the current system in ways that will make it more effective in the context of armed conflict. It illustrates the ways in which the current polio surveillance system is hindered in areas of armed conflict and suggests alternatives that may be effective in overcoming those barriers. Finally, it includes novel strategies such as engagement with local communities, engagement with security forces, use of tailored surveillance indicators, and use of remote sensing to assess population dynamics.

figure 1

Conceptual Framework

Study sample

The sample for this study was purposefully selected [ 30 ] and encompassed 15 key documents and 16 staff selected for in-depth interviews. The sample strategy was designed to obtain a range of perspectives and experiences and to encapsulate both the field level (district and community) and the higher levels (state and national). The strategy also entailed reaching out to various organizations to understand the issues related to collaboration and information sharing. Finally, the strategy involved selecting various cadres of staff who are familiar with diverse aspects of surveillance including oversight functions, field surveillance, and data analysis. The list of potential staff to interview included 17 staff at the state and national level and 30 staff at the field level. We limited the interviews to a maximum of 30 people to ensure feasibility of the data collection. We purposively chose from among the staff in the sample to obtain a range of perspectives. In order to ensure a diversity of respondents, the sample included at least 3 staff from the district level and staff from at least 3 different agencies. We continued sampling until reaching a point of saturation in which we were hearing the same information and not gaining new insights from subsequent interviews. However, our study has limitations given that not everyone selected for the study opted to participate and there may have been some sample bias due to the participation levels. Key documents were selected to obtain detailed information on the armed conflict in Borno, the humanitarian response to the conflict, and polio eradication program activities in those areas. Documents included published reports, journal articles, program plans, and news media articles (Table 1 ).

The staff interviewed in this study were selected to obtain a range of perspectives from various organizations, position types, and levels that are important for the functioning of the surveillance system (Table 2 ). The primary aim of these interviews was to gain a deeper understanding of the enablers and barriers for effective surveillance and the monitoring and collaboration systems. Five of the staff interviewed worked in four Local Government Areas (LGAs, districts) with high levels of conflict and inaccessibility: Guzamala, Bama, Ngala, and Kukawa. Interviews were continued until reaching a point of saturation where very little new information was gained from additional interviews.

Data collection and data management

Standardized interview guides were developed, pre-tested for clarity and relevance with relevant stakeholders and refined prior to data collection. Separate interview guides were developed for state level and district level interviews. Interviews were conducted between April and August 2020. All semi-structured telephone interviews were recorded and manually transcribed. Each interview was approximately one hour in length. Interview transcriptions were reviewed and cleaned for transcription errors prior to analysis. Reflective memos were produced immediately after each interview.

Data analysis

Data relevant to the study questions and constructs in the 15 documents were extracted using a tool in Microsoft Excel© to create a matrix for analysis by construct and document type. The document extracts were analyzed to better understand the constructs in the study, identify emerging themes, and assess consistency of information among reports as a measure of the reliability of the available data.

Interview data were analyzed using MaxQDA© 2018 software [ 32 ]. Data were coded using both on a-priori and emergent codes (i.e. grounded in the data) (Table 3 ) following a hybrid approach to coding [ 33 , 34 ]. Two rounds of coding were conducted to ensure that emerging codes and co-occurring codes were fully captured. Analysis was conducted using analytic memos, matrix displays, summary tables, code relations graphs, and code mapping to develop and describe themes and relationships in the data. Content from co-occurring codes was analyzed in further detail through summary tables and grids by organization. Results were compared and contrasted among respondents and respondent groups and also triangulated among interview data and reviewed documents to look for areas of convergence and divergence. Data analysis was conducted concurrently with data collection.

Ethical considerations

This study posed little risk to the participants. Only program staff who were already deeply involved in the issues were included. Our study design purposely excluded interviewees who could have been exposed to personal risk from participation in the study, e.g. identified community informants. Informed consent was obtained from each participant prior to conducting the interviews. All responses were kept confidential and no identifying information was retained electronically. Ethical approval was obtained from the Government of Borno State Ethical Review Board, and the case study was determined non-research by the CDC Center for Global Health Human Subjects Research Office and the University of Illinois Ethical Review Board. The researchers affirm that ethical considerations must be paramount in a study of this nature, both in conducting the study and in drawing recommendations from the results for potential application in other conflict-torn areas, e.g. considering what risks there may be to local populations in involving particular actors such as local informants or the military in surveillance.

Role of the funding source

Funds provided by the US Centers for Disease Control and Prevention (CDC) were used to cover the cost of transcribing the interviews. CDC also allowed the primary investigator (a CDC employee) to work on this project during his working hours. The primary investigator conducted the study and made the decision to submit the manuscript for publication while working for CDC. CDC as an agency did not provide input into the study design or analysis.

Findings from the document review and interviews were consistent; the respondents’ interviews were very consistent and highly detailed (Table 4 ). The main inhibitors of surveillance in the conflict areas of Borno State were inaccessibility, and the destruction of both the health care infrastructure and the communication network; respondents unanimously reported that there were no functional health facilities and no cellular network in those areas. The traditional polio surveillance system relies on active surveillance in facilities, passive reporting, and prompt communication and could not function in the inaccessible areas. Figure  2 displays the accessibility by ward (sub-district) in Borno state as of December 2020. Other important challenges to the traditional AFP surveillance system, including traumatizing violence and widespread malnutrition, were considered surmountable. Population movement was viewed as a potential surveillance advantage because migrating families were primarily fleeing inaccessible areas to accessible areas, where they could more easily be captured in the surveillance system.

Respondent 3: “Up to 45% of the state geographic area remain inaccessible. Take for example, there are 27 local governments in the state, only 6 are fully accessible” …”populations living in those areas cannot be reached by the regular teams that conduct AFP surveillance and surveillance for other vaccines preventable diseases. So, some populations are trapped there” Respondent 1: “So, all those health facilities in those trapped communities have been destroyed.” Respondent 4: “in those inaccessible areas, communication structures has been destroyed, so GSM networks Footnote 2 are not available. You won't be able to communicate on phone in those areas.”

figure 2

Accessibility in Borno State, December 2020 (provided by the Borno polio Emergency Operations Center)

Three strategies were found to be effective in overcoming these challenges: (1) use of local community informants to conduct surveillance in inaccessible areas; (2) local-level negotiation with insurgency groups to bring children with paralysis to accessible areas for investigation and sample collection; and (3) use of GIS technology to estimate the size and location of the population in inaccessible places (satellite imagery) and to track progress in surveillance. Together, these provided strong cumulative evidence of the absence of WPV transmission in Borno state. Other strategies discussed, but not emphasized by the respondents, were collection and testing of stool specimens from healthy children from inaccessible areas, collaboration with security forces, profiling newly arrived displaced persons, and accessing nomadic populations for surveillance.

Use of local community informants for surveillance

Lay adults who resided in or were able to enter inaccessible areas were recruited as community informants in inaccessible areas (CIIAs) to search for children with suspected AFP. CIIAs were recruited through a snowball approach and included hunters, traders, nomads, and others identified at markets who were uninvolved in government programs, to protect them from anti-government sentiment. No stipend was provided; CIIAs were given an allowance after attending monthly meetings. The settlements they visited depended on whether they could indeed negotiate access. Their exact activities depended on the security risk level in the areas they reached, from simply observing children to directly asking adults if they had any paralyzed children in their or neighboring households. A separate coordination system to monitor CIIAs was set up with ward and LGA coordinators who were also intentionally distanced from the polio program to protect them from anti-government sentiment. Respondents agreed that CIIAs were reaching most, but not all settlements in inaccessible areas. Challenges discussed included reporting of false AFP cases, late reporting, additional costs required to collect specimens, and the inability to directly supervise the work of the informants.

Respondent 5: “the major strength really lies on the ability of the informants to be able to navigate into these inaccessible areas, to be able to interact with the caregivers without any problem.”

Local level negotiation for evacuation of AFP cases

Most respondents (12/16) discussed the strategy of temporarily evacuating children with suspected AFP for confirmation and investigation. Given that CIIAs were not health workers and often illiterate, and inaccessible areas had no electricity, the most feasible but sometimes dangerous approach for collecting specimens and conducting case investigations and clinical examinations was to bring the patient to an accessible area of Borno. Funds were pre-positioned at LGAs to cover lodging, meals, and medical care costs, which played a large role in persuading families to agree to evacuation. While this strategy greatly improved case investigation, cases were often investigated late after onset due to the challenges of evacuation, including travel by foot or horse-drawn cart. It is also not clear if all children with suspected AFP were evacuated; there was no system in place for recording information about suspected AFP in children who could not be evacuated. Of note, many respondents explained that the work of the CIIAs, including evacuation of cases, required direct negotiation with the insurgents at the local level. Several respondents emphasized the importance of CIIAs having established the trust of local insurgent actors.

Respondent 12: "The community informants have been able to gain the trust of the community. So, even if a child of a terrorist needs to be evacuated, these guys can still go ahead and do the vaccination, because they have been trusted, they cannot be attacked. But if a soldier, a military man approaches those communities, the terrorists or the bad boys can engage them in a fight."

Use of GIS technology

Respondents enthusiastically described the benefits of GIS technology for implementing and monitoring of surveillance in inaccessible areas. The methods of satellite imagery analysis for assessing populations in Borno has been described elsewhere [ 13 ]. Before the use of satellite imagery, there was conflicting information on the size and location of populations remaining in inaccessible areas. Satellite imagery allowed estimation of inhabitation, population size and precise location of settlements in the inaccessible areas. Over 12,000 settlements in the inaccessible districts were regularly analyzed using satellite imagery to estimate the inaccessible population, prioritize areas for implementing surveillance and vaccination activities, track progress in reaching the population, and advocate with security forces for support in reaching inaccessible populations if needed.

Most respondents discussed the value of GPS-enabled phones as an accountability tool for tracing and documenting the places CIIAs visited, although several reported logistical difficulties in providing phones to CIIAs. This led to the development of a modified surveillance monitoring system focused on process indicators including the number of settlements reached and the number of AFP cases detected and investigated. The monitoring system relied heavily on GIS technology to regularly map the reach of the program and produce reports for program planning (Fig.  3 ). A diverse data team worked in an ongoing process of refining the system and analyzing and reporting the monitoring data. The polio Emergency Operations Center in Borno facilitated strong collaboration across organizations involved in the polio program and the humanitarian response.

Respondent 1: “We use satellite imagery to estimate population, population usually in trapped areas…. And that has really been helpful in the program.” Respondent 3: “The most important tool is the Geo-Location Tracking Systems, which I call the GTS. That shows that the person has been to a settlement. He cannot be somewhere else and then the geo-location system would show somewhere else. So, the next monitoring system is the geo-location monitoring system that is being used to show that they have visited the community itself.” Respondent 16: “so being able to use the tracking phones to add another layer of accountability, I think has been extremely valuable. So you can make sure that if somebody says they reach, they reached a settlement… Well, you can see. Alright. Did you actually go there? Did you actually spend enough time to do what you said you did?”

figure 3

Map of surveillance visits in Borno since 2014 as of April 2020 (provided by the Borno polio Emergency Operations Center)

Key findings

Our case study found that the major challenges to standard AFP surveillance activities in conflict-affected areas of Borno state were inaccessibility due to insecurity and the complete destruction of health and communication infrastructure. The most effective strategies to overcome these challenges were the recruitment of community informants with access to inaccessible areas, evacuation of AFP patients for investigation and specimen collection, and use of GIS technology for estimating the population size and location of the inaccessible settlements and tracking surveillance visits in the inaccessible areas. Since 2019 the polio program in Africa has started using GIS more widely for polio surveillance. However, its use is still limited in areas of armed conflict. Implementation of these strategies involves risk and requires a careful balancing of the safety of the local actors with the achievement of public health goals. Although the surveillance data for Borno, as a critical geography, was sufficient for certification of the eradication of indigenous WPV from the World Health Organization (WHO) Region of Africa in August 2020, the remaining challenges include pockets of settlements still unreached by vaccination and surveillance activities, uncertain regularity and quality of surveillance in the inaccessible areas, and challenges with investigating contacts of AFP cases and conducting 60-day follow up examinations when case specimens cannot be promptly collected.

Traditional performance monitoring for polio surveillance relies heavily on tracking the rate of non-polio (NP) AFP detection in children under 15 years of age [ 35 ]. However, monitoring this rate is less useful in areas of armed conflict and insecurity because the populations in those areas are often small, with low likelihood of reporting a background NP AFP case every year, of uncertain size and with severely limited health care access. In addition, the NP AFP rate assumes a relatively homogenous level of AFP detection in a given area, and low case detection in inaccessible areas may be masked by high detection in accessible areas within the same administrative area. The risk of assumed homogeneity in the surveillance performance indicators for LGAs within Borno state can be seen in the premature decision by WHO to remove Nigeria from the list of WPV-endemic countries in 2015, after one year without any WPV detection [ 36 ]. Revisions in the performance monitoring system for surveillance in inaccessible areas were necessary, focusing on accurately identifying the populations at risk and using process indicators and GPS tracking of surveillance visits in the inaccessible areas.

Recommendations

To further improve surveillance performance in the inaccessible areas of Borno State, we recommend developing systems to: (1) report and track suspected AFP cases that are not evacuated for investigation; (2) track the regularity of surveillance visits by CIIAs and categorize settlements by frequency of visits; (3) track the collection of specimens from contacts of AFP cases when specimen collection from patients is not timely; (4) continually enumerate the number of children < 15 years of age unreached by surveillance and < 5 years of age unreached by vaccination using GIS tracking data and satellite imagery analysis; and (5) use this AFP surveillance approach to detect other priority diseases in the inaccessible areas.

This study suggests useful approaches for other areas of armed conflict. The progress in Borno required sustained efforts with full financial backing, constant innovation, collaboration among partners, attention to data accuracy, and a focus on accountability and transparency. The use of local-level negotiation by community actors to expand access may be useful in other settings where higher-level negotiations are not successful. Collaboration with security forces can be useful for some areas where civilian staff cannot work safely. Furthermore, the development of novel strategies and monitoring systems demonstrated how a bottom-up approach to partner collaboration in Borno was employed to achieve a common goal through innovation, collaboration, attention to data, and accountability. This approach may serve as a model for how staff from government, international agencies, non-governmental organizations and community members can work together. Finally, GIS is a very powerful tool for assessing inhabitation status of settlements in conflict areas and for tracking interventions.

Limitations

This study is subject to several limitations. Only publicly available documents were analyzed. The use of a sole researcher to collect data may have led to subjectivity in the sample selection and analysis. As mentioned above, to address this, several strategies to strengthen validity were employed including development of a robust sampling strategy, triangulation among data sources, use of a second coder, member checking, and peer debriefing. The interviews were limited to 16 respondents, although interviews continued until the point of response saturation. In addition, some cadres of staff sought in the sampling frame did not participate. Finally, it would have been useful to directly interview community-level respondents, however because of the vulnerability of that population they were excluded.

Conclusions

This study found that, even in the most insecure and inaccessible areas of Borno State Nigeria, it was possible to conduct sensitive public health surveillance using modified approaches. In August 2020 the countries of the World Health Organization Africa Region were certified as free from WPV, an achievement that rested largely on the vaccination and surveillance activities conducted in the conflict-affected areas of Nigeria, particularly Borno state [ 10 ]. This study revealed a very effective system of collaboration to address an adaptive problem with no easy solutions. The approach used in Borno along with some of the specific strategies of local negotiated access, collaboration with security forces, and use of GIS technology, may be useful for other public health interventions in areas of armed conflict.

Availability of data and materials

The de-identified datasets used and/or analyzed during the current study are not publicly available due to the sensitive nature of this topic but are available from the corresponding author upon reasonable request. The full report from this study as well as the study protocol are also available from the corresponding author upon reasonable request.

“The sequence of the complete VP1 surface protein coding region is determined by using automated cycle-sequencing procedures described previously [ 6 ] and by comparing the resulting sequences with those in a database of all recent poliovirus isolates. The origins and routes of virus importation are then derived from phylogenetic analysis.”

Cellular phone networks.

Hull HF, et al. Progress toward global polio eradication. J Infect Dis. 1997;175(Supplement_1):S4–9.

Article   Google Scholar  

Bigouette JP, Wilkinson AL, Tallis G, Burns CC, Wassilak SG, Vertefeuille JF. Progress toward polio eradication—worldwide, January 2019–June 2021. MMWR Morb Mortal Wkly Rep. 2021;2021(70):1129–35. https://doi.org/10.15585/mmwr.mm7034a1 .

Chard AN, et al. Progress toward polio eradication—worldwide, January 2018–March 2020. Morb Mortal Wkly Rep. 2020;69(25):784.

Surveillance standards for vaccine-preventable diseases, second edition. Geneva: World Health Organization; 2018. License: CC BY-NC-SA 3.0 IGO.

Nnadi C, Etsano A, Uba B, Ohuabunwo C, Melton M, WaNganda G, Esapa L, Bolu O, Mahoney F, Vertefeuille J, Wiesen E, Durry E. Approaches to vaccination among populations in areas of conflict. J Infect Dis. 2017;216(suppl_1):S368–72. https://doi.org/10.1093/infdis/jix175 .

Article   PubMed   Google Scholar  

Liu HM, Zheng DP, Zhang LB, et al. Molecular evolution of a type 1 wild-vaccine poliovirus recombinant during widespread circulation in China. J Virol. 2000;74:11153–61.

Article   CAS   Google Scholar  

Centers for Disease Control and Prevention (CDC). Resurgence of wild poliovirus type 1 transmission and consequences of importation–21 countries, 2002–2005. MMWR Morb Mortal Wkly Rep. 2006;55(6):145–50.

Google Scholar  

Global ID, World Health Organization. Wild poliovirus type 1 and type 3 importations-15 countries, Africa, 2008–2009. Morb Mortal Wkly Rep. 2009;58(14):357–62.

Tangermann RH, Hull HF, Jafari H, Nkowane B, Everts H, Aylward RB. Eradication of poliomyelitis in countries affected by conflict. Bull World Health Organ. 2000;78(3):330–8.

CAS   PubMed   PubMed Central   Google Scholar  

Leke RGF, et al. Certifying the interruption of wild poliovirus transmission in the WHO African region on the turbulent journey to a polio-free world.". Lancet Glob Health. 2020;8:e1345–51.

Amao O. A decade of terror: revisiting Nigeria’s interminable Boko Haram insurgency. Secur J. 2020;33:357–75. https://doi.org/10.1057/s41284-020-00232-8 .

United Nations Office for the Coordination of Humanitarian Affairs. North-east Nigeria Humanitarian Situation Update October 2017. https://reliefweb.int/sites/reliefweb.int/files/resources/24112017_ocha_humanitarian_situation_update.pdf . Accessed 14 Dec 2017.

Higgins J, Adamu U, Adewara K, et al. Finding inhabited settlements and tracking vaccination progress: the application of satellite imagery analysis to guide the immunization response to confirmation of previously-undetected, ongoing endemic wild poliovirus transmission in Borno State, Nigeria. Int J Health Geogr. 2019;18:11. https://doi.org/10.1186/s12942-019-0175-y .

Article   PubMed   PubMed Central   Google Scholar  

Yin RK. The abridged version of case study research: design and method. In: Bickman L, Rog DJ, editors. Handbook of applied social research methods. Sage Publications Inc; 1998. p. 229–59.

Spall S. Peer debriefing in qualitative research: emerging operational models. Qual Inq. 1998;4(2):280–92.

Bush K. Polio, war and peace. Bull World Health Organ. 2000;78(3):281–2.

Nnadi C, Etsano A, Uba B, Ohuabunwo C, Melton M, Wanganda G, Esapa L, Bolu O, Mahoney F, Vertefeuille J, Wiesen E, Durry E. Approaches to vaccination among populations in areas of conflict. J Infect Dis. 2017;216(suppl_1):368–72.

Dil Y, Strachan D, Cairncross S, Korkor AS, Hill Z. Motivations and challenges of community-based surveillance volunteers in the northern region of Ghana. J Community Health. 2012;37(6):1192–8.

Curry D, Bisrat F, Coates E, Altman P. Reaching beyond the health post: Community-based surveillance for polio eradication. Dev Pract. 2013;23(1):69–78. https://doi.org/10.1080/09614524.2013.753410 .

WHO-recommended standards for surveillance of selected vaccine preventable diseases. 1999 https://extranet.who.int/iris/restricted/bitstream/handle/10665/64165/WHO_EPI_GEN_98.01_Rev.2.pdf?sequence=1&isAllowed=y . Accessed 26 Jan 2022.

Mbaeyi C, Kamadjeu R, Mahamud A, Webeck J, Ehrhardt D, Mulugeta A. Progress toward polio eradication—Somalia, 1998–2013. J Infect Dis. 2014;210(suppl_1):S173–80.

Centers for Disease Control and Prevention (CDC). Progress toward interrupting wild poliovirus circulation in countries with reestablished transmission–Africa, 2009–2010. MMWR Morb Mortal Wkly Rep. 2011;60(10):306.

Health Sector Nigeria. Health sector bulletin. Northeast Nigeria Humanitarian Response. August 2018. https://www.humanitarianresponse.info/sites/www.humanitarianresponse.info/files/documents/files/ne_nigeria_health_sector_bulletin_8_august_2018.pdf . Accessed 23 Sept 2018.

Bolu O, et al. Progress toward poliomyelitis eradication—Nigeria, January–December 2017. Morb Mortal Wkly Rep. 2018;67(8):253.

Hamisu AW, et al. Strategies for improving polio surveillance performance in the security-challenged Nigerian States of Adamawa, Borno, and Yobe during 2009–2014. J Infect Dis. 2016;213(suppl_3):S136–9.

Hussain SF, Boyle P, Patel P, Sullivan R. Eradicating polio in Pakistan: an analysis of the challenges and solutions to this security and health issue. Global Health. 2016;12(1):63.

Tambini G, Andrus JK, Marques E, Boshell J, Pallansch M, de Quadros OA, Kew O. Direct detection of wild poliovirus circulation by stool surveys of healthy children and analysis of community wastewater. J Infect Dis. 1993;168(6):1510–4.

Hovi T, Shulman LM, Van der Avoort H, Deshpande J, Roivainen M, De Gourville EM. Role of environmental poliovirus surveillance in global polio eradication and beyond. Epidemiol Infect. 2012;140(1):1–13.

International Organization of Migration. Displacement tracking matrix Nigeria. Round XIX Report—October 2017 Nigeria. https://drive.google.com/file/d/0B841q6qT8kS_MVpmRUowdkNrZ0k/view Accessed 26 Jan 2022.

Patton MQ. Sampling, Qualitative (Purposive). In: Ritzer G, editor. The Blackwell encyclopedia of sociology. 2007. https://doi.org/10.1002/9781405165518.wbeoss012

CDC. The Global Polio Eradication Initiative Stop Transmission of Polio (STOP) Program—1999–2013. MMWR Morb Mortal Wkly Rep. 2013;62:501–3.

VERBI Software. 2018. MAXQDA 2018 [computer software]. Berlin, Germany: VERBI Software. https://www.maxqda.com/ .

Brixey JJ, Robinson DJ, Johnson CW, Johnson TR, Turley JP, Patel VL, Zhang J. Towards a hybrid method to categorize interruptions and activities in healthcare. Int J Med Informatics. 2007;76(11–12):812–20.

Fereday J, Muir-Cochrane E. Demonstrating rigor using thematic analysis: a hybrid approach of inductive and deductive coding and theme development. Int J Qual Methods. 2006;5(1):80–92.

Patel JC, Diop OM, Gardner T, Chavan S, Jorba J, Wassilak S, Ahmed J, Snider CJ. Surveillance to track progress toward polio eradication—worldwide, 2017–2018. MMWR Morb Mortal Wkly Rep. 2019;68(13):312–8. https://doi.org/10.15585/mmwr.mm6813a4 .

Morales M, Tangermann RH, Wassilak SG. Progress toward polio eradication—worldwide, 2015–2016. Morb Mortal Wkly Rep. 2016;65(18):470–3.

Download references

Acknowledgements

Disclaimers.

The findings represent the personal views of the authors and not the official position of the U.S. Centers for Disease Control and Prevention.

This work was supported by the Centers for Disease Control and Prevention. Funding covered the cost of the corresponding author’s staff time as well as the cost of transcribing the interviews.

Author information

Authors and affiliations.

US Centers for Disease Control and Prevention, Atlanta, USA

Eric Wiesen, Jeff Higgins, Joseph Forbi, Omotayo Bolu & Gatei WaNganda

World Health Organization, Maiduguri, Borno State, Nigeria

Raymond Dankoli

National Stop Transmission of Polio, Abuja, Nigeria

Melton Musa, Jibrin Idris, Ndadilnasiya Waziri & Oladapo Ogunbodede

National Primary Health Care Development Agency, Abuja, Nigeria

Kabiru Mohammed & Usman Adamu

University of Illinois at Chicago, Chicago, USA

Eve Pinsker

You can also search for this author in PubMed   Google Scholar

Contributions

EW: conceptualization, data collection, analysis, interpretation, draft of manuscript. RD: interpretation, revision of manuscript. MM: conceptualization, data collection, analysis, interpretation, draft of manuscript. JH: interpretation, revision of manuscript. NW: data collection, interpretation, revision of manuscript. OO: data collection, interpretation, revision of manuscript. KM: interpretation, revision of manuscript. UA: data collection, interpretation, revision of manuscript. EP: conceptualization, analysis, revision of manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Eric Wiesen .

Ethics declarations

Ethical approval and consent to participate.

Ethical approval was obtained from the Government of Borno State Ethical Review Board, and the case study was determined non-research by the CDC Center for Global Health Human Subjects Research Office and the University of Illinois Ethical Review Board. All interviewees provided consent to participate in the study.

Consent for publication

Not applicable.

Competing interests

The authors declare that there are no conflicts of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Wiesen, E., Dankoli, R., Musa, M. et al. Conducting public health surveillance in areas of armed conflict and restricted population access: a qualitative case study of polio surveillance in conflict-affected areas of Borno State, Nigeria. Confl Health 16 , 20 (2022). https://doi.org/10.1186/s13031-022-00452-2

Download citation

Received : 31 August 2021

Accepted : 14 April 2022

Published : 07 May 2022

DOI : https://doi.org/10.1186/s13031-022-00452-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Conflict and Health

ISSN: 1752-1505

case study health surveillance

  • Utility Menu

University Logo

harvardchan_logo.png

school logo

Harvard T.H. Chan School of Public Health Case-Based Teaching & Learning Initiative

Teaching cases & active learning resources for public health education, case library.

The Harvard Chan Case Library is a collection of teaching cases with a public health focus, written by Harvard Chan faculty, case writers, and students, or in collaboration with other institutions and initiatives.

Use the filters at right to search the case library by subject, geography, health condition, and representation of diversity and identity to find cases to fit your teaching needs. Or browse the case collections below for our newest cases, cases available for free download, or cases with a focus on diversity. 

Using our case library

Access to cases.

Many of our cases are available for sale through Harvard Business Publishing in the  Harvard T.H. Chan case collection . Others are free to download through this website .

Cases in this collection may be used free of charge by Harvard Chan course instructors in their teaching. Contact  Allison Bodznick , Harvard Chan Case Library administrator, for access.

Access to teaching notes

Teaching notes are available as supporting material to many of the cases in the Harvard Chan Case Library. Teaching notes provide an overview of the case and suggested discussion questions, as well as a roadmap for using the case in the classroom.

Access to teaching notes is limited to course instructors only.

  • Teaching notes for cases available through  Harvard Business Publishing may be downloaded after registering for an Educator account .
  • To request teaching notes for cases that are available for free through this website, look for the "Teaching note  available for faculty/instructors " link accompanying the abstract for the case you are interested in; you'll be asked to complete a brief survey verifying your affiliation as an instructor.

Using the Harvard Business Publishing site

Faculty and instructors with university affiliations can register for Educator access on the Harvard Business Publishing website,  where many of our cases are available . An Educator account provides access to teaching notes, full-text review copies of cases, articles, simulations, course planning tools, and discounted pricing for your students.

related case

What's New

Atkinson, M.K. , 2023. Organizational Resilience and Change at UMass Memorial , Harvard Business Publishing: Harvard T.H. Chan School of Public Health. Available from Harvard Business Publishing Abstract The UMass Memorial Health Care (UMMHC or UMass) case is an examination of the impact of crisis or high uncertainty events on organizations. As a global pandemic unfolds, the case examines the ways in which UMMHC manages crisis and poses questions around organizational change and opportunity for growth after such major events. The case begins with a background of UMMHC, including problems the organization was up against before the pandemic, then transitions to the impact of crisis on UMMHC operations and its subsequent response, and concludes with challenges that the organization must grapple with in the months and years ahead. A crisis event can occur at any time for any organization. Organizational leaders must learn to manage stakeholders both inside and outside the organization throughout the duration of crisis and beyond. Additionally, organizational decision-makers must learn how to deal with existing weaknesses and problems the organization had before crisis took center stage, balancing those challenges with the need to respond to an emergency all the while not neglecting major existing problem points. This case is well-suited for courses on strategy determination and implementation, organizational behavior, and leadership.

The case describes the challenges facing Shlomit Schaal, MD, PhD, the newly appointed Chair of UMass Memorial Health Care’s Department of Ophthalmology. Dr. Schaal had come to UMass in Worcester, Massachusetts, in the summer of 2016 from the University of Louisville (KY) where she had a thriving clinical practice and active research lab, and was Director of the Retina Service. Before applying for the Chair position at UMass she had some initial concerns about the position but became fascinated by the opportunities it offered to grow a service that had historically been among the smallest and weakest programs in the UMass system and had experienced a rapid turnover in Chairs over the past few years. She also was excited to become one of a very small number of female Chairs of ophthalmology programs in the country. 

Dr. Schaal began her new position with ambitious plans and her usual high level of energy, but immediately ran into resistance from the faculty and staff of the department.  The case explores the steps she took, including implementing a LEAN approach in the department, and the leadership approaches she used to overcome that resistance and build support for the changes needed to grow and improve ophthalmology services at the medical center. 

This case describes efforts to promote racial equity in healthcare financing from the perspective of one public health organization, Community Care Cooperative (C3). C3 is a Medicaid Accountable Care Organization–i.e., an organization set up to manage payment from Medicaid, a public health insurance option for low-income people. The case describes C3’s approach to addressing racial equity from two vantage points: first, its programmatic efforts to channel financing into community health centers that serve large proportions of Black, Indigenous, People of Color (BIPOC), and second, its efforts to address racial equity within its own internal operations (e.g., through altering hiring and promotion processes). The case can be used to help students understand structural issues pertaining to race in healthcare delivery and financing, to introduce students to the basics of payment systems in healthcare, and/or to highlight how organizations can work internally to address racial equity.

Kerrissey, M.J. & Kuznetsova, M. , 2022. Killing the Pager at ZSFG , Harvard Business Publishing: Harvard T.H. Chan School of Public Health case collection. Available from Harvard Business Publishing Abstract This case is about organizational change and technology. It follows the efforts of one physician as they try to move their department past using the pager, a device that persisted in American medicine despite having long been outdated by superior communication technology. The case reveals the complex organizational factors that have made this persistence possible, such as differing interdepartmental priorities, the perceived benefits of simple technology, and the potential drawbacks of applying typical continuous improvement approaches to technology change. Ultimately the physician in the case is not able to rid their department of the pager, despite pursuing a thorough continuous improvement effort and piloting a viable alternative; the case ends with the physician having an opportunity to try again and asks students to assess whether doing so is wise. The case can be used in class to help students apply the general concepts of organizational change to the particular context of technology, discuss the forces of stasis and change in medicine, and to familiarize students with the uses and limits of continuous improvement methods. 

Yatsko, P. & Koh, H. , 2021. Dr. Joan Reede and the Embedding of Diversity, Equity, and Inclusion at Harvard Medical School , Harvard T.H. Chan School of Public Health case collection. Available from Harvard Business Publishing Abstract For more than 30 years, Dr. Joan Reede worked to increase the diversity of voices and viewpoints heard at Harvard Medical School (HMS) and at its affiliate teaching hospitals and institutes. Reede, HMS’s inaugural dean for Diversity and Community Partnership, as well as a professor and physician, conceived and launched more than 20 programs to improve the recruitment, retention, and promotion of individuals from racial and ethnic groups historically underrepresented in medicine (UiMs). These efforts have substantially diversified physician faculty at HMS and built pipelines for UiM talent into academic medicine and biosciences. Reede helped embed the promotion of diversity, equity, and inclusion (DEI) not only into Harvard Medical School’s mission and community values, but also into the DEI agenda in academic medicine nationally. To do so, she found allies and formed enduring coalitions based on shared ownership. She bootstrapped and hustled for resources when few readily existed. And she persuaded skeptics by building programs using data-driven approaches. She also overcame discriminatory behaviors and other obstacles synonymous with being Black and female in American society. Strong core values and sense of purpose were keys to her resilience, as well as to her leadership in the ongoing effort to give historically marginalized groups greater voice in medicine and science.

Cases Available for Free Download

Alidina, S., Paulus, J. & Kane, N.M. , 2009. Malaria and DDT in Uganda , Harvard Business Publishing: Harvard T.H. Chan School of Public Health case collection. Download free of charge Abstract In October 2008, Dr. Richard Mgaga, Head of the Malaria Control Programme in Uganda reviewed the monthly malaria statistics report for the district of Apac, which in April of 2008 had undergone a pilot indoor residual spraying (IRS) program using DDT in a campaign to prevent mosquitoes from biting and spreading malaria. The campaign was halted by a court injunction requested by organic farmers, exporters and environmentalists in May 2008, and the injunction was upheld by the High Court in June. In early August, the Uganda Health Ministry began spraying a pyrethroid insecticide in place of DDT. Meanwhile the Ugandan Attorney General was challenging the High Court’s decision.  Dr. Mugaga was under pressure by the Presidential Malaria Initiative (PMI) to undertake a full program of IRS in 300,000 households in the northern districts of Uganda, including Apac. However, he was unsure whether to proceed, given the opposition and apparent problems that surfaced when the Apac pilot was implemented. Teaching note available for faculty/instructors .

In February 2015, technical staff reviewed the results from a jointly conducted study on malaria control. This study had major implications for malaria in Zambia—and elsewhere. The preliminary analysis strongly suggested that the study’s Mass Drug Administration (MDA) strategy was reducing the incidence of malaria disease. In addition, MDA seemed to be driving down the infection reservoir among asymptomatic people in the study area of the Southern Province of Zambia. Further analysis with mathematical models indicated that if the intervention was sustained so current trends continued, then the MDA strategy would make it possible to eliminate malaria in the Southern Province. 

If malaria could be eliminated in one region of Zambia, that would provide new evidence and motivation to work towards elimination throughout the country, an ambitious goal. But it would not be easy to move from conducting one technical study in a single region to creating a national strategy for malaria elimination. The scientists realized that their new data and analyses—of malaria infections, mosquito populations, and community health worker activities—were not enough. A national malaria elimination effort would require mobilizing many partners, national and local leaders, and community members, and convincing them to get on board with this new approach. 

Teaching note available for faculty/instructors .

In the aftermath of the atrocities endured by the Cambodian people, Friends-International (FI) was established in 1994 to address some of the many protection needs faced by the country’s marginalized children and youth. In the intervening quarter century, FI has grown substantially, both in the scope and complexity of its operations. The organization’s core mission consists of providing comprehensive, innovative, and high quality services to children, youth, and their families, based on a child rights-based approach that informs all of the organization’s programs. FI has established a strong and highly respected presence in Cambodia, building social services for children, operating effective social businesses , and initiating the global ChildSafe Movement. Over time , they have expanded their community-based model to multiple countries. But amidst their expansion, FI has continued to face financial insecurity and a constantly shifting landscape of challenging child protection concerns. At what point might they have been trying to do too much, possibly unduly stretching themselves across too many sectors and borders? Innovation had been a core strength of FI, but was it always appropriate to innovate? The case addresses these common problems.

Gordon, R. , 2014. Who Owns Your Story? , Harvard University: Global Health Education and Learning Incubator. Access online Abstract This case uses a role play simulation to illustrate ethical implications when research practices violate cultural taboos and norms. In Who Owns Your Story? the Trilanyi - a fictional Native American tribe based on a real community that is not identified or located in the case – is adversely affected by a high prevalence of diabetes. They ask a university professor with whom they have a close relationship to study their tribe, and they agree to give samples of their blood – which they consider sacred – for the study. Tribe members signed a consent form to participate but it was unclear whether they realized that the consent covered the university potentially using their blood for other possible research topics beyond diabetes. Ultimately, the study does not discover that the tribe has a genetic predisposition to diabetes. Years later, however, tribe members learn that their samples had also been used to study topics they considered objectionable. The case is based on true events between the Havasupai tribe and the University of Arizona which ultimately led to a legal suit that was settled out of court. In the case, students are asked to develop and simulate role play negotiations toward an acceptable resolution for all the parties involved. 

Focus on Diversity, Equity, and Inclusion

This case describes and explores the development of the first medical transitions clinic in Louisiana by a group of community members, health professionals, and students at Tulane Medical School in 2015.  The context surrounding health in metro New Orleans, the social and structural determinants of health, and mass incarceration and correctional health care are described in detail. The case elucidates why and how the Formerly Incarcerated Transitions (FIT) clinic was established, including the operationalization of the clinic and the challenges to providing healthcare to this population. The case describes the central role of medical students as case managers at the FIT clinic, and how community organizations were engaged in care provision and the development of the model.  The case concludes with a discussion of the importance of advocacy amongst health care professionals.

Guerra, I., et al. , 2019. SALUDos: Healthcare for Migrant Seasonal Farm Workers , Harvard University: Social Medicine Consortium. Download free of charge Abstract The SALUDos program began in 2008 as a response to an influx of migrant seasonal farm workers (MSFWs) at a mobile medical unit serving homeless persons in Santa Clara County in Northern California. The program offered patients free and low-cost primary care services, linkage to resources, and advocacy.  As the farm workers involved in this program became more involved in their primary care, they advocated for evening hours, transportation, linkage to coverage programs, and health education resources to better understand their medical and psychological conditions. During continual modifications of the SALUDos program, the team sought to understand and address large-scale social forces affecting migrant health through interventions to mitigate health inequities. Teaching note available for faculty/instructors.

This module will present two unfolding case studies based on real-world, actual events. The cases will require participants to review videos embedded into three modules and a summary module: Introduction to Concepts of Social Determinant of Health and Seeking Racial Equity  Case Study on Health and Healthcare Context - Greensboro Health Disparities Collaborative (GHDC)​    Case Study on Social and Community Context - Renaissance Community Cooperative (RCC) Summary (Optional)

The learning objectives for the modules are related to achieving the Healthy People 2020 Social Determinants of Health Objectives – specifically the (1) Health and Healthcare Context, and (2) Social and Community Context.   

Al Kasir, A., Coles, E. & Siegrist, R. , 2019. Anchoring Health beyond Clinical Care: UMass Memorial Health Care’s Anchor Mission Project , Harvard Business Publishing: Harvard T.H. Chan School of Public Health case collection. Available from Harvard Business Publishing Abstract As the Chief Administrative Officer of UMass Memorial Health Care (UMMHC) and president of UMass Memorial (UMM) Community Hospitals, Douglas Brown had just received unanimous and enthusiastic approval to pursue his "Anchor Mission" project at UMMHC in Worcester, Massachusetts. He was extremely excited by the board's support, but also quite apprehensive about how to make the Anchor Mission a reality. Doug had spearheaded the Anchor Mission from its earliest exploratory efforts. The goal of the health system's Anchor Mission-an idea developed by the Democracy Collaborative, an economic think tank-was to address the social determinants of health in its community beyond the traditional approach of providing excellent clinical care. He had argued that UMMHC had an obligation as the largest employer and economic force in Central Massachusetts to consider the broader development of the community and to address non-clinical factors, like homelessness and social inequality that made people unhealthy. To achieve this goal, UMMHC's Anchor Mission would undertake three types of interventions: local hiring, local sourcing/purchasing, and place-based community investment projects. While the board's enthusiasm was palpable and inspiring, Doug knew that sustaining it would require concrete accomplishments and a positive return on any investments the health system made in the project. The approval was just the first step. Innovation and new ways of thinking would be necessary. The bureaucracy behind a multi-billion-dollar healthcare organization would need to change. Even the doctors and nurses would need to change! He knew that the project had enormous potential but would become even more daunting from here.

Johnson, P. & Gordon, R. , 2013. Hauwa Ibrahim: What Route to Change? , Harvard University: Global Health Education and Learning Incubator. Access online Abstract This case explores Nigerian attorney Hauwa Ibrahim’s defense of a woman charged with adultery by Islamic Shariah law. One of Nigeria’s first female lawyers, Ibrahim develops a strategy to defend a young married woman, Amina Lawal, against adultery charges that could potentially, if the court judged against her, result in her death. While many Western non-governmental organizations and advocacy groups viewed Lawal’s case as an instance of human rights abuse and called for an abolition of the Shariah-imposed punishment, Ibrahim instead chose to see an opportunity for change within a system that many – especially cultural outsiders – viewed as oppressive. Ibrahim challenged the dominant paradigm by working within it to create change that would eventually reverberate beyond one woman’s case. Willing to start with a framework that saw long-term opportunity and possibility, Ibrahim developed a very measured change approach and theory framed in seven specific principles. Additionally, Ibrahim’s example of challenging her own internal paradigms while also insisting that others do the same invites students to examine their own internal systems and paradigms.

Filter cases

Author affiliation.

  • Harvard T.H. Chan School of Public Health (98) Apply Harvard T.H. Chan School of Public Health filter
  • Harvard Business School (22) Apply Harvard Business School filter
  • Global Health Education and Learning Incubator at Harvard University (12) Apply Global Health Education and Learning Incubator at Harvard University filter
  • Strategic Training Initiative for the Prevention of Eating Disorders (STRIPED) (11) Apply Strategic Training Initiative for the Prevention of Eating Disorders (STRIPED) filter
  • Social Medicine Consortium (8) Apply Social Medicine Consortium filter
  • Harvard Kennedy School of Government (1) Apply Harvard Kennedy School of Government filter
  • Harvard Malaria Initiative (1) Apply Harvard Malaria Initiative filter
  • Women, Gender, and Health interdisciplinary concentration (1) Apply Women, Gender, and Health interdisciplinary concentration filter

Geographic focus

  • United States (63) Apply United States filter
  • Massachusetts (14) Apply Massachusetts filter
  • International/multiple countries (11) Apply International/multiple countries filter
  • California (6) Apply California filter
  • Mexico (4) Apply Mexico filter
  • India (3) Apply India filter
  • Israel (3) Apply Israel filter
  • New York (3) Apply New York filter
  • Bangladesh (2) Apply Bangladesh filter
  • Colorado (2) Apply Colorado filter
  • Guatemala (2) Apply Guatemala filter
  • Haiti (2) Apply Haiti filter
  • Japan (2) Apply Japan filter
  • Kenya (2) Apply Kenya filter
  • South Africa (2) Apply South Africa filter
  • Uganda (2) Apply Uganda filter
  • United Kingdom (2) Apply United Kingdom filter
  • Washington state (2) Apply Washington state filter
  • Australia (1) Apply Australia filter
  • Cambodia (1) Apply Cambodia filter
  • China (1) Apply China filter
  • Connecticut (1) Apply Connecticut filter
  • Egypt (1) Apply Egypt filter
  • El Salvador (1) Apply El Salvador filter
  • Honduras (1) Apply Honduras filter
  • Liberia (1) Apply Liberia filter
  • Louisiana (1) Apply Louisiana filter
  • Maine (1) Apply Maine filter
  • Michigan (1) Apply Michigan filter
  • Minnesota (1) Apply Minnesota filter
  • New Jersey (1) Apply New Jersey filter
  • Nigeria (1) Apply Nigeria filter
  • Pakistan (1) Apply Pakistan filter
  • Philippines (1) Apply Philippines filter
  • Rhode Island (1) Apply Rhode Island filter
  • Turkey (1) Apply Turkey filter
  • Washington DC (1) Apply Washington DC filter
  • Zambia (1) Apply Zambia filter

Case availability & pricing

  • Available for purchase from Harvard Business Publishing (73) Apply Available for purchase from Harvard Business Publishing filter
  • Download free of charge (50) Apply Download free of charge filter
  • Request from author (4) Apply Request from author filter

Case discipline/subject

  • Healthcare management (55) Apply Healthcare management filter
  • Social & behavioral sciences (41) Apply Social & behavioral sciences filter
  • Health policy (35) Apply Health policy filter
  • Global health (28) Apply Global health filter
  • Multidisciplinary (16) Apply Multidisciplinary filter
  • Child & adolescent health (15) Apply Child & adolescent health filter
  • Marketing (15) Apply Marketing filter
  • Environmental health (12) Apply Environmental health filter
  • Human rights & health (11) Apply Human rights & health filter
  • Social innovation & entrepreneurship (11) Apply Social innovation & entrepreneurship filter
  • Women, gender, & health (11) Apply Women, gender, & health filter
  • Finance & accounting (10) Apply Finance & accounting filter
  • Population health (8) Apply Population health filter
  • Social medicine (7) Apply Social medicine filter
  • Epidemiology (6) Apply Epidemiology filter
  • Nutrition (6) Apply Nutrition filter
  • Technology (6) Apply Technology filter
  • Ethics (5) Apply Ethics filter
  • Life sciences (5) Apply Life sciences filter
  • Quality improvement (4) Apply Quality improvement filter
  • Quantative methods (3) Apply Quantative methods filter
  • Maternal & child health (1) Apply Maternal & child health filter

Health condition

  • Cancer (3) Apply Cancer filter
  • COVID-19 (3) Apply COVID-19 filter
  • Obesity (3) Apply Obesity filter
  • Breast cancer (2) Apply Breast cancer filter
  • Disordered eating (2) Apply Disordered eating filter
  • Ebola (2) Apply Ebola filter
  • Influenza (2) Apply Influenza filter
  • Injury (2) Apply Injury filter
  • Malaria (2) Apply Malaria filter
  • Alcohol & drug use (1) Apply Alcohol & drug use filter
  • Asthma (1) Apply Asthma filter
  • Breast implants (1) Apply Breast implants filter
  • Cardiovascular disease (1) Apply Cardiovascular disease filter
  • Cervical cancer (1) Apply Cervical cancer filter
  • Cholera (1) Apply Cholera filter
  • Food poisoning (1) Apply Food poisoning filter
  • HPV (1) Apply HPV filter
  • Malnutrition (1) Apply Malnutrition filter
  • Meningitis (1) Apply Meningitis filter
  • Opioids (1) Apply Opioids filter
  • Psychological trauma (1) Apply Psychological trauma filter
  • Road traffic injury (1) Apply Road traffic injury filter
  • Sharps injury (1) Apply Sharps injury filter
  • Skin bleaching (1) Apply Skin bleaching filter

Diversity and Identity

  • Female protagonist (13) Apply Female protagonist filter
  • Health of diverse communities (11) Apply Health of diverse communities filter
  • Protagonist of color (5) Apply Protagonist of color filter

Supplemental teaching material

  • Teaching note available (70) Apply Teaching note available filter
  • Multi-part case (18) Apply Multi-part case filter
  • Additional teaching materials available (12) Apply Additional teaching materials available filter
  • Simulation (2) Apply Simulation filter
  • Teaching pack (2) Apply Teaching pack filter
  • Teaching example (1) Apply Teaching example filter

Browse our case library

Singer, S. , 2013. Surgical Safety Simulation Exercise , Harvard T.H. Chan School of Public Health. Abstract In this simulation exercise, students are given the opportunity to think critically about the role of motivation and organizational context in implementing a process innovation. Students work in teams of four to six people to develop recommendations for a hospital president on the best ways to implement a surgical safety checklist. Simulation available upon request from author .

Yatsko, P. & Koh, H. , 2017. Dr. Jonathan Woodson, Military Health System Reform, and National Digital Health Strategy , Harvard Business Publishing: Harvard T.H. Chan School of Public Health case collection. Available from Harvard Business Publishing Abstract Dr. Jonathan Woodson faced more formidable challenges than most in his storied medical, public health, and military career, starting with multiple rotations in combat zones around the world. He subsequently took on ever more complicated assignments, including reforming the country’s bloated Military Health System (MHS) in his role as assistant secretary of defense for health affairs at the U.S. Department of Defense from 2010 to 2016. As the director of Boston University’s Institute for Health System Innovation and Policy starting in 2016, he devised a National Digital Health Strategy (NDHS) to harness the myriad disparate health care innovations taking place around the country, with the goal of making the U.S. health care system more efficient, patient-centered, safe, and equitable for all Americans. How did Woodson—who was also a major general in the U.S. Army Reserves and a skilled vascular surgeon—approach such complicated problems? In-depth research and analysis, careful stakeholder review, strategic coalition building, and clear, insightful communication were some of the critical leadership skills Woodson employed to achieve his missions.

In 2011 in response to two high profile cases of maternal death during labor and delivery, Ugandan citizens mobilized to prevent maternal mortality by improving the delivery of healthcare services in public hospitals. The Coalition to Stop Maternal Mortality ignited a social movement by utilizing strategic advocacy to hold the Government of Uganda accountable to its constitutional provisions on health service delivery. This case examines the Coalition to Stop Maternal Mortality and its landmark legal initiative, Constitutional Petition No. 16 of 2011, that focused the nation’s attention on the state of health services in Uganda and initiated a nationwide conversation about the role of government in delivering the right to health for all Ugandans.  What tactics and strategies can effectively mobilize power to bring about legal and policy change?  Would these be enough to achieve the change that the Coalition sought?

When Dr. Marwan started as director of Ramses Hospital in Cairo in 2008, charged by the Minister of Health with improving performance, he found the hospital had been neglected for decades. A Ministry of Health quality audit had recently given the hospital the worst score of the five hospitals designated as critical to the greater Cairo area. 

Dr. Marwan vowed that Ramses Hospital would come in first in the next round of quality audits. Without improving its quality scores, the hospital would be unable to pass the accreditation process required for hospital participation in a new universal social health insurance scheme. In addition—and just as critically—Dr. Marwan needed to develop a longer-term strategy for obtaining the considerable additional resources required to upgrade the long-neglected facility.

Quelch, J.A. & Xia, Q. , 2015. AIP Healthcare Japan: Investing in Japan's Retirement Home Market , Harvard Business Publishing. Available from Harvard Business Publishing Abstract The CEO of a health care-based REIT is considering alternative nursing home investment strategies. Students must consider macro-industry trends, scale and scope issues and consumer segmentation data in making their recommendations.

Cash, R., et al. , 2009. Casebook on ethical issues in international health research , World Health Organization. Publisher's Version Abstract This casebook published by the World Health Organization contains 64 case studies, each of which raises an important and difficult ethical issue connected with planning, reviewing, or conducting health-related research. Available for download free of charge from the World Health Organization in English, Arabic, Russian, and Spanish.

Quigley, K. & Kane, N.M. , 2014. Hillside Hospital: Physician-Led Planning (Parts A & B) , Harvard Business Publishing: Harvard T.H. Chan School of Public Health case collection. Available from Harvard Business Publishing Abstract Bill Hurt, the new CEO of Hillside Hospital, knew that the forecast was grim. The hospital’s service volumes and market share were dropping precipitously. The direct causes of the problems were numerous, but an indirect cause—and major barrier to addressing the declines—lay in the poor relationship between Hillside’s management and its physicians. To engage the physicians in problem-solving, he considered a consultant’s suggestion: turning the clinical planning process over to the physicians, with management involved only at their request. This approach was risky but Bill thought sometimes you got power by giving it up. 

Conducting public health surveillance in areas of armed conflict and restricted population access: a qualitative case study of polio surveillance in conflict-affected areas of Borno State, Nigeria

Affiliations.

  • 1 US Centers for Disease Control and Prevention, Atlanta, USA. [email protected].
  • 2 World Health Organization, Maiduguri, Borno State, Nigeria.
  • 3 National Stop Transmission of Polio, Abuja, Nigeria.
  • 4 US Centers for Disease Control and Prevention, Atlanta, USA.
  • 5 National Primary Health Care Development Agency, Abuja, Nigeria.
  • 6 University of Illinois at Chicago, Chicago, USA.
  • PMID: 35526017
  • PMCID: PMC9077905
  • DOI: 10.1186/s13031-022-00452-2

This study examined the impact of armed conflict on public health surveillance systems, the limitations of traditional surveillance in this context, and innovative strategies to overcome these limitations. A qualitative case study was conducted to examine the factors affecting the functioning of poliovirus surveillance in conflict-affected areas of Borno state, Nigeria using semi-structured interviews of a purposeful sample of participants. The main inhibitors of surveillance were inaccessibility, the destroyed health infrastructure, and the destroyed communication network. These three challenges created a situation in which the traditional polio surveillance system could not function. Three strategies to overcome these challenges were viewed by respondents as the most impactful. First, local community informants were recruited to conduct surveillance for acute flaccid paralysis in children in the inaccessible areas. Second, the informants engaged in local-level negotiation with the insurgency groups to bring children with paralysis to accessible areas for investigation and sample collection. Third, GIS technology was used to track the places reached for surveillance and vaccination and to estimate the size and location of the inaccessible population. A modified monitoring system tracked tailored indicators including the number of places reached for surveillance and the number of acute flaccid paralysis cases detected and investigated, and utilized GIS technology to map the reach of the program. The surveillance strategies used in Borno were successful in increasing surveillance sensitivity in an area of protracted conflict and inaccessibility. This approach and some of the specific strategies may be useful in other areas of armed conflict.

© 2022. The Author(s).

Grants and funding

  • 001/WHO_/World Health Organization/International

Volume 29, Number 2—February 2023

Sentinel Surveillance System Implementation and Evaluation for SARS-CoV-2 Genomic Data, Washington, USA, 2020–2021

Help Icon

Cite This Article

Genomic data provides useful information for public health practice, particularly when combined with epidemiologic data. However, sampling bias is a concern because inferences from nonrandom data can be misleading. In March 2021, the Washington State Department of Health, USA, partnered with submitting and sequencing laboratories to establish sentinel surveillance for SARS-CoV-2 genomic data. We analyzed available genomic and epidemiologic data during presentinel and sentinel periods to assess representativeness and timeliness of availability. Genomic data during the presentinel period was largely unrepresentative of all COVID-19 cases. Data available during the sentinel period improved representativeness for age, death from COVID-19, outbreak association, long-term care facility–affiliated status, and geographic coverage; timeliness of data availability and captured viral diversity also improved. Hospitalized cases were underrepresented, indicating a need to increase inpatient sampling. Our analysis emphasizes the need to understand and quantify sampling bias in phylogenetic studies and continue evaluation and improvement of public health surveillance systems.

Virus genome data can provide useful information for public health practice, particularly when combined with epidemiologic data in real time. Goals of genomic surveillance can include monitoring circulating and emerging variants, detecting and characterizing outbreaks, describing spatiotemporal patterns of virus transmission, supporting epidemiologic and genomic characterization of variants, and pinpointing introduction sources that might be risk factors ( 1 ). Information from a paired genomic and epidemiologic surveillance system can then be translated into public health interventions to prevent disease, control spread, and mitigate outbreaks. Interventions could include planning preparedness according to emerging variant characteristics, changing therapeutic and nonpharmaceutical interventions, and recommending control strategies on the basis of outbreak characteristics. To ensure generalizability and equity when using paired genomic and epidemiologic data for public health purposes, the methods for capturing those data must ensure a representative sample from the population of interest ( 2 , 3 ).

Ongoing global circulation of SARS-CoV-2 and repeated emergence of new variants indicate the need for robust genomic surveillance to inform public health responses ( 4 ). In Washington, USA, surveillance of SARS-CoV-2 is passive and, therefore, focused on cases of COVID-19 in persons seeking testing. In addition, methods for conducting next-generation sequencing introduce limitations on sampling; specimens must contain adequate quantities of viral RNA for sequencing efforts to be successful. Therefore, persons who had mild illness, delayed testing, reinfection, or other characteristics that might lower viral loads are less likely to be represented in sequencing data. Knowing those limitations, the Washington State Department of Health sought to establish a genomic sentinel surveillance system for SARS-CoV-2 in March 2021.

Before sentinel surveillance was initiated, large amounts of genomic data were produced by academic and clinical laboratories in Washington and shared publicly via the GISAID EpiCoV database ( 5 – 7 ). Studies using those data to rapidly produce critical viral transmission and evolution information were published early during the pandemic; however, the populations captured in those data remain unknown ( 8 – 12 ). Sampling bias or systematic differences in sample characteristics between COVID-19 cases with sequenced specimens and total COVID-19 cases is a concern. Using large datasets from a limited number of geographically sparse institutions might produce inaccurate phylogenetic representations of virus distribution and migration within the population ( 13 , 14 ). Specifically, discrete trait analysis is a type of phylogeographic analysis that treats lineage migration between locations as if the location was a discrete trait; models relying on this analysis type assume that sample sizes across subpopulations are proportional to their relative size and random sampling occurs ( 15 ). If 1 population is oversampled, large biases are expected in model output ( 15 ). This concern extends beyond state or country borders because representative sampling is often assumed for contextual data, which provides the backdrop upon which phylogenetic inference is based.

We describe implementing a sentinel surveillance system that enables pairing of genomic and epidemiologic data. In addition, we assessed representativeness and timeliness of genomic data availability before and after system implementation. By performing this evaluation, we provide information regarding populations of sampled cases and limitations on inference affecting genomic data use. To support planning efforts to obtain more equitable and representative sampling, we identified subpopulations that might be systematically excluded from sequencing surveillance. More broadly, we raise awareness regarding sampling bias in convenience-based genomic surveillance systems and support development of robust genomic surveillance systems in additional jurisdictions.

Sentinel Surveillance System Design

In March 2021, the Washington State Department of Health partnered with multiple laboratories to establish a sentinel surveillance program to monitor genomic epidemiology of SARS-CoV-2 within the state. Partner laboratories were selected to maximize geographic coverage and specimen numbers. The initial proportion of randomly selected positive specimens submitted for sequencing was designed to balance geographic coverage regionally and match available sequencing capacity; statewide case coverage varied from 8% to 25% during the study period ( 16 ). In addition to the Washington State Public Health Laboratories, the 6 sentinel laboratories are Atlas Genomics, Confluence Health/Central Washington Hospital, Interpath Laboratories, Incyte Diagnostics Spokane, Northwest Laboratories, and University of Washington Virology Division. PCR cycle threshold (Ct) is capped at 30 for this surveillance system. The surveillance program is supplemented by a national surveillance effort supported by the Centers for Disease Control and Prevention (CDC), which includes multiple commercial laboratories sequencing randomly selected specimens ( 2 ). Methods for next-generation sequencing vary across laboratories, but >90% sequences are generated by using an Illumina platform ( https://www/illumina.com ); assembly methods also vary.

Study Population Evaluation

We included all confirmed COVID-19 cases (SARS-CoV-2 RNA detected by molecular amplification) reported in the Washington Disease Reporting System from January 21, 2020, through December 31, 2021. Using laboratory accession numbers or patient demographics, we linked those cases to sequences uploaded to the GISAID EpiCoV database ( 5 – 7 ) from January 21, 2020, through January 31, 2022, that indicated the state of Washington in the geographic tag. We classified cases as presentinel surveillance if specimens were sequenced before March 1, 2021. We classified cases as sentinel surveillance if specimens were sequenced on or after March 1, 2021, and submitted through the Washington State Department of Health sentinel surveillance program, or if the sequencing laboratory indicated that specimens were randomly selected. Specimens specifically selected for targeted sequencing as part of outbreak investigations because of travel history, known vaccine breakthrough status, or spike gene target failures were not considered sentinel surveillance if sampled outside the random selection process. Washington state and University of Washington Institutional Review Boards determined this project to be a surveillance activity and exempt from review.

Data Analysis

We assessed representativeness of data before and after implementing sentinel surveillance by comparing COVID-19 cases with sequenced specimens to all COVID-19 cases during the same period according to sex, age, race, ethnicity, language, long-term care facility (LTCF) association, occupation, county of residence, outbreak association, travel history, hospitalization, or death. All epidemiologic data analyses were performed using R version 4.0.3 ( 17 ). We compared categorical data by using Pearson χ 2 test or the formula Σ(|E-O|)/E, where E was expected and O observed counts. Expected counts were calculated by standardization to overall reported cases during the same period. We visualized geographic comparisons by mapping standardized ratios of observed versus expected cases at the county level. We graphed the percentage of cases with sequenced specimens by county and month to visualize spatiotemporal sampling. We evaluated areas with high presentinel sequencing coverage and high or low sentinel sequencing coverage to determine representativeness because data from those areas enabled robust phylogeographic studies.

To determine variability of genomic data, we constructed phylogenetic trees for 4 scenarios using the Nextstrain ( 18 ) pipeline for SARS-CoV-2. The scenarios were presentinel surveillance with high coverage, low representativeness; presentinel surveillance with high coverage, high representativeness; sentinel surveillance with high coverage, high representativeness; and sentinel surveillance with low coverage, low representativeness. We performed rarefaction analysis to examine how sampling affected the diversity of sequences captured in each of those 4 scenarios. For each value from 1 to n, where n is the total number of available sequences for a location/timeframe of interest, we generated 10 subsampled datasets (sampling without replacement). We counted and plotted the number of unique haplotypes as a function of the number of sampled sequences.

We assessed timeliness of data by comparing the interval between initial specimen collection and genomic data upload to the GISAID database. We assessed median timeliness by month and compared categorical data uploaded within <14 days, 14–27 days, and > 28 days after specimen collection.

During the presentinel surveillance period, 10,653 (3.3%) COVID-19 cases had sequencing information available, compared with 56,106 (12.1%) cases sampled during sentinel surveillance. For all categorical comparisons using Pearson χ 2 tests, we observed statistically significant differences between presentinel and sentinel cases that had sequencing data. To avoid having a single large discrepancy dominate the representativeness measurement, we used the formula Σ(|E-O|)/E instead of Pearson χ 2 test to directly compare representativeness between populations ( Table ).

Both presentinel and sentinel cases with sequencing data were generally representative of all COVID-19 cases for sex at birth. During the presentinel surveillance period, older age groups and hospitalized persons with sequenced specimens were overrepresented. Persons who died of COVID-19 were overrepresented by ≈3-fold among presentinel cases with sequencing data compared with cases that had no sequencing data. Sentinel surveillance implementation resolved overrepresentation of decedents, but persons with COVID-19 who were hospitalized or > 65 years of age were underrepresented.

Early during the pandemic, specimens from known outbreak-associated COVID-19 cases were more commonly sequenced, likely reflecting preferential sample selection of those cases for studies. Similarly, sequencing of specimens from LTCF-associated COVID-19 cases was enriched by 2.5-fold. Sentinel surveillance implementation decreased but did not completely resolve enrichment of outbreak-associated cases, whereas LTCF-associated case enrichment was substantially resolved.

Presentinel COVID-19 cases with sequenced specimens had more complete symptom information when compared with all COVID-19 cases. Both presentinel and sentinel cases with sequenced specimens had symptom information reported more frequently compared with all cases.

Persons self-reporting as a racial or ethnic minority were generally overrepresented among presentinel COVID-19 cases with sequenced specimens; race/ethnicity data were less likely to be missing among those cases than among total COVID-19 cases. After sentinel surveillance implementation, persons reporting Hispanic ethnicity or Spanish language preference were overrepresented among COVID-19 cases with sequenced specimens. Differences in missing race data were resolved after sentinel surveillance implementation.

Industry information was missing for most cases. According to the available industry information, agriculture, forestry, fishing and hunting, and healthcare and social assistance were overrepresented among cases with sequenced specimens. Industry information was missing for >90% of cases during the sentinel surveillance period; therefore, industry representation was not assessed in this study.

More persons with sequenced specimens during the presentinel period traveled outside the United States than expected, indicating likely enrichment for international travelers. Travel information was missing for >95% of cases during the sentinel surveillance period; therefore, traveler representation was not assessed in this study.

Reinfection data were captured starting on September 1, 2021; therefore, case-level data were not available for most of the study period. From September through December 2021, reinfection cases were underrepresented in the sequencing data, which might reflect a higher average Ct in this population.

Geographic extent of sequencing data available for COVID-19 cases in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. A) Presentinel surveillance (specimens sequenced before March 1, 2021). B) Sentinel surveillance (specimens sequenced on or after March 1, 2021, through the sentinel surveillance program). Standardized ratios (observed/expected counts) of cases with sequenced specimens are indicated by county. No sequence data were available for 3 counties during the presentinel period.

Figure 1 . Geographic extent of sequencing data available for COVID-19 cases in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. A) Presentinel surveillance (specimens sequenced...

Before sentinel surveillance implementation, geographic sequencing coverage was variable and focused on western Washington ( Figure 1 ); King, San Juan, Pacific, and Yakima Counties had high coverage. Some areas of the state had little or no data available. After sentinel surveillance implementation, geographic coverage equalized regionally across the state; variable coverage because of sentinel laboratory service areas occurred as expected ( Figure 1 ).

We investigated representativeness further in areas with high presentinel sequencing coverage and high cases numbers ( Appendix Figure 1). During March–June 2020, Yakima County had 19%–30% sequencing coverage for all COVID-19 cases; high-quality genomic data were available for 1,696 cases. High coverage was partially driven by sequencing specimens from LTCF-associated cases. A total of 25% of cases with sequenced specimens were affiliated with LTCFs, compared with 11% of all COVID-19 cases during that period. Persons with sequenced specimens were more commonly > 65 years of age and less commonly of Hispanic descent or with Spanish language preference.

We performed phylogenetic analysis of all sequenced specimens from Yakima County cases with COVID-19 onset dates during March–June 2020 ( Appendix Figure 2, panel A). During this period, most (63%) sequences were classified as Nextstrain clade 20B (Pango lineage B.1.1), 23% were clade 19B (Pango lineage A), 9% were clade 20A (Pango lineage B.1) and 5% were clade 20C. Comparatively, within the entire state of Washington, clades 20C and 19B (Pango lineage A) were most prevalent during the same period.

Sequencing coverage was also high in Yakima County in February 2021. Sequencing coverage was 26% across all COVID-19 cases, and high-quality genomic data were available for 271 cases. During this period, we observed smaller differences between cases with sequenced specimens and all cases for ethnicity and outbreak-association; otherwise, cases with sequenced specimens were largely representative of all cases during this time. We performed phylogenetic analysis of Yakima cases during February 2021 ( Appendix Figure 2, panel B). The most common lineage identified was 21C (Pango lineage B.1.427/429 or Epsilon), representing 33% of sequences, then 20G (Pango lineage B.1.2) at 29%, 20A at 13%, 20B at 9%, and 20C at 15%. In Washington, 30% of sequences in GISAID were Epsilon in February 2021.

After sentinel surveillance implementation, variability in geographic coverage was diminished regionally but persisted at the county level. We investigated counties with high and low sentinel sequencing coverage to determine effects of variable sentinel specimen sampling. We specifically compared Whatcom County, a county with high coverage from a sentinel laboratory, and Clark County, a county with low coverage. During the sentinel surveillance period, cases with sequenced specimens from Whatcom County were representative of all COVID-19 cases from the county for age, sex, race, death from COVID-19, and LTCF-association. Persons hospitalized for COVID-19 were underrepresented among sentinel surveillance cases, reflecting statewide findings. Outbreak-associated cases and symptomatic persons were slightly overrepresented among sentinel surveillance cases. We performed phylogenetic analysis of cases from Whatcom County during the sentinel surveillance period ( Appendix Figure 2, panel C) and showed a transition from clade 20I (Alpha) to 21A/21I/21J (Delta) dominance, similar to what was observed in Washington overall.

Rarefaction analysis of virus haplotype diversity in Yakima, Clark, and Whatcom Counties in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Presentinel COVID-19 cases (sequenced before March 1, 2021) with sequenced specimens from Yakima County (2 timepoints) were compared with sentinel COVID-19 cases (sequenced on or after March 1, 2021, through the sentinel surveillance program) with sequenced specimens in Clark and Whatcom Counties. Haplotype count indicates virus diversity.

Figure 2 . Rarefaction analysis of virus haplotype diversity in Yakima, Clark, and Whatcom Counties in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Presentinel COVID-19...

Clark County had very low sequencing coverage over the sentinel surveillance period, ranging from 0.8% of cases in April 2021 to 4.9% of cases in June 2021. Persons <45 years of age and outbreak-associated cases were overrepresented among cases with sequenced specimens, and hospitalized persons were underrepresented. We performed phylogenetic analysis of cases from Clark County during the sentinel surveillance period ( Appendix Figure 2, panel D). Despite limited coverage, we observed a variant profile similar to that of Whatcom County and Washington overall. We performed rarefaction analysis and found sentinel sampling from Clark and Whatcom counties displayed higher viral diversity than Yakima County at 2 presentinel timepoints ( Figure 2 ). Additional sampling will be required in all scenarios to fully capture circulating viral diversity.

case study health surveillance

Figure 3 . Timeliness of sequence data availability in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Graph shows percentages of COVID-19 cases with sequenced data...

Timeliness of available genomic data in the GISAID database varied over the study period ( Figure 3 ). During the presentinel period, median timeliness ranged from 23 days in February to 98 days in October of 2020; > 50% of sequences were uploaded to GISAID >28 days after specimen collection for most months. During the sentinel period, median timeliness was 26 days in August and 15 days in December of 2021; most sequences were uploaded to GISAID <28 days after specimen collection in all months after sentinel surveillance implementation.

After a sentinel surveillance system for sequencing SARS-CoV-2 specimens was implemented in Washington, the available data were more epidemiologically and genomically representative of all COVID-19 cases and timelier than data before sentinel surveillance began. Specifically, representativeness of age, death from COVID-19, outbreak-association status, LTCF-affiliated status, and geographic coverage improved; increased viral diversity was also noted. Before sentinel surveillance began, we were unable to identify a county or period with representative sampling, except for Yakima County during February 2021. After implementation, representativeness improved across multiple areas. Increased representativeness is a critical achievement because genomic data are routinely available to public health leaders and decision-makers; ensuring equitable sampling coverage has substantial implications for response planning and interventions. Measuring effects of genomic surveillance on public health responses in Washington was not included in this study; however, methods for measuring and evaluating effectiveness should be explored.

Overrepresentation of older persons in presentinel genomic data was partly driven by selection of LTCF-associated COVID-19 cases and COVID-19 cases resulting in hospitalization or death. After sentinel surveillance began, the decrease in representation of persons > 65 years of age improved overall representativeness but actually resulted in undersampling this age group, possibly indicating poor sequencing coverage by facilities where this population seeks care. Indeed, the sentinel surveillance system underrepresents hospitalized cases; further consideration is needed to improve data capture of both inpatient and outpatient COVID-19 cases. Before sentinel surveillance, outbreak-associated and symptomatic COVID-19 cases were oversampled. After implementation, overrepresentation of those cases decreased but was not resolved. At least 3 possible explanations exist for those findings: specimens from symptomatic SARS-CoV-2–infected persons are more likely to be sequenced because of higher average viral loads, which improves sequencing success; asymptomatic persons might be detected through screening programs not associated with sentinel laboratories; and outbreak-associated specimens might be sent to sentinel laboratories to ensure sequencing for investigative purposes. Random sampling among specimens received at sentinel laboratories could, thereby, still lead to biased samples.

Minority race and ethnicity were more commonly reported among presentinel cases with sequenced specimens; data were also more complete among those cases. Whether true overrepresentation occurred or race data were differentially missing among all cases is unclear. After sentinel surveillance implementation, persons reporting Hispanic ethnicity and Spanish language preference were overrepresented compared with overall cases statewide, which likely reflects the catchment areas of sentinel laboratories. Geographic coverage variability was identified during both presentinel and sentinel surveillance periods. Presentinel coverage focused on western Washington, where laboratories were connected to sequencing capacity. Sentinel surveillance enabled access to sequencing for additional laboratories and ensured greater equitable regional coverage, although variability at the county and subcounty levels remains. Variable coverage and representativeness at the substatewide level should be considered when using genomic data for specific analyses. Increasing geographic coverage will require additional sentinel laboratories that contribute specimens from areas of low coverage.

Other epidemiologic information was of interest in assessing representativeness, including industry and occupation, travel history, and reinfection status. However, data for those variables was incomplete, limiting their usefulness. As public health systems pivot away from capturing data through individual case interviews, datasets available for assessing sampling of specimens for sequencing should be considered. The full potential of genomic epidemiologic surveillance for improving public health requires pairing epidemiologic metadata with genomic data.

Viral diversity has been and continues to be dynamic over the course of the COVID-19 pandemic. Measuring true viral diversity requires random or complete sampling. Actual circulating viral diversity likely differed across locations and timepoints included in our study; if circulating diversity generally increased over time, our conclusions would be biased toward assumption of improved capture because of surveillance.

Other states and countries have used various practices to select SARS-CoV-2 specimens for sequencing. Methods that rely on convenience samples, such as our presentinel system, likely have sampling biases that affect phylogenetic inference. In those settings, weighting cases for inclusion in estimates by using selection probabilities might help to correct bias. Alternatively, approaches to correct for nonrepresentative sampling during analysis, such as inverse probability weighting, should be considered. Even after sentinel surveillance system is put in place, some biases remain, such as undersampling of hospitalized cases, that should be corrected by diversifying sources of specimens. Ongoing evaluation and improvement of systems is necessary, especially in the context of performing epidemiologic studies. Many epidemiologic studies of COVID-19 have availability of genomic data as an inclusion criterion; if sampling biases are not clarified, biased conclusions might be drawn. Co-development of genomic epidemiology programs alongside bioinformatics programs is needed in public health departments because epidemiologic and phylogenetic analyses are best performed after sampling methods and data limitations are considered.

Although representativeness and timeliness were the focus of this study, other features should be considered in the design of surveillance systems, such as simplicity, flexibility, sensitivity, and stability ( 4 ). Sentinel surveillance systems are complicated and require ongoing coordination with laboratory partners; stability requires public health resources. Alternative systems to enable representativeness and timeliness while increasing simplicity and stability could include requirements for specimen submission, such as those commonly used for foodborne pathogens and other notifiable conditions. Sensitivity is essential for the surveillance system goals of rare variant detection and timely surveillance of circulating virus variants. Right-size sampling, such as that performed for influenza surveillance, should be considered ( 19 ; S. Wohl et al., unpub. data, https://www.medrxiv.org/content/10.1101/2021.12.30.21268453v1 ).

Even after careful consideration of surveillance system design for pathogen sequencing and pairing with epidemiologic data, limitations remain because of specimen requirements for sequencing. Studies using surveillance sequencing data should report the following limitations: application of laboratory-based diagnostic testing might depend on many factors that are difficult to assess and increasingly complex because of availability of improved at-home testing, and, among positive test results, those with a low PCR Ct are more likely to be sequenced. Therefore, representativeness of sequencing data is inherently limited.

Assessment of representativeness during presentinel and sentinel surveillance is limited in the causal inferences that can be drawn. Other concurrent factors might have affected representativeness and timeliness during this study period. For example, CDC surveillance efforts were also increased during this timeframe; samples sequenced under CDC surveillance were coded as sentinel and were analyzed as part of the sentinel surveillance system in Washington.

In conclusion, implementing a sentinel surveillance system for sequencing SARS-CoV-2 specimens was associated with improved genomic and epidemiologic representativeness and timeliness of available sequence data in Washington. Ongoing evaluation and improvements will be necessary to ensure representative capture of inpatient settings. As public health leaders discuss changes to COVID-19 surveillance systems nationally, datasets required to assess representativeness of sampling for sequencing should be considered. Cross-jurisdictional sampling bias is a concern when validating phylogeographic methods applications; attention to sampling will improve the usefulness of those datasets for public health practice.

Ms. Oltean is a senior epidemiologist at the Washington State Department of Health and a PhD candidate at the University of Washington School of Public Health. Her interests focus on genomic epidemiology and communicable disease surveillance systems design, implementation, and evaluation.

Acknowledgments

We thank Peter Gibson, Cory Yun, Emily Nebergall, Allison Thibodeau, and Frank Aragona for data linkage and maintenance; Rebecca Thomure for data maintenance; Chris Destro, Renee Takara, Velma Xu, Kelly Thornton, Kelly Burchardt, Guadalupe Munoz-Vargas, Jade Hayes, Bri Spencer, Michelle McCartha, Alexandra Putzier, Micaela Pribic, Sarah Giadone, Linda Peart, Kimberly Dowdle, Katrina Sullivan, Kas Miller, Gillian Conkling, Sarah Haines, Joshua McNamara, Sarah Hulbert, Ashley Romana, Mikelle Quale, Rowan Day, Katelyn Fritz, Edwin Enciso, Helen Dolejsi, Emily Baril, Connor Nels, Kevin Nelson, and Michael Harvey for performing laboratory diagnostic tests and supporting sentinel laboratory specimen selection; Hong Xie, Isabel Arnould, Nathan Breit, Sean Ellis, and Saraswathi Sathees for performing sequencing; Eric Oltean for Python code review and revisions; the originating laboratories Aegis Sciences Corporation, Atlas Genomics, Centers for Disease Control and Prevention, Curative Labs, Fulgent Genetics, Gravity Diagnostics, LLC, Helix, Incyte Diagnostics, Interpath Laboratory, Laboratory Corporation of America, Northwest Laboratory, Overlake Hospital, Quest Diagnostics Incorporated, Seattle Flu Study, University of Washington Virology, Washington State Department of Health Public Health Laboratories for providing specimens for whole-genome sequencing; and the submitting laboratories Altius Institute for Biomedical Research, Centers for Disease Control and Prevention, Curative Labs, Gravity Diagnostics, LLC, Seattle Flu Study, University of Washington Virology, Washington State Department of Health Public Health Laboratories for providing sequence data to GISAID.

Dr. Nickerson is deceased.

This work was supported by the Epidemiology and Laboratory Capacity cooperative agreement from the US Centers for Disease Control and Prevention.

  • Ferdinand  AS , Kelaher  M , Lane  CR , da Silva  AG , Sherry  NL , Ballard  SA , et al. An implementation science approach to evaluating pathogen whole genome sequencing in public health. Genome Med . 2021 ; 13 : 121 . DOI PubMed Google Scholar
  • Centers for Disease Control (CDC) . Guidelines for evaluating surveillance systems. MMWR Suppl . 1988 ; 37 : 1 – 18 . PubMed Google Scholar
  • German  RR , Lee  LM , Horan  JM , Milstein  RL , Pertowski  CA , Waller  MN ; Guidelines Working Group Centers for Disease Control and Prevention (CDC) . Updated guidelines for evaluating public health surveillance systems: recommendations from the Guidelines Working Group. MMWR Recomm Rep . 2001 ; 50 ( RR-13 ): 1 – 35, quiz CE1–7 . PubMed Google Scholar
  • Paul  P , France  AM , Aoki  Y , Batra  D , Biggerstaff  M , Dugan  V , et al. Genomic surveillance for SARS-CoV-2 variants circulating in the United States, December 2020–May 2021. MMWR Morb Mortal Wkly Rep . 2021 ; 70 : 846 – 50 . DOI PubMed Google Scholar
  • Shu  Y , McCauley  J . GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill . 2017 ; 22 : 30494 . DOI PubMed Google Scholar
  • Elbe  S , Buckland-Merrett  G . Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall . 2017 ; 1 : 33 – 46 . DOI PubMed Google Scholar
  • Khare  S , Gurry  C , Freitas  L , Schultz  MB , Bach  G , Diallo  A , et al. GISAID’s role in pandemic response. China CDC Wkly . 2021 ; 3 : 1049 – 51 . DOI PubMed Google Scholar
  • Bedford  T , Greninger  AL , Roychoudhury  P , Starita  LM , Famulare  M , Huang  ML , et al. ; Seattle Flu Study Investigators . Cryptic transmission of SARS-CoV-2 in Washington state. Science . 2020 ; 370 : 571 – 5 . DOI PubMed Google Scholar
  • Jorden  MA , Rudman  SL , Villarino  E , Hoferka  S , Patel  MT , Bemis  K , et al. ; CDC COVID-19 Response Team . Evidence for limited early spread of COVID-19 within the United States, January–February 2020. MMWR Morb Mortal Wkly Rep . 2020 ; 69 : 680 – 4 . DOI PubMed Google Scholar
  • Fauver  JR , Petrone  ME , Hodcroft  EB , Shioda  K , Ehrlich  HY , Watts  AG , et al. Coast-to-coast spread of SARS-CoV-2 during the early epidemic in the United States. Cell . 2020 ; 181 : 990 – 996.e5 . DOI PubMed Google Scholar
  • Tordoff  DM , Greninger  AL , Roychoudhury  P , Shrestha  L , Xie  H , Jerome  KR , et al. Phylogenetic estimates of SARS-CoV-2 introductions into Washington State. Lancet Reg Health Am . 2021 ; 1 : 100018 . DOI PubMed Google Scholar
  • Müller  NF , Wagner  C , Frazar  CD , Roychoudhury  P , Lee  J , Moncla  LH , et al. Viral genomes reveal patterns of the SARS-CoV-2 outbreak in Washington State. Sci Transl Med . 2021 ; 13 : eabf0202 . DOI PubMed Google Scholar
  • Magee  D , Scotch  M . The effects of random taxa sampling schemes in Bayesian virus phylogeography. Infect Genet Evol . 2018 ; 64 : 225 – 30 . DOI PubMed Google Scholar
  • Lemey  P , Rambaut  A , Bedford  T , Faria  N , Bielejec  F , Baele  G , et al. Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2. PLoS Pathog . 2014 ; 10 : e1003932 . DOI PubMed Google Scholar
  • De Maio  N , Wu  CH , O’Reilly  KM , Wilson  D . New routes to phylogeography: a Bayesian structured coalescent approximation. PLoS Genet . 2015 ; 11 : e1005421 . DOI PubMed Google Scholar
  • Washington State Department of Health . SARS-CoV-2 sequencing and variants in Washington state. 2022 [ cited 2022 Jun 21 ]. https://doh.wa.gov/sites/default/files/2022-02/420-316-SequencingAndVariantsReport.pdf
  • R Core Team . R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. 2020 [ cited 2022 Sep 19 ]. https://www.r-project.org
  • Hadfield  J , Megill  C , Bell  SM , Huddleston  J , Potter  B , Callender  C , et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics . 2018 ; 34 : 4121 – 3 . DOI PubMed Google Scholar
  • Association of Public Health Laboratories . Influenza virologic surveillance right size roadmap [ cited 2022 Mar 21 ]. https://www.aphl.org/programs/infectious_disease/influenza/Influenza-Virologic-Surveillance-Right-Size-Roadmap/Pages/default.aspx
  • Figure 1 . Geographic extent of sequencing data available for COVID-19 cases in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. A) Presentinel surveillance (specimens...
  • Figure 2 . Rarefaction analysis of virus haplotype diversity in Yakima, Clark, and Whatcom Counties in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Presentinel...
  • Figure 3 . Timeliness of sequence data availability in study of sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data, Washington, USA, 2020–2021. Graph shows percentages of COVID-19 cases with sequenced...
  • Table . Comparison of demographic characteristics between COVID-19 cases with sequenced specimens and all confirmed COVID-19 cases in study of presentinel and sentinel surveillance system implementation and evaluation for SARS-CoV-2 genomic data,...

DOI: 10.3201/eid2902.221482

Original Publication Date: January 03, 2023

Table of Contents – Volume 29, Number 2—February 2023

Please use the form below to submit correspondence to the authors or contact them at the following address:

Hanna Oltean, Washington State Department of Health, 1610 NE 150th St, Shoreline, WA 98155, USA

Comment submitted successfully, thank you for your feedback.

There was an unexpected error. Message not sent.

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

Article Citations

Highlight and copy the desired format.

Metric Details

Article views: 2662.

Data is collected weekly and does not include downloads and attachments. View data is from .

What is the Altmetric Attention Score?

The Altmetric Attention Score for a research output provides an indicator of the amount of attention that it has received. The score is derived from an automated algorithm, and represents a weighted count of the amount of attention Altmetric picked up for a research output.

This paper is in the following e-collection/theme issue:

Published on 18.4.2024 in Vol 10 (2024)

Projected Time for the Elimination of Cervical Cancer Under Various Intervention Scenarios: Age-Period-Cohort Macrosimulation Study

Authors of this article:

Author Orcid Image

Original Paper

  • Yi-Chu Chen 1 , PhD   ; 
  • Yun-Yuan Chen 2 , PhD   ; 
  • Shih-Yung Su 3 , PhD   ; 
  • Jing-Rong Jhuang 4 , PhD   ; 
  • Chun-Ju Chiang 1, 5 , PhD   ; 
  • Ya-Wen Yang 5 , MSc   ; 
  • Li-Ju Lin 6 , PhD   ; 
  • Chao-Chun Wu 6 , MD   ; 
  • Wen-Chung Lee 1, 5, 7 , MD, PhD  

1 Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan

2 Head Office, Taiwan Blood Services Foundation, Taipei, Taiwan

3 Master Program in Statistics, National Taiwan University, Taipei, Taiwan

4 Institute of Statistical Science, Academia Sinica, Taipei, Taiwan

5 Taiwan Cancer Registry, Taipei city, Taiwan

6 Health Promotion Administration, Ministry of Health and Welfare, Taipei, Taiwan

7 Institute of Health Data Analytics, College of Public Health, National Taiwan University, Taipei, Taiwan

Corresponding Author:

Wen-Chung Lee, MD, PhD

Institute of Health Data Analytics

College of Public Health

National Taiwan University

Room 536, No 17, Xuzhou Road

Taipei, 100

Phone: 886 223511955

Fax:886 223511955

Email: [email protected]

Background: The World Health Organization aims for the global elimination of cervical cancer, necessitating modeling studies to forecast long-term outcomes.

Objective: This paper introduces a macrosimulation framework using age-period-cohort modeling and population attributable fractions to predict the timeline for eliminating cervical cancer in Taiwan.

Methods: Data for cervical cancer cases from 1997 to 2016 were obtained from the Taiwan Cancer Registry. Future incidence rates under the current approach and various intervention strategies, such as scaled-up screening (cytology based or human papillomavirus [HPV] based) and HPV vaccination, were projected.

Results: Our projections indicate that Taiwan could eliminate cervical cancer by 2050 with either 70% compliance in cytology-based or HPV-based screening or 90% HPV vaccination coverage. The years projected for elimination are 2047 and 2035 for cytology-based and HPV-based screening, respectively; 2050 for vaccination alone; and 2038 and 2033 for combined screening and vaccination approaches.

Conclusions: The age-period-cohort macrosimulation framework offers a valuable policy analysis tool for cervical cancer control. Our findings can inform strategies in other high-incidence countries, serving as a benchmark for global efforts to eliminate the disease.

Introduction

Cervical cancer remains the fourth most common cancer in women worldwide [ 1 ]. The World Health Organization (WHO) [ 2 ] called for the global elimination of cervical cancer; elimination is defined as an incidence of fewer than 4 cases per 100,000 women-years. Eliminating cervical cancer requires scaling up the human papillomavirus (HPV) vaccination of girls and cervical screening [ 3 ].

Randomized trials and follow-up studies are valuable for evaluating the short-term and medium- to long-term impacts of public health intervention policies [ 4 , 5 ]. In contrast, modeling studies offer the ability to estimate and forecast effects over an extended timeframe and can also simulate outcomes across a range of future scenarios [ 6 ]. Numerous microsimulation modeling studies have been conducted to assess the necessary time frame and intervention strategies for the elimination of cervical cancer [ 3 , 7 - 14 ]. These studies rely on numerous assumptions, such as oncogenic potentials, infection dynamics and immunity of HPV, and the natural history of cervical cancer. They require many parameters that are difficult to obtain.

A better alternative is to use macrosimulation modeling. From a macroscopic perspective, disease incidence rates are primarily influenced by 3 temporal factors: age, period, and cohort. The age-period-cohort (APC) model, which accounts for these factors, has been recently used to project future disease burdens [ 15 - 19 ]. This methodology can also estimate future incidence rates of cervical cancer under existing conditions, known as the status quo. In a hypothetical “what if” scenario involving a public health intervention, the baseline incidence rate would be reduced by a certain fraction—captured by the population attributable fraction (PAF) associated with the intervention. By combining the APC model’s baseline estimates with the PAF to factor in reductions from interventions, we can effectively forecast future cervical cancer incidence rates under specific public health programs.

Well-organized cytology-based cervical screening programs have effectively reduced the incidence rates of cervical cancer in many countries [ 20 - 23 ]. Since 1995, Taiwan’s Health Promotion Administration has offered organized cytology-based screenings for women aged 30 years and older. Consequently, the age-standardized incidence rate of invasive cervical cancer in Taiwan sharply declined from 28.0 per 100,000 woman-years in 1997 to 8.2 in 2016. Despite this progress, the incidence rate in Taiwan has yet to meet the criteria for the elimination of cervical cancer. A more effective form of HPV-based screening, which has shown to be superior to cytology-based methods [ 24 ], may soon be adopted for mass screening. Projecting future trends in cervical cancer incidence under new approaches such as HPV-based screening has been challenging using microsimulation models. However, these projections may be more feasible through macrosimulation techniques.

In this study, macrosimulation was used, combining APC modeling with PAF calculation, to estimate when cervical cancer could be eliminated in Taiwan. We explored the potential to expedite this timeline by implementing more extensive population-based cervical cancer screening (either cytology or HPV based) and enhancing HPV vaccination coverage.

Data Source

In this study, we collected data on cervical cancer cases diagnosed between 1997 and 2016 using the International Classification of Diseases for Oncology, Third Edition code C53. Our extensive data set, which includes population information, originated from the Taiwan Cancer Registry—a meticulously maintained nationwide system established by the Ministry of Health and Welfare. The registry diligently captures and synthesizes information from patients who are newly diagnosed with malignant cancer in hospitals with 50 or more beds in Taiwan, a country with a population size of approximately 23 million. The data quality and completeness of the Taiwan Cancer Registry database have consistently adhered to standards of excellence. Specifically, the completeness is 98.4% (14,833/218,239); the percentage of cases with death certificate only is 0.9% (1079/118,583); the mortality versus incidence ratio is 45.1% (202.89/449.59); the percentage of morphological verification is 93% (109,273/117,504) for all sites combined and 97.6% (103,812/106,423) for all sites excluding the liver; and the data timeliness is 14 months. These statistics demonstrate that the Taiwan Cancer Registry is one of the highest-quality cancer registries in the world [ 25 - 27 ]. The world standard population (WHO 2000 [ 28 ]) proportions were used to calculate age-standardized incidence rates. The projected elimination year was defined as the first year when the age-standardized incidence rate would fall below 4 cases per 100,000 women-years [ 8 ].

Macrosimulation

The APC models were used to project future cervical cancer incidence rates under the status quo. Due to the scarcity of cases in patients younger than 30 years of age, this age group was excluded from the APC modeling. Instead, the average incidence rate between 1997 and 2016 was used as the projected rate for women younger than 30 years of age. This approach likely makes our future decline estimates slightly conservative. However, the impact is minimal as the incidence rate for this age group contributes only a small fraction to the overall rate. For women aged 30 years and older, the projections are outlined below. First, an ensemble of 265 APC models was constructed using data from 1997 to 2006 as the training set. The APC models were then applied to project incidence rates from 2007 to 2016 as the validation set. These projections underwent year-on-year attenuation adjustments ranging from 0% and 5% to 100%, resulting in 5565 different projection sets. The symmetric mean absolute percentage error (SMAPE) was used to quantify the prediction error for each projection model, and the model with the lowest SMAPE was selected. Finally, the selected model, fine-tuned with all available data from 1997 to 2016, was used to project future incidence rates up to the year 2050. These projections incorporated the chosen attenuation factor. The future projected incidence rates were determined by calculating age-standardized rates across all age groups. For those younger than 30 years of age, the average incidence rate from 1997 to 2016 was used. For individuals aged 30 years and older, the rates forecasted by the selected APC model were used.

Next, we calculated the PAFs for various scenarios of cervical cancer screening and HPV vaccination. The PAF in this study was defined as the proportionate reduction in the incidence rate of cervical cancer due to a specific intervention program, as in the following equation [ 29 ]:

where IRR i is the incidence rate ratio between the i th level of the screening or vaccination variable and the reference level ( i =1) of no screening and no vaccination, and P i and P ʹ i are the proportions of women in the i th level of the screening or vaccination variable under the status quo and the specific interventions, respectively.

The incidence rate ratios of cytology-based screening, HPV-based screening, and HPV vaccination were based on the studies of Chen et al [ 30 ], Ronco et al [ 24 ], and Lei et al [ 5 ]. We assume that HPV vaccination administered to 13-year-old girls affords them lifetime effectiveness against cervical cancer [ 31 ]. HPV-based screening may have a higher false positive rate, so it is suggested that screening be performed once every 5-10 years [ 24 ]. Currently, only cytology-based screening is available in Taiwan, and HPV-based screening still needs to be implemented. We also assume that the incidence rate ratios of a joint program involving both interventions are the products of 2 programs involving only the respective intervention (Table S1 in Multimedia Appendix 1 ). In 2016, the proportion of women with cytology-based screening more than twice in 6 years was 43.5% (data provided by Taiwan’s Health Promotion Administration), and a negligible proportion of women had received HPV vaccination in Taiwan (Table S2 in Multimedia Appendix 1 ).

We assume a gradual increase from 2023 to 2030 in the proportion of women undergoing cytology-based screening more than twice within 6 years from the current 43.5% to 70%. We also assume a new intervention plan from 2023 that involves switching from cytology-based screening twice within 6 years to HPV-based screening. The goal is to increase the overall screening proportion from 43.5% to 70% by 2030. We set an intervention scenario to achieve HPV vaccination coverage of 90% since 2018 (the Health Promotion Administration in Taiwan has offered HPV vaccination for 13-year-old girls since 2018). Furthermore, the potential impact of combining cytology-based and HPV-based screening with HPV vaccination in 2 different scenarios was evaluated. For simplicity, we assume that compliance with cytology-based screening, HPV-based screening, and receipt of the HPV vaccination are 3 independent events. The referral rate for positive cervical cytology results, facilitated by community nurses, exceeds 90%. Additionally, Taiwan’s health care system offers almost universal coverage, making effective cervical disease management an integral part of the existing health care infrastructure. As such, this study did not specifically factor in the concept of effective management.

We used equation 1 to calculate the PAFs for all 6 intervention scenarios considered in this study. The incidence rate under a specific intervention was then calculated using the following equation:

Incidence rate under a specific intervention = incidence rate under the status quo × (1 – PAF under a specific intervention) (2)

Data management and analyses were performed using SAS statistical software (version 9.4; SAS Institute Inc).

Ethical Considerations

The study was based solely on deidentified aggregate data, without access to individual records. This study protocol was approved by the National Taiwan University Research Ethics Committee (NTU-REC 202101HM030) and the data release review board of the Health Promotion Administration, Ministry of Health and Welfare in Taiwan. All methods were performed in accordance with the relevant guidelines and regulations. In addition, the National Taiwan University Research Ethics Committee waived the requirement for informed consent due to the lack of personal information and secondary data in the study.

The selected projection model under the status quo was a polynomial APC model with a log link function and 55% attenuation (SMAPE=6.1%). Figure S1 in Multimedia Appendix 1 presents the observed and model-fitted age-standardized (WHO 2000 standard population [ 28 ]) cervical cancer incidence rates from 1997 to 2016 and the projections from 2017 to 2050. A declining trend in cervical cancer incidence rate was observed in the 20-year study period. In 2016, under the status quo of cytology-based screening compliance of 43.5% and no HPV vaccination, the projected cervical cancer incidence rate will not fall below 4 new cases per 100,000 women-years by 2050.

Figure 1 shows the expected incidence rates of cervical cancer from 2023 to 2030 under 3 scenarios: the current situation, if adherence to cytology-based screening increases to 70%, and if adherence to HPV-based screening rises to 70%. The projection indicates that cervical cancer in Taiwan will not reach the goal of elimination by 2050 if cytology-based screening compliance remains at the current level of 43.5%. However, if cytology-based and HPV-based screening compliance are increased to 70%, cervical cancer elimination can be achieved by 2047 and 2035, respectively.

Figure 2 shows the projected cervical cancer incidence rate with an HPV vaccination coverage of 90%. An HPV vaccination coverage of 90% will eliminate cervical cancer by 2050. Figure 3 presents the projected cervical cancer incidence rates if cytology-based or HPV-based screening is applied (compliance raised to 70%) in conjunction with HPV vaccination (90% coverage). Both joint interventions will help to achieve cervical cancer elimination before 2050 and at an earlier year (2038 and 2033, respectively).

The years of elimination (if before 2050) for the various scenarios are shown in Table 1 . For comparison, the same table also shows the results when the Segi standard population [ 32 ] was used for age standardization. The time to elimination was expedited by a few years when the Segi standard was used.

case study health surveillance

a The first year when the projected age-standardized cervical cancer incidence rate falls below 4 cases per 100,000 women-years.

b HPV: human papillomavirus.

c The projected age-standardized cervical cancer incidence rate would not fall below 4 cases per 1000,000 women-years before 2050.

Principal Findings

In summary, our macrosimulation analysis projected that Taiwan could eliminate cervical cancer by 2050 through 70% compliance with screening (cytology based or HPV based) or 90% coverage of HPV vaccination. Specifically, the projected elimination years are 2047 or 2035 for screening (cytology based or HPV based, respectively), 2050 for vaccination, and 2038 or 2033 for a combination of both screening and vaccination. This study confirms earlier microsimulation findings that increased screening can fast-track cervical cancer elimination [ 3 , 7 - 14 ]. Our macrosimulation methodology offers both adaptability and ease of implementation, as demonstrated by the SAS code in Multimedia Appendix 2 . Unique to Taiwan is its high HPV vaccine coverage, a legacy of successful vaccine campaigns, facilitated by extensive health care infrastructure [ 33 ]. In stark contrast, Japan saw HPV vaccine coverage collapse from 70% to nearly 0% between 2013 and 2019 due to a crisis [ 14 ].

Cytology-based screening is crucial for cervical cancer management [ 34 ]. It is recommended that women aged 30 years and older should undergo cytology-based screening at least once every 3 years (effective screening). Historically, cervical cancer incidence has steadily decreased because of opportunistic and organized cytology-based screening [ 35 ]. From 1974 to 1984, Taiwan launched an opportunistic cytology-based screening program for cervical cancer, facilitated through partnerships between the cancer society and gynecology and obstetrics clinics [ 36 ]. A major turning point came in 1995 when Taiwan’s Health Promotion Administration set up an organized cytology-based screening initiative targeting women aged 30 years and older. This represented a significant advancement in Taiwan’s efforts to tackle cervical cancer. Consequently, the age-standardized incidence rate for invasive cervical cancer in Taiwan dropped from 28.0 per 100,000 woman-years in 1997 to 8.2 per 100,000 woman-years in 2016. This reduction marked a shift from Taiwan being a high-risk region to becoming a low-to-medium-risk area for the disease (the figure was presented in Figure S2 in Multimedia Appendix 1 ).

However, the need for improved compliance with effective screening is a global issue, and Taiwan is no exception [ 37 ]. Although approximately 82% of Taiwanese women have undergone at least one screening since 1995, the overall effectiveness of screening implementation remains below the desired level. The participation rate for triennial screenings, aimed at women aged 30 to 69 years, is only around 54%. Moreover, participation drops even further among women aged 69 years and older. These data underscore the pressing need for improved outreach and accessibility to achieve more comprehensive and impactful screening initiatives [ 38 ]. Various strategies have been adopted to promote screening, including educational interventions, physician reminders, incentive programs, mass media campaigns, outreach to community members, and leveraging community health workers [ 39 - 41 ]. Despite all these efforts, the compliance rate of effective screening in 2016 was only 43.5% in Taiwan. Our analysis indicates that cervical cancer in Taiwan can be eliminated in 2047 only if there is 70% compliance with cytology-based screening. To enhance the effectiveness of screening, strategies could include distributing informative pamphlets to schoolchildren to share with adult women in their families; authorizing self-collected HPV screening kits (Taiwan currently only approves the use of clinical-based HPV screening kits); and targeting older, previously unscreened and unvaccinated groups.

Cytology-based screening as a primary mode of cervical cancer screening has been gradually replaced by HPV-based screening, which has 70% greater protection against invasive cervical cancer [ 24 ]. HPV testing recommended by WHO may increase engagement in cervical cancer screening programs [ 42 ]. Taiwan has been implementing cytology-based screening for more than 30 years; however, compliance still needs to be improved. The Health Promotion Administration in Taiwan is also considering implementing more effective HPV-based screening. This study demonstrated that HPV-based screening could achieve the goal of eliminating cervical cancer more swiftly than cytology-based screening, given the same conditions. Taiwan has successfully decreased the incidence rate of cervical cancer through an organized cytology-based screening program. It is now considering the implementation of HPV-based screening to achieve the goal of elimination faster. This approach can be used as a model for other countries.

If HPV vaccine efficacy wanes, it may alter cervical cancer prevention and screening protocols [ 43 ]. However, long-term studies since the vaccine’s 2006 introduction show sustained high antibody levels, suggesting it could offer near-lifelong cervical cancer protection [ 44 - 50 ]. However, multiple factors hinder the implementation and scaling-up of HPV vaccination, such as vaccine supply shortage [ 51 , 52 ], budgetary constraints [ 53 ], and hesitancy due to vaccine-related side effects [ 54 ]. Globally, only a few high-income countries have offered the HPV vaccine for the target age group with a coverage rate above the WHO-recommended threshold of 90% [ 51 , 55 ]. By comparison, the HPV vaccination program introduced in Taiwan in 2018 has been relatively successful; the coverage rates were 76.8% in 2018 and 86.9% in 2019, respectively. It is very likely to further increase this rate to 90% in 2022 (scenario 3 in this study); if this happens, we project that cervical cancer in Taiwan can be eliminated in 2050. The effect of HPV vaccination (on 13-year-old girls) can only appear after the vaccinated cohort reaches the high-risk ages (45 years and older) for cervical cancer—the cohort effect. This is why it takes a longer time to achieve the goal of cervical cancer elimination with an HPV vaccination coverage of 90% (scenario 3 in this study) compared to cytology- or HPV-based screening with a compliance of 70% (scenarios 1 and 2 in this study), reaching the goal in 2050 versus 2047 or 2035.

WHO announced the 90-70-90 target to achieve cervical cancer elimination: HPV vaccination coverage rate of 90% of girls by the age of 15 years, twice-lifetime screening of 70% of vaccinated women (by the age of 35 and 45 years), and treatment of 90% of women with the cervical disease [ 23 ]. The respective latest figures in Taiwan were 90% (HPV vaccination coverage rate; the actual figure could be higher because some women were vaccinated at their own expense and were not tallied), 43% (effective screening compliance rate), and 90% (proportion of treated for women who screened positive), with the effective screening compliance rate lagging far behind the 90-70-90 target. Our analysis shows that if the effective screening compliance rate can be increased to 70% in conjunction with an HPV coverage rate of 90%, Taiwan can achieve the goal of cervical cancer elimination in 2038 (scenario 4 in this study). However, it is worth noting that although the incidence rates of cervical cancer among all age groups have declined in Taiwan, there is a slightly increased trend in the 30-34 years age group among the most recent birth cohort (Figure S3 in Multimedia Appendix 1 ). This warrants attention from health authorities.

Strengths and Limitations

The APC macrosimulation framework developed in this study is a useful policy analysis tool for disease control. The policy analysis results in this study can serve as a reference for other countries with a high incidence of cervical cancer. The strengths of this study lie in the rigorous analytical models we used and the high-quality data we used. Nonetheless, the study is not without limitations. First, we do not have access to individual-level data, and the study is therefore prone to ecological fallacy. Second, we used data spanning from 1997 to 2016 as the basis for making future predictions. This data range is unaffected by the delayed diagnosis and registration of cancer cases that occurred due to the COVID-19 pandemic [ 56 ]. Consequently, our short- to medium-term projections for cervical cancer incidence may be biased by the COVID-19 pandemic. Nonetheless, this is unlikely to affect the study’s primary conclusions, which focus on long-term projections. Finally, public funding for HPV vaccination in Taiwan is restricted to 13-year-old girls, leaving female individuals in other age groups the choice to opt for private vaccination. This study concentrates mainly on the consequences of the publicly funded HPV vaccination program and does not incorporate the likely effects of privately funded vaccinations in a wider age range. The high rate of HPV vaccination also fosters benefits through herd immunity [ 57 ], which could mean that our evaluations are underestimating the full potential impact of the vaccination program.

Acknowledgments

The content of this research may not represent the opinion of the Health Promotion Administration, Ministry of Health and Welfare in Taiwan. The authors used Grammarly and ChatGPT (OpenAI) [ 58 ] to correct the grammar in the paper. This work was supported by grants from the Health Promotion Administration, the Ministry of Health and Welfare in Taiwan (A1111010; Tobacco Health and Welfare Taxation) and the National Science and Technology Council in Taiwan (MOST 111-2314-B-002-089-MY3). The funders had no role in study design, data collection, and analysis; the decision to publish; or the preparation of the paper.

Data Availability

The data sets generated and analyzed during this study are available from the corresponding author on reasonable request.

Conflicts of Interest

None declared.

Additional data regarding incidence rate ratios, proportions of women, and cervical cancer incidence rates.

  • Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394-424. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • WHO director-general calls for all countries to take action to help end the suffering caused by cervical cancer. World Health Organization. 2018. URL: https:/​/www.​who.int/​news/​item/​18-05-2018-who-dg-calls-for-all-countries-to-take-action-to-help -end-the-suffering-caused-by-cervical-cancer [accessed 2024-08-23]
  • Brisson M, Kim JJ, Canfell K, Drolet M, Gingras G, Burger EA, et al. Impact of HPV vaccination and cervical screening on cervical cancer elimination: a comparative modelling analysis in 78 low-income and lower-middle-income countries. Lancet. 2020;395(10224):575-590. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Falcaro M, Castañon A, Ndlela B, Checchi M, Soldan K, Lopez-Bernal J, et al. The effects of the national HPV vaccination programme in England, UK, on cervical cancer and grade 3 cervical intraepithelial neoplasia incidence: a register-based observational study. Lancet. 2021;398(10316):2084-2092. [ CrossRef ] [ Medline ]
  • Lei J, Ploner A, Elfström KM, Wang J, Roth A, Fang F, et al. HPV vaccination and the risk of invasive cervical cancer. N Engl J Med. 2020;383(14):1340-1348. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Campos NG, Demarco M, Bruni L, Desai KT, Gage JC, Adebamowo SN, et al. A proposed new generation of evidence-based microsimulation models to inform global control of cervical cancer. Prev Med. 2021;144:106438. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Burger EA, Smith MA, Killen J, Sy S, Simms KT, Canfell K, et al. Projected time to elimination of cervical cancer in the USA: a comparative modelling study. Lancet Public Health. 2020;5(4):e213-e222. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hall MT, Simms KT, Lew JB, Smith MA, Brotherton JMI, Saville M, et al. The projected timeframe until cervical cancer elimination in Australia: a modelling study. Lancet Public Health. 2019;4(1):e19-e27. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Xia C, Hu S, Xu X, Zhao X, Qiao Y, Broutet N, et al. Projections up to 2100 and a budget optimisation strategy towards cervical cancer elimination in China: a modelling study. Lancet Public Health. 2019;4(9):e462-e472. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Portnoy A, Pedersen K, Trogstad L, Hansen BT, Feiring B, Laake I, et al. Impact and cost-effectiveness of strategies to accelerate cervical cancer elimination: a model-based analysis. Prev Med. 2021;144:106276. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Drolet M, Laprise JF, Martin D, Jit M, Bénard É, Gingras G, et al. Optimal human papillomavirus vaccination strategies to prevent cervical cancer in low-income and middle-income countries in the context of limited resources: a mathematical modelling analysis. Lancet Infect Dis. 2021;21(11):1598-1610. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Simms KT, Steinberg J, Caruana M, Smith MA, Lew JB, Soerjomataram I, et al. Impact of scaled up human papillomavirus vaccination and cervical screening and the potential for global elimination of cervical cancer in 181 countries, 2020-99: a modelling study. Lancet Oncol. 2019;20(3):394-407. [ CrossRef ] [ Medline ]
  • Xia C, Xu X, Zhao X, Hu S, Qiao Y, Zhang Y, et al. Effectiveness and cost-effectiveness of eliminating cervical cancer through a tailored optimal pathway: a modeling study. BMC Med. 2021;19(1):62. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Simms KT, Hanley SJB, Smith MA, Keane A, Canfell K. Impact of HPV vaccine hesitancy on cervical cancer in Japan: a modelling study. Lancet Public Health. 2020;5(4):e223-e234. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Su SY, Chiang CJ, Yang YW, Lee WC. Secular trends in liver cancer incidence from 1997 to 2014 in Taiwan and projection to 2035: an age-period-cohort analysis. J Formos Med Assoc. 2019;118(1 Pt 3):444-449. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Su SY, Lee WC. Mortality trends of liver diseases from 1981 to 2016 and the projection to 2035 in Taiwan: an age-period-cohort analysis. Liver Int. 2019;39(4):770-776. [ CrossRef ] [ Medline ]
  • Hsiao BY, Su SY, Jhuang JR, Chiang CJ, Yang YW, Lee WC. Ensemble forecasting of a continuously decreasing trend in bladder cancer incidence in Taiwan. Sci Rep. 2021;11(1):8373. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jhuang JR, Su SY, Chiang CJ, Yang YW, Lin LJ, Hsu TH, et al. Forecast of peak attainment and imminent decline after 2017 of oral cancer incidence in men in Taiwan. Sci Rep. 2022;12(1):5726. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen YC, Su SY, Jhuang JR, Chiang CJ, Yang YW, Wu CC, et al. Forecast of a future leveling of the incidence trends of female breast cancer in Taiwan: an age-period-cohort analysis. Sci Rep. 2022;12(1):12481. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vaccarella S, Franceschi S, Engholm G, Lönnberg S, Khan S, Bray F. 50 years of screening in the Nordic countries: quantifying the effects on cervical cancer incidence. Br J Cancer. 2014;111(5):965-969. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Andrae B, Andersson TML, Lambert PC, Kemetli L, Silfverdal L, Strander B, et al. Screening and cervical cancer cure: population based cohort study. BMJ. 2012;344:e900. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Arbyn M, Raifu AO, Weiderpass E, Bray F, Anttila A. Trends of cervical cancer mortality in the member states of the European Union. Eur J Cancer. 2009;45(15):2640-2648. [ CrossRef ] [ Medline ]
  • Das M. WHO launches strategy to accelerate elimination of cervical cancer. Lancet Oncol. 2021;22(1):20-21. [ CrossRef ] [ Medline ]
  • Ronco G, Dillner J, Elfström KM, Tunesi S, Snijders PJF, Arbyn M, et al. Efficacy of HPV-based screening for prevention of invasive cervical cancer: follow-up of four European randomised controlled trials. Lancet. 2014;383(9916):524-532. [ CrossRef ] [ Medline ]
  • Chiang CJ, You SL, Chen CJ, Yang YW, Lo WC, Lai MS. Quality assessment and improvement of nationwide cancer registration system in Taiwan: a review. Jpn J Clin Oncol. 2015;45(3):291-296. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chiang CJ, Wang YW, Lee WC. Taiwan's Nationwide Cancer Registry System of 40 years: past, present, and future. J Formos Med Assoc. 2019;118(5):856-858. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kao CW, Chiang CJ, Lin LJ, Huang CW, Lee WC, Lee MY, et al. Accuracy of long-form data in the Taiwan Cancer Registry. J Formos Med Assoc. 2021;120(11):2037-2041. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Standard populations (millions) for age-adjustment. National Institutes of Health. URL: https://seer.cancer.gov/stdpopulations/ [accessed 2024-03-04]
  • Shield KD, Parkin DM, Whiteman DC, Rehm J, Viallon V, Micallef CM, et al. Population attributable and preventable fractions: cancer risk factor surveillance, and cancer policy projection. Curr Epidemiol Rep. 2016;3(3):201-211. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen YY, You SL, Koong SL, Liu J, Chen CA, Chen CJ, et al. Screening frequency and atypical cells and the prediction of cervical cancer risk. Obstet Gynecol. 2014;123(5):1003-1011. [ CrossRef ] [ Medline ]
  • Choi HCW, Jit M, Leung GM, Tsui K, Wu JT. Simultaneously characterizing the comparative economics of routine female adolescent nonavalent human papillomavirus (HPV) vaccination and assortativity of sexual mixing in Hong Kong Chinese: a modeling analysis. BMC Med. 2018;16(1):127. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Boniol M, Heanue M. Chapter 7: Age-standardisation and denominators. In: Curado MP, Edwards B, Shin HR, Storm H, Ferlay J, Heanue M, et al, editors. Cancer Incidence in Five Countinents. Lyon. IARC Press, International Agency for Research on Cancer; 2007;99-101.
  • Chen DS, Hsu NH, Sung JL, Hsu TC, Hsu ST, Kuo YT, et al. A mass vaccination program in Taiwan against hepatitis B virus infection in infants of hepatitis B surface antigen-carrier mothers. JAMA. 1987;257(19):2597-2603. [ Medline ]
  • Rodriguez NM. Participatory innovation for human papillomavirus screening to accelerate the elimination of cervical cancer. Lancet Glob Health. 2021;9(5):e582-e583. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Harper DM, Jimbo M. Elimination of cervical cancer depends on HPV vaccination and primary HPV screening. Lancet Infect Dis. 2021;21(10):1342-1344. [ CrossRef ] [ Medline ]
  • Chou P, Chen V. Mass screening for cervical cancer in Taiwan from 1974 to 1984. Cancer. 1989;64(4):962-968. [ CrossRef ] [ Medline ]
  • Gakidou E, Nordhagen S, Obermeyer Z. Coverage of cervical cancer screening in 57 countries: low average levels and large inequalities. PLoS Med. 2008;5(6):e132. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wu MH, Lin LJ, Wu CC. Towards the elimination of cervical cancer in Taiwan. J Formos Med Assoc. 2022;121(7):1188-1190. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Koong SL, Yen AMF, Chen THH. Efficacy and cost-effectiveness of nationwide cervical cancer screening in Taiwan. J Med Screen. 2006;13(Suppl 1):S44-S47. [ Medline ]
  • Chen YY, You SL, Chen CA, Shih LY, Koong SL, Chao KY, et al. Effectiveness of national cervical cancer screening programme in Taiwan: 12-year experiences. Br J Cancer. 2009;101(1):174-177. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chiou ST, Lu TH. Changes in geographic variation in the uptake of cervical cancer screening in Taiwan: possible effects of "leadership style factor"? Health Policy. 2014;114(1):64-70. [ CrossRef ] [ Medline ]
  • Global strategy to accelerate the elimination of cervical cancer as a public health problem. World Health Organization. 2020. URL: https://www.who.int/publications/i/item/9789240014107 [accessed 2024-02-17]
  • Hoes J, Pasmans H, Schurink-van 't Klooster TM, van der Klis FRM, Donken R, Berkhof J, et al. Review of long-term immunogenicity following HPV vaccination: gaps in current knowledge. Hum Vaccin Immunother. 2022;18(1):1908059. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kjaer SK, Nygård M, Sundström K, Dillner J, Tryggvadottir L, Munk C, et al. Final analysis of a 14-year long-term follow-up study of the effectiveness and immunogenicity of the quadrivalent human papillomavirus vaccine in women from four nordic countries. EClinicalMedicine. 2020;23:100401. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Olsson SE, Restrepo JA, Reina JC, Pitisuttithum P, Ulied A, Varman M, et al. Long-term immunogenicity, effectiveness, and safety of nine-valent human papillomavirus vaccine in girls and boys 9 to 15 years of age: interim analysis after 8 years of follow-up. Papillomavirus Res. 2020;10:100203. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Artemchuk H, Eriksson T, Poljak M, Surcel HM, Dillner J, Lehtinen M, et al. Long-term antibody response to human papillomavirus vaccines: up to 12 years of follow-up in the finnish maternity cohort. J Infect Dis. 2019;219(4):582-589. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Iversen OE, Miranda MJ, Ulied A, Soerdal T, Lazarus E, Chokephaibulkit K, et al. Immunogenicity of the 9-valent HPV vaccine using 2-dose regimens in girls and boys vs a 3-dose regimen in women. JAMA. 2016;316(22):2411-2421. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Donken R, Dobson SRM, Marty KD, Cook D, Sauvageau C, Gilca V, et al. Immunogenicity of 2 and 3 doses of the quadrivalent human papillomavirus vaccine up to 120 months postvaccination: follow-up of a randomized clinical trial. Clin Infect Dis. 2020;71(4):1022-1029. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kreimer AR, Sampson JN, Porras C, Schiller JT, Kemp T, Herrero R, et al. Evaluation of durability of a single dose of the bivalent HPV vaccine: the CVT trial. J Natl Cancer Inst. 2020;112(10):1038-1046. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Toh ZQ, Russell FM, Reyburn R, Fong J, Tuivaga E, Ratu T, et al. Sustained antibody responses 6 years following 1, 2, or 3 doses of quadrivalent human papillomavirus (HPV) vaccine in adolescent Fijian girls, and subsequent responses to a single dose of bivalent HPV vaccine: a prospective cohort study. Clin Infect Dis. 2017;64(7):852-859. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bonjour M, Charvat H, Franco EL, Piñeros M, Clifford GM, Bray F, et al. Global estimates of expected and preventable cervical cancers among girls born between 2005 and 2014: a birth cohort analysis. Lancet Public Health. 2021;6(7):e510-e521. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • No authors listed. Meeting of the strategic advisory group of experts on immunization, October 2015—conclusions and recommendations. Wkly Epidemiol Rec. 2015;90(50):681-699. [ FREE Full text ] [ Medline ]
  • Alonso S, Cambaco O, Maússe Y, Matsinhe G, Macete E, Menéndez C, et al. Costs associated with delivering HPV vaccination in the context of the first year demonstration programme in southern Mozambique. BMC Public Health. 2019;19(1):1031. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hanley SJB, Yoshioka E, Ito Y, Kishi R. HPV vaccination crisis in Japan. Lancet. 2015;385(9987):2571. [ CrossRef ] [ Medline ]
  • Human papillomavirus (HPV) vaccination coverage. World Health Organization. 2023. URL: https://immunizationdata.who.int/pages/coverage/hpv.html [accessed 2024-08-23]
  • Oymans EJ, de Kroon CD, Bart J, Nijman HW, van der Aa MA. Incidence of gynaecological cancer during the COVID-19 pandemic: a population-based study in the Netherlands. Cancer Epidemiol. 2023;85:102405. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Drolet M, Bénard É, Boily MC, Ali H, Baandrup L, Bauer H, et al. Population-level impact and herd effects following human papillomavirus vaccination programmes: a systematic review and meta-analysis. Lancet Infect Dis. 2015;15(5):565-580. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • ChatGPT. OpenAI. URL: https://chat.openai.com/ [accessed 2024-03-19]

Abbreviations

Edited by A Mavragani, T Sanchez; submitted 08.02.23; peer-reviewed by Z Liu, YP Liaw; comments to author 21.08.23; revised version received 11.09.23; accepted 08.02.24; published 18.04.24.

©Yi-Chu Chen, Yun-Yuan Chen, Shih-Yung Su, Jing-Rong Jhuang, Chun-Ju Chiang, Ya-Wen Yang, Li-Ju Lin, Chao-Chun Wu, Wen-Chung Lee. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 18.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.

  • Open access
  • Published: 22 April 2024

The influence of maternal prepregnancy weight and gestational weight gain on the umbilical cord blood metabolome: a case–control study

  • Xianxian Yuan   ORCID: orcid.org/0000-0001-8762-8471 1 ,
  • Yuru Ma 1 ,
  • Jia Wang 2 ,
  • Yan Zhao 1 ,
  • Wei Zheng 1 ,
  • Ruihua Yang 1 ,
  • Lirui Zhang 1 ,
  • Xin Yan 1 &
  • Guanghui Li   ORCID: orcid.org/0000-0003-2290-1515 1  

BMC Pregnancy and Childbirth volume  24 , Article number:  297 ( 2024 ) Cite this article

Metrics details

Maternal overweight/obesity and excessive gestational weight gain (GWG) are frequently reported to be risk factors for obesity and other metabolic disorders in offspring. Cord blood metabolites provide information on fetal nutritional and metabolic health and could provide an early window of detection of potential health issues among newborns. The aim of the study was to explore the impact of maternal prepregnancy overweight/obesity and excessive GWG on cord blood metabolic profiles.

A case control study including 33 pairs of mothers with prepregnancy overweight/obesity and their neonates, 30 pairs of mothers with excessive GWG and their neonates, and 32 control mother-neonate pairs. Untargeted metabolomic profiling of umbilical cord blood samples were performed using UHPLC‒MS/MS.

Forty-six metabolites exhibited a significant increase and 60 metabolites exhibited a significant reduction in umbilical cord blood from overweight and obese mothers compared with mothers with normal body weight. Steroid hormone biosynthesis and neuroactive ligand‒receptor interactions were the two top-ranking pathways enriched with these metabolites ( P  = 0.01 and 0.03, respectively). Compared with mothers with normal GWG, in mothers with excessive GWG, the levels of 63 metabolites were increased and those of 46 metabolites were decreased in umbilical cord blood. Biosynthesis of unsaturated fatty acids was the most altered pathway enriched with these metabolites ( P  < 0.01).

Conclusions

Prepregnancy overweight and obesity affected the fetal steroid hormone biosynthesis pathway, while excessive GWG affected fetal fatty acid metabolism. This emphasizes the importance of preconception weight loss and maintaining an appropriate GWG, which are beneficial for the long-term metabolic health of offspring.

Peer Review reports

The obesity epidemic is an important public health problem in developed and developing countries [ 1 ] and is associated with the emergence of chronic noncommunicable diseases, including type 2 diabetes mellitus (T2DM), hypertension, cardiovascular disease, nonalcoholic fatty liver disease (NAFLD), and cancer [ 2 , 3 , 4 ]. Maternal obesity is the most common metabolic disturbance in pregnancy, and the prevalence of obesity among women of childbearing age is 7.1% ~ 31.9% in some countries [ 5 ]. In China, the prevalence of overweight and obesity has also increased rapidly in the past four decades. Based on Chinese criteria, the latest national prevalence estimates for 2015–2019 were 34.3% for overweight and 16.4% for obesity in adults (≥ 18 years of age) [ 6 ].

Increasing evidence implicates overnutrition in utero as a major determinant of the health of offspring during childhood and adulthood, which is compatible with the developmental origins of health and disease (DOHaD) framework [ 7 ]. Maternal obesity and excessive gestational weight gain (GWG) are important risk factors for several adverse maternal outcomes, including gestational diabetes and hypertensive disorders, fetal death, and preterm birth [ 8 , 9 , 10 ]. More importantly, they have negative implications for offspring, both perinatally and later in life. Evidence from cohort studies focusing on offspring development confirms the relationship between maternal obesity/excessive GWG and offspring obesity programming [ 11 , 12 , 13 ]. Currently, there is no unified mechanism to explain the adverse outcomes associated with maternal obesity and excessive GWG, which may be the independent and interactive effects of the obese maternal phenotype itself and the diet associated with this phenotype. In addition to genetic and environmental factors, metabolic programming may also lead to the intergenerational transmission of obesity through epigenetic mechanisms.

Metabolomics, which reflects the metabolic phenotype of human subjects and animals, is the profiling of metabolites in biofluids, cells and tissues using high-throughput platforms, such as mass spectrometry. It has unique potential in identifying biomarkers for predicting occurrence, severity, and progression of diseases, as well as exploring underlying mechanistic abnormalities [ 14 , 15 ]. Umbilical cord metabolites can provide information about fetal nutritional and metabolic health, and may provide an early window for detection of potential health issues in newborns [ 16 ]. Previous studies have reported differences in umbilical cord metabolite profiles associated with maternal obesity [ 17 , 18 ]. However, the results were inconsistent due to differences in sample sizes, ethnicity and region, and mass spectrometry. In addition, most studies have not considered the difference in the effects of prepregnancy body mass index (BMI) and GWG on cord blood metabolites.

To investigate the relationship between early metabolic programming and the increased incidence of metabolic diseases in offspring, we studied the associations between elevated prepregnancy BMI/excessive GWG and umbilical cord metabolic profiles. Another purpose of this study was to explore whether there were differences in the effects of prepregnancy overweight/obesity and excessive GWG on cord blood metabolites.

Study population

This was a hospital-based, case control study that included singleton pregnant women who received prenatal care and delivered vaginally at Beijing Obstetrics and Gynecology Hospital, Capital Medical University, from January 2022 to March 2022. We selected 33 pregnant women with a prepregnancy BMI ≥ 24.0 kg/m 2 regardless of their gestational weight gain as the overweight/obese group, 30 pregnant women with a prepregnancy BMI of 18.5–23.9 kg/m 2 and a GWG > 14.0 kg as the excessive GWG group, and 32 pregnant women with a BMI of 18.5–23.9 kg/m 2 and a GWG of 8.0–14.0 kg as the control group. The ages of the three groups were matched (± 1.0 years), and the prepregnancy BMIs of the excessive GWG and control groups were matched (± 1.0 kg/m 2 ).

The inclusion criteria were women with singleton pregnancies, those aged between 20 and 45 years, those with full-term delivery (gestational age ≥ 37 weeks), those with a prepregnancy BMI ≥ 18.5 kg/m 2 , those without prepregnancy diabetes mellitus (DM) or hypertension, and those without gestational diabetes mellitus (GDM). The exclusion criteria were women with multiple pregnancies, those less than 20 years or more than 45 years old, those with a prepregnancy BMI < 18.5 kg/m 2 , those with prepregnancy DM, hypertension or GDM, and those without cord blood samples.

We classified pregnant women into BMI categories based on Chinese guidelines [ 19 ]: normal weight (prepregnancy BMI 18.5–23.9 kg/m 2 ), overweight (prepregnancy BMI 24.0–27.9 kg/m 2 ), and obese (prepregnancy BMI ≥ 28.0 kg/m 2 ). GWG guideline concordance was defined by the 2021 Chinese Nutrition Society recommendations according to prepregnancy BMI. The upper limits of GWG for normal weight, overweight, and obesity were 14.0 kg, 11.0 kg, and 9.0 kg, respectively.

Ethical approval and written informed consent were obtained from all participants. The study has been performed according to the Declaration of Helsinki, and the procedures have been approved by the ethics committees of Beijing Obstetrics and Gynecology Hospital, Capital Medical University (2021-KY-037).

Sample and data collection

Maternal and neonatal clinical data were collected from the electronic medical records system of Beijing Obstetrics and Gynecology Hospital. Maternal clinical characteristics included age, height, prepregnancy and predelivery weight, education level, smoking and drinking status during pregnancy, parity, conception method, comorbidities and complications of pregnancy, family history of DM and hypertension, gestational age, mode of delivery, and biochemical results during pregnancy. Prepregnancy BMI was calculated as prepregnancy weight in kilograms divided by the square of height in meters. GWG was determined by subtracting the prepregnancy weight in kilograms from the predelivery weight in kilograms. GDM was defined using the IAPDSG’s diagnostic criteria at 24 to 28 +6  weeks gestation and the fasting glucose and 1- and 2-h glucose concentrations at the time of the oral glucose tolerance test (OGTT). Neonatal clinical characteristics included sex, birth weight and length. Macrosomia was defined as a birth weight of 4,000 g or more [ 20 ]. Low birth weight (LBW) was defined as a birth weight less than 2,500 g [ 21 ].

Umbilical cord blood samples were obtained by trained midwives after clamping the cord at delivery. Whole blood samples were collected in EDTA tubes, refrigerated for < 24 h, and centrifuged at 2,000 r.p.m. at 4 ℃ for 10 min. Plasma aliquots were stored at -80 ℃ until shipment on dry ice to Novogene, Inc. (Beijing, China) for untargeted metabolomic analysis.

Untargeted metabolomic analyses

Ultrahigh-performance liquid chromatography tandem mass spectrometry (UHPLC‒MS/MS) analyses were performed using a Vanquish UHPLC system (Thermo Fisher, Germany) coupled with an Orbitrap Q Exactive™ HF mass spectrometer (Thermo Fisher, Germany) at Novogene Co., Ltd. (Beijing, China). Detailed descriptions of the sample preparation, mass spectrometry and automated metabolite identification procedures are described in the Supplementary materials .

Statistical analysis

Clinical data statistical analysis.

Quantitative data are shown as the mean ± standard deviation (SD) or median (interquartile range), and categorical data are presented as percentages. The Mann‒Whitney U test, chi-square test, and general linear repeated-measures model were used to assess the differences between the control and study groups when appropriate. A P value < 0.05 was considered statistically significant. All analyses were performed using Statistical Package of Social Sciences version 25.0 (SPSS 25.0) for Windows (SPSS Inc).

Umbilical cord metabolome statistical analysis

These metabolites were annotated using the Human Metabolome Database (HMDB) ( https://hmdb.ca/metabolites ), LIPIDMaps database ( http://www.lipidmaps.org/ ), and Kyoto Encylopaedia of Genes and Genomes (KEGG) database ( https://www.genome.jp/kegg/pathway.html ). Principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA) were performed at metaX. We applied univariate analysis ( T test) to calculate the statistical significance ( P value). Metabolites with a variable importance for the projection (VIP) > 1, a P value < 0.05 and a fold change (FC) ≥ 2 or FC ≤ 0.5 were considered to be differential metabolites. A false discovery rate (FDR) control was implemented to correct for multiple comparisons. The q -value in the FDR control was defined as the FDR analog of the P -value. In this study, the q -value was set at 0.2. For clustering heatmaps, the data were normalized using z scores of the intensity areas of differential metabolites and were plotted by the Pheatmap package in R language.

The correlations among differential metabolites were analyzed by cor () in R language (method = Pearson). Statistically significant correlations among differential metabolites were calculated by cor.mtest () in R language. A P value < 0.05 was considered statistically significant, and correlation plots were plotted by the corrplot package in R language. The functions of these metabolites and metabolic pathways were studied using the KEGG database. The metabolic pathway enrichment analysis of differential metabolites was performed when the ratio was satisfied by x/n > y/N, and the metabolic pathway was considered significantly enriched when P  < 0.05.

Demographic characteristics of study participants

The demographic and clinical characteristics of the three population groups enrolled in the study are summarized in Table  1 . Mothers had no significant difference regarding their ages or gestational ages. Compared to the mothers in the excessive GWG and control groups, those in the prepregnancy overweight/obesity group had a significantly higher prepregnancy BMI (25.6 (24.5, 27.2) kg/m 2 ). However, there was no significant difference in prepregnancy BMI between mothers in the excessive GWG group (20.3 ± 1.2 kg/m 2 ) and mothers in the control group (20.6 ± 1.5 kg/m 2 ). Mothers in the excessive GWG group had the highest GWG (17.0 (15.5, 19.1) kg) among the three groups. The mean GWG of the mothers in the prepregnancy overweight/obesity group was 12.9 ± 3.8 kg, which was similar to that of the control group (11.8 ± 1.5 kg). It was noteworthy that among the 33 prepregnancy overweight/obese pregnant women, 20 of them had appropriate GWG, 1 had insufficient GWG, and 12 had excessive GWG. The proportion of mothers who underwent invitro fertilization and embryo transfer (IVF-ET) in the prepregnancy overweight/obesity group (15.2%) was significantly higher than that in the excessive GWG and control groups. There were no statistically significant differences in the proportions of pregnancy outcomes among the three groups, including preeclampsia, premature rupture of membranes, postpartum hemorrhage, macrosomia, and LBW. The babies in the three groups showed no significant difference regarding their birth weights or lengths.

The biochemical parameters of the mothers during pregnancy are shown in Table  2 . The levels of triglyceride (TG) and uric acid (UA) of mothers in the prepregnancy overweight/obesity group were significantly higher than those of the mothers in the excessive GWG and control groups in the first trimester. However, there was no significant difference in the blood glucose and lipid levels in the second and third trimesters of pregnancy among the three groups.

PCA and PLS-DA analysis of cord blood metabolites

Functional and taxonomic annotations of the identified metabolites included the HMDB classification annotations, LIPID MAPS classification annotations, and KEGG pathway annotations. Those cord blood metabolites included lipids and lipid-like molecules, organic acids and their derivatives, and organoheterocyclic compounds, which were mainly involved in metabolism. To better understand the structure of the cord blood metabolome in cases versus controls, we used unsupervised PCA to identify metabolites contributing the most to observed differences in the dataset. PCA did not clearly separate the three groups. We next used PLS-DA to identify metabolites that were predictive of case versus control status. PLS-DA clearly distinguished the cases from the controls (Fig.  1 ), the prepregnancy overweight/obesity group vs. the control group (R2Y = 0.82, Q2Y = 0.37; R2Y = 0.77, Q2Y = 0.13, respectively) (Fig.  1 A), and the excessive GWG group vs. the control group (R2Y = 0.76, Q2Y = 0.16; R2Y = 0.81, Q2Y = 0.41) (Fig.  1 B).

figure 1

PLS-DA of identified cord blood metabolites. A the prepregnancy overweight/obesity group vs. the control group; B the excessive GWG group vs. the control group. (a) PLS-DA score. The horizontal coordinates are the score of the sample on the first principal component; the longitudinal coordinates are the score of the sample on the second principal component; R2Y represents the interpretation rate of the model, and Q2Y is used to evaluate the predictive ability of the PLS-DA model, and when R2Y is greater than Q2Y, it means that the model is well established. (b) PLS-DA valid. Horizontal coordinates represent the correlation between randomly grouped Y and the original group Y, and vertical coordinates represent the scores of R2 and Q2. (1) POS, positive metabolites; (2) NEG, negative metabolites

Maternal prepregnancy overweight/obesity

Screening differential metabolites according to a PLS-DA VIP > 1.0, a FC > 1.2 or < 0.833 and a P value < 0.05, a total of 106 cord blood metabolites (77 positive metabolites and 29 negative metabolites) differed between the prepregnancy overweight/obesity group and the control group. Compared with those in the control group, the levels of 46 metabolites (19 positive metabolites and 27 negative metabolites) were increased in the prepregnancy overweight/obesity group, among which octopamine was the metabolite with the largest increase, followed by (2S)-4-Oxo-2-phenyl-3,4-dihydro-2H-chromen-7-yl beta-D-glucopyranoside, N-tetradecanamide, stearamide, and methanandamide (Fig.  2 A). Compared with the control group, in the prepregnancy overweight/obesity group, there were 60 metabolites (58 positive metabolites and 2 negative metabolites) with reduced concentrations, among which senecionine was the metabolite with the largest decrease, followed by 3-(methylsulfonyl)-2H-chromen-2-one, methyl EudesMate, cuminaldehyde, and 2-(tert-butyl)-1,3-thiazolane-4-carboxylic acid (Fig.  2 A).

figure 2

Stem plots of differential cord blood metabolites. A the prepregnancy overweight/obesity group vs. the control group; B the excessive GWG group vs. the control group. (1) positive metabolites; (2) negative metabolites. Notes: The color of the dot in the stem plots represents the upward and lower adjustment, the blue represents downward, and the red represents upward. The length of the rod represents the size of log2 (FC), and the size of the dot represents the size of the VIP value

A hierarchical analysis of the two groups of differential metabolites obtained was carried out, and the difference in metabolic expression patterns between the two groups and within the same comparison was obtained, which is shown in Fig.  3 . KEGG pathway analysis of differential cord blood metabolites associated with the prepregnancy overweight/obesity group versus the control group is shown in Table  3 and Fig.  4 A. The metabolite enrichment analysis revealed that steroid hormone biosynthesis ( P value = 0.01) and neuroactive ligand‒receptor interactions ( P value = 0.03) were the two pathways that were most altered between the prepregnancy overweight/obesity group and the control group. 19 metabolites were distributed in the pathway of steroid hormone biosynthesis, and 4 metabolites were distributed in the pathway of neuroactive ligand‒receptor interactions. In the steroid hormone biosynthesis pathway, the levels of corticosterone, 11-deoxycortisol, cortisol, testosterone, and 7α-hydroxytestosterone were decreased in the prepregnancy overweight/obesity group relative to those in the control group. In the neuroactive ligand‒receptor interaction pathway, the level of cortisol was decreased and the levels of trace amines were increased in the prepregnancy overweight/obesity group relative to the control group.

figure 3

Clustering heat maps of differential cord blood metabolites of the three groups. A positive metabolites; B negative metabolites. Notes: Longitudinal clustering of samples and trans-verse clustering of metabolites. The shorter the clustering branches, the higher the similarity. Through horizontal comparison, we can see the relationship between groups of metabolite content clustering

figure 4

KEGG enrichment scatterplots (a) and net (b) of differential cord blood metabolites. A the prepregnancy overweight/obesity group vs. the control group; B the excessive GWG group vs. the control group. (1) positive metabolites; (2) negative metabolites. Notes: (a) The horizontal co-ordinates in the figure are x/y (the number of differential metabolites in the corresponding metabolic pathway/the total number of total metabolites identified in this pathway). The value represents the enrichment degree of differential metabolites in the pathway. The color of the point rep-resents the P -value of the hypergeometric test, and the size of the point represents the number of differential metabolites in the corresponding pathway. (b) The red dot represents a metabolic pathway, the yellow dot represents a substance-related regulatory enzyme information, the green dot represents the background substance of a metabolic pathway, the purple dot represents the molecular module information of a class of substances, the blue dot represents a substance chemical reaction, and the green square represents the differential substance obtained by this comparison

Maternal excessive GWG

A total of 109 cord blood metabolites (52 positive metabolites and 57 negative metabolites) differed between the excessive GWG group and the control group. Compared with the control group, in the excessive GWG group, there were 63 metabolites (15 positive metabolites and 48 negative metabolites) with increased concentrations, among which 2-thio-acetyl MAGE was the metabolite with the largest increase, followed by PC (7:0/8:0), lysopc 16:2 (2 N isomer), MGMG (18:2), and thromboxane B2 (Fig.  2 B). Compared with the levels in the control group, the levels of 46 metabolites (37 positive metabolites and 9 negative metabolites) in the excessive GWG group were reduced, among which hippuric acid had the largest decrease, followed by 8-hydroxyquinoline, gamithromycin, 2-phenylglycine, and cefmetazole (Fig.  2 B).

A hierarchical analysis of differential metabolites obtained in the two groups was carried out, and the difference in metabolic expression patterns between the two groups and within the same comparison was obtained, which is shown in Fig.  3 . KEGG pathway analysis of the cord blood metabolites associated with the excessive GWG group versus the control group is shown in Table  4 and Fig.  4 B. The metabolite enrichment analysis revealed that biosynthesis of unsaturated fatty acids was the most altered pathway between the excessive GWG and control groups ( P value < 0.01). There were 13 metabolites distributed in the enriched pathway. The levels of docosapentaenoic acid (DPA), docosahexaenoic acid (DHA), arachidonic acid, adrenic acid, palmitic acid, stearic acid, behenic acid, lignoceric acid, and erucic acid were increased in the excessive GWG group relative to those in the control group.

Our present study found that both maternal prepregnancy overweight/obesity and excessive GWG could affect umbilical cord blood metabolites, and they had different effects on these metabolites. Regardless of their gestational weight gain, the umbilical cord blood of prepregnancy overweight and obese mothers had 46 metabolites increased and 60 metabolites decreased compared with the umbilical cord blood of mothers with normal body weight and appropriate GWG. Steroid hormone biosynthesis and neuroactive ligand‒receptor interactions were the two top-ranking pathways enriched with these metabolites. Compared with mothers with normal prepregnancy BMI and appropriate GWG, in mothers with normal prepregnancy BMI but excessive GWG, the levels of 63 metabolites were increased and those of 46 metabolites were decreased in umbilical cord blood. Biosynthesis of unsaturated fatty acids was the most altered pathway enriched with these metabolites.

There were many differential metabolites in the cord blood between the prepregnancy overweight/obesity group and the control group and between the excessive GWG group and the control group. However, the roles of most of these differential metabolites are unknown. The levels of stearamide and methanandamide were increased in the prepregnancy overweight/obesity group. Stearamide, also known as octadecanamide or kemamide S, belongs to the class of organic compounds known as carboximidic acids. Stearamide, which is increased in the serum of patients with hepatic cirrhosis and sepsis, may be associated with the systemic inflammatory state [ 22 , 23 ]. Methanandamide is a stable analog of anandamide that participates in energy balance mainly by activating cannabinoid receptors. Methanandamide dose-dependently inhibits and excites tension-sensitive gastric vagal afferents (GVAs), which play a role in appetite regulation [ 24 ]. In mice fed a high-fat diet, only an inhibitory effect of methanandamide was observed, and GVA responses to tension were dampened [ 24 , 25 ]. These changes may contribute to the development and/or maintenance of obesity. Moreover, methanandamide can produce dose-related hypothermia and attenuate cocaine-induced hyperthermia by a cannabinoid 1-dopamine D2 receptor mechanism [ 26 ].

Metabolomic pathway analysis of the cord blood metabolite features in the prepregnancy overweight and obesity group identified two filtered significant pathways: steroid hormone biosynthesis and neuroactive ligand‒receptor interaction pathways. In the steroid hormone biosynthesis pathway, the levels of several glucocorticoids (including corticosterone, 11-deoxycortisol, cortisol, testosterone, and 7α-hydroxytestosterone) were decreased in the prepregnancy overweight/obesity group. In addition to the physiological role of glucocorticoids in the healthy neuroendocrine development and maturation of fetuses and babies, glucocorticoids are essential to human health by regulating different physiological events in mature organs and tissues, such as glucose metabolism, lipid biosynthesis and distribution, food intake, thermogenesis, and mood and learning patterns [ 27 ]. Glucocorticoids have been considered as a link between adverse early-life conditions and the development of metabolic disorders in later life [ 28 , 29 , 30 ]. However, there is still much controversy regarding the role of maternal obesity in the fetal–steroid hormone biosynthesis pathway. Studies of maternal obesity animal models showed that corticosterone and cortisol levels were increased in the offspring of obese mothers [ 31 , 32 ]. A study reported by Satu M Kumpulainen et al. showed that young adults born to mothers with higher early pregnancy BMIs show lower average levels of diurnal cortisol, especially in the morning [ 33 ]. Laura I. Stirrat et al. found that increased maternal BMI was associated with lower maternal cortisol, corticosterone, and 11-dehydrocorticosterone levels. However, there were no associations between maternal BMI and glucocorticoid levels in the cord blood [ 34 ]. The differences in the study protocols of these previous studies may explain the mixed findings, such as cortisol measured from peripheral blood, cord blood or saliva; variation in measurement time points; the number of samples. Although the effect of maternal obesity on fetal steroid hormone levels is controversial, dysregulation of glucocorticoids may be a plausible mechanism by which maternal obesity can increase the risk of metabolic disorders and mental health disorders in offspring.

The effect of excessive GWG on umbilical cord blood metabolites is different from that of maternal overweight and obesity. Compared with the control group, in the excessive GWG group, the level of thromboxane B2 was increased and the level of hippuric acid was decreased. Thromboxane B2, which is important in the platelet release reaction, is a stable, physiologically active compound formed in vivo from prostaglandin endoperoxides. Hippuric acid is an acyl glycine formed from the conjugation of benzoic acid with glycine. Several studies have confirmed that both thromboxane B2 and hippuric acid levels are associated with diet. Dietary fatty acids affect platelet thromboxane production [ 35 , 36 , 37 ]. In our study, several fatty acids (e.g., palmitic acid, stearic acid, behenic acid, and lignoceric acid) in the excessive GWG group were also increased, which may have led to the increase in thromboxane B2 levels. Hippuric acid can be detected after the consumption of whole grains and anthocyanin-rich bilberries [ 38 , 39 ]. A healthy diet intervention increased the signals for hippuric acid to incorporate polyunsaturated fatty acids [ 38 ], and the low level of hippuric acid was associated with lower fruit-vegetable intakes [ 39 ]. Maternal overnutrition and unhealthy dietary patterns are the main reasons for excessive GWG [ 40 , 41 ]. Therefore, we speculated that the differences in thromboxane B2 and hippuric acid between the excessive GWG and control groups were associated with maternal diet during pregnancy. The effect of these differential metabolites on the long-term metabolic health of offspring after birth needs further study.

Metabolomic pathway analysis of the cord blood metabolite features in the excessive GWG group identified that biosynthesis of unsaturated fatty acids was the filtered significant pathway. The levels of several fatty acids in this pathway were increased in the excessive GWG group, including long-chain saturated fatty acids (e.g., palmitic acid (C 16:0), stearic acid (C 18:0), behenic acid (C 22:0), and lignoceric acid (C 23:0)), monounsaturated fatty acids (erucic acid), and polyunsaturated fatty acids (e.g., DPA, DHA, arachidonic acid, and adrenic acid). Because perinatal fatty acid status can be influenced by maternal dietary modifications or supplementation [ 42 ], we speculated that maternal diet during pregnancy caused the difference in umbilical cord blood fatty acids between the excessive GWG and control groups. A large body of evidence from mechanistic studies supports the potential of fatty acids to influence later obesity. However, the possible mechanisms and observed relationships are complex and related to the types and patterns of fatty acids [ 43 , 44 ]. Maternal dietary fatty acids have been found to induce hypothalamic inflammation, cause epigenetic changes, and alter the mechanisms of energy control in offspring [ 43 ]. Evidence from cell culture and rodent studies showed that polyunsaturated fatty acids might serve several complex roles in fetuses, including the stimulation and/or inhibition regulation of adipocyte differentiation [ 44 ]. The questions of whether lower n-6 long-chain polyunsaturated fatty acid levels or higher n-3 long-chain polyunsaturated fatty acid levels are of more relevance and whether the long-term effects differ with different offspring ages remain [ 44 ]. Although there is a biologically plausible case for the relevance of perinatal fatty acid status in later obesity risk, available data in humans suggest that the influence of achievable modification of perinatal n-3/n-6 status is not sufficient to influence offspring obesity risk in the general population [ 45 ]. Further studies seem justified to clarify the reasons.

The advantage of our present study is that we simultaneously analyzed the effects of prepregnancy overweight/obesity and excessive GWG on cord blood metabolites and explored their differences. In addition, to exclude the effect of hyperglycemia on cord blood metabolites, both women with prepregnancy diabetes mellitus and gestational diabetes mellitus were excluded from our study. The limitation of our study is that it was a single-center study with a small sample, especially in the prepregnancy overweight/obesity group. In the future, we can expand the sample size and conduct a subgroup analysis of the prepregnancy overweight/obesity group and analyze the differences in the effects of different degrees of obesity on cord blood metabolites. The prepregnancy overweight/obesity group can be further divided into an appropriate GWG group and an excessive GWG group, and the differences in the effects of these two groups on umbilical cord blood metabolites can be analyzed. Moreover, the dietary pattern of the pregnant woman could affect the production of cord blood metabolites. We did not investigate the dietary patterns of the mothers in this study, which is another limitation of this study. In future studies, we should investigate maternal dietary patterns as a very important confounding variable.

In conclusion, our present study confirmed that both prepregnancy overweight/obesity and excessive GWG could affect umbilical cord blood metabolites, and they had different effects on these metabolites. Prepregnancy overweight and obesity affected the fetal steroid hormone biosynthesis pathway, while normal prepregnancy body weight but excessive GWG affected fetal fatty acid metabolism. This emphasizes the importance of preconception weight loss and maintaining an appropriate GWG, which are beneficial for the long-term metabolic health of offspring.

Availability of data and materials

Data sets generated during the current study are not publicly available but will be available from the corresponding author at a reasonable request. Responses to the request for the raw data will be judged by a committee including XXY and GHL.

Abbreviations

Excessive gestational weight gain

Ultrahigh-performance liquid chromatography tandem mass spectrometry

Type 2 diabetes mellitus

Nonalcoholic fatty liver disease

The developmental origins of health and disease

Body mass index

Diabetes mellitus

Gestational diabetes mellitus

Oral glucose tolerance test

Low birth weight

Standard deviation

The Human Metabolome Database

Kyoto Encylopaedia of Genes and Genomes

Principal component analysis

Partial least-squares discriminant analysis

Importance for the projection

Fold change

Invitro fertilization and embryo transfer

Triglyceride

Docosapentaenoic acid

Docosahexaenoic acid

Gastric vagal afferents

Collaborators GBDO, Afshin A, Forouzanfar MH, Reitsma MB, Sur P, Estep K, Lee A, Marczak L, Mokdad AH, Moradi-Lakeh M, et al. Health effects of overweight and obesity in 195 countries over 25 years. N Engl J Med. 2017;377(1):13–27.

Article   Google Scholar  

Bjerregaard LG, Jensen BW, Angquist L, Osler M, Sorensen TIA, Baker JL. Change in overweight from childhood to early adulthood and risk of type 2 diabetes. N Engl J Med. 2018;378(14):1302–12.

Article   PubMed   Google Scholar  

Sharma V, Coleman S, Nixon J, Sharples L, Hamilton-Shield J, Rutter H, Bryant M. A systematic review and meta-analysis estimating the population prevalence of comorbidities in children and adolescents aged 5 to 18 years. Obes Rev. 2019;20(10):1341–9.

Article   PubMed   PubMed Central   Google Scholar  

Llewellyn A, Simmonds M, Owen CG, Woolacott N. Childhood obesity as a predictor of morbidity in adulthood: a systematic review and meta-analysis. Obes Rev. 2016;17(1):56–67.

Article   CAS   PubMed   Google Scholar  

Poston L, Caleyachetty R, Cnattingius S, Corvalan C, Uauy R, Herring S, Gillman MW. Preconceptional and maternal obesity: epidemiology and health consequences. Lancet Diabetes Endocrinol. 2016;4(12):1025–36.

Pan XF, Wang L, Pan A. Epidemiology and determinants of obesity in China. Lancet Diabetes Endocrinol. 2021;9(6):373–92.

Barker DJ. The developmental origins of adult disease. J Am Coll Nutr. 2004;23(6 Suppl):588S-595S.

LifeCycle Project-Maternal O, Childhood Outcomes Study G, Voerman E, Santos S, Inskip H, Amiano P, Barros H, Charles MA, Chatzi L, Chrousos GP, et al. Association of gestational weight gain with adverse maternal and infant outcomes. JAMA. 2019;321(17):1702–15.

Aune D, Saugstad OD, Henriksen T, Tonstad S. Maternal body mass index and the risk of fetal death, stillbirth, and infant death: a systematic review and meta-analysis. JAMA. 2014;311(15):1536–46.

Ukah UV, Bayrampour H, Sabr Y, Razaz N, Chan WS, Lim KI, Lisonkova S. Association between gestational weight gain and severe adverse birth outcomes in Washington State, US: a population-based retrospective cohort study, 2004–2013. PLoS Med. 2019;16(12):e1003009.

Starling AP, Brinton JT, Glueck DH, Shapiro AL, Harrod CS, Lynch AM, Siega-Riz AM, Dabelea D. Associations of maternal BMI and gestational weight gain with neonatal adiposity in the Healthy Start study. Am J Clin Nutr. 2015;101(2):302–9.

Voerman E, Santos S, Patro Golab B, Amiano P, Ballester F, Barros H, Bergstrom A, Charles MA, Chatzi L, Chevrier C, et al. Maternal body mass index, gestational weight gain, and the risk of overweight and obesity across childhood: an individual participant data meta-analysis. PLoS Med. 2019;16(2):e1002744.

Heslehurst N, Vieira R, Akhter Z, Bailey H, Slack E, Ngongalah L, Pemu A, Rankin J. The association between maternal body mass index and child obesity: a systematic review and meta-analysis. PLoS Med. 2019;16(6):e1002817.

Newgard CB. Metabolomics and metabolic diseases: where do we stand? Cell Metab. 2017;25(1):43–56.

Johnson CH, Ivanisevic J, Siuzdak G. Metabolomics: beyond biomarkers and towards mechanisms. Nat Rev Mol Cell Biol. 2016;17(7):451–9.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hivert MF, Perng W, Watkins SM, Newgard CS, Kenny LC, Kristal BS, Patti ME, Isganaitis E, DeMeo DL, Oken E, et al. Metabolomics in the developmental origins of obesity and its cardiometabolic consequences. J Dev Orig Health Dis. 2015;6(2):65–78.

Schlueter RJ, Al-Akwaa FM, Benny PA, Gurary A, Xie G, Jia W, Chun SJ, Chern I, Garmire LX. Prepregnant obesity of mothers in a multiethnic cohort is associated with cord blood metabolomic changes in offspring. J Proteome Res. 2020;19(4):1361–74.

Shokry E, Marchioro L, Uhl O, Bermudez MG, Garcia-Santos JA, Segura MT, Campoy C, Koletzko B. Impact of maternal BMI and gestational diabetes mellitus on maternal and cord blood metabolome: results from the PREOBE cohort study. Acta Diabetol. 2019;56(4):421–30.

Chen C, Lu FC, Department of Disease Control Ministry of Health PRC. The guidelines for prevention and control of overweight and obesity in Chinese adults. Biomed Environ Sci. 2004;17(Suppl):1–36.

PubMed   Google Scholar  

The American College of Obstetricians and Gynecologists. Macrosomia: ACOG practice bulletin, number 216. Obstet Gynecol. 2020;135(1):e18–e35.

Goldenberg RL, Culhane JF. Low birth weight in the United States. Am J Clin Nutr. 2007;85(2):584S-590S.

Lian JS, Liu W, Hao SR, Guo YZ, Huang HJ, Chen DY, Xie Q, Pan XP, Xu W, Yuan WX, et al. A serum metabonomic study on the difference between alcohol- and HBV-induced liver cirrhosis by ultraperformance liquid chromatography coupled to mass spectrometry plus quadrupole time-of-flight mass spectrometry. Chin Med J (Engl). 2011;124(9):1367–73.

CAS   PubMed   Google Scholar  

Ding W, Xu S, Zhou B, Zhou R, Liu P, Hui X, Long Y, Su L. Dynamic plasma lipidomic analysis revealed cholesterol ester and amides associated with sepsis development in critically Ill patients after cardiovascular surgery with cardiopulmonary bypass. J Pers Med. 2022;12(11):1838.

Christie S, O’Rielly R, Li H, Nunez-Salces M, Wittert GA, Page AJ. Modulatory effect of methanandamide on gastric vagal afferent satiety signals depends on nutritional status. J Physiol. 2020;598(11):2169–82.

Christie S, O’Rielly R, Li H, Wittert GA, Page AJ. High fat diet induced obesity alters endocannabinoid and ghrelin mediated regulation of components of the endocannabinoid system in nodose ganglia. Peptides. 2020;131:170371.

Rasmussen BA, Kim E, Unterwald EM, Rawls SM. Methanandamide attenuates cocaine-induced hyperthermia in rats by a cannabinoid CB1-dopamine D2 receptor mechanism. Brain Res. 2009;1260:7–14.

Facchi JC, Lima TAL, Oliveira LR, Costermani HO, Miranda GDS, de Oliveira JC. Perinatal programming of metabolic diseases: the role of glucocorticoids. Metabolism. 2020;104:154047.

Reynolds RM, Walker BR, Syddall HE, Andrew R, Wood PJ, Whorwood CB, Phillips DI. Altered control of cortisol secretion in adult men with low birth weight and cardiovascular risk factors. J Clin Endocrinol Metab. 2001;86(1):245–50.

Valtat B, Dupuis C, Zenaty D, Singh-Estivalet A, Tronche F, Breant B, Blondeau B. Genetic evidence of the programming of beta cell mass and function by glucocorticoids in mice. Diabetologia. 2011;54(2):350–9.

Jia Y, Li R, Cong R, Yang X, Sun Q, Parvizi N, Zhao R. Maternal low-protein diet affects epigenetic regulation of hepatic mitochondrial DNA transcription in a sex-specific manner in newborn piglets associated with GR binding to its promoter. PLoS ONE. 2013;8(5):e63855.

Rodriguez JS, Rodriguez-Gonzalez GL, Reyes-Castro LA, Ibanez C, Ramirez A, Chavira R, Larrea F, Nathanielsz PW, Zambrano E. Maternal obesity in the rat programs male offspring exploratory, learning and motivation behavior: prevention by dietary intervention pre-gestation or in gestation. Int J Dev Neurosci. 2012;30(2):75–81.

Tuersunjiang N, Odhiambo JF, Long NM, Shasa DR, Nathanielsz PW, Ford SP. Diet reduction to requirements in obese/overfed ewes from early gestation prevents glucose/insulin dysregulation and returns fetal adiposity and organ development to control levels. Am J Physiol Endocrinol Metab. 2013;305(7):E868-878.

Kumpulainen SM, Heinonen K, Kaseva N, Andersson S, Lano A, Reynolds RM, Wolke D, Kajantie E, Eriksson JG, Raikkonen K. Maternal early pregnancy body mass index and diurnal salivary cortisol in young adult offspring. Psychoneuroendocrinology. 2019;104:89–99.

Stirrat LI, Just G, Homer NZM, Andrew R, Norman JE, Reynolds RM. Glucocorticoids are lower at delivery in maternal, but not cord blood of obese pregnancies. Sci Rep. 2017;7(1):10263.

Prisco D, Filippini M, Francalanci I, Paniccia R, Gensini GF, Serneri GG. Effect of n-3 fatty acid ethyl ester supplementation on fatty acid composition of the single platelet phospholipids and on platelet functions. Metabolism. 1995;44(5):562–9.

Kaapa P, Uhari M, Nikkari T, Viinikka L, Ylikorkala O. Dietary fatty acids and platelet thromboxane production in puerperal women and their offspring. Am J Obstet Gynecol. 1986;155(1):146–9.

Teng KT, Chang CY, Kanthimathi MS, Tan AT, Nesaretnam K. Effects of amount and type of dietary fats on postprandial lipemia and thrombogenic markers in individuals with metabolic syndrome. Atherosclerosis. 2015;242(1):281–7.

Hanhineva K, Lankinen MA, Pedret A, Schwab U, Kolehmainen M, Paananen J, de Mello V, Sola R, Lehtonen M, Poutanen K, et al. Nontargeted metabolite profiling discriminates diet-specific biomarkers for consumption of whole grains, fatty fish, and bilberries in a randomized controlled trial. J Nutr. 2015;145(1):7–17.

Brunelli L, Davin A, Sestito G, Mimmi MC, De Simone G, Balducci C, Pansarasa O, Forloni G, Cereda C, Pastorelli R, et al. Plasmatic hippuric acid as a hallmark of frailty in an Italian cohort: the mediation effect of fruit-vegetable intake. J Gerontol A Biol Sci Med Sci. 2021;76(12):2081–9.

Ferreira LB, Lobo CV, Miranda A, Carvalho BDC, Santos LCD. Dietary patterns during pregnancy and gestational weight gain: a systematic review. Rev Bras Ginecol Obstet. 2022;44(5):540–7.

Tielemans MJ, Garcia AH, Peralta Santos A, Bramer WM, Luksa N, Luvizotto MJ, Moreira E, Topi G, de Jonge EA, Visser TL, et al. Macronutrient composition and gestational weight gain: a systematic review. Am J Clin Nutr. 2016;103(1):83–99.

Lewis RM, Wadsack C, Desoye G. Placental fatty acid transfer. Curr Opin Clin Nutr Metab Care. 2018;21(2):78–82.

Cesar HC, Pisani LP. Fatty-acid-mediated hypothalamic inflammation and epigenetic programming. J Nutr Biochem. 2017;42:1–6.

Demmelmair H, Koletzko B. Perinatal polyunsaturated fatty acid status and obesity risk. Nutrients. 2021;13(11):3882.

Hauner H, Brunner S. Early fatty acid exposure and later obesity risk. Curr Opin Clin Nutr Metab Care. 2015;18(2):113–7.

Download references

Acknowledgements

The authors thank the study participants for their involvement and research assistants for their help conducting the study.

This research was funded by the Beijing Natural Science Foundation, grant number 7214231.

Author information

Authors and affiliations.

Division of Endocrinology and Metabolism, Department of Obstetrics, Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital, No. 251, Yaojiayuan Road, Chaoyang District, Beijing, 100026, China

Xianxian Yuan, Yuru Ma, Yan Zhao, Wei Zheng, Ruihua Yang, Lirui Zhang, Xin Yan & Guanghui Li

Department of Obstetrics and Gynecology, The Second Hospital of Jilin University, Changchun, 130041, Jilin, China

You can also search for this author in PubMed   Google Scholar

Contributions

XXY designed the study. XXY, WZ, LRZ and XY analyzed the data. YRM, JW, YZ and RHY took part in data collection and management. XXY wrote the manuscript. XXY and GHL reviewed the manuscript and contributed to manuscript revision. All authors contributed to the article and approved the submitted version. All authors reviewed the manuscript.

Corresponding author

Correspondence to Guanghui Li .

Ethics declarations

Ethics approval and consent to participate.

This study has been performed in accordance with the Declaration of Helsinki and has been approved by the ethics committee of Beijing Obstetrics and Gynecology Hospital, Capital Medical University (2021-KY-037). Informed consent was obtained from all subjects involved in the study to publish this paper. All methods were carried out in accordance with relevant guidelines and regulations in the declaration.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Yuan, X., Ma, Y., Wang, J. et al. The influence of maternal prepregnancy weight and gestational weight gain on the umbilical cord blood metabolome: a case–control study. BMC Pregnancy Childbirth 24 , 297 (2024). https://doi.org/10.1186/s12884-024-06507-x

Download citation

Received : 30 September 2023

Accepted : 11 April 2024

Published : 22 April 2024

DOI : https://doi.org/10.1186/s12884-024-06507-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Maternal obesity
  • Gestational weight gain
  • Offspring health
  • Metabolites
  • Umbilical cord blood

BMC Pregnancy and Childbirth

ISSN: 1471-2393

case study health surveillance

Measles — United States, January 1, 2020–March 28, 2024

Weekly / April 11, 2024 / 73(14);295–300

Adria D. Mathis, MSPH 1 ; Kelley Raines, MPH 1 ; Nina B. Masters, PhD 1 ; Thomas D. Filardo, MD 1 ; Gimin Kim, MS 1 ; Stephen N. Crooke, PhD 1 ; Bettina Bankamp, PhD 1 ; Paul A. Rota, PhD 1 ; David E. Sugerman, MD 1 ( View author affiliations )

What is already known about this topic?

Although endemic U.S. measles was declared eliminated in 2000, measles importations continue to occur. Prolonged outbreaks during 2019 threatened the U.S. measles elimination status.

What is added by this report?

During January 1, 2020–March 28, 2024, a total of 338 U.S. measles cases were reported; 29% of these cases occurred during the first quarter of 2024, almost all in persons who were unvaccinated or whose vaccination status was unknown. As of the end of 2023, U.S. measles elimination status was maintained.

What are the implications for public health practice?

Risk for widespread U.S. measles transmission remains low because of high population immunity. Enhanced efforts are needed to increase routine U.S. vaccination coverage, encourage vaccination before international travel, identify communities at risk for measles transmission, and rapidly investigate suspected measles cases to reduce cases and complications of measles.

  • Article PDF
  • Full Issue PDF

The graphic includes an illustration of a map and a clinician with a parent and child with text about international travel and measles.

Measles is a highly infectious febrile rash illness and was declared eliminated in the United States in 2000. However, measles importations continue to occur, and U.S. measles elimination status was threatened in 2019 as the result of two prolonged outbreaks among undervaccinated communities in New York and New York City. To assess U.S. measles elimination status after the 2019 outbreaks and to provide context to understand more recent increases in measles cases, CDC analyzed epidemiologic and laboratory surveillance data and the performance of the U.S. measles surveillance system after these outbreaks. During January 1, 2020–March 28, 2024, CDC was notified of 338 confirmed measles cases; 97 (29%) of these cases occurred during the first quarter of 2024, representing a more than seventeenfold increase over the mean number of cases reported during the first quarter of 2020–2023. Among the 338 reported cases, the median patient age was 3 years (range = 0–64 years); 309 (91%) patients were unvaccinated or had unknown vaccination status, and 336 case investigations included information on ≥80% of critical surveillance indicators. During 2020–2023, the longest transmission chain lasted 63 days. As of the end of 2023, because of the absence of sustained measles virus transmission for 12 consecutive months in the presence of a well-performing surveillance system, U.S. measles elimination status was maintained. Risk for widespread U.S. measles transmission remains low because of high population immunity. However, because of the increase in cases during the first quarter of 2024, additional activities are needed to increase U.S. routine measles, mumps, and rubella vaccination coverage, especially among close-knit and undervaccinated communities. These activities include encouraging vaccination before international travel and rapidly investigating suspected measles cases.

Introduction

Measles is a highly infectious acute, febrile rash illness with a >90% secondary attack rate among susceptible contacts ( 1 ). High national 2-dose coverage with the measles, mumps, and rubella (MMR) vaccine led to the declaration of U.S. measles elimination* in 2000 ( 2 ). However, this elimination status was threatened in 2019 because of two prolonged outbreaks among undervaccinated communities in New York and New York City; these outbreaks accounted for 29% of all reported cases during 2001–2019 ( 2 ). To assess U.S. measles elimination status after the 2019 outbreaks and to provide context for understanding more recent increases in measles cases in 2024, † CDC assessed the epidemiologic and laboratory-based surveillance of measles in the United States and the performance of the U.S. measles surveillance system during January 1, 2020–March 28, 2024.

Reporting and Classification of Measles Cases

Confirmed measles cases § ( 1 ) are reported to CDC by state health departments through the National Notifiable Disease Surveillance System and directly (by email or telephone) to the National Center for Immunization and Respiratory Diseases. Measles cases are classified by the Council of State and Territorial Epidemiologists as import-associated if they were internationally imported, epidemiologically linked to an imported case, or had viral genetic evidence of an imported measles genotype ( 1 ); cases with no epidemiologic or virologic link to an imported case are classified as having an unknown source ( 1 ). For this analysis, unique sequences were defined as those differing by at least one nucleotide in the N-450 sequence (the 450 nucleotides encoding the carboxyl-terminal 150 nucleoprotein amino acids) based on the standard World Health Organization (WHO) recommendations for describing sequence variants ¶ ( 3 ). Unvaccinated patients were classified as eligible for vaccination if they were not vaccinated according to Advisory Committee on Immunization Practices recommendations ( 4 ). A well-performing surveillance system was defined as one with ≥80% of cases meeting each of the following three criteria: classified as import-associated, reported with complete information on at least eight of 10 critical surveillance indicators (i.e., place of residence, sex, age, occurrence of fever and rash, date of rash onset, vaccination status, travel history, hospitalization, transmission setting, and whether the case was outbreak-related) ( 5 ), and laboratory-confirmed.

Assessment of Chains of Transmission

Cases were classified into chains of transmission on the basis of known epidemiologic linkages: isolated (single) cases, two-case chains (two epidemiologically linked cases), and outbreaks (three or more epidemiologically linked cases). The potential for missed cases within two-case chains and outbreaks was assessed by measuring the interval between measles rash onset dates in each chain; chains with more than one maximum incubation period (21 days) between cases could indicate a missing case in the chain. This activity was reviewed by CDC, deemed not research, and was conducted consistent with applicable federal law and CDC policy.**

Reported Measles Cases and Outbreaks

CDC was notified of 338 confirmed measles cases with rash onset during January 1, 2020–March 28, 2024 ( Figure ); cases occurred in 30 jurisdictions. During 2020, 12 of 13 cases preceded the commencement of COVID-19 mitigation efforts in March 2020. Among the 170 cases reported during 2021 and 2022, 133 (78%) were associated with distinct outbreaks: 47 (96%) of 49 cases in 2021 occurred among Afghan evacuees temporarily housed at U.S. military bases during Operation Allies Welcome, and 86 (71%) of 121 cases in 2022 were associated with an outbreak in central Ohio. During 2023, 28 (48%) of 58 cases were associated with four outbreaks. As of March 28, 2024, a total of 97 cases have been reported in 2024, representing 29% of all 338 measles cases reported during January 1, 2020–March 28, 2024, and more than a seventeenfold increase over the mean number of cases reported during the first quarter of 2020–2023 (five cases).

Characteristics of Reported Measles Cases

The median patient age was 3 years (range = 0–64 years); more than one half of cases (191; 58%) occurred in persons aged 16 months–19 years ( Table ). Overall, 309 (91%) patients were unvaccinated (68%) or had unknown vaccination status (23%); 29 (9%) had previously received ≥1 MMR vaccine dose. Among the 309 cases among unvaccinated persons or persons with unknown vaccination status, 259 (84%) patients were eligible for vaccination, 40 (13%) were aged 6–11 months and therefore not recommended for routine MMR vaccination, and 10 (3%) were ineligible for MMR because they were aged <6 months. †† Among 155 (46%) hospitalized measles patients, 109 (70%) cases occurred in persons aged <5 years; 142 (92%) hospitalized patients were unvaccinated or had unknown vaccination status. No measles-associated deaths were reported to CDC.

Imported Measles Cases

Among all 338 cases, 326 (96%) were associated with an importation; 12 (4%) had an unknown source. Among the 326 import-associated cases, 200 (61%) occurred among U.S. residents who were eligible for vaccination but who were unvaccinated or whose vaccination status was unknown. Among 93 (28%) measles cases that were directly imported from other countries, 34 (37%) occurred in foreign visitors, and 59 (63%) occurred in U.S. residents, 53 (90%) of whom were eligible for vaccination but were unvaccinated or whose vaccination status was unknown. One (2%) case in a U.S. resident occurred in a person too young for vaccination, two (3%) in persons who had previously received 1 MMR vaccine dose, and three (5%) in persons who had previously received 2 MMR vaccine doses. The most common source for internationally imported cases during the study period were the Eastern Mediterranean (48) and African (24) WHO regions. During the first quarter of 2024, a total of six internationally imported cases were reported from the European and South-East Asia WHO regions, representing a 50% increase over the mean number of importations from these regions during 2020–2023 (mean of two importations per year from each region).

Surveillance Quality Indicators

Overall, all but two of the 338 case investigations included information on ≥80% of the critical surveillance indicators; those two case investigations included information on 70% of critical surveillance indicators. Date of first case report to a health department was available for 219 (65%) case investigations; 127 (58%) cases were reported to health departments on or before the day of rash onset (IQR = 4 days before to 3 days after). Overall, 314 (93%) measles cases were laboratory confirmed, including 16 (5%) by immunoglobulin M (serologic) testing alone and 298 (95%) by real-time reverse transcription–polymerase chain reaction (rRT-PCR). Among 298 rRT-PCR–positive specimens, 221 (74%) were successfully genotyped: 177 (80%) were genotype B3, and 44 (20%) were genotype D8. Twenty-two distinct sequence identifiers (DSIds) ( 3 ) for genotype B3 and 13 DSIds for genotype D8 were detected (Supplementary Figure, https://stacks.cdc.gov/view/cdc/152776 ). The longest period of detection for any DSId was 15 weeks (DSId 8346).

Chains of Transmission

The 338 measles cases were categorized into 92 transmission chains (Table); 62 (67%) were isolated cases, 10 (11%) were two-case chains, and 20 (22%) were outbreaks of three or more cases. Seven (35%) of 20 outbreaks occurred during 2024. §§ The median outbreak size was six cases (range = three–86 cases) and median duration of transmission was 20 days (range = 6–63 days). Among the 30 two-case chains and outbreaks, more than one maximum incubation period (21 days) did not elapse between any two cases.

Because of the absence of endemic measles virus transmission for 12 consecutive months in the presence of a well-performing surveillance system, as of the end of 2023, measles elimination has been maintained in the United States. U.S. measles elimination reduces the number of cases, deaths, and costs that would occur if endemic measles transmission were reestablished. Investigation of almost all U.S. measles cases reported since January 2020 were import-associated, included complete information on critical surveillance variables, were laboratory-confirmed by rRT-PCR, and underwent genotyping; these findings indicate that the U.S. measles surveillance system is performing well. A variety of transmission chain sizes were detected, including isolated cases, suggesting that sustained measles transmission would be rapidly detected. However, the rapid increase in the number of reported measles cases during the first quarter of 2024 represents a renewed threat to elimination.

Most measles importations were cases among persons traveling to and from countries in the Eastern Mediterranean and African WHO regions; these regions experienced the highest reported measles incidence among all WHO regions during 2021–2022 ( 6 ). During November 2022–October 2023, the number of countries reporting large or disruptive outbreaks increased by 123%, from 22 to 49. Global estimates suggest that first-dose measles vaccination coverage had declined from 86% in 2019 to 83% in 2022, leaving almost 22 million children aged <1 year susceptible to measles ( 6 ).

As has been the case in previous postelimination years ( 7 ), most imported measles cases occurred among unvaccinated U.S. residents. Increasing global measles incidence and decreasing vaccination coverage will increase the risk for importations into U.S. communities, as has been observed during the first quarter of 2024, further supporting CDC’s recommendation for persons to receive MMR vaccine before international travel ( 4 ).

Maintaining high national and local MMR vaccination coverage remains central to sustaining measles elimination. Risk for widespread U.S. measles transmission remains low because of high population immunity; however, national 2-dose MMR vaccination coverage has remained below the Healthy People 2030 target of 95% (the estimated population-level immunity necessary to prevent sustained measles transmission) ( 8 ) for 3 consecutive years, leaving approximately 250,000 kindergarten children susceptible to measles each year ( 9 ). Furthermore, 2-dose MMR vaccination coverage estimates in 12 states and the District of Columbia were <90%, and during the 2022–23 school year, exemption rates among kindergarten children exceeded 5% in 10 states ( 9 ). Clusters of unvaccinated persons placed communities at risk for large outbreaks, as occurred during the central Ohio outbreak in 2022: 94% of measles patients were unvaccinated and 42% were hospitalized ( 10 ). Monitoring MMR vaccination coverage at county and zip code levels could help public health agencies identify undervaccinated communities for targeted interventions to improve vaccination coverage while preparing for possible measles outbreaks. As of March 28, 2024, a total of 97 confirmed measles cases have been reported in the United States in 2024, compared with a mean of five cases during the first quarter of each year during 2020–2023. Similar to cases reported during 2020–2023, most cases reported during 2024 occurred among patients aged <20 years who were unvaccinated or whose vaccination status was unknown, and were associated with an importation. Rapid detection of cases, prompt implementation of control measures, and maintenance of high national measles vaccination coverage, including improving coverage in undervaccinated populations, is essential to preventing measles and its complications and to maintaining U.S. elimination status.

Limitations

The findings in this report are subject to at least three limitations. First, importations might have been underreported: 4% of reported cases during the study period had no known source. Second, case investigations resulting in discarded measles cases (i.e., a diagnosis of measles excluded) are not nationally reportable, which limits the ability to directly evaluate the sensitivity of measles case investigations. However, surveillance remains sufficiently sensitive to detect isolated cases and outbreaks, and robust molecular epidemiology provides further evidence supporting the absence of sustained measles transmission in the United States. Finally, the date of first case report to a health department was not available for 35% of case investigations.

Implications for Public Health Practice

The U.S. measles elimination status will continue to be threatened by global increases in measles incidence and decreases in global, national, and local measles vaccination coverage. Because of high population immunity, the risk of widespread measles transmission in the United States remains low; however, efforts are needed to increase routine MMR vaccination coverage, encourage vaccination before international travel, identify communities at risk for measles transmission, and rapidly investigate suspected measles cases to maintain elimination.

Corresponding author: Adria D. Mathis, [email protected] .

1 Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, CDC.

All authors have completed and submitted the International Committee of Medical Journal Editors form for disclosure of potential conflicts of interest. Stephen N. Crooke reports institutional support from PATH. No other potential conflicts of interest were disclosed.

* Elimination is defined as the absence of endemic measles virus transmission in a defined geographic area for ≥12 months in the presence of a well-performing surveillance system.

† https://emergency.cdc.gov/han/2024/han00504.asp

§ A confirmed measles case was defined as an acute febrile rash illness with laboratory confirmation or direct epidemiologic linkage to a laboratory-confirmed case. Laboratory confirmation was defined as detection of measles virus–specific nucleic acid from a clinical specimen using real-time reverse transcription–polymerase chain reaction or a positive serologic test for measles immunoglobulin M antibody.

¶ Genotyping was performed at CDC and at the Vaccine Preventable Disease Reference Centers of the Association of Public Health Laboratories.

** 45 C.F.R. part 46.102(l)(2), 21 C.F.R. part 56; 42 U.S.C. Sect. 241(d); 5 U.S.C. Sect. 552a; 44 U.S.C. Sect. 3501 et seq.

†† MMR vaccine is not licensed for use in persons aged <6 months.

§§ At the time of this report, six measles outbreaks have ended, and one outbreak is ongoing. A measles outbreak is considered to be over when no new cases have been identified during two incubation periods (42 days) since the rash onset in the last outbreak-related case.

  • Gastañaduy PA, Redd SB, Clemmons NS, et al. Measles [Chapter 7]. In: Manual for the surveillance of vaccine-preventable diseases. Atlanta, GA: US Department of Health and Human Services, CDC; 2023. https://www.cdc.gov/vaccines/pubs/surv-manual/chpt07-measles.html
  • Mathis AD, Clemmons NS, Redd SB, et al. Maintenance of measles elimination status in the United States for 20 years despite increasing challenges. Clin Infect Dis 2022;75:416–24. https://doi.org/10.1093/cid/ciab979 PMID:34849648
  • Williams D, Penedos A, Bankamp B, et al. Update: circulation of active genotypes of measles virus and recommendations for use of sequence analysis to monitor viral transmission. Weekly Epidemiologic Record 2022;97(39):481–92. https://reliefweb.int/report/world/weekly-epidemiological-record-wer-30-september-2022-vol-97-no-39-2022-pp-481-492-enfr
  • McLean HQ, Fiebelkorn AP, Temte JL, Wallace GS; CDC. Prevention of measles, rubella, congenital rubella syndrome, and mumps, 2013: summary recommendations of the Advisory Committee on Immunization Practices (ACIP). MMWR Recomm Rep 2013;62(No. RR-4):1–34. PMID:23760231
  • World Health Organization. Measles: vaccine preventable diseases surveillance standards. Geneva, Switzerland: World Health Organization; 2018. https://www.who.int/publications/m/item/vaccine-preventable-diseases-surveillance-standards-measles
  • Minta AA, Ferrari M, Antoni S, et al. Progress toward measles elimination—worldwide, 2000–2022. MMWR Morb Mortal Wkly Rep 2023;72:1262–8. https://doi.org/10.15585/mmwr.mm7246a3 PMID:37971951
  • Lee AD, Clemmons NS, Patel M, Gastañaduy PA. International importations of measles virus into the United States during the postelimination era, 2001–2016. J Infect Dis 2019;219:1616–23. https://doi.org/10.1093/infdis/jiy701 PMID:30535027
  • Truelove SA, Graham M, Moss WJ, Metcalf CJE, Ferrari MJ, Lessler J. Characterizing the impact of spatial clustering of susceptibility for measles elimination. Vaccine 2019;37:732–41. https://doi.org/10.1016/j.vaccine.2018.12.012 PMID:30579756
  • Seither R, Yusuf OB, Dramann D, Calhoun K, Mugerwa-Kasujja A, Knighton CL. Coverage with selected vaccines and exemption from school vaccine requirements among children in kindergarten—United States, 2022–23 school year. MMWR Morb Mortal Wkly Rep 2023;72:1217–24. https://doi.org/10.15585/mmwr.mm7245a2 PMID:37943705
  • Tiller EC, Masters NB, Raines KL, et al. Notes from the field: measles outbreak—central Ohio, 2022–2023. MMWR Morb Mortal Wkly Rep 2023;72:847–9. https://doi.org/10.15585/mmwr.mm7231a3 PMID:37535476

FIGURE . Confirmed measles cases, by month of rash onset (N = 338) — United States, January 1, 2020–March 28, 2024

Abbreviations: IgM = immunoglobulin M; rRT-PCR = real-time reverse transcription–polymerase chain reaction; WHO = World Health Organization. * A case resulting from exposure to measles virus outside the United States as evidenced by at least some of the exposure period (7–21 days before rash onset) occurring outside the United States and rash onset occurring within 21 days of entering the United States without known exposure to measles during that time. † A case in a transmission chain epidemiologically linked to an internationally imported case. § A case for which an epidemiologic link to an internationally imported case was not identified, but for which viral sequence data indicate an imported measles genotype (i.e., a genotype that is not detected in the United States with a pattern indicative of endemic transmission). ¶ A case for which an epidemiologic or virologic link to importation or to endemic transmission within the United States cannot be established after a thorough investigation. ** Percentage is percentage of international importations. Four cases among persons who traveled to both the Eastern Mediterranean and African regions and one case in a person who traveled to both the Eastern Mediterranean and European regions were counted twice. †† Place of residence, sex, age or date of birth, fever and rash, date of rash onset, vaccination status, travel history, hospitalization, transmission setting, and whether the case was outbreak related. §§ Includes 65 cases among patients who received both positive rRT-PCR and positive IgM results. ¶¶ Percentage is percentage of total chains.

Suggested citation for this article: Mathis AD, Raines K, Masters NB, et al. Measles — United States, January 1, 2020–March 28, 2024. MMWR Morb Mortal Wkly Rep 2024;73:295–300. DOI: http://dx.doi.org/10.15585/mmwr.mm7314a1 .

MMWR and Morbidity and Mortality Weekly Report are service marks of the U.S. Department of Health and Human Services. Use of trade names and commercial sources is for identification only and does not imply endorsement by the U.S. Department of Health and Human Services. References to non-CDC sites on the Internet are provided as a service to MMWR readers and do not constitute or imply endorsement of these organizations or their programs by CDC or the U.S. Department of Health and Human Services. CDC is not responsible for the content of pages found at these sites. URL addresses listed in MMWR were current as of the date of publication.

All HTML versions of MMWR articles are generated from final proofs through an automated process. This conversion might result in character translation or format errors in the HTML version. Users are referred to the electronic PDF version ( https://www.cdc.gov/mmwr ) and/or the original MMWR paper copy for printable versions of official text, figures, and tables.

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Online J Public Health Inform
  • v.13(3); 2021

Logo of ojphi

Evaluating multi-purpose syndromic surveillance systems – a complex problem

Roger morbey.

1 Real-time Syndromic Surveillance Team, Field Service, National Infection Service, Public Health England, Birmingham B2 4BH, United Kingdom

Gillian Smith

Isabel oliver.

2 Field Service, National Infection Service, Public Health England, Bristol BS1 6EH, United Kingdom

Obaghe Edeghere

3 Field Epidemiology West Midlands, Field Service, National Infection Service, Public Health England, Birmingham B2 4BH, United Kingdom

4 School of Environmental Sciences, University of East Anglia, Norwich, NR4 7TJ, United Kingdom

Richard Pebody

5 Influenza and Other Respiratory Virus Section, Immunisation and Countermeasures Division, National Infection Service, Public Health England, London NW9 5EQ, United Kingdom

Dan Todkill

Noel mccarthy.

6 Warwick Medical School, Division of Health Sciences, University of Warwick, CV4 7AL, United Kingdom

Alex J. Elliot

Surveillance systems need to be evaluated to understand what the system can or cannot detect. The measures commonly used to quantify detection capabilities are sensitivity, positive predictive value and timeliness. However, the practical application of these measures to multi-purpose syndromic surveillance services is complex. Specifically, it is very difficult to link definitive lists of what the service is intended to detect and what was detected.

First, we discuss issues arising from a multi-purpose system, which is designed to detect a wide range of health threats, and where individual indicators, e.g. ‘fever’, are also multi-purpose. Secondly, we discuss different methods of defining what can be detected, including historical events and simulations. Finally, we consider the additional complexity of evaluating a service which incorporates human decision-making alongside an automated detection algorithm. Understanding the complexities involved in evaluating multi-purpose systems helps design appropriate methods to describe their detection capabilities.

Introduction

Syndromic surveillance.

Syndromic surveillance involves monitoring health care data on symptoms, signs and diagnoses to provide information for public health action [ 1 ]. Syndromic surveillance is often multi-purpose, using many different syndromes or clinical indicators to monitor different conditions and events of public health interest. Public health organisations may operate a syndromic surveillance ‘service’ that includes several ‘systems’, with each ‘system’ using data from one source, e.g. emergency departments, family doctors or ambulances. An on-going syndromic surveillance service is more than a series of data processing steps, it involves analysis, interpretation, reporting and enabling decision-making for appropriate action. It also requires a cycle of continuous improvement, with development of novel approaches and their subsequent application into the service.

When interpreting information from syndromic surveillance systems, public health practitioners, e.g. epidemiologists or incident directors, need to understand the capabilities of those systems to support decision making and choice of actions. Incident directors and other users want answers to apparently simple questions such as: “How many cases of cryptosporidiosis need to occur before your system detects an outbreak in this area?”; or “How much early warning can you provide of increases in seasonal influenza?”

Evaluating syndromic surveillance - existing evidence base

The Centre for Disease Control and Prevention (CDC) in the United States of America created a framework for evaluating a syndromic surveillance service [ 2 ]. This framework has been widely adopted and used to evaluate both syndromic and traditional non-syndromic surveillance. The framework has been applied to evaluate services both quantitatively and qualitatively [ 3 , 4 ). Furthermore, a wide range of statistical aberration detection algorithms have been applied to syndromic surveillance, to identify unusual exceedances that might indicate a threat to public health [ 5 - 7 ]. Consequently, much of the published research on quantifying the public health benefit of syndromic surveillance focuses on the use of the statistical algorithms. However, retrospectively identifying that an algorithm can detect outbreaks does not inform whether appropriate public health action was taken by the syndromic surveillance service or the impact on public health [ 8 ]. It is also important to evaluate the service’s decision-making and operational processes [ 9 ]. Surveillance does not end with the generation of a statistical alarm. Following an alarm there will be decisions about the importance of the alarm, possibly further epidemiological investigations and analysis to summarise findings in key messages, and finally there will decisions about appropriate public health action. Therefore, further work is also needed to evaluate these later stages of syndromic surveillance as well as the detection algorithms.

Similarly, published evaluations of syndromic systems often focus on just one disease or syndrome [ 10 ], whereas syndromic surveillance services are often multi-purpose 5 ]. Importantly, syndromic surveillance has the potential to detect future unknown hazards, for instance symptoms resulting from a newly emerging disease, such as COVID-19, for which laboratory tests may not yet be available 11 ]. Therefore, there is a gap in our understanding of the detection capabilities of multi-purpose syndromic surveillance services because services are usually only evaluated as if they have a single purpose and only in terms of the ability to generate statistical alarms.

Quantifying the detection capabilities of a multi-purpose service - a complex problem

Ideally, simple clear quantitative measures should be provided to describe a multi-purpose service’s detection capabilities. However, published quantitative estimates for detection capabilities have usually been restricted to single diseases or to the automated part of a service. For example, it is much easier to deliver estimates structured as “the algorithm had a sensitivity of 98% and a specificity of 84% for simulated influenza outbreaks” rather than “this syndromic service resulted in appropriate action 85% of the time, with 20% of actions subsequently found to be unnecessary”. This research focus may be because quantifying the detection capabilities of a multi-purpose syndromic service is not as straightforward as it might initially appear. In fact, this is not just a complicated problem but a complex one. A complicated problem might be large and require considerable resources but can be answered by a single rule-based process, whereas a complex problem requires a range of context-specific methods to obtain answers. Similar issues of complexity have been found in evaluating public health interventions [ 12 ]. Here, we provide a perspective paper on the complexities involved in providing meaningful answers for what can and cannot be detected by a multi-purpose syndromic surveillance service. Thus, we aim to suggest a way forward in tackling this complex problem, which can be adopted by other organisations and countries coordinating a multi-purpose syndromic surveillance service.

Measures for quantifying detection – laboratory tests analogy

Syndromic surveillance systems are often used alongside and complement traditional surveillance systems such as those based on laboratory testing. Therefore, we use laboratory tests as an example to describe how detection capabilities can be quantified. Then, by analogy we discuss what is required to quantify the detection capabilities of syndromic systems.

Quantifying laboratory tests – a ‘simple’ example

A laboratory test needs to be able to identify disease rapidly with few ‘false alarms’ [ 13 ]. Therefore, evaluation measures must include: a measure for how likely the test is to detect disease; a measure for how likely it is to create false alarms; and for how quickly it will detect disease. Firstly, sensitivity (also called recall) can be defined as the proportion of patients with disease correctly identified by a positive test. Secondly, false alarms can be quantified using, specificity or positive predictive value (PPV; also, called precision). Specificity can be defined as the proportion of tested patients without a disease with a negative test result, and PPV by the proportion of positive tests that come from patients with the disease. Finally, timeliness can be defined as the time between a sample being taken and the laboratory report being available.

Calculating these quantitative measures for a laboratory test requires a list of patients, with a variable for whether the disease or condition is present, and a linked list of samples, with a variable for whether the laboratory test was positive for the disease or condition ( Figure 1 ).

An external file that holds a picture, illustration, etc.
Object name is ojphi-13-3-e15-g001.jpg

Results matrix for evaluating the sensitivity and specificity of a single laboratory test

Quantifying syndromic surveillance – a ‘complex’ example

By analogy, it should be possible to create the same quantitative measures i.e. the sensitivity, specificity, PPV and timeliness of a syndromic surveillance service ( Figure 2 ). However, instead of comparing a list of patients and test results, we need a list of events we want to detect and a linked list of detections made by the service (throughout this paper, we will use the term ‘event’ to cover all the different public health threats a service aims to detect, including outbreaks with different aetiologies, public health incidents and the impact of environmental exposures etc., Figure 3 ).

An external file that holds a picture, illustration, etc.
Object name is ojphi-13-3-e15-g002.jpg

Results matrix for evaluating a multi-purpose syndromic surveillance service.

Types of events that a multi-purpose syndromic surveillance service aims to detect.

In theory, given a linked list of events to be detected and a list of detections reported by a syndromic service, we can quantify the detection capabilities of the service. However, in practice, creating definitive linked lists of events and detections is complex.

What do we want to detect with syndromic surveillance?

Multi-purpose surveillance.

Syndromic surveillance was originated to provide population-level surveillance for early warning for bioterrorism threats but it has subsequently been used for early warning of other events and is increasingly used for reassurance of the lack of adverse health impact in a specific context, or for situational awareness after a known exposure [ 1 , 14 ]. A multi-purpose syndromic surveillance service may have multiple objectives [ 2 , 8 , 10 ]:

  • Early warning of unexpected events, e.g. bioterrorism, emerging new diseases, outbreaks;
  • Early warning of aberrant trends by monitoring endemic or seasonal diseases, e.g. scarlet fever or seasonal influenza;
  • Reassurance and monitoring during mass gatherings e.g. Olympic and Paralympic Games;
  • Situational awareness during pre-identified outbreaks or environmental incidents, e.g. COVID-19, an influenza pandemic or heat wave;

Therefore, a multi-purpose syndromic surveillance service will need to detect a wide range of events, reflecting potential threats to public health, including infectious disease, environmental impacts and mass gatherings ( Figure 3 ).

Compiling a list of events to be detected through multi-purpose surveillance is complex because different types of events are defined in different ways. For example, point-source outbreaks might have a clear start and end date, whilst propagated or seasonal epidemics cannot be clearly defined in this way [ 8 ]. Similarly, how suspected events are validated will vary by type. For infectious diseases, laboratory reports provide a ‘gold-standard’ for incidence, however, independent data may not be available for other types of events, e.g. increase in hay fever reports. For some types of events, e.g. extreme weather or mass-gatherings, it may be easy to validate exposure but less obvious how to independently validate impact on the population’s health. Consequently, we may be able to create a list of events which have been detected by other surveillance systems (but not those which haven’t), but not be certain about the timing and size of any public health impacts that the syndromic service needs to detect.

Obtaining historical examples

It is important that syndromic services are evaluated across the full range of event types and different sizes of event 17 , 18 ]. However, for some types of event there may be no historical data available or only a limited range of outbreak sizes, locations etc. [ 8 ]. Therefore, synthetic simulated data are often used to evaluate syndromic systems [ 19 ]. There are advantages and disadvantages for using real historical events or using synthetic events, historical events may be rare whilst synthetic events may be unrealistic 20 ]. The main disadvantage of using synthetic events is that they require modelling assumptions, for example, healthcare seeking behaviours for a range of diseases need to be estimated from other research, which is not straightforward 21 ]. A commonly used approach is to ‘inject’ synthetic simulations of events into ‘real’ historic syndromic data [ 5 ]. Furthermore, real scaled events can be injected to reduce modelling assumptions about the relationship between outbreak size and syndromic indicators [ 17 , 22 - 24 ]. However, results will still depend upon assumptions about the lag between exposure, symptom onset and whether a person presents to health care.

Completeness of event lists

To evaluate a syndromic service, the list of events to be detected must be comprehensive and exclusive ( Figure 3 ). Furthermore, to estimate specificity or PPV, an identified period without such events is also needed. However, even for event types where numerous independently verifiable outbreaks are available, it may be impossible to guarantee that all events have been identified. It is perfectly plausible that syndromic data contain unverified events, for example, increases in respiratory illness have been observed in autumn that cannot be explained by comparison with laboratory data [ 20 ]. These unverified outbreaks within baseline syndromic data can result in lower specificity and PPV estimates [ 8 , 14 ]. Figure 4 summarizes the complexities around defining what needs to be detected by syndromic surveillance, as discussed above.

Reasons why defining ‘events’ to be detected by syndromic surveillance are complex.

Defining detection with syndromic surveillance

Whilst it is relatively straightforward to define the detection parameters for statistical algorithms [ 25 ], it becomes more complex when we consider the whole syndromic surveillance service. Firstly, we need to consider how the service reports detection, which may depend on its ‘surveillance objective’. Secondly, we need to decide how to link detection to events in the context of multi-purpose syndromic surveillance.

Objectives for a syndromic surveillance service

The objective that a syndromic service is fulfilling will affect both the definition of detection and its ability to detect events. For example, when acting as an early warning system a syndromic service may define detection as alerting the appropriate authorities prior to any other surveillance system. Successful early warning depends on a service’s routine surveillance practices and reporting arrangements. By contrast, when providing situational awareness during a known event, the multi-purpose service can focus on a geographical area and subset of syndromic indicators, which will increase the probability of detecting an impact. Also, when providing situational awareness, the service may define detection as identifying small changes in trends, which would not have triggered an early warning response to a hitherto unknown event. Similarly, a service that routinely monitors seasonal diseases (e.g. influenza) may have specifically developed thresholds that are more sensitive than those that warn of undefined new threats [ 26 ]. Finally, the objective of a syndromic service may change when an event becomes publicly known through media reports, e.g. COVID-19. Moreover, syndromic indicators may be affected by changes in patient health-seeking behaviour because of increased awareness after an event [ 8 , 10 , 27 ], or changes in government advice e.g. during a lock-down. In summary, creating a list of detections requires consideration of whether the event was expected and the service’s objective at the time of detection.

Multi-purpose syndromic indicators

The ability to link what is detected by syndromic surveillance to specific events is further complicated because many syndromic indicators are multi-purpose. Whilst some syndromic indicators are very specific (e.g. bloody diarrhoea) others (e.g. gastrointestinal) are designed to have a high sensitivity but low specificity to maximise the chance of detecting events or to ensure that new emerging threats, such as COVID-19, are captured [ 3 , 4 , 28 ]. These broad syndromic indicators may detect a range of different types of events. For example, generic respiratory indicators (e.g. cough or difficulty breathing) have been found to be associated with changing trends in laboratory reports for several different respiratory pathogens [ 20 , 29 - 30 ] as well as seasonal allergies [ 31 ]. Consequently, a syndromic service will often detect an increasing trend but not be able to link it to a specific event or individual organism, without further context. However, the ability to link detection to events may also depend on the objective of the surveillance system. For example, during a known laboratory-confirmed measles outbreak, a syndromic service may use a general indicator, e.g. rash, for situational awareness, which would not be considered as an effective early warning indicator for unknown measles outbreaks [ 10 ]. Furthermore, when laboratory data are not available to verify causal pathogens, syndromic indicators or combinations of symptoms may be used to suggest probable causes of outbreaks [ 3 , 32 ], particularly for multi-system surveillance [ 20 ]. Finally, during a pandemic of an emerging disease like COVID-19, new processes or diagnostic codes may be introduced which have an impact on existing syndromic indicators.

Much of the published research evaluating syndromic surveillance focuses either on just one type of event or on the detection capabilities of statistical algorithms. We have reflected on and highlighted the complexities of evaluating and quantifying the detection capability of a multi-purpose syndromic service, which may explain the lack of published evidence on this subject. However, to address questions from users of syndromic surveillance about detection capabilities, we need to avoid over-simplifications and provide descriptions which directly address the complexities and wide-ranging utility of these services.

Therefore, we argue that syndromic surveillance service evaluations need to measure separately different types of event that the service aims to detect and to consider all surveillance stages. Whilst the authors support the use of the CDCs framework for evaluation of surveillance systems [ 2 ], we also believe the complexity of multi-purpose systems needs to be considered in such frameworks. Firstly, separate answers are needed for different types of event both to address users’ specific questions and because different types of events will require different methods for evaluation. Crucially, these separate evaluations should be done in the context of a multi-purpose service where other types of events can affect detection capabilities and the ability to identify causes is also addressed. Secondly, syndromic services should be evaluated beyond the generation of statistical alarms to provide results that inform public health action. Service evaluations should include consideration of the routine surveillance messages and the impact of public health actions for different event types.

To quantify the detection capabilities of syndromic surveillance it is important to compare events that the system aims to detect with what was detected. However, in this commentary we have shown that for a multi-purpose service, defining and linking these events is complex. The complexities arise from the wide range of events covered by a multi-purpose service and the need to assess not just the performance of statistical algorithms but the whole service process.

Measure each event type separately

When considering a multi-purpose syndromic surveillance service, no single measure can helpfully describe its detection capabilities across all the different types of events it aims to detect. Therefore, it is important to consider all the different type of events to be detected and measure detection capabilities separately for each.

Measuring each type of event separately means that a different approach can be used for different event types, for instance how events are defined or the user questions to be addressed. Involving key internal and external stakeholders (including users of the service) in the evaluation is very important to ensure relevance [ 17 ]. For example, stakeholders can steer how narrowly the event types are defined and to address issues such as whether it is sufficient to estimate detection for all gastrointestinal outbreaks or do users require separate estimates for specific pathogens e.g. cryptosporidium or rotavirus.

When measuring each event type separately there is still a need to consider how other types might affect detection capabilities. For example, does the ability to detect the health impact of air pollution change during an influenza epidemic? Also, where there are multi-purpose indicators, correct detection of one type of event could be considered as a false alarm for detecting another type of event. Importantly, evaluating a multi-purpose service by measuring different event types separately is not the same as performing a series of parallel evaluations in each of which the service is treated as if it had only one purpose.

Clearly, it requires much more work to tackle each event type separately, particularly if a range of different approaches are needed. However, this will provide a much richer understanding of the service’s capabilities and enhance users’ interpretation and confidence in the service outputs.

Evaluating each stage in the surveillance process

The automated statistical detection algorithm is just one stage in a syndromic service’s many processes [ 33 ]. The stages can be characterized as: data collection, storage and extraction; aggregation to syndromic indicators; application of detection algorithms; and interpretation, reporting and taking action. It is important to evaluate the service as a whole, so detection involves not just automated alarms but their interpretation, prioritization, reporting and public health impact [ 34 ]. However, evaluating each stage in the process separately can provide useful insights into which factors affect the service’s ability to detect events [ 35 ].

Firstly, evaluating data collection will reveal what proportion of the target population is covered by the service and whether there are any delays in receiving information. For example, a sentinel service will be unable to detect local outbreaks in locations not covered by the system [ 36 ]. Secondly, the underlying codes, diagnoses or free text included in syndromic indicators will determine their sensitivity and specificity [ 28 ], for example, a multi-purpose indicator may be able to detect different diseases with varying success due to different disease characteristics [ 7 ]. Evaluating detection algorithms enables users to choose the most appropriate method for their service, which may vary by event type. Finally, evaluating the interpretation and reporting stage usually involves assessing which automated statistical alarms require further action, therefore this stage should improve PPV and specificity but with a cost for timeliness and possibly sensitivity [ 6 ]. Considering each stage separately should enable service users to identify areas where a system can be improved, for example, what are the main causes of delays? or is more data being collected than can be analyzed? Figure 5 summarizes how each stage can impact on sensitivity, PPV and timeliness as discussed above. Each additional stage may introduce delays to timeliness and a drop in sensitivity but should increase the PPV.

Impact on detection capabilities of different stages in syndromic surveillance.

Future work

We have focussed on the complexities surrounding evaluation of a multi-purpose syndromic service, therefore we have not considered other important issues such as cost-effectiveness or the added value of additional data sources. However, understanding evaluation complexities will be useful for future studies into cost-effectiveness etc. Evaluation of a multi-purpose syndromic surveillance service should not be a one-off process, it should be periodic creating a positive feedback loop. Information about a service’s detection capabilities should be updated as new evidence comes to light, or in response to major incidents such as the current COVID-19 pandemic. Also, the most valuable information for assessing a service will come from its on-going performance. Therefore, a syndromic service should have clear objectives and maintain a database of past events of different types and detections to enable on-going validation [ 37 ]. The process of identifying the different types of event that the users want a multi-purpose syndromic service to detect should help identify gaps in our knowledge about service detection capabilities, and in turn, this should help guide research priorities.

Acknowledgements

The authors would like to thank the members of PHE’s Real-time syndromic surveillance team who helped develop England’s complex multi-purpose syndromic surveillance system, including: Amardeep Bains, Sally Harcourt, Helen Hughes, Paul Loveridge, Sue Smith and Ana Soriano. RM, GS, IL and AJE are affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emergency Preparedness and Response. GS, OE, IL, NM and AJE are affiliated to the NIHR HPRU in Gastrointestinal Infections. IO is affiliated to the NIHR HPRU in Behavioral Science and Evaluation. The views expressed are those of the author(s) and not necessarily those of the NIHR, PHE or the Department of Health and Social Care.

Abbreviations:

Financial Disclosure

No Financial Disclosures.

Competing Interests

No Competing Interests.

IMAGES

  1. (PDF) Open data in public health surveillance systems: A case study

    case study health surveillance

  2. Frontiers

    case study health surveillance

  3. The health surveillance cycle • Stockwell Safety

    case study health surveillance

  4. Strengthening the CDC’s Public Health Surveillance System

    case study health surveillance

  5. An Introduction to Health Surveillance

    case study health surveillance

  6. Modernizing our Public Health Surveillance Systems

    case study health surveillance

VIDEO

  1. Health Informatics & Health Analysis specialisms webinar: September 2024 entry

  2. Data Science in Healthcare (case study)

  3. No-Code Data Analytics 📈 Hal9 Case Study: Health Data 🏥

  4. Monitoring

  5. PUBLIC HEALTH SURVEILLANCE |PUBLIC HEALTH NUTRITION |FULL LECTURE |BY SAIMA RAFI |AIOU

  6. Case studies Health Working Lives

COMMENTS

  1. Digital public health surveillance: a systematic scoping review

    The studies were from 54 countries and utilized 26 digital platforms to study 208 sub-categories of 49 categories associated with 16 public health surveillance (PHS) themes.

  2. The Power of Public Health Surveillance

    As Delaware reopens in phases, the Delaware Department of Health and Social Services, Division of Public Health (DPH) - the state's lead health agency - is conducting public health surveillance. Case investigations and contact tracing have impacted disease transmission rates by identifying those needing isolation or quarantine.

  3. Performance of COVID-19 case-based surveillance system in FCT ...

    Methods. We used a cross-sectional study design, comprising a survey, key informant interview, record review and secondary data analysis. A self-administered, semi-structured questionnaire was administered to key stakeholders to assess the attributes and process of operation of the surveillance system using CDC's Updated Guidelines for Evaluation of Public Health Surveillance System 2001.

  4. People-centred surveillance: a narrative review of community-based

    Outbreaks of disease in settings affected by crises grow rapidly due to late detection and weakened public health systems. Where surveillance is underfunctioning, community-based surveillance can contribute to rapid outbreak detection and response, a core capacity of the International Health Regulations. We reviewed articles describing the potential for community-based surveillance to detect ...

  5. Humanitarian led community-based surveillance: case study in Ekondo

    Community-based surveillance (CBS) is defined as the systematic detection and reporting of events of public health significance by community members within a community [].CBS has been used in both; non crisis settings for public health emergencies and crisis situations including conflict and natural disasters [2,3,4].CBS has been used in eradication programs including guinea worm and polio, as ...

  6. Surveillance: The Role of Observation in Epidemiological Studies

    Public health surveillance is the corner stone of public health practice and is critical for improving population health. Public health surveillance is the continuous, ongoing systematic collection, analysis, and interpretation of relevant health data, which play key role in planning, implementing, and evaluating public health policies and practices as well as disseminating information needed ...

  7. Use of artificial intelligence for public health surveillance: a case

    Background The use of machine learning techniques is increasing in healthcare which allows to estimate and predict health outcomes from large administrative data sets more efficiently. The main objective of this study was to develop a generic machine learning (ML) algorithm to estimate the incidence of diabetes based on the number of reimbursements over the last 2 years. Methods We selected a ...

  8. Introduction to Public Health Surveillance

    Public health surveillance is "the ongoing, systematic collection, analysis, and interpretation of health-related data essential to planning, implementation, and evaluation of public health practice.". — Field Epidemiology. These materials provide an overview of public health surveillance systems and methods.

  9. The Case For Real-Time Public Health Surveillance

    Client stories and case studies. Climate Center. Climate risk modeling. Digital modernization report. Diversity, equity, and inclusion. Energy in 30 podcast. Federal IT modernization. Researchers, epidemiologists, and policy makers all need timely data. Real-time surveillance enables rapid and standardized data sharing to understand how ...

  10. What is Case Surveillance?

    Case surveillance occurs each time public health agencies at the local, state, or national levels collect information about a case or person diagnosed with a disease or condition that poses a serious health threat to Americans. These diseases and conditions include. noninfectious conditions, such as lead poisoning.

  11. Strengthening global health security through health early warning

    The goal of event-based public health surveillance is to identify and assess potential threats to public health by analyzing news stories, media posts, and other sources of information regarding health events. ... (HEWS) during the Hajj. This case study highlighted the benefits of early warning systems in an international mass gathering (MG ...

  12. "Hospitals respond to demand. Public health needs to respond to risk

    The boundaries of the case are the communicable disease surveillance and response systems in NQ, with COVID-19 one of four embedded units of analysis in the broader study. The case study context is the broader public health system in NQ (Fig. 2). The four case units were selected for their differences according to organisational, biological ...

  13. Conducting public health surveillance in areas of armed conflict and

    This study examined the impact of armed conflict on public health surveillance systems, the limitations of traditional surveillance in this context, and innovative strategies to overcome these limitations. A qualitative case study was conducted to examine the factors affecting the functioning of poliovirus surveillance in conflict-affected areas of Borno state, Nigeria using semi-structured ...

  14. Clinical Surveillance, A Concept Analysis: Leveraging Real-Time ...

    Public health surveillance is defined as the "continuous systematic recognition, collection, analysis, interpretation and dissemination of data about a health-related event for the use of public health action to reduce morbidity and mortality and to improve health" (CDC, 2018, p. 1). ... As demonstrated in the "Related" case study (see ...

  15. Case Library

    The Harvard Chan Case Library is a collection of teaching cases with a public health focus, written by Harvard Chan faculty, case writers, and students, or in collaboration with other institutions and initiatives. Use the filters at right to search the case library by subject, geography, health condition, and representation of diversity and identity to find cases to fit your teaching needs.

  16. Conducting public health surveillance in areas of armed conflict and

    This study examined the impact of armed conflict on public health surveillance systems, the limitations of traditional surveillance in this context, and innovative strategies to overcome these limitations. A qualitative case study was conducted to examine the factors affecting the functioning of pol …

  17. Sentinel Surveillance System Implementation and Evaluation for SARS-CoV

    Measuring effects of genomic surveillance on public health responses in Washington was not included in this study; however, methods for measuring and evaluating effectiveness should be explored. Overrepresentation of older persons in presentinel genomic data was partly driven by selection of LTCF-associated COVID-19 cases and COVID-19 cases ...

  18. PDF Case Definitions for Public Health Surveillance

    Centers for Disease Control and Prevention. Case definitions for public health surveillance. MMWR 1990;39(No. RR-13):[inclusive page numbers]. Copies can be purchased from Superintendent of Documents, U.S. Government Printing Office, Washington, DC 20402-9325. Tel ephone: (202) 783-3238.

  19. JMIR Public Health and Surveillance

    Background: Depression is often accompanied by changes in behavior, including dietary behaviors. The relationship between dietary behaviors and depression has been widely studied, yet previous research has relied on self-reported data which is subject to recall bias. Electronic device-based behavioral monitoring offers the potential for objective, real-time data collection of a large amount ...

  20. JMIR Public Health and Surveillance

    Background: The World Health Organization aims for the global elimination of cervical cancer, necessitating modeling studies to forecast long-term outcomes. Objective: This paper introduces a macrosimulation framework using age-period-cohort modeling and population attributable fractions to predict the timeline for eliminating cervical cancer in Taiwan.

  21. Surveillance of Infectious Diseases

    An example of the importance of serological surveillance in determining public health policy is included below (analysis by person). ... Morbey R.A., et al. A methodological framework for the evaluation of syndromic surveillance systems: A case study of England. BMC Public Health. 2018; 18 doi: 10.1186/s12889-018-5422-9. [PMC free article] ...

  22. The influence of maternal prepregnancy weight and gestational weight

    The obesity epidemic is an important public health problem in developed and developing countries [] and is associated with the emergence of chronic noncommunicable diseases, including type 2 diabetes mellitus (T2DM), hypertension, cardiovascular disease, nonalcoholic fatty liver disease (NAFLD), and cancer [2,3,4].Maternal obesity is the most common metabolic disturbance in pregnancy, and the ...

  23. Measles

    Methods Reporting and Classification of Measles Cases. Confirmed measles cases § (1) are reported to CDC by state health departments through the National Notifiable Disease Surveillance System and directly (by email or telephone) to the National Center for Immunization and Respiratory Diseases.Measles cases are classified by the Council of State and Territorial Epidemiologists as import ...

  24. Evaluating multi-purpose syndromic surveillance systems

    A methodological framework for the evaluation of syndromic surveillance systems: a case study of England. BMC Public Health. 18, 544. 10.1186/s12889-018 ... Potential added value of the new emergency care dataset to ED-based public health surveillance in England: an initial concept analysis. Emerg Med J. 36, 459-64. 10.1136/emermed-2018 ...

  25. Kittitas Valley Healthcare

    Kittitas Valley Hospital. (509) 962-7301. 603 South Chestnut Street. Ellensburg, WA 98926. Download the PDF.