• Search by keyword
  • Search by citation

Page 1 of 496

“It’s about how you take in things with your brain” - young people’s perspectives on mental health and help seeking: an interview study

Poor mental health in young people has become a growing problem globally over the past decades. However, young people have also been shown to underutilize available healthcare resources. The World Health Organ...

  • View Full Text

Racial and ethnic disparities in access to community-based perinatal mental health programs: results from a cross-sectional survey

Perinatal mental health is a major public health problem that disproportionately affects people from racial and ethnic minority groups. Community-based perinatal mental health programs, such as peer support gr...

Cervical cancer screening among women with comorbidities: evidence from the 2022 Tanzania demographic and health survey

The aim of this study is to examine cervical cancer screening (CCS) uptake among women living with hypertension and HIV in Tanzania.

Which aspects of education are health protective? a life course examination of early education and adulthood cardiometabolic health in the 30-year study of early child care and Youth Development (SECCYD)

Past research describes robust associations between education and health, yet findings have generally been limited to the examination of education as the number of years of education or educational attainment....

Trends in parkinson’s disease mortality in China from 2004 to 2021: a joinpoint analysis

This study aimed to analyze the trends of Parkinson’s disease (PD) mortality rates among Chinese residents from 2004 to 2021, provide evidence for the formulation of PD prevention and control strategies to imp...

Associations between Life’s Essential 8 and abdominal aortic calcification among US Adults: a cross-sectional study

Cardiovascular health (CVH) and abdominal aortic calcification (AAC) are closely linked to cardiovascular disease (CVD) and related mortality. However, the relationship between CVH metrics via Life’s Essential...

Factors associated with the use of antibiotics for children presenting with illnesses with fever and cough obtained from prescription and non-prescription sources: a cross-sectional study of data for 37 sub-Saharan African countries

Fever and cough in under-five children are common and predominately self-limiting illnesses. Inappropriate prescribing of antibiotics in sub-Saharan Africa is a significant public health concern. However, pres...

A methodology for estimating SARS-CoV-2 importation risk by air travel into Canada between July and November 2021

Estimating rates of disease importation by travellers is a key activity to assess both the risk to a country from an infectious disease emerging elsewhere in the world and the effectiveness of border measures....

‘We get to learn as we move’: effects and feasibility of lesson-integrated physical activity in a Swedish primary school

Physical activity (PA) promotes health in adults as well as children. At the same time, a large proportion of children do not meet the recommendations for PA, and more school-based efforts to increase PA are n...

Association of dietary calcium intake at dinner versus breakfast with cardiovascular disease in U.S. adults: the national health and nutrition examination survey, 2003–2018

Currently, it is still largely unknown whether the proportion of calcium intake at breakfast and dinner is associated with cardiovascular disease (CVD) in the general population.

Retraction Note: Children’s physical activity level and sedentary behaviour in Norwegian early childhood education and care: effects of a staff-led cluster-randomised controlled trial

This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1186/s12889-024-18629-0.

Racial/ethnic differences in the associations between trust in the U.S. healthcare system and willingness to test for and vaccinate against COVID-19

Trust in the healthcare system may impact adherence to recommended healthcare practices, including willingness to test for and vaccinate against COVID-19. This study examined racial/ethnic differences in the a...

Engaging with immigrant students’ voices in the school environment: an analysis of policy documents through school websites

For students to feel happy and supported in school, it is important that their views are taken seriously and integrated into school policies. However, limited information is available how the voices of immigra...

Effectiveness of implementing evidence-based approaches to promote physical activity in a Midwestern micropolitan area using a quasi-experimental hybrid type I study design

Much evidence-based physical activity (PA) interventions have been tested and implemented in urban contexts. However, studies that adapt, implement, and evaluate the effectiveness of these interventions in mic...

Prevalence of tobacco use among cancer patients in Iran: a systematic review and meta-analysis

The prevalence of tobacco use among various cancer types in Iran remains a significant concern, necessitating a comprehensive analysis to understand the extent and patterns of consumption. This study aimed to ...

Clusters of 24-hour movement behavior and diet and their relationship with health indicators among youth: a systematic review

Movement-related behaviors (physical activity [PA], sedentary behavior [SB], and sleep) and diet interact with each other and play important roles in health indicators in youth. This systematic review aimed to...

Associations between 47 anthropometric markers derived from a body scanner and relative fat-free mass in a population-based study

Low relative fat free mass (FFM) is associated with a greater risk of chronic diseases and mortality. Unfortunately, FFM is currently not being measured regularly to allow for individuals therapy.

Correction: Sodium, potassium intake, and all-cause mortality: confusion and new findings

The original article was published in BMC Public Health 2024 24 :180

Relationship between resilience at work, work engagement and job satisfaction among engineers: a cross-sectional study

Workplace challenges can negatively affect employees and the organization. Resilience improves work-related outcomes like engagement, satisfaction, and performance. Gaps exist in studying resilience at work, p...

Demographic disparities in the limited awareness of alcohol use as a breast cancer risk factor: empirical findings from a cross-sectional study of U.S. women

Alcohol use is an established yet modifiable risk factor for breast cancer. However, recent research indicates that the vast majority of U.S. women are unaware that alcohol use is a risk factor for breast canc...

Design of a bilingual (FR-UR) website on the sensitive topic of sexual and mental health with Urdu speakers in a Parisian suburb: a qualitative study

This article is a continuation of the Musafir study published in 2020. Following the results of this study, we designed an educational website with Urdu-speaking volunteers, using a participatory approach. Thi...

Assessment of the correlation between KAP scores regarding sugar-sweetened beverage consumption and hyperuricemia amongst Chinese young adults

The prevalence of hyperuricemia in China has been consistently increasing, particularly among the younger generation. The excessive consumption of sugar-sweetened beverages is associated with hyperuricemia. Th...

Problematic usage of the internet among Hungarian elementary school children: a cross-sectional study

Problematic usage of the internet (PUI) is perhaps one of the most frequently studied phenomena of the 21st century receiving increasing attention in both scientific literature and the media. Despite intensive...

Enhancing routine HIV and STI testing among young men who have sex with men: primary outcomes of the get connected clinical randomized trial (ATN 139)

Regular HIV and STI testing remain a cornerstone of comprehensive sexual health care. In this study, we examine the efficacy of Get Connected, a WebApp that combines test locators with personalized educational...

Inter-leg systolic blood pressure difference has been associated with all-cause and cardiovascular mortality: analysis of NHANES 1999–2004

Inter-leg systolic blood pressure difference (ILSBPD) has emerged as a novel cardiovascular risk factor. This study aims to investigate the predictive value of ILSBPD on all-cause and cardiovascular mortality ...

Sex-related inequalities in crude and age-standardized suicide rates: trends in Ghana from 2000 to 2019

Suicide represents a major public health concern, affecting a significant portion of individuals. However, there remains a gap in understanding the age and sex disparities in the occurrence of suicide. Therefo...

Association of daily sitting time and coffee consumption with the risk of all-cause and cardiovascular disease mortality among US adults

Sedentary behavior has been demonstrated to be a modifiable factor for several chronic diseases, while coffee consumption is believed to be beneficial for health. However, the joint associations of daily sitti...

Association of hypertension and depression with mortality: an exploratory study with interaction and mediation models

The association of hypertension and depression with mortality has not been fully understood. We aimed to explore the possible independent or joint association of hypertension and depression with mortality. The...

Knowledge and trust of mothers regarding childhood vaccination in Rwanda

Knowledge and trust are some of the contributing factors to vaccine acceptance(VA) and Vaccine hesitancy (VH) is one of the top threats to global health. A significant drop in childhood vaccination has been ob...

Associations between COVID-19 incidence, weight status, and social participation restrictions in the U.S.: evidence from the national population, cross-sectional study

To explore the associations between coronavirus infection incidence and weight status and social participation restrictions among community-dwelling adults in the United States.

The impact of internet use on health among older adults in China: a nationally representative study

Aging poses a significant challenge worldwide, with China’s aging status becoming particularly severe. What is the impact of Internet use on the health of the elderly? Existing studies have drawn conflicting c...

The roles of health literacy and social support in the association between smartphone ownership and frailty in older adults: a moderated mediation model

Understanding the role of smartphones to promote the health status of older adults is important in the digital society. Little is known about the effects of having smartphones on physical frailty despite its p...

Relationship between 24-h activity behavior and body fat percentage in preschool children: based on compositional data and isotemporal substitution analysis

This study aims to elucidate the dose‒response relationship between 24-h activity behaviors and body fat percentage (BFP) in Chinese preschool children using a compositional isotemporal substitution model (ISM).

Evaluating compliance with local and International Food Labelling Standards in urban Tanzania: a cross-sectional study of pre-packaged snacks in Dar Es Salaam

Urbanization influences food culture, particularly in low- and middle-income countries where there is an increasing consumption of processed and pre-packaged foods. This shift is contributing to a rise in non-...

Association between the circulating very long-chain saturated fatty acid and cognitive function in older adults: findings from the NHANES

Age-related cognitive decline has a significant impact on the health and longevity of older adults. Circulating very long-chain saturated fatty acids (VLSFAs) may actively contribute to the improvement of cogn...

The association between psychological distress, abusive experiences, and help-seeking among people with intimate partner violence

Intimate partner violence (IPV) is a serious public health problem associated with countless adverse physical and mental health outcomes. It places an enormous economic and public health burden on communities....

Experiences of support for people who access voluntary, community and social enterprise (VCSE) organisations for self-harm: a qualitative study with stakeholder feedback

Prevalence of self-harm In England is rising, however contact with statutory services remains relatively low. There is growing recognition of the potential role voluntary, community and social enterprise secto...

Subnational estimates of life expectancy at birth in India: evidence from NFHS and SRS data

Mortality estimates at the subnational level are of urgent need in India for the formulation of policies and programmes at the district level. This is the first-ever study which used survey data for the estima...

The association between social connectedness and euthanasia and assisted suicide and related constructs: systematic review

Euthanasia and assisted suicide (EAS) requests are common in countries where they are legal. Loneliness and social isolation are modifiable risk factors for mental illness and suicidal behaviour and are common...

Health effects of holistic housing renovation in a disadvantaged neighbourhood in the Netherlands: a qualitative exploration among residents and professionals

Holistic housing renovations combine physical housing improvements with social and socioeconomic interventions (e.g. referral to social services, debt counselling, involvement in decision-making, promoting soc...

Archetype analysis and the PHATE algorithm as methods to describe and visualize pregnant women’s levels of physical activity knowledge

The knowledge of physical activity (PA) recommended for pregnant women and practical application of it has positive impact on the outcome. Nevertheless, it is estimated that in high-income countries over 40% o...

Correlation of pre-existing comorbidities with disease severity in individuals infected with SARS-COV-2 virus

Shortly after the first publication on the new disease called Coronavirus Disease 2019 (Covid-19), studies on the causal consequences of this disease began to emerge, initially focusing only on transmission me...

Integrating ‘undetectable equals untransmittable’ into HIV counselling in South Africa: the development of locally acceptable communication tools using intervention mapping

The global campaign for “Undetectable equals Untransmittable” (U = U) seeks to spread awareness of HIV treatment as prevention, aiming to enhance psychological well-being and diminish stigma. Despite its poten...

Migration process of Venezuelan women to Brazil: living conditions and use of health services in Manaus and Boa Vista, 2018–2021

The last decade saw the emergence of a new significant migration corridor due to the mass migration of Venezuelans to neighboring countries in South America. Since 2018, Brazil became the third host country of...

Associations between cardiovascular diseases and cancer mortality: insights from a retrospective cohort analysis of NHANES data

This study explored the association of cardiovascular disease (CVD) with cancer mortality risk in individuals with or without a history of cancer, to better understand the interplay between CVD and cancer outc...

Inflammation mediates the association between furan exposure and the prevalence and mortality of chronic obstructive pulmonary disease: National Health and Nutrition Examination Survey 2013–2018

Although extensive research has established associations between chronic obstructive pulmonary disease (COPD) and environmental pollutants, the connection between furan and COPD remains unclear. This study aim...

health research journal articles

Correction: Racial discrimination is associated with food insecurity, stress, and worse physical health among college students

The original article was published in BMC Public Health 2024 24 :883

Longitudinal association of grip strength with cardiovascular and all-cause mortality in older urban Lithuanian population

Ageing populations experience greater risks associated with health and survival. It increases the relevance of identifying variables associated with mortality. Grip strength (GS) has been identified as an impo...

The 5 C model and Mpox vaccination behavior in Germany: a cross-sectional survey

Due to the authorization of the Mpox vaccines, we aimed to identify determinants of the intention to get vaccinated, actively trying to receive vaccination, and for successfully receiving a vaccination in Germ...

Pregnancy period and early-life risk factors for inflammatory bowel disease: a Northern Finland birth cohort 1966 study

The pathogenesis of inflammatory bowel disease (IBD) has not been fully elucidated. The aim of this study was to analyze the pregnancy period, perinatal period, and infancy period risk factors for IBD in a wel...

Important information

Editorial board

For authors

For editorial board members

For reviewers

  • Manuscript editing services

Annual Journal Metrics

2022 Citation Impact 4.5 - 2-year Impact Factor 4.7 - 5-year Impact Factor 1.661 - SNIP (Source Normalized Impact per Paper) 1.307 - SJR (SCImago Journal Rank)

2023 Speed 32 days submission to first editorial decision for all manuscripts (Median) 173 days submission to accept (Median)

2023 Usage  24,332,405 downloads 24,308 Altmetric mentions 

  • More about our metrics

Peer-review Terminology

The following summary describes the peer review process for this journal:

Identity transparency: Single anonymized

Reviewer interacts with: Editor

Review information published: Review reports. Reviewer Identities reviewer opt in. Author/reviewer communication

More information is available here

  • Follow us on Twitter

BMC Public Health

ISSN: 1471-2458

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Trending Articles

  • Small extracellular vesicles from young plasma reverse age-related functional declines by improving mitochondrial energy metabolism. Chen X, et al. Nat Aging. 2024. PMID: 38627524
  • Dual-role transcription factors stabilize intermediate expression levels. He J, et al. Cell. 2024. PMID: 38631355
  • Brain endothelial GSDMD activation mediates inflammatory BBB breakdown. Wei C, et al. Nature. 2024. PMID: 38632402
  • ITPRIPL1 binds CD3ε to impede T cell activation and enable tumor immune evasion. Deng S, et al. Cell. 2024. PMID: 38614099
  • Overall Survival with Adjuvant Pembrolizumab in Renal-Cell Carcinoma. Choueiri TK, et al. N Engl J Med. 2024. PMID: 38631003 Clinical Trial.

Latest Literature

  • Am Heart J (1)
  • Am J Clin Nutr (1)
  • Am J Med (1)
  • J Am Acad Dermatol (1)
  • J Biol Chem (6)
  • J Neurosci (5)
  • Nat Commun (1)

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

BMC Health Services Research

Latest collections open to submissions, new: digital health and the healthcare workforce.

Guest Edited by Kerryn Butler-Henderson & Clair Sullivan

New: Researching and measuring inequalities in healthcare

Guest Edited by Magdalena Szaflarski & Zhonghua Wang

Sustainability in health services

Guest Edited by Virginia McKay & Judith Singleton 

Rural health services research

Guest Edited by Birgit Abelsen and Selina Taylor

Published collections

New Content Item

Advancing epidemic preparedness of health systems  

Guest edited by Yibeltal Assefa Alemu, Carl Abelardo T. Antonio​​​​​, Julie ​​​Balen & Megan Schmidt-Sane

substance use disorders small

Health services for substance use disorders

Guest edited by Chaisiri Angkurawaranon, Berkeley Franz, João Pedro Silva

New Content Item

Health workforce planning

Guest edited by Madhan Balasubramanian and Sunny C. Okoroafor

  • Most accessed
  • Latest collections

Development and validation of the hospice professional coping scale among Chinese nurses

Authors: Yanting Zhang, Li Zheng, Yanling He, Min Han, Yu Wang, Jinyu Xv, Hui Qiu and Liu Yang

“ It’s hard to say anything definitive about what severity really is ”: lay conceptualisations of severity in a healthcare context

Authors: Mille Sofie Stenmarck, David GT Whitehurst, Hilde Lurås and Jorun Rugkåsa

Diabetic kidney disease screening status and related factors: a cross-sectional study of patients with type 2 diabetes in six provinces in China

Authors: Zhang Xia, Xuechun Luo, Yanzhi Wang, Tingling Xu, Jianqun Dong, Wei Jiang and Yingying Jiang

A scoping review of continuous quality improvement in healthcare system: conceptualization, models and tools, barriers and facilitators, and impact

Authors: Aklilu Endalamaw, Resham B Khatri, Tesfaye Setegn Mengistu, Daniel Erku, Eskinder Wolka, Anteneh Zewdie and Yibeltal Assefa

Quality indicators for hospital burn care: a scoping review

Authors: Denise R. Rabelo Suzuki, Levy Aniceto Santana, Juliana Elvira H. Guerra Ávila, Fábio Ferreira Amorim, Guilherme Pacheco Modesto, Leila Bernarda Donato Gottems and Vinicius Maldaner

Most recent articles RSS

View all articles

Relationship between Organizational Culture, Leadership Behavior and Job Satisfaction

Authors: Yafang Tsai

How nurses and their work environment affect patient experiences of the quality of care: a qualitative study

Authors: Renate AMM Kieft, Brigitte BJM de Brouwer, Anneke L Francke and Diana MJ Delnoij

Proceedings of the 3rd IPLeiria’s International Health Congress

Authors: Catarina Cardoso Tomás, Emanuel Oliveira, D. Sousa, M. Uba-Chupel, G. Furtado, C. Rocha, A. Teixeira, P. Ferreira, Celeste Alves, Stefan Gisin, Elisabete Catarino, Nelma Carvalho, Tiago Coucelo, Luís Bonfim, Carina Silva, Débora Franco…

Characteristics of successful changes in health care organizations: an interview study with physicians, registered nurses and assistant nurses

Authors: Per Nilsen, Ida Seing, Carin Ericsson, Sarah A. Birken and Kristina Schildmeijer

PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews

Authors: Abigail M Methley, Stephen Campbell, Carolyn Chew-Graham, Rosalind McNally and Sudeh Cheraghi-Sohi

Most accessed articles RSS

Innovations for better health and social justice Edited by: Dr. Magdalena Szaflarski Collection published: 30th May 2022

Advancing Dementia Care Edited by: Dr. Clarissa Giebel and Tillie Cryer Collection published: 13 May 2020

Health Services Research for Opioid Use Disorders Edited by: Dr. Kim Hoffman Collection published: 31 March 2020

Management of Infectious Diseases in Health Systems and Services Edited by: Tillie Cryer Collection published: 19 March 2020

Aims and scope

BMC Health Services Research  is an open access, peer-reviewed journal that considers articles on all aspects of health services research. The journal has a special focus on digital health, governance, health policy, health system quality and safety, healthcare delivery and access to healthcare, healthcare financing and economics, implementing reform, and the health workforce.  

Become an Editorial Board Member

New Content Item

We are seeking new members to join our international Editorial Board.

BMC Health Services Research Blogs

Highlights of the BMC Series – February 2024

Highlights of the BMC Series – February 2024

21 March 2024

Highlights of the BMC Series – November 2023

Highlights of the BMC Series – November 2023

22 December 2023

World AIDS Day 2023: Highlights from the BMC Series

World AIDS Day 2023: Highlights from the BMC Series

01 December 2023

Editor's picks

New Content Item © Chainarong Prasertthai / Getty Images / iStock

Latest Tweets

Your browser needs to have JavaScript enabled to view this timeline

Affiliated with

health research journal articles

Roles Data curation, Investigation, Writing – review & editing

Affiliation University of Cambridge School of Clinical Medicine, Cambridge, United Kingdom

Affiliation Eye Institute, Cleveland Clinic Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates

Roles Data curation, Investigation, Writing – original draft, Writing – review & editing

Affiliations University of Cambridge School of Clinical Medicine, Cambridge, United Kingdom, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom

Roles Data curation, Investigation

Affiliation West Suffolk NHS Foundation Trust, Bury St Edmunds, United Kingdom

Affiliation Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom

Affiliation Birmingham and Midland Eye Centre, Sandwell and West Birmingham NHS Foundation Trust, Birmingham, United Kingdom

Affiliation Department of Ophthalmology, Chang Gung Memorial Hospital, Linkou Medical Center, Taoyuan, Taiwan

Affiliation Yong Loo Lin School of Medicine, National University of Singapore, Singapore

Roles Data curation, Investigation, Project administration, Writing – review & editing

Affiliation Bedfordshire Hospitals NHS Foundation Trust, Luton and Dunstable, United Kingdom

Affiliation Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore

Roles Writing – review & editing

Affiliations Birmingham and Midland Eye Centre, Sandwell and West Birmingham NHS Foundation Trust, Birmingham, United Kingdom, Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom

Roles Funding acquisition, Project administration

Affiliations Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore, Duke-NUS Medical School, Singapore, Singapore, Byers Eye Institute, Stanford University, Palo Alto, California, United States of America

  •  [ ... ],

Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

Affiliations Birmingham and Midland Eye Centre, Sandwell and West Birmingham NHS Foundation Trust, Birmingham, United Kingdom, Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, United Kingdom, Academic Ophthalmology, School of Medicine, University of Nottingham, Nottingham, United Kingdom

  • [ view all ]
  • [ view less ]
  • Arun James Thirunavukarasu, 
  • Shathar Mahmood, 
  • Andrew Malem, 
  • William Paul Foster, 
  • Rohan Sanghera, 
  • Refaat Hassan, 
  • Sean Zhou, 
  • Shiao Wei Wong, 
  • Yee Ling Wong, 

PLOS

  • Published: April 17, 2024
  • https://doi.org/10.1371/journal.pdig.0000341
  • Reader Comments

Table 1

Large language models (LLMs) underlie remarkable recent advanced in natural language processing, and they are beginning to be applied in clinical contexts. We aimed to evaluate the clinical potential of state-of-the-art LLMs in ophthalmology using a more robust benchmark than raw examination scores. We trialled GPT-3.5 and GPT-4 on 347 ophthalmology questions before GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training were trialled on a mock examination of 87 questions. Performance was analysed with respect to question subject and type (first order recall and higher order reasoning). Masked ophthalmologists graded the accuracy, relevance, and overall preference of GPT-3.5 and GPT-4 responses to the same questions. The performance of GPT-4 (69%) was superior to GPT-3.5 (48%), LLaMA (32%), and PaLM 2 (56%). GPT-4 compared favourably with expert ophthalmologists (median 76%, range 64–90%), ophthalmology trainees (median 59%, range 57–63%), and unspecialised junior doctors (median 43%, range 41–44%). Low agreement between LLMs and doctors reflected idiosyncratic differences in knowledge and reasoning with overall consistency across subjects and types ( p >0.05). All ophthalmologists preferred GPT-4 responses over GPT-3.5 and rated the accuracy and relevance of GPT-4 as higher ( p <0.05). LLMs are approaching expert-level knowledge and reasoning skills in ophthalmology. In view of the comparable or superior performance to trainee-grade ophthalmologists and unspecialised junior doctors, state-of-the-art LLMs such as GPT-4 may provide useful medical advice and assistance where access to expert ophthalmologists is limited. Clinical benchmarks provide useful assays of LLM capabilities in healthcare before clinical trials can be designed and conducted.

Author summary

Large language models (LLMs) are the most sophisticated form of language-based artificial intelligence. LLMs have the potential to improve healthcare, and experiments and trials are ongoing to explore potential avenues for LLMs to improve patient care. Here, we test state-of-the-art LLMs on challenging questions used to assess the aptitude of eye doctors (ophthalmologists) in the United Kingdom before they can be deemed fully qualified. We compare the performance of these LLMs to fully trained ophthalmologists as well as doctors in training to gauge the aptitude of the LLMs for providing advice to patients about eye health. One of the LLMs, GPT-4, exhibits favourable performance when compared with fully qualified and training ophthalmologists; and comparisons with its predecessor model, GPT-3.5, indicate that this superior performance is due to improved accuracy and relevance of model responses. LLMs are approaching expert-level ophthalmological knowledge and reasoning, and may be useful for providing eye-related advice where access to healthcare professionals is limited. Further research is required to explore potential avenues of clinical deployment.

Citation: Thirunavukarasu AJ, Mahmood S, Malem A, Foster WP, Sanghera R, Hassan R, et al. (2024) Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study. PLOS Digit Health 3(4): e0000341. https://doi.org/10.1371/journal.pdig.0000341

Editor: Man Luo, Mayo Clinic Scottsdale, UNITED STATES

Received: July 31, 2023; Accepted: February 26, 2024; Published: April 17, 2024

Copyright: © 2024 Thirunavukarasu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data are available as supplementary information , excluding copyrighted material from the textbook used for experiments.

Funding: DSWT is supported by the National Medical Research Council, Singapore (NMCR/HSRG/0087/2018; MOH-000655-00; MOH-001014-00), Duke-NUS Medical School (Duke-NUS/RSF/2021/0018; 05/FY2020/EX/15-A58), and Agency for Science, Technology and Research (A20H4g2141; H20C6a0032). DSJT is supported by a Medical Research Council / Fight for Sight Clinical Research Fellowship (MR/T001674/1). These funders were not involved in the conception, execution, or reporting of this review.

Competing interests: AM is a member of the Panel of Examiners of the Royal College of Ophthalmologists and performs unpaid work as an FRCOphth examiner. DSWT holds a patent on a deep learning system to detect retinal disease. DSJT authored the book used in the study and receives royalty from its sales. The other authors have no competing interests to declare.

Introduction

Generative Pre-trained Transformer 3.5 (GPT-3.5) and 4 (GPT-4) are large language models (LLMs) trained on datasets containing hundreds of billions of words from articles, books, and other internet sources [ 1 , 2 ]. ChatGPT is an online chatbot which uses GPT-3.5 or GPT-4 to provide bespoke responses to human users’ queries [ 3 ]. LLMs have revolutionised the field of natural language processing, and ChatGPT has attracted significant attention in medicine for attaining passing level performance in medical school examinations and providing more accurate and empathetic messages than human doctors in response to patient queries on a social media platform [ 3 , 4 , 5 , 6 ]. While GPT-3.5 performance in more specialised examinations has been inadequate, GPT-4 is thought to represent a significant advancement in terms of medical knowledge and reasoning [ 3 , 7 , 8 ]. Other LLMs in wide use include Pathways Language Model 2 (PaLM 2) and Large Language Model Meta AI 2 (LLaMA 2) [ 3 ], [ 9 , p. 2], [ 10 ].

Applications and trials of LLMs in ophthalmological settings has been limited despite ChatGPT’s performance in questions relating to ‘eyes and vision’ being superior to other subjects in an examination for general practitioners [ 7 , 11 ]. ChatGPT has been trialled on the North American Ophthalmology Knowledge Assessment Program (OKAP), and Fellowship of the Royal College of Ophthalmologists (FRCOphth) Part 1 and Part 2 examinations. In both cases, relatively poor results have been reported for GPT-3.5, with significant improvement exhibited by GPT-4 [ 12 , 13 , 14 , 15 , 16 ]. However, previous studies are afflicted by two important issues which may affect their validity and interpretability. First, so-called ‘contamination’, where test material features in the pretraining data used to develop LLMs, may result in inflated performance as models recall previously seen text rather than using clinical reasoning to provide an answer. Second, examination performance in and of itself provides little information regarding the potential of models to contribute to clinical practice as a medical-assistance tool [ 3 ]. Clinical benchmarks are required to understanding the meaning and implications of scores in ophthalmological examinations attained by LLMs and are a necessary precursor to clinical trials of LLM-based interventions.

Here, we used FRCOphth Part 2 examination questions to gauge the ophthalmological knowledge base and reasoning capability of LLMs using fully qualified and currently training ophthalmologists as clinical benchmarks. These questions were not freely available online, minimising the risk of contamination. The FRCOphth Part 2 Written Examination tests the clinical knowledge and skills of ophthalmologists in training using multiple choice questions with no negative marking and must be passed to fully qualify as a specialist eye doctor in the United Kingdom.

Question extraction

FRCOphth Part 2 questions were sourced from a textbook for doctors preparing to take the examination [ 17 ]. This textbook is not freely available on the internet, making the possibility of its content being included in LLMs’ training datasets unlikely [ 1 ]. All 360 multiple-choice questions from the textbook’s six chapters were extracted, and a 90-question mock examination from the textbook was segregated for LLM and doctor comparisons. Two researchers matched the subject categories of the practice papers’ questions to those defined in the Royal College of Ophthalmologists’ documentation concerning the FRCOphth Part 2 written examination. Similarly, two researchers categorised each question as first order recall or higher order reasoning, corresponding to ‘remembering’ and ‘applying’ or ‘analysing’ in Bloom’s taxonomy, respectively [ 18 ]. Disagreement between classification decisions was resolved by a third researcher casting a deciding vote. Questions containing non-plain text elements such as images were excluded as these could not be inputted to the LLM applications.

Trialling large language models

Every eligible question was inputted into ChatGPT (GPT-3.5 and GPT-4 versions; OpenAI, San Francisco, California, United States of America) between April 29 and May 10, 2023. The answers provided by GPT-3.5 and GPT-4 were recorded and their whole reply to each question was recorded for further analysis. If ChatGPT failed to provide a definitive answer, the question was re-trialled up to three times, after which ChatGPT’s answer was recorded as ‘null’ if no answer was provided. Correct answers (‘ground truth’) were defined as the answers provided by the textbook and were recorded for every eligible question to facilitate calculation of performance. Upon their release, Bard (Google LLC, Mountain View, California, USA) and HuggingChat (Hugging Face, Inc., New York City, USA) were used to trial PaLM 2 (Google LLC) and LLaMA (Meta, Menlo Park, California, USA) respectively on the portion of the textbook corresponding to a 90-question examination, adhering to the same procedures between June 20 and July 2, 2023.

Clinical benchmarks

To gauge the performance, accuracy, and relevance of LLM outputs, five expert ophthalmologists who had all passed the FRCOphth Part 2 (E1-E5), three trainees (residents) currently in ophthalmology training programmes (T1-T3), and two unspecialised ( i . e . not in ophthalmology training) junior doctors (J1-J2) first answered the 90-question mock examination independently, without reference to textbooks, the internet, or LLMs’ recorded answers. As with the LLMs, doctors’ performance was calculated with reference to the correct answers provided by the textbook. After completing the examination, ophthalmologists graded the whole output of GPT-3.5 and GPT-4 on a Likert scale from 1–5 (very bad, bad, neutral, good, very good) to qualitatively appraise accuracy of information provided and relevance of outputs to the question used as an input prompt. For these appraisals, ophthalmologists were blind to the LLM source (which was presented in a randomised order) and to their previous answers to the same questions, but they could refer to the question text and correct answer and explanation provided by the textbook. Procedures are comprehensively described in the protocol issued to the ophthalmologists ( S1 Protocol ).

Our null hypothesis was that LLMs and doctors would exhibit similar performance, supported by results in a wide range of medical examinations [ 3 , 6 ]. Prospective power analysis was conducted which indicated that 63 questions were required to identify a 10% superior performance of an LLM to human performance at a 5% significance level (type 1 error rate) with 80% power (20% type 2 error rate). This indicated that the 90-question examination in our experiments was more than sufficient to detect ~10% differences in overall performance. The whole 90-question mock examination was used to avoid over- or under-sampling certain question types with respect to actual FRCOphth papers. To verify that the mock examination was representative of the FRCOphth Part 2 examination, expert ophthalmologists were asked to rate the difficulty of questions used here in comparison to official examinations on a 5-point Likert scale (“much easier”, “somewhat easier”, “similar”, “somewhat more difficult”, “much more difficult”).

Statistical analysis

Performance of doctors and LLMs were compared using chi-squared (χ 2 ) tests. Agreement between answers provided by doctors and LLMs was quantified through calculation of Kappa statistics, interpreted in accordance with McHugh’s recommendations [ 19 ]. To further explore the strengths and weaknesses of the answer providers, performance was stratified by question type (first order fact recall or higher order reasoning) and subject using a chi-squared or Fisher’s exact test where appropriate. Likert scale data corresponding to the accuracy and relevance of GPT-3.5 and GPT-4 responses to the same questions were analysed with paired t -tests with the Bonferroni correction applied to mitigate the risk of false positive results due to multiple-testing—parametric testing was justified by a sufficient sample size [ 20 ]. A chi-squared test was used to quantify the significance of any difference in overall preference of ophthalmologists choosing between GPT-3.5 and GPT-4 responses. Statistical significance was concluded where p < 0.05. For additional contextualisation, examination statistics corresponding to FRCOphth Part 2 written examinations taken between July 2017 and December 2022 were collected from Royal College of Ophthalmologists examiners’ reports [ 21 ]. These statistics facilitated comparisons between human and LLM performance in the mock examination with the performance of actual candidates in recent examinations. Failure cases where all LLMs provided an incorrect answer were appraised qualitatively to explore any specific weaknesses of the technology.

Statistical analysis was conducted in R (version 4.1.2; R Foundation for Statistical Computing, Vienna, Austria), and figures were produced in Affinity Designer (version 1.10.6; Serif Ltd, West Bridgford, Nottinghamshire, United Kingdom).

Questions sources

Of 360 questions in the textbook, 347 questions (including 87 of the 90 questions from the mock examination chapter) were included [ 17 ]. Exclusions were all due to non-text elements such as images and tables which could not be inputted into LLM chatbot interfaces. The distribution of question types and subjects within the whole set and mock examination set of questions is summarised in Table 1 and S1 Table alongside performance.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

Question subject and type distributions presented alongside scores attained by LLMs (GPT-3.5, GPT-4, LLaMA, and PaLM 2), expert ophthalmologists (E1-E5), ophthalmology trainees (T1-T3), and unspecialised junior doctors (J1-J2). Median scores do not necessarily sum to the overall median score, as fractional scores are impossible.

https://doi.org/10.1371/journal.pdig.0000341.t001

GPT-4 represents a significant advance on GPT-3.5 in ophthalmological knowledge and reasoning.

Overall performance over 347 questions was significantly higher for GPT-4 (61.7%) than GPT-3.5 (48.41%; χ 2 = 12.32, p <0.01), with results detailed in S1 Fig and S1 Table . ChatGPT performance was consistent across question types and subjects ( S1 Table ). For GPT-4, no significant variation was observed with respect to first order and higher order questions (χ 2 = 0.22, p = 0.64), or subjects defined by the Royal College of Ophthalmologists (Fisher’s exact test over 2000 iterations, p = 0.23). Similar results were observed for GPT-3.5 with respect to first and second order questions (χ 2 = 0.08, p = 0.77), and subjects (Fisher’s exact test over 2000 iterations, p = 0.28). Performance and variation within the 87-question mock examination was very similar to the overall performance over 347 questions, and subsequent experiments were therefore restricted to that representative set of questions.

GPT-4 compares well with other LLMs, junior and trainee doctors and ophthalmology experts.

Performance in the mock examination is summarised in Fig 1 —GPT-4 (69%) was the top-scoring model, performing to a significantly higher standard than GPT-3.5 (48%; χ 2 = 7.33, p < 0.01) and LLaMA (32%; χ 2 = 22.77, p < 0.01), but statistically similarly to PaLM 2 (56%) despite a superior score (χ 2 = 2.81, p = 0.09). LLaMA exhibited the lowest examination score, significantly weaker than GPT-3.5 (χ 2 = 4.58, p = 0.03) and PaLM-2 (χ 2 = 10.01, p < 0.01) as well as GPT-4.

thumbnail

Examination performance in the 87-question mock examination used to trial LLMs (GPT-3.5, GPT-4, LLaMA, and PaLM 2), expert ophthalmologists (E1-E5), ophthalmology trainees (T1-T3), and unspecialised junior doctors (J1-J2). Dotted lines depict the mean performance of expert ophthalmologists (66/87; 76%), ophthalmology trainees (60/87; 69%), and unspecialised junior doctors (37/87; 43%). The performance of GPT-4 lay within the range of expert ophthalmologists and ophthalmology trainees.

https://doi.org/10.1371/journal.pdig.0000341.g001

The performance of GPT-4 was statistically similar to the mean score attained by expert ophthalmologists ( Fig 1 ; χ 2 = 1.18, p = 0.28). Moreover, GPT-4’s performance exceeded the mean mark attained across FRCOphth Part 2 written examination candidates between 2017–2022 (66.06%), mean pass mark according to standard setting (61.31%), and the mean official mark required to pass the examination after adjustment (63.75%), as detailed in S2 Table . In individual comparisons with expert ophthalmologists, GPT-4 was equivalent in 3 cases (χ 2 tests, p > 0.05, S3 Table ), and inferior in 2 cases (χ 2 tests, p < 0.05; Table 2 ). In comparisons with ophthalmology trainees, GPT-4 was equivalent to all three ophthalmology trainees (χ 2 tests, p > 0.05; Table 2 ). GPT-4 was significantly superior to both unspecialised trainee doctors (χ 2 tests, p < 0.05; Table 2 ). Doctors were anonymised in analysis, but their ophthalmological experience is summarised in S3 Table . Unsurprisingly, junior doctors (J1-J2) attained lower scores than expert ophthalmologists (E1-E5; t = 7.18, p < 0.01), and ophthalmology trainees (T1-T3; t = 11.18, p < 0.01), illustrated in Fig 1 . Ophthalmology trainees approached expert-level scores with no significant difference between the groups ( t = 1.55, p = 0.18). None of the other LLMs matched any of the expert ophthalmologists, mean mark of real examination candidates, or FRCOphth Part 2 pass mark.

Expert ophthalmologists agreed that the mock examination was a faithful representation of actual FRCOphth Part 2 Written Examination papers with a mean and median score of 3/5 (range 2-4/5).

thumbnail

Results of pair-wise comparisons of examination performance between GPT-4 and the other answer providers. Significantly greater performance for GPT-4 is highlighted green, significantly inferior performance for GPT-4 is highlighted orange. GPT-4 was superior to all other LLMs and unspecialised junior doctors, and equivalent to most expert ophthalmologists and all ophthalmology trainees.

https://doi.org/10.1371/journal.pdig.0000341.t002

LLM strengths and weaknesses are similar to doctors.

Agreement between answers given by LLMs, expert ophthalmologists, and trainee doctors was generally absent (0 ≤ κ < 0.2), minimal (0.2 ≤ κ < 0.4), or weak (0.4 ≤ κ < 0.6), with moderate agreement only recorded for one pairing between the two highest performing ophthalmologists ( Fig 2 ; κ = 0.64) [ 19 ]. Disagreement was primarily the result of general differences in knowledge and reasoning ability, illustrated by strong negative correlation between Kappa statistic (quantifying agreement) and difference in examination performance (Pearson’s r = -0.63, p < 0.01). Answer providers with more similar scores exhibited greater agreement overall irrespective of their category (LLM, expert ophthalmologist, ophthalmology trainee, or junior doctor).

thumbnail

Agreement correlates strongly with overall performance and stratification analysis found no particular question type or subject was associated with better performance of LLMs or doctors, indicating that LLM knowledge and reasoning ability is general across ophthalmology rather than restricted to particular subspecialties or question types.

https://doi.org/10.1371/journal.pdig.0000341.g002

Stratification analysis was undertaken to identify any specific strengths and weaknesses of LLMs with respect to expert ophthalmologists and trainee doctors ( Table 1 and S4 Table ). No significant difference between performance in first order fact recall and higher order reasoning questions was observed among any of the LLMs, expert ophthalmologists, ophthalmology trainees, or unspecialised junior doctors ( S4 Table ; χ 2 tests, p > 0.05). Similarly, only J1 (junior doctor yet to commence ophthalmology training) exhibited statistically significant variation in performance between subjects ( S4 Table ; Fisher’s exact tests over 2000 iterations, p = 0.02); all other doctors and LLMs exhibited no significant variation (Fisher’s exact tests over 2000 iterations, p > 0.05). To explore whether consistency was due to an insufficient sample size, similar analyses were run for GPT-3.5 and GPT-4 performance over the larger set of 347 questions ( S1 Table ; S4 Table ). As with the mock examination, no significant differences in performance across question types ( S4 Table ; χ 2 tests, p > 0.05) or subjects ( S4 Table ; Fisher’s exact tests over 2000 iterations, p > 0.05) were observed.

LLM examination performance translates to subjective preference indicated by expert ophthalmologists.

Ophthalmologists’ appraisal of GPT-4 and GPT-3.5 outputs indicated a marked preference for the former over the latter, mirroring objective performance in the mock examination and over the whole textbook. GPT-4 exhibited significantly ( t -test with Bonferroni correction, p < 0.05) higher accuracy and relevance than GPT-3.5 according to all five ophthalmologists’ grading ( Table 3 ). Differences were visually obvious, with GPT-4 exhibiting much higher rates of attaining the highest scores for accuracy and relevance than GPT-3.5 ( Fig 3 ). This superiority was reflected in ophthalmologists’ qualitative preference indications: GPT-4 responses were preferred to GPT-3.5 responses by every ophthalmologist with statistically significant skew in favour of GPT-4 (χ 2 test, p < 0.05; Table 3 ).

thumbnail

Accuracy (A) and relevance (B) ratings were provided by five expert ophthalmologists for ChatGPT (powered by GPT-3.5 and GPT-4) responses to 87 FRCOphth Part 2 mock examination questions. In every case, the accuracy and relevance of GPT-4 is significantly superior to GPT-3.5 (t-test with Bonferroni correct applied, p < 0.05). Pooled scores for accuracy (C) and relevance (D) from all five raters are presented in the bottom two plots, with GPT-3.5 (left bars) compared directly with GPT-4 (right bars).

https://doi.org/10.1371/journal.pdig.0000341.g003

thumbnail

t-test results with Bonferroni correction applied showing the superior accuracy and relevance of GPT-4 responses relative to GPT-3.5 responses in the opinion of five fully trained ophthalmologists (positive mean differences favour GPT-4), and χ 2 test showing that GPT-4 responses were preferred to GPT-3.5 responses by every ophthalmologist in their blinded qualitative appraisals.

https://doi.org/10.1371/journal.pdig.0000341.t003

Failure cases exhibit no association with subject, complexity, or human answers.

The LLM failure cases—where every LLM provided an incorrect answer—are summarised in Table 4 . While errors made by LLMs were occasionally similar to those made by trainee ophthalmologists and junior doctors, this association was not consistent ( Table 4 ). There was no preponderance of ophthalmological subject or first or higher order questions in the failure cases, and questions did not share a common theme, sentence structure, or grammatical construct ( Table 4 ). Examination questions are redacted here to avoid breaching copyright and prevent future LLMs accessing the test data during pretraining but can be provided on request.

thumbnail

Summary of LLM failure cases, where all models provided an incorrect answer to the FRCOphth Part 2 mock examination question. No associations were found with human answers, complexity, subject, theme, sentence structure, or grammatic constructs.

https://doi.org/10.1371/journal.pdig.0000341.t004

Here, we present a clinical benchmark to gauge the ophthalmological performance of LLMs, using a source of questions with very low risk of contamination as the utilised textbook is not freely available online [ 17 ]. Previous studies have suggested that ChatGPT can provide useful responses to ophthalmological queries, but often use online question sources which may have featured in LLMs’ pretraining datasets [ 7 , 12 , 15 , 22 ]. In addition, our employment of multiple LLMs as well as fully qualified and training doctors provides novel insight into the potential and limitations of state-of-the-art LLMs through head-to-head comparisons which provide clinical context and quantitative benchmarks of competence in ophthalmology. Subsequent research may leverage our questions and results to gauge the performance of new LLMs and applications as they emerge.

We make three primary observations. First, performance of GPT-4 compares well to expert ophthalmologists and ophthalmology trainees, and exhibits pass-worthy performance in an FRCOphth Part 2 mock examination. PaLM 2 did not attain pass-worthy performance or match expert ophthalmologists’ scores but was within the spread of trainee doctors’ performance. LLMs are approaching human expert-level knowledge and reasoning in ophthalmology, and significantly exceed the ability of non-specialist clinicians (represented here by unspecialised junior doctors) to answer ophthalmology questions. Second, clinician grading of model outputs suggests that GPT-4 exhibits improved accuracy and relevance when compared with GPT-3.5. Development is producing models which generate better outputs to ophthalmological queries in the opinion of expert human clinicians, which suggests that models are becoming more capable of providing useful assistance in clinical settings. Third, LLM performance was consistent across question subjects and types, distributed similarly to human performance, and exhibited comparable agreement between other LLMs and doctors when corrected for differences in overall performance. Together, this indicates that the ophthalmological knowledge and reasoning capability of LLMs is general rather than limited to certain subspecialties or tasks. LLM-driven natural language processing seems to facilitate similar—although idiosyncratic—clinical knowledge and reasoning to human clinicians, with no obvious blind spots precluding clinical use.

Similarly dramatic improvements in the performance of GPT-4 relative to GPT-3.5 have been reported in the context of the North American Ophthalmology Knowledge Assessment Program (OKAP) [ 13 , 15 ]. State-of-the-art models exhibit far more clinical promise than their predecessors, and expectations and development should be tailored accordingly. Results from the OKAP also suggest that improvement in performance is due to GPT-4 being more well-rounded than GPT-3.5 [ 13 ]. This increases the scope for potential applications of LLMs in ophthalmology, as development is eliminating weaknesses rather than optimising in narrow domains. This study shows that well-rounded LLM performance compares well with expert ophthalmologists, providing clinically relevant evidence that LLMs may be used to provide medical advice and assistance. Further improvement is expected as multimodal foundation models, perhaps based on LLMs such as GPT-4, emerge and facilitate compatibility with image-rich ophthalmological data [ 3 , 23 , 24 ].

Limitations

This study was limited by three factors. First, examination performance is an unvalidated indicator of clinical aptitude. We sought to ameliorate this limitation by employing expert ophthalmologists, ophthalmology trainees, and unspecialised junior doctors answering the same questions as clinical benchmarks; and compared LLM performance to real cohorts of candidates in recent FRCOphth examinations. However, it remains an issue that comparable performance to clinical experts in an examination does not necessarily demonstrate that an LLM can communicate with patients and practitioners or contribute to clinical decision making accurately and safely. Early trials of LLM chatbots have suggested that LLM responses may be equivalent or even superior to human doctors in terms of accuracy and empathy, and experiments using complicated case studies suggest that LLMs operate well even outside typical presentations and more common medical conditions [ 4 , 25 , 26 ]. In ophthalmology, GPT-3.5 and GPT-4 have been shown to be capable of providing precise and suitable triage decisions when queried with eye-related symptoms [ 22 , 27 ]. Further work is now warranted in conventional clinical settings.

Second, while the study was sufficiently powered to detect a less than 10% difference in overall performance, the relatively small number of questions in certain categories used for stratification analysis may mask significant differences in performance. Testing LLMs and clinicians with more questions may help establish where LLMs exhibit greater or lesser ability in ophthalmology. Furthermore, researchers using different ways to categorise questions may be able to identify specific strengths and weaknesses of LLMs and doctors which could help guide design of clinical LLM interventions.

Finally, experimental tasks were ‘zero-shot’ in that LLMs were not provided with any examples of correctly answered questions before it was queried with FRCOphth questions from the textbook. This mode of interrogation entails the maximal level of difficulty for LLMs, so it is conceivable that the ophthalmological knowledge and reasoning encoded within these models is actually even greater than indicated by results here [ 1 ]. Future research may seek to fine-tune LLMs by using more domain-specific text during pretraining and fine-tuning, or by providing examples of successfully completed tasks to further improve performance in that clinical task [ 3 ].

Future directions

Autonomous deployment of LLMs is currently precluded by inaccuracy and fact fabrication. Our study found that despite meeting expert standards, state-of-the-art LLMs such as GPT-4 do not match top-performing ophthalmologists [ 28 ]. Moreover, there remain controversial ethical questions about what roles should and should not be assigned to inanimate AI models, and to what extent human clinicians must remain responsible for their patients [ 3 ]. However, the remarkable performance of GPT-4 in ophthalmology examination questions suggests that LLMs may be able to provide useful input in clinical contexts, either to assist clinicians in their day-to-day work or with their education or preparation for examinations [ 3 , 13 , 14 , 27 ]. Further improvement in performance may be obtained by specific fine-tuning of models with high quality ophthalmological text data, requiring curation and deidentification [ 29 ]. GPT-4 may prove especially useful where access to ophthalmologists is limited: provision of advice, diagnosis, and management suggestions by a model with FRCOphth Part 2-level knowledge and reasoning ability is likely to be superior to non-specialist doctors and allied healthcare professionals working without support, as their exposure to and knowledge of eye care is limited [ 27 , 30 , 31 ].

However, close monitoring is essential to avoid mistakes caused by inaccuracy or fact fabrication [ 32 ]. Clinical applications would also benefit from an uncertainty indicator reducing the risk of erroneous decisions [ 7 ]. As LLM performance often correlates with the frequency of query terms’ representation in the model’s training dataset, a simple indicator of ‘familiarity’ could be engineered by calculating the relative frequency of query term representation in the training data [ 7 , 33 ]. Users could appraise familiarity to temper their confidence in answers provided by the LLM, perhaps reducing error. Moreover, ophthalmological applications require extensive validation, preferably with high quality randomised controlled trials to conclusively demonstrate benefit (or lack thereof) conferred to patients by LLM interventions [ 34 ]. Trials should be pragmatic so as not to inflate effect sizes beyond what may generalise to patients once interventions are implemented at scale [ 34 , 35 ]. In addition to patient outcomes, practitioner-related variables should also be considered: interventions aiming to improve efficiency should be specifically tested to ensure that they reduce rather than increase clinicians’ workload [ 3 ].

According to comparisons with expert and trainee doctors, state-of-the-art LLMs are approaching expert-level performance in advanced ophthalmology questions. GPT-4 attains pass-worthy performance in FRCOphth Part 2 questions and exceeds the scores of some expert ophthalmologists. As top-performing doctors exhibit superior scores, LLMs do not appear capable of replacing ophthalmologists, but state-of-the-art models could provide useful advice and assistance to non-specialists or patients where access to eye care professionals is limited [ 27 , 28 ]. Further research is required to design LLM-based interventions which may improve eye health outcomes, validate interventions in clinical trials, and engineer governance structures to regulate LLM applications as they begin to be deployed in clinical settings [ 36 ].

Supporting information

S1 fig. chatgpt performance in questions taken from the whole textbook..

Mosaic plot depicting the overall performance of ChatGPT versions powered by GPT-3.5 and GPT-4 in 360 FRCOphth Part 2 written examination questions. Performance was significantly higher for GPT-4 than GPT-3.5, and was close to mean human examination candidate performance and pass mark set by standard setting and after adjustment.

https://doi.org/10.1371/journal.pdig.0000341.s001

S1 Table. Question characteristics and performance of GPT-3.5 and GPT-4 over the whole textbook.

Similar observations were noted here to the smaller mock examination used for subsequent experiments. GPT-4 performs to a significantly higher standard than GPT-3.5

https://doi.org/10.1371/journal.pdig.0000341.s002

S2 Table. Examination statistics corresponding to FRCOphth Part 2 written examinations sat between July 2017-December 2022.

https://doi.org/10.1371/journal.pdig.0000341.s003

S3 Table. Experience of expert ophthalmologists (E1-E5), ophthalmology trainees (T1-T3), and unspecialised junior doctors (J1-J2) involved in experiments.

https://doi.org/10.1371/journal.pdig.0000341.s004

S4 Table. Results of statistical tests of variation in performance between question subjects and types, for each trialled LLM, expert ophthalmologist, and trainee doctor.

Statistically significant results are highlighted in green.

https://doi.org/10.1371/journal.pdig.0000341.s005

S1 Protocol. Procedures followed by ophthalmologists to grade the output of GPT-3.5 and GPT-4 in terms of accuracy, relevance, and rater-preference of model outputs.

https://doi.org/10.1371/journal.pdig.0000341.s006

Acknowledgments

The authors extend their thanks to Mr Arunachalam Thirunavukarasu (Betsi Cadwaladr University Health Board) for his advice and assistance with recruitment.

  • 1. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language Models are Few-Shot Learners. In: Advances in Neural Information Processing Systems [Internet]. Curran Associates, Inc.; 2020 [cited 2023 Jan 30]. p. 1877–901. Available from: https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
  • 2. OpenAI. GPT-4 Technical Report [Internet]. arXiv; 2023 [cited 2023 Apr 11]. Available from: http://arxiv.org/abs/2303.08774
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 9. Google. PaLM 2 Technical Report [Internet]. 2023 [cited 2023 May 11]. Available from: https://ai.google/static/documents/palm2techreport.pdf
  • 17. Ting DSJ, Steel D. MCQs for FRCOphth Part 2. Oxford University Press; 2020. 253 p.
  • 21. Part 2 Written FRCOphth Exam [Internet]. The Royal College of Ophthalmologists. [cited 2023 Jan 30]. Available from: https://www.rcophth.ac.uk/examinations/rcophth-exams/part-2-written-frcophth-exam/

health research journal articles

Maintenance work is planned for Wednesday 1st May 2024 from 9:00am to 11:00am (BST).

During this time, the performance of our website may be affected - searches may run slowly and some pages may be temporarily unavailable. If this happens, please try refreshing your web browser or try waiting two to three minutes before trying again.

We apologise for any inconvenience this might cause and thank you for your patience.

health research journal articles

Journal of Materials Chemistry B

Vaccine adjuvants: current status, research and development, licensing, and future opportunities.

ORCID logo

* Corresponding authors

a Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, CA 90095, USA E-mail: [email protected]

b Department of Bioengineering, University of California, Los Angeles, CA 90095, USA

c Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA E-mail: [email protected]

Vaccines represent one of the most significant inventions in human history and have revolutionized global health. Generally, a vaccine functions by triggering the innate immune response and stimulating antigen-presenting cells, leading to a defensive adaptive immune response against a specific pathogen's antigen. As a key element, adjuvants are chemical materials often employed as additives to increase a vaccine's efficacy and immunogenicity. For over 90 years, adjuvants have been essential components in many human vaccines, improving their efficacy by enhancing, modulating, and prolonging the immune response. Here, we provide a timely and comprehensive review of the historical development and the current status of adjuvants, covering their classification, mechanisms of action, and roles in different vaccines. Additionally, we perform systematic analysis of the current licensing processes and highlights notable examples from clinical trials involving vaccine adjuvants. Looking ahead, we anticipate future trends in the field, including the development of new adjuvant formulations, the creation of innovative adjuvants, and their integration into the broader scope of systems vaccinology and vaccine delivery. The article posits that a deeper understanding of biochemistry, materials science, and vaccine immunology is crucial for advancing vaccine technology. Such advancements are expected to lead to the future development of more effective vaccines, capable of combating emerging infectious diseases and enhancing public health.

Graphical abstract: Vaccine adjuvants: current status, research and development, licensing, and future opportunities

  • This article is part of the themed collections: Journal of Materials Chemistry B Recent Review Articles and Journal of Materials Chemistry B Emerging Investigators 2024

Article information

Download citation, permissions.

health research journal articles

Y. Cui, M. Ho, Y. Hu and Y. Shi, J. Mater. Chem. B , 2024, Advance Article , DOI: 10.1039/D3TB02861E

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page .

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page .

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author.

This article has not yet been cited.

Advertisements

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychiatry

Mental Health Prevention and Promotion—A Narrative Review

Associated data.

Extant literature has established the effectiveness of various mental health promotion and prevention strategies, including novel interventions. However, comprehensive literature encompassing all these aspects and challenges and opportunities in implementing such interventions in different settings is still lacking. Therefore, in the current review, we aimed to synthesize existing literature on various mental health promotion and prevention interventions and their effectiveness. Additionally, we intend to highlight various novel approaches to mental health care and their implications across different resource settings and provide future directions. The review highlights the (1) concept of preventive psychiatry, including various mental health promotions and prevention approaches, (2) current level of evidence of various mental health preventive interventions, including the novel interventions, and (3) challenges and opportunities in implementing concepts of preventive psychiatry and related interventions across the settings. Although preventive psychiatry is a well-known concept, it is a poorly utilized public health strategy to address the population's mental health needs. It has wide-ranging implications for the wellbeing of society and individuals, including those suffering from chronic medical problems. The researchers and policymakers are increasingly realizing the potential of preventive psychiatry; however, its implementation is poor in low-resource settings. Utilizing novel interventions, such as mobile-and-internet-based interventions and blended and stepped-care models of care can address the vast mental health need of the population. Additionally, it provides mental health services in a less-stigmatizing and easily accessible, and flexible manner. Furthermore, employing decision support systems/algorithms for patient management and personalized care and utilizing the digital platform for the non-specialists' training in mental health care are valuable additions to the existing mental health support system. However, more research concerning this is required worldwide, especially in the low-and-middle-income countries.

Introduction

Mental disorder has been recognized as a significant public health concern and one of the leading causes of disability worldwide, particularly with the loss of productive years of the sufferer's life ( 1 ). The Global Burden of Disease report (2019) highlights an increase, from around 80 million to over 125 million, in the worldwide number of Disability-Adjusted Life Years (DALYs) attributable to mental disorders. With this surge, mental disorders have moved into the top 10 significant causes of DALYs worldwide over the last three decades ( 2 ). Furthermore, this data does not include substance use disorders (SUDs), which, if included, would increase the estimated burden manifolds. Moreover, if the caregiver-related burden is accounted for, this figure would be much higher. Individual, social, cultural, political, and economic issues are critical mental wellbeing determinants. An increasing burden of mental diseases can, in turn, contribute to deterioration in physical health and poorer social and economic growth of a country ( 3 ). Mental health expenditure is roughly 3–4% of their Gross Domestic Products (GDPs) in developed regions of the world; however, the figure is abysmally low in low-and-middle-income countries (LMICs) ( 4 ). Untreated mental health and behavioral problems in childhood and adolescents, in particular, have profound long-term social and economic adverse consequences, including increased contact with the criminal justice system, lower employment rate and lesser wages among those employed, and interpersonal difficulties ( 5 – 8 ).

Need for Mental Health (MH) Prevention

Longitudinal studies suggest that individuals with a lower level of positive wellbeing are more likely to acquire mental illness ( 9 ). Conversely, factors that promote positive wellbeing and resilience among individuals are critical in preventing mental illnesses and better outcomes among those with mental illness ( 10 , 11 ). For example, in patients with depressive disorders, higher premorbid resilience is associated with earlier responses ( 12 ). On the contrary, patients with bipolar affective- and recurrent depressive disorders who have a lower premorbid quality of life are at higher risk of relapses ( 13 ).

Recently there has been an increased emphasis on the need to promote wellbeing and positive mental health in preventing the development of mental disorders, for poor mental health has significant social and economic implications ( 14 – 16 ). Research also suggests that mental health promotion and preventative measures are cost-effective in preventing or reducing mental illness-related morbidity, both at the society and individual level ( 17 ).

Although the World Health Organization (WHO) defines health as “a state of complete physical, mental, and social wellbeing and not merely an absence of disease or infirmity,” there has been little effort at the global level or stagnation in implementing effective mental health services ( 18 ). Moreover, when it comes to the research on mental health (vis-a-viz physical health), promotive and preventive mental health aspects have received less attention vis-a-viz physical health. Instead, greater emphasis has been given to the illness aspect, such as research on psychopathology, mental disorders, and treatment ( 19 , 20 ). Often, physicians and psychiatrists are unfamiliar with various concepts, approaches, and interventions directed toward mental health promotion and prevention ( 11 , 21 ).

Prevention and promotion of mental health are essential, notably in reducing the growing magnitude of mental illnesses. However, while health promotion and disease prevention are universally regarded concepts in public health, their strategic application for mental health promotion and prevention are often elusive. Furthermore, given the evidence of substantial links between psychological and physical health, the non-incorporation of preventive mental health services is deplorable and has serious ramifications. Therefore, policymakers and health practitioners must be sensitized about linkages between mental- and physical health to effectively implement various mental health promotive and preventive interventions, including in individuals with chronic physical illnesses ( 18 ).

The magnitude of the mental health problems can be gauged by the fact that about 10–20% of young individuals worldwide experience depression ( 22 ). As described above, poor mental health during childhood is associated with adverse health (e.g., substance use and abuse), social (e.g., delinquency), academic (e.g., school failure), and economic (high risk of poverty) adverse outcomes in adulthood ( 23 ). Childhood and adolescence are critical periods for setting the ground for physical growth and mental wellbeing ( 22 ). Therefore, interventions promoting positive psychology empower youth with the life skills and opportunities to reach their full potential and cope with life's challenges. Comprehensive mental health interventions involving families, schools, and communities have resulted in positive physical and psychological health outcomes. However, the data is limited to high-income countries (HICs) ( 24 – 28 ).

In contrast, in low and middle-income countries (LMICs) that bear the greatest brunt of mental health problems, including massive, coupled with a high treatment gap, such interventions remained neglected in public health ( 29 , 30 ). This issue warrants prompt attention, particularly when global development strategies such as Millennium Development Goals (MDGs) realize the importance of mental health ( 31 ). Furthermore, studies have consistently reported that people with socioeconomic disadvantages are at a higher risk of mental illness and associated adverse outcomes; partly, it is attributed to the inequitable distribution of mental health services ( 32 – 35 ).

Scope of Mental Health Promotion and Prevention in the Current Situation

Literature provides considerable evidence on the effectiveness of various preventive mental health interventions targeting risk and protective factors for various mental illnesses ( 18 , 36 – 42 ). There is also modest evidence of the effectiveness of programs focusing on early identification and intervention for severe mental diseases (e.g., schizophrenia and psychotic illness, and bipolar affective disorders) as well as common mental disorders (e.g., anxiety, depression, stress-related disorders) ( 43 – 46 ). These preventive measures have also been evaluated for their cost-effectiveness with promising findings. In addition, novel interventions such as digital-based interventions and novel therapies (e.g., adventure therapy, community pharmacy program, and Home-based Nurse family partnership program) to address the mental health problems have yielded positive results. Likewise, data is emerging from LMICs, showing at least moderate evidence of mental health promotion intervention effectiveness. However, most of the available literature and intervention is restricted mainly to the HICs ( 47 ). Therefore, their replicability in LMICs needs to be established and, also, there is a need to develop locally suited interventions.

Fortunately, there has been considerable progress in preventive psychiatry over recent decades, including research on it. In the light of these advances, there is an accelerated interest among researchers, clinicians, governments, and policymakers to harness the potentialities of the preventive strategies to improve the availability, accessibility, and utility of such services for the community.

The Concept of Preventive Psychiatry

Origins of preventive psychiatry.

The history of preventive psychiatry can be traced back to the early 1900's with the foundation of the national mental health association (erstwhile mental health association), the committee on mental hygiene in New York, and the mental health hygiene movement ( 48 ). The latter emphasized the need for physicians to develop empathy and recognize and treat mental illness early, leading to greater awareness about mental health prevention ( 49 ). Despite that, preventive psychiatry remained an alien concept for many, including mental health professionals, particularly when the etiology of most psychiatric disorders was either unknown or poorly understood. However, recent advances in our understanding of the phenomena underlying psychiatric disorders and availability of the neuroimaging and electrophysiological techniques concerning mental illness and its prognosis has again brought the preventive psychiatry in the forefront ( 1 ).

Levels of Prevention

The literal meaning of “prevention” is “the act of preventing something from happening” ( 50 ); the entity being prevented can range from the risk factors of the development of the illness, the onset of illness, or the recurrence of the illness or associated disability. The concept of prevention emerged primarily from infectious diseases; measures like mass vaccination and sanitation promotion have helped prevent the development of the diseases and subsequent fatalities. The original preventive model proposed by the Commission on Chronic Illness in 1957 included primary, secondary, and tertiary preventions ( 48 ).

The Concept of Primary, Secondary, and Tertiary Prevention

The stages of prevention target distinct aspects of the illness's natural course; the primary prevention acts at the stage of pre-pathogenesis, that is, when the disease is yet to occur, whereas the secondary and tertiary prevention target the phase after the onset of the disease ( 51 ). Primary prevention includes health promotion and specific protection, while secondary and tertairy preventions include early diagnosis and treatment and measures to decrease disability and rehabilitation, respectively ( 51 ) ( Figure 1 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyt-13-898009-g0001.jpg

The concept of primary and secondary prevention [adopted from prevention: Primary, Secondary, Tertiary by Bauman et al. ( 51 )].

The primary prevention targets those individuals vulnerable to developing mental disorders and their consequences because of their bio-psycho-social attributes. Therefore, it can be viewed as an intervention to prevent an illness, thereby preventing mental health morbidity and potential social and economic adversities. The preventive strategies under it usually target the general population or individuals at risk. Secondary and tertiary prevention targets those who have already developed the illness, aiming to reduce impairment and morbidity as soon as possible. However, these measures usually occur in a person who has already developed an illness, therefore facing related suffering, hence may not always be successful in curing or managing the illness. Thus, secondary and tertiary prevention measures target the already exposed or diagnosed individuals.

The Concept of Universal, Selective, and Indicated Prevention

The classification of health prevention based on primary/secondary/tertiary prevention is limited in being highly centered on the etiology of the illness; it does not consider the interaction between underlying etiology and risk factors of an illness. Gordon proposed another model of prevention that focuses on the degree of risk an individual is at, and accordingly, the intensity of intervention is determined. He has classified it into universal, selective, and indicated prevention. A universal preventive strategy targets the whole population irrespective of individual risk (e.g., maintaining healthy, psychoactive substance-free lifestyles); selective prevention is targeted to those at a higher risk than the general population (socio-economically disadvantaged population, e.g., migrants, a victim of a disaster, destitute, etc.). The indicated prevention aims at those who have established risk factors and are at a high risk of getting the disease (e.g., family history of psychiatric illness, history of substance use, certain personality types, etc.). Nevertheless, on the other hand, these two classifications (the primary, secondary, and tertiary prevention; and universal, selective, and indicated prevention) have been intended for and are more appropriate for physical illnesses with a clear etiology or risk factors ( 48 ).

In 1994, the Institute of Medicine (IOM) Committee on Prevention of Mental Disorders proposed a new paradigm that classified primary preventive measures for mental illnesses into three categories. These are indicated, selected, and universal preventive interventions (refer Figure 2 ). According to this paradigm, primary prevention was limited to interventions done before the onset of the mental illness ( 48 ). In contrast, secondary and tertiary prevention encompasses treatment and maintenance measures ( Figure 2 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyt-13-898009-g0002.jpg

The interventions for mental illness as classified by the Institute of Medicine (IOM) Committee on Prevention of Mental Disorders [adopted from Mrazek and Haggerty ( 48 )].

Although the boundaries between prevention and treatment are often more overlapping than being exclusive, the new paradigm can be used to avoid confusion stemming from the common belief that prevention can take place at all parts of mental health management ( 48 ). The onset of mental illnesses can be prevented by risk reduction interventions, which can involve reducing risk factors in an individual and strengthening protective elements in them. It aims to target modifiable factors, both risk, and protective factors, associated with the development of the illness through various general and specific interventions. These interventions can work across the lifespan. The benefits are not restricted to reduction or delay in the onset of illness but also in terms of severity or duration of illness ( 48 ).On the spectrum of mental health interventions, universal preventive interventions are directed at the whole population without identifiable risk factors. The interventions are beneficial for the general population or sub-groups. Prenatal care and childhood vaccination are examples of preventative measures that have benefited both physical and mental health. Selective preventive mental health interventions are directed at people or a subgroup with a significantly higher risk of developing mental disorders than the general population. Risk groups are those who, because of their vulnerabilities, are at higher risk of developing mental illnesses, e.g., infants with low-birth-weight (LBW), vulnerable children with learning difficulties or victims of maltreatment, elderlies, etc. Specific interventions are home visits and new-born day care facilities for LBW infants, preschool programs for all children living in resource-deprived areas, support groups for vulnerable elderlies, etc. Indicated preventive interventions focus on high-risk individuals who have developed minor but observable signs or symptoms of mental disorder or genetic risk factors for mental illness. However, they have not fulfilled the criteria of a diagnosable mental disorder. For instance, the parent-child interaction training program is an indicated prevention strategy that offers support to children whose parents have recognized them as having behavioral difficulties.

The overall objective of mental health promotion and prevention is to reduce the incidence of new cases, additionally delaying the emergence of mental illness. However, promotion and prevention in mental health complement each other rather than being mutually exclusive. Moreover, combining these two within the overall public health framework reduces stigma, increases cost-effectiveness, and provides multiple positive outcomes ( 18 ).

How Prevention in Psychiatry Differs From Other Medical Disorders

Compared to physical illnesses, diagnosing a mental illness is more challenging, particularly when there is still a lack of objective assessment methods, including diagnostic tools and biomarkers. Therefore, the diagnosis of mental disorders is heavily influenced by the assessors' theoretical perspectives and subjectivity. Moreover, mental illnesses can still be considered despite an individual not fulfilling the proper diagnostic criteria led down in classificatory systems, but there is detectable dysfunction. Furthermore, the precise timing of disorder initiation or transition from subclinical to clinical condition is often uncertain and inconclusive ( 48 ). Therefore, prevention strategies are well-delineated and clear in the case of physical disorders while it's still less prevalent in mental health parlance.

Terms, Definitions, and Concepts

The terms mental health, health promotion, and prevention have been differently defined and interpreted. It is further complicated by overlapping boundaries of the concept of promotion and prevention. Some commonly used terms in mental health prevention have been tabulated ( Table 1 ) ( 18 ).

Commonly used terms in mental health prevention.

Mental Health Promotion and Protection

The term “mental health promotion” also has definitional challenges as it signifies different things to different individuals. For some, it means the treatment of mental illness; for others, it means preventing the occurrence of mental illness; while for others, it means increasing the ability to manage frustration, stress, and difficulties by strengthening one's resilience and coping abilities ( 54 ). It involves promoting the value of mental health and improving the coping capacities of individuals rather than amelioration of symptoms and deficits.

Mental health promotion is a broad concept that encompasses the entire population, and it advocates for a strengths-based approach and tries to address the broader determinants of mental health. The objective is to eliminate health inequalities via empowerment, collaboration, and participation. There is mounting evidence that mental health promotion interventions improve mental health, lower the risk of developing mental disorders ( 48 , 55 , 56 ) and have socioeconomic benefits ( 24 ). In addition, it strives to increase an individual's capacity for psychosocial wellbeing and adversity adaptation ( 11 ).

However, the concepts of mental health promotion, protection, and prevention are intrinsically linked and intertwined. Furthermore, most mental diseases result from complex interaction risk and protective factors instead of a definite etiology. Facilitating the development and timely attainment of developmental milestones across an individual's lifespan is critical for positive mental health ( 57 ). Although mental health promotion and prevention are essential aspects of public health with wide-ranging benefits, their feasibility and implementation are marred by financial and resource constraints. The lack of cost-effectiveness studies, particularly from the LMICs, further restricts its full realization ( 47 , 58 , 59 ).

Despite the significance of the topic and a considerable amount of literature on it, a comprehensive review is still lacking that would cover the concept of mental health promotion and prevention and simultaneously discusses various interventions, including the novel techniques delivered across the lifespan, in different settings, and level of prevention. Therefore, this review aims to analyze the existing literature on various mental health promotion and prevention-based interventions and their effectiveness. Furthermore, its attempts to highlight the implications of such intervention in low-resource settings and provides future directions. Such literature would add to the existing literature on mental health promotion and prevention research and provide key insights into the effectiveness of such interventions and their feasibility and replicability in various settings.

Methodology

For the current review, key terms like “mental health promotion,” OR “protection,” OR “prevention,” OR “mitigation” were used to search relevant literature on Google Scholar, PubMed, and Cochrane library databases, considering a time period between 2000 to 2019 ( Supplementary Material 1 ). However, we have restricted our search till 2019 for non-original articles (reviews, commentaries, viewpoints, etc.), assuming that it would also cover most of the original articles published until then. Additionally, we included original papers from the last 5 years (2016–2021) so that they do not get missed out if not covered under any published review. The time restriction of 2019 for non-original articles was applied to exclude papers published during the Coronavirus disease (COVID-19) pandemic as the latter was a significant event, bringing about substantial change and hence, it warranted a different approach to cater to the MH needs of the population, including MH prevention measures. Moreover, the COVID-19 pandemic resulted in the flooding of novel interventions for mental health prevention and promotion, specifically targeting the pandemic and its consequences, which, if included, could have biased the findings of the current review on various MH promotion and prevention interventions.

A time frame of about 20 years was taken to see the effectiveness of various MH promotion and protection interventions as it would take substantial time to be appreciated in real-world situations. Therefore, the current paper has put greater reliance on the review articles published during the last two decades, assuming that it would cover most of the original articles published until then.

The above search yielded 320 records: 225 articles from Google scholar, 59 articles from PubMed, and 36 articles from the Cochrane database flow-diagram of records screening. All the records were title/abstract screened by all the authors to establish the suitability of those records for the current review; a bibliographic- and gray literature search was also performed. In case of any doubts or differences in opinion, it was resolved by mutual discussion. Only those articles directly related to mental health promotion, primary prevention, and related interventions were included in the current review. In contrast, records that discussed any specific conditions/disorders (post-traumatic stress disorders, suicide, depression, etc.), specific intervention (e.g., specific suicide prevention intervention) that too for a particular population (e.g., disaster victims) lack generalizability in terms of mental health promotion or prevention, those not available in the English language, and whose full text was unavailable were excluded. The findings of the review were described narratively.

Interventions for Mental Health Promotion and Prevention and Their Evidence

Various interventions have been designed for mental health promotion and prevention. They are delivered and evaluated across the regions (high-income countries to low-resource settings, including disaster-affiliated regions of the world), settings (community-based, school-based, family-based, or individualized); utilized different psychological constructs and therapies (cognitive behavioral therapy, behavioral interventions, coping skills training, interpersonal therapies, general health education, etc.); and delivered by different professionals/facilitators (school-teachers, mental health professionals or paraprofessionals, peers, etc.). The details of the studies, interventions used, and outcomes have been provided in Supplementary Table 1 . Below we provide the synthesized findings of the available research.

The majority of the available studies were quantitative and experimental. Randomized controlled trials comprised a sizeable proportion of the studies; others were quasi-experimental studies and, a few, qualitative studies. The studies primarily focussed on school students or the younger population, while others were explicitly concerned with the mental health of young females ( 60 ). Newer data is emerging on mental health promotion and prevention interventions for elderlies (e.g., dementia) ( 61 ). The majority of the research had taken a broad approach to mental health promotion ( 62 ). However, some studies have focused on universal prevention ( 63 , 64 ) or selective prevention ( 65 – 68 ). For instance, the Resourceful Adolescent Program (RAPA) was implemented across the schools and has utilized cognitive-behavioral and interpersonal therapies and reported a significant improvement in depressive symptoms. Some of the interventions were directed at enhancing an individual's characteristics like resilience, behavior regulation, and coping skills (ZIPPY's Friends) ( 69 ), while others have focused on the promotion of social and emotional competencies among the school children and attempted to reduce the gap in such competencies across the socio-economic classes (“Up” program) ( 70 ) or utilized expressive abilities of the war-affected children (Writing for Recover (WfR) intervention) ( 71 ) to bring about an improvement in their psychological problems (a type of selective prevention) ( 62 ) or harnessing the potential of Art, in the community-based intervention, to improve self-efficacy, thus preventing mental disorders (MAD about Art program) ( 72 ). Yet, others have focused on strengthening family ( 60 , 73 ), community relationships ( 62 ), and targeting modifiable risk factors across the life course to prevent dementia among the elderlies and also to support the carers of such patients ( 61 ).

Furthermore, more of the studies were conducted and evaluated in the developed parts of the world, while emerging economies, as anticipated, far lagged in such interventions or related research. The interventions that are specifically adapted for local resources, such as school-based programs involving paraprofessionals and teachers in the delivery of mental health interventions, were shown to be more effective ( 62 , 74 ). Likewise, tailored approaches for low-resource settings such as LMICs may also be more effective ( 63 ). Some of these studies also highlight the beneficial role of a multi-dimensional approach ( 68 , 75 ) and interventions targeting early lifespan ( 76 , 77 ).

Newer Insights: How to Harness Digital Technology and Novel Methods of MH Promotion and Protection

With the advent of digital technology and simultaneous traction on mental health promotion and prevention interventions, preventive psychiatrists and public health experts have developed novel techniques to deliver mental health promotive and preventive interventions. These encompass different settings (e.g., school, home, workplace, the community at large, etc.) and levels of prevention (universal, selective, indicated) ( 78 – 80 ).

The advanced technologies and novel interventions have broadened the scope of MH promotion and prevention, such as addressing the mental health issues of individuals with chronic medical illness ( 81 , 82 ), severe mental disorders ( 83 ), children and adolescents with mental health problems, and geriatric population ( 78 ). Further, it has increased the accessibility and acceptability of such interventions in a non-stigmatizing and tailored manner. Moreover, they can be integrated into the routine life of the individuals.

For instance, Internet-and Mobile-based interventions (IMIs) have been utilized to monitor health behavior as a form of MH prevention and a stand-alone self-help intervention. Moreover, the blended approach has expanded the scope of MH promotive and preventive interventions such as face-to-face interventions coupled with remote therapies. Simultaneously, it has given way to the stepped-care (step down or step-up care) approach of treatment and its continuation ( 79 ). Also, being more interactive and engaging is particularly useful for the youth.

The blended model of care has utilized IMIs to a varying degree and at various stages of the psychological interventions. This includes IMIs as a supplementary approach to the face-to-face-interventions (FTFI), FTFI augmented by behavior intervention technologies (BITs), BITs augmented by remote human support, and fully automated BITs ( 84 ).

The stepped care model of mental health promotion and prevention strategies includes a stepped-up approach, wherein BITs are utilized to manage the prodromal symptoms, thereby preventing the onset of the full-blown episode. In the Stepped-down approach, the more intensive treatments (in-patient or out-patient based interventions) are followed and supplemented with the BITs to prevent relapse of the mental illness, such as for previously admitted patients with depression or substance use disorders ( 85 , 86 ).

Similarly, the latest research has developed newer interventions for strengthening the psychological resilience of the public or at-risk individuals, which can be delivered at the level of the home, such as, e.g., nurse family partnership program (to provide support to the young and vulnerable mothers and prevent childhood maltreatment) ( 87 ); family healing together program aimed at improving the mental health of the family members living with persons with mental illness (PwMI) ( 88 ). In addition, various novel interventions for MH promotion and prevention have been highlighted in the Table 2 .

Depiction of various novel mental health promotion and prevention strategies.

a/w, associated with; A-V, audio-visual; b/w, between; CBT, Cognitive Behavioral Therapy; CES-Dep., Center for Epidemiologic Studies-Depression scale; CG, control group; FU, follow-up; GAD, generalized anxiety disorders-7; IA, intervention arm; HCWs, Health Care Workers; LMIC, low and middle-income countries; MDD, major depressive disorders; mgt, management; MH, mental health; MHP, mental health professional; MINI, mini neuropsychiatric interview; NNT, number needed to treat; PHQ-9, patient health questionnaire; TAU, treatment as usual .

Furthermore, school/educational institutes-based interventions such as school-Mental Health Magazines to increase mental health literacy among the teachers and students have been developed ( 80 ). In addition, workplace mental health promotional activities have targeted the administrators, e.g., guided “e-learning” for the managers that have shown to decrease the mental health problems of the employees ( 102 ).

Likewise, digital technologies have also been harnessed in strengthening community mental health promotive/preventive services, such as the mental health first aid (MHFA) Books on Prescription initiative in New Zealand provided information and self-help tools through library networks and trained book “prescribers,” particularly in rural and remote areas ( 103 ).

Apart from the common mental disorders such as depression, anxiety, and behavioral disorders in the childhood/adolescents, novel interventions have been utilized to prevent the development of or management of medical, including preventing premature mortality and psychological issues among the individuals with severe mental illnesses (SMIs), e.g., Lets' talk about tobacco-web based intervention and motivational interviewing to prevent tobacco use, weight reduction measures, and promotion of healthy lifestyles (exercise, sleep, and balanced diets) through individualized devices, thereby reducing the risk of cardiovascular disorders ( 83 ). Similarly, efforts have been made to improve such individuals' coping skills and employment chances through the WorkingWell mobile application in the US ( 104 ).

Apart from the digital-based interventions, newer, non-digital-based interventions have also been utilized to promote mental health and prevent mental disorders among individuals with chronic medical conditions. One such approach in adventure therapy aims to support and strengthen the multi-dimensional aspects of self. It includes the physical, emotional or cognitive, social, spiritual, psychological, or developmental rehabilitation of the children and adolescents with cancer. Moreover, it is delivered in the natural environment outside the hospital premises, shifting the focus from the illness model to the wellness model ( 81 ). Another strength of this intervention is it can be delivered by the nurses and facilitate peer support and teamwork.

Another novel approach to MH prevention is gut-microbiota and dietary interventions. Such interventions have been explored with promising results for the early developmental disorders (Attention deficit hyperactive disorder, Autism spectrum disorders, etc.) ( 105 ). It works under the framework of the shared vulnerability model for common mental disorders and other non-communicable diseases and harnesses the neuroplasticity potential of the developing brain. Dietary and lifestyle modifications have been recommended for major depressive disorders by the Clinical Practice Guidelines in Australia ( 106 ). As most childhood mental and physical disorders are determined at the level of the in-utero and early after the birth period, targeting maternal nutrition is another vital strategy. The utility has been expanded from maternal nutrition to women of childbearing age. The various novel mental health promotion and prevention strategies are shown in Table 2 .

Newer research is emerging that has utilized the digital platform for training non-specialists in diagnosis and managing individuals with mental health problems, such as Atmiyata Intervention and The SMART MH Project in India, and The Allillanchu Project in Peru, to name a few ( 99 ). Such frameworks facilitate task-sharing by the non-specialist and help in reducing the treatment gap in these countries. Likewise, digital algorithms or decision support systems have been developed to make mental health services more transparent, personalized, outcome-driven, collaborative, and integrative; one such example is DocuMental, a clinical decision support system (DSS). Similarly, frameworks like i-PROACH, a cloud-based intelligent platform for research outcome assessment and care in mental health, have expanded the scope of the mental health support system, including promoting research in mental health ( 100 ). In addition, COVID-19 pandemic has resulted in wider dissemination of the applications based on the evidence-based psycho-social interventions such as National Health Service's (NHS's) Mind app and Headspace (teaching meditation via a website or a phone application) that have utilized mindfulness-based practices to address the psychological problems of the population ( 101 ).

Challenges in Implementing Novel MH Promotion and Prevention Strategies

Although novel interventions, particularly internet and mobile-based interventions (IMIs), are effective models for MH promotion and prevention, their cost-effectiveness requires further exploration. Moreover, their feasibility and acceptability in LMICs could be challenging. Some of these could be attributed to poor digital literacy, digital/network-related limitations, privacy issues, and society's preparedness to implement these interventions.

These interventions need to be customized and adapted according to local needs and context, for which implementation and evaluative research are warranted. In addition, the infusion of more human and financial resources for such activities is required. Some reports highlight that many of these interventions do not align with the preferences and use the pattern of the service utilizers. For instance, one explorative research on mental health app-based interventions targeting youth found that despite the burgeoning applications, they are not aligned with the youth's media preferences and learning patterns. They are less interactive, have fewer audio-visual displays, are not youth-specific, are less dynamic, and are a single touch app ( 107 ).

Furthermore, such novel interventions usually come with high costs. In low-resource settings where service utilizers have limited finances, their willingness to use such services may be doubtful. Moreover, insurance companies, including those in high-income countries (HICs), may not be willing to fund such novel interventions, which restricts the accessibility and availability of interventions.

Research points to the feasibility and effectiveness of incorporating such novel interventions in routine services such as school, community, primary care, or settings, e.g., in low-resource settings, the resource persons like teachers, community health workers, and primary care physicians are already overburdened. Therefore, their willingness to take up additional tasks may raise skepticism. Moreover, the attitudinal barrier to moving from the traditional service delivery model to the novel methods may also impede.

Considering the low MH budget and less priority on the MH prevention and promotion activities in most low-resource settings, the uptake of such interventions in the public health framework may be lesser despite the latter's proven high cost-effectiveness. In contrast, policymakers may be more inclined to invest in the therapeutic aspects of MH.

Such interventions open avenues for personalized and precision medicine/health care vs. the traditional model of MH promotion and preventive interventions ( 108 , 109 ). For instance, multivariate prediction algorithms with methods of machine learning and incorporating biological research, such as genetics, may help in devising tailored, particularly for selected and indicated prevention, interventions for depression, suicide, relapse prevention, etc. ( 79 ). Therefore, more research in this area is warranted.

To be more clinically relevant, greater biological research in MH prevention is required to identify those at higher risk of developing given mental disorders due to the existing risk factors/prominent stress ( 110 ). For instance, researchers have utilized the transcriptional approach to identify a biological fingerprint for susceptibility (denoting abnormal early stress response) to develop post-traumatic stress disorders among the psychological trauma survivors by analyzing the expression of the Peripheral blood mononuclear cell gene expression profiles ( 111 ). Identifying such biological markers would help target at-risk individuals through tailored and intensive interventions as a form of selected prevention.

Similarly, such novel interventions can help in targeting the underlying risk such as substance use, poor stress management, family history, personality traits, etc. and protective factors, e.g., positive coping techniques, social support, resilience, etc., that influences the given MH outcome ( 79 ). Therefore, again, it opens the scope of tailored interventions rather than a one-size-fits-all model of selective and indicated prevention for various MH conditions.

Furthermore, such interventions can be more accessible for the hard-to-reach populations and those with significant mental health stigma. Finally, they play a huge role in ensuring the continuity of care, particularly when community-based MH services are either limited or not available. For instance, IMIs can maintain the improvement of symptoms among individuals previously managed in-patient, such as for suicide, SUDs, etc., or receive intensive treatment like cognitive behavior therapy (CBT) for depression or anxiety, thereby helping relapse prevention ( 86 , 112 ). Hence, such modules need to be developed and tested in low-resource settings.

IMIs (and other novel interventions) being less stigmatizing and easily accessible, provide a platform to engage individuals with chronic medical problems, e.g., epilepsy, cancer, cardiovascular diseases, etc., and non-mental health professionals, thereby making it more relevant and appealing for them.

Lastly, research on prevention-interventions needs to be more robust to adjust for the pre-intervention matching, high attrition rate, studying the characteristics of treatment completers vs. dropouts, and utilizing the intention-to-treat analysis to gauge the effect of such novel interventions ( 78 ).

Recommendations for Low-and-Middle-Income Countries

Although there is growing research on the effectiveness and utility of mental health promotion/prevention interventions across the lifespan and settings, low-resource settings suffer from specific limitations that restrict the full realization of such public health strategies, including implementing the novel intervention. To overcome these challenges, some of the potential solutions/recommendations are as follows:

  • The mental health literacy of the population should be enhanced through information, education, and communication (IEC) activities. In addition, these activities should reduce stigma related to mental problems, early identification, and help-seeking for mental health-related issues.
  • Involving teachers, workplace managers, community leaders, non-mental health professionals, and allied health staff in mental health promotion and prevention is crucial.
  • Mental health concepts and related promotion and prevention should be incorporated into the education curriculum, particularly at the medical undergraduate level.
  • Training non-specialists such as community health workers on mental health-related issues across an individual's life course and intervening would be an effective strategy.
  • Collaborating with specialists from other disciplines, including complementary and alternative medicines, would be crucial. A provision of an integrated health system would help in increasing awareness, early identification, and prompt intervention for at-risk individuals.
  • Low-resource settings need to develop mental health promotion interventions such as community-and school-based interventions, as these would be more culturally relevant, acceptable, and scalable.
  • Utilizing a digital platform for scaling mental health services (e.g., telepsychiatry services to at-risk populations) and training the key individuals in the community would be a cost-effective framework that must be explored.
  • Infusion of higher financial and human resources in this area would be a critical step, as, without adequate resources, research, service development, and implementation would be challenging.
  • It would also be helpful to identify vulnerable populations and intervene in them to prevent the development of clinical psychiatric disorders.
  • Lastly, involving individuals with lived experiences at the level of mental health planning, intervention development, and delivery would be cost-effective.

Clinicians, researchers, public health experts, and policymakers have increasingly realized mental health promotion and prevention. Investment in Preventive psychiatry appears to be essential considering the substantial burden of mental and neurological disorders and the significant treatment gap. Literature suggests that MH promotive and preventive interventions are feasible and effective across the lifespan and settings. Moreover, various novel interventions (e.g., internet-and mobile-based interventions, new therapies) have been developed worldwide and proven effective for mental health promotion and prevention; such interventions are limited mainly to HICs.

Despite the significance of preventive psychiatry in the current world and having a wide-ranging implication for the wellbeing of society and individuals, including those suffering from chronic medical problems, it is a poorly utilized public health field to address the population's mental health needs. Lately, researchers and policymakers have realized the untapped potentialities of preventive psychiatry. However, its implementation in low-resource settings is still in infancy and marred by several challenges. The utilization of novel interventions, such as digital-based interventions, and blended and stepped-care models of care, can address the enormous mental health need of the population. Additionally, it provides mental health services in a less-stigmatizing and easily accessible, and flexible manner. More research concerning this is required from the LMICs.

Author Contributions

VS, AK, and SG: methodology, literature search, manuscript preparation, and manuscript review. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2022.898009/full#supplementary-material

This paper is in the following e-collection/theme issue:

Published on 19.4.2024 in Vol 26 (2024)

Psychometric Evaluation of a Tablet-Based Tool to Detect Mild Cognitive Impairment in Older Adults: Mixed Methods Study

Authors of this article:

Author Orcid Image

Original Paper

  • Josephine McMurray 1, 2 * , MBA, PhD   ; 
  • AnneMarie Levy 1 * , MSc, PhD   ; 
  • Wei Pang 1, 3 * , BTM   ; 
  • Paul Holyoke 4 , PhD  

1 Lazaridis School of Business & Economics, Wilfrid Laurier University, Brantford, ON, Canada

2 Health Studies, Faculty of Human and Social Sciences, Wilfrid Laurier University, Brantford, ON, Canada

3 Biomedical Informatics & Data Science, Yale University, New Haven, CT, United States

4 SE Research Centre, Markham, ON, Canada

*these authors contributed equally

Corresponding Author:

Josephine McMurray, MBA, PhD

Lazaridis School of Business & Economics

Wilfrid Laurier University

73 George St

Brantford, ON, N3T3Y3

Phone: 1 548 889 4492

Email: [email protected]

Background: With the rapid aging of the global population, the prevalence of mild cognitive impairment (MCI) and dementia is anticipated to surge worldwide. MCI serves as an intermediary stage between normal aging and dementia, necessitating more sensitive and effective screening tools for early identification and intervention. The BrainFx SCREEN is a novel digital tool designed to assess cognitive impairment. This study evaluated its efficacy as a screening tool for MCI in primary care settings, particularly in the context of an aging population and the growing integration of digital health solutions.

Objective: The primary objective was to assess the validity, reliability, and applicability of the BrainFx SCREEN (hereafter, the SCREEN) for MCI screening in a primary care context. We conducted an exploratory study comparing the SCREEN with an established screening tool, the Quick Mild Cognitive Impairment (Qmci) screen.

Methods: A concurrent mixed methods, prospective study using a quasi-experimental design was conducted with 147 participants from 5 primary care Family Health Teams (FHTs; characterized by multidisciplinary practice and capitated funding) across southwestern Ontario, Canada. Participants included health care practitioners, patients, and FHT administrative executives. Individuals aged ≥55 years with no history of MCI or diagnosis of dementia rostered in a participating FHT were eligible to participate. Participants were screened using both the SCREEN and Qmci. The study also incorporated the Geriatric Anxiety Scale–10 to assess general anxiety levels at each cognitive screening. The SCREEN’s scoring was compared against that of the Qmci and the clinical judgment of health care professionals. Statistical analyses included sensitivity, specificity, internal consistency, and test-retest reliability assessments.

Results: The study found that the SCREEN’s longer administration time and complex scoring algorithm, which is proprietary and unavailable for independent analysis, presented challenges. Its internal consistency, indicated by a Cronbach α of 0.63, was below the acceptable threshold. The test-retest reliability also showed limitations, with moderate intraclass correlation coefficient (0.54) and inadequate κ (0.15) values. Sensitivity and specificity were consistent (63.25% and 74.07%, respectively) between cross-tabulation and discrepant analysis. In addition, the study faced limitations due to its demographic skew (96/147, 65.3% female, well-educated participants), the absence of a comprehensive gold standard for MCI diagnosis, and financial constraints limiting the inclusion of confirmatory neuropsychological testing.

Conclusions: The SCREEN, in its current form, does not meet the necessary criteria for an optimal MCI screening tool in primary care settings, primarily due to its longer administration time and lower reliability. As the number of digital health technologies increases and evolves, further testing and refinement of tools such as the SCREEN are essential to ensure their efficacy and reliability in real-world clinical settings. This study advocates for continued research in this rapidly advancing field to better serve the aging population.

International Registered Report Identifier (IRRID): RR2-10.2196/25520

Introduction

Mild cognitive impairment (MCI) is a syndrome characterized by a slight but noticeable and measurable deterioration in cognitive abilities, predominantly memory and thinking skills, that is greater than expected for an individual’s age and educational level [ 1 , 2 ]. The functional impairments associated with MCI are subtle and often impair instrumental activities of daily living (ADL). Instrumental ADL include everyday tasks such as managing finances, cooking, shopping, or taking regularly prescribed medications and are considered more complex than ADL such as bathing, dressing, and toileting [ 3 , 4 ]. In cases in which memory impairment is the primary indicator of the disease, MCI is classified as amnesic MCI and when significant impairment of non–memory-related cognitive domains such as visual-spatial or executive functioning is dominant, MCI is classified as nonamnesic [ 5 ].

Cognitive decline, more so than cancer and cardiovascular disease, poses a substantial threat to an individual’s ability to live independently or at home with family caregivers [ 6 ]. The Centers for Disease Control and Prevention reports that 1 in 8 adults aged ≥60 years experiences memory loss and confusion, with 35% reporting functional difficulties with basic ADL [ 7 ]. The American Academy of Neurology estimates that the prevalence of MCI ranges from 13.4% to 42% in people aged ≥65 years [ 8 ], and a 2023 meta-analysis that included 233 studies and 676,974 participants aged ≥50 years estimated that the overall global prevalence of MCI is 19.7% [ 9 ]. Once diagnosed, the prognosis for MCI is variable, whereby the impairment may be reversible; the rate of decline may plateau; or it may progressively worsen and, in some cases, may be a prodromal stage to dementia [ 10 - 12 ]. While estimates vary based on sample (community vs clinical), annual rates of conversion from MCI to dementia range from 5% to 24% [ 11 , 12 ], and those who present with multiple domains of cognitive impairment are at higher risk of conversion [ 5 ].

The risk of developing MCI rises with age, and while there are no drug treatments for MCI, nonpharmacologic interventions may improve cognitive function, alleviate the burden on caregivers, and potentially delay institutionalization should MCI progress to dementia [ 13 ]. To overcome the challenges of early diagnosis, which currently depends on self-detection, family observation, or health care provider (HCP) recognition of symptoms, screening high-risk groups for MCI or dementia is suggested as a solution [ 13 ]. However, the Canadian Task Force on Preventive Health Care recommends against screening adults aged ≥65 years due to a lack of meaningful evidence from randomized controlled trials and the high false-positive rate [ 14 - 16 ]. The main objective of a screening test is to reduce morbidity or mortality in at-risk populations through early detection and intervention, with the anticipated benefits outweighing potential harms. Using brief screening tools in primary care might improve MCI case detection, allowing patients and families to address reversible causes, make lifestyle changes, and access disease-modifying treatments [ 17 ].

There is no agreement among experts as to which tests or groups of tests are most predictive of MCI [ 16 ], and the gold standard approach uses a combination of positive results from neuropsychological assessments, laboratory tests, and neuroimaging to infer a diagnosis [ 8 , 18 ]. The clinical heterogeneity of MCI complicates its diagnosis because it influences not only memory and thinking abilities but also mood, behavior, emotional regulation, and sensorimotor abilities, and patients may present with any combination of symptoms with varying rates of onset and decline [ 4 , 8 ]. For this reason, a collaborative approach between general practitioners and specialists (eg, geriatricians and neurologists) is often required to be confident in the diagnosis of MCI [ 8 , 19 , 20 ].

In Canada, diagnosis often begins with screening for cognitive impairment followed by referral for additional testing; this process takes, on average, 5 months [ 20 ]. The current usual practice screening tools for MCI are the Mini-Mental State Examination (MMSE) [ 21 , 22 ] and the Montreal Cognitive Assessment (MoCA) 8.1 [ 3 ]. Both are paper-and-pencil screens administered in 10 to 15 minutes, scored out of 30, and validated as MCI screening tools across diverse clinical samples [ 23 , 24 ]. Universally, the MMSE is most often used to screen for MCI [ 20 , 25 ] and consists of 20 items that measure orientation, immediate and delayed recall, attention and calculation, visual-spatial skills, verbal fluency, and writing. The MoCA 8.1 was developed to improve on the MMSE’s ability to detect early signs of MCI, placing greater emphasis on evaluating executive function as well as language, memory, visual-spatial skills, abstraction, attention, concentration, and orientation across 30 items [ 24 , 26 ]. Scores of <24 on the MMSE or ≤25 on the MoCA 8.1 signal probable MCI [ 21 , 27 ]. Lower cutoff scores for both screens have been recommended to address evidence that they lack specificity to detect mild and early cases of MCI [ 4 , 28 - 31 ]. The clinical efficacy of both screens for tracking change in cognition over time is limited as they are also subject to practice effects with repeated administration [ 32 ].

Novel screening tools, including the Quick Mild Cognitive Impairment (Qmci) screen, have been developed with the goal of improving the accuracy of detecting MCI [ 33 , 34 ]. The Qmci is a sensitive and specific tool that differentiates normal cognition from MCI and dementia and is more accurate at differentiating MCI from controls than either the MoCA 8.1 (Qmci area under the curve=0.97 vs MoCA 8.1 area under the curve=0.92) [ 25 , 35 ] or the Short MMSE [ 33 , 36 ]. It also demonstrates high test-retest reliability (intraclass correlation coefficient [ICC]=0.88) [ 37 ] and is clinically useful as a rapid screen for MCI as the Qmci mean is 4.5 (SD 1.3) minutes versus 9.5 (SD 2.8) minutes for the MoCA 8.1 [ 25 ].

The COVID-19 pandemic and the necessary shift to virtual health care accelerated the use of digital assessment tools, including MCI screening tools such as the electronic MoCA 8.1 [ 38 , 39 ], and the increased use and adoption of technology (eg, smartphones and tablets) by older adults suggests that a lack of proficiency with technology may not be a barrier to the use of such assessment tools [ 40 , 41 ]. BrainFx is a for-profit firm that creates proprietary software designed to assess cognition and changes in neurofunction that may be caused by neurodegenerative diseases (eg, MCI or dementia), stroke, concussions, or mental illness using ecologically relevant tasks (eg, prioritizing daily schedules and route finding on a map) [ 42 ]. Their assessments are administered via a tablet and stylus. The BrainFx 360 performance assessment (referred to hereafter as the 360) is a 90-minute digitally administered test that was designed to assess cognitive, physical, and psychosocial areas of neurofunction across 26 cognitive domains using 49 tasks that are timed and scored [ 42 ]. The BrainFx SCREEN (referred to hereafter as the SCREEN) is a short digital version of the 360 that includes 7 of the cognitive domains included in the 360, is estimated to take approximately 10 to 15 minutes to complete, and was designed to screen for early detection of cognitive impairment [ 43 , 44 ]. Upon completion of any BrainFx assessment, the results of the 360 or SCREEN are added to the BrainFx Living Brain Bank (LBB), which is an electronic database that stores all completed 360 and SCREEN assessments and is maintained by BrainFx. An electronic report is generated by BrainFx comparing an individual’s results to those of others collected and stored in the LBB. Normative data from the LBB are used to evaluate and compare an individual’s results.

The 360 has been used in clinical settings to assess neurofunction among youth [ 45 ] and anecdotally in other rehabilitation settings (T Milner, personal communication, May 2018). To date, research on the 360 indicates that it has been validated in healthy young adults (mean age 22.9, SD 2.4 years) and that the overall test-retest reliability of the tool is high (ICC=0.85) [ 42 ]. However, only 2 of the 7 tasks selected to be included in the SCREEN produced reliability coefficients of >0.70 (visual-spatial and problem-solving abilities) [ 42 ]. Jones et al [ 43 ] explored the acceptability and perceived usability of the SCREEN with a small sample (N=21) of Canadian Armed Forces veterans living with posttraumatic stress disorder. A structural equation model based on the Unified Theory of Acceptance and Use of Technology suggested that behavioral intent to use the SCREEN was predicted by facilitating conditions such as guidance during the test and appropriate resources to complete the test [ 43 ]. However, the validity, reliability, and sensitivity of the SCREEN for detecting cognitive impairment have not been tested.

McMurray et al [ 44 ] designed a protocol to assess the validity, reliability, and sensitivity of the SCREEN for detecting early signs of MCI in asymptomatic adults aged ≥55 years in a primary care setting (5 Family Health Teams [FHTs]). The protocol also used a series of semistructured interviews and surveys guided by the fit between individuals, task, technology, and environment framework [ 46 ], a health-specific model derived from the Task-Technology Fit model by Goodhue and Thompson [ 47 ], to explore the SCREEN’s acceptability and use by HCPs and patients in primary care settings (manuscript in preparation). This study is a psychometric evaluation of the SCREEN’s validity, reliability, and sensitivity for detecting MCI in asymptomatic adults aged ≥55 years in primary care settings.

Study Location, Design, and Data Collection

This was a concurrent, mixed methods, prospective study using a quasi-experimental design. Participants were recruited from 5 primary care FHTs (characterized by multidisciplinary practice and capitated funding) across southwestern Ontario, Canada. FHTs that used a registered occupational therapist on staff were eligible to participate in the study, and participating FHTs received a nominal compensatory payment for the time the HCPs spent in training; collecting data for the study; administering the SCREEN, Qmci, and Geriatric Anxiety Scale–10 (GAS-10); and communicating with the research team. A multipronged recruitment approach was used [ 44 ]. A designated occupational therapist at each location was provided with training and equipment to recruit participants, administer assessment tools, and submit collected data to the research team.

The research protocol describing the methods of both the quantitative and qualitative arms of the study is published elsewhere [ 44 ].

Ethical Considerations

This study was approved by the Wilfrid Laurier University Research Ethics Board (ORE 5820) and was reviewed and approved by each FHT. Participants (HCPs, patients, and administrative executives) read and signed an information and informed consent package in advance of taking part in the study. We complied with recommendations for obtaining informed consent and conducting qualitative interviews with persons with dementia when recruiting patients who may be affected by neurocognitive diseases [ 48 - 50 ]. In addition, at the end of each SCREEN assessment, patients were required to provide their consent (electronic signature) to contribute their anonymized scores to the database of SCREEN results maintained by BrainFx. Upon enrolling in the study, participants were assigned a unique identification number that was used in place of their name on all study documentation to anonymize the data and preserve their confidentiality. A master list matching participant names with their unique identification number was stored in a password-protected file by the administering HCP and principal investigator on the research team. The FHTs received a nominal compensatory payment to account for their HCPs’ time spent administering the SCREEN, collecting data for the study, and communicating with the research team. However, the individual HCPs who volunteered to participate and the patient participants were not financially compensated for taking part in the study.

Participants

Patients who were rostered with the FHT, were aged ≥55 years, and had no history of MCI or dementia diagnoses to better capture the population at risk of early signs of cognitive impairment were eligible to participate [ 51 , 52 ]. It was necessary for the participants to be rostered with the FHTs to ensure that the HCPs could access their electronic medical record to confirm eligibility and record the testing sessions and results and to ensure that there was a responsible physician for referral if indicated. As the SCREEN is administered using a tablet, participants had to be able to read and think in English and discern color, have adequate hearing and vision to interact with the administering HCP, read 12-point font on the tablet, and have adequate hand and arm function to manipulate and hold the tablet. The exclusion criteria used in the study included colorblindness and any disability that might impair the individual’s ability to hold and interact with the tablet. Prospective participants were also excluded based on a diagnosis of conditions that may result in MCI or dementia-like symptoms, including major depression that required hospitalization, psychiatric disorders (eg, schizophrenia and bipolar disorder), psychopathology, epilepsy, substance use disorders, or sleep apnea (without the use of a continuous positive airway pressure machine) [ 52 ]. Patients were required to complete a minimum of 2 screening sessions spaced 3 months apart to participate in the study and, depending on when they enrolled to participate, could complete a maximum of 4 screening sessions over a year.

Data Collection Instruments

Gas-10 instrument.

A standardized protocol was used to collect demographic data, randomly administer the SCREEN and the Qmci (a validated screening tool for MCI), and administer the GAS-10 immediately before and after the completion of the first MCI screen at each visit [ 44 ]. This was to assess participants’ general anxiety as it related to screening for cognitive impairment at the time of the assessment, any change in subjective ratings after completion of the first MCI screen, and change in anxiety between appointments. The GAS-10 is a 10-item, self-report screen for anxiety in older adults [ 53 ] developed for rapid screening of anxiety in clinical settings (the GAS-10 is the short form of the full 30-item Geriatric Anxiety Scale [GAS]) [ 54 ]. While 3 subscales are identified, the GAS is reported to be a unidimensional scale that assesses general anxiety [ 55 , 56 ]. Validation of the GAS-10 suggests that it is optimal for assessing average to moderate levels of anxiety in older adults, with subscale scores that are highly and positively correlated with the GAS and high internal consistency [ 53 ]. Participants were asked to use a 4-point Likert scale (0= not at all , 1= sometimes , 2= most of the time , and 3= all of the time ) to rate how often they had experienced each symptom over the previous week, including on the day the test was administered [ 54 ]. The GAS-10 has a maximum score of 30, with higher scores indicating higher levels of anxiety [ 53 , 54 , 57 ].

HCPs completed the required training to become certified BrainFx SCREEN administrators before the start of the study. To this end, HCPs completed a web-based training program (developed and administered through the BrainFx website) that included 3 self-directed training modules. For the purpose of the study, they also participated in 1 half-day in-person training session conducted by a certified BrainFx administrator (T Milner, BrainFx chief executive officer) at one of the participating FHT locations. The SCREEN (version 0.5; beta) was administered on a tablet (ASUS ZenPad 10.1” IPS WXGA display, 1920 × 1200, powered by a quad-core 1.5 GHz, 64-bit MediaTek MTK 8163A processor with 2 GB RAM and 16-GB storage). The tablet came with a tablet stand for optional use and a dedicated stylus that is recommended for completion of a subset of questions. At the start of the study, HCPs were provided with identical tablets preloaded with the SCREEN software for use in the study. The 7 tasks on the SCREEN are summarized in Table 1 and were taken directly from the 360 based on a clustering and regression analysis of LBB records in 2016 (N=188) [ 58 ]. A detailed description of the study and SCREEN administration procedures was published by McMurray et al [ 44 ].

An activity score is generated for each of the 7 tasks on the SCREEN. It is computed based on a combination of the accuracy of the participant’s response and the processing speed (time in seconds) that it takes to complete the task. The relative contribution of accuracy and processing speed to the final activity score for each task is proprietary to BrainFx and unknown to the research team. The participant’s activity score is compared to the mean activity score for the same task at the time of testing in the LBB. The mean activity score from the LBB may be based on the global reference population (ie, all available SCREEN results in the LBB), or the administering HCP may select a specific reference population by filtering according to factors including but not limited to age, sex, or diagnosis. If the participant’s activity score is >1 SD below the LBB activity score mean for that task, it is labeled as an area of challenge . Each of the 7 tasks on the SCREEN are evaluated independently of each other, producing a report with 7 activity scores showing the participant’s score, the LBB mean score, and the SD. The report also provides an overall performance and processing speed score. The overall performance score is an average of all 7 activity scores; however, the way in which the overall processing speed score is generated remains proprietary to BrainFx and unknown to the research team. Both the overall performance and processing speed scores are similarly evaluated against the LBB and identified as an area of challenge using the criteria described previously. For the purpose of this study, participants’ mean activity scores on the SCREEN were compared to the results of people aged ≥55 years in the LBB.

The Qmci evaluated 6 cognitive domains: orientation (10 points), registration (5 points), clock drawing (15 points), delayed recall (20 points), verbal fluency (20 points), and logical memory (30 points) [ 59 ]. Administering HCPs scored the text manually, with each subtest’s points contributing to the overall score out of 100 points, and the cutoff score to distinguish normal cognition from MCI was ≤67/100 [ 60 ]. Cutoffs to account for age and education have been validated and are recommended as the Qmci is sensitive to these factors [ 60 ]. A 2019 meta-analysis of the diagnostic accuracy of MCI screening tools reported that the sensitivity and specificity of the Qmci for distinguishing MCI from normal cognition is similar to usual standard-of-care tools (eg, the MoCA, Addenbrooke Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease battery total score, and Sunderland Clock Drawing Test) [ 61 ]. The Qmci has also been translated into >15 different languages and has undergone psychometric evaluation across a subset of these languages. While not as broadly adopted as the MoCA 8.1 in Canada, its psychometric properties, administration time, and availability for use suggested that the Qmci was an optimal assessment tool for MCI screening in FHT settings during the study.

Psychometric Evaluation

To date, the only published psychometric evaluation of any BrainFx tool is by Searles et al [ 42 ] in Athletic Training & Sports Health Care ; it assessed the test-retest reliability of the 360 in 15 healthy adults between the ages of 20 and 25 years. This study evaluated the psychometric properties of the SCREEN and included a statistical analysis of the tool’s internal consistency, construct validity, test-retest reliability, and sensitivity and specificity. McMurray et al [ 44 ] provide a detailed description of the data collection procedures for administration of the SCREEN and Qmci completed by participants at each visit.

Validity Testing

Face validity was outside the scope of this study but was implied, and assumptions are reported in the Results section. Construct validity, whether the 7 activities that make up the SCREEN were representative of MCI, was assessed through comparison with a substantive body of literature in the domain and through principal component analysis using varimax rotation. Criterion validity measures how closely the SCREEN results corresponded to the results of the Qmci (used here as an “imperfect gold standard” for identifying MCI in older adults) [ 62 ]. A BrainFx representative hypothesized that the ecological validity of the SCREEN questions (ie, using tasks that reflect real-world activities to detect early signs of cognitive impairment) [ 63 ] makes it a more sensitive tool than other screens (T Milner, personal communication, May 2018) and allows HCPs to equate activity scores on the SCREEN with real-world functional abilities. Criterion validity was explored first using cross-tabulations to calculate the sensitivity and specificity of the SCREEN compared to those of the Qmci. Conventional screens such as the Qmci are scored by taking the sum of correct responses on the screen and a cutoff score derived from normative data to distinguish normal cognition from MCI. The SCREEN used a different method of scoring whereby each of the 7 tasks was scored and evaluated independently of each other and there were no recommended guidelines for distinguishing normal cognition from MCI based on the aggregate areas of challenge identified by the SCREEN. Therefore, to compare the sensitivity and specificity of the SCREEN against those of the Qmci, the results of both screens were coded into a binary format as 1=healthy and 2=unhealthy, where healthy denoted no areas of challenge identified through the SCREEN and a Qmci score of ≥67. Conversely, unhealthy denoted one or more areas of challenge identified through the SCREEN and a Qmci score of <67.

Criterion validity was further explored using discrepant analysis via a resolver test [ 44 ]. Following the administration of the SCREEN and Qmci, screen results were evaluated by the administering HCP. HCPs were instructed to refer the participant for follow-up with their primary care physician if the Qmci result was <67 regardless of whether any areas of challenge were identified on the SCREEN. However, HCPs could use their clinical judgment to refer a participant for physician follow-up based on the results of the SCREEN or the Qmci, and all the referral decisions were charted on the participant’s electronic medical record following each visit and screening. In discrepant analysis, the results of the imperfect gold standard [ 64 ], as was the role of the Qmci in this study, were compared with the SCREEN results. A resolver test (classified as whether the HCP referred the patient to a physician for follow-up based on their performance on the SCREEN and the Qmci) was used on discordant results [ 64 , 65 ] to determine sensitivity and specificity. To this end, a new variable, Referral to a Physician for Cognitive Impairment , was coded as the true status (1=no referral; 2=referral was made) and compared to the Qmci as the imperfect gold standard (1=healthy; 2=unhealthy).

Reliability Testing

The reliability of a screening instrument is its ability to consistently measure an attribute and how well its component measures fit together conceptually. Internal consistency identifies whether the items in a multi-item scale are measuring the same underlying construct; the internal consistency of the SCREEN was assessed using the Cronbach α. Test-retest reliability refers to the ability of a measurement instrument to reproduce results over ≥2 occasions (assuming the underlying conditions have not changed) and was assessed using paired t tests (2-tailed), ICC, and the κ coefficient. In this study, participants completed both the SCREEN and the Qmci in the same sitting in a random sequence on at least 2 different occasions spaced 3 months apart (administration procedures are described elsewhere) [ 44 ]. In some instances, the screens were administered to the same participant on 4 separate occasions spaced 3 months apart each, and this provided up to 3 separate opportunities to conduct test-retest reliability analyses and investigate the effects of repeated practice. There are no clear guidelines on the optimal time between tests [ 66 , 67 ]; however, Streiner and Kottner [ 68 ] and Streiner [ 69 ] recommend longer periods between tests (eg, at least 10-14 days) to avoid recall bias, and greater practice effects have been experienced with shorter test-retest intervals [ 32 ].

Analysis of the quantitative data was completed using Stata (version 17.0; StataCorp). Assumptions of normality were not violated, so parametric tests were used. Collected data were reported using frequencies and percentages and compared using the chi-square or Fisher exact test as necessary. Continuous data were analyzed for central tendency and variability; categoric data were presented as proportions. Normality was tested using the Shapiro-Wilk test, and nonparametric data were tested using the Mann-Whitney U test. A P value of .05 was considered statistically significant, with 95% CIs provided where appropriate. We powered the exploratory analysis to validate the SCREEN using an estimated effect size of 12%—understanding that Canadian prevalence rates of MCI were not available [ 1 ]—and determined that the study required at least 162 participants. For test-retest reliability, using 90% power and a 5% type-I error rate, a minimum of 67 test results was required.

The time taken for participants to complete the SCREEN was recorded by the HCPs at the time of testing; there were 6 missing HCP records of time to complete the SCREEN. For these 6 cases of missing data, we imputed the mean time to complete the SCREEN by all participants who were tested by that HCP and used this to populate the missing cells [ 70 ]. There were 3 cases of missing data related to the SCREEN reports. More specifically, the SCREEN report generated by BrainFx did not include 1 or 2 data points each for the route finding, divided attention, and prioritizing tasks. The clinical notes provided by the HCP at the time of SCREEN administration did not indicate that the participant had not completed those questions, and it was not possible to determine the root cause of the missing data in report generation according to BrainFx (M Milner, personal communication, July 7, 2020). For continuous variables in analyses such as exploratory factor analysis, Cronbach α, and t test, missing values were imputed using the mean. However, for the coded healthy and unhealthy categorical variables, values were not imputed.

Data collection began in January 2019 and was to conclude on May 31, 2020. However, the emergence of the global COVID-19 pandemic resulted in the FHTs and Wilfrid Laurier University prohibiting all in-person research starting on March 16, 2020.

Participant Demographics

A total of 154 participants were recruited for the study, and 20 (13%) withdrew following their first visit to the FHT. The data of 65% (13/20) of the participants who withdrew were included in the final analysis, and the data of the remaining 35% (7/20) were removed, either due to their explicit request (3/7, 43%) or because technical issues at the time of testing rendered their data unusable (4/7, 57%). These technical issues were related to software issues (eg, any instance in which the patient or HCP interacted with the SCREEN software and followed the instructions provided, the software did not work as expected [ie, objects did not move where they were dragged or tapping on objects failed to highlight the object], and the question could not be completed). After attrition, a total of 147 individuals aged ≥55 years with no previous diagnosis of MCI or dementia participated in the study ( Table 2 ). Of the 147 participants, 71 (48.3%) took part in only 1 round of screening on visit 1 (due to COVID-19 restrictions imposed on in-person research that prevented a second visit). The remaining 51.7% (76/147) of the participants took part in ≥2 rounds of screening across multiple visits (76/147, 51.7% participated in 2 rounds; 22/147, 15% participated in 3 rounds; and 13/147, 8.8% participated in 4 rounds of screening).

The sample population was 65.3% (96/147) female (mean 70.2, SD 7.9 years) and 34.7% (51/147) male (mean 72.5, SD 8.1 years), with age ranging from 55 to 88 years; 65.3% (96/147) achieved the equivalent of or higher than a college diploma or certificate ( Table 2 ); and 32.7% (48/147) self-reported living with one or more chronic medical conditions ( Table 3 ). At the time of screening, 73.5% (108/147) of participants were also taking medications with side effects that may include impairments to memory and thinking abilities [ 71 - 75 ]; therefore, medication use was accounted for in a subset of the analyses. Finally, 84.4% (124/147) of participants self-reported regularly using technology (eg, smartphone, laptop, or tablet) with high proficiency. A random sequence generator was used to determine the order for administering the MCI screens; the SCREEN was administered first 51.9% (134/258) of the time.

Construct Validity

Construct validity was assessed through a review of relevant peer-reviewed literature that compared constructs included in the SCREEN with those identified in the literature as 2 of the most sensitive tools for MCI screening: the MoCA 8.1 [ 76 ] and the Qmci [ 25 ]. Memory, language, and verbal skills are assessed in the MoCA and Qmci but are absent from the SCREEN. Tests of verbal fluency and logical memory have been shown to be particularly sensitive to early cognitive changes [ 77 , 78 ] but are similarly absent from the SCREEN.

Exploratory factor analysis was performed to examine the SCREEN’s ability to reliably measure risk of MCI. The Kaiser-Meyer-Olkin measure yielded a value of 0.79, exceeding the commonly accepted threshold of 0.70, indicating that the sample was adequate for factor analysis. The Bartlett test of sphericity returned a chi-square value of χ 2 21 =167.1 ( P <.001), confirming the presence of correlations among variables suitable for factor analysis. A principal component analysis revealed 2 components with eigenvalues of >1, cumulatively accounting for 52.12% of the variance, with the first factor alone explaining 37.8%. After the varimax rotation, the 2 factors exhibited distinct patterns of loadings, with the visual-spatial ability factor loading predominantly on the second factor. The SCREEN tasks, except for visual-spatial ability, loaded substantially on the factors (>0.5), suggesting that the SCREEN possesses good convergent validity for assessing the risk of MCI.

Criterion Validity

The coding of SCREEN scores into a binary healthy and unhealthy outcome standardized the dependent variable to allow for criterion testing. Criterion validity was assessed using cross-tabulations and the analysis of confusion matrices and provided insights into the sensitivity and specificity of the SCREEN when compared to the Qmci. Of the 144 cases considered, 20 (13.9%) were true negatives, and 74 (51.4%) were true positives. The SCREEN’s sensitivity, which reflects its capacity to accurately identify healthy individuals (true positives), was 63.25% (74 correct identifications/117 actual positives). The specificity of the test, indicating its ability to accurately identify unhealthy individuals (true negatives), was 74.07% (20 correct identifications/27 actual negatives). Then, sensitivity and specificity were derived using discrepant analysis and a resolver test previously described (whether the HCP referred the participant to a physician following the screens). The results were identical, the estimate of the SCREEN sensitivity was 63.3% (74/117), and the estimate of the specificity was 74% (20/27).

Internal Reliability

A Cronbach α=0.70 is acceptable, and at least 0.90 is required for clinical instruments [ 79 ]. The estimate of internal consistency for the SCREEN (N=147) was Cronbach α=0.63.

Test-Retest Reliability

Test-retest reliability analyses were conducted using ICC for the SCREEN activity scores and the κ coefficient for the healthy and unhealthy classifications. Guidelines for interpretation of the ICC suggest that anything <0.5 indicates poor reliability and anything between 0.5 and 0.75 suggests moderate reliability [ 80 ]; the ICC for the SCREEN activity scores was 0.54. With respect to the κ coefficient, a κ value of <0.2 is considered to have no level of agreement, a κ value of 0.21 to 0.39 is considered minimal, a κ value of 0.4 to 0.59 is considered weak agreement, and anything >0.8 suggests strong to almost perfect agreement [ 81 ]. The κ coefficient for healthy and unhealthy classifications was 0.15.

Analysis of the Factors Impacting Healthy and Unhealthy Results

The Spearman rank correlation was used to assess the relationships between participants’ overall activity score on the SCREEN and their total time to complete the SCREEN; age, sex, and self-reported levels of education; technology use; medication use; amount of sleep; and level of anxiety (as measured using the GAS-10) at the time of SCREEN administration. Lower overall activity scores were moderately correlated with being older ( r s142 =–0.57; P <.001) and increased total time to complete the SCREEN ( r s142 =0.49; P <.001). There was also a moderate inverse relationship between overall activity score and total time to compete the SCREEN ( r s142 =–0.67; P <.001) whereby better performance was associated with quicker task completion. There were weak positive associations between overall activity score and increased technology use ( r s142 =0.34; P <.001) and higher level of education ( r s142 =0.21; P =.01).

A logistic regression model was used to predict the SCREEN result using data from 144 observations. The model’s predictors explain approximately 21.33% of the variance in the outcome variable. The likelihood ratio test indicates that the model provides a significantly better fit to the data than a model without predictors ( P <.001).

The SCREEN outcome variable ( healthy vs unhealthy ) was associated with the predictor variables sex and total time to complete the SCREEN. More specifically, female participants were more likely to obtain healthy SCREEN outcomes ( P =.007; 95% CI 0.32-2.05). For all participants, the longer it took to complete the SCREEN, the less likely they were to achieve a healthy SCREEN outcome ( P =.002; 95% CI –0.33 to –0.07). Age ( P =.25; 95% CI –0.09 to 0.02), medication use ( P =.96; 95% CI –0.9 to 0.94), technology use ( P =.44; 95% CI –0.28 to 0.65), level of education ( P =.14; 95% CI –0.09 to 0.64), level of anxiety ( P =.26; 95% CI –1.13 to 0.3), and hours of sleep ( P =.08; 95% CI –0.06 to 0.93) were not significant.

Impact of Practice Effects

The SCREEN was administered approximately 3 months apart, and separate, paired-sample t tests were performed to compare SCREEN outcomes between visits 1 and 2 (76/147, 51.7%; Table 4 ), visits 2 and 3 (22/147, 15%), and visits 3 and 4 (13/147, 8.8%). Declining visits were partially attributable to the early shutdown of data collection due to the COVID-19 pandemic, and therefore, comparisons between visits 2 and 3 or visits 3 and 4 were not reported. Compared to participants’ SCREEN performance on visit 1, their overall mean activity score and overall processing time improved on their second administration of the SCREEN (score: t 75 =–2.86 and P =.005; processing time: t 75 =–2.98 and P =.004). Even though the 7 task-specific activity scores on the SCREEN also increased between visits 1 and 2, these improvements were not significant, indicating that the difference in overall activity scores was cumulative and not attributable to a specific task ( Table 4 ).

Principal Findings

Our study aimed to evaluate the effectiveness and reliability of the BrainFx SCREEN in detecting MCI in primary care settings. The research took place during the COVID-19 pandemic, which influenced the study’s execution and timeline. Despite these challenges, the findings offer valuable insights into cognitive impairment screening.

Brief MCI screening tools help time-strapped primary care physicians determine whether referral for a definitive battery of more time-consuming and expensive tests is warranted. These tools must optimize and balance the need for time efficiency while also being psychometrically valid and easily administered [ 82 ]. The importance of brevity is determined by a number of factors, including the clinical setting. Screens that can be completed in approximately ≤5 minutes [ 13 ] are recommended for faster-paced clinical settings (eg, emergency rooms and preoperative screens), whereas those that can be completed in 5 to 10 minutes or less are better suited to primary care settings [ 82 - 84 ]. Identifying affordable, psychometrically tested screening tests for MCI that integrate into clinical workflows and are easy to consistently administer and complete may help with the following:

  • Initiating appropriate diagnostic tests for signs and symptoms at an earlier stage
  • Normalizing and destigmatizing cognitive testing for older adults
  • Expediting referrals
  • Allowing for timely access to programs and services that can support aging in place or delay institutionalization
  • Reducing risk
  • Improving the psychosocial well-being of patients and their care partners by increasing access to information and resources that aid with future planning and decision-making [ 85 , 86 ]

Various cognitive tests are commonly used for detecting MCI. These include the Addenbrook Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease, Sunderland Clock Drawing Test, Informant Questionnaire on Cognitive Decline in the Elderly, Memory Alternation Test, MMSE, MoCA 8.1, and Qmci [ 61 , 87 ]. The Addenbrook Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease, MoCA 8.1, Qmci, and Memory Alternation Test are reported to have similar diagnostic accuracy [ 61 , 88 ]. The HCPs participating in this study reported using the MoCA 8.1 as their primary screening tool for MCI along with other assessments such as the MMSE and Trail Making Test parts A and B.

Recent research highlights the growing use of digital tools [ 51 , 89 , 90 ], mobile technology [ 91 , 92 ], virtual reality [ 93 , 94 ], and artificial intelligence [ 95 ] to improve early identification of MCI. Demeyere et al [ 51 ] developed the tablet-based, 10-item Oxford Cognitive Screen–Plus to detect slight changes in cognitive impairment across 5 domains of cognition (memory, attention, number, praxis, and language), which has been validated among neurologically healthy older adults. Statsenko et al [ 96 ] have explored improvement of the predictive capabilities of tests using artificial intelligence. Similarly, there is an emerging focus on the use of machine learning techniques to detect dementia leveraging routinely collected clinical data [ 97 , 98 ]. This progression signifies a shift toward more technologically advanced, efficient, and potentially more accurate diagnostic approaches in the detection of MCI.

Whatever the modality, screening tools should be quick to administer, demonstrate consistent results over time and between different evaluators, cover all major cognitive areas, and be straightforward to both administer and interpret [ 99 ]. However, highly sensitive tests such as those suggested for screening carry a significant risk of false-positive diagnoses [ 15 ]. Given the high potential for harm of false positives, it is important to validate the psychometric properties of screening tests across different populations and understand how factors such as age and education can influence the results [ 99 ].

Our study did not assess the face validity of the SCREEN, but participating occupational therapists were comfortable with the test regimen. Nonetheless, the research team noted the absence of verbal fluency and memory tests in the SCREEN, both of which McDonnell et al [ 100 ] identified as being more sensitive to the more commonly seen amnesic MCI. Two of the most sensitive tools for MCI screening, the MoCA 8.1 [ 76 ] and Qmci [ 25 ], assess memory, language, and verbal skills, and tests of verbal fluency and logical memory have been shown to be particularly sensitive to early cognitive changes [ 77 , 78 ].

The constructs included in the SCREEN ( Table 1 ) were selected based on a single non–peer-reviewed study [ 58 ] using the 360 and traumatic brain injury data (N=188) that identified the constructs as predictive of brain injury. The absence of tasks that measure verbal fluency or logical memory in the SCREEN appears to weaken claims of construct validity. The principal component analysis of the SCREEN assessment identified 2 components accounting for 52.12% of the total variance. The first component was strongly associated with abstract reasoning, constructive ability, and divided attention, whereas the second was primarily influenced by visual-spatial abilities. This indicates that constructs related to perception, attention, and memory are central to the SCREEN scores.

The SCREEN’s binary outcome (healthy or unhealthy) created by the research team was based on comparisons with the Qmci. However, the method of identifying areas of challenge in the SCREEN by comparing the individual’s mean score on each of the 7 tasks with the mean scores of a global or filtered cohort in the LBB introduces potential biases or errors. These could arise from a surge in additions to the LBB from patients with specific characteristics, self-selection of participants, poorly trained SCREEN administrators, inclusion of nonstandard test results, underuse of appropriate filters, and underreporting of clinical conditions or factors such as socioeconomic status that impact performance in standardized cognitive tests.

The proprietary method of analyzing and reporting SCREEN results complicates traditional sensitivity and specificity measurement. Our testing indicated a sensitivity of 63.25% and specificity of 74.07% for identifying healthy (those without MCI) and unhealthy (those with MCI) individuals. The SCREEN’s Cronbach α=.63, slightly below the threshold for clinical instruments, and reliability scores that were lower than the ideal standards suggest a higher-than-acceptable level of random measurement error in its constructs. The lower reliability may also stem from an inadequate sample size or a limited number of scale items.

The SCREEN’s results are less favorable compared to those of other digital MCI screening tools that similarly enable evaluation of specific cognitive domains but also provide validated, norm-referenced cutoff scores and methods for cumulative scoring in clinical settings (Oxford Cognitive Screen–Plus) [ 51 ] or of validated MCI screening tools used in primary care (eg, MoCA 8.1, Qmci, and MMSE) [ 51 , 87 ]. The SCREEN’s unique scoring algorithm and the dynamic denominator in data analysis necessitate caution in comparing these results to those of other tools with fixed scoring algorithms and known sensitivities [ 101 , 102 ]. We found the SCREEN to have lower-than-expected internal reliability, suggesting significant random measurement error. Test-retest reliability was weak for the healthy or unhealthy outcome but stronger for overall activity scores between tests. The variability in identifying areas of challenge could relate to technological difficulties or variability from comparisons with a growing database of test results.

Potential reasons for older adults’ poorer scores on timed tests include the impact of sensorimotor decline on touch screen sensation and reaction time [ 38 , 103 ], anxiety related to taking a computer-enabled test [ 104 - 106 ], or the anticipated consequences of a negative outcome [ 107 ]. However, these effects were unlikely to have influenced the results of this study. Practice effects were observed [ 29 , 108 ], but the SCREEN’s novelty suggests that familiarity is not gained through prepreparation or word of mouth as this sample was self-selected and not randomized. Future research might also explore the impact of digital literacy and cultural differences in the interpretation of software constructs or icons on MCI screening in a randomized, older adult sample.

Limitations

This study had methodological limitations that warrant attention. The small sample size and the demographic distribution of the 147 participants aged ≥55 years, with most (96/147, 65.3%) being female and well educated, limits the generalizability of the findings to different populations. The study’s design, aiming to explore the sensitivity of the SCREEN for early detection of MCI, necessitated the exclusion of individuals with a previous diagnosis of MCI or dementia. This exclusion criterion might have impacted the study’s ability to thoroughly assess the SCREEN’s effectiveness in a more varied clinical context. The requirement for participants to read and comprehend English introduced another limitation to our study. This criterion potentially limited the SCREEN tool’s applicability across diverse linguistic backgrounds as individuals with language-based impairments or those not proficient in English may face challenges in completing the assessment [ 51 ]. Such limitations could impact the generalizability of our findings to non–English-speaking populations or to those with language impairments, underscoring the need for further research to evaluate the SCREEN tool’s effectiveness in broader clinical and linguistic contexts.

Financial constraints played a role in limiting the study’s scope. Due to funding limitations, it was not possible to include specialist assessments and a battery of neuropsychiatric tests generally considered the gold standard to confirm or rule out an MCI diagnosis. Therefore, the study relied on differential verification through 2 imperfect reference standards: a comparison with the Qmci (the tool with the highest published sensitivity to MCI in 2019, when the study was designed) and the clinical judgment of the administering HCP, particularly in decisions regarding referrals for further clinical assessment. Furthermore, while an economic feasibility assessment was considered, the research team determined that it should follow, not precede, an evaluation of the SCREEN’s validity and reliability.

The proprietary nature of the algorithm used for scoring the SCREEN posed another challenge. Without access to this algorithm, the research team had to use a novel comparative statistical approach, coding patient results into a binary variable: healthy (SCREEN=no areas of challenge OR Qmci≥67 out of 100) or unhealthy (SCREEN=one or more areas of challenge OR Qmci<67 out of 100). This may have introduced a higher level of error into our statistical analysis. Furthermore, the process for determining areas of challenge on the SCREEN involves comparing a participant’s result to the existing SCREEN results in the LBB at the time of testing. By the end of this study, the LBB contained 632 SCREEN results for adults aged ≥55 years, with this study contributing 258 of those results. The remaining 366 original SCREEN results, 64% of which were completed by individuals who self-identified as having a preexisting diagnosis or conditions associated with cognitive impairment (eg, traumatic brain injury, concussion, or stroke), could have led to an overestimation of the means and SDs of the study participants’ results at the outset of the study.

Unlike other cognitive screening tools, the SCREEN allows for filtering of results to compare different patient cohorts in the LBB using criteria such as age and education. However, at this stage of the LBB’s development, using such filters can significantly reduce the reliability of the results due to a smaller comparator population (ie, the denominator used to calculate the mean and SD). This, in turn, affects the significance of the results. Moreover, the constantly changing LBB data set makes it challenging to meaningfully compare an individual’s results over time as the evolving denominator affects the accuracy and relevance of these comparisons. Finally, the significant improvement in SCREEN scores between the first and second visits suggests the presence of practice effects, which could have influenced the reliability and validity of the findings.

Conclusions

In a primary care setting, where MCI screening tools are essential and recommended for those with concerns [ 85 ], certain criteria are paramount: time efficiency, ease of administration, and robust psychometric properties [ 82 ]. Our analysis of the BrainFx SCREEN suggests that, despite its innovative approach and digital delivery, it currently falls short in meeting these criteria. The SCREEN’s comparatively longer administration time and lower-than-expected reliability scores suggest that it may not be the most effective tool for MCI screening of older adults in a primary care setting at this time.

It is important to note that, in the wake of the COVID-19 pandemic, and with an aging population living and aging by design or necessity in a community setting, there is growing interest in digital solutions, including web-based applications and platforms to both collect digital biomarkers and deliver cognitive training and other interventions [ 109 , 110 ]. However, new normative standards are required when adapting cognitive tests to digital formats [ 92 ] as the change in medium can significantly impact test performance and results interpretation. Therefore, we recommend caution when interpreting our study results and encourage continued research and refinement of tools such as the SCREEN. This ongoing process will ensure that current and future MCI screening tools are effective, reliable, and relevant in meeting the needs of our aging population, particularly in primary care settings where early detection and intervention are key.

Acknowledgments

The researchers gratefully acknowledge the Ontario Centres of Excellence Health Technologies Fund for their financial support of this study; the executive directors and clinical leads in each of the Family Health Team study locations; the participants and their friends and families who took part in the study; and research assistants Sharmin Sharker, Kelly Zhu, and Muhammad Umair for their contributions to data management and statistical analysis.

Data Availability

The data sets generated during and analyzed during this study are available from the corresponding author on reasonable request.

Authors' Contributions

JM contributed to the conceptualization, methodology, validation, formal analysis, data curation, writing—original draft, writing—review and editing, visualization, supervision, and funding acquisition. AML contributed to the conceptualization, methodology, validation, investigation, formal analysis, data curation, writing—original draft, writing—review and editing, visualization, and project administration. WP contributed to the validation, formal analysis, data curation, writing—original draft, writing—review and editing, and visualization. Finally, PH contributed to conceptualization, methodology, writing—review and editing, supervision, and funding acquisition.

Conflicts of Interest

None declared.

  • Casagrande M, Marselli G, Agostini F, Forte G, Favieri F, Guarino A. The complex burden of determining prevalence rates of mild cognitive impairment: a systematic review. Front Psychiatry. 2022;13:960648. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Petersen RC, Caracciolo B, Brayne C, Gauthier S, Jelic V, Fratiglioni L. Mild cognitive impairment: a concept in evolution. J Intern Med. Mar 2014;275(3):214-228. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Knopman DS, Petersen RC. Mild cognitive impairment and mild dementia: a clinical perspective. Mayo Clin Proc. Oct 2014;89(10):1452-1459. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Anderson ND. State of the science on mild cognitive impairment (MCI). CNS Spectr. Feb 2019;24(1):78-87. [ CrossRef ] [ Medline ]
  • Tangalos EG, Petersen RC. Mild cognitive impairment in geriatrics. Clin Geriatr Med. Nov 2018;34(4):563-589. [ CrossRef ] [ Medline ]
  • Ng R, Maxwell C, Yates E, Nylen K, Antflick J, Jette N, et al. Brain disorders in Ontario: prevalence, incidence and costs from health administrative data. Institute for Clinical Evaluative Sciences. 2015. URL: https:/​/www.​ices.on.ca/​publications/​research-reports/​brain-disorders-in-ontario-prevalence-incidence-and-costs-from-health-administrative-data/​ [accessed 2024-04-01]
  • Centers for Disease ControlPrevention (CDC). Self-reported increased confusion or memory loss and associated functional difficulties among adults aged ≥ 60 years - 21 states, 2011. MMWR Morb Mortal Wkly Rep. May 10, 2013;62(18):347-350. [ FREE Full text ] [ Medline ]
  • Petersen RC, Lopez O, Armstrong MJ, Getchius TS, Ganguli M, Gloss D, et al. Practice guideline update summary: mild cognitive impairment: report of the guideline development, dissemination, and implementation subcommittee of the American Academy of Neurology. Neurology. Jan 16, 2018;90(3):126-135. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Song WX, Wu WW, Zhao YY, Xu HL, Chen GC, Jin SY, et al. Evidence from a meta-analysis and systematic review reveals the global prevalence of mild cognitive impairment. Front Aging Neurosci. 2023;15:1227112. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen Y, Denny KG, Harvey D, Farias ST, Mungas D, DeCarli C, et al. Progression from normal cognition to mild cognitive impairment in a diverse clinic-based and community-based elderly cohort. Alzheimers Dement. Apr 2017;13(4):399-405. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Langa KM, Levine DA. The diagnosis and management of mild cognitive impairment: a clinical review. JAMA. Dec 17, 2014;312(23):2551-2561. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang Y, Natale G, Clouston S. Incidence of mild cognitive impairment, conversion to probable dementia, and mortality. Am J Alzheimers Dis Other Demen. 2021;36:15333175211012235. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Prince M, Bryce R, Ferri CP. World Alzheimer report 2011: the benefits of early diagnosis and intervention. Alzheimer’s Disease International. 2011. URL: https://www.alzint.org/u/WorldAlzheimerReport2011.pdf [accessed 2024-04-01]
  • Patnode CD, Perdue LA, Rossom RC, Rushkin MC, Redmond N, Thomas RG, et al. Screening for cognitive impairment in older adults: updated evidence report and systematic review for the US preventive services task force. JAMA. Feb 25, 2020;323(8):764-785. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Canadian Task Force on Preventive Health Care, Pottie K, Rahal R, Jaramillo A, Birtwhistle R, Thombs BD, et al. Recommendations on screening for cognitive impairment in older adults. CMAJ. Jan 05, 2016;188(1):37-46. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tahami Monfared AA, Phan NT, Pearson I, Mauskopf J, Cho M, Zhang Q, et al. A systematic review of clinical practice guidelines for Alzheimer's disease and strategies for future advancements. Neurol Ther. Aug 2023;12(4):1257-1284. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mattke S, Jun H, Chen E, Liu Y, Becker A, Wallick C. Expected and diagnosed rates of mild cognitive impairment and dementia in the U.S. medicare population: observational analysis. Alzheimers Res Ther. Jul 22, 2023;15(1):128. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Manly JJ, Tang MX, Schupf N, Stern Y, Vonsattel JP, Mayeux R. Frequency and course of mild cognitive impairment in a multiethnic community. Ann Neurol. Apr 2008;63(4):494-506. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Black CM, Ambegaonkar BM, Pike J, Jones E, Husbands J, Khandker RK. The diagnostic pathway from cognitive impairment to dementia in Japan: quantification using real-world data. Alzheimer Dis Assoc Disord. 2019;33(4):346-353. [ CrossRef ] [ Medline ]
  • Ritchie CW, Black CM, Khandker RK, Wood R, Jones E, Hu X, et al. Quantifying the diagnostic pathway for patients with cognitive impairment: real-world data from seven European and north American countries. J Alzheimers Dis. 2018;62(1):457-466. [ CrossRef ] [ Medline ]
  • Folstein MF, Folstein SE, McHugh PR. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. Nov 1975;12(3):189-198. [ CrossRef ] [ Medline ]
  • Tsoi KK, Chan JY, Hirai HW, Wong SY, Kwok TC. Cognitive tests to detect dementia: a systematic review and meta-analysis. JAMA Intern Med. Sep 2015;175(9):1450-1458. [ CrossRef ] [ Medline ]
  • Lopez MN, Charter RA, Mostafavi B, Nibut LP, Smith WE. Psychometric properties of the Folstein mini-mental state examination. Assessment. Jun 2005;12(2):137-144. [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal cognitive assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. Apr 2005;53(4):695-699. [ CrossRef ] [ Medline ]
  • O'Caoimh R, Timmons S, Molloy DW. Screening for mild cognitive impairment: comparison of "MCI specific" screening instruments. J Alzheimers Dis. 2016;51(2):619-629. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Trzepacz PT, Hochstetler H, Wang S, Walker B, Saykin AJ, Alzheimer’s Disease Neuroimaging Initiative. Relationship between the Montreal cognitive assessment and mini-mental state examination for assessment of mild cognitive impairment in older adults. BMC Geriatr. Sep 07, 2015;15:107. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Phillips N, Chertkow H. Normative data for the Montreal Cognitive Assessment (MoCA) in a population-based sample. Neurology. Mar 06, 2012;78(10):765-766. [ CrossRef ] [ Medline ]
  • Monroe T, Carter M. Using the Folstein Mini Mental State Exam (MMSE) to explore methodological issues in cognitive aging research. Eur J Ageing. Sep 2012;9(3):265-274. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Damian AM, Jacobson SA, Hentz JG, Belden CM, Shill HA, Sabbagh MN, et al. The Montreal cognitive assessment and the mini-mental state examination as screening instruments for cognitive impairment: item analyses and threshold scores. Dement Geriatr Cogn Disord. 2011;31(2):126-131. [ CrossRef ] [ Medline ]
  • Kaufer DI, Williams CS, Braaten AJ, Gill K, Zimmerman S, Sloane PD. Cognitive screening for dementia and mild cognitive impairment in assisted living: comparison of 3 tests. J Am Med Dir Assoc. Oct 2008;9(8):586-593. [ CrossRef ] [ Medline ]
  • Gagnon C, Saillant K, Olmand M, Gayda M, Nigam A, Bouabdallaoui N, et al. Performances on the Montreal cognitive assessment along the cardiovascular disease continuum. Arch Clin Neuropsychol. Jan 17, 2022;37(1):117-124. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cooley SA, Heaps JM, Bolzenius JD, Salminen LE, Baker LM, Scott SE, et al. Longitudinal change in performance on the Montreal cognitive assessment in older adults. Clin Neuropsychol. 2015;29(6):824-835. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Caoimh R, Gao Y, McGlade C, Healy L, Gallagher P, Timmons S, et al. Comparison of the quick mild cognitive impairment (Qmci) screen and the SMMSE in screening for mild cognitive impairment. Age Ageing. Sep 2012;41(5):624-629. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Caoimh R, Molloy DW. Comparing the diagnostic accuracy of two cognitive screening instruments in different dementia subtypes and clinical depression. Diagnostics (Basel). Aug 08, 2019;9(3):93. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Clarnette R, O'Caoimh R, Antony DN, Svendrovski A, Molloy DW. Comparison of the Quick Mild Cognitive Impairment (Qmci) screen to the Montreal Cognitive Assessment (MoCA) in an Australian geriatrics clinic. Int J Geriatr Psychiatry. Jun 2017;32(6):643-649. [ CrossRef ] [ Medline ]
  • Glynn K, Coen R, Lawlor BA. Is the Quick Mild Cognitive Impairment screen (QMCI) more accurate at detecting mild cognitive impairment than existing short cognitive screening tests? A systematic review of the current literature. Int J Geriatr Psychiatry. Dec 2019;34(12):1739-1746. [ CrossRef ] [ Medline ]
  • Lee MT, Chang WY, Jang Y. Psychometric and diagnostic properties of the Taiwan version of the quick mild cognitive impairment screen. PLoS One. 2018;13(12):e0207851. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wallace SE, Donoso Brown EV, Simpson RC, D'Acunto K, Kranjec A, Rodgers M, et al. A comparison of electronic and paper versions of the Montreal cognitive assessment. Alzheimer Dis Assoc Disord. 2019;33(3):272-278. [ CrossRef ] [ Medline ]
  • Gagnon C, Olmand M, Dupuy EG, Besnier F, Vincent T, Grégoire CA, et al. Videoconference version of the Montreal cognitive assessment: normative data for Quebec-French people aged 50 years and older. Aging Clin Exp Res. Jul 2022;34(7):1627-1633. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Friemel TN. The digital divide has grown old: determinants of a digital divide among seniors. New Media & Society. Jun 12, 2014;18(2):313-331. [ CrossRef ]
  • Ventola CL. Mobile devices and apps for health care professionals: uses and benefits. P T. May 2014;39(5):356-364. [ FREE Full text ] [ Medline ]
  • Searles C, Farnsworth JL, Jubenville C, Kang M, Ragan B. Test–retest reliability of the BrainFx 360® performance assessment. Athl Train Sports Health Care. Jul 2019;11(4):183-191. [ CrossRef ]
  • Jones C, Miguel-Cruz A, Brémault-Phillips S. Technology acceptance and usability of the BrainFx SCREEN in Canadian military members and veterans with posttraumatic stress disorder and mild traumatic brain injury: mixed methods UTAUT study. JMIR Rehabil Assist Technol. May 13, 2021;8(2):e26078. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McMurray J, Levy A, Holyoke P. Psychometric evaluation and workflow integration study of a tablet-based tool to detect mild cognitive impairment in older adults: protocol for a mixed methods study. JMIR Res Protoc. May 21, 2021;10(5):e25520. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wilansky P, Eklund JM, Milner T, Kreindler D, Cheung A, Kovacs T, et al. Cognitive behavior therapy for anxious and depressed youth: improving homework adherence through mobile technology. JMIR Res Protoc. Nov 10, 2016;5(4):e209. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ammenwerth E, Iller C, Mahler C. IT-adoption and the interaction of task, technology and individuals: a fit framework and a case study. BMC Med Inform Decis Mak. Jan 09, 2006;6:3. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Goodhue DL, Thompson RL. Task-technology fit and individual performance. MIS Q. Jun 1995;19(2):213-236. [ CrossRef ]
  • Beuscher L, Grando VT. Challenges in conducting qualitative research with individuals with dementia. Res Gerontol Nurs. Jan 2009;2(1):6-11. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Howe E. Informed consent, participation in research, and the Alzheimer's patient. Innov Clin Neurosci. May 2012;9(5-6):47-51. [ FREE Full text ] [ Medline ]
  • Thorogood A, Mäki-Petäjä-Leinonen A, Brodaty H, Dalpé G, Gastmans C, Gauthier S, et al. Global Alliance for GenomicsHealth‚ AgeingDementia Task Team. Consent recommendations for research and international data sharing involving persons with dementia. Alzheimers Dement. Oct 2018;14(10):1334-1343. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Demeyere N, Haupt M, Webb SS, Strobel L, Milosevich ET, Moore MJ, et al. Introducing the tablet-based Oxford Cognitive Screen-Plus (OCS-Plus) as an assessment tool for subtle cognitive impairments. Sci Rep. Apr 12, 2021;11(1):8000. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Patel BB. Validation of Montreal cognitive assessment, MoCA, alternate French versions. Can J Neurol Sci. Sep 2016;43(5):665-671. [ CrossRef ] [ Medline ]
  • Mueller AE, Segal DL, Gavett B, Marty MA, Yochim B, June A, et al. Geriatric anxiety scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10). Int Psychogeriatr. Jul 2015;27(7):1099-1111. [ CrossRef ] [ Medline ]
  • Segal DL, June A, Payne M, Coolidge FL, Yochim B. Development and initial validation of a self-report assessment tool for anxiety among older adults: the Geriatric Anxiety Scale. J Anxiety Disord. Oct 2010;24(7):709-714. [ CrossRef ] [ Medline ]
  • Balsamo M, Cataldi F, Carlucci L, Fairfield B. Assessment of anxiety in older adults: a review of self-report measures. Clin Interv Aging. 2018;13:573-593. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gatti A, Gottschling J, Brugnera A, Adorni R, Zarbo C, Compare A, et al. An investigation of the psychometric properties of the Geriatric Anxiety Scale (GAS) in an Italian sample of community-dwelling older adults. Aging Ment Health. Sep 2018;22(9):1170-1178. [ CrossRef ] [ Medline ]
  • Yochim BP, Mueller AE, June A, Segal DL. Psychometric properties of the Geriatric Anxiety Scale: comparison to the beck anxiety inventory and geriatric anxiety inventory. Clin Gerontol. Dec 06, 2010;34(1):21-33. [ CrossRef ]
  • Recent concussion (< 6 months ago) analysis result. Daisy Intelligence. 2016. URL: https://www.daisyintelligence.com/retail-solutions/ [accessed 2024-04-01]
  • Malloy DW, O'Caoimh R. The Quick Guide: Scoring and Administration Instructions for The Quick Mild Cognitive Impairment (Qmci) Screen. Waterford, Ireland. Newgrange Press; 2017.
  • O'Caoimh R, Gao Y, Svendovski A, Gallagher P, Eustace J, Molloy DW. Comparing approaches to optimize cut-off scores for short cognitive screening instruments in mild cognitive impairment and dementia. J Alzheimers Dis. 2017;57(1):123-133. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Breton A, Casey D, Arnaoutoglou NA. Cognitive tests for the detection of mild cognitive impairment (MCI), the prodromal stage of dementia: meta-analysis of diagnostic accuracy studies. Int J Geriatr Psychiatry. Feb 2019;34(2):233-242. [ CrossRef ] [ Medline ]
  • Umemneku Chikere CM, Wilson K, Graziadio S, Vale L, Allen AJ. Diagnostic test evaluation methodology: a systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard - An update. PLoS One. 2019;14(10):e0223832. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Espinosa A, Alegret M, Boada M, Vinyes G, Valero S, Martínez-Lage P, et al. Ecological assessment of executive functions in mild cognitive impairment and mild Alzheimer's disease. J Int Neuropsychol Soc. Sep 2009;15(5):751-757. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hawkins DM, Garrett JA, Stephenson B. Some issues in resolution of diagnostic tests using an imperfect gold standard. Stat Med. Jul 15, 2001;20(13):1987-2001. [ CrossRef ] [ Medline ]
  • Hadgu A, Dendukuri N, Hilden J. Evaluation of nucleic acid amplification tests in the absence of a perfect gold-standard test: a review of the statistical and epidemiologic issues. Epidemiology. Sep 2005;16(5):604-612. [ CrossRef ] [ Medline ]
  • Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. A comparison of two time intervals for test-retest reliability of health status instruments. J Clin Epidemiol. Aug 2003;56(8):730-735. [ CrossRef ] [ Medline ]
  • Paiva CE, Barroso EM, Carneseca EC, de Pádua Souza C, Dos Santos FT, Mendoza López RV, et al. A critical analysis of test-retest reliability in instrument validation studies of cancer patients under palliative care: a systematic review. BMC Med Res Methodol. Jan 21, 2014;14:8. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Streiner DL, Kottner J. Recommendations for reporting the results of studies of instrument and scale development and testing. J Adv Nurs. Sep 2014;70(9):1970-1979. [ CrossRef ] [ Medline ]
  • Streiner DL. A checklist for evaluating the usefulness of rating scales. Can J Psychiatry. Mar 1993;38(2):140-148. [ CrossRef ] [ Medline ]
  • Peyre H, Leplège A, Coste J. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey. Qual Life Res. Mar 2011;20(2):287-300. [ CrossRef ] [ Medline ]
  • Nevado-Holgado AJ, Kim CH, Winchester L, Gallacher J, Lovestone S. Commonly prescribed drugs associate with cognitive function: a cross-sectional study in UK Biobank. BMJ Open. Nov 30, 2016;6(11):e012177. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Moore AR, O'Keeffe ST. Drug-induced cognitive impairment in the elderly. Drugs Aging. Jul 1999;15(1):15-28. [ CrossRef ] [ Medline ]
  • Rogers J, Wiese BS, Rabheru K. The older brain on drugs: substances that may cause cognitive impairment. Geriatr Aging. 2008;11(5):284-289. [ FREE Full text ]
  • Marvanova M. Drug-induced cognitive impairment: effect of cardiovascular agents. Ment Health Clin. Jul 2016;6(4):201-206. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Espeland MA, Rapp SR, Manson JE, Goveas JS, Shumaker SA, Hayden KM, et al. WHIMSYWHIMS-ECHO Study Groups. Long-term effects on cognitive trajectories of postmenopausal hormone therapy in two age groups. J Gerontol A Biol Sci Med Sci. Jun 01, 2017;72(6):838-845. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Luis CA, Keegan AP, Mullan M. Cross validation of the Montreal cognitive assessment in community dwelling older adults residing in the Southeastern US. Int J Geriatr Psychiatry. Feb 2009;24(2):197-201. [ CrossRef ] [ Medline ]
  • Cunje A, Molloy DW, Standish TI, Lewis DL. Alternate forms of logical memory and verbal fluency tasks for repeated testing in early cognitive changes. Int Psychogeriatr. Feb 2007;19(1):65-75. [ CrossRef ] [ Medline ]
  • Molloy DW, Standish TI, Lewis DL. Screening for mild cognitive impairment: comparing the SMMSE and the ABCS. Can J Psychiatry. Jan 2005;50(1):52-58. [ CrossRef ] [ Medline ]
  • Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to Their Development and Use. 4th edition. Oxford, UK. Oxford University Press; 2008.
  • Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. Jun 2016;15(2):155-163. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276-282. [ FREE Full text ] [ Medline ]
  • Zhuang L, Yang Y, Gao J. Cognitive assessment tools for mild cognitive impairment screening. J Neurol. May 2021;268(5):1615-1622. [ CrossRef ] [ Medline ]
  • Zhang J, Wang L, Deng X, Fei G, Jin L, Pan X, et al. Five-minute cognitive test as a new quick screening of cognitive impairment in the elderly. Aging Dis. Dec 2019;10(6):1258-1269. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Feldman HH, Jacova C, Robillard A, Garcia A, Chow T, Borrie M, et al. Diagnosis and treatment of dementia: 2. Diagnosis. CMAJ. Mar 25, 2008;178(7):825-836. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sabbagh MN, Boada M, Borson S, Chilukuri M, Dubois B, Ingram J, et al. Early detection of mild cognitive impairment (MCI) in primary care. J Prev Alzheimers Dis. 2020;7(3):165-170. [ CrossRef ] [ Medline ]
  • Milne A. Dementia screening and early diagnosis: the case for and against. Health Risk Soc. Mar 05, 2010;12(1):65-76. [ CrossRef ]
  • Screening tools to identify adults with cognitive impairment associated with dementia: diagnostic accuracy. Canadian Agency for Drugs and Technologies in Health. 2014. URL: https:/​/www.​cadth.ca/​sites/​default/​files/​pdf/​htis/​nov-2014/​RB0752%20Cognitive%20Assessments%20for%20Dementia%20Final.​pdf [accessed 2024-04-01]
  • Chehrehnegar N, Nejati V, Shati M, Rashedi V, Lotfi M, Adelirad F, et al. Early detection of cognitive disturbances in mild cognitive impairment: a systematic review of observational studies. Psychogeriatrics. Mar 2020;20(2):212-228. [ CrossRef ] [ Medline ]
  • Chan JY, Yau ST, Kwok TC, Tsoi KK. Diagnostic performance of digital cognitive tests for the identification of MCI and dementia: a systematic review. Ageing Res Rev. Dec 2021;72:101506. [ CrossRef ] [ Medline ]
  • Cubillos C, Rienzo A. Digital cognitive assessment tests for older adults: systematic literature review. JMIR Ment Health. Dec 08, 2023;10:e47487. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen R, Foschini L, Kourtis L, Signorini A, Jankovic F, Pugh M, et al. Developing measures of cognitive impairment in the real world from consumer-grade multimodal sensor streams. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019. Presented at: KDD '19; August 4-8, 2019;2145; Anchorage, AK. URL: https://dl.acm.org/doi/10.1145/3292500.3330690 [ CrossRef ]
  • Koo BM, Vizer LM. Mobile technology for cognitive assessment of older adults: a scoping review. Innov Aging. Jan 2019;3(1):igy038. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zygouris S, Ntovas K, Giakoumis D, Votis K, Doumpoulakis S, Segkouli S, et al. A preliminary study on the feasibility of using a virtual reality cognitive training application for remote detection of mild cognitive impairment. J Alzheimers Dis. 2017;56(2):619-627. [ CrossRef ] [ Medline ]
  • Liu Q, Song H, Yan M, Ding Y, Wang Y, Chen L, et al. Virtual reality technology in the detection of mild cognitive impairment: a systematic review and meta-analysis. Ageing Res Rev. Jun 2023;87:101889. [ CrossRef ] [ Medline ]
  • Fayemiwo MA, Olowookere TA, Olaniyan OO, Ojewumi TO, Oyetade IS, Freeman S, et al. Immediate word recall in cognitive assessment can predict dementia using machine learning techniques. Alzheimers Res Ther. Jun 15, 2023;15(1):111. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Statsenko Y, Meribout S, Habuza T, Almansoori TM, van Gorkom KN, Gelovani JG, et al. Patterns of structure-function association in normal aging and in Alzheimer's disease: screening for mild cognitive impairment and dementia with ML regression and classification models. Front Aging Neurosci. 2022;14:943566. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Roebuck-Spencer TM, Glen T, Puente AE, Denney RL, Ruff RM, Hostetter G, et al. Cognitive screening tests versus comprehensive neuropsychological test batteries: a national academy of neuropsychology education paper†. Arch Clin Neuropsychol. Jun 01, 2017;32(4):491-498. [ CrossRef ] [ Medline ]
  • Jammeh EA, Carroll CB, Pearson SW, Escudero J, Anastasiou A, Zhao P, et al. Machine-learning based identification of undiagnosed dementia in primary care: a feasibility study. BJGP Open. Jul 2018;2(2):bjgpopen18X101589. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Riello M, Rusconi E, Treccani B. The role of brief global cognitive tests and neuropsychological expertise in the detection and differential diagnosis of dementia. Front Aging Neurosci. 2021;13:648310. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McDonnell M, Dill L, Panos S, Amano S, Brown W, Giurgius S, et al. Verbal fluency as a screening tool for mild cognitive impairment. Int Psychogeriatr. Sep 2020;32(9):1055-1062. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wojtowicz A, Larner AJ. Diagnostic test accuracy of cognitive screeners in older people. Prog Neurol Psychiatry. Mar 20, 2017;21(1):17-21. [ CrossRef ]
  • Larner AJ. Cognitive screening instruments for the diagnosis of mild cognitive impairment. Prog Neurol Psychiatry. Apr 07, 2016;20(2):21-26. [ CrossRef ]
  • Heintz BD, Keenan KG. Spiral tracing on a touchscreen is influenced by age, hand, implement, and friction. PLoS One. 2018;13(2):e0191309. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Laguna K, Babcock RL. Computer anxiety in young and older adults: implications for human-computer interactions in older populations. Comput Human Behav. Aug 1997;13(3):317-326. [ CrossRef ]
  • Wild KV, Mattek NC, Maxwell SA, Dodge HH, Jimison HB, Kaye JA. Computer-related self-efficacy and anxiety in older adults with and without mild cognitive impairment. Alzheimers Dement. Nov 2012;8(6):544-552. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wiechmann D, Ryan AM. Reactions to computerized testing in selection contexts. Int J Sel Assess. Jul 30, 2003;11(2-3):215-229. [ CrossRef ]
  • Gass CS, Curiel RE. Test anxiety in relation to measures of cognitive and intellectual functioning. Arch Clin Neuropsychol. Aug 2011;26(5):396-404. [ CrossRef ] [ Medline ]
  • Barbic D, Kim B, Salehmohamed Q, Kemplin K, Carpenter CR, Barbic SP. Diagnostic accuracy of the Ottawa 3DY and short blessed test to detect cognitive dysfunction in geriatric patients presenting to the emergency department. BMJ Open. Mar 16, 2018;8(3):e019652. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Owens AP, Ballard C, Beigi M, Kalafatis C, Brooker H, Lavelle G, et al. Implementing remote memory clinics to enhance clinical care during and after COVID-19. Front Psychiatry. 2020;11:579934. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Geddes MR, O'Connell ME, Fisk JD, Gauthier S, Camicioli R, Ismail Z, et al. Alzheimer Society of Canada Task Force on Dementia Care Best Practices for COVID‐19. Remote cognitive and behavioral assessment: report of the Alzheimer Society of Canada task force on dementia care best practices for COVID-19. Alzheimers Dement (Amst). 2020;12(1):e12111. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by G Eysenbach, T de Azevedo Cardoso; submitted 29.01.24; peer-reviewed by J Gao, MJ Moore; comments to author 20.02.24; revised version received 05.03.24; accepted 19.03.24; published 19.04.24.

©Josephine McMurray, AnneMarie Levy, Wei Pang, Paul Holyoke. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 19.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Published on 19.4.2024 in Vol 8 (2024)

A Health Information Technology Protocol to Enhance Colorectal Cancer Screening

Authors of this article:

Author Orcid Image

Research Letter

  • Adam Baus 1 * , MA, MPH, PhD   ; 
  • Dannell D Boatman 2 * , MS, EdD   ; 
  • Andrea Calkins 1 * , MPH   ; 
  • Cecil Pollard 1 * , MA   ; 
  • Mary Ellen Conn 2 * , MS   ; 
  • Sujha Subramanian 3 * , MA, PhD   ; 
  • Stephenie Kennedy-Rea 2 * , MA, EdD  

1 Department of Social and Behavioral Sciences, School of Public Health, West Virginia University, Morgantown, WV, United States

2 Cancer Prevention and Control, West Virginia University Cancer Institute, Morgantown, WV, United States

3 Implenomics, Dover, DE, United States

*all authors contributed equally

Corresponding Author:

Adam Baus, MA, MPH, PhD

Department of Social and Behavioral Sciences

School of Public Health

West Virginia University

64 Medical Center Drive

PO Box 9190

Morgantown, WV, 26506

United States

Phone: 1 304 293 1083

Fax:1 304 293 6685

Email: [email protected]

This study addresses barriers to electronic health records–based colorectal cancer screening and follow-up in primary care through the development and implementation of a health information technology protocol.

Introduction

Cancer is a pressing global public health problem and the second leading cause of death in the United States, accounting for an estimated 1670 deaths daily [ 1 ]. Colorectal cancer (CRC) is the third most commonly diagnosed cancer, the second leading cause of cancer death worldwide [ 2 ], and the third most common cause of cancer-related deaths in the United States [ 3 ]. More effective use of health information technology (HIT), including electronic health records (EHRs), can aid in improving CRC screening and care [ 4 ]. Studies from as early as the 1990s have shown that EHRs and associated clinical decision support tools have promise in helping with patient care and population health needs [ 5 ]. However, barriers like clinician readiness [ 6 ] and clinical workflow integration [ 7 ] hinder EHRs’ full benefits. This study aims to address barriers to EHR-based CRC screening and follow-up through the development and implementation of a universally applicable EHR protocol tailored to identify and overcome practice workflow and EHR challenges.

This study used a mixed methods approach, involving quantitative and qualitative data collection techniques, conducted across 3 diverse health systems in West Virginia to develop and implement an EHR protocol for CRC screening and follow-up. These health systems were purposefully chosen to encompass diverse sizes, organizational structures, geographic locations, patient demographics, and EHR preferences, thereby supporting the generalizability of the study’s findings. These included a free and charitable clinic, a larger, urban, federally qualified health center, and a smaller, rural, federally qualified health center. Key stakeholders, including health care administrators, clinicians, and information technology personnel, were identified as potential participants. This study was conducted from April 2021 through April 2022. Implementation mapping methodology guided the assessment of current CRC screening practices and the development, implementation, and evaluation of the EHR protocol. Data collection tools were pilot tested in Health System A to assess their reliability, validity, and feasibility, then refined prior to full implementation in Health Systems B and C to ensure quality and effectiveness in data collection. Evaluation of the protocol’s acceptability, appropriateness, and feasibility was conducted using the Acceptability of Intervention Measure (AIM), Intervention Appropriateness Measure (IAM), and Feasibility of Intervention Measure (FIM). Technical issues during the study were resolved collaboratively by the research team and technical staff through troubleshooting, protocol adjustments, and ongoing support.

Ethical Considerations

This study received ethics approval from the West Virginia University Institutional Review Board (protocol number 2107363377).

The development of the EHR protocol involved a collaborative process between the research team and key stakeholders from participating health systems. Initial assessments revealed common challenges in CRC screening and follow-up across the diverse settings, including issues related to data quality, workflow inefficiencies, and underutilization of EHR functionalities. Based on these findings, a draft protocol was formulated, emphasizing strategies to enhance EHR data quality and optimization specifically tailored to address the identified barriers. The protocol comprised three key components: (1) Quality Improvement Activities , guiding clinic staff through a Plan-Do-Study-Act cycle to identify and mitigate data entry errors; (2) EHR Optimization Factors , highlighting specific EHR features supporting CRC screening and follow-up when effectively used; and (3) Health Information Technology Assessment , facilitating structured discussions on EHR use roles, office workflows, knowledge, skills, abilities, challenges, and improvement opportunities.

The developed protocol was implemented in Health Systems B and C following its refinement based on feedback from the development site (Health System A). Implementation involved training sessions for clinic staff on protocol utilization and ongoing support from the research team. Eight staff members from the participating health systems completed the AIM, IAM, and FIM assessments, providing valuable insights into their perceptions of the protocol. The mean scores from AIM (mean 16.00, SD 4.24), IAM (mean 15.80, SD 4.54), and FIM (mean 16.80, SD 4.66) indicate favorable perceptions of protocol feasibility, acceptability, and appropriateness. Qualitative feedback from participants further supported the positive reception of the protocol, with respondents expressing satisfaction with its efficacy and intentions to integrate it into their clinical practices. All respondents indicated that they would use or would consider using the protocol within their clinics again. Open-ended responses included “very pleased with the protocol and leveraging EHR/staff/outreach” and “plan to now identify and track to completion of CRC testing.”

The results demonstrate the successful development and initial implementation of an EHR protocol aimed at enhancing CRC screening in primary care settings. The protocol’s favorable reception by clinic staff, as indicated by high scores on acceptability, appropriateness, and feasibility measures, suggests its potential effectiveness in addressing identified barriers. The diverse representation of health systems and EHR platforms involved in the study enhances the generalizability of findings. Limitations include the small sample size and the focus on a specific geographic region. Future research will assess the protocol’s performance across additional EHR systems and health care settings for enhanced scalability and further evaluate the protocol’s impact on CRC screening outcomes.

Acknowledgments

The authors acknowledge the funding and support from the Research Triangle Institute (grant 1-312-0216648-66244L).

Conflicts of Interest

None declared.

  • Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J Clin. Jan 2022;72(1):7-33. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Morgan E, Arnold M, Gini A, Lorenzoni V, Cabasag CJ, Laversanne M, et al. Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut. Feb 2023;72(2):338-344. [ CrossRef ] [ Medline ]
  • Division of Cancer Prevention and Control. Colorectal cancer statistics. Centers for Disease Control and Prevention. 2023. URL: https://www.cdc.gov/cancer/colorectal/statistics/index.htm [accessed 2023-12-04]
  • Baus A, Wright L, Kennedy-Rea S, Conn ME, Eason S, Boatman D, et al. Leveraging electronic health records data for enhanced colorectal cancer screening efforts. J Appalach Health. 2020;2(4):53-63. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Atasoy H, Greenwood BN, McCullough JS. The digitization of patient care: a review of the effects of electronic health records on health care quality and utilization. Annu Rev Public Health. Apr 01, 2019;40(1):487-500. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bates DW. Physicians and ambulatory electronic health records. Health Aff (Millwood). Sep 2005;24(5):1180-1189. [ CrossRef ] [ Medline ]
  • Hersh W, Weiner M, Embi P, Logan JR, Payne PRO, Bernstam EV, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. Aug 2013;51(8 Suppl 3):S30-S37. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by A Mavragani; submitted 08.12.23; peer-reviewed by Y Chu, A Banerjee; comments to author 05.02.24; revised version received 12.02.24; accepted 04.04.24; published 19.04.24.

©Adam Baus, Dannell D Boatman, Andrea Calkins, Cecil Pollard, Mary Ellen Conn, Sujha Subramanian, Stephenie Kennedy-Rea. Originally published in JMIR Formative Research (https://formative.jmir.org), 19.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.

Volume 18 Supplement 1

Integration of Refugees into National Health Systems: Enhancing Equity and Strengthening Sustainable Health Services for All

Publication of this supplement has been jointly supported by the Foreign, Commonwealth & Development Office (FCDO), the Medical Research Council (MRC), and Wellcome and Economic and Social Research Council (ESRC) under grant number MR/S013547/1. The articles have undergone the journal's standard peer review process for supplements. Supplement Editors were not involved in the peer review of any supplement articles they had co-authored. No other competing interests were declared.

Edited by Fadi El-Jardali (American University of Beirut, Lebanon), Sara Bennett (Johns Hopkins Bloomberg School of Public Health, USA), and Paul Spiegel (Johns Hopkins Bloomberg School of Public Health, USA).

The emergence and regression of political priority for refugee integration into the Jordanian health system: an analysis using the Kingdon’s multiple streams model

The prolonged presence of Syrian refugees in Jordan has highlighted the need for sustainable health service delivery models for refugees. In 2012, the Jordanian government adopted a policy that granted Syrian ...

  • View Full Text

How integration of refugees into national health systems became a global priority: a qualitative policy analysis

Despite a long history of political discourse around refugee integration, it wasn’t until 2016 that this issue emerged as a global political priority. Limited research has examined the evolution of policies of...

  • Editorial Board
  • Instructions for Editors
  • Contact Support for Editors
  • Call for papers
  • Sign up for article alerts and news from this journal
  • Follow us on Twitter

Annual Journal Metrics

2022 Citation Impact 3.6 - 2-year Impact Factor 3.9 - 5-year Impact Factor 1.627 - SNIP (Source Normalized Impact per Paper) 1.201 - SJR (SCImago Journal Rank)

2023 Speed 16 days submission to first editorial decision for all manuscripts (Median) 176 days submission to accept (Median)

2023 Usage  864,532 downloads 14,822 Altmetric mentions

Conflict and Health

ISSN: 1752-1505

IMAGES

  1. AFRICAN JOURNAL OF HEALTH SCIENCES

    health research journal articles

  2. (PDF) Influential journals in health research: A bibliometric study

    health research journal articles

  3. (PDF) Medical Student Research Journals: The International Journal of

    health research journal articles

  4. Vol. 5 No. 2 (2019): International Journal of Medicine and Medical

    health research journal articles

  5. Top 25 Prestigious Medical Journals To Publish In

    health research journal articles

  6. Journal of Health Sciences and Medicine

    health research journal articles

VIDEO

  1. New study looks at connection between mental health and heart health

  2. Reading Journal Articles

  3. New App Predicts the Risk of Heart Disease from High Blood Pressure, Obesity and High Cholesterol

  4. Literature search and review to identify research gaps

  5. CASE SERIES IN A MEDICAL JOURNAL| PUBLISHING ARTICLES IN MEDICAL JOURNALS

  6. WRITING CASE REPORT IN A MEDICAL JOURNAL| PUBLISHING ARTICLES IN MEDICAL JOURNALS

COMMENTS

  1. The New England Journal of Medicine

    The New England Journal of Medicine (NEJM) is a weekly general medical journal that publishes new medical research and review articles, and editorial opinion on a wide variety of topics of ...

  2. Qualitative Health Research: Sage Journals

    Qualitative Health Research (QHR) is a peer-reviewed monthly journal that provides an international, interdisciplinary forum to enhance health care and further the development and understanding of qualitative research in health-care settings.QHR is an invaluable resource for researchers and academics, administrators and others in the health and social service professions, and graduates who ...

  3. Articles

    Correction: Sodium, potassium intake, and all-cause mortality: confusion and new findings. Donghao Liu, Yuqing Tian, Rui Wang, Tianyue Zhang, Shuhui Shen, Ping Zeng and Tong Zou. BMC Public Health 2024 24 :1078. Correction Published on: 18 April 2024. The original article was published in BMC Public Health 2024 24 :180.

  4. PubMed

    PubMed® comprises more than 37 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites.

  5. AJPH

    The mission of the Journal is to advance public health research, policy, practice, and education. ... The acceptance rate is 4% for research articles and analytic essays. Unique Online Features. The online Journal is a delivery channel of special importance to those in the research, academic, student, and practice communities. It is the ...

  6. Qualitative Health Research

    Sage publishes a diverse portfolio of fully Open Access journals in a variety of disciplines. EXPLORE GOLD OPEN ACCESS JOURNALS . Alternatively, you can explore our Disciplines Hubs, including: ... Qualitative Health Research ISSN: 1049-7323; Online ISSN: 1552-7557; About Sage; Contact us;

  7. SSM

    SSM - Qualitative Research in Health is a peer-reviewed, open access journal that publishes international and interdisciplinary qualitative research, methodological, and theoretical contributions related to medical care, illness, disease, health, and wellbeing from across the globe. SSM - Qualitative Research in Health is edited by Stefan Timmermans, a Senior Editor at Social Science & Medicine.

  8. Global Burden of Disease Study 2021 estimates: implications for health

    Over the past three decades, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) has produced several iterations of global estimates for various disease metrics.1 The latest iteration, GBD 2021, published in The Lancet as a series of Articles, includes estimates of the global disease burden including incidence, prevalence, and disability-adjusted life-years (DALYs) for 371 ...

  9. Qualitative Methods in Health Care Research

    The greatest strength of the qualitative research approach lies in the richness and depth of the healthcare exploration and description it makes. In health research, these methods are considered as the most humanistic and person-centered way of discovering and uncovering thoughts and actions of human beings. Table 1.

  10. Public Health

    Fundamentals of Public Health: Using Policy Tools to Improve Population Health — Combating the U.S. Opioid Crisis. C.L. Barry and B. SalonerN Engl J Med 2021;385:2113-2116. Policy has been ...

  11. Oral Health for All

    Over the past 20 years, per-person dental care costs have increased by 30% in the United States; in 2018, Americans paid $55 billion in out-of-pocket dental expenses, which constituted more than ...

  12. ScienceDirect.com

    3.3 million articles on ScienceDirect are open access. Articles published open access are peer-reviewed and made freely available for everyone to read, download and reuse in line with the user license displayed on the article. ScienceDirect is the world's leading source for scientific, technical, and medical research.

  13. Recent quantitative research on determinants of health in high ...

    Background Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature. Methods We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that ...

  14. Promoting Health and Well-being in Healthy People 2030

    Implementation of Healthy People 2030 will by strengthened by engaging users from many sectors and ensuring the effective use and alignment of resources. Promoting the nation's health and well-being is a shared responsibility—at the national, state, territorial, tribal, and community levels. It requires involving the public, private, and not ...

  15. Home

    Advanced. Journal List. PubMed Central ® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM)

  16. Home page

    BMC Health Services Research is an open access, peer-reviewed journal that considers articles on all aspects of health services research.The journal has a special focus on digital health, governance, health policy, health system quality and safety, healthcare delivery and access to healthcare, healthcare financing and economics, implementing reform, and the health workforce.

  17. Research Methods in Medicine & Health Sciences: Sage Journals

    JOURNAL HOMEPAGE. Research Methods in Medicine & Health Sciences is a peer reviewed journal, publishing rigorous research on established "gold standard" methods and new cutting edge research methods in the health sciences and clinical medicine. View full journal description. This journal is a member of the Committee on Publication Ethics ...

  18. HSR

    Journal Issues Current Issue Past Issues Special Issues Methods Corner Article Collection Subscribe ... Special Issue Call for Abstracts The Role of Health Services Research in Advances in Cancer Prevention and Control, sponsored by Department of Public Health Sciences, University of Virginia School of Medicine. Submission deadline for ...

  19. Sleep is essential to health: an American Academy of Sleep Medicine

    INTRODUCTION. Sleep is vital for health and well-being in children, adolescents, and adults. 1-3 Healthy sleep is important for cognitive functioning, mood, mental health, and cardiovascular, cerebrovascular, and metabolic health. 4 Adequate quantity and quality of sleep also play a role in reducing the risk of accidents and injuries caused by sleepiness and fatigue, including workplace ...

  20. Large language models approach expert-level clinical knowledge and

    Introduction. Generative Pre-trained Transformer 3.5 (GPT-3.5) and 4 (GPT-4) are large language models (LLMs) trained on datasets containing hundreds of billions of words from articles, books, and other internet sources [1, 2].ChatGPT is an online chatbot which uses GPT-3.5 or GPT-4 to provide bespoke responses to human users' queries [].LLMs have revolutionised the field of natural language ...

  21. Health Services Research

    Online publication from 2024. Health Services Research will be published in online-only format effective with the 2024 volume. This is a proactive move towards reducing the environmental impact caused by the production and distribution of printed journal copies and will allow the journal to invest in further innovation, digital development, and ...

  22. Vaccine adjuvants: current status, research and development, licensing

    Vaccines represent one of the most significant inventions in human history and have revolutionized global health. Generally, a vaccine functions by triggering the innate immune response and stimulating antigen-presenting cells, leading to a defensive adaptive immune response against a specific pathogen's antigen. A Journal of Materials Chemistry B Recent Review Articles Journal of Materials ...

  23. Mental Health Prevention and Promotion—A Narrative Review

    Scope of Mental Health Promotion and Prevention in the Current Situation. Literature provides considerable evidence on the effectiveness of various preventive mental health interventions targeting risk and protective factors for various mental illnesses (18, 36-42).There is also modest evidence of the effectiveness of programs focusing on early identification and intervention for severe ...

  24. Journal of Medical Internet Research

    Background: With the rapid aging of the global population, the prevalence of mild cognitive impairment (MCI) and dementia is anticipated to surge worldwide. MCI serves as an intermediary stage between normal aging and dementia, necessitating more sensitive and effective screening tools for early identification and intervention. The BrainFx SCREEN is a novel digital tool designed to assess ...

  25. Health: Sage Journals

    Health: is published six times per year and attempts in each number to offer a mix of articles that inform or that provoke debate. The readership of the journal is wide and drawn from different disciplines and from workers both inside and outside the … | View full journal description. This journal is a member of the Committee on Publication ...

  26. JMIR Formative Research

    JMIR Mental Health 982 articles JMIR Human Factors 666 articles ... Interactive Journal of Medical Research 362 articles JMIRx Med 359 articles JMIR Pediatrics and Parenting 345 articles JMIR Cancer 341 articles ...

  27. Integration of Refugees into National Health Systems: Enhancing Equity

    Publication of this supplement has been jointly supported by the Foreign, Commonwealth & Development Office (FCDO), the Medical Research Council (MRC), and Wellcome and Economic and Social Research Council (ESRC) under grant number MR/S013547/1. The articles have undergone the journal's standard peer review process for supplements.

  28. A growing understanding of the link between movement and health

    Since the pandemic, which accelerated the shift to a virtual existence, people are moving less than ever, Gibbs said. Just 1 in 4 men and 1 in 5 women and adolescents currently get the recommended amount of aerobic and muscle-strengthening exercise, the federal guidelines say. "We have engineered physical activity out of our lives," Gibbs said.