Systematic Reviews and Meta Analysis

  • Getting Started
  • Guides and Standards
  • Review Protocols
  • Databases and Sources
  • Randomized Controlled Trials
  • Controlled Clinical Trials
  • Observational Designs
  • Tests of Diagnostic Accuracy
  • Software and Tools
  • Where do I get all those articles?
  • Collaborations
  • EPI 233/528
  • Countway Mediated Search
  • Risk of Bias (RoB)

Data Sources

Databases you will probably search.

No one database can cover the literature for any topic. For medical topics, a combination of PubMed (or other search of PubMed data) plus Embase, Web of Science, and Google Scholar has been shown to provide adequate recall ( Syst Rev. 2017;6(1):245 ). For topics that reach beyond the biomedicine, other databases need to be considered.

  • PubMed PubMed is both the search platform provided by the National Center for Biotechnology information and the database. PubMed includes MEDLINE (records indexed with MeSH terms) but also material in process, older records from before the inception of MEDLINE, and material from journals not included in MEDLINE. The PubMed database is available on independent platforms including Ovid SP, Web of Science, and several others.
  • Embase Note: Embase requires users to either create an individual account (free) or log in with an institutional email address to enable the export of records. Before you start a session, 'log in" at the upper right. You can either create an account or use your Harvard email (recommended).   Embase includes materials second tier European and Asian journals not included in MEDLINE as well as conference abstracts. The Emtree controlled vocabulary is well developed. Embase records include more Emtree terms than MEDLINE records do MeSH term. Hence, results sets can often be significantly large in Embase, especially for drug-related searches.
  • Cochrane Central Register of Controlled Trials Cochrane Central contains trials from both MEDLINE and Embase plus many trials from other, non-indexed sources; limited to randomized and non-randomized controlled trials.  MeSH for MEDLINE records, but no other controlled vocabulary. To limit to results in Central, click the "Trials" limit to the left of your results.
  • Web of Science Core Collection (includes the Science Citation Index) Broad coverage of all sciences.  Will cover some journals at the edge of the biomedical sciences missed by PubMed and Embase. Some meeting information. No controlled vocabulary. Alternatively, the Elsevier database Scopus can be used. Harvard does not license access to Scopus.
  • GoogleScholar Consider as a supplement to the literature databases. It can improve sensitivity because it searches the full-text of articles. Screening the first 200-400 records in a search is recommended.
  • ClinicalTrials.gov Registers trials that are recruiting, completed, or terminated. Some records includes results.  Searching here helps identify unpublished trials. See below for other registries.

These database can be an effective complement to your search.  They can be essential in their specialized topic areas.

  • BIOSIS Previews Although it is primarly useful for biologists, it contains a lot of meetings and some medical journals.  Controlled vocabulary is not suitable for medical searching.
  • CINAHL Nursing and other health related information; excellent source for issues in patient care.  Well developed controlled vocabulary.
  • PsycINFO Cognitive and behavioral therapies are well covered.  Controlled vocabulary.
  • Google Scholar Add as an additional source. Here are some search tips.
  • WHO Global Index Medicus Search all WHO regional indexes, including the South-East Asia and Western Pacific Pacific regional databases.
  • Sociological Abstracts The primary index for sociological literature.  May be useful for community-related studies or interpersonal issues. Controlled vocabulary.
  • 3ie Impact Evaluation Repository Investigating an ecomomic or social intervention? The 3ie Impact Evaluation Repository is a currated database for evidence of what works in international development in low- and middle-income countries.
  • EconLit Economics. Almost any social intervetion and many medical ones get studied by economists.
  • RePEc IDEAS A repository of economics literature. It includes bibliographic metadata from many archives.

Resources for Meetings and Other Grey Literature

Truely unbiased searches look for unpublished literature in a number of places, included meeting abstracts, white papers, clinical trial registries, and searching by hand.

  • GreyNet GreyNet is an organization dedicated to promoting and facilitating the use of grey literature. Includes of listing of grey literature resources, GreySource .  OpenGrey, a former multidisciplinary database of technical reports, meetings, dissertations, and official publications is now archived in GreyNet. 
  • Grey Literature Report A bi-monthly publication of the New York Academy of Medicine, the GLR includes listings of recently published reports in health science and public health. The archives are tagged with MeSH terms and are searchable.
  • BIOSIS Previews Meetings! BIOSIS Previews includes proceedings of many meetings that may not be electronically available elsewhere.
  • ProQuest Dissertations & Theses Global A central authoritative source for locating doctoral dissertations and master's theses. Provides full text for most indexed dissertations from 1990-present. Includes theses and dissertations from the Harvard T.H. Chan School of Public Health, Harvard Medical School, and Harvard School of Dental Medicine.
  • greylitsearcher A web-based tool for performing systematic and transparent searches of organizational websites

Identifying sources for grey literature and being sure you've done enough is a challenge. The Canadian Agency for Drugs and Technologies in Health (CADTH) feels your pain and has produced a checklist that might help guide your grey research. The Grey Matters checklist provides an organized source of health technology assessment sites, regulatory agencies, trial registries, and other databases in a form that can help ensure the completeness of you search.

Clinical Trial Registries

  • ClinicalTrials.gov
  • European Union Clinical Trials Registry
  • ISRCTN registry
  • International Clinical Trials Registry Platform  (ICTRP)

When you search Cochrane Library/Trials , you will see results from both ClinicalTrials.gov and ICTRP. 

More information about trial registries and solving the problems associated with searching them is available through this site: Medical and health-related trials registers and research registers which is maintained by Julie Glanville and Carol Lefebvre and hosted by the York Health Economics Health Consortium.

  • << Previous: Review Protocols
  • Next: Methodology Filters >>
  • Last Updated: Feb 26, 2024 3:17 PM
  • URL: https://guides.library.harvard.edu/meta-analysis

Northeastern University Library

  • Northeastern University Library
  • Research Subject Guides
  • Guides for Library Services
  • Systematic Reviews and Evidence Syntheses
  • Evidence Synthesis Service
  • Types of Systematic Reviews in the Health Sciences
  • Beginning Your Project
  • Standards & Guidance
  • Critical Appraisal
  • Evidence-Based Assignments
  • Tips for a Successful Review Team
  • Training and Tutorials

Systematic Reviews and Evidence Syntheses : Databases

You will want to search at least three databases for your systematic review. Three databases alone does not complete the search standards for systematic review requirements. You will also have to complete a search of the grey literature and complete additional hand searches. Which databases you should search is highly dependent on your systematic review topic, so it is recommended you  meet with a librarian . 

Commonly Used Health Sciences Databases

Commonly used social sciences databases, commonly used education databases.

  • Resources for Finding Systematic Reviews

You will want to search at least three databases for your systematic review. Three databases alone does not complete the search standards for systematic review requirements as you will also have additional searches of the grey literature and hand searches to complete.  Which databases you search is highly dependent on your systematic review topic, so it is recommended you  meet with a librarian . 

Cochrane, which is considered the gold standard for clinical systematic reviews, recommends searching the following three databases, at a minimum: PubMed, Embase, and Cochrane Central Register of Controlled Trials (CENTRAL).

Northeastern login or email required

  • ERIC (Education Resources Institute) This link opens in a new window Citations to education information, including scholarly articles, professional literature, education dissertations, and books, plus grey literature such as curriculum guides, conference proceedings, government publications, and white papers. Covers 1966 to the present. more... less... Sponsored by the U.S. Department of Education.

Looking to Find Systematic Reviews?

There are a number of places to look for systematic reviews, including within the commonly used databases listed on this page. Some other resources to consider are:

  • Systematic Review Repository - International Initiative for Impact Evaluation The systematic review repository from International Initiative for Impact Evaluation is an essential resource for policymakers and researchers who are looking for synthesized evidence on the effects of social and economic interventions in low- and middle- income countries.
  • Epistemonikos Epistemonikos is a collaborative, multilingual database of health evidence. It is the largest source of systematic reviews relevant for health-decision making, and a large source of other types of scientific evidence. PLEASE NOTE: Epistemonikos is a systematic reviews focused database. It pulls in systematic reviews from a number of different international sources and pulls in the studies those reviews. While you will find randomized controlled trials and other primary studies in this database, they are only added in because of their association with a systematic review. Therefore, searching here for randomized controlled trials or other primary studies would NOT be considered a comprehensive search.
  • << Previous: Types of Systematic Reviews in the Health Sciences
  • Next: Resources for Completing Evidence Syntheses >>
  • Ask a Librarian
  • Last Updated: May 6, 2024 9:46 AM
  • URL: https://subjectguides.lib.neu.edu/systematicreview

U.S. flag

An official website of the United States government

Here's how you know

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Literature Search: Databases and Gray Literature

The literature search.

  • A systematic review search includes a search of databases, gray literature, personal communications, and a handsearch of high impact journals in the related field.  See our list of recommended databases and gray literature sources on this page.
  • a comprehensive literature search can not be dependent on a single database, nor on bibliographic databases only.
  • inclusion of multiple databases helps avoid publication bias (georaphic bias or bias against publication of negative results).
  • The Cochrane Collaboration recommends PubMed, Embase and the Cochrane Central Register of Controlled Trials (CENTRAL) at a minimum.     
  • NOTE:  The Cochrane Collaboration and the IOM recommend that the literature search be conducted by librarians or persons with extensive literature search experience. Please contact the NIH Librarians for assistance with the literature search component of your systematic review. 

Cochrane Library

A collection of six databases that contain different types of high-quality, independent evidence to inform healthcare decision-making. Search the Cochrane Central Register of Controlled Trials here.

European database of biomedical and pharmacologic literature.

PubMed comprises more than 21 million citations for biomedical literature from MEDLINE, life science journals, and online books.

Largest abstract and citation database of peer-reviewed literature and quality web sources. Contains conference papers.

Web of Science

World's leading citation databases. Covers over 12,000 of the highest impact journals worldwide, including Open Access journals and over 150,000 conference proceedings. Coverage in the sciences, social sciences, arts, and humanities, with coverage to 1900.

Subject Specific Databases

APA PsycINFO

Over 4.5 million abstracts of peer-reviewed literature in the behavioral and social sciences. Includes conference papers, book chapters, psychological tests, scales and measurement tools.

CINAHL Plus

Comprehensive journal index to nursing and allied health literature, includes books, nursing dissertations, conference proceedings, practice standards and book chapters.

Latin American and Caribbean health sciences literature database

Gray Literature

  • Gray Literature  is the term for information that falls outside the mainstream of published journal and mongraph literature, not controlled by commercial publishers
  • hard to find studies, reports, or dissertations
  • conference abstracts or papers
  • governmental or private sector research
  • clinical trials - ongoing or unpublished
  • experts and researchers in the field     
  • Library catalogs
  • Professional association websites
  • Google Scholar  - Search scholarly literature across many disciplines and sources, including theses, books, abstracts and articles.
  • Dissertation Abstracts - dissertation and theses database - NIH Library biomedical librarians can access and search for you.
  • NTIS  - central resource for government-funded scientific, technical, engineering, and business related information.
  • AHRQ  - agency for healthcare research and quality
  • Open Grey  - system for information on grey literature in Europe. Open access to 700,000 references to the grey literature.
  • World Health Organization  - providing leadership on global health matters, shaping the health research agenda, setting norms and standards, articulating evidence-based policy options, providing technical support to countries and monitoring and assessing health trends.
  • New York Academy of Medicine Grey Literature Report  - a bimonthly publication of The New York Academy of Medicine (NYAM) alerting readers to new gray literature publications in health services research and selected public health topics. NOTE: Discontinued as of Jan 2017, but resources are still accessible.
  • Gray Source Index
  • OpenDOAR - directory of academic repositories
  • International Clinical Trials Registery Platform  - from the World Health Organization
  • Australian New Zealand Clinical Trials Registry
  • Brazilian Clinical Trials Registry
  • Chinese Clinical Trial Registry - 
  • ClinicalTrials.gov   - U.S.  and international federally and privately supported clinical trials registry and results database
  • Clinical Trials Registry  - India
  • EU clinical Trials Register
  • Japan Primary Registries Network  
  • Pan African Clinical Trials Registry

Systematic Reviews: Medical Literature Databases to search

  • Types of literature review, methods, & resources
  • Protocol and registration
  • Search strategy
  • Medical Literature Databases to search
  • Study selection and appraisal
  • Data Extraction/Coding/Study characteristics/Results
  • Reporting the quality/risk of bias
  • Manage citations using RefWorks This link opens in a new window
  • GW Box file storage for PDF's This link opens in a new window

How to document your literature search

You should always  document how you have searched each database, what keywords or index terms were used, the date on which the search was performed, how many results you retrieved, and if you use RefWorks to deduplicate results record how many were removed as duplicates and the final number of discrete studies you subjected to your first sift through of study selection.  Here is an example of how to document a literature search on an Excel spreadsheet , this example records a search of the hematology literature for articles about sickle cell disease. Here is another example of  how to document a literature search, this time on one page of a Word document , this example records a search of the medical literature for a poster on Emergency Department throughput.  The numbers recorded can then be used to populate the PRISMA flow diagram summarizing the literature search.

In the final report add as an appendix the full electronic search strategy for each database searched for the literature review e.g. MEDLINE with MeSH terms, keywords & limits

In the final report in the methods section:

PRISMA checklist Item 7 information sources will be reported as:

  • What databases/websites you searched, the name of the database search platform and the start/end dates the index covers if relevant e.g. OVID MEDLINE (1950-present, or just PubMed
  • Who developed & conducted the searches
  • Date each database/website was last searched
  • Supplementary sources - what other websites did you search? What journal titles were hand searched, whether reference lists were checked, what trial registries or regulatory agency websites were searched, were manufacturers or other authors contacted to obtain unpublished or missing information on study methods or results.

PRISMA checklist Item 8 search will be reported as:

  • In text: describe the principal keywords used to search databases, websites & trials registers

What databases/indexes should you search?

At a minimum you need to search MEDLINE ,  EMBASE , and the  Cochrane CENTRAL  trials register .  This is the recommendation of three medical and public health research organizations: the U.S.  Agency for Healthcare Research and Quality ( AHRQ ), the U.K. Centre for Reviews and Dissemination ( CRD ), and the International Cochrane Collaboration (Source:  Institute of Medicine (2011) Finding What Works in Healthcare: Standards for Systematic Reviews  Table E-1, page 267).  Some databases have an alternate version, linked in parentheses below, that search the same records sets, ie the content of MEDLINE is in PubMed and Scopus, while the content of EMBASE is in Scopus. You should reformat your search for each database as appropriate, contact your librarian if you want help on how to search each database.  

Begin by searching:

1.        MEDLINE  (or  PubMed )

2.       EMBASE (or  Scopus )  Please note Himmelfarb Library does not have a subscription to EMBASE. The content is in the Scopus  database that you can search using keywords, but it is not possible to perform an EMTREE theasaurus search in Scopus.

3.        Cochrane Central Trials Register  (or  Cochrane Library ). In addition Cochrane researchers recommend you search the clinicaltrials.gov and ICTRP clinical trial registries due to the low sensitivity of the Cochrane CENTRAL index because according to Hunter et al (2022) "register records as they appear in CENTRAL are less comprehensive than the original register entry, and thus are at a greater risk than other systems of being missed in a search."

The Polyglot Search Translator is a very useful tool for translating search strings from PubMed or Medline via Ovid across multiple databases, developed by the Institute for Evidence-Based Healthcare at Bond University. But please note Polyglot does not automatically map subject terms across databases (e.g. MeSH terms to Emtree terms) so you will need to manually edit the search syntax in a text editor to change to the actual subject terms used by another database.

The Yale Mesh Analyzer is another very useful tool you can copy and paste in a list of up to 20 PMID numbers for records in the PubMed database, the Yale Mesh Analyzer will then display the Mesh Medical Subject Headings for those 20 articles as a table so you can identify and compare what Mesh headings they have in common, this can suggest additional search terms for your PubMed search.

The MedSyntax tool is another useful tool, for parsing out very long searches with many levels of brackets. This would be useful if you are trying to edit a pre-existing search strategy with many levels of parentheses.

Some sources for pre-existing database search filters or "hedges" include:

  • CADTH Search Filters Database ,
  • McMaster University Health Information Research Unit ,
  • University of York Centre for Reviews and Dissemination InterTASC Information Specialists' Sub-Group ,
  • InterTASC Population Specific search filters  (particularly useful for identifying Latinx, Indigenous people's, LGBTQ, Black & Minority ethnic)
  • CareSearch Palliative Care PubMed search filters  (bereavement, dementia, heart failure, lung cancer, cost of care, and Palliative Care)
  • Low and Middle Income countries filter at https://epoc.cochrane.org/lmic-filters . 
  • Search Pubmed for another validated search filter using some variation of a search like this, possibly adding your discipline or search topic keywords: ("Databases, Bibliographic"[Mesh] OR "Search Engine"[Mesh]) AND ("Reproducibility of Results"[Mesh] OR "Sensitivity and Specificity"[Mesh] OR validat*) AND (filter OR hedge) .
  • Search MEDLINE (or PubMed), preferably using a peer reviewed search strategy per protocol and apply any relevant methodology filters.
  • Search EMBASE (or Scopus) and the Cochrane Central trials register using appropriately reformatted search versions for those databases, and any other online resources. 
  • You should also search other subject specific databases that index the literature in your field.  Use our Himmelfarb Library  research guides  to identify other  subject specific databases . 
  • Save citations in Covidence to deduplicate citations prior to screening.
  • After screening export citations to  RefWorks database when you are ready to write up your manuscript. The Covidence and Refworks databases should be shared with all members of the investigative team.

Supplementary resources to search

Other member of your investigative team may have ideas about databases, websites, and journals they think you should search. Searching these sources is not required to perform a systematic review. You may need to reformat your search keywords.

Researchers at GW should check our subject research guides for suggestions, or check the libguides community for a guide on your subject.

In addition you may wish to search one or more of the following resources:

  • Google Scholar
  • BASE  academic search engine is useful for searching in University Institutional Repositories
  • Cochrane Database of Systematic Reviews  to search for a pre-existing systematic review on your topic
  • Epistemonikos database, has a matrix of evidence table so you can see what citations are shared in common across existing systematic reviews of the same topic. This feature might help identify sentinel or 'don't miss' articles.

You might also consider searching one or more of the following websites depending on your topic:

Clinical trial registers. The Cochrane Collaboration recommends for a systematic review to search both clinicaltrials.gov and the WHO ICTRP (See http://handbook.cochrane.org/ section 4.3):

  • ClinicalTrials.gov  - also contains study population characteristics and results data of FDA regulated drugs and medical devices in NIH funded studies produced after January 18, 2017.
  • WHO ICTRP  - trials register
  • TRIP  - searchable index of clinical trials, guidelines,and regulatory guidance
  • CenterWatch
  • Current Controlled Trials
  • European Clinical Trials Register
  • ISRCTN Register
  • COMPARE - tracks outcome switching in clinical trials
  • OpenTrials - aims to match published trials with the underlying data where this is publicly available in an open source 
  • ECRI Guidelines Trust

Grey literature resources:

  • WONDER - CDC data and reports
  • FDSys - search federal government publications
  • Science.gov
  • NRR Archive
  • NIH Reporter
  • re3data registry of data repositories
  • Data Repositories (listed by the Simmons Open Access Directory)
  • OpenDOAR  search academic open access research repositories
  • f1000research search open access repositories of articles, slides, and research posters, in the life sciences, public health, education, and communication.
  • RAND Health Reports
  • National Academy of Medicine Publications
  • Kaiser Family Foundation 
  • Robert Wood Johnson Foundation health and medical care data archive
  • Milbank Memorial Fund reports and issue briefs
  • Also search the resources listed in the CADTH (2019) Grey Matters checklist.

Preprints 

  • See our Himmelfarb preprints guide page on finding preprints , a useful database for searching Health Sciences preprints is  Europe PMC

Dissertations and Theses:

  • Proquest Dissertations and Theses Online 
  • Networked Digital Library of Theses and Dissertations
  • Open Access Theses and Dissertations
  • WorldCat and change Content: from Any Content to Thesis/dissertations

Conference proceedings:

Most conference proceedings are difficult to find because they may or may not be published. Only select individual papers may be made available in print as a book, journal, or series, rather than all of the presented items. Societies and Associations may only publish abstracts, or extended abstracts, from a conference, often in an annual supplement to an issue of the journal of record of that professional society.  Often posters are not published, if they are they may be made available only to other conference registrants at that meeting or online. Authors may "publish" their conference papers or posters on personal or institutional websites.  A limited set of conference proceedings databases include the following:

  • BASE  academic search engine, has an Advanced Search feature with a Limit by Type to 'Conference Objects', this is useful for searching for conference posters and submissions stored in University Institutional Repositories.
  • Web of Science - click All Databases and select Core Collection - under More Settings limit to the Conference Proceedings Citation Index (CPCI) - searches a limited set of conferences on Science, Social Science and Humanities from 1990-present.
  • Scopus - Limit Document Type to Conference Paper or Conference Review.
  • Proquest  - Limit search results to conference papers &/or proceedings under Advanced Search.
  • BioMed Central Proceedings  - searches a limited set of biomedical conference proceedings, including bioinformatics, genetics, medical students, and data visualization.
  • F1000 Research - browse by subject and click the tabs for articles, posters, and slides - which searches a limited number of biology and medical society meetings/conferences. This is a voluntary self-archive repository.

Individual Journals 

  • You may choose to "hand search" select journals where the research team reads the Table of Contents of each issue for a chosen period of time.  You can look for the names of high impact journal titles in a particular field indexed in Journal Citation Reports  (JCR). Please note as of August 2021 ISI are linking to a new version of JCR that currently does not have the particularly helpful 'Browse by Category' link working, so I recommend you click the Products link in the top right corner and select Journal Citation Reports (Classic) to switch back to the old version to get that functionality back.
  • The AllTrials petition aims to motivate health care researchers to petition regulators and research bodies to require the results and data of all clinical trials be published.
  • << Previous: Search strategy
  • Next: Study selection and appraisal >>

Creative Commons License

  • Last Updated: May 8, 2024 11:07 AM
  • URL: https://guides.himmelfarb.gwu.edu/systematic_review

GW logo

  • Himmelfarb Intranet
  • Privacy Notice
  • Terms of Use
  • GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
  • Himmelfarb Health Sciences Library
  • 2300 Eye St., NW, Washington, DC 20037
  • Phone: (202) 994-2850
  • [email protected]
  • https://himmelfarb.gwu.edu
  • Open access
  • Published: 06 December 2017

Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study

  • Wichor M. Bramer 1 ,
  • Melissa L. Rethlefsen 2 ,
  • Jos Kleijnen 3 , 4 &
  • Oscar H. Franco 5  

Systematic Reviews volume  6 , Article number:  245 ( 2017 ) Cite this article

152k Accesses

861 Citations

88 Altmetric

Metrics details

Within systematic reviews, when searching for relevant references, it is advisable to use multiple databases. However, searching databases is laborious and time-consuming, as syntax of search strategies are database specific. We aimed to determine the optimal combination of databases needed to conduct efficient searches in systematic reviews and whether the current practice in published reviews is appropriate. While previous studies determined the coverage of databases, we analyzed the actual retrieval from the original searches for systematic reviews.

Since May 2013, the first author prospectively recorded results from systematic review searches that he performed at his institution. PubMed was used to identify systematic reviews published using our search strategy results. For each published systematic review, we extracted the references of the included studies. Using the prospectively recorded results and the studies included in the publications, we calculated recall, precision, and number needed to read for single databases and databases in combination. We assessed the frequency at which databases and combinations would achieve varying levels of recall (i.e., 95%). For a sample of 200 recently published systematic reviews, we calculated how many had used enough databases to ensure 95% recall.

A total of 58 published systematic reviews were included, totaling 1746 relevant references identified by our database searches, while 84 included references had been retrieved by other search methods. Sixteen percent of the included references (291 articles) were only found in a single database; Embase produced the most unique references ( n  = 132). The combination of Embase, MEDLINE, Web of Science Core Collection, and Google Scholar performed best, achieving an overall recall of 98.3 and 100% recall in 72% of systematic reviews. We estimate that 60% of published systematic reviews do not retrieve 95% of all available relevant references as many fail to search important databases. Other specialized databases, such as CINAHL or PsycINFO, add unique references to some reviews where the topic of the review is related to the focus of the database.

Conclusions

Optimal searches in systematic reviews should search at least Embase, MEDLINE, Web of Science, and Google Scholar as a minimum requirement to guarantee adequate and efficient coverage.

Peer Review reports

Investigators and information specialists searching for relevant references for a systematic review (SR) are generally advised to search multiple databases and to use additional methods to be able to adequately identify all literature related to the topic of interest [ 1 , 2 , 3 , 4 , 5 , 6 ]. The Cochrane Handbook, for example, recommends the use of at least MEDLINE and Cochrane Central and, when available, Embase for identifying reports of randomized controlled trials [ 7 ]. There are disadvantages to using multiple databases. It is laborious for searchers to translate a search strategy into multiple interfaces and search syntaxes, as field codes and proximity operators differ between interfaces. Differences in thesaurus terms between databases add another significant burden for translation. Furthermore, it is time-consuming for reviewers who have to screen more, and likely irrelevant, titles and abstracts. Lastly, access to databases is often limited and only available on subscription basis.

Previous studies have investigated the added value of different databases on different topics [ 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. Some concluded that searching only one database can be sufficient as searching other databases has no effect on the outcome [ 16 , 17 ]. Nevertheless others have concluded that a single database is not sufficient to retrieve all references for systematic reviews [ 18 , 19 ]. Most articles on this topic draw their conclusions based on the coverage of databases [ 14 ]. A recent paper tried to find an acceptable number needed to read for adding an additional database; sadly, however, no true conclusion could be drawn [ 20 ]. However, whether an article is present in a database may not translate to being found by a search in that database. Because of this major limitation, the question of which databases are necessary to retrieve all relevant references for a systematic review remains unanswered. Therefore, we research the probability that single or various combinations of databases retrieve the most relevant references in a systematic review by studying actual retrieval in various databases.

The aim of our research is to determine the combination of databases needed for systematic review searches to provide efficient results (i.e., to minimize the burden for the investigators without reducing the validity of the research by missing relevant references). A secondary aim is to investigate the current practice of databases searched for published reviews. Are included references being missed because the review authors failed to search a certain database?

Development of search strategies

At Erasmus MC, search strategies for systematic reviews are often designed via a librarian-mediated search service. The information specialists of Erasmus MC developed an efficient method that helps them perform searches in many databases in a much shorter time than other methods. This method of literature searching and a pragmatic evaluation thereof are published in separate journal articles [ 21 , 22 ]. In short, the method consists of an efficient way to combine thesaurus terms and title/abstract terms into a single line search strategy. This search is then optimized. Articles that are indexed with a set of identified thesaurus terms, but do not contain the current search terms in title or abstract, are screened to discover potential new terms. New candidate terms are added to the basic search and evaluated. Once optimal recall is achieved, macros are used to translate the search syntaxes between databases, though manual adaptation of the thesaurus terms is still necessary.

Review projects at Erasmus MC cover a wide range of medical topics, from therapeutic effectiveness and diagnostic accuracy to ethics and public health. In general, searches are developed in MEDLINE in Ovid (Ovid MEDLINE® In-Process & Other Non-Indexed Citations, Ovid MEDLINE® Daily and Ovid MEDLINE®, from 1946); Embase.com (searching both Embase and MEDLINE records, with full coverage including Embase Classic); the Cochrane Central Register of Controlled Trials (CENTRAL) via the Wiley Interface; Web of Science Core Collection (hereafter called Web of Science); PubMed restricting to records in the subset “as supplied by publisher” to find references that not yet indexed in MEDLINE (using the syntax publisher [sb]); and Google Scholar. In general, we use the first 200 references as sorted in the relevance ranking of Google Scholar. When the number of references from other databases was low, we expected the total number of potential relevant references to be low. In this case, the number of hits from Google Scholar was limited to 100. When the overall number of hits was low, we additionally searched Scopus, and when appropriate for the topic, we included CINAHL (EBSCOhost), PsycINFO (Ovid), and SportDiscus (EBSCOhost) in our search.

Beginning in May 2013, the number of records retrieved from each search for each database was recorded at the moment of searching. The complete results from all databases used for each of the systematic reviews were imported into a unique EndNote library upon search completion and saved without deduplication for this research. The researchers that requested the search received a deduplicated EndNote file from which they selected the references relevant for inclusion in their systematic review. All searches in this study were developed and executed by W.M.B.

Determining relevant references of published reviews

We searched PubMed in July 2016 for all reviews published since 2014 where first authors were affiliated to Erasmus MC, Rotterdam, the Netherlands, and matched those with search registrations performed by the medical library of Erasmus MC. This search was used in earlier research [ 21 ]. Published reviews were included if the search strategies and results had been documented at the time of the last update and if, at minimum, the databases Embase, MEDLINE, Cochrane CENTRAL, Web of Science, and Google Scholar had been used in the review. From the published journal article, we extracted the list of final included references. We documented the department of the first author. To categorize the types of patient/population and intervention, we identified broad MeSH terms relating to the most important disease and intervention discussed in the article. We copied from the MeSH tree the top MeSH term directly below the disease category or, in to case of the intervention, directly below the therapeutics MeSH term. We selected the domain from a pre-defined set of broad domains, including therapy, etiology, epidemiology, diagnosis, management, and prognosis. Lastly, we checked whether the reviews described limiting their included references to a particular study design.

To identify whether our searches had found the included references, and if so, from which database(s) that citation was retrieved, each included reference was located in the original corresponding EndNote library using the first author name combined with the publication year as a search term for each specific relevant publication. If this resulted in extraneous results, the search was subsequently limited using a distinct part of the title or a second author name. Based on the record numbers of the search results in EndNote, we determined from which database these references came. If an included reference was not found in the EndNote file, we presumed the authors used an alternative method of identifying the reference (e.g., examining cited references, contacting prominent authors, or searching gray literature), and we did not include it in our analysis.

Data analysis

We determined the databases that contributed most to the reviews by the number of unique references retrieved by each database used in the reviews. Unique references were included articles that had been found by only one database search. Those databases that contributed the most unique included references were then considered candidate databases to determine the most optimal combination of databases in the further analyses.

In Excel, we calculated the performance of each individual database and various combinations. Performance was measured using recall, precision, and number needed to read. See Table  1 for definitions of these measures. These values were calculated both for all reviews combined and per individual review.

Performance of a search can be expressed in different ways. Depending on the goal of the search, different measures may be optimized. In the case of a clinical question, precision is most important, as a practicing clinician does not have a lot of time to read through many articles in a clinical setting. When searching for a systematic review, recall is the most important aspect, as the researcher does not want to miss any relevant references. As our research is performed on systematic reviews, the main performance measure is recall.

We identified all included references that were uniquely identified by a single database. For the databases that retrieved the most unique included references, we calculated the number of references retrieved (after deduplication) and the number of included references that had been retrieved by all possible combinations of these databases, in total and per review. For all individual reviews, we determined the median recall, the minimum recall, and the percentage of reviews for which each single database or combination retrieved 100% recall.

For each review that we investigated, we determined what the recall was for all possible different database combinations of the most important databases. Based on these, we determined the percentage of reviews where that database combination had achieved 100% recall, more than 95%, more than 90%, and more than 80%. Based on the number of results per database both before and after deduplication as recorded at the time of searching, we calculated the ratio between the total number of results and the number of results for each database and combination.

Improvement of precision was calculated as the ratio between the original precision from the searches in all databases and the precision for each database and combination.

To compare our practice of database usage in systematic reviews against current practice as evidenced in the literature, we analyzed a set of 200 recent systematic reviews from PubMed. On 5 January 2017, we searched PubMed for articles with the phrase “systematic review” in the title. Starting with the most recent articles, we determined the databases searched either from the abstract or from the full text until we had data for 200 reviews. For the individual databases and combinations that were used in those reviews, we multiplied the frequency of occurrence in that set of 200 with the probability that the database or combination would lead to an acceptable recall (which we defined at 95%) that we had measured in our own data.

Our earlier research had resulted in 206 systematic reviews published between 2014 and July 2016, in which the first author was affiliated with Erasmus MC [ 21 ]. In 73 of these, the searches and results had been documented by the first author of this article at the time of the last search. Of those, 15 could not be included in this research, since they had not searched all databases we investigated here. Therefore, for this research, a total of 58 systematic reviews were analyzed. The references to these reviews can be found in Additional file 1 . An overview of the broad topical categories covered in these reviews is given in Table  2 . Many of the reviews were initiated by members of the departments of surgery and epidemiology. The reviews covered a wide variety of disease, none of which was present in more than 12% of the reviews. The interventions were mostly from the chemicals and drugs category, or surgical procedures. Over a third of the reviews were therapeutic, while slightly under a quarter answered an etiological question. Most reviews did not limit to certain study designs, 9% limited to RCTs only, and another 9% limited to other study types.

Together, these reviews included a total of 1830 references. Of these, 84 references (4.6%) had not been retrieved by our database searches and were not included in our analysis, leaving in total 1746 references. In our analyses, we combined the results from MEDLINE in Ovid and PubMed (the subset as supplied by publisher) into one database labeled MEDLINE.

Unique references per database

A total of 292 (17%) references were found by only one database. Table  3 displays the number of unique results retrieved for each single database. Embase retrieved the most unique included references, followed by MEDLINE, Web of Science, and Google Scholar. Cochrane CENTRAL is absent from the table, as for the five reviews limited to randomized trials, it did not add any unique included references. Subject-specific databases such as CINAHL, PsycINFO, and SportDiscus only retrieved additional included references when the topic of the review was directly related to their special content, respectively nursing, psychiatry, and sports medicine.

Overall performance

The four databases that had retrieved the most unique references (Embase, MEDLINE, Web of Science, and Google Scholar) were investigated individually and in all possible combinations (see Table  4 ). Of the individual databases, Embase had the highest overall recall (85.9%). Of the combinations of two databases, Embase and MEDLINE had the best results (92.8%). Embase and MEDLINE combined with either Google Scholar or Web of Science scored similarly well on overall recall (95.9%). However, the combination with Google Scholar had a higher precision and higher median recall, a higher minimum recall, and a higher proportion of reviews that retrieved all included references. Using both Web of Science and Google Scholar in addition to MEDLINE and Embase increased the overall recall to 98.3%. The higher recall from adding extra databases came at a cost in number needed to read (NNR). Searching only Embase produced an NNR of 57 on average, whereas, for the optimal combination of four databases, the NNR was 73.

Probability of appropriate recall

We calculated the recall for individual databases and databases in all possible combination for all reviews included in the research. Figure  1 shows the percentages of reviews where a certain database combination led to a certain recall. For example, in 48% of all systematic reviews, the combination of Embase and MEDLINE (with or without Cochrane CENTRAL; Cochrane CENTRAL did not add unique relevant references) reaches a recall of at least 95%. In 72% of studied systematic reviews, the combination of Embase, MEDLINE, Web of Science, and Google Scholar retrieved all included references. In the top bar, we present the results of the complete database searches relative to the total number of included references. This shows that many database searches missed relevant references.

Percentage of systematic reviews for which a certain database combination reached a certain recall. The X -axis represents the percentage of reviews for which a specific combination of databases, as shown on the y -axis, reached a certain recall (represented with bar colors). Abbreviations: EM Embase, ML MEDLINE, WoS Web of Science, GS Google Scholar. Asterisk indicates that the recall of all databases has been calculated over all included references. The recall of the database combinations was calculated over all included references retrieved by any database

Differences between domains of reviews

We analyzed whether the added value of Web of Science and Google Scholar was dependent of the domain of the review. For 55 reviews, we determined the domain. See Fig.  2 for the comparison of the recall of Embase, MEDLINE, and Cochrane CENTRAL per review for all identified domains. For all but one domain, the traditional combination of Embase, MEDLINE, and Cochrane CENTRAL did not retrieve enough included references. For four out of five systematic reviews that limited to randomized controlled trials (RCTs) only, the traditional combination retrieved 100% of all included references. However, for one review of this domain, the recall was 82%. Of the 11 references included in this review, one was found only in Google Scholar and one only in Web of Science.

Percentage of systematic reviews of a certain domain for which the combination Embase, MEDLINE and Cochrane CENTRAL reached a certain recall

Reduction in number of results

We calculated the ratio between the number of results found when searching all databases, including databases not included in our analyses, such as Scopus, PsycINFO, and CINAHL, and the number of results found searching a selection of databases. See Fig.  3 for the legend of the plots in Figs.  4 and 5 . Figure  4 shows the distribution of this value for individual reviews. The database combinations with the highest recall did not reduce the total number of results by large margins. Moreover, in combinations where the number of results was greatly reduced, the recall of included references was lower.

Legend of Figs. 3 and 4

The ratio between number of results per database combination and the total number of results for all databases

The ratio between precision per database combination and the total precision for all databases

Improvement of precision

To determine how searching multiple databases affected precision, we calculated for each combination the ratio between the original precision, observed when all databases were searched, and the precision calculated for different database combinations. Figure  5 shows the improvement of precision for 15 databases and database combinations. Because precision is defined as the number of relevant references divided by the number of total results, we see a strong correlation with the total number of results.

Status of current practice of database selection

From a set of 200 recent SRs identified via PubMed, we analyzed the databases that had been searched. Almost all reviews (97%) reported a search in MEDLINE. Other databases that we identified as essential for good recall were searched much less frequently; Embase was searched in 61% and Web of Science in 35%, and Google Scholar was only used in 10% of all reviews. For all individual databases or combinations of the four important databases from our research (MEDLINE, Embase, Web of Science, and Google Scholar), we multiplied the frequency of occurrence of that combination in the random set, with the probability we found in our research that this combination would lead to an acceptable recall of 95%. The calculation is shown in Table  5 . For example, around a third of the reviews (37%) relied on the combination of MEDLINE and Embase. Based on our findings, this combination achieves acceptable recall about half the time (47%). This implies that 17% of the reviews in the PubMed sample would have achieved an acceptable recall of 95%. The sum of all these values is the total probability of acceptable recall in the random sample. Based on these calculations, we estimate that the probability that this random set of reviews retrieved more than 95% of all possible included references was 40%. Using similar calculations, also shown in Table  5 , we estimated the probability that 100% of relevant references were retrieved is 23%.

Our study shows that, to reach maximum recall, searches in systematic reviews ought to include a combination of databases. To ensure adequate performance in searches (i.e., recall, precision, and number needed to read), we find that literature searches for a systematic review should, at minimum, be performed in the combination of the following four databases: Embase, MEDLINE (including Epub ahead of print), Web of Science Core Collection, and Google Scholar. Using that combination, 93% of the systematic reviews in our study obtained levels of recall that could be considered acceptable (> 95%). Unique results from specialized databases that closely match systematic review topics, such as PsycINFO for reviews in the fields of behavioral sciences and mental health or CINAHL for reviews on the topics of nursing or allied health, indicate that specialized databases should be used additionally when appropriate.

We find that Embase is critical for acceptable recall in a review and should always be searched for medically oriented systematic reviews. However, Embase is only accessible via a paid subscription, which generally makes it challenging for review teams not affiliated with academic medical centers to access. The highest scoring database combination without Embase is a combination of MEDLINE, Web of Science, and Google Scholar, but that reaches satisfactory recall for only 39% of all investigated systematic reviews, while still requiring a paid subscription to Web of Science. Of the five reviews that included only RCTs, four reached 100% recall if MEDLINE, Web of Science, and Google Scholar combined were complemented with Cochrane CENTRAL.

The Cochrane Handbook recommends searching MEDLINE, Cochrane CENTRAL, and Embase for systematic reviews of RCTs. For reviews in our study that included RCTs only, indeed, this recommendation was sufficient for four (80%) of the reviews. The one review where it was insufficient was about alternative medicine, specifically meditation and relaxation therapy, where one of the missed studies was published in the Indian Journal of Positive Psychology . The other study from the Journal of Advanced Nursing is indexed in MEDLINE and Embase but was only retrieved because of the addition of KeyWords Plus in Web of Science. We estimate more than 50% of reviews that include more study types than RCTs would miss more than 5% of included references if only traditional combination of MEDLINE, Embase, and Cochrane CENTAL is searched.

We are aware that the Cochrane Handbook [ 7 ] recommends more than only these databases, but further recommendations focus on regional and specialized databases. Though we occasionally used the regional databases LILACS and SciELO in our reviews, they did not provide unique references in our study. Subject-specific databases like PsycINFO only added unique references to a small percentage of systematic reviews when they had been used for the search. The third key database we identified in this research, Web of Science, is only mentioned as a citation index in the Cochrane Handbook, not as a bibliographic database. To our surprise, Cochrane CENTRAL did not identify any unique included studies that had not been retrieved by the other databases, not even for the five reviews focusing entirely on RCTs. If Erasmus MC authors had conducted more reviews that included only RCTs, Cochrane CENTRAL might have added more unique references.

MEDLINE did find unique references that had not been found in Embase, although our searches in Embase included all MEDLINE records. It is likely caused by difference in thesaurus terms that were added, but further analysis would be required to determine reasons for not finding the MEDLINE records in Embase. Although Embase covers MEDLINE, it apparently does not index every article from MEDLINE. Thirty-seven references were found in MEDLINE (Ovid) but were not available in Embase.com . These are mostly unique PubMed references, which are not assigned MeSH terms, and are often freely available via PubMed Central.

Google Scholar adds relevant articles not found in the other databases, possibly because it indexes the full text of all articles. It therefore finds articles in which the topic of research is not mentioned in title, abstract, or thesaurus terms, but where the concepts are only discussed in the full text. Searching Google Scholar is challenging as it lacks basic functionality of traditional bibliographic databases, such as truncation (word stemming), proximity operators, the use of parentheses, and a search history. Additionally, search strategies are limited to a maximum of 256 characters, which means that creating a thorough search strategy can be laborious.

Whether Embase and Web of Science can be replaced by Scopus remains uncertain. We have not yet gathered enough data to be able to make a full comparison between Embase and Scopus. In 23 reviews included in this research, Scopus was searched. In 12 reviews (52%), Scopus retrieved 100% of all included references retrieved by Embase or Web of Science. In the other 48%, the recall by Scopus was suboptimal, in one occasion as low as 38%.

Of all reviews in which we searched CINAHL and PsycINFO, respectively, for 6 and 9% of the reviews, unique references were found. For CINAHL and PsycINFO, in one case each, unique relevant references were found. In both these reviews, the topic was highly related to the topic of the database. Although we did not use these special topic databases in all of our reviews, given the low number of reviews where these databases added relevant references, and observing the special topics of those reviews, we suggest that these subject databases will only add value if the topic is related to the topic of the database.

Many articles written on this topic have calculated overall recall of several reviews, instead of the effects on all individual reviews. Researchers planning a systematic review generally perform one review, and they need to estimate the probability that they may miss relevant articles in their search. When looking at the overall recall, the combination of Embase and MEDLINE and either Google Scholar or Web of Science could be regarded sufficient with 96% recall. This number however is not an answer to the question of a researcher performing a systematic review, regarding which databases should be searched. A researcher wants to be able to estimate the chances that his or her current project will miss a relevant reference. However, when looking at individual reviews, the probability of missing more than 5% of included references found through database searching is 33% when Google Scholar is used together with Embase and MEDLINE and 30% for the Web of Science, Embase, and MEDLINE combination. What is considered acceptable recall for systematic review searches is open for debate and can differ between individuals and groups. Some reviewers might accept a potential loss of 5% of relevant references; others would want to pursue 100% recall, no matter what cost. Using the results in this research, review teams can decide, based on their idea of acceptable recall and the desired probability which databases to include in their searches.

Strengths and limitations

We did not investigate whether the loss of certain references had resulted in changes to the conclusion of the reviews. Of course, the loss of a minor non-randomized included study that follows the systematic review’s conclusions would not be as problematic as losing a major included randomized controlled trial with contradictory results. However, the wide range of scope, topic, and criteria between systematic reviews and their related review types make it very hard to answer this question.

We found that two databases previously not recommended as essential for systematic review searching, Web of Science and Google Scholar, were key to improving recall in the reviews we investigated. Because this is a novel finding, we cannot conclude whether it is due to our dataset or to a generalizable principle. It is likely that topical differences in systematic reviews may impact whether databases such as Web of Science and Google Scholar add value to the review. One explanation for our finding may be that if the research question is very specific, the topic of research might not always be mentioned in the title and/or abstract. In that case, Google Scholar might add value by searching the full text of articles. If the research question is more interdisciplinary, a broader science database such as Web of Science is likely to add value. The topics of the reviews studied here may simply have fallen into those categories, though the diversity of the included reviews may point to a more universal applicability.

Although we searched PubMed as supplied by publisher separately from MEDLINE in Ovid, we combined the included references of these databases into one measurement in our analysis. Until 2016, the most complete MEDLINE selection in Ovid still lacked the electronic publications that were already available in PubMed. These could be retrieved by searching PubMed with the subset as supplied by publisher. Since the introduction of the more complete MEDLINE collection Epub Ahead of Print , In-Process & Other Non-Indexed Citations , and Ovid MEDLINE® , the need to separately search PubMed as supplied by publisher has disappeared. According to our data, PubMed’s “as supplied by publisher” subset retrieved 12 unique included references, and it was the most important addition in terms of relevant references to the four major databases. It is therefore important to search MEDLINE including the “Epub Ahead of Print, In-Process, and Other Non-Indexed Citations” references.

These results may not be generalizable to other studies for other reasons. The skills and experience of the searcher are one of the most important aspects in the effectiveness of systematic review search strategies [ 23 , 24 , 25 ]. The searcher in the case of all 58 systematic reviews is an experienced biomedical information specialist. Though we suspect that searchers who are not information specialists or librarians would have a higher possibility of less well-constructed searches and searches with lower recall, even highly trained searchers differ in their approaches to searching. For this study, we searched to achieve as high a recall as possible, though our search strategies, like any other search strategy, still missed some relevant references because relevant terms had not been used in the search. We are not implying that a combined search of the four recommended databases will never result in relevant references being missed, rather that failure to search any one of these four databases will likely lead to relevant references being missed. Our experience in this study shows that additional efforts, such as hand searching, reference checking, and contacting key players, should be made to retrieve extra possible includes.

Based on our calculations made by looking at random systematic reviews in PubMed, we estimate that 60% of these reviews are likely to have missed more than 5% of relevant references only because of the combinations of databases that were used. That is with the generous assumption that the searches in those databases had been designed sensitively enough. Even when taking into account that many searchers consider the use of Scopus as a replacement of Embase, plus taking into account the large overlap of Scopus and Web of Science, this estimate remains similar. Also, while the Scopus and Web of Science assumptions we made might be true for coverage, they are likely very different when looking at recall, as Scopus does not allow the use of the full features of a thesaurus. We see that reviewers rarely use Web of Science and especially Google Scholar in their searches, though they retrieve a great deal of unique references in our reviews. Systematic review searchers should consider using these databases if they are available to them, and if their institution lacks availability, they should ask other institutes to cooperate on their systematic review searches.

The major strength of our paper is that it is the first large-scale study we know of to assess database performance for systematic reviews using prospectively collected data. Prior research on database importance for systematic reviews has looked primarily at whether included references could have theoretically been found in a certain database, but most have been unable to ascertain whether the researchers actually found the articles in those databases [ 10 , 12 , 16 , 17 , 26 ]. Whether a reference is available in a database is important, but whether the article can be found in a precise search with reasonable recall is not only impacted by the database’s coverage. Our experience has shown us that it is also impacted by the ability of the searcher, the accuracy of indexing of the database, and the complexity of terminology in a particular field. Because these studies based on retrospective analysis of database coverage do not account for the searchers’ abilities, the actual findings from the searches performed, and the indexing for particular articles, their conclusions lack immediate translatability into practice. This research goes beyond retrospectively assessed coverage to investigate real search performance in databases. Many of the articles reporting on previous research concluded that one database was able to retrieve most included references. Halladay et al. [ 10 ] and van Enst et al. [ 16 ] concluded that databases other than MEDLINE/PubMed did not change the outcomes of the review, while Rice et al. [ 17 ] found the added value of other databases only for newer, non-indexed references. In addition, Michaleff et al. [ 26 ] found that Cochrane CENTRAL included 95% of all RCTs included in the reviews investigated. Our conclusion that Web of Science and Google Scholar are needed for completeness has not been shared by previous research. Most of the previous studies did not include these two databases in their research.

We recommend that, regardless of their topic, searches for biomedical systematic reviews should combine Embase, MEDLINE (including electronic publications ahead of print), Web of Science (Core Collection), and Google Scholar (the 200 first relevant references) at minimum. Special topics databases such as CINAHL and PsycINFO should be added if the topic of the review directly touches the primary focus of a specialized subject database, like CINAHL for focus on nursing and allied health or PsycINFO for behavioral sciences and mental health. For reviews where RCTs are the desired study design, Cochrane CENTRAL may be similarly useful. Ignoring one or more of the databases that we identified as the four key databases will result in more precise searches with a lower number of results, but the researchers should decide whether that is worth the >increased probability of losing relevant references. This study also highlights once more that searching databases alone is, nevertheless, not enough to retrieve all relevant references.

Future research should continue to investigate recall of actual searches beyond coverage of databases and should consider focusing on the most optimal database combinations, not on single databases.

Levay P, Raynor M, Tuvey D. The contributions of MEDLINE, other bibliographic databases and various search techniques to NICE public health guidance. Evid Based Libr Inf Pract. 2015;10:50–68.

Article   Google Scholar  

Stevinson C, Lawlor DA. Searching multiple databases for systematic reviews: added value or diminishing returns? Complement Ther Med. 2004;12:228–32.

Article   CAS   PubMed   Google Scholar  

Lawrence DW. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion? Inj Prev. 2008;14:401–4.

Lemeshow AR, Blum RE, Berlin JA, Stoto MA, Colditz GA. Searching one or two databases was insufficient for meta-analysis of observational studies. J Clin Epidemiol. 2005;58:867–73.

Article   PubMed   Google Scholar  

Zheng MH, Zhang X, Ye Q, Chen YP. Searching additional databases except PubMed are necessary for a systematic review. Stroke. 2008;39:e139. author reply e140

Beyer FR, Wright K. Can we prioritise which databases to search? A case study using a systematic review of frozen shoulder management. Health Inf Libr J. 2013;30:49–58.

Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions: The Cochrane Collaboration, London, United Kingdom. 2011.

Wright K, Golder S, Lewis-Light K. What value is the CINAHL database when searching for systematic reviews of qualitative studies? Syst Rev. 2015;4:104.

Article   PubMed   PubMed Central   Google Scholar  

Wilkins T, Gillies RA, Davies K. EMBASE versus MEDLINE for family medicine searches: can MEDLINE searches find the forest or a tree? Can Fam Physician. 2005;51:848–9.

PubMed   Google Scholar  

Halladay CW, Trikalinos TA, Schmid IT, Schmid CH, Dahabreh IJ. Using data sources beyond PubMed has a modest impact on the results of systematic reviews of therapeutic interventions. J Clin Epidemiol. 2015;68:1076–84.

Ahmadi M, Ershad-Sarabi R, Jamshidiorak R, Bahaodini K. Comparison of bibliographic databases in retrieving information on telemedicine. J Kerman Univ Med Sci. 2014;21:343–54.

Google Scholar  

Lorenzetti DL, Topfer L-A, Dennett L, Clement F. Value of databases other than MEDLINE for rapid health technology assessments. Int J Technol Assess Health Care. 2014;30:173–8.

Beckles Z, Glover S, Ashe J, Stockton S, Boynton J, Lai R, Alderson P. Searching CINAHL did not add value to clinical questions posed in NICE guidelines. J Clin Epidemiol. 2013;66:1051–7.

Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. The contribution of databases to the results of systematic reviews: a cross-sectional study. BMC Med Res Methodol. 2016;16:1–13.

Aagaard T, Lund H, Juhl C. Optimizing literature search in systematic reviews—are MEDLINE, EMBASE and CENTRAL enough for identifying effect studies within the area of musculoskeletal disorders? BMC Med Res Methodol. 2016;16:161.

van Enst WA, Scholten RJ, Whiting P, Zwinderman AH, Hooft L. Meta-epidemiologic analysis indicates that MEDLINE searches are sufficient for diagnostic test accuracy systematic reviews. J Clin Epidemiol. 2014;67:1192–9.

Rice DB, Kloda LA, Levis B, Qi B, Kingsland E, Thombs BD. Are MEDLINE searches sufficient for systematic reviews and meta-analyses of the diagnostic accuracy of depression screening tools? A review of meta-analyses. J Psychosom Res. 2016;87:7–13.

Bramer WM, Giustini D, Kramer BM, Anderson PF. The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews. Syst Rev. 2013;2:115.

Bramer WM, Giustini D, Kramer BMR. Comparing the coverage, recall, and precision of searches for 120 systematic reviews in Embase, MEDLINE, and Google Scholar: a prospective study. Syst Rev. 2016;5:39.

Ross-White A, Godfrey C. Is there an optimum number needed to retrieve to justify inclusion of a database in a systematic review search? Health Inf Libr J. 2017;33:217–24.

Bramer WM, Rethlefsen ML, Mast F, Kleijnen J. A pragmatic evaluation of a new method for librarian-mediated literature searches for systematic reviews. Res Synth Methods. 2017. doi: 10.1002/jrsm.1279 .

Bramer WM, de Jonge GB, Rethlefsen ML, Mast F, Kleijnen J. A systematic approach to searching: how to perform high quality literature searches more efficiently. J Med Libr Assoc. 2018.

Rethlefsen ML, Farrell AM, Osterhaus Trzasko LC, Brigham TJ. Librarian co-authors correlated with higher quality reported search strategies in general internal medicine systematic reviews. J Clin Epidemiol. 2015;68:617–26.

McGowan J, Sampson M. Systematic reviews need systematic searchers. J Med Libr Assoc. 2005;93:74–80.

PubMed   PubMed Central   Google Scholar  

McKibbon KA, Haynes RB, Dilks CJW, Ramsden MF, Ryan NC, Baker L, Flemming T, Fitzgerald D. How good are clinical MEDLINE searches? A comparative study of clinical end-user and librarian searches. Comput Biomed Res. 1990;23:583–93.

Michaleff ZA, Costa LO, Moseley AM, Maher CG, Elkins MR, Herbert RD, Sherrington C. CENTRAL, PEDro, PubMed, and EMBASE are the most comprehensive databases indexing randomized controlled trials of physical therapy interventions. Phys Ther. 2011;91:190–7.

Download references

Acknowledgements

Not applicable

Melissa Rethlefsen receives funding in part from the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR001067. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Availability of data and materials

The datasets generated and/or analyzed during the current study are available from the corresponding author on a reasonable request.

Author information

Authors and affiliations.

Medical Library, Erasmus MC, Erasmus University Medical Centre Rotterdam, 3000 CS, Rotterdam, the Netherlands

Wichor M. Bramer

Spencer S. Eccles Health Sciences Library, University of Utah, Salt Lake City, Utah, USA

Melissa L. Rethlefsen

Kleijnen Systematic Reviews Ltd., York, UK

Jos Kleijnen

School for Public Health and Primary Care (CAPHRI), Maastricht University, Maastricht, the Netherlands

Department of Epidemiology, Erasmus MC, Erasmus University Medical Centre Rotterdam, Rotterdam, the Netherlands

Oscar H. Franco

You can also search for this author in PubMed   Google Scholar

Contributions

WB, JK, and OF designed the study. WB designed the searches used in this study and gathered the data. WB and ML analyzed the data. WB drafted the first manuscript, which was revised critically by the other authors. All authors have approved the final manuscript.

Corresponding author

Correspondence to Wichor M. Bramer .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

WB has received travel allowance from Embase for giving a presentation at a conference. The other authors declare no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:.

Reviews included in the research . References to the systematic reviews published by Erasmus MC authors that were included in the research. (DOCX 19 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Bramer, W.M., Rethlefsen, M.L., Kleijnen, J. et al. Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study. Syst Rev 6 , 245 (2017). https://doi.org/10.1186/s13643-017-0644-y

Download citation

Received : 21 August 2017

Accepted : 24 November 2017

Published : 06 December 2017

DOI : https://doi.org/10.1186/s13643-017-0644-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Databases, bibliographic
  • Review literature as topic
  • Sensitivity and specificity
  • Information storage and retrieval

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

systematic literature review databases

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Data Descriptor
  • Open access
  • Published: 14 June 2022

A global database for conducting systematic reviews and meta-analyses in innovation and quality management

  • Tibor Csizmadia   ORCID: orcid.org/0000-0001-6734-8894 1   na1 &
  • Attila Imre Katona   ORCID: orcid.org/0000-0001-7946-6265 2   na1  

Scientific Data volume  9 , Article number:  301 ( 2022 ) Cite this article

4081 Accesses

2 Citations

Metrics details

  • Interdisciplinary studies

Innovation and quality management are two fundamental business orientations that complement each other in improving performance and are important drivers of long-term economic growth. These themes have generated widespread attention in the literature; however, most of these studies mainly focused on a narrow area and only in a short term. No systematic effort has been made to build an extended bibliometric database regarding these research areas, which can be immediately used to conduct literature reviews. This study presents a complete (from 1975–2021), up-to-date, preprocessed and geocoded bibliometric database combining published articles of the two themes. The data collection was performed following the PRISMA methodology. The database consists of seven data tables, including one core dataset with 59,231 records and six citation network-related tables, including latitude and longitude values of the affiliations. These data will benefit researchers conducting comparative and in-depth analyses, such as gaining an overview of relevant existing studies, identifying relevant trends and gaining opportunities for a variety of geographic analyses.

Similar content being viewed by others

systematic literature review databases

Entropy, irreversibility and inference at the foundations of statistical physics

systematic literature review databases

Determinants of behaviour and their efficacy as targets of behavioural change interventions

systematic literature review databases

Interviews in the social sciences

Background & summary.

In today’s growing global competition, organizations are obliged to promote innovation and improve quality to create, defend and enhance their competitive advantage simultaneously 1 , 2 , 3 . However, a traditional view considers that there is a trade-off between innovation and quality to the extent that the increasing one leads to deteriorating the other 4 , 5 , 6 . Conversely, the modern view rejects this idea and suggests that innovation and quality can coexist together, and companies that achieve excellence in quality are expected to also excel in innovation 7 , 8 , 9 . Thus, innovation and quality management are two necessary business orientations that complement each other in improving performance and are important drivers of long-term economic growth 1 , 10 .

The significance of innovation research is based on its application across numerous disciplines, the international richness of these studies, and the variety of new ideas, practices, and technologies that have been examined in this field. Moreover, innovation frameworks and theories have become fundamental factors for solving human problems. For example, in a review of COVID-19 vaccine innovations, Vuong et al . 11 demonstrated how COVID-19 vaccines were developed and produced in a very short time. In addition, George et al . 12 explored how digital innovations are helping to tackle climate change and promote sustainable development. Moreover, Toebelmann and Wendler 13 demonstrated how environmental innovation contributes to reductions in carbon dioxide emissions in EU-27 countries.

Over the past decades, innovation, quality management and their relationship have generated widespread attention in the literature. However, most of these studies mainly focus attention on a narrow area in the short term 14 , and the limited number of studies that have studied the relationship between them shows controversial results 15 . Therefore, to obtain a comprehensive understanding of the context of innovation and quality management, synthesize the extant knowledge on these areas, address relevant gaps and stimulate future research, systematic reviews and meta-analyses need to be performed in the future 16 , 17 . However, data collection, preprocessing and data mining in systematic reviews and meta-analyses are time- and resource-intensive steps. In addition, most bibliometric platforms provide structured data only without geocoded affiliations, and therefore, it is difficult to analyze research hot spots or hubs. Nevertheless, an open scientific dataset including the aforementioned features and supported by program code can be a powerful tool to encourage open community dialog and support new scientific discoveries 18 .

To our knowledge, no systematic effort has been made to build an extended bibliometric dataset regarding the two research areas, which can be immediately used to conduct literature reviews. Nevertheless, in other research fields, scholars showed that geocoded bibliometric database is strongly valuable 19 . Therefore, we provide an up-to-date and easily accessible cleaned, preprocessed and geocoded bibliometric database combining the published articles of the two themes (innovation and quality management) as a large and comprehensive database.

Our work makes the following significant contributions.

The database provides a comprehensive overview from 1975 to 2021 of scientific literature in the areas of innovation and quality management.

Researchers and practitioners can benefit from this database by (1) gaining an overview of relevant existing studies in the fields of innovation and quality management, (2) identifying relevant (future) trends and (3) through geocoded records, gaining opportunities for a variety of geographic distance calculations, spatial clustering and visualizations. In addition, the database allows answering research questions such as “ What kind of relationship can be observed between natural and social sciences in terms of innovation and quality management? ”, “ What spatial-temporal patterns can be observed in the dominant topics related to innovation and/or quality management? ” and “ How the two research fields influenced each other?

The data build a baseline for comparative as well as in-depth analyses such as (1) text mining approaches to reveal hidden topics, (2) time-series analyses, (3) citation network analyses, (4) geographical analysis and (5) spatial network analysis.

The worldwide coverage of the data enables scientific research on innovation and quality management, which supports identifying and analyzing research hot spots and revealing dominant topics from geographical point of view.

The database can be used to conduct research that supports decision-making in the field of quality management and innovation policy.

The rest of the paper is organized as follows. Section Methods describes the methods applied for data collection, preprocessing, data mining and citation network construction in detail, section Data Records presents the database structure and provides illustrative examples for using the data associated with text mining approaches, citation network analysis and spatial analysis, section Technical Validation shows the validation results and finally, section Usage Notes gives data application information.

Database construction was conducted in three phases: (1) collecting data from the Web of Science (WoS), (2) cleaning and data mining (preprocessing) and (3) constructing citation network nodes and edge tables. Fig.  1 shows the database construction framework.

figure 1

Database construction framework.

Data collection

Data collection was performed following the PRISMA methodology proposed by 20 . This method provides guidance to scholars in conducting systematic literature reviews by following the four proposed steps: (1) identification, (2) screening, (3) eligibility and (4) included. The PRISMA methodology was selected as the framework of the data collection and filtering due to the following advantages: it provides a comprehensive and transparent process; it is applicable in any research field; and it strongly supports the reproducibility of the review. Furthermore, a process flow diagram (as also contained by PRISMA) helps readers to better understand the overall process and the boundaries of the study and can increase the quality of the literature review 21 . The search was conducted separately for the two areas of interest, and the results were combined to provide the entire bibliometric dataset. Fig.  2 shows the data collection process.

figure 2

PRISMA flowchart.

A keyword search was used in both topics on the WoS platform. The search was conducted only on article titles to minimize the inclusion of nonrelevant papers that only mention the terms within the abstract related to innovation or quality management. The entire time range was analyzed (the first available year in WoS was 1975) until the date of data collection, which occurred on 22 September 2021. Regarding the language of the documents, English was considered as the lingua franca of scientific publication 22 . In the case of the innovation field (left side of Fig.  2 ), 61,930 records were found after the keyword search in the specified time interval. This number was reduced to 38,630 after applying the abovementioned filtering rules by excluding 23,570 papers. In the case of quality management (right side of Fig.  2 ), 36,390 papers were found by the keyword search, and this number was reduced during the screening phase to 20,517 by excluding 15,873 papers as they were documents with non-English language or were types other than articles. In this paper, step “Eligibility” is not relevant because the number of resulting data records does not make it possible to manually read and evaluate all the screened papers. After the application of the steps proposed by the PRISMA methodology, 59,231 screened papers were collected into the database.

Cleaning and data mining

The goal of this step was to extend the dataset with additional valuable variables such as the institute of the first author, country of the first author, ISO3 country code, COVID-19 content, geographical coordinates and topic indicator (innovation or quality management). The additional variables were provided based on the following:

Institute of the first author (Institute)

The column “ Affiliations ” provided by WoS was used to extract the first author’s affiliation. These data were stored initially as a continuous string including all the authors’ names and affiliations. A further problem was that the same authors from the same affiliation were handled as one entity within the string, and their affiliations did not follow the same format and structure in all cases. Due to this unstructured nature, text cleaning and text mining needed to be used to extract the required information. To return the first author’s affiliation, first, regular expressions were used to remove the unnecessary substrings; second, term shortenings were replaced by their full forms (such as “ University ” instead of “ Univ .” or “ Department ” instead of “ Dept .”). Finally, the cleaned string was tokenized to separate the specific parts of the affiliation, such as the institute name, city, street address, and country. These steps were performed using the Python program language.

Country-related columns (Country, ISO3)

Using the preprocessed column “ Institute ”, the country of the first author’s affiliation was extracted as part of the cleaned string. Not only were country names extracted, but their ISO3 codes were mapped. Since several statistical programs and packages (such as R) identify countries based on ISO codes, this step makes easy identification possible for statistical packages without the need for further mapping effort by the researcher.

COVID-19 content (Covid19Content)

To highlight whether a paper was written in the context of COVID-19, a keyword search was used, and the value of the column was set to 1 if the title, the keywords or the abstract contained at least one of the following keywords: “ COVID ”, “ coronavirus ”, “ pandemic ”, or “ SARSCoV2 ”. Otherwise, its value was set to zero.

Geographical coordinates (Lat, Lon)

Latitude and longitude values related to the first author’s affiliation were retrieved by geocoding using the GeoPy Python package. Geocoding was performed using the extracted and tokenized column “Institution” as input values.

Search category indicator (InnovQMCateg)

This column was manually added when combining the results from both searches as described by Fig.  2 . If a specific paper was collected exclusively by the innovation-related search, its value was set to “ innovation ”, and the category name “ quality ” indicates that the paper can be found exclusively in the quality management search results. Finally, the intersection was denoted by the category name “ both ”. In this case, duplicated records were removed from the data table.

Citation network construction

The node and edge tables were generated using the collected and further processed dataset from WoS. In the core dataset, the cited articles were stored in a single column in string format. To construct the edge list format from the string-type input variable, RegEx (regular expressions) commands were applied to find all the DOI numbers appearing within the long text. After extracting the cited DOI numbers, a list format was constructed. The edge list construction process can be described as follows:

Select paper i

Extract all DOI numbers from the string of cited references using regular expressions

For all the extracted DOI numbers: add DOI i – DOI j pairs to the edge list (where j is the j th element of the extracted cited DOIs for paper i )

Iterate steps 1–3 through all the papers (DOIs) within the core dataset.

The construction was performed in the Python program language using re , NLTK , NumPy and pandas packages.

Data Records

The database (see Excel files with this article) provides an overview from 1975 to 2021 of scientific literature in the areas of innovation and quality management. Table  1 shows the database structure.

The seven data tables are contained in four Excel files:

InnovQm.xlsx : bibliometric data for each article

cnetQM.xlsx : citation-related network data containing quality management

cnetIN.xlsx : citation-related network data containing innovation

cnetALL.xlsx : citation-related network data containing both fields

All of the four Excel files containing the data tables described in Table 1 are accessible at figshare 23 . The referred figshare database also contains an R notebook file (as well as HTML format) as a related manuscript file ( InnovQmAnalysis.Rmd ) including the visualizations and example analyses such as (1) geographical plotting, (2) co-occurrence network construction, (3) text cleaning and (4) visualization of citation network 23 .

The first file ( InnovQm.xlsx ) shows the bibliometric data for each article, which includes bibliographic information such as authorship, title, year of publication, journal, number of citations and affiliation-related data (see details in Table  2 ).

The InnovQm dataset can be used to conduct several types of analysis, such as text mining on textual fields of bibliometric data (e.g., columns Title, Abstract, Keywords, KeywordsPlus) and geographical analysis using geocoded data (e.g., columns Institute, Country, ISO3). As an example, Fig.  3a shows the 51,743 geocoded institutes as the first author’s affiliation and Fig.  3b depicts the frequency of COVID-19-related papers by country.

figure 3

Number of located institutes and COVID-19-related papers in the database.

As a further example for text analysis, keyword co-occurrences can be visualized by networks, as shown in Fig.  4 . The analysis can be conducted by subsetting the dataset based on several factors. Fig.  4 visualizes the co-occurrence network by discipline (using column InnovQmCateg ) and geographical location (using column Country ).

figure 4

Keyword co-occurrence networks.

The next three Excel files contain citation-related network data connected to quality management ( cnetQM.xlsx ), innovation ( cnetIN.xlsx ) and both fields ( cnetALL.xlsx ). Table  3 shows the structure of the node and edge tables.

The three node tables are structured according to the scientific papers, which means that one paper is one row. The three edge tables contain the links between the scientific papers indicating citation relationships. It also encloses characteristics such as Publisher , Institute and Country , which can be used for developing different networks. The geocoded data allow us to generate spatial networks. For example, Fig.  5 contains 59,178 nodes and 244,976 edges that are generated by the cnetALL.xlsx file. Using the cnetALL.xlsx file, further networks such as those between authors or institutes can be developed. The variable Covid19Content is included in the node tables and provides the opportunity to analyze the subgraph of papers with COVID-19 content.

figure 5

Citation network.

Fig.  5 shows the citation network of the two themes based on the research fields, where the different colors show different modules as a result of community detection. Labels show the top 3 most frequent WoS categories. As Fig.  5 represents, the different research fields (e.g., green technologies, chemistry, engineering) are well separated, and their relationship within the network can be discovered using the provided database. Further citation networks such as between authors, institutions, countries, research fields, and COVID-19-related topics can be developed. Moreover, each citation network can be constructed in different time intervals since the database covers a long range of publication years.

Technical Validation

In this section, we present the technical validation of the extracted variables resulting from the additional data mining. First, the geocodes of the affiliations were tested by taking a sample including 287 papers from the dataset. The adequate sample size was determined based on the following equation 24 :

where n is the sample size, Z α /2 is the Z score of the selected confidence level, σ is the standard deviation of the population and e is the margin of error during the estimation of the population mean from the sample. The estimation was also performed for latitude and longitude values, and a greater sample size was considered for the sampling strategy:

During the calculation, the confidence level was set to 95%, which indicates 1.96 as the Z score value. To represent an overall 5% error interval, e was set to ±2.5% of the given latitude/longitude ranges. After the substitution of the adequate values, the sample size can be determined as n  =  max {185.80, 286.79}, indicating that 287 samples should be taken. The sample was randomly taken from the dataset, and Fig.  6 illustrates the geographical locations and distribution of the sample compared to the population.

figure 6

Sample properties.

As the histograms in Fig.  6a show, the sample covers a good range of geographical locations, including all continents. It is also observable that the sample well describes the mean and range of the population (see Fig.  6b ).

Fig.  7a,c show the relationship between the automatically extracted (by the Python code) and the manually searched coordinates. The red reference line includes those points where the results of the manual search and data extraction were equal. The subfigures show a good fitting and strong relationship with a 0.99 (Pearson) correlation coefficient in both cases. Fig.  7b,d show the distribution of the errors (the difference between the expected value and the extracted value). The mean absolute deviation (MAD) is 0.08 in the case of latitudes, and its value is 0.07 in the case of longitudes. Two cases were found with higher differences in terms of longitudes, but after manual checking, we concluded that they did not cause significant differences in the geographical pattern. Based on the above, it can be concluded that the performance of geocoding is acceptable.

figure 7

Validation of the geographical coordinates.

The institution names, country names, country codes and validity of the cited references were checked in relation to the selected 287 papers. During the examination, manual checking was performed, and no mistakes were found. The existence of citations was validated using Google Scholar.

Usage Notes

Our database can be used to conduct literature reviews and meta-analyses in the field of innovation and/or quality management. It is also possible to analyze the relationship between the two research fields. Text mining methods can be applied, such as topic mining, word clouds, and co-occurrence networks, to reveal the trends in latent topics. Using citation networks, the most important hubs can be identified with the application of a wide range of network centrality metrics. It is also possible to construct multilayer networks (such as applied by Gadár et al . 25 ) to analyze the relationship between the two fields, and adding publication time as a dimension enables us to model the evolution of these research fields. Due to the geocoded variables, spatial analysis can also be conducted, even with spatial citation networks, to identify geographical patterns.

Limitations

It is necessary to note that the proposed database has some limitations.

The last step of the PRISMA methodology (“Eligibility”) could not be performed due to the data size; therefore, PRISMA methodology was partly followed.

Global bibliographic databases can have some shortcomings. In the Web of Science dataset used, some DOI numbers are missing, which can cause some citations to go undetected.

As Vuong et al . 26 pointed out, bibliographic datasets can include different name versions for the same author. Although this problem does not affect the examples provided in this paper, the users should take this into consideration when applying author-level analyses.

Finally, despite being indexed in global bibliometric databases, in some cases, published papers can be retracted 27 , and this may not be reflected in the developed datasets.

Code availability

Data preprocessing tasks were performed in Python programming language. A Jupyter notebook (Geocode_cnet.ipynb) including the data preprocessing steps is provided alongside the paper 23 . Example figures (geographical distribution, co-occurrence networks, citation network) were constructed in the R program language. The R script is provided to produce and modify the figures based on the needs of the researcher.

Bourke, J. & Roper, S. Innovation, quality management and learning: Short-term and longer-term effects. Res. Policy 46 , 1505–1518, https://doi.org/10.1016/j.respol.2017.07.005 (2017).

Article   Google Scholar  

Pekovic, S. & Galia, F. From quality to innovation: Evidence from two french employer surveys. Technovation 29 , 829–842, https://doi.org/10.1016/j.technovation.2009.08.002Getrightsandcontent (2009).

Martinez-Costa, M. & Martinez-Lorente, A. R. Does quality management foster or hinder innovation? an empirical study of spanish companies. Total. Qual. Manag. 19 , 209–221, https://doi.org/10.1080/14783360701600639 (2008).

Flynn, B. B. The relationship between quality management practices, infrastructure and fast product innovation. Bench marking for Qual. Manag. & Technol. 1 , 48–64, https://doi.org/10.1108/14635779410056886 (1994).

Li, D., Zhao, Y., Zhang, L., Chen, X. & Cao, C. Impact of quality management on green innovation. J. Clean. Prod. 170 , 462–470, https://doi.org/10.1016/j.jclepro.2017.09.158 (2018).

Zeng, J., Phan, C. A. & Matsui, Y. The impact of hard and soft quality management on quality and innovation performance: An empirical study. Int. J. Prod. Econ. 162 , 216–226, https://doi.org/10.1016/j.ijpe.2014.07.006 (2015).

Kafetzopoulos, D., Gotzamani, K. & Gkana, V. Relationship between quality management, innovation and competitiveness. evidence from greek companies. J. Manuf. Technol. Manag. 26 , 1177–1200, https://doi.org/10.1108/JMTM-02-2015-0007 (2015).

Kim, D.-Y., Kumar, V. & Kumar, U. Relationship between quality management practices and innovation. J. Oper. Manag. 30 , 295–315, https://doi.org/10.1016/j.jom.2012.02.003 (2012).

Terziovski, M. & Guerrero, J.-L. Iso 9000 quality system certification and its impact on product and process innovation performance. Int. J. Prod. Econ. 158 , 197–207, https://doi.org/10.1016/j.ijpe.2014.08.011 (2014).

Fiss, P. C. Building better causal theories: A fuzzy set approach to typologies in organization research. Acad. Manag. J. 54 , 393–420, https://doi.org/10.5465/amj.2011.60263120 (2011).

Vuong, Q.-H. et al . Covid-19 vaccines production and societal immunization under the serendipity-mindsponge-3d knowledge management theory and conceptual framework. Humanit. Soc. Sci. Commun. 9 , 1–12, https://doi.org/10.1057/s41599-022-01034-6 (2022).

George, G., Merrill, R. K. & Schillebeeckx, S. J. Digital sustainability and entrepreneurship: How digital innovations are helping tackle climate change and sustainable development. Entrepreneurship Theory Pract. 45 , 999–1027, https://doi.org/10.1177/1042258719899425 (2021).

Toebelmann, D. & Wendler, T. The impact of environmental innovation on carbon dioxide emissions. J. Clean. Prod. 244 , 118787, https://doi.org/10.1016/j.jclepro.2019.118787 (2020).

Article   CAS   Google Scholar  

Blind, K., Ramel, F. & Rochell, C. The influence of standards and patents on long-term economic growth. The J. Technol. Transf . 1–21, https://doi.org/10.1007/s10961-021-09868-z (2021).

El Manzani, Y., Sidmou, M. L. & Cegarra, J.-J. Does is0 9001 quality management system support product innovation? an analysis from the sociotechnical systems theory. Int. J. Qual. & Reliab. Manag. 36 , 951–982, https://doi.org/10.1108/IJQRM-09-2017-0174 (2019).

Riillo, C. A. F. Quality management and innovation: a review of quantitative studies. Int. J. Prod. Qual. Manag. 14 , 441–456, https://doi.org/10.1504/ijpqm.2014.065557 (2014).

Segarra-Cipres, M., Escrig-Tena, A. B. & Garcia-Juan, B. The link between quality management and innovation performance: a content analysis of survey-based research. Total. Qual. Manag. & Bus. Excell. 31 , 1–22, https://doi.org/10.1080/14783363.2017.1401460 (2020).

Vuong, Q. H. Open data, open review and open dialogue in making social sciences plausible. Nature: Sci. Data Updat . http://blogs.nature.com/scientificdata/2017/12/12/authors-corner-open-data-open-review-and-open-dialogue-in-making-social-sciences-plausible/ (2017).

Su, Y., Gabrielle, B. & Makowski, D. A global dataset for crop production under conventional tillage and no tillage systems. Sci. data 8 , 1–17 (2021).

Moher, D. et al . Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. Int J Surg 8 , 336–341, https://doi.org/10.1371/journal.pmed.1000097 (2010).

Article   PubMed   Google Scholar  

Vu-Ngoc, H. et al . Quality of flow diagram in systematic review and/or meta-analysis. PloS one 13 , e0195955, https://doi.org/10.1371/journal.pone.0195955 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Melitz, J. English as a lingua franca: Facts, benefits and costs. The World Econ. 41 , 1750–1774, https://doi.org/10.1111/twec.12643 (2018).

A global database for conducting systematic reviews and meta-analyses in innovation and quality management. figshare https://doi.org/10.6084/m9.figshare.c.5704030.v1 (2022).

Smith, M. Sampling considerations in evaluating cooperative extension programs. (1983).

Gadar, L., Kosztyan, Z. T., Telcs, A. & Abonyi, J. A multilayer and spatial description of the erasmus mobility network. Sci. data 7 , 1–11 (2020).

Vuong, Q.-H. et al . An open database of productivity in vietnam’s social sciences and humanities for public use. Sci. Data 5 , 1–15, https://doi.org/10.1038/sdata.2018.188 (2018).

Article   MathSciNet   Google Scholar  

Vuong, Q.-H. Reform retractions to make them more transparent. Nature 582 , 149, https://doi.org/10.1038/d41586-020-01694-x (2020).

Article   ADS   CAS   Google Scholar  

Download references

Acknowledgements

This work has been implemented by the TKP2020-NKA-10 project with the support provided by the Ministry for Innovation and Technology of Hungary from the National Research, Development and Innovation Fund, financed under the 2020 Thematic Excellence Programme funding scheme and the ÚNKP-20-4 New National Excellence Program of the Ministry for Innovation and Technology from the source of the National Research, Development and Innovation Fund.

Open access funding provided by University of Pannonia.

Author information

These authors contributed equally: Tibor Csizmadia, Attila Imre Katona.

Authors and Affiliations

University of Pannonia, Department of Management, Veszprém, 8200, Hungary

Tibor Csizmadia

University of Pannonia, Department of Quantitative Methods, Veszprém, 8200, Hungary

Attila Imre Katona

You can also search for this author in PubMed   Google Scholar

Contributions

T.C. and A.I.K. contributed to conceiving and designing the study, collecting and preprocessing the data, conducting analysis leading to the usage notes, and coauthoring the manuscript. Both authors reviewed and approved the manuscript.

Corresponding author

Correspondence to Tibor Csizmadia .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Csizmadia, T., Katona, A.I. A global database for conducting systematic reviews and meta-analyses in innovation and quality management. Sci Data 9 , 301 (2022). https://doi.org/10.1038/s41597-022-01427-x

Download citation

Received : 18 January 2022

Accepted : 26 May 2022

Published : 14 June 2022

DOI : https://doi.org/10.1038/s41597-022-01427-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

systematic literature review databases

PolyU Library

  • The Hong Kong Polytechnic University
  • Guides & Tutorials

Systematic Search for Systematic Review

  • Databases Selection for Conducting SR
  • Introduction
  • Find Systematic Reviews (SR)
  • Formulate Research Question Using PICO
  • Step 1. Set Preferences in EndNote
  • Step 2. Create Groups in EndNote
  • Step 3. Export Search Results from Databases to EndNote
  • Step 4. Add Name of Database to References
  • Step 5. Remove Duplicate Records
  • Step 6. Share References with Teammates
  • Step 7. Find Full Text Articles
  • [Optional] Export References to Excel

Tutorials - Databases Searching Help

Pubmed help.

  • PubMed Tutorial   (NLM)
  • Conducting a literature search using PubMed   (Medical College of Wisconsin Libraries)

CINAHL Help

  • CINAHL Databases - Basic Searching Tutorial   (EBSCO)
  • CINAHL Databases - Advanced Searching Tutorial   (EBSCO)
  • Using the CINAHL/MeSH Headings Feature in EBSCOhost  (EBSCO)
  • ​ CINAHL Tutorials   (Y ale University Cushing/Whitney Medical Library)

Embase Help

  • Basic Searching in Embase   (University of Michigan Library)
  • Embase Advanced Search   (UIC Library)
  • Embase PICO Search   (UIC Library)

Cochrane Help

  • Cochrane Search   (Pace University Library)
  • Cochrane Library - Advanced Search   (Bridgewater Library)
  • The Cochrane Training channel on YouTube

PsycINFO Help

  • Basic Searching in PsycINFO   (USC Libraries)
  • Searching PsycINFO with Subject Terms   (USC Libraries)
  • PsycINFO Searching Tips   ( USC Libraries)

Copyright Disclaimer

Creative Commons License

Except where otherwise noted, the content of this guide is licensed under a  CC BY-NC 4.0 License .

Databases and Journals in Nursing and Health Science

Systematic review (SR) is a review of evidence-based studies and it aims to support the clinicians or researchers to find out the best available evidence to a specific problem. SR is usually conducted in the area of nursing and healthcare.

SR requires an exhaustive and systematic search of literature to ensure that all relevant evidence is included. A very important step for a systematic search is to select the databases you want to search within. Note that the databases you select and the search strategies should be described in your review as well. 

  • Core Databases
  • Other Databases
  • Grey Literature
  • Evidence-based Journals

For reviews in nursing and health science areas, here is a list of core databases to start from.

  • PubMed PubMed has been available since 1996, provided by U.S. National Library of Medicine (NLM). It has more than 34 million references include the MEDLINE database plus material provided via PubMed Central (PMC), NCBI bookshelf, and other records such as in-processing articles and some old materials.. Medline - the largest biomedical literature bibliographic database - alone provides more than 29 million references to biomedical and life sciences journal articles (from more than 5,200 journals) dating back to 1946.

      Medline is also accessible on its own via other platforms:

  • Medline (1946+) via EbscoHost
  • CINAHL Complete (via EbscoHost) CINAHL (Cumulative Index of Nursing and Allied Health) Complete covers over 6 million records from nearly 5,500 academic journals and magazines in all areas of nursing & allied health literature. CINAHL Complete can be accessed via Ebscohost platform, which allows you to exclude records found in Medline.
  • Embase Embase covers over 32 million records from almost 8,300 journals in biomedical literature. Embase platform also allows you to exclude records found in Medline. Embase supports searching in PICO format.
  • Cochrane Library Cochrane Library provides high quality evidence-based literature to support health-care decision-making. It includes 6 databases, covering more than 1 million records. The most commonly used databases are: (1) Cochrane Database of Systematic Reviews (CDSR), which includes Cochrane reviews and systematic reviews, and (2) Cochrane Central Register of Controlled Trials (CENTRAL), which includes clinical trials (RCT and CCT). Cochrane Library is also available via Wiley platform. Cochrane reviews can also be searched through PubMed.
  • PsycINFO (via ProQuest) PsycINFO covers over 4.5 million records from more than 2,500 journals in psychology and related disciplines. PsycINFO can be accessed via ProQuest.

You may also consider including the following databases depending on your research topic.

  • PEDro (Physiotherapy Evidence Database) Free database of over 35,000 randomised trials, systematic reviews and clinical practice guidelines in physiotherapy
  • PROSPERO - International Prospective Register of Systematic Reviews
  • ERIC ERIC (Educational Resources Information Center), the largest education database. It contains more than 1.6 million records of journal articles, books, conference papers, dissertations and theses, research reports, curriculum and teaching guides, etc. Covers literature on nursing education.
  • Web of Science Multidisciplinary citation database. Web of Science Core Collection covers 20,000+ journals, including 8,850+ from Science Citation Index Expanded (SCIE), 3,200+ from Social Science Citation Index (SSCI) and 1,700+ from Arts & Humanities Citation Index (AHCI).
  • Science Citation Index Expanded (SCIE)
  • Social Sciences Citation Index (SSCI)
  • Scopus Multidisciplinary citation database. Covers 22,800+ active journal titles, including 8,600+ in Social Sciences, 7,100+ in Health Sciences, 7,400+ in Physical Sciences, and 4,600+ in Life Sciences.

See all available databases by subject:

  • Databases in Nursing
  • Databases in Rehabilitation Sciences
  • Databases in Medicine & Health Science

Grey literature  "is often used to refer to reports published outside of traditional commercial publishing." ( Cochrane Handbook for Systematic Review of Interventions , chapter 4). Examples of grey literature include:

  • conference abstracts, presentations, proceedings;
  • regulatory data;
  • unpublished trial data;
  • government publications;
  • reports (such as white papers, working papers, internal documentation);
  • dissertations/theses;
  • patents; and
  • policies & procedures etc.

Searching the grey literature is important, because not all evidence is (commercially) published in journal articles delivered by major databases. It is worth noting that the producing bodies of grey literature are essential sources of quality information beyond the control of commercial publishers, as 'publishing' is normally not the primary activity of those bodies.

Theses & dissertations:

  • ProQuest Dissertations & Theses A comprehensive collection of PhD dissertations and Master theses from around the world. Covers Health & Medicine, etc.
  • See All Databases in Dissertations & Theses

Clinical trials: 

(Learn more about clinical trials from the PhRMA.org )

  • CENTRAL (part of Cochrane Library) Cochrane Central Register of Controlled Trials (CENTRAL) covers all clinical trials (Randomized Controlled Trials and Controlled Clinical Trials) in Medline.
  • ClinicalTrials.gov ClinicalTrials.gov is a resource provided by the U.S. National Library of Medicine. It is a database of privately and publicly funded clinical studies conducted around the world.
  • ICTRP Search Portal The ICTRP Search Portal, provided by WHO since Aug 2005, aims to provide a single point of access to information about ongoing and completed clinical trials. It provides a searchable database containing the trial registration data sets made available by data providers around the world meeting criteria for content and quality control.
  • ISRCTN registry ISRCTN supports transparency in clinical research, helps reduce selective reporting of results and ensures an unbiased and complete evidence base. more... less... The ISRCTN registry is a primary clinical trial registry recognised by WHO and ICMJE that accepts all clinical research studies (whether proposed, ongoing or completed), providing content validation and curation and the unique identification number necessary for publication. All study records in the database are freely accessible and searchable.

Online Resources for Grey Literature discoveries in the health sector:

  • MedNar : a free, medical-focused deep web search engine.
  • OpenGrey : a multidisciplinary European database covering science, technology, biomedical science, economics, social science and humanities. Records are in English.
  • Grey Matters : a practical tool for searching health-related grey literature  : provided by Canadian Agency for Drugs and Technologies (CADTH), is a checklist pointing users to a rich list of online resources to locate medical grey literature, and to record findings from each source.
  • Global Index Medicus  (GIM): collated and aggregated by WHO Regional Office Libraries, provides worldwide access to biomedical and public health literature produced by and within low-middle income countries. 
  • WHO Library Database : provides access to knowledge, including governing documents, reports and technical documentation, from WHO as well as from other sources of scientific literature produced around the world.
  • NY Academy of Medicine Grey Literature Report (ceased to update from 2017): This is a searchable collection of reports compiled by The New York Academy of Medicine (NYAM) alerting readers to new grey literature publications in health services research and selected public health topics.

WorldWideScience.org : "... enables anyone with internet access to launch a single-query search of national scientific databases and portals in more than 70 countries, covering all of the world's inhabited continents and over three-quarters of the world's population. …. It provides simultaneous access to "deep web" scientific databases, which are typically not searchable by commercial search engines .” ~abstracted rom Wikipedia at https://en.wikipedia.org/wiki/WorldWideScience

Some evidence-based journals may not be indexed in core databases you usually search in. These journals may also contain valuable works that can contribute to your systematic review. Consider to cover these journals also in your searching process.

  • Evidence-Based Medicine
  • Annals of Internal Medicine
  • Evidence-Based Mental Health
  • Evidence-Based Nursing
  • Cancer Treatment Reviews
  • Evidence-Based Child Health: A Cochrane Review Journal
  • Critical Pathways in Cardiology
  • Evidence-based Complementary and Alternative Medicine
  • International Journal of Evidence-Based Healthcare
  • Evidence-Based Healthcare and Public Health

Coverage of Databases

Different databases have different coverage in journal titles. It is worthwhile to examine the coverage of databases you selected, because this determines your search results and will have a direct impact on your review. Here is an overview of the differences in coverage across the core databases. 

  • Title Coverage of Medline vs. CINAHL vs. Embase vs. PsycINFO vs. Cochrane
  • Title Coverage of Medline vs. CINAHL vs. Embase vs. PsycINFO vs. Cochrane vs. Web of Science vs. Scopus
  • Comparison - Medline, Embase, CINAHL and PsycINFO
  • Sources & References

Title Coverage:   Medline vs. CINAHL Complete vs. Embase vs. Cochrane Library  vs.  PsycINFO

  • To do a systematic review in nursing and health science fields, Medline , CINAHL  (Complete), Embase and Cochrance Library are the core databases for literature searching.
  • If your research topic involves psychological problems (e.g., cognitive behavior), include PsycINFO and Web of Science (SSCI) too in your literature search.
  • Databases may have overlaps in content, and each database also covers some exclusive (unique) titles. Figure 1 illustrates the title coverage across these five databases.
  • PubMed  covers all records from Medline , plus a few other materials (about ~15%).
  • Embase  includes almost all of Medline journals, while more than 2,900 journals in Embase are not covered by Medline .
  • Medline has a better coverage of US journals, while Embase has a better coverage of European journals.
  • Medline has around 1/3 (~1,800 journals) overlapped with CINAHL Complete .
  • PsycINFO has around 1/2 (~1,200) unique journals that are NOT covered by Medline , Embase  and CINAHL Complete .
  • Cochrane Database of Systematic Reviews ( CDSR ) is fully indexed by Medline . 
  • Cochrane Central Register of Controlled Trials ( CENTRAL ) contains nearly 530,000 citations, of which 310,000 (~58%) are from  Medline , 50,000 (~10%) are from  Embase , and the remaining 170,000 (~32%) are from other sources ( Cochrane Handbook , 2011). CENTRAL  covers all Randomized Controlled Trials (RCTs) and Controlled Clinical Trials (CCTs)  indexed in Medline  ( Source ).

Title Coverage:   Medline vs. CINAHL Complete vs. Embase vs. Cochrane Library  vs.  PsycINFO  vs.  Web of Science   vs.  Scopus

Web of Science and Scopus are two large multidisciplinary citation databases. You may not need to include these two databases for a systematic review, but it's good to know to what extend the core databases are covered in these two. Figure 2 gives you some indications about the title coverage.

  • There are still many journals outside the "core collection" of databases in nursing and health science. You may also need to do hand searching in selected journals to make sure your literature search is exhausted.
  • Web of Science  (SSCI) has over 2400 titles in Social Science  area and around 70% of these titles are NOT covered by Medline, CINAHL Complete, Embase and PsycINFO. So if your research topic is in social science field, it's necessary to include Web of Science SSCI in your search.
  • Web of Science (SSCI) has around 660 titles in Psychiatry and Psychology areas. Around 50 titles are NOT covered by PsycINFO . So if your research topic is concerning psychology issues, do include Web of Science (SSCI) in your search.
  • Scopus covers (almost) all titles in Medline , PsycINFO and Web of Science . So most of the cases, it's not necessary to search in Scopus for your systematic review.  CINAHL Complete has approx. 2000 titles that are NOT covered by Scopus. Out of these titles, more than half are magazines, trade publications, etc. 

systematic literature review databases

Here are some sources and references you can refer to if you need additional information.

  • Medline Fact Sheet : https://www.nlm.nih.gov/bsd/medline.html
  • Medline, PubMed, and PMC (PubMed Central) : How are they different?  https://www.nlm.nih.gov/bsd/difference.html
  • Medline Title List :  https://www.ncbi.nlm.nih.gov/nlmcatalog/?term=currentlyindexed  (retrieved on 20 Dec 2016)
  • CINAHL Complete Database Coverage List https://www.ebsco.com/m/ee/Marketing/titleLists/ccm-coverage.htm
  • CINAHL Complete Fact Sheet : https://www.ebscohost.com/nursing/products/cinahl-databases/cinahl-complete  (updated: 2018)
  • Embase Coverage and Content : https://www.elsevier.com/solutions/embase-biomedical-research/embase-coverage-and-content  (updated: May 2018)
  • PsycINFO Fact Sheet :  http://www.apa.org/pubs/databases/psycinfo/
  • PsycINFO Journal Coverage List : http://www.apa.org/pubs/databases/psycinfo/coverage.aspx  (updated: May 2018)
  • Cochrane Library - Databases Covered :  http://www.cochranelibrary.com/about/about-the-cochrane-library.html
  • Web of Science Master Journal List :  http://mjl.clarivate.com/  (updated: Jul 2017)
  • Scopus Content Coverage :  https://www.elsevier.com/solutions/scopus/content  (updated: Aug 2018)

References:

  • Chapman, D. (2009). Health-Related Databases.  Journal of the Canadian Academy of Child and Adolescent Psychiatry , 18(2), 148–149. Available: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2687480/​
  • Julian, H., & Sally, G. (2011). Cochrane Handbook for Systematic Reviews of Interventions (Version 5.1.0 ed.): The Cochrane Collaboration. Available:  http://handbook.cochrane.org/

The Venn diagrams are generated by  Venn Diagram Maker Online , based on data extracted from the title lists given above (data accurate as of Dec 2016).

  • << Previous: Formulate Research Question Using PICO
  • Next: Manage Search Results Using EndNote >>
  • Last Updated: Apr 22, 2024 10:53 AM
  • URL: https://libguides.lb.polyu.edu.hk/syst_review
  • Open access
  • Published: 14 August 2018

Defining the process to literature searching in systematic reviews: a literature review of guidance and supporting studies

  • Chris Cooper   ORCID: orcid.org/0000-0003-0864-5607 1 ,
  • Andrew Booth 2 ,
  • Jo Varley-Campbell 1 ,
  • Nicky Britten 3 &
  • Ruth Garside 4  

BMC Medical Research Methodology volume  18 , Article number:  85 ( 2018 ) Cite this article

203k Accesses

204 Citations

118 Altmetric

Metrics details

Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving readers clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence.

Information specialists and review teams appear to work from a shared and tacit model of the literature search process. How this tacit model has developed and evolved is unclear, and it has not been explicitly examined before.

The purpose of this review is to determine if a shared model of the literature searching process can be detected across systematic review guidance documents and, if so, how this process is reported in the guidance and supported by published studies.

A literature review.

Two types of literature were reviewed: guidance and published studies. Nine guidance documents were identified, including: The Cochrane and Campbell Handbooks. Published studies were identified through ‘pearl growing’, citation chasing, a search of PubMed using the systematic review methods filter, and the authors’ topic knowledge.

The relevant sections within each guidance document were then read and re-read, with the aim of determining key methodological stages. Methodological stages were identified and defined. This data was reviewed to identify agreements and areas of unique guidance between guidance documents. Consensus across multiple guidance documents was used to inform selection of ‘key stages’ in the process of literature searching.

Eight key stages were determined relating specifically to literature searching in systematic reviews. They were: who should literature search, aims and purpose of literature searching, preparation, the search strategy, searching databases, supplementary searching, managing references and reporting the search process.

Conclusions

Eight key stages to the process of literature searching in systematic reviews were identified. These key stages are consistently reported in the nine guidance documents, suggesting consensus on the key stages of literature searching, and therefore the process of literature searching as a whole, in systematic reviews. Further research to determine the suitability of using the same process of literature searching for all types of systematic review is indicated.

Peer Review reports

Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving review stakeholders clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence.

Information specialists and review teams appear to work from a shared and tacit model of the literature search process. How this tacit model has developed and evolved is unclear, and it has not been explicitly examined before. This is in contrast to the information science literature, which has developed information processing models as an explicit basis for dialogue and empirical testing. Without an explicit model, research in the process of systematic literature searching will remain immature and potentially uneven, and the development of shared information models will be assumed but never articulated.

One way of developing such a conceptual model is by formally examining the implicit “programme theory” as embodied in key methodological texts. The aim of this review is therefore to determine if a shared model of the literature searching process in systematic reviews can be detected across guidance documents and, if so, how this process is reported and supported.

Identifying guidance

Key texts (henceforth referred to as “guidance”) were identified based upon their accessibility to, and prominence within, United Kingdom systematic reviewing practice. The United Kingdom occupies a prominent position in the science of health information retrieval, as quantified by such objective measures as the authorship of papers, the number of Cochrane groups based in the UK, membership and leadership of groups such as the Cochrane Information Retrieval Methods Group, the HTA-I Information Specialists’ Group and historic association with such centres as the UK Cochrane Centre, the NHS Centre for Reviews and Dissemination, the Centre for Evidence Based Medicine and the National Institute for Clinical Excellence (NICE). Coupled with the linguistic dominance of English within medical and health science and the science of systematic reviews more generally, this offers a justification for a purposive sample that favours UK, European and Australian guidance documents.

Nine guidance documents were identified. These documents provide guidance for different types of reviews, namely: reviews of interventions, reviews of health technologies, reviews of qualitative research studies, reviews of social science topics, and reviews to inform guidance.

Whilst these guidance documents occasionally offer additional guidance on other types of systematic reviews, we have focused on the core and stated aims of these documents as they relate to literature searching. Table  1 sets out: the guidance document, the version audited, their core stated focus, and a bibliographical pointer to the main guidance relating to literature searching.

Once a list of key guidance documents was determined, it was checked by six senior information professionals based in the UK for relevance to current literature searching in systematic reviews.

Identifying supporting studies

In addition to identifying guidance, the authors sought to populate an evidence base of supporting studies (henceforth referred to as “studies”) that contribute to existing search practice. Studies were first identified by the authors from their knowledge on this topic area and, subsequently, through systematic citation chasing key studies (‘pearls’ [ 1 ]) located within each key stage of the search process. These studies are identified in Additional file  1 : Appendix Table 1. Citation chasing was conducted by analysing the bibliography of references for each study (backwards citation chasing) and through Google Scholar (forward citation chasing). A search of PubMed using the systematic review methods filter was undertaken in August 2017 (see Additional file 1 ). The search terms used were: (literature search*[Title/Abstract]) AND sysrev_methods[sb] and 586 results were returned. These results were sifted for relevance to the key stages in Fig.  1 by CC.

figure 1

The key stages of literature search guidance as identified from nine key texts

Extracting the data

To reveal the implicit process of literature searching within each guidance document, the relevant sections (chapters) on literature searching were read and re-read, with the aim of determining key methodological stages. We defined a key methodological stage as a distinct step in the overall process for which specific guidance is reported, and action is taken, that collectively would result in a completed literature search.

The chapter or section sub-heading for each methodological stage was extracted into a table using the exact language as reported in each guidance document. The lead author (CC) then read and re-read these data, and the paragraphs of the document to which the headings referred, summarising section details. This table was then reviewed, using comparison and contrast to identify agreements and areas of unique guidance. Consensus across multiple guidelines was used to inform selection of ‘key stages’ in the process of literature searching.

Having determined the key stages to literature searching, we then read and re-read the sections relating to literature searching again, extracting specific detail relating to the methodological process of literature searching within each key stage. Again, the guidance was then read and re-read, first on a document-by-document-basis and, secondly, across all the documents above, to identify both commonalities and areas of unique guidance.

Results and discussion

Our findings.

We were able to identify consensus across the guidance on literature searching for systematic reviews suggesting a shared implicit model within the information retrieval community. Whilst the structure of the guidance varies between documents, the same key stages are reported, even where the core focus of each document is different. We were able to identify specific areas of unique guidance, where a document reported guidance not summarised in other documents, together with areas of consensus across guidance.

Unique guidance

Only one document provided guidance on the topic of when to stop searching [ 2 ]. This guidance from 2005 anticipates a topic of increasing importance with the current interest in time-limited (i.e. “rapid”) reviews. Quality assurance (or peer review) of literature searches was only covered in two guidance documents [ 3 , 4 ]. This topic has emerged as increasingly important as indicated by the development of the PRESS instrument [ 5 ]. Text mining was discussed in four guidance documents [ 4 , 6 , 7 , 8 ] where the automation of some manual review work may offer efficiencies in literature searching [ 8 ].

Agreement between guidance: Defining the key stages of literature searching

Where there was agreement on the process, we determined that this constituted a key stage in the process of literature searching to inform systematic reviews.

From the guidance, we determined eight key stages that relate specifically to literature searching in systematic reviews. These are summarised at Fig. 1 . The data extraction table to inform Fig. 1 is reported in Table  2 . Table 2 reports the areas of common agreement and it demonstrates that the language used to describe key stages and processes varies significantly between guidance documents.

For each key stage, we set out the specific guidance, followed by discussion on how this guidance is situated within the wider literature.

Key stage one: Deciding who should undertake the literature search

The guidance.

Eight documents provided guidance on who should undertake literature searching in systematic reviews [ 2 , 4 , 6 , 7 , 8 , 9 , 10 , 11 ]. The guidance affirms that people with relevant expertise of literature searching should ‘ideally’ be included within the review team [ 6 ]. Information specialists (or information scientists), librarians or trial search co-ordinators (TSCs) are indicated as appropriate researchers in six guidance documents [ 2 , 7 , 8 , 9 , 10 , 11 ].

How the guidance corresponds to the published studies

The guidance is consistent with studies that call for the involvement of information specialists and librarians in systematic reviews [ 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 ] and which demonstrate how their training as ‘expert searchers’ and ‘analysers and organisers of data’ can be put to good use [ 13 ] in a variety of roles [ 12 , 16 , 20 , 21 , 24 , 25 , 26 ]. These arguments make sense in the context of the aims and purposes of literature searching in systematic reviews, explored below. The need for ‘thorough’ and ‘replicable’ literature searches was fundamental to the guidance and recurs in key stage two. Studies have found poor reporting, and a lack of replicable literature searches, to be a weakness in systematic reviews [ 17 , 18 , 27 , 28 ] and they argue that involvement of information specialists/ librarians would be associated with better reporting and better quality literature searching. Indeed, Meert et al. [ 29 ] demonstrated that involving a librarian as a co-author to a systematic review correlated with a higher score in the literature searching component of a systematic review [ 29 ]. As ‘new styles’ of rapid and scoping reviews emerge, where decisions on how to search are more iterative and creative, a clear role is made here too [ 30 ].

Knowing where to search for studies was noted as important in the guidance, with no agreement as to the appropriate number of databases to be searched [ 2 , 6 ]. Database (and resource selection more broadly) is acknowledged as a relevant key skill of information specialists and librarians [ 12 , 15 , 16 , 31 ].

Whilst arguments for including information specialists and librarians in the process of systematic review might be considered self-evident, Koffel and Rethlefsen [ 31 ] have questioned if the necessary involvement is actually happening [ 31 ].

Key stage two: Determining the aim and purpose of a literature search

The aim: Five of the nine guidance documents use adjectives such as ‘thorough’, ‘comprehensive’, ‘transparent’ and ‘reproducible’ to define the aim of literature searching [ 6 , 7 , 8 , 9 , 10 ]. Analogous phrases were present in a further three guidance documents, namely: ‘to identify the best available evidence’ [ 4 ] or ‘the aim of the literature search is not to retrieve everything. It is to retrieve everything of relevance’ [ 2 ] or ‘A systematic literature search aims to identify all publications relevant to the particular research question’ [ 3 ]. The Joanna Briggs Institute reviewers’ manual was the only guidance document where a clear statement on the aim of literature searching could not be identified. The purpose of literature searching was defined in three guidance documents, namely to minimise bias in the resultant review [ 6 , 8 , 10 ]. Accordingly, eight of nine documents clearly asserted that thorough and comprehensive literature searches are required as a potential mechanism for minimising bias.

The need for thorough and comprehensive literature searches appears as uniform within the eight guidance documents that describe approaches to literature searching in systematic reviews of effectiveness. Reviews of effectiveness (of intervention or cost), accuracy and prognosis, require thorough and comprehensive literature searches to transparently produce a reliable estimate of intervention effect. The belief that all relevant studies have been ‘comprehensively’ identified, and that this process has been ‘transparently’ reported, increases confidence in the estimate of effect and the conclusions that can be drawn [ 32 ]. The supporting literature exploring the need for comprehensive literature searches focuses almost exclusively on reviews of intervention effectiveness and meta-analysis. Different ‘styles’ of review may have different standards however; the alternative, offered by purposive sampling, has been suggested in the specific context of qualitative evidence syntheses [ 33 ].

What is a comprehensive literature search?

Whilst the guidance calls for thorough and comprehensive literature searches, it lacks clarity on what constitutes a thorough and comprehensive literature search, beyond the implication that all of the literature search methods in Table 2 should be used to identify studies. Egger et al. [ 34 ], in an empirical study evaluating the importance of comprehensive literature searches for trials in systematic reviews, defined a comprehensive search for trials as:

a search not restricted to English language;

where Cochrane CENTRAL or at least two other electronic databases had been searched (such as MEDLINE or EMBASE); and

at least one of the following search methods has been used to identify unpublished trials: searches for (I) conference abstracts, (ii) theses, (iii) trials registers; and (iv) contacts with experts in the field [ 34 ].

Tricco et al. (2008) used a similar threshold of bibliographic database searching AND a supplementary search method in a review when examining the risk of bias in systematic reviews. Their criteria were: one database (limited using the Cochrane Highly Sensitive Search Strategy (HSSS)) and handsearching [ 35 ].

Together with the guidance, this would suggest that comprehensive literature searching requires the use of BOTH bibliographic database searching AND supplementary search methods.

Comprehensiveness in literature searching, in the sense of how much searching should be undertaken, remains unclear. Egger et al. recommend that ‘investigators should consider the type of literature search and degree of comprehension that is appropriate for the review in question, taking into account budget and time constraints’ [ 34 ]. This view tallies with the Cochrane Handbook, which stipulates clearly, that study identification should be undertaken ‘within resource limits’ [ 9 ]. This would suggest that the limitations to comprehension are recognised but it raises questions on how this is decided and reported [ 36 ].

What is the point of comprehensive literature searching?

The purpose of thorough and comprehensive literature searches is to avoid missing key studies and to minimize bias [ 6 , 8 , 10 , 34 , 37 , 38 , 39 ] since a systematic review based only on published (or easily accessible) studies may have an exaggerated effect size [ 35 ]. Felson (1992) sets out potential biases that could affect the estimate of effect in a meta-analysis [ 40 ] and Tricco et al. summarize the evidence concerning bias and confounding in systematic reviews [ 35 ]. Egger et al. point to non-publication of studies, publication bias, language bias and MEDLINE bias, as key biases [ 34 , 35 , 40 , 41 , 42 , 43 , 44 , 45 , 46 ]. Comprehensive searches are not the sole factor to mitigate these biases but their contribution is thought to be significant [ 2 , 32 , 34 ]. Fehrmann (2011) suggests that ‘the search process being described in detail’ and that, where standard comprehensive search techniques have been applied, increases confidence in the search results [ 32 ].

Does comprehensive literature searching work?

Egger et al., and other study authors, have demonstrated a change in the estimate of intervention effectiveness where relevant studies were excluded from meta-analysis [ 34 , 47 ]. This would suggest that missing studies in literature searching alters the reliability of effectiveness estimates. This is an argument for comprehensive literature searching. Conversely, Egger et al. found that ‘comprehensive’ searches still missed studies and that comprehensive searches could, in fact, introduce bias into a review rather than preventing it, through the identification of low quality studies then being included in the meta-analysis [ 34 ]. Studies query if identifying and including low quality or grey literature studies changes the estimate of effect [ 43 , 48 ] and question if time is better invested updating systematic reviews rather than searching for unpublished studies [ 49 ], or mapping studies for review as opposed to aiming for high sensitivity in literature searching [ 50 ].

Aim and purpose beyond reviews of effectiveness

The need for comprehensive literature searches is less certain in reviews of qualitative studies, and for reviews where a comprehensive identification of studies is difficult to achieve (for example, in Public health) [ 33 , 51 , 52 , 53 , 54 , 55 ]. Literature searching for qualitative studies, and in public health topics, typically generates a greater number of studies to sift than in reviews of effectiveness [ 39 ] and demonstrating the ‘value’ of studies identified or missed is harder [ 56 ], since the study data do not typically support meta-analysis. Nussbaumer-Streit et al. (2016) have registered a review protocol to assess whether abbreviated literature searches (as opposed to comprehensive literature searches) has an impact on conclusions across multiple bodies of evidence, not only on effect estimates [ 57 ] which may develop this understanding. It may be that decision makers and users of systematic reviews are willing to trade the certainty from a comprehensive literature search and systematic review in exchange for different approaches to evidence synthesis [ 58 ], and that comprehensive literature searches are not necessarily a marker of literature search quality, as previously thought [ 36 ]. Different approaches to literature searching [ 37 , 38 , 59 , 60 , 61 , 62 ] and developing the concept of when to stop searching are important areas for further study [ 36 , 59 ].

The study by Nussbaumer-Streit et al. has been published since the submission of this literature review [ 63 ]. Nussbaumer-Streit et al. (2018) conclude that abbreviated literature searches are viable options for rapid evidence syntheses, if decision-makers are willing to trade the certainty from a comprehensive literature search and systematic review, but that decision-making which demands detailed scrutiny should still be based on comprehensive literature searches [ 63 ].

Key stage three: Preparing for the literature search

Six documents provided guidance on preparing for a literature search [ 2 , 3 , 6 , 7 , 9 , 10 ]. The Cochrane Handbook clearly stated that Cochrane authors (i.e. researchers) should seek advice from a trial search co-ordinator (i.e. a person with specific skills in literature searching) ‘before’ starting a literature search [ 9 ].

Two key tasks were perceptible in preparing for a literature searching [ 2 , 6 , 7 , 10 , 11 ]. First, to determine if there are any existing or on-going reviews, or if a new review is justified [ 6 , 11 ]; and, secondly, to develop an initial literature search strategy to estimate the volume of relevant literature (and quality of a small sample of relevant studies [ 10 ]) and indicate the resources required for literature searching and the review of the studies that follows [ 7 , 10 ].

Three documents summarised guidance on where to search to determine if a new review was justified [ 2 , 6 , 11 ]. These focused on searching databases of systematic reviews (The Cochrane Database of Systematic Reviews (CDSR) and the Database of Abstracts of Reviews of Effects (DARE)), institutional registries (including PROSPERO), and MEDLINE [ 6 , 11 ]. It is worth noting, however, that as of 2015, DARE (and NHS EEDs) are no longer being updated and so the relevance of this (these) resource(s) will diminish over-time [ 64 ]. One guidance document, ‘Systematic reviews in the Social Sciences’, noted, however, that databases are not the only source of information and unpublished reports, conference proceeding and grey literature may also be required, depending on the nature of the review question [ 2 ].

Two documents reported clearly that this preparation (or ‘scoping’) exercise should be undertaken before the actual search strategy is developed [ 7 , 10 ]).

The guidance offers the best available source on preparing the literature search with the published studies not typically reporting how their scoping informed the development of their search strategies nor how their search approaches were developed. Text mining has been proposed as a technique to develop search strategies in the scoping stages of a review although this work is still exploratory [ 65 ]. ‘Clustering documents’ and word frequency analysis have also been tested to identify search terms and studies for review [ 66 , 67 ]. Preparing for literature searches and scoping constitutes an area for future research.

Key stage four: Designing the search strategy

The Population, Intervention, Comparator, Outcome (PICO) structure was the commonly reported structure promoted to design a literature search strategy. Five documents suggested that the eligibility criteria or review question will determine which concepts of PICO will be populated to develop the search strategy [ 1 , 4 , 7 , 8 , 9 ]. The NICE handbook promoted multiple structures, namely PICO, SPICE (Setting, Perspective, Intervention, Comparison, Evaluation) and multi-stranded approaches [ 4 ].

With the exclusion of The Joanna Briggs Institute reviewers’ manual, the guidance offered detail on selecting key search terms, synonyms, Boolean language, selecting database indexing terms and combining search terms. The CEE handbook suggested that ‘search terms may be compiled with the help of the commissioning organisation and stakeholders’ [ 10 ].

The use of limits, such as language or date limits, were discussed in all documents [ 2 , 3 , 4 , 6 , 7 , 8 , 9 , 10 , 11 ].

Search strategy structure

The guidance typically relates to reviews of intervention effectiveness so PICO – with its focus on intervention and comparator - is the dominant model used to structure literature search strategies [ 68 ]. PICOs – where the S denotes study design - is also commonly used in effectiveness reviews [ 6 , 68 ]. As the NICE handbook notes, alternative models to structure literature search strategies have been developed and tested. Booth provides an overview on formulating questions for evidence based practice [ 69 ] and has developed a number of alternatives to the PICO structure, namely: BeHEMoTh (Behaviour of interest; Health context; Exclusions; Models or Theories) for use when systematically identifying theory [ 55 ]; SPICE (Setting, Perspective, Intervention, Comparison, Evaluation) for identification of social science and evaluation studies [ 69 ] and, working with Cooke and colleagues, SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) [ 70 ]. SPIDER has been compared to PICO and PICOs in a study by Methley et al. [ 68 ].

The NICE handbook also suggests the use of multi-stranded approaches to developing literature search strategies [ 4 ]. Glanville developed this idea in a study by Whitting et al. [ 71 ] and a worked example of this approach is included in the development of a search filter by Cooper et al. [ 72 ].

Writing search strategies: Conceptual and objective approaches

Hausner et al. [ 73 ] provide guidance on writing literature search strategies, delineating between conceptually and objectively derived approaches. The conceptual approach, advocated by and explained in the guidance documents, relies on the expertise of the literature searcher to identify key search terms and then develop key terms to include synonyms and controlled syntax. Hausner and colleagues set out the objective approach [ 73 ] and describe what may be done to validate it [ 74 ].

The use of limits

The guidance documents offer direction on the use of limits within a literature search. Limits can be used to focus literature searching to specific study designs or by other markers (such as by date) which limits the number of studies returned by a literature search. The use of limits should be described and the implications explored [ 34 ] since limiting literature searching can introduce bias (explored above). Craven et al. have suggested the use of a supporting narrative to explain decisions made in the process of developing literature searches and this advice would usefully capture decisions on the use of search limits [ 75 ].

Key stage five: Determining the process of literature searching and deciding where to search (bibliographic database searching)

Table 2 summarises the process of literature searching as reported in each guidance document. Searching bibliographic databases was consistently reported as the ‘first step’ to literature searching in all nine guidance documents.

Three documents reported specific guidance on where to search, in each case specific to the type of review their guidance informed, and as a minimum requirement [ 4 , 9 , 11 ]. Seven of the key guidance documents suggest that the selection of bibliographic databases depends on the topic of review [ 2 , 3 , 4 , 6 , 7 , 8 , 10 ], with two documents noting the absence of an agreed standard on what constitutes an acceptable number of databases searched [ 2 , 6 ].

The guidance documents summarise ‘how to’ search bibliographic databases in detail and this guidance is further contextualised above in terms of developing the search strategy. The documents provide guidance of selecting bibliographic databases, in some cases stating acceptable minima (i.e. The Cochrane Handbook states Cochrane CENTRAL, MEDLINE and EMBASE), and in other cases simply listing bibliographic database available to search. Studies have explored the value in searching specific bibliographic databases, with Wright et al. (2015) noting the contribution of CINAHL in identifying qualitative studies [ 76 ], Beckles et al. (2013) questioning the contribution of CINAHL to identifying clinical studies for guideline development [ 77 ], and Cooper et al. (2015) exploring the role of UK-focused bibliographic databases to identify UK-relevant studies [ 78 ]. The host of the database (e.g. OVID or ProQuest) has been shown to alter the search returns offered. Younger and Boddy [ 79 ] report differing search returns from the same database (AMED) but where the ‘host’ was different [ 79 ].

The average number of bibliographic database searched in systematic reviews has risen in the period 1994–2014 (from 1 to 4) [ 80 ] but there remains (as attested to by the guidance) no consensus on what constitutes an acceptable number of databases searched [ 48 ]. This is perhaps because thinking about the number of databases searched is the wrong question, researchers should be focused on which databases were searched and why, and which databases were not searched and why. The discussion should re-orientate to the differential value of sources but researchers need to think about how to report this in studies to allow findings to be generalised. Bethel (2017) has proposed ‘search summaries’, completed by the literature searcher, to record where included studies were identified, whether from database (and which databases specifically) or supplementary search methods [ 81 ]. Search summaries document both yield and accuracy of searches, which could prospectively inform resource use and decisions to search or not to search specific databases in topic areas. The prospective use of such data presupposes, however, that past searches are a potential predictor of future search performance (i.e. that each topic is to be considered representative and not unique). In offering a body of practice, this data would be of greater practicable use than current studies which are considered as little more than individual case studies [ 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 ].

When to database search is another question posed in the literature. Beyer et al. [ 91 ] report that databases can be prioritised for literature searching which, whilst not addressing the question of which databases to search, may at least bring clarity as to which databases to search first [ 91 ]. Paradoxically, this links to studies that suggest PubMed should be searched in addition to MEDLINE (OVID interface) since this improves the currency of systematic reviews [ 92 , 93 ]. Cooper et al. (2017) have tested the idea of database searching not as a primary search method (as suggested in the guidance) but as a supplementary search method in order to manage the volume of studies identified for an environmental effectiveness systematic review. Their case study compared the effectiveness of database searching versus a protocol using supplementary search methods and found that the latter identified more relevant studies for review than searching bibliographic databases [ 94 ].

Key stage six: Determining the process of literature searching and deciding where to search (supplementary search methods)

Table 2 also summaries the process of literature searching which follows bibliographic database searching. As Table 2 sets out, guidance that supplementary literature search methods should be used in systematic reviews recurs across documents, but the order in which these methods are used, and the extent to which they are used, varies. We noted inconsistency in the labelling of supplementary search methods between guidance documents.

Rather than focus on the guidance on how to use the methods (which has been summarised in a recent review [ 95 ]), we focus on the aim or purpose of supplementary search methods.

The Cochrane Handbook reported that ‘efforts’ to identify unpublished studies should be made [ 9 ]. Four guidance documents [ 2 , 3 , 6 , 9 ] acknowledged that searching beyond bibliographic databases was necessary since ‘databases are not the only source of literature’ [ 2 ]. Only one document reported any guidance on determining when to use supplementary methods. The IQWiG handbook reported that the use of handsearching (in their example) could be determined on a ‘case-by-case basis’ which implies that the use of these methods is optional rather than mandatory. This is in contrast to the guidance (above) on bibliographic database searching.

The issue for supplementary search methods is similar in many ways to the issue of searching bibliographic databases: demonstrating value. The purpose and contribution of supplementary search methods in systematic reviews is increasingly acknowledged [ 37 , 61 , 62 , 96 , 97 , 98 , 99 , 100 , 101 ] but understanding the value of the search methods to identify studies and data is unclear. In a recently published review, Cooper et al. (2017) reviewed the literature on supplementary search methods looking to determine the advantages, disadvantages and resource implications of using supplementary search methods [ 95 ]. This review also summarises the key guidance and empirical studies and seeks to address the question on when to use these search methods and when not to [ 95 ]. The guidance is limited in this regard and, as Table 2 demonstrates, offers conflicting advice on the order of searching, and the extent to which these search methods should be used in systematic reviews.

Key stage seven: Managing the references

Five of the documents provided guidance on managing references, for example downloading, de-duplicating and managing the output of literature searches [ 2 , 4 , 6 , 8 , 10 ]. This guidance typically itemised available bibliographic management tools rather than offering guidance on how to use them specifically [ 2 , 4 , 6 , 8 ]. The CEE handbook provided guidance on importing data where no direct export option is available (e.g. web-searching) [ 10 ].

The literature on using bibliographic management tools is not large relative to the number of ‘how to’ videos on platforms such as YouTube (see for example [ 102 ]). These YouTube videos confirm the overall lack of ‘how to’ guidance identified in this study and offer useful instruction on managing references. Bramer et al. set out methods for de-duplicating data and reviewing references in Endnote [ 103 , 104 ] and Gall tests the direct search function within Endnote to access databases such as PubMed, finding a number of limitations [ 105 ]. Coar et al. and Ahmed et al. consider the role of the free-source tool, Zotero [ 106 , 107 ]. Managing references is a key administrative function in the process of review particularly for documenting searches in PRISMA guidance.

Key stage eight: Documenting the search

The Cochrane Handbook was the only guidance document to recommend a specific reporting guideline: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [ 9 ]. Six documents provided guidance on reporting the process of literature searching with specific criteria to report [ 3 , 4 , 6 , 8 , 9 , 10 ]. There was consensus on reporting: the databases searched (and the host searched by), the search strategies used, and any use of limits (e.g. date, language, search filters (The CRD handbook called for these limits to be justified [ 6 ])). Three guidance documents reported that the number of studies identified should be recorded [ 3 , 6 , 10 ]. The number of duplicates identified [ 10 ], the screening decisions [ 3 ], a comprehensive list of grey literature sources searched (and full detail for other supplementary search methods) [ 8 ], and an annotation of search terms tested but not used [ 4 ] were identified as unique items in four documents.

The Cochrane Handbook was the only guidance document to note that the full search strategies for each database should be included in the Additional file 1 of the review [ 9 ].

All guidance documents should ultimately deliver completed systematic reviews that fulfil the requirements of the PRISMA reporting guidelines [ 108 ]. The guidance broadly requires the reporting of data that corresponds with the requirements of the PRISMA statement although documents typically ask for diverse and additional items [ 108 ]. In 2008, Sampson et al. observed a lack of consensus on reporting search methods in systematic reviews [ 109 ] and this remains the case as of 2017, as evidenced in the guidance documents, and in spite of the publication of the PRISMA guidelines in 2009 [ 110 ]. It is unclear why the collective guidance does not more explicitly endorse adherence to the PRISMA guidance.

Reporting of literature searching is a key area in systematic reviews since it sets out clearly what was done and how the conclusions of the review can be believed [ 52 , 109 ]. Despite strong endorsement in the guidance documents, specifically supported in PRISMA guidance, and other related reporting standards too (such as ENTREQ for qualitative evidence synthesis, STROBE for reviews of observational studies), authors still highlight the prevalence of poor standards of literature search reporting [ 31 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 ]. To explore issues experienced by authors in reporting literature searches, and look at uptake of PRISMA, Radar et al. [ 120 ] surveyed over 260 review authors to determine common problems and their work summaries the practical aspects of reporting literature searching [ 120 ]. Atkinson et al. [ 121 ] have also analysed reporting standards for literature searching, summarising recommendations and gaps for reporting search strategies [ 121 ].

One area that is less well covered by the guidance, but nevertheless appears in this literature, is the quality appraisal or peer review of literature search strategies. The PRESS checklist is the most prominent and it aims to develop evidence-based guidelines to peer review of electronic search strategies [ 5 , 122 , 123 ]. A corresponding guideline for documentation of supplementary search methods does not yet exist although this idea is currently being explored.

How the reporting of the literature searching process corresponds to critical appraisal tools is an area for further research. In the survey undertaken by Radar et al. (2014), 86% of survey respondents (153/178) identified a need for further guidance on what aspects of the literature search process to report [ 120 ]. The PRISMA statement offers a brief summary of what to report but little practical guidance on how to report it [ 108 ]. Critical appraisal tools for systematic reviews, such as AMSTAR 2 (Shea et al. [ 124 ]) and ROBIS (Whiting et al. [ 125 ]), can usefully be read alongside PRISMA guidance, since they offer greater detail on how the reporting of the literature search will be appraised and, therefore, they offer a proxy on what to report [ 124 , 125 ]. Further research in the form of a study which undertakes a comparison between PRISMA and quality appraisal checklists for systematic reviews would seem to begin addressing the call, identified by Radar et al., for further guidance on what to report [ 120 ].

Limitations

Other handbooks exist.

A potential limitation of this literature review is the focus on guidance produced in Europe (the UK specifically) and Australia. We justify the decision for our selection of the nine guidance documents reviewed in this literature review in section “ Identifying guidance ”. In brief, these nine guidance documents were selected as the most relevant health care guidance that inform UK systematic reviewing practice, given that the UK occupies a prominent position in the science of health information retrieval. We acknowledge the existence of other guidance documents, such as those from North America (e.g. the Agency for Healthcare Research and Quality (AHRQ) [ 126 ], The Institute of Medicine [ 127 ] and the guidance and resources produced by the Canadian Agency for Drugs and Technologies in Health (CADTH) [ 128 ]). We comment further on this directly below.

The handbooks are potentially linked to one another

What is not clear is the extent to which the guidance documents inter-relate or provide guidance uniquely. The Cochrane Handbook, first published in 1994, is notably a key source of reference in guidance and systematic reviews beyond Cochrane reviews. It is not clear to what extent broadening the sample of guidance handbooks to include North American handbooks, and guidance handbooks from other relevant countries too, would alter the findings of this literature review or develop further support for the process model. Since we cannot be clear, we raise this as a potential limitation of this literature review. On our initial review of a sample of North American, and other, guidance documents (before selecting the guidance documents considered in this review), however, we do not consider that the inclusion of these further handbooks would alter significantly the findings of this literature review.

This is a literature review

A further limitation of this review was that the review of published studies is not a systematic review of the evidence for each key stage. It is possible that other relevant studies could help contribute to the exploration and development of the key stages identified in this review.

This literature review would appear to demonstrate the existence of a shared model of the literature searching process in systematic reviews. We call this model ‘the conventional approach’, since it appears to be common convention in nine different guidance documents.

The findings reported above reveal eight key stages in the process of literature searching for systematic reviews. These key stages are consistently reported in the nine guidance documents which suggests consensus on the key stages of literature searching, and therefore the process of literature searching as a whole, in systematic reviews.

In Table 2 , we demonstrate consensus regarding the application of literature search methods. All guidance documents distinguish between primary and supplementary search methods. Bibliographic database searching is consistently the first method of literature searching referenced in each guidance document. Whilst the guidance uniformly supports the use of supplementary search methods, there is little evidence for a consistent process with diverse guidance across documents. This may reflect differences in the core focus across each document, linked to differences in identifying effectiveness studies or qualitative studies, for instance.

Eight of the nine guidance documents reported on the aims of literature searching. The shared understanding was that literature searching should be thorough and comprehensive in its aim and that this process should be reported transparently so that that it could be reproduced. Whilst only three documents explicitly link this understanding to minimising bias, it is clear that comprehensive literature searching is implicitly linked to ‘not missing relevant studies’ which is approximately the same point.

Defining the key stages in this review helps categorise the scholarship available, and it prioritises areas for development or further study. The supporting studies on preparing for literature searching (key stage three, ‘preparation’) were, for example, comparatively few, and yet this key stage represents a decisive moment in literature searching for systematic reviews. It is where search strategy structure is determined, search terms are chosen or discarded, and the resources to be searched are selected. Information specialists, librarians and researchers, are well placed to develop these and other areas within the key stages we identify.

This review calls for further research to determine the suitability of using the conventional approach. The publication dates of the guidance documents which underpin the conventional approach may raise questions as to whether the process which they each report remains valid for current systematic literature searching. In addition, it may be useful to test whether it is desirable to use the same process model of literature searching for qualitative evidence synthesis as that for reviews of intervention effectiveness, which this literature review demonstrates is presently recommended best practice.

Abbreviations

Behaviour of interest; Health context; Exclusions; Models or Theories

Cochrane Database of Systematic Reviews

The Cochrane Central Register of Controlled Trials

Database of Abstracts of Reviews of Effects

Enhancing transparency in reporting the synthesis of qualitative research

Institute for Quality and Efficiency in Healthcare

National Institute for Clinical Excellence

Population, Intervention, Comparator, Outcome

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Setting, Perspective, Intervention, Comparison, Evaluation

Sample, Phenomenon of Interest, Design, Evaluation, Research type

STrengthening the Reporting of OBservational studies in Epidemiology

Trial Search Co-ordinators

Booth A. Unpacking your literature search toolbox: on search styles and tactics. Health Information & Libraries Journal. 2008;25(4):313–7.

Article   Google Scholar  

Petticrew M, Roberts H. Systematic reviews in the social sciences: a practical guide. Oxford: Blackwell Publishing Ltd; 2006.

Book   Google Scholar  

Institute for Quality and Efficiency in Health Care (IQWiG). IQWiG Methods Resources. 7 Information retrieval 2014 [Available from: https://www.ncbi.nlm.nih.gov/books/NBK385787/ .

NICE: National Institute for Health and Care Excellence. Developing NICE guidelines: the manual 2014. Available from: https://www.nice.org.uk/media/default/about/what-we-do/our-programmes/developing-nice-guidelines-the-manual.pdf .

Sampson M. MJ, Lefebvre C, Moher D, Grimshaw J. Peer Review of Electronic Search Strategies: PRESS; 2008.

Google Scholar  

Centre for Reviews & Dissemination. Systematic reviews – CRD’s guidance for undertaking reviews in healthcare. York: Centre for Reviews and Dissemination, University of York; 2009.

eunetha: European Network for Health Technology Assesment Process of information retrieval for systematic reviews and health technology assessments on clinical effectiveness 2016. Available from: http://www.eunethta.eu/sites/default/files/Guideline_Information_Retrieval_V1-1.pdf .

Kugley SWA, Thomas J, Mahood Q, Jørgensen AMK, Hammerstrøm K, Sathe N. Searching for studies: a guide to information retrieval for Campbell systematic reviews. Oslo: Campbell Collaboration. 2017; Available from: https://www.campbellcollaboration.org/library/searching-for-studies-information-retrieval-guide-campbell-reviews.html

Lefebvre C, Manheimer E, Glanville J. Chapter 6: searching for studies. In: JPT H, Green S, editors. Cochrane Handbook for Systematic Reviews of Interventions; 2011.

Collaboration for Environmental Evidence. Guidelines for Systematic Review and Evidence Synthesis in Environmental Management.: Environmental Evidence:; 2013. Available from: http://www.environmentalevidence.org/wp-content/uploads/2017/01/Review-guidelines-version-4.2-final-update.pdf .

The Joanna Briggs Institute. Joanna Briggs institute reviewers’ manual. 2014th ed: the Joanna Briggs institute; 2014. Available from: https://joannabriggs.org/assets/docs/sumari/ReviewersManual-2014.pdf

Beverley CA, Booth A, Bath PA. The role of the information specialist in the systematic review process: a health information case study. Health Inf Libr J. 2003;20(2):65–74.

Article   CAS   Google Scholar  

Harris MR. The librarian's roles in the systematic review process: a case study. Journal of the Medical Library Association. 2005;93(1):81–7.

PubMed   PubMed Central   Google Scholar  

Egger JB. Use of recommended search strategies in systematic reviews and the impact of librarian involvement: a cross-sectional survey of recent authors. PLoS One. 2015;10(5):e0125931.

Li L, Tian J, Tian H, Moher D, Liang F, Jiang T, et al. Network meta-analyses could be improved by searching more sources and by involving a librarian. J Clin Epidemiol. 2014;67(9):1001–7.

Article   PubMed   Google Scholar  

McGowan J, Sampson M. Systematic reviews need systematic searchers. J Med Libr Assoc. 2005;93(1):74–80.

Rethlefsen ML, Farrell AM, Osterhaus Trzasko LC, Brigham TJ. Librarian co-authors correlated with higher quality reported search strategies in general internal medicine systematic reviews. J Clin Epidemiol. 2015;68(6):617–26.

Weller AC. Mounting evidence that librarians are essential for comprehensive literature searches for meta-analyses and Cochrane reports. J Med Libr Assoc. 2004;92(2):163–4.

Swinkels A, Briddon J, Hall J. Two physiotherapists, one librarian and a systematic literature review: collaboration in action. Health Info Libr J. 2006;23(4):248–56.

Foster M. An overview of the role of librarians in systematic reviews: from expert search to project manager. EAHIL. 2015;11(3):3–7.

Lawson L. OPERATING OUTSIDE LIBRARY WALLS 2004.

Vassar M, Yerokhin V, Sinnett PM, Weiher M, Muckelrath H, Carr B, et al. Database selection in systematic reviews: an insight through clinical neurology. Health Inf Libr J. 2017;34(2):156–64.

Townsend WA, Anderson PF, Ginier EC, MacEachern MP, Saylor KM, Shipman BL, et al. A competency framework for librarians involved in systematic reviews. Journal of the Medical Library Association : JMLA. 2017;105(3):268–75.

Cooper ID, Crum JA. New activities and changing roles of health sciences librarians: a systematic review, 1990-2012. Journal of the Medical Library Association : JMLA. 2013;101(4):268–77.

Crum JA, Cooper ID. Emerging roles for biomedical librarians: a survey of current practice, challenges, and changes. Journal of the Medical Library Association : JMLA. 2013;101(4):278–86.

Dudden RF, Protzko SL. The systematic review team: contributions of the health sciences librarian. Med Ref Serv Q. 2011;30(3):301–15.

Golder S, Loke Y, McIntosh HM. Poor reporting and inadequate searches were apparent in systematic reviews of adverse effects. J Clin Epidemiol. 2008;61(5):440–8.

Maggio LA, Tannery NH, Kanter SL. Reproducibility of literature search reporting in medical education reviews. Academic medicine : journal of the Association of American Medical Colleges. 2011;86(8):1049–54.

Meert D, Torabi N, Costella J. Impact of librarians on reporting of the literature searching component of pediatric systematic reviews. Journal of the Medical Library Association : JMLA. 2016;104(4):267–77.

Morris M, Boruff JT, Gore GC. Scoping reviews: establishing the role of the librarian. Journal of the Medical Library Association : JMLA. 2016;104(4):346–54.

Koffel JB, Rethlefsen ML. Reproducibility of search strategies is poor in systematic reviews published in high-impact pediatrics, cardiology and surgery journals: a cross-sectional study. PLoS One. 2016;11(9):e0163309.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Fehrmann P, Thomas J. Comprehensive computer searches and reporting in systematic reviews. Research Synthesis Methods. 2011;2(1):15–32.

Booth A. Searching for qualitative research for inclusion in systematic reviews: a structured methodological review. Systematic Reviews. 2016;5(1):74.

Article   PubMed   PubMed Central   Google Scholar  

Egger M, Juni P, Bartlett C, Holenstein F, Sterne J. How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study. Health technology assessment (Winchester, England). 2003;7(1):1–76.

Tricco AC, Tetzlaff J, Sampson M, Fergusson D, Cogo E, Horsley T, et al. Few systematic reviews exist documenting the extent of bias: a systematic review. J Clin Epidemiol. 2008;61(5):422–34.

Booth A. How much searching is enough? Comprehensive versus optimal retrieval for technology assessments. Int J Technol Assess Health Care. 2010;26(4):431–5.

Papaioannou D, Sutton A, Carroll C, Booth A, Wong R. Literature searching for social science systematic reviews: consideration of a range of search techniques. Health Inf Libr J. 2010;27(2):114–22.

Petticrew M. Time to rethink the systematic review catechism? Moving from ‘what works’ to ‘what happens’. Systematic Reviews. 2015;4(1):36.

Betrán AP, Say L, Gülmezoglu AM, Allen T, Hampson L. Effectiveness of different databases in identifying studies for systematic reviews: experience from the WHO systematic review of maternal morbidity and mortality. BMC Med Res Methodol. 2005;5

Felson DT. Bias in meta-analytic research. J Clin Epidemiol. 1992;45(8):885–92.

Article   PubMed   CAS   Google Scholar  

Franco A, Malhotra N, Simonovits G. Publication bias in the social sciences: unlocking the file drawer. Science. 2014;345(6203):1502–5.

Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews. BMC Med Res Methodol. 2017;17(1):64.

Schmucker CM, Blümle A, Schell LK, Schwarzer G, Oeller P, Cabrera L, et al. Systematic review finds that study data not published in full text articles have unclear impact on meta-analyses results in medical research. PLoS One. 2017;12(4):e0176210.

Egger M, Zellweger-Zahner T, Schneider M, Junker C, Lengeler C, Antes G. Language bias in randomised controlled trials published in English and German. Lancet (London, England). 1997;350(9074):326–9.

Moher D, Pham B, Lawson ML, Klassen TP. The inclusion of reports of randomised trials published in languages other than English in systematic reviews. Health technology assessment (Winchester, England). 2003;7(41):1–90.

Pham B, Klassen TP, Lawson ML, Moher D. Language of publication restrictions in systematic reviews gave different results depending on whether the intervention was conventional or complementary. J Clin Epidemiol. 2005;58(8):769–76.

Mills EJ, Kanters S, Thorlund K, Chaimani A, Veroniki A-A, Ioannidis JPA. The effects of excluding treatments from network meta-analyses: survey. BMJ : British Medical Journal. 2013;347

Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. The contribution of databases to the results of systematic reviews: a cross-sectional study. BMC Med Res Methodol. 2016;16(1):127.

van Driel ML, De Sutter A, De Maeseneer J, Christiaens T. Searching for unpublished trials in Cochrane reviews may not be worth the effort. J Clin Epidemiol. 2009;62(8):838–44.e3.

Buchberger B, Krabbe L, Lux B, Mattivi JT. Evidence mapping for decision making: feasibility versus accuracy - when to abandon high sensitivity in electronic searches. German medical science : GMS e-journal. 2016;14:Doc09.

Lorenc T, Pearson M, Jamal F, Cooper C, Garside R. The role of systematic reviews of qualitative evidence in evaluating interventions: a case study. Research Synthesis Methods. 2012;3(1):1–10.

Gough D. Weight of evidence: a framework for the appraisal of the quality and relevance of evidence. Res Pap Educ. 2007;22(2):213–28.

Barroso J, Gollop CJ, Sandelowski M, Meynell J, Pearce PF, Collins LJ. The challenges of searching for and retrieving qualitative studies. West J Nurs Res. 2003;25(2):153–78.

Britten N, Garside R, Pope C, Frost J, Cooper C. Asking more of qualitative synthesis: a response to Sally Thorne. Qual Health Res. 2017;27(9):1370–6.

Booth A, Carroll C. Systematic searching for theory to inform systematic reviews: is it feasible? Is it desirable? Health Info Libr J. 2015;32(3):220–35.

Kwon Y, Powelson SE, Wong H, Ghali WA, Conly JM. An assessment of the efficacy of searching in biomedical databases beyond MEDLINE in identifying studies for a systematic review on ward closures as an infection control intervention to control outbreaks. Syst Rev. 2014;3:135.

Nussbaumer-Streit B, Klerings I, Wagner G, Titscher V, Gartlehner G. Assessing the validity of abbreviated literature searches for rapid reviews: protocol of a non-inferiority and meta-epidemiologic study. Systematic Reviews. 2016;5:197.

Wagner G, Nussbaumer-Streit B, Greimel J, Ciapponi A, Gartlehner G. Trading certainty for speed - how much uncertainty are decisionmakers and guideline developers willing to accept when using rapid reviews: an international survey. BMC Med Res Methodol. 2017;17(1):121.

Ogilvie D, Hamilton V, Egan M, Petticrew M. Systematic reviews of health effects of social interventions: 1. Finding the evidence: how far should you go? J Epidemiol Community Health. 2005;59(9):804–8.

Royle P, Milne R. Literature searching for randomized controlled trials used in Cochrane reviews: rapid versus exhaustive searches. Int J Technol Assess Health Care. 2003;19(4):591–603.

Pearson M, Moxham T, Ashton K. Effectiveness of search strategies for qualitative research about barriers and facilitators of program delivery. Eval Health Prof. 2011;34(3):297–308.

Levay P, Raynor M, Tuvey D. The Contributions of MEDLINE, Other Bibliographic Databases and Various Search Techniques to NICE Public Health Guidance. 2015. 2015;10(1):19.

Nussbaumer-Streit B, Klerings I, Wagner G, Heise TL, Dobrescu AI, Armijo-Olivo S, et al. Abbreviated literature searches were viable alternatives to comprehensive searches: a meta-epidemiological study. J Clin Epidemiol. 2018;102:1–11.

Briscoe S, Cooper C, Glanville J, Lefebvre C. The loss of the NHS EED and DARE databases and the effect on evidence synthesis and evaluation. Res Synth Methods. 2017;8(3):256–7.

Stansfield C, O'Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Research Synthesis Methods.n/a-n/a.

Petrova M, Sutcliffe P, Fulford KW, Dale J. Search terms and a validated brief search filter to retrieve publications on health-related values in Medline: a word frequency analysis study. Journal of the American Medical Informatics Association : JAMIA. 2012;19(3):479–88.

Stansfield C, Thomas J, Kavanagh J. 'Clustering' documents automatically to support scoping reviews of research: a case study. Res Synth Methods. 2013;4(3):230–41.

PubMed   Google Scholar  

Methley AM, Campbell S, Chew-Graham C, McNally R, Cheraghi-Sohi S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv Res. 2014;14:579.

Andrew B. Clear and present questions: formulating questions for evidence based practice. Library Hi Tech. 2006;24(3):355–68.

Cooke A, Smith D, Booth A. Beyond PICO: the SPIDER tool for qualitative evidence synthesis. Qual Health Res. 2012;22(10):1435–43.

Whiting P, Westwood M, Bojke L, Palmer S, Richardson G, Cooper J, et al. Clinical effectiveness and cost-effectiveness of tests for the diagnosis and investigation of urinary tract infection in children: a systematic review and economic model. Health technology assessment (Winchester, England). 2006;10(36):iii-iv, xi-xiii, 1–154.

Cooper C, Levay P, Lorenc T, Craig GM. A population search filter for hard-to-reach populations increased search efficiency for a systematic review. J Clin Epidemiol. 2014;67(5):554–9.

Hausner E, Waffenschmidt S, Kaiser T, Simon M. Routine development of objectively derived search strategies. Systematic Reviews. 2012;1(1):19.

Hausner E, Guddat C, Hermanns T, Lampert U, Waffenschmidt S. Prospective comparison of search strategies for systematic reviews: an objective approach yielded higher sensitivity than a conceptual one. J Clin Epidemiol. 2016;77:118–24.

Craven J, Levay P. Recording database searches for systematic reviews - what is the value of adding a narrative to peer-review checklists? A case study of nice interventional procedures guidance. Evid Based Libr Inf Pract. 2011;6(4):72–87.

Wright K, Golder S, Lewis-Light K. What value is the CINAHL database when searching for systematic reviews of qualitative studies? Syst Rev. 2015;4:104.

Beckles Z, Glover S, Ashe J, Stockton S, Boynton J, Lai R, et al. Searching CINAHL did not add value to clinical questions posed in NICE guidelines. J Clin Epidemiol. 2013;66(9):1051–7.

Cooper C, Rogers M, Bethel A, Briscoe S, Lowe J. A mapping review of the literature on UK-focused health and social care databases. Health Inf Libr J. 2015;32(1):5–22.

Younger P, Boddy K. When is a search not a search? A comparison of searching the AMED complementary health database via EBSCOhost, OVID and DIALOG. Health Inf Libr J. 2009;26(2):126–35.

Lam MT, McDiarmid M. Increasing number of databases searched in systematic reviews and meta-analyses between 1994 and 2014. Journal of the Medical Library Association : JMLA. 2016;104(4):284–9.

Bethel A, editor Search summary tables for systematic reviews: results and findings. HLC Conference 2017a.

Aagaard T, Lund H, Juhl C. Optimizing literature search in systematic reviews - are MEDLINE, EMBASE and CENTRAL enough for identifying effect studies within the area of musculoskeletal disorders? BMC Med Res Methodol. 2016;16(1):161.

Adams CE, Frederick K. An investigation of the adequacy of MEDLINE searches for randomized controlled trials (RCTs) of the effects of mental health care. Psychol Med. 1994;24(3):741–8.

Kelly L, St Pierre-Hansen N. So many databases, such little clarity: searching the literature for the topic aboriginal. Canadian family physician Medecin de famille canadien. 2008;54(11):1572–3.

Lawrence DW. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion? Injury Prevention. 2008;14(6):401–4.

Lemeshow AR, Blum RE, Berlin JA, Stoto MA, Colditz GA. Searching one or two databases was insufficient for meta-analysis of observational studies. J Clin Epidemiol. 2005;58(9):867–73.

Sampson M, Barrowman NJ, Moher D, Klassen TP, Pham B, Platt R, et al. Should meta-analysts search Embase in addition to Medline? J Clin Epidemiol. 2003;56(10):943–55.

Stevinson C, Lawlor DA. Searching multiple databases for systematic reviews: added value or diminishing returns? Complementary Therapies in Medicine. 2004;12(4):228–32.

Suarez-Almazor ME, Belseck E, Homik J, Dorgan M, Ramos-Remus C. Identifying clinical trials in the medical literature with electronic databases: MEDLINE alone is not enough. Control Clin Trials. 2000;21(5):476–87.

Taylor B, Wylie E, Dempster M, Donnelly M. Systematically retrieving research: a case study evaluating seven databases. Res Soc Work Pract. 2007;17(6):697–706.

Beyer FR, Wright K. Can we prioritise which databases to search? A case study using a systematic review of frozen shoulder management. Health Info Libr J. 2013;30(1):49–58.

Duffy S, de Kock S, Misso K, Noake C, Ross J, Stirk L. Supplementary searches of PubMed to improve currency of MEDLINE and MEDLINE in-process searches via Ovid. Journal of the Medical Library Association : JMLA. 2016;104(4):309–12.

Katchamart W, Faulkner A, Feldman B, Tomlinson G, Bombardier C. PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews. J Clin Epidemiol. 2011;64(7):805–7.

Cooper C, Lovell R, Husk K, Booth A, Garside R. Supplementary search methods were more effective and offered better value than bibliographic database searching: a case study from public health and environmental enhancement (in Press). Research Synthesis Methods. 2017;

Cooper C, Booth, A., Britten, N., Garside, R. A comparison of results of empirical studies of supplementary search techniques and recommendations in review methodology handbooks: A methodological review. (In Press). BMC Systematic Reviews. 2017.

Greenhalgh T, Peacock R. Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources. BMJ (Clinical research ed). 2005;331(7524):1064–5.

Article   PubMed Central   Google Scholar  

Hinde S, Spackman E. Bidirectional citation searching to completion: an exploration of literature searching methods. PharmacoEconomics. 2015;33(1):5–11.

Levay P, Ainsworth N, Kettle R, Morgan A. Identifying evidence for public health guidance: a comparison of citation searching with web of science and Google scholar. Res Synth Methods. 2016;7(1):34–45.

McManus RJ, Wilson S, Delaney BC, Fitzmaurice DA, Hyde CJ, Tobias RS, et al. Review of the usefulness of contacting other experts when conducting a literature search for systematic reviews. BMJ (Clinical research ed). 1998;317(7172):1562–3.

Westphal A, Kriston L, Holzel LP, Harter M, von Wolff A. Efficiency and contribution of strategies for finding randomized controlled trials: a case study from a systematic review on therapeutic interventions of chronic depression. Journal of public health research. 2014;3(2):177.

Matthews EJ, Edwards AG, Barker J, Bloor M, Covey J, Hood K, et al. Efficient literature searching in diffuse topics: lessons from a systematic review of research on communicating risk to patients in primary care. Health Libr Rev. 1999;16(2):112–20.

Bethel A. Endnote Training (YouTube Videos) 2017b [Available from: http://medicine.exeter.ac.uk/esmi/workstreams/informationscience/is_resources,_guidance_&_advice/ .

Bramer WM, Giustini D, de Jonge GB, Holland L, Bekhuis T. De-duplication of database search results for systematic reviews in EndNote. Journal of the Medical Library Association : JMLA. 2016;104(3):240–3.

Bramer WM, Milic J, Mast F. Reviewing retrieved references for inclusion in systematic reviews using EndNote. Journal of the Medical Library Association : JMLA. 2017;105(1):84–7.

Gall C, Brahmi FA. Retrieval comparison of EndNote to search MEDLINE (Ovid and PubMed) versus searching them directly. Medical reference services quarterly. 2004;23(3):25–32.

Ahmed KK, Al Dhubaib BE. Zotero: a bibliographic assistant to researcher. J Pharmacol Pharmacother. 2011;2(4):303–5.

Coar JT, Sewell JP. Zotero: harnessing the power of a personal bibliographic manager. Nurse Educ. 2010;35(5):205–7.

Moher D, Liberati A, Tetzlaff J, Altman DG, The PG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

Sampson M, McGowan J, Tetzlaff J, Cogo E, Moher D. No consensus exists on search reporting methods for systematic reviews. J Clin Epidemiol. 2008;61(8):748–54.

Toews LC. Compliance of systematic reviews in veterinary journals with preferred reporting items for systematic reviews and meta-analysis (PRISMA) literature search reporting guidelines. Journal of the Medical Library Association : JMLA. 2017;105(3):233–9.

Booth A. "brimful of STARLITE": toward standards for reporting literature searches. Journal of the Medical Library Association : JMLA. 2006;94(4):421–9. e205

Faggion CM Jr, Wu YC, Tu YK, Wasiak J. Quality of search strategies reported in systematic reviews published in stereotactic radiosurgery. Br J Radiol. 2016;89(1062):20150878.

Mullins MM, DeLuca JB, Crepaz N, Lyles CM. Reporting quality of search methods in systematic reviews of HIV behavioral interventions (2000–2010): are the searches clearly explained, systematic and reproducible? Research Synthesis Methods. 2014;5(2):116–30.

Yoshii A, Plaut DA, McGraw KA, Anderson MJ, Wellik KE. Analysis of the reporting of search strategies in Cochrane systematic reviews. Journal of the Medical Library Association : JMLA. 2009;97(1):21–9.

Bigna JJ, Um LN, Nansseu JR. A comparison of quality of abstracts of systematic reviews including meta-analysis of randomized controlled trials in high-impact general medicine journals before and after the publication of PRISMA extension for abstracts: a systematic review and meta-analysis. Syst Rev. 2016;5(1):174.

Akhigbe T, Zolnourian A, Bulters D. Compliance of systematic reviews articles in brain arteriovenous malformation with PRISMA statement guidelines: review of literature. Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia. 2017;39:45–8.

Tao KM, Li XQ, Zhou QH, Moher D, Ling CQ, Yu WF. From QUOROM to PRISMA: a survey of high-impact medical journals' instructions to authors and a review of systematic reviews in anesthesia literature. PLoS One. 2011;6(11):e27611.

Wasiak J, Tyack Z, Ware R. Goodwin N. Jr. Poor methodological quality and reporting standards of systematic reviews in burn care management. International wound journal: Faggion CM; 2016.

Tam WW, Lo KK, Khalechelvam P. Endorsement of PRISMA statement and quality of systematic reviews and meta-analyses published in nursing journals: a cross-sectional study. BMJ Open. 2017;7(2):e013905.

Rader T, Mann M, Stansfield C, Cooper C, Sampson M. Methods for documenting systematic review searches: a discussion of common issues. Res Synth Methods. 2014;5(2):98–115.

Atkinson KM, Koenka AC, Sanchez CE, Moshontz H, Cooper H. Reporting standards for literature searches and report inclusion criteria: making research syntheses more transparent and easy to replicate. Res Synth Methods. 2015;6(1):87–95.

McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40–6.

Sampson M, McGowan J, Cogo E, Grimshaw J, Moher D, Lefebvre C. An evidence-based practice guideline for the peer review of electronic search strategies. J Clin Epidemiol. 2009;62(9):944–52.

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ (Clinical research ed). 2017;358.

Whiting P, Savović J, Higgins JPT, Caldwell DM, Reeves BC, Shea B, et al. ROBIS: a new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol. 2016;69:225–34.

Relevo R, Balshem H. Finding evidence for comparing medical interventions: AHRQ and the effective health care program. J Clin Epidemiol. 2011;64(11):1168–77.

Medicine Io. Standards for Systematic Reviews 2011 [Available from: http://www.nationalacademies.org/hmd/Reports/2011/Finding-What-Works-in-Health-Care-Standards-for-Systematic-Reviews/Standards.aspx .

CADTH: Resources 2018.

Download references

Acknowledgements

CC acknowledges the supervision offered by Professor Chris Hyde.

This publication forms a part of CC’s PhD. CC’s PhD was funded through the National Institute for Health Research (NIHR) Health Technology Assessment (HTA) Programme (Project Number 16/54/11). The open access fee for this publication was paid for by Exeter Medical School.

RG and NB were partially supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care South West Peninsula.

The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information

Authors and affiliations.

Institute of Health Research, University of Exeter Medical School, Exeter, UK

Chris Cooper & Jo Varley-Campbell

HEDS, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK

Andrew Booth

Nicky Britten

European Centre for Environment and Human Health, University of Exeter Medical School, Truro, UK

Ruth Garside

You can also search for this author in PubMed   Google Scholar

Contributions

CC conceived the idea for this study and wrote the first draft of the manuscript. CC discussed this publication in PhD supervision with AB and separately with JVC. CC revised the publication with input and comments from AB, JVC, RG and NB. All authors revised the manuscript prior to submission. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chris Cooper .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:.

Appendix tables and PubMed search strategy. Key studies used for pearl growing per key stage, working data extraction tables and the PubMed search strategy. (DOCX 30 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Cooper, C., Booth, A., Varley-Campbell, J. et al. Defining the process to literature searching in systematic reviews: a literature review of guidance and supporting studies. BMC Med Res Methodol 18 , 85 (2018). https://doi.org/10.1186/s12874-018-0545-3

Download citation

Received : 20 September 2017

Accepted : 06 August 2018

Published : 14 August 2018

DOI : https://doi.org/10.1186/s12874-018-0545-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Literature Search Process
  • Citation Chasing
  • Tacit Models
  • Unique Guidance
  • Information Specialists

BMC Medical Research Methodology

ISSN: 1471-2288

systematic literature review databases

systematic literature review databases

Systematic Reviews

  • Developing a Research Question
  • Developing a Protocol

Literature Searching

  • Screening References
  • Data Extraction
  • Quality Assessment
  • Reporting Results
  • Related Guides
  • Getting Help

When searching for relevant literature to your research question, you'll need to determine  what terms  you're using to search and  where  to search.

Comprehensive Searching

One of the most important factors in searching is to document a search. Reporting search strategies is required for systematic review authors.

Documentation is also necessary for refining previous searches. Use a Word document (or similar) to keep track of search terms.

When searching, consider:

  • Types of studies
  • Subject Headings (E.g. MeSH (PubMed), CINAHL, and APA Thesaurus to name a few)
  • Spelling variants (i.e. color vs. colour, randomised vs. randomized).

Additional tips:

  • Refer to relevant resources for finding new keywords or search terms
  • Snowball citations - follow cited works in studies yielded by a search to find additional references
  • Include grey literature and nontraditional searches
  • Hand search for articles (see Cochrane for information on hand searching)
  • Review international studies, and note planning strategies relating to translation
  • Search Like a Pro - tips from Covidence
  • Consult a librarian for assistance

Selecting Databases

Knowing where to look for studies is key to a successful review. Below is a link to Duquesne's database list. The databases to search in conducting a systematic review rely heavily on the field a research question is based in.

It's important to revise a search strategy and terms for every database used! Keywords and subject headings in one database may not be the same in another .  If you have further questions, feel free to  meet with a librarian  to discuss where to search.

Translate Your Search Terms with Polyglot

When conducting a literature search, it's important to change search terms based on the classification systems and subject headings within individual databases. For example, CINAHL subject headings may use different terminology when compared to PubMed's MeSH.

Bond University offers Polyglot, a tool for converting search terms from one database into searchable materials for other databases. Part of the University's systematic review accelerator, Polyglot offers search translations for many databases, including CINAHL, PubMed, Embase, APA PsycInfo, Scopus, and more!

  • Systematic Review Accelerator - Polyglot

Grey Literature

Searching for grey (or gray) literature is necessary for preventing selection bias: that is, searching literature that has not been published. 

Learn more about what grey literature entails and where to search for it on Gumberg's  Grey Literature library guide .

More Places to Search

  • Clinical trials

Search  National Institute of Health's Clinical Trials  registry and  York Health Economics Consortium's Finding Clinical Trials . 

  • Examine  reference lists

When relevant studies are found, look at the works cited to find more studies to include in the review. This is called  snowballing .

  • Citation searching

Pace University has a great  guide to help with citation searching .

  • Perform an  Internet search

Using a variety of web browsers, search for a topic. This may lead to individuals or organizations who have done studies pertaining to the research question. Contact these people and organizations to see if they have any unpublished studies to include.

  • << Previous: Developing a Protocol
  • Next: Screening References >>
  • Last Updated: Feb 9, 2024 4:57 PM
  • URL: https://guides.library.duq.edu/systematicreviews

Systematic Reviews in the Engineering Literature: A Scoping Review

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Efficacy of psilocybin...

Efficacy of psilocybin for treating symptoms of depression: systematic review and meta-analysis

Linked editorial.

Psilocybin for depression

  • Related content
  • Peer review

This article has a correction. Please see:

  • EXPRESSION OF CONCERN: Efficacy of psilocybin for treating symptoms of depression: systematic review and meta-analysis - May 04, 2024
  • Athina-Marina Metaxa , masters graduate researcher 1 ,
  • Mike Clarke , professor 2
  • 1 Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford OX2 6GG, UK
  • 2 Northern Ireland Methodology Hub, Centre for Public Health, ICS-A Royal Hospitals, Belfast, Ireland, UK
  • Correspondence to: A-M Metaxa athina.metaxa{at}hmc.ox.ac.uk (or @Athina_Metaxa12 on X)
  • Accepted 6 March 2024

Objective To determine the efficacy of psilocybin as an antidepressant compared with placebo or non-psychoactive drugs.

Design Systematic review and meta-analysis.

Data sources Five electronic databases of published literature (Cochrane Central Register of Controlled Trials, Medline, Embase, Science Citation Index and Conference Proceedings Citation Index, and PsycInfo) and four databases of unpublished and international literature (ClinicalTrials.gov, WHO International Clinical Trials Registry Platform, ProQuest Dissertations and Theses Global, and PsycEXTRA), and handsearching of reference lists, conference proceedings, and abstracts.

Data synthesis and study quality Information on potential treatment effect moderators was extracted, including depression type (primary or secondary), previous use of psychedelics, psilocybin dosage, type of outcome measure (clinician rated or self-reported), and personal characteristics (eg, age, sex). Data were synthesised using a random effects meta-analysis model, and observed heterogeneity and the effect of covariates were investigated with subgroup analyses and metaregression. Hedges’ g was used as a measure of treatment effect size, to account for small sample effects and substantial differences between the included studies’ sample sizes. Study quality was appraised using Cochrane’s Risk of Bias 2 tool, and the quality of the aggregated evidence was evaluated using GRADE guidelines.

Eligibility criteria Randomised trials in which psilocybin was administered as a standalone treatment for adults with clinically significant symptoms of depression and change in symptoms was measured using a validated clinician rated or self-report scale. Studies with directive psychotherapy were included if the psychotherapeutic component was present in both experimental and control conditions. Participants with depression regardless of comorbidities (eg, cancer) were eligible.

Results Meta-analysis on 436 participants (228 female participants), average age 36-60 years, from seven of the nine included studies showed a significant benefit of psilocybin (Hedges’ g=1.64, 95% confidence interval (CI) 0.55 to 2.73, P<0.001) on change in depression scores compared with comparator treatment. Subgroup analyses and metaregressions indicated that having secondary depression (Hedges’ g=3.25, 95% CI 0.97 to 5.53), being assessed with self-report depression scales such as the Beck depression inventory (3.25, 0.97 to 5.53), and older age and previous use of psychedelics (metaregression coefficient 0.16, 95% CI 0.08 to 0.24 and 4.2, 1.5 to 6.9, respectively) were correlated with greater improvements in symptoms. All studies had a low risk of bias, but the change from baseline metric was associated with high heterogeneity and a statistically significant risk of small study bias, resulting in a low certainty of evidence rating.

Conclusion Treatment effects of psilocybin were significantly larger among patients with secondary depression, when self-report scales were used to measure symptoms of depression, and when participants had previously used psychedelics. Further research is thus required to delineate the influence of expectancy effects, moderating factors, and treatment delivery on the efficacy of psilocybin as an antidepressant.

Systematic review registration PROSPERO CRD42023388065.

Figure1

  • Download figure
  • Open in new tab
  • Download powerpoint

Introduction

Depression affects an estimated 300 million people around the world, an increase of nearly 20% over the past decade. 1 Worldwide, depression is also the leading cause of disability. 2

Drugs for depression are widely available but these seem to have limited efficacy, can have serious adverse effects, and are associated with low patient adherence. 3 4 Importantly, the treatment effects of antidepressant drugs do not appear until 4-7 weeks after the start of treatment, and remission of symptoms can take months. 4 5 Additionally, the likelihood of relapse is high, with 40-60% of people with depression experiencing a further depressive episode, and the chance of relapse increasing with each subsequent episode. 6 7

Since the early 2000s, the naturally occurring serotonergic hallucinogen psilocybin, found in several species of mushrooms, has been widely discussed as a potential treatment for depression. 8 9 Psilocybin’s mechanism of action differs from that of classic selective serotonin reuptake inhibitors (SSRIs) and might improve the treatment response rate, decrease time to improvement of symptoms, and prevent relapse post-remission. Moreover, more recent assessments of harm have consistently reported that psilocybin generally has low addictive potential and toxicity and that it can be administered safely under clinical supervision. 10

The renewed interest in psilocybin’s antidepressive effects led to several clinical trials on treatment resistant depression, 11 12 major depressive disorder, 13 and depression related to physical illness. 14 15 16 17 These trials mostly reported positive efficacy findings, showing reductions in symptoms of depression within a few hours to a few days after one dose or two doses of psilocybin. 11 12 13 16 17 18 These studies reported only minimal adverse effects, however, and drug harm assessments in healthy volunteers indicated that psilocybin does not induce physiological toxicity, is not addictive, and does not lead to withdrawal. 19 20 Nevertheless, these findings should be interpreted with caution owing to the small sample sizes and open label design of some of these studies. 11 21

Several systematic reviews and meta-analyses since the early 2000s have investigated the use of psilocybin to treat symptoms of depression. Most found encouraging results, but as well as people with depression some included healthy volunteers, 22 and most combined data from studies of multiple serotonergic psychedelics, 23 24 25 even though each compound has unique neurobiological effects and mechanisms of action. 26 27 28 Furthermore, many systematic reviews included non-randomised studies and studies in which psilocybin was tested in conjunction with psychotherapeutic interventions, 25 29 30 31 32 which made it difficult to distinguish psilocybin’s treatment effects. Most systematic reviews and meta-analyses did not consider the impact of factors that could act as moderators to psilocybin’s effects, such as type of depression (primary or secondary), previous use of psychedelics, psilocybin dosage, type of outcome measure (clinician rated or self-reported), and personal characteristics (eg, age, sex). 25 26 29 30 31 32 Lastly, systematic reviews did not consider grey literature, 33 34 which might have led to a substantial overestimation of psilocybin’s efficacy as a treatment for depression. In this review we focused on randomised trials that contained an unconfounded evaluation of psilocybin in adults with symptoms of depression, regardless of country and language of publication.

In this systematic review and meta-analysis of indexed and non-indexed randomised trials we investigated the efficacy of psilocybin to treat symptoms of depression compared with placebo or non-psychoactive drugs. The protocol was registered in the International Prospective Register of Systematic Reviews (see supplementary Appendix A). The study overall did not deviate from the pre-registered protocol; one clarification was made to highlight that any non-psychedelic comparator was eligible for inclusion, including placebo, niacin, micro doses of psychedelics, and drugs that are considered the standard of care in depression (eg, SSRIs).

Inclusion and exclusion criteria

Double blind and open label randomised trials with a crossover or parallel design were eligible for inclusion. We considered only studies in humans and with a control condition, which could include any type of non -active comparator, such as placebo, niacin, or micro doses of psychedelics.

Eligible studies were those that included adults (≥18 years) with clinically significant symptoms of depression, evaluated using a clinically validated tool for depression and mood disorder outcomes. Such tools included the Beck depression inventory, Hamilton depression rating scale, Montgomery-Åsberg depression rating scale, profile of mood states, and quick inventory of depressive symptomatology. Studies of participants with symptoms of depression and comorbidities (eg, cancer) were also eligible. We excluded studies of healthy participants (without depressive symptomatology).

Eligible studies investigated the effect of psilocybin as a standalone treatment on symptoms of depression. Studies with an active psilocybin condition that involved micro dosing (ie, psilocybin <100 μg/kg, according to the commonly accepted convention 22 35 ) were excluded. We included studies with directive psychotherapy if the psychotherapeutic component was present in both the experimental and the control conditions, so that the effects of psilocybin could be distinguished from those of psychotherapy. Studies involving group therapy were also excluded. Any non-psychedelic comparator was eligible for inclusion, including placebo, niacin, and micro doses of psychedelics.

Changes in symptoms, measured by validated clinician rated or self-report scales, such as the Beck depression inventory, Hamilton depression rating scale, Montgomery-Åsberg depression rating scale, profile of mood states, and quick inventory of depressive symptomatology were considered. We excluded outcomes that were measured less than three hours after psilocybin had been administered because any reported changes could be attributed to the transient cognitive and affective effects of the substance being administered. Aside from this, outcomes were included irrespective of the time point at which measurements were taken.

Search strategy

We searched major electronic databases and trial registries of psychological and medical research, with no limits on the publication date. Databases were the Cochrane Central Register of Controlled Trials via the Cochrane Library, Embase via Ovid, Medline via Ovid, Science Citation Index and Conference Proceedings Citation Index-Science via Web of Science, and PsycInfo via Ovid. A search through multiple databases was necessary because each database includes unique journals. Supplementary Appendix B shows the search syntax used for the Cochrane Central Register of Controlled Trials, which was slightly modified to comply with the syntactic rules of the other databases.

Unpublished and grey literature were sought through registries of past and ongoing trials, databases of conference proceedings, government reports, theses, dissertations, and grant registries (eg, ClinicalTrials.gov, WHO International Clinical Trials Registry Platform, ProQuest Dissertations and Theses Global, and PsycEXTRA). The references and bibliographies of eligible studies were checked for relevant publications. The original search was done in January 2023 and updated search was performed on 10 August 2023.

Data collection, extraction, and management

The results of the literature search were imported to the Endnote X9 reference management software, and the references were imported to the Covidence platform after removal of duplicates. Two reviewers (AM and DT) independently screened the title and abstract of each reference and then screened the full text of potentially eligible references. Any disagreements about eligibility were resolved through discussion. If information was insufficient to determine eligibility, the study’s authors were contacted. The reviewers were not blinded to the studies’ authors, institutions, or journal of publication.

The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram shows the study selection process and reasons for excluding studies that were considered eligible for full text screening. 36

Critical appraisal of individual studies and of aggregated evidence

The methodological quality of eligible studies was assessed using the Cochrane Risk of Bias 2 tool (RoB 2) for assessing risk of bias in randomised trials. 37 In addition to the criteria specified by RoB 2, we considered the potential impact of industry funding and conflicts of interest. The overall methodological quality of the aggregated evidence was evaluated using GRADE (Grading of Recommendations, Assessment, Development and Evaluation). 38

If we found evidence of heterogeneity among the trials, then small study biases, such as publication bias, were assessed using a funnel plot and asymmetry tests (eg, Egger’s test). 39

We used a template for data extraction (see supplementary Appendix C) and summarised the extracted data in tabular form, outlining personal characteristics (age, sex, previous use of psychedelics), methodology (study design, dosage), and outcome related characteristics (mean change from baseline score on a depression questionnaire, response rates, and remission rates) of the included studies. Response conventionally refers to a 50% decrease in symptom severity based on scores on a depression rating scale, whereas remission scores are specific to a questionnaire (eg, score of ≤5 on the quick inventory of depressive symptomatology, score of ≤10 on the Montgomery-Åsberg depression rating scale, 50% or greater reduction in symptoms, score of ≤7 on the Hamilton depression rating scale, or score of ≤12 on the Beck depression inventory). Across depression scales, higher scores signify more severe symptoms of depression.

Continuous data synthesis

From each study we extracted the baseline and post-intervention means and standard deviations (SDs) of the scores between comparison groups for the depression questionnaires and calculated the mean differences and SDs of change. If means and SDs were not available for the included studies, we extracted the values from available graphs and charts using the Web Plot Digitizer application ( https://automeris.io/WebPlotDigitizer/ ). If it was not possible to calculate SDs from the graphs or charts, we generated values by converting standard errors (SEs) or confidence intervals (CIs), depending on availability, using formulas in the Cochrane Handbook (section 7.7.3.2). 40

Standardised mean differences were calculated for each study. We chose these rather than weighted mean differences because, although all the studies measured depression as the primary outcome, they did so with different questionnaires that score depression based on slightly different items. 41 If we had used weighted mean differences, any variability among studies would be assumed to reflect actual methodological or population differences and not differences in how the outcome was measured, which could be misleading. 40

The Hedges’ g effect size estimate was used because it tends to produce less biased results for studies with smaller samples (<20 participants) and when sample sizes differ substantially between studies, in contrast with Cohen’s d. 42 According to the Cochrane Handbook, the Hedges’ g effect size measure is synonymous with the standardised mean difference, 40 and the terms may be used interchangeably. Thus, a Hedges’ g of 0.2, 0.5, 0.8, or 1.2 corresponds to a small, medium, large, or very large effect, respectively. 40

Owing to variation in the participants’ personal characteristics, psilocybin dosage, type of depression investigated (primary or secondary), and type of comparators, we used a random effects model with a Hartung-Knapp-Sidik-Jonkman modification. 43 This model also allowed for heterogeneity and within study variability to be incorporated into the weighting of the results of the included studies. 44 Lastly, this model could help to generalise the findings beyond the studies and patient populations included, making the meta-analysis more clinically useful. 45 We chose the Hartung-Knapp-Sidik-Jonkman adjustment in favour of more widely used random effects models (eg, DerSimonian and Laird) because it allows for better control of type 1 errors, especially for studies with smaller samples, and provides a better estimation of between study variance by accounting for small sample sizes. 46 47

For studies in which multiple treatment groups were compared with a single placebo group, we split the placebo group to avoid multiplicity. 48 Similarly, if studies included multiple primary outcomes (eg, change in depression at three weeks and at six weeks), we split the treatment groups to account for overlapping participants. 40

Prediction intervals (PIs) were calculated and reported to show the expected effect range of a similar future study, in a different setting. In a random effects model, within study measures of variability, such as CIs, can only show the range in which the average effect size could lie, but they are not informative about the range of potential treatment effects given the heterogeneity between studies. 49 Thus, we used PIs as an indication of variation between studies.

Heterogeneity and sensitivity analysis

Statistical heterogeneity was tested using the χ 2 test (significance level P<0.1) and I 2 statistic, and heterogeneity among included studies was evaluated visually and displayed graphically using a forest plot. If substantial or considerable heterogeneity was found (I 2 ≥50% or P<0.1), 50 we considered the study design and characteristics of the included studies. Sources of heterogeneity were explored by subgroup analysis, and the potential effects on the results are discussed.

Planned sensitivity analyses to assess the effect of unpublished studies and studies at high risk of bias were not done because all included studies had been published and none were assessed as high risk of bias. Exclusion sensitivity plots were used to display graphically the impact of individual studies and to determine which studies had a particularly large influence on the results of the meta-analysis. All sensitivity analyses were carried out with Stata 16 software.

Subgroup analysis

To reduce the risk of errors caused by multiplicity and to avoid data fishing, we planned subgroup analyses a priori and limited to: (1) patient characteristics, including age and sex; (2) comorbidities, such as a serious physical condition (previous research indicates that the effects of psilocybin may be less strong for such participants, compared with participants with no comorbidities) 33 ; (3) number of doses and amount of psilocybin administered, because some previous meta-analyses found that a higher number of doses and a higher dose of psilocybin both predicted a greater reduction in symptoms of depression, 34 whereas others reported the opposite 33 ; (4) psilocybin administered alongside psychotherapeutic guidance or as a standalone treatment; (5) severity of depressive symptoms (clinical v subclinical symptomatology); (6) clinician versus patient rated scales; and (7) high versus low quality studies, as determined by RoB 2 assessment scores.

Metaregression

Given that enough studies were identified (≥10 distinct observations according to the Cochrane Handbook’s suggestion 40 ), we performed metaregression to investigate whether covariates, or potential effect modifiers, explained any of the statistical heterogeneity. The metaregression analysis was carried out using Stata 16 software.

Random effects metaregression analyses were used to determine whether continuous variables such as participants’ age, percentage of female participants, and percentage of participants who had previously used psychedelics modified the effect estimate, all of which have been implicated in differentially affecting the efficacy of psychedelics in modifying mood. 51 We chose this approach in favour of converting these continuous variables into categorical variables and conducting subgroup analyses for two primary reasons; firstly, the loss of any data and subsequent loss of statistical power would increase the risk of spurious significant associations, 51 and, secondly, no cut-offs have been agreed for these factors in literature on psychedelic interventions for mood disorders, 52 making any such divisions arbitrary and difficult to reconcile with the findings of other studies. The analyses were based on within study averages, in the absence of individual data points for each participant, with the potential for the results to be affected by aggregate bias, compromising their validity and generalisability. 53 Furthermore, a group level analysis may not be able to detect distinct interactions between the effect modifiers and participant subgroups, resulting in ecological bias. 54 As a result, this analysis should be considered exploratory.

Sensitivity analysis

A sensitivity analysis was performed to determine if choice of analysis method affected the primary findings of meta-analysis. Specifically, we reanalysed the data on change in depression score using a random effects Dersimonian and Laird model without the Hartung-Knapp-Sidik-Jonkman modification and compared the results with those of the originally used model. This comparison is particularly important in the presence of substantial heterogeneity and the potential of small study effects to influence the intervention effect estimate. 55

Patient and public involvement

Research on novel depression treatments is of great interest to both patients and the public. Although patients and members of the public were not directly involved in the planning or writing of this manuscript owing to a lack of available funding for recruitment and researcher training, patients and members of the public read the manuscript after submission.

Figure 1 presents the flow of studies through the systematic review and meta-analysis. 56 A total of 4884 titles were retrieved from the five databases of published literature, and a further 368 titles were identified from the databases of unpublished and international literature in February 2023. After the removal of duplicate records, we screened the abstracts and titles of 875 reports. A further 12 studies were added after handsearching of reference lists and conference proceedings and abstracts. Overall, nine studies totalling 436 participants were eligible. The average age of the participants ranged from 36-60 years. During an updated search on 10 August 2023, no further studies were identified.

Fig 1

Flow of studies in systematic review and meta-analysis

After screening of the title and abstract, 61 titles remained for full text review. Native speakers helped to translate papers in languages other than English. The most common reasons for exclusion were the inclusion of healthy volunteers, absence of control groups, and use of a survey based design rather than an experimental design. After full text screening, nine studies were eligible for inclusion, and 15 clinical trials prospectively registered or underway as of August 2023 were noted for potential future inclusion in an update of this review (see supplementary Appendix D).

We sent requests for further information to the authors of studies by Griffiths et al, 57 Barrett, 58 and Benville et al, 59 because these studies appeared to meet the inclusion criteria but were only provided as summary abstracts online. A potentially eligible poster presentation from the 58th annual meeting of the American College of Neuropsychopharmacology was identified but the lead author (Griffiths) clarified that all information from the presentation was included in the studies by Davis et al 13 and Gukasyan et al 60 ; both of which we had already deemed ineligible.

Barrett 58 reported the effects of psilocybin on the cognitive flexibility and verbal reasoning of a subset of patients with major depressive disorder from Griffith et al’s trial, 61 compared with a waitlist group, but when contacted, Barrett explained that the results were published in the study by Doss et al, 62 which we had already screened and judged ineligible (see supplementary Appendix E). Benville et al’s study 59 presented a follow-up of Ross et al’s study 17 on a subset of patients with cancer and high suicidal ideation and desire for hastened death at baseline. Measures of antidepressant effects of psilocybin treatment compared with niacin were taken before and after treatment crossover, but detailed results are not reported. Table 1 describes the characteristics of the included studies and table 2 lists the main findings of the studies.

Characteristics of included studies

  • View inline

Main findings of included studies

Side effects and adverse events

Side effects reported in the included studies were minor and transient (eg, short term increases in blood pressure, headache, and anxiety), and none were coded as serious. Cahart-Harris et al noted one instance of abnormal dreams and insomnia. 63 This side effect profile is consistent with findings from other meta-analyses. 30 68 Owing to the different scales and methods used to catalogue side effects and adverse events across trials, it was not possible to combine these data quantitatively (see supplementary Appendix F).

Risk of bias

The Cochrane RoB 2 tools were used to evaluate the included studies ( table 3 ). RoB 2 for randomised trials was used for the five reports of parallel randomised trials (Carhart-Harris et al 63 and its secondary analysis Barba et al, 64 Goodwin et al 18 and its secondary analysis Goodwin et al, 65 and von Rotz et al 66 ) and RoB 2 for crossover trials was used for the four reports of crossover randomised trials (Griffiths et al, 14 Grob et al, 15 and Ross et al 17 and its follow-up Ross et al 67 ). Supplementary Appendix G provides a detailed explanation of the assessment of the included studies.

Summary risk of bias assessment of included studies, based on domains in Cochrane Risk of Bias 2 tool

Quality of included studies

Confidence in the quality of the evidence for the meta-analysis was assessed using GRADE, 38 through the GRADEpro GDT software program. Figure 2 shows the results of this assessment, along with our summary of findings.

Fig 2

GRADE assessment outputs for outcomes investigated in meta-analysis (change in depression scores and response and remission rates). The risk in the intervention group (and its 95% CI) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). BDI=Beck depression inventory; CI=confidence interval; GRADE=Grading of Recommendations, Assessment, Development and Evaluation; HADS-D=hospital anxiety and depression scale; HAM-D=Hamilton depression rating scale; MADRS=Montgomery-Åsberg depression rating scale; QIDS=quick inventory of depressive symptomatology; RCT=randomised controlled trial; SD=standard deviation

Meta-analyses

Continuous data, change in depression scores —Using a Hartung-Knapp-Sidik-Jonkman modified random effects meta-analysis, change in depression scores was significantly greater after treatment with psilocybin compared with active placebo. The overall Hedges’ g (1.64, 95% CI 0.55 to 2.73) indicated a large effect size favouring psilocybin ( fig 3 ). PIs were, however, wide and crossed the line of no difference (95% CI −1.72 to 5.03), indicating that there could be settings or populations in which psilocybin intervention would be less efficacious.

Fig 3

Forest plot for overall change in depression scores from before to after treatment. CI=confidence interval; DL=DerSimonian and Laird; HKSJ=Hartung-Knapp-Sidik-Jonkman

Exploring publication bias in continuous data —We used Egger’s test and a funnel plot to examine the possibility of small study biases, such as publication bias. Statistical significance of Egger’s test for small study effects, along with the asymmetry in the funnel plot ( fig 4 ), indicates the presence of bias against smaller studies with non-significant results, suggesting that the pooled intervention effect estimate is likely to be overestimated. 69 An alternative explanation, however, is that smaller studies conducted at the early stages of a new psychotherapeutic intervention tend to include more high risk or responsive participants, and psychotherapeutic interventions tend to be delivered more effectively in smaller trials; both of these factors can exaggerate treatment effects, resulting in funnel plot asymmetry. 70 Also, because of the relatively small number of included studies and the considerable heterogeneity observed, test power may be insufficient to distinguish real asymmetry from chance. 71 Thus, this analysis should be considered exploratory.

Fig 4

Funnel plot assessing publication bias among studies measuring change in depression scores from before to after treatment. CI=confidence interval; θ IV =estimated effect size under inverse variance random effects model

Dichotomous data

We extracted response and remission rates for each group when reported directly, or imputed information when presented graphically. Two studies did not measure response or remission and thus did not contribute data for this part of the analysis. 15 18 The random effects model with a Hartung-Knapp-Sidik-Jonkman modification was used to allow for heterogeneity to be incorporated into the weighting of the included studies’ results, and to provide a better estimation of between study variance accounting for small sample sizes.

Response rate —Overall, the likelihood of psilocybin intervention leading to treatment response was about two times greater (risk ratio 2.02, 95% CI 1.33 to 3.07) than with placebo. Despite the use of different scales to measure response, the heterogeneity between studies was not significant (I 2 =25.7%, P=0.23). PIs were, however, wide and crossed the line of no difference (−0.94 to 3.88), indicating that there could be settings or populations in which psilocybin intervention would be less efficacious.

Remission rate —Overall, the likelihood of psilocybin intervention leading to remission of depression was nearly three times greater than with placebo (risk ratio 2.71, 95% CI 1.75 to 4.20). Despite the use of different scales to measure response, no statistical heterogeneity was found between studies (I 2 =0.0%, P=0.53). PIs were, however, wide and crossed the line of no difference (0.87 to 2.32), indicating that there could be settings or populations in which psilocybin intervention would be less efficacious.

Exploring publication bias in response and remission rates data —We used Egger’s test and a funnel plot to examine whether response and remission estimates were affected by small study biases. The result for Egger’s test was non-significant (P>0.05) for both response and remission estimates, and no substantial asymmetry was observed in the funnel plots, providing no indication for the presence of bias against smaller studies with non-significant results.

Heterogeneity: subgroup analyses and metaregression

Heterogeneity was considerable across studies exploring changes in depression scores (I 2 =89.7%, P<0.005), triggering subgroup analyses to explore contributory factors. Table 4 and table 5 present the results of the heterogeneity analyses (subgroup analyses and metaregression, respectively). Also see supplementary Appendix H for a more detailed description and graphical representation of these results.

Subgroup analyses to explore potential causes of heterogeneity among included studies

Metaregression analyses to explore potential causes of heterogeneity among included studies

Cumulative meta-analyses

We used cumulative meta-analyses to investigate how the overall estimates of the outcomes of interest changed as each study was added in chronological order 72 ; change in depression scores and likelihood of treatment response both increased as the percentage of participants with past use of psychedelics increased across studies, as expected based on the metaregression analysis (see supplementary Appendix I). No other significant time related patterns were found.

We reanalysed the data for change in depression scores using a random effects Dersimonian and Laird model without the Hartung-Knapp-Sidik-Jonkman modification and compared the results with those of the original model. All comparisons found to be significant using the Dersimonian and Laird model with the Hartung-Knapp-Sidik-Jonkman adjustment were also significant without the Hartung-Knapp-Sidik-Jonkman adjustment, and confidence intervals were only slightly narrower. Thus, small study effects do not appear to have played a major role in the treatment effect estimate.

Additionally, to estimate the accuracy and robustness of the estimated treatment effect, we excluded studies from the meta-analysis one by one; no important differences in the treatment effect, significance, and heterogeneity levels were observed after the exclusion of any study (see supplementary Appendix J).

In our meta-analysis we found that psilocybin use showed a significant benefit on change in depression scores compared with placebo. This is consistent with other recent meta-analyses and trials of psilocybin as a standalone treatment for depression 73 74 or in combination with psychological support. 24 25 29 30 31 32 68 75 This review adds to those finding by exploring the considerable heterogeneity across the studies, with subsequent subgroup analyses showing that the type of depression (primary or secondary) and the depression scale used (Montgomery-Åsberg depression rating scale, quick inventory of depressive symptomatology, or Beck depression inventory) had a significant differential effect on the outcome. High between study heterogeneity has been identified by some other meta-analyses of psilocybin (eg, Goldberg et al 29 ), with a higher treatment effect in studies with patients with comorbid life threatening conditions compared with patients with primary depression. 22 Although possible explanations, including personal factors (eg, patients with life threatening conditions being older) or depression related factors (eg, secondary depression being more severe than primary depression) could be considered, these hypotheses are not supported by baseline data (ie, patients with secondary depression do not differ substantially in age or symptom severity from patients with primary depression). The differential effects from assessment scales used have not been examined in other meta-analyses of psilocybin, but this review’s finding that studies using the Beck depression inventory showed a higher treatment effect than those using the Montgomery-Åsberg depression rating scale and quick inventory of depressive symptomatology is consistent with studies in the psychological literature that have shown larger treatment effects when self-report scales are used (eg, Beck depression inventory). 76 77 This finding may be because clinicians tend to overestimate the severity of depression symptoms at baseline assessments, leading to less pronounced differences between before and after treatment identified in clinician assessed scales (eg, Montgomery-Åsberg depression rating scale, quick inventory of depressive symptomatology). 78

Metaregression analyses further showed that a higher average age and a higher percentage of participants with past use of psychedelics both correlated with a greater improvement in depression scores with psilocybin use and explained a substantial amount of between study variability. However, the cumulative meta-analysis showed that the effects of age might be largely an artefact of the inclusion of one specific study, and alternative explanations are worth considering. For instance, Studerus et al 79 identified participants’ age as the only personal variable significantly associated with psilocybin response, with older participants reporting a higher “blissful state” experience. This might be because of older people’s increased experience in managing negative emotions and the decrease in 5-hydroxytryptamine type 2A receptor density associated with older age. 80 Furthermore, Rootman et al 81 reported that the cognitive performance of older participants (>55 years) improved significantly more than that of younger participants after micro dosing with psilocybin. Therefore, the higher decrease in depressive symptoms associated with older age could be attributed to a decrease in cognitive difficulties experienced by older participants.

Interestingly, a clear pattern emerged for past use of psychedelics—the higher the proportion of study participants who had used psychedelics in the past, the higher the post-psilocybin treatment effect observed. Past use of psychedelics has been proposed to create an expectancy bias among participants and amplify the positive effects of psilocybin 82 83 84 ; however, this important finding has not been examined in other meta-analyses and may highlight the role of expectancy in psilocybin research.

Limitations of this study

Generalisability of the findings of this meta-analysis was limited by the lack of racial and ethnic diversity in the included studies—more than 90% of participants were white across all included trials, resulting in a homogeneous sample that is not representative of the general population. Moreover, it was not possible to distinguish between subgroups of participants who had never used psilocybin and those who had taken psilocybin more than a year before the start of the trial, as these data were not provided in the included studies. Such a distinction would be important, as the effects of psilocybin on mood may wane within a year after being administered. 21 85 Also, how psychological support was conceptualised was inconsistent within studies of psilocybin interventions; many studies failed to clearly describe the type of psychological support participants received, and others used methods ranging from directive guidance throughout the treatment session to passive encouragement or reassurance (eg, Griffiths et al, 14 Carhart-Harris et al 63 ). The included studies also did not gather evidence on participants’ previous experiences with treatment approaches, which could influence their response to the trials’ intervention. Thus, differences between participant subgroups related to past use of psilocybin or psychotherapy may be substantial and could help interpret this study’s findings more accurately. Lastly, the use of graphical extraction software to estimate the findings of studies where exact numerical data were not available (eg, Goodwin et al, 18 Grob et al 15 ), may have affected the robustness of the analyses.

A common limitation in studies of psilocybin is the likelihood of expectancy effects augmenting the treatment effect observed. Although some studies used low dose psychedelics as comparators to deal with this problem (eg, Carhart-Harris et al, 63 Goodwin et al, 18 Griffiths et al 14 ) or used a niacin placebo that can induce effects similar to those of psilocybin (eg, Grob et al, 15 Ross et al 17 ), the extent to which these methods were effective in blinding participants is not known. Other studies have, however, reported that participants can accurately identify the study groups to which they had been assigned 70-85% of the time, 84 86 indicating a high likelihood of insufficient blinding. This is especially likely for studies in which a high proportion of participants had previously used psilocybin and other hallucinogens, making the identification of the drug’s acute effects easier (eg, Griffiths et al, 14 Grob et al, 15 Ross et al 17 ). Patients also have expectations related to the outcome of their treatment, expecting psilocybin to improve their symptoms of depression, and these positive expectancies are strong predictors of actual treatment effects. 87 88 Importantly, the effect of outcome expectations on treatment effect is particularly strong when patient reported measures are used as primary outcomes, 89 which was the case in several of the included studies (eg, Griffiths et al, 14 Grob et al, 15 Ross et al 17 ). Unfortunately, none of the included studies recorded expectations before treatment, so it is not possible to determine the extent to which this factor affected the findings.

Implications for clinical practice

Although this review’s findings are encouraging for psilocybin’s potential as an effective antidepressant, a few areas about its applicability in clinical practice remain unexplored. Firstly, it is unclear whether the protocols for psilocybin interventions in clinical trials can be reliably and safely implemented in clinical practice. In clinical trials, patients receive psilocybin in a non-traditional medical setting, such as a specially designed living room, while they may be listening to curated calming music and are isolated from most external stimuli by wearing eyeshades and external noise-cancelling earphones. A trained therapist closely supervises these sessions, and the patient usually receives one or more preparatory sessions before the treatment commences. Standardising an intervention setting with so many variables is unlikely to be achievable in routine practice, and consensus is considerably lacking on the psychotherapeutic training and accreditations needed for a therapist to deliver such treatment. 90 The combination of these elements makes this a relatively complex and expensive intervention, which could make it challenging to gain approval from regulatory agencies and to gain reimbursement from insurance companies and others. Within publicly funded healthcare systems, the high cost of treatment may make psilocybin treatment inaccessible. The high cost associated with the intervention also increases the risk that unregulated clinics may attempt to cut costs by making alterations to the protocol and the therapeutic process, 91 92 which could have detrimental effects for patients. 92 93 94 Thus, avoiding the conflation of medical and commercial interests is a primary concern that needs to be dealt with before psilocybin enters mainstream practice.

Implications for future research

More large scale randomised trials with long follow-up are needed to fully understand psilocybin’s treatment potential, and future studies should aim to recruit a more diverse population. Another factor that would make clinical trials more representative of routine practice would be to recruit patients who are currently using or have used commonly prescribed serotonergic antidepressants. Clinical trials tend to exclude such participants because many antidepressants that act on the serotonin system modulate the 5-hydroxytryptamine type 2A receptor that psilocybin primarily acts upon, with prolonged use of tricyclic antidepressants associated with more intense psychedelic experiences and use of monoamine oxidase inhibitors or SSRIs inducing weaker responses to psychedelics. 95 96 97 Investigating psilocybin in such patients would, however, provide valuable insight on how psilocybin interacts with commonly prescribed drugs for depression and would help inform clinical practice.

Minimising the influence of expectancy effects is another core problem for future studies. One strategy would be to include expectancy measures and explore the level of expectancy as a covariate in statistical analysis. Researchers should also test the effectiveness of condition masking. Another proposed solution would be to adopt a 2×2 balanced placebo design, where both the drug (psilocybin or placebo) and the instructions given to participants (told they have received psilocybin or told they have received placebo) are crossed. 98 Alternatively, clinical trials could adopt a three arm design that includes both an inactive placebo (eg, saline) and active placebo (eg, niacin, lower psylocibin dose), 98 allowing for the effects of psilocybin to be separated from those of the placebo.

Overall, future studies should explore psilocybin’s exact mechanism of treatment effectiveness and outline how its physiological effects, mystical experiences, dosage, treatment setting, psychological support, and relationship with the therapist all interact to produce a synergistic antidepressant effect. Although this may be difficult to achieve using an explanatory randomised trial design, pragmatic clinical trial designs may be better suited to psilocybin research, as their primary objective is to achieve high external validity and generalisability. Such studies may include multiple alternative treatments rather than simply an active and placebo treatment comparison (eg, psilocybin v SSRI v serotonin-noradrenaline reuptake inhibitor), and participants would be recruited from broader clinical populations. 99 100 Although such studies are usually conducted after a drug’s launch, 100 earlier use of such designs could help assess the clinical effectiveness of psilocybin more robustly and broaden patient access to a novel type of antidepressant treatment.

Conclusions

This review’s findings on psilocybin’s efficacy in reducing symptoms of depression are encouraging for its use in clinical practice as a drug intervention for patients with primary or secondary depression, particularly when combined with psychological support and administered in a supervised clinical environment. However, the highly standardised treatment setting, high cost, and lack of regulatory guidelines and legal safeguards associated with psilocybin treatment need to be dealt with before it can be established in clinical practice.

What is already known on this topic

Recent research on treatments for depression has focused on psychedelic agents that could have strong antidepressant effects without the drawbacks of classic antidepressants; psilocybin being one such substance

Over the past decade, several clinical trials, meta-analyses, and systematic reviews have investigated the use of psilocybin for symptoms of depression, and most have found that psilocybin can have antidepressant effects

Studies published to date have not investigated factors that may moderate psilocybin’s effects, including type of depression, past use of psychedelics, dosage, outcome measures, and publication biases

What this study adds

This review showed a significantly greater efficacy of psilocybin among patients with secondary depression, patients with past use of psychedelics, older patients, and studies using self-report measures for symptoms of depression

Efficacy did not appear to be homogeneous across patient types—for example, those with depression and a life threatening illness appeared to benefit more from treatment

Further research is needed to clarify the factors that maximise psilocybin’s treatment potential for symptoms of depression

Ethics statements

Ethical approval.

This study was approved by the ethics committee of the University of Oxford Nuffield Department of Medicine, which waived the need for ethical approval and the need to obtain consent for the collection, analysis, and publication of the retrospectively obtained anonymised data for this non-interventional study.

Data availability statement

The relevant aggregated data and statistical code will be made available on reasonable request to the corresponding author.

Acknowledgments

We thank DT who acted as an independent secondary reviewer during the study selection and data review process.

Contributors: AMM contributed to the design and implementation of the research, analysis of the results, and writing of the manuscript. MC was involved in planning and supervising the work and contributed to the writing of the manuscript. AMM and MC are the guarantors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding: None received.

Competing interests: All authors have completed the ICMJE uniform disclosure form at https://www.icmje.org/disclosure-of-interest/ and declare: no support from any organisation for the submitted work; AMM is employed by IDEA Pharma, which does consultancy work for pharmaceutical companies developing drugs for physical and mental health conditions; MC was the supervisor for AMM’s University of Oxford MSc dissertation, which forms the basis for this paper; no other relationships or activities that could appear to have influenced the submitted work.

Transparency: The corresponding author (AMM) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as registered have been explained.

Dissemination to participants and related patient and public communities: To disseminate our findings and increase the impact of our research, we plan on writing several social media posts and blog posts outlining the main conclusions of our paper. These will include blog posts on the websites of the University of Oxford’s Department of Primary Care Health Sciences and Department for Continuing Education, as well as print publications, which are likely to reach a wider audience. Furthermore, we plan to present our findings and discuss them with the public in local mental health related events and conferences, which are routinely attended by patient groups and advocacy organisations.

Provenance and peer review: Not commissioned; externally peer reviewed.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .

  • ↵ World Health Organization. Depressive Disorder (Depression); 2023. https://www.who.int/news-room/fact-sheets/detail/depression .
  • GBD 2017 Disease and Injury Incidence and Prevalence Collaborators
  • Cipriani A ,
  • Furukawa TA ,
  • Salanti G ,
  • Trivedi MH ,
  • Wisniewski SR ,
  • Mitchell AJ
  • Bockting CL ,
  • Hollon SD ,
  • Jarrett RB ,
  • Nierenberg AA ,
  • Petersen TJ ,
  • Páleníček T ,
  • Carbonaro TM ,
  • Bradstreet MP ,
  • Barrett FS ,
  • Carhart-Harris RL ,
  • Bolstridge M ,
  • Griffiths RR ,
  • Johnson MW ,
  • Carducci MA ,
  • Danforth AL ,
  • Chopra GS ,
  • Kraehenmann R ,
  • Preller KH ,
  • Scheidegger M ,
  • Goodwin GM ,
  • Aaronson ST ,
  • Alvarez O ,
  • Bogenschutz MP ,
  • Podrebarac SK ,
  • Roseman L ,
  • Galvão-Coelho NL ,
  • Gonzalez M ,
  • Dos Santos RG ,
  • Osório FL ,
  • Crippa JA ,
  • Zuardi AW ,
  • Cleare AJ ,
  • Martelli C ,
  • Benyamina A
  • Vollenweider FX ,
  • Demetriou L ,
  • Carhart-Harris RL
  • Timmermann C ,
  • Giribaldi B ,
  • Goldberg SB ,
  • Nicholas CR ,
  • Raison CL ,
  • Irizarry R ,
  • Winczura A ,
  • Dimassi O ,
  • Dhillon N ,
  • Griffiths RR
  • Castro Santos H ,
  • Gama Marques J
  • Moreno FA ,
  • Wiegand CB ,
  • Taitano EK ,
  • Liberati A ,
  • Tetzlaff J ,
  • Altman DG ,
  • PRISMA Group
  • Sterne JAC ,
  • Savović J ,
  • Guyatt GH ,
  • Schünemann HJ ,
  • Tugwell P ,
  • Knottnerus A
  • Sterne JA ,
  • Sutton AJ ,
  • Ioannidis JP ,
  • Higgins JPT ,
  • Chandler J ,
  • Borenstein M ,
  • Hedges LV ,
  • Higgins JP ,
  • Rothstein HR
  • DerSimonian R ,
  • ↵ Borenstein M, Hedges L, Rothstein H. Meta-analysis: Fixed effect vs. random effects. Meta-analysis. com. 2007;1-62.
  • IntHout J ,
  • Rovers MM ,
  • Gøtzsche PC
  • Spineli LM ,
  • ↵ Higgins JP, Green S. Identifying and measuring heterogeneity. Cochrane handbook for systematic reviews of interventions. 2011;5(0).
  • Austin PC ,
  • O’Donnell KC ,
  • Mennenga SE ,
  • Bogenschutz MP
  • Sander SD ,
  • Berlin JA ,
  • Santanna J ,
  • Schmid CH ,
  • Szczech LA ,
  • Feldman HI ,
  • Anti-Lymphocyte Antibody Induction Therapy Study Group
  • ↵ Iyengar S, Greenhouse J. Sensitivity analysis and diagnostics. Handbook of research synthesis and meta-analysis. Russell Sage Foundation, 2009:417-33.
  • McKenzie JE ,
  • Bossuyt PM ,
  • ↵ Griffiths R, Barrett F, Johnson M, Mary C, Patrick F, Alan D. Psilocybin-Assisted Treatment of Major Depressive Disorder: Results From a Randomized Trial. Proceedings of the ACNP 58th Annual Meeting: Poster Session II. In Neuropsychopharmacology. 2019;44:230-384.
  • ↵ Barrett F. ACNP 58th Annual Meeting: Panels, Mini-Panels and Study Groups. [Abstract.] Neuropsychopharmacology 2019;44:1-77. doi: 10.1038/s41386-019-0544-z . OpenUrl CrossRef
  • Benville J ,
  • Agin-Liebes G ,
  • Roberts DE ,
  • Gukasyan N ,
  • Hurwitz ES ,
  • Považan M ,
  • Rosenberg MD ,
  • Carhart-Harris R ,
  • Buehler S ,
  • Kettner H ,
  • von Rotz R ,
  • Schindowski EM ,
  • Jungwirth J ,
  • Vargas AS ,
  • Barroso M ,
  • Gallardo E ,
  • Isojarvi J ,
  • Lefebvre C ,
  • Glanville J
  • Sukpraprut-Braaten S ,
  • Narlesky M ,
  • Strayhan RC
  • Prouzeau D ,
  • Conejero I ,
  • Voyvodic PL ,
  • Becamel C ,
  • Lopez-Castroman J
  • Więckiewicz G ,
  • Stokłosa I ,
  • Gorczyca P ,
  • John Mann J ,
  • Currier D ,
  • Zimmerman M ,
  • Friedman M ,
  • Boerescu DA ,
  • Attiullah N
  • Borgherini G ,
  • Conforti D ,
  • Studerus E ,
  • Kometer M ,
  • Vollenweider FX
  • Pinborg LH ,
  • Rootman JM ,
  • Kryskow P ,
  • Turner EH ,
  • Rosenthal R
  • Bershad AK ,
  • Schepers ST ,
  • Bremmer MP ,
  • Sepeda ND ,
  • Hurwitz E ,
  • Horvath AO ,
  • Del Re AC ,
  • Flückiger C ,
  • Rutherford BR ,
  • Pearson C ,
  • Husain SF ,
  • Harris KM ,
  • George JR ,
  • Michaels TI ,
  • Sevelius J ,
  • Williams MT
  • Collins A ,
  • Bonson KR ,
  • Buckholtz JW ,
  • Yamauchi M ,
  • Matsushima T ,
  • Coleshill MJ ,
  • Colloca L ,
  • Zachariae R ,
  • Colagiuri B
  • Heifets BD ,
  • Pratscher SD ,
  • Bradley E ,
  • Sugarman J ,

systematic literature review databases

  • Open access
  • Published: 07 May 2024

Spiritual nursing education programme for nursing students in Korea: a systematic review and meta-analysis

  • Hyun-Jin Cho 1 ,
  • Kyoungrim Kang 2 &
  • Kyo-Yeon Park 1  

BMC Nursing volume  23 , Article number:  310 ( 2024 ) Cite this article

Metrics details

This study conducts a systematic review and meta-analysis to understand the characteristics and contents of studies on spiritual nursing education programmes and their effects.

The literature search included five databases (RISS, KISS, DBpia, Science ON, and KmBase) published in South Korea until September 30, 2021. Nine studies were included in the final review, with six for the meta-analysis using the RevMan 5.4. 1 programme. The programmes targeted nursing students and nurses in the RN-BSN course and employed methods such as lecturing, discussions, and case presentations. The contents focused on self-spirituality awareness, spirituality-related concepts, understanding others’ spirituality, and the process and application of spiritual nursing.

The meta-analysis revealed statistically significant effects on spiritual nursing competencies, spirituality, spiritual well-being, existential well-being, and spiritual needs, except self-esteem. Spiritual nursing education was effective in enhancing spiritual nursing competencies.

The study confirmed that spiritual nursing education effectively improves spiritual nursing competency, indicating a need for increased focus and administrative and financial support for such education in schools and hospitals. Furthermore, future studies should employ randomised experimental designs to examine the effects of online education programmes with short training time on clinical nurses in hospitals.

Peer Review reports

Introduction

Spiritual nursing involves providing care that recognises and responds to the spiritual needs of people in specific situations such as birth, trauma, illness, and loss [ 1 ]. It can provide answers to fundamental questions such as meaning of life, suffering, distress, and death through the person’s inner healing resources [ 2 ]. The physical, mental, social, and spiritual aspects of human beings dynamically interact with one another, with the spiritual aspect actively integrating and regulating all aspects [ 3 ]. The goal to enhance understanding and management of human health and disease has led to a growing interest in the spiritual aspect [ 4 ]. Spiritual nursing, as a trend, is reinforced as an essential obligation in modern nursing [ 5 ], which pursues holistic health management [ 6 ]. Therefore, spiritual nursing is considered an important core concept of holistic nursing [ 7 ].

Despite the importance of the spiritual aspect in health, most nurses have never received training in spiritual nursing and have little experience in utilizing it [ 8 ]. Nursing college students lack clinical practicum opportunities in spiritual nursing, hindering them to gain experience in this area [ 9 ]. Additionally, nurses often avoid spiritual nursing because they perceive it as a religious concept or as unscientific owing to the abstract nature of the spiritual aspect and their lack of knowledge [ 10 ]. Moreover, spiritual nursing can be challenging owing to lack of time, prioritisation of physical nursing tasks, and inadequate staff training [ 11 , 12 ].

Spiritual nursing education has been proposed as a solution to these problems, leading to the development of spiritual nursing education programmes [ 11 ]. Nurses must learn about spirituality and spiritual nursing to enhance patients’ quality of life, health, well-being, coping mechanisms, and decision-making [ 13 ]. Spiritual nursing education serves as means to integrate spiritual aspects into comprehensive patient care [ 14 ]. Previous studies have shown that spiritual nursing education increases nurses’ ability to assess spiritual needs, enhances their competency in spiritual care, and improves their performance in spiritual nursing [ 11 ]. Moreover, nurses’ positive attitudes towards spiritual nursing influence their intention to engage in spiritual nursing and provide spiritual care [ 4 ]. Therefore, spiritual nursing education should aim to improve nurses’ attitude towards spiritual nursing and promote the application of spiritual nursing [ 15 ].

Spiritual nursing education programmes apply various curricula, content, delivery, and evaluation methods [ 16 ]. The educational content varies and may include the definition of spirituality or spiritual nursing, personal spiritual awareness, understanding of spiritual anguish, communication skills, comparative religious studies, and spiritual nursing ethics. Similarly, the educational methods employed comprise lectures, online education, simulation, role-plays, videos, group discussions, individual reflections, and practice [ 11 , 13 , 14 ]. This diversity arises from the lack of consensus on the meaning of spirituality and the unclear content and evaluation methods for spiritual nursing education programmes [ 1 ]. However, it also offers the opportunity to learn different approaches to caring for individuals from different social, cultural, and spiritual backgrounds [ 13 ]. Furthermore, diverse education programmes can serve as foundation for applying the most effective approach in situations requiring spiritual nursing education. Therefore, a systematic review can help in identifying strategies for integrating spirituality and spiritual nursing by examining the contents and teaching and evaluation methods of spiritual nursing education programmes.

Although several studies abroad conducted systematic reviews of spiritual nursing education programmes [ 11 , 13 , 14 ], only one phenomenological study [ 17 ] examined the experiences that Korean nursing students could potentially gain through spiritual nursing practicum and only one randomised controlled trial (RCT) study [ 18 ] investigated the effectiveness of spirituality training programmes focusing solely on spirituality on Korean nurses. Thus, a systematic review of spiritual nursing education programmes in Korea is necessary. In particular, Christianity began as a medical mission in Korea, establishing Korea’s first modern hospital, and subsequently expanded [ 19 ]. Consequently, nurses confused spiritual nursing with medical mission and perceived spiritual nursing as a compulsion to a certain religion [ 20 ], or Christian evangelism [ 21 ]. Moreover, the Korean culture values dignity and the views of others rather than the personal factors of self-satisfaction. The pursuit of spirituality also values harmony with the absolute, others, ancestors, and society [ 3 ]. Therefore, a spiritual nursing education programme that considers Korea’s cultural practices is necessary [ 22 ] to implement spiritual nursing in the nursing field.

Thus, using previous studies’ results related to the spiritual nursing education programmes in Korea, this study aims to systematically and scientifically integrate the contents, methods, and effects of these programmes. It seeks to develop and apply an effective programme for evidence-based practice, providing evidence for enhancing spiritual nursing competency and guiding future research directions.

Study design

This study is a systematic review that aims to understand the characteristics and effects of spiritual nursing education programmes on nurses and nursing students in Korea. The study used the systematic review manual of the National Evidence-based Healthcare Collaborating Agency [ 23 ]. The protocol for this review was registered in the International Prospective Register of Systematic Reviews (PROSPERO, ID: CRD42022326776).

Search strategy and study selection

Studies published up to September 30, 2021, were examined using electronic databases. Earlier studies [ 10 , 12 , 13 ] published in 2015, 2016, and 2021 conducted systematic literature reviews of spiritual nursing education programmes using foreign databases and did not search the domestic literature suitable for the eligibility criteria of this study. Therefore, this study utilized the five most used databases in Korea: the Research Information Sharing Service (RISS), Korean Studies Information Service System (KISS), DataBase Periodical Information Academic (DBpia), Science ON (formerly National Discovery for Science Library [NDSL]), and KmBase. There was no restriction on the search period, and all documents corresponding to related subject words were searched until the search date (September 30, 2021). To increase sensitivity of the literature search, grey literature was manually searched using Google Scholar. Furthermore, additional literature was searched by reviewing the reference lists of studies obtained through the database search. ‘Nurse’, ‘nursing student’, ‘spiritual nursing’, ‘education’, and ‘programme’ were used as literature search terms. Three researchers independently performed the literature selection process. Intervention studies on the effectiveness of spiritual nursing education programmes for nurses or nursing students were included, while review articles, conference abstracts, or unpublished manuscripts were excluded. The full inclusion and exclusion criteria are presented in Table  1 .

Data extraction and quality assessment

Relevant data were extracted using a standardised data collection form, which included information on authors, publication year, research design, study subjects (number, grade), programme characteristics (training place, type, session/time/period/evaluation time, education methods, conceptual framework, and contents), measurement tools, variable measurement results, limitations, and suggestions (Tables  2 and 3 ). The Cochran’s Risk of Bias (Cochrane’s RoB 1) tool was used to evaluate the quality of the selected literature as randomised study [ 24 ]. Cochrane’s RoB 1 assesses seven areas: randomisation order generation, random assignment order concealment, blinding of study participants and researchers, blinding of outcome evaluation, insufficient data, selective reporting, and other biases. Each area was rated as having low, high, or uncertain risk of bias in the literature. Non-randomised studies were evaluated using the Risk of Bias Assessment Tool for Non-randomised Studies (RoBANS) [ 25 ] developed by the National Evidence-based Healthcare Collaborating Agency [ 23 ]. RoBANS assesses six areas: subject group selection, confounding variables, intervention (exposure) measurement, blinding outcome evaluation, incomplete data, and selective outcome reporting. Each area was evaluated as having low, high, or uncertain risk of bias. In this study, the risk of bias is considered low for non-randomised studies if the subjects were similar in the experimental and control groups and were prospectively and continuously recruited. The risk of bias is considered low also if the intervention was made after confirmation of exposure to spiritual nursing-related education, a questionnaire with confirmed reliability and validity was used, blinding to the outcome evaluator was reported, dropouts and reasons were reported or missing values did not affect the outcome, or the outcome value for the pre-defined outcome variable was reported. Three researchers independently evaluated each piece of literature, and any disagreements were resolved through discussion.

Data synthesis and meta-analysis

For studies where quantitative synthesis was possible, we conducted a meta-analysis using the RevMan 5.4.1 programme from the Cochrane Library. This was performed when the same outcome variables could be analysed, or when pre- and post-mean and standard deviation values for the outcome variables were available. Subgroup meta-analysis was performed when at least two studies had the same outcome variables. In calculating the effect size, the result variables of each synthesised study were analysed as continuous variables, with mean and standard deviation. The Standardised Mean Difference (SMD) was selected as the analysis method for effect size of the same outcome variable. The statistical significance level for effect size was set at 0.05, and the confidence interval was set at 95%. Heterogeneity between studies was assessed for the common part in the confidence interval and effect estimate using a meta-analysis forest plot, a visual method. Heterogeneity was quantitatively evaluated using Cochrane’s chi-square test and Higgins’ I 2 statistic value. The I 2 value is 0% when there is no heterogeneity, 30-60% when there is moderate heterogeneity, and more than 75% when there is large heterogeneity [ 23 ].

Analysis was conducted using a random-effects model, which adjusts weights to account for intersubject variation and heterogeneity between the studies used in meta-analysis. Given the diversity in samples, intervention methods, intervention period, and measurement tools across studies, the random-effects model was used when the heterogeneity was I 2  = 50% or higher. When inputting data, if the outcome variables were measured twice, only the value calculated immediately after training was included. If the standard deviation for the difference before and after education was missing, the correlation coefficient calculated in another study was used and the missing standard deviation was replaced using the correlation coefficient [ 23 ].

Study selection and characteristics

Literature was obtained using electronic databases, and a literature review was conducted according to the reporting guidelines recommended by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). We then carried out a literature search and identified 612 studies: 425 from RISS, 19 from KISS, 95 from DBpia, 64 from Science ON, and nine from KmBase. After removing 318 duplicate papers using RefWorks, we reviewed the titles and abstracts of 294 papers based on the inclusion and exclusion criteria; 280 studies that did not fit the study purpose were excluded. After a detailed review of the full texts of the remaining 14 studies in detail, one study with unverifiable original text, two studies focusing solely on spirituality rather than spiritual nursing, and three theses duplicating academic papers were excluded. Eight studies were thus included. Additionally, two studies found in the reference lists of the selected studies were reviewed, with one meeting the inclusion criteria. The original text of the other study could not be confirmed. Finally, nine articles were selected for the systematic review (see Additional file 1 ). Among these, six studies suitable for quantitative synthesis were meta-analysed. Disagreements among the three researchers were resolved through discussion. The exclusion of literature at each selection stages was recorded, and the document selection process was described using the systematic review flow chart [ 26 ] from the Preferred Reporting Items for Systematic Reviews and Meta Analysis 2020 (Fig.  1 ).

figure 1

Flow diagram of study selection process

Characteristics of selected literature

All nine studies considered were published in journals in Korea. Two studies each were published in 2016 and 2019, and one study each was published in 1999, 2011, 2014, 2018, and 2021. The study designs were as follows: three were single-group pre-post-quasi experimental studies, three were quasi-experimental studies with non-equivalent control groups, two were pre- and post-non-synchronised quasi-experimental studies with non-equivalent control groups, and one was a randomised control pre- and post-experimental study. Seven studies focused on nursing students, while two targeted nurses in a bachelor’s degree programme (Registered Nurse-Bachelor of Science in Nursing, RN-BSN). Among the nine studies, six included 80 to 99 participants, and three included 60 to 79 participants. Four studies targeted junior students, two targeted senior students, two targeted students in first and second semesters of the RN-BSN course, and one targeted sophomore students. All training sessions were conducted in university classrooms. Five programmes utilised non-formal education, while four were operated through regular education courses. The training session durations were as follows: four studies conducted training sessions for 120 min each, two studies for 60 min each, two studies for 90 min each, and one for 150 min. As for the number of sessions, three studies have had seven sessions each, while the remaining six studies have had eight, nine, 10, 12, 14, and 16 sessions, respectively.

Characteristics of spiritual nursing education programmes

1) Teaching methods and content of spiritual nursing education programmes.

All nine studies in the review conducted lectures as a teaching method. Eight studies incorporated presentations, seven used discussions, three employed case studies, two featured practice sessions, and two included role-play. Additionally, relaxation techniques, action learning, and various tests were used. As for evaluation methods, four studies used reports and two used paper-written exams. Among the nine studies that provided lectures using PowerPoint (PPT) as an educational medium, seven also used various videos. Furthermore, one study presented photos, and another utilized sound equipment for relaxation. Six studies employed the conceptual framework, with three utilizing the Actioning Spirituality and Spiritual Care Education and Training in nursing (ASSET) model, two employing the textbook ‘Spiritual Nursing Module: Completion of Holistic Care’, one using the Analysis, Design, Development, Implementation, Evaluation (ADDIE) model, one applying the Psychological Empowerment Theory, and one implementing the Rogers Human-Centred Theory. Three studies did not mention any conceptual framework.

The content of spiritual nursing education was categorised into self-awareness, concepts related to spirituality, understanding of others, spiritual nursing process, and spiritual nursing applications. As for self-awareness, eight studies focused on examining the ego state to revise self-image through positive objective self-recognition in the first and second sessions of the programme. Each study included self-reflection in the third and ninth sessions. Four studies used tests such as ego-gram, MBTI, and enneagram tests, while five applied a holistic understanding of humans. The spiritual nursing process (assessment, diagnosis, planning, intervention, and evaluation) was included in all nine studies. Spiritual nursing intervention methods included therapeutic self-use such as being present together, attentive listening, touch, massage, lettering, poetry, laughter therapy, music therapy, occupational therapy, horticultural therapy, walking, meditation, use of the Bible, prayer, hymns using, support for religious activities, spiritual counselling, clergy referral, and offering support groups. The spiritual nursing process was covered in sessions 3 to 11 of the programme for training in spiritual nursing practice.

Eight studies applied spiritual nursing in various contexts: they targeted general, surgical, cancer, elderly, paediatric, and end-of-life patients, employed spiritual nursing in clinical situations, and explored spiritual nursing case studies and presentations. One study conducted a practicum at facilities for single mothers, the disabled, orphanages, and hospitals. Another study focused on nurses in bachelor’s degree programmes, asking them to apply their spiritual nursing education in practice by selecting patients themselves. The application of spiritual nursing was typically introduced in the latter part of the programme for practical use in clinical settings. Additionally, some studies included Nightingale’s nursing philosophy as educational content, while others explored the meaning of life and death.

2) Effects of spiritual nursing education programmes.

The effectiveness of spiritual nursing education programmes was confirmed using measures such as spiritual care competency (eight studies), spiritual well-being (six studies), spirituality (four studies), spiritual needs (three studies), existential well-being (two studies), self-esteem (two studies), self-identity (one study), life satisfaction (one study), empathy (one study), communication ability (one study), and attitude towards death (one study). All surveys were conducted using self-report questionnaires. Measurements were taken before and after training in eight studies, and before, after, and five weeks after training in one study.

In eight studies that measured spiritual care competency as a major variable, all spiritual care competency levels showed significant increase after intervention in the spiritual nursing education programme. However, for the sub-domains of spiritual care competency, one study did not demonstrate any significant difference in ‘professionalisation and improving the quality of spiritual care’ and ‘referral to professionals’, while another study did not show significant difference in ‘communication’. Three studies did not specify subdomain, and some studies measured only the ‘assessment and implementation of spiritual care’, ‘professionalisation and improving the quality of spiritual care’, and ‘personal support and patient counselling’ among the elements of spiritual care competency.

Among the six studies focusing on spiritual well-being, four showed a significant increase in spiritual well-being, one found no significant difference in religious well-being in the subdomain, and another showed no significant difference in existential well-being. Regarding the four studies that measured spirituality as a major variable, three showed a significant increase in the degree of spirituality, while one showed no significant change in transcendence in the subdomain. In studies measuring spiritual need as the main variable, all three studies showed significant differences, although one reported a significant decrease in spiritual need. Two studies reported a significant increase in spiritual need, with one showing no significant difference in the subdomains of ‘love and peace’ and ‘the meaning and purposes of life’.

Both studies that measured existential well-being as the main variable showed a significant increase in its degree. However, only one of the two studies measuring self-esteem as a major variable showed a significant increase in the degree of self-esteem. The studies measuring self-identity, life satisfaction, and empathy as the main variables showed significant increases, while one study measuring communication ability showed no significant difference. One study measured attitude towards death as a major variable. A significant increase was found in the need for prolonging the life of terminally ill patients and for an organisation dedicated to protection facilities, dedicated personnel, and the elderly problem. However, no significant differences were found in response to dying patient’s needs, death notice, attitude towards the dying, general attitude towards death, and dying patient’s family problems before and after education.

Evaluation of the quality of included literature

The quality evaluation of a randomised study using Cochrane’s RoB tool showed a low risk of bias. Random numbers were assigned using a random numbering programme to ‘generate random assignment order’, and the allocation table was covered in an opaque envelope to ‘hide allocation order’. For the items ‘blind for research participants and researcher’ and ‘blind for outcome evaluation’, the assignment table was managed by an assistant involved in the curriculum, thus preventing exposure of the order to the experimental and control groups until the start of spiritual education. As for ‘insufficient result data’, dropouts occurred in both groups for similar causes, and the risk of bias was rated as low in ‘selective reporting’ and ‘other bias’.

Of the eight non-randomised studies evaluated using the RoBANS tool, for the ‘subject group selection’ item, four studies confirmed that the experimental and control groups were the same or from the same period, and as the subjects did not receive spiritual nursing education at the time of study participation, it resulted in a low risk of bias. However, in one study, it was unclear whether there was an intervention in the study participants during the study participation. Additionally, in three studies, participants were not continuously recruited, leading to a high risk of bias. In terms of the ‘confounder variable’ items, five studies identified and appropriately considered the confounding variables, resulting in a low risk of bias, but the risk was uncertain in three studies. All eight studies used self-response for the ‘intervention exposure measurement’ item. Seven studies used the tools verified in previous studies, suggesting low risk of bias due to tool reliability. One study developed a tool by extracting from the literature, but as its validity and reliability were not presented, the risk of bias was high. The information reported in all eight studies was insufficient for the item ‘blindness for outcome evaluation’. ‘Incomplete outcome data’ showed low risk of bias in seven studies, except for one study with large difference in missing values between the groups. One study was evaluated as having high risk of bias for the ‘selective outcome report’ item owing to undefined results, while seven studies had low risk of bias by including all expected results (Fig.  2 ).

figure 2

Risk of bias graph for non-randomized controlled studies

Outcome variables and effect size of the included literature

Of the nine studies, six compared the effects of a spiritual nursing education programme that included the control group that did not receive treatment. These studies presented the pre- and post-mean and standard deviation of the outcome variables, enabling effect size analysis (Fig.  3 ).

figure 3

Forest plot of meta analysis on effects of spiritual care education

The examination of six studies measuring spiritual care competency as a major variable showed no homogeneity in the homogeneity test (Higgins I²=94%). As for effect size between the experimental and control groups of the programmes, the experimental group exhibited a 1.56 increase in spiritual care competency compared to the control group ( n  = 460, SMD = 1.56, 95% CI 0.70 to 2.43), indicating a statistically significant difference (Z = 3.56, p  = .0004).

The examination of four studies measuring spirituality as a major variable showed no homogeneity in the homogeneity test (Higgins I²=90%). As for effect size, the experimental group showed a 0.82 increase in spirituality compared to the control group ( n  = 317, SMD = 0.82, 95% CI 0.08 to 1.55), indicating a statistically significant difference (Z = 2.18, p  = .03).

The examination of four papers measuring spiritual well-being as a major variable showed no homogeneity in the homogeneity test (Higgins I²=85%). As for effect size, the experimental group exhibited a 0.65 increase in spiritual well-being compared to the control group ( n  = 317, SMD = 0.65, 95% CI 0.07 to 1.24), indicating a statistically significant difference (Z = 2.18, p  = .03).

The examination of two studies measuring existential well-being as a major variable showed no homogeneity in the homogeneity tests (Higgins I²=72%). As for effect size, the experimental group showed a 0.76 increase in existential well-being compared to the control group ( n  = 143, SMD = 0.76, 95% CI 0.11 to 1.40), indicating a statistically significant difference (Z = 2.31, p  = .02).

The examination of two studies measuring spiritual needs as a major variable showed no homogeneity in the homogeneity tests (Higgins I²=0%). As for effect size, the experimental group exhibited a 0.51 increase in spiritual need compared to the control group ( n  = 153, SMD = 0.51, 95% CI 0.19 to 0.83), indicating a statistically significant difference (Z = 3.11, p  = .002).

The examination of two studies measuring self-esteem as a major variable showed no homogeneity in the homogeneity tests (Higgins I²=94%). As for effect size, the experimental group showed a 1.06 increase in self-esteem compared to the control group ( n  = 143, SMD = 1.06, 95% CI − 0.43 to 2.55), indicating no statistically significant difference (Z = 1.40, p  = .16).

The systematic review and meta-analysis aimed to confirm the characteristics and contents of spiritual nursing education programme research, identify the effect of the spiritual nursing education programme, develop a spiritual nursing education programme by integrating education content and methods, and provide evidence for interventions strengthening the spiritual care competency.

This study confirmed that although interest and attempts in interventional research in spiritual nursing education are increasing, research on spiritual nursing education programmes in Korea is still not actively conducted. These results are similar to those of previous systematic reviews in other countries that showed increasing focus on spiritual nursing education for healthcare professionals [ 11 , 14 ]. However, compared to the increase in observational studies on spiritual nursing [ 22 ], intervention studies on spiritual nursing education were insufficient and required comprehensive implementation.

As for the study participants, seven studies were conducted on nursing students, including those in the RN-BSN course. Nursing students are easier to recruit than nurses, as education for students is implemented in a classroom setting, eliminating the need for an intervention environment. However, in a systematic review of spiritual nursing education, Jones and Paal [ 11 ] identified 13 studies on nursing students and 14 studies on nurses conducted over the past ten years. Unlike in Korea, researchers in other countries actively conduct studies on intervention in educational programmes targeting nurses. Six studies included in this review also suggested that research on clinical nurses be conducted [ 6 , 8 , 10 , 27 , 28 , 29 ]. It is crucial to explore whether spiritual nursing education programmes for nurses can affect the provision of spiritual nursing intervention to patients. Therefore, developing and actively applying practical spiritual nursing education programmes at the hospital level is necessary to enable the practice of spiritual nursing.

In terms of study design, eight out of nine studies were quasi-experimental ones, with three [ 27 , 28 , 30 ] among them using a one-group pre-post design. Difficulties could arise in designing a randomised controlled experimental study because students could not choose a class in regular course education, and time and psychological constraints pose challenges for students in special lecture-type education [ 31 ]. Limited availability of randomised controlled experimental studies led to the inclusion of quasi-experimental studies, which is a limitation of this study. Future studies should consider an RCT design to prevent subject selection bias and to clearly measure the effect of spiritual nursing education. Additionally, only one study performed a follow-up evaluation after five weeks of education [ 29 ], suggesting the need to consider a follow-up period to confirm continuity of the study results.

The duration of training sessions ranged from 60 to 150 min, with 120 min being the most common. This duration was likely based on the form of regular education or special lectures for nursing students and organised according to the average credits per class for bachelor’s degrees. Previous systematic review studies conducted abroad reported varying teaching hours, ranging from 30-minute to an all-day lectures [ 11 ]. This may be because, as found in a study on nurses, training was provided to them during lunch breaks or in the form of workshops, depending on the situation [ 32 ]. Previous studies have reported that spiritual nursing education was effective even in a short curriculum [ 11 ]. Therefore, when planning future research for three-shift nurses, it is necessary to consider short training duration to secure sufficient time for education.

In most studies, educational programmes were conducted in the form of general lectures, which is the most common educational method. However, research on nurses must address the shortcomings of face-to-face education considering the difficulty in adjusting nurses’ schedules owing to shift work. As previous international studies have proven the effectiveness of online spiritual nursing education [ 33 , 34 , 35 , 36 ], it is necessary to consider online non-face-to-face education. Additionally, field practice for spiritual nursing application was conducted in only one study [ 28 ], while two studies suggested research to combine theory and practice [ 10 , 37 ]. As field practice prepare nurses to function in real situations [ 38 ], more studies should conduct field practices. Furthermore, for research targeting nursing students, considering that practicum is currently limited due to COVID-19, education through simulation must be applied as it is effective as actual clinical practicum in increasing confidence in nursing performance [ 39 ].

Regarding the content of spiritual nursing education, most programmes were based on the ASSET model, consisting of self-awareness, spirituality, understanding of others, and the spiritual nursing process and application. Most studies approached spirituality from a cross-religious perspective, with spiritual nursing intervention providing education focused on existential and religious well-being. This result is consistent with previous systematic review studies [ 11 ], where spirituality was taught not as part of religion, but comprehensively. Many people in Korea believe that spiritual nursing is related to a specific religion [ 20 ]. However, since Korea has various religions such as shamanism, Buddhism, Confucianism, and Christianity [ 3 ], spirituality was approached from various perspectives. Moreover, most of the spiritual nursing education programmes in Korea showed a holistic rather than religious approach [ 8 ]. This trend is also reflected in the definition of the multidimensional domain of spirituality and spiritual nursing adopted by the European Education Project for the Development of Standards for Spiritual Nursing Education [ 1 ]. Therefore, future studies should approach spirituality and spiritual nursing as multidimensional educational content meant to provide meaning and purpose to life beyond religion.

Studies in Korea focused on teaching communication skills to broaden the understanding of the spiritual needs of others and training on enhancing empathy and providing hope. The effectiveness of spiritual nursing education was measured by its relationship with the subject as a spiritual nursing provider, including communication, empathy, spiritual needs, and spiritual care competency. In contrast, previous international systematic reviews [ 11 , 13 , 14 ] focused more on recognising one’s own spirituality, training on self-reflection to broaden the understanding of individual spirituality, and measuring the changes in viewpoint, knowledge, and attitude towards individual spirituality and spiritual nursing. This difference may be attributed to the socio-cultural characteristics of Korea, which value harmony with family and community rather than individuals [ 3 ]. Thus, the educational content on spirituality and spiritual nursing in Korea focuses more on understanding others and performing spiritual nursing for the spiritual well-being of patients rather than on self-awareness or personal spirituality. Particularly, therapeutic communication skills are required to broaden the understanding of the spiritual needs of others, assess their spiritual needs, and support patients with a professional competency that improves spiritual nursing quality [ 8 ]. Therefore, a spiritual nursing education applicable to nursing practice, such as scenarios related to therapeutic communication and communication skills for certain situations, is necessary [ 8 ].

This systematic review found that spiritual nursing education programme improves spiritual care competency. This finding is consistent with the results of previous systematic reviews from other countries [ 11 , 13 , 14 ] and confirms the need for spiritual nursing education to address the challenges faced in related practice. All eight studies in this review used tools developed by Van Leeuwen and Tiesinga [ 40 ], but one study did not show any significant difference in ‘professionalisation and improving the quality of spiritual care’ and ‘referral to professionals’ from among the subfields of spiritual care competency [ 37 ]. Multidisciplinary collaboration between nurses and hospital clergy is necessary for a religious approach to the transcendental relationship of spirituality in performing spiritual nursing. As professional referrals are an important area of spiritual care competency that addresses the spiritual needs of the subject beyond the role of a nurse [ 40 ], a specific multidisciplinary approach in spiritual nursing education is necessary.

Studies measuring the effect of spiritual nursing education on spiritual well-being and spirituality showed different results, especially when checking the subdomains. No significant differences were found between the ‘religious well-being’ subdomains of spiritual well-being and the ‘transcendence’ subdomains of spirituality. This could be due to difficulties in reflecting the level of spirituality and spiritual well-being of subjects with no religion when measuring the ‘religious well-being’ subdomain of spiritual well-being or in the ‘transcendence’ subdomain of spirituality. Therefore, studies should revise and supplement the tools for measuring spirituality and spiritual well-being by reflecting the meaning of the changing concepts.

Some studies examined the effects of spiritual nursing education on spiritual needs. One study reported a decrease in spiritual needs after implementation of spiritual nursing education, suggesting ways to meet the spiritual needs of nurses through self-awareness and application of learnings [ 28 ]. However, two studies reported that increase in spiritual needs can be considered as increased sensitivity to one’s spiritual needs through spiritual nursing education and would help understand the spiritual needs of patients [ 6 , 12 ]. These contradictory findings indicate that the reliability and validity of the measurement tool should be checked to confirm the effectiveness of spiritual nursing education and that presenting clear standards is necessary for result interpretation.

Studies in Korea used eleven variables to confirm the effectiveness of the spiritual nursing education programme. Systematic review studies [ 11 , 14 ] from other countries used various variables such as the Spirituality and Spiritual Care Rating Scale, Spiritual Transcendence Scale, Spiritual Perspective Scale, Spiritual Care Inventory, and Spiritual Care in Practice Survey to confirm the effectiveness of spiritual nursing education. To meet the spiritual needs of nursing students, self-awareness to understand one’s spirituality is crucial [ 11 ]. Future studies examining the effectiveness of spiritual nursing education should consider other variables, such as evaluating nurses’ perceptions of spirituality and spiritual nursing [ 41 ] and measuring the effect of spiritual nursing performance on nurses [ 42 ]. The studies included in this study have limitations in subjective evaluation because all measurements of the effectiveness of spiritual nursing education were measured using self-reported questionnaires. Prior studies from other countries showed that the objectivity of the effect measurement of spiritual nursing education increased after providing spiritual nursing education for nurses [ 43 ]. One study objectified the effect of spiritual nursing education by examining the number of spiritual nursing interventions before and after spiritual nursing education [ 44 ]. For objective evaluation, evaluation checklists, in which the evaluation criteria are objectively presented not only for self-evaluation, can be used for head nurses or patients. Various other methods can be considered for more objective evaluation by using standardised measurements for the subject’s spiritual nursing intervention skill level, number of executions, performance accuracy, and knowledge level [ 45 ]. Therefore, future studies should confirm the effectiveness of spiritual nursing education by using more objective, reliable, and valid measurement methods and tools.

The meta-analysis showed that the spiritual nursing education programme increased spiritual nursing competency, spirituality, spiritual well-being, existential well-being, and spiritual needs. Understanding one’s spirituality helps in understanding the spiritual needs of others by reflecting on one’s own spiritual beliefs in the process of identifying individual spirituality and spiritual needs through self-awareness at the beginning of the spiritual nursing education programme. Spiritual nursing education [ 14 ] aim to develop sensitivity to spiritual nursing, clarify the importance of spirituality and spiritual nursing in healthcare, and present a spiritual nursing intervention method. Thus, it can affect the acknowledgement of individual spirituality and the integration of spirituality in clinical practice and communication with patients [ 14 ].

However, the quality of studies included may pose the risk of randomisation-related bias because of the minimal number of randomised trials used, with the possibility of the effect estimates of the outcome variables being overestimated when interpreting the results. Moreover, the study had limitations in explaining the effect as only six studies were included in the meta-analysis, most results showed large heterogeneity, and moderation effect analysis was not conducted because less than ten studies were selected.

Nevertheless, this study can be meaningful in a few aspects. We tried to comprehensively and scientifically synthesise individual study results confirming the effectiveness of spiritual nursing education programmes for nurses and nursing students in Korea. In particular, we aim to contribute to the planning of the future directions of spiritual nursing education intervention research by providing the content and teaching methods of programmes. Furthermore, diverse outcome variables were explored and integrated to estimate the significance of the effects of spiritual nursing education programmes.

Conclusions

This study examined the teaching methods and contents of spiritual nursing education programmes for nurses and nursing students and confirmed their effectiveness. The teaching methods included lectures, discussions, and case presentations, while the contents included self-spiritual awareness, spirituality-related concepts, understanding the spirituality of others, and the spiritual nursing process and application. To confirm the effects of education programmes, we mainly used variables related to spiritual care competency. Spiritual nursing education increased spiritual care competency and individual spirituality. The meta-analysis showed statistically significant effects on spiritual nursing competency, spirituality, spiritual well-being, existential well-being, and spiritual demand, but not on self-esteem. This study’s findings on the characteristics of spiritual nursing education programmes in Korea can help develop and apply programmes for nursing students and nurses. Given the improvement in spiritual nursing competency, more attention and administrative and financial support for spiritual nursing education programmes in schools and hospitals should be provided. To further advance this science, more randomised experimental studies on the effectiveness of spiritual nursing education on clinical nurses is necessary. Furthermore, future studies should examine whether short online training is effective and verify its continued effects through a long-term follow-up study. We also recommend developing and applying spiritual nursing education programmes considering Korean practices, such as spiritual nursing interventions that addresses spiritual needs arising from relationships with others and promoting existential well-being full of meaning and purpose in life.

Data availability

The supplementary material used for this study can be found in Additional file 1 . The datasets used or analysed for the current study are available from the corresponding author on reasonable request.

Abbreviations

Analysis Design Development Implementation Evaluation

Actioning, Spirituality and Spiritual Care Education and Training in nursing

Cochrane’s Risk of Bias

DataBase Periodical Information Academic

International Council of Nurses

Korean Studies Information Service System

National Discovery for Science Library

Participants, Intervention, Comparison, Outcome, and Study Design

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Randomised Controlled Trial

Research Information Sharing Service

Registered Nurse-Bachelor of Science in Nursing

Risk of Bias Assessment Tool for Non-randomised Studies

Standardised Mean Difference

Van Leeuwen R, Attard J, Ross L, Boughey A, Giske T, Kleiven T, et al. The development of a consensus-based spiritual care education standard for undergraduate nursing and midwifery students: an educational mixed methods study. J Adv Nurs. 2021;77:973–86.

Article   PubMed   Google Scholar  

Ramezani M, Ahmadi F, Mohammadi E, Kazemnejad A. Spiritual care in nursing: a concept analysis. Int Nurs Rev. 2014;61:211–9.

Article   CAS   PubMed   Google Scholar  

Choi GH. Korean spiritual health: a concept analysis [dissertation]. Chuncheon: The Kangwon National University; 2018. pp. 1-166.

Lee S, Kim MK, Hong E-Y, Lee JJ, Kim HJ, Kim HS, et al. Structural equation modeling on spiritual nursing care of clinical nurses based on the theory of planned behavior. Korean J Adult Nurs. 2022;34:27–38.

Article   Google Scholar  

Kim HS, Park HS, Chung HS, Kim MK, Park EY, Kim DY. Analysis of the present condition of spiritual nursing diagnosis using clinical big data. Evid Nurs. 2021;9:10–21.

Google Scholar  

Jeong JO, Jo HS, Kim S. Effect of the spiritual care module education program for nurses. J Korean Acad Soc Nurs Educ. 2016;22:51–62.

Cooper KL, Chang E, Sheehan A, Johnson A. The impact of spiritual care education upon preparing undergraduate nursing students to provide spiritual care. Nurse Educ Today. 2013;33:1057–61.

Choi SK, Kim J, Kim S. Development and effectiveness of a spiritual care education program for nurses. J Converg Inf Technol. 2019;9:67–77.

Hong S. Factors influencing spiritual care practices of clinical nurses. JLCCI. 2020;20:677–92.

Kim J, Cha NH. Effect of a spiritual care empowerment program on psychological empowerment of nursing students. J East-West Nurs Res. 2019;25:117–27.

Jones KF, Paal P, Symons X, Best MC. The content, teaching methods and effectiveness of spiritual care training for healthcare professionals: a mixed-methods systematic review. J Pain Symptom Manage. 2021;62:e261–78.

Yoon MO, Sim JH. The effects of spiritual nursing care education of christian university nursing students. Theol Soc. 2018;32:221–55.

Mthembu TG, Wegner L, Roman NV. Teaching spirituality and spiritual care in health sciences education: a systematic review. AJPHES. 2016;22:1036–57.

Paal P, Helo Y, Frick E. Spiritual care training provided to healthcare professionals: a systematic review. J Pastoral Care Counsel. 2015;69:19–30.

Semerci R, Uysal N, Bağçi̇van G, Doğan N, AKGÜN KOSTAK M, Tayaz E, et al. Oncology nurses’ spiritual care competence and perspective about spiritual care services. Turk J Oncol. 2021;36:222–30.

Lewinson LP, McSherry W, Kevern P. Spirituality in pre-registration nurse education and practice: a review of the literature. Nurse Educ Today. 2015;35:806–14.

So WS, Shin HS. From burden to spiritual growth: Korean students’ experience in a spiritual care practicum. J Christ Nurs. 2011;28:228–34.

Yong J, Kim J, Park J, Seo I, Swinton J. Effects of a spirituality training program on the spiritual and psychosocial well-being of hospital middle manager nurses in Korea. J Contin Educ Nurs. 2011;42:280–8.

Park YS. Starting with ‘Korean society and christianity’. Essence Phenom. 2013;33:32–44.

Kwon HJ. Perceptions of spiritual nursing care nurses and nursing students. J Korean Acad Nurs. 1989;19:233–9.

CAS   Google Scholar  

Kwon SH, Tae YS. Christian nursing students’ experience of spiritual nursing practice. J Qual Res. 2013;14:92–104.

Cho HM, Jang YN, Choi HJ. Analysis of research trends about spiritual care in Korea- focusing on the christian perspective. Faith Scholarsh. 2019;24:199–219.

Kim SY, Park DA, Seo HJ, Shin SS, Lee SJ, Lee M, et al. Health technology assessment methodology: systematic review. Seoul: National Evidence-based Healthcare Collaborating Agency; 2020.

Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions. London: The Cochrane Collaboration; c2011.

Kim SY, Park JE, Lee YJ, Seo H-J, Sheen S-S, Hahn S, et al. Testing a tool for assessing the risk of bias for nonrandomized studies showed moderate reliability and promising validity. J Clin Epidemiol. 2013;66:408–14.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

Article   PubMed   PubMed Central   Google Scholar  

Kim J, Park K. The influences of spiritual care nursing education towards death and dying. J Korean Pubilc Health Nurs. 1999;13:114–27.

Hong S. Effects of a spiritual care education program based on the action learning on spiritual needs, spiritual well-being and spiritual care competence of nursing students. Jour KoCon a. 2016;16:285–94.

Lim H-J, Park K. The effectiveness of a spiritual education for spiritual care competence reinforcement of nursing students. J Digit Converg. 2021;19:261–74.

Choi EJ. Effects of the spiritual care education on spiritual well-being and spiritual care competence in nursing student. J Wholist Nurs Sci. 2014;7:143–50.

Kim H, Kim E. A systematic review of infection management education program for nursing students. JLCCI. 2020;20:1359–75.

Lion A, Szilagyi C, Varner Perez S, Koch S, Oyedele O, Slaven J, et al. Interprofessional spiritual care education in pediatric hematology-oncology. Pediatr Blood Cancer. 2021;69:1–10.

Damsma-Bakker A, Van Leeuwen R. An online competency-based spiritual care education tool for oncology nurses. Semin Oncol Nurs. 2021;37:1–6.

Petersen CL, Callahan MF, McCarthy DO, Hughes RG, White-Traut R, Bansal NK. An online educational program improves pediatric oncology nurses’ knowledge, attitudes, and spiritual care competence. J Pediatr Oncol Nurs. 2017;34:130–9.

Rawlings MA, Gonzalez-Castaneda R, Valdovinos IC, Shepard Payne J, Ho Yu C. Spiritually responsive SBIRT in social work education. J Soc Work Pract Addict. 2019;19:57–77.

Pearce MJ, Pargament KI, Oxhandler HK, Vieten C, Wong S. Novel online training program improves spiritual competencies in mental health care. Spiritual Clin Pract. 2020;7:145–61.

Chung MJ, Eun Y. Development and effectiveness of a spiritual care education program for nursing students-based on the ASSET model. J Korean Acad Nurs. 2011;41:673–83.

Orique SB, Phillips LJ. The effectiveness of simulation on recognizing and managing clinical deterioration: meta-analyses. West J Nurs Res. 2018;40:582–609.

Tawalbeh LI, Tubaishat A. Effect of simulation on knowledge of advanced cardiac life support, knowledge retention, and confidence of nursing students in Jordan. J Nurs Educ. 2014;53:38–44.

Van Leeuwen R, Tiesinga LJ, Middel B, Post D, Jochemsen H. The validity and reliability of an instrument to assess nursing competencies in spiritual care. J Clin Nurs. 2009;18:2857–69.

McSherry W, Draper P, Kendrick D. The construct validity of a rating scale designed to assess spirituality and spiritual care. Int J Nurs Stud. 2002;39:723–34.

Burkhart L, Schmidt L, Hogan N. Development and psychometric testing of the Spiritual Care Inventory instrument. J Adv Nurs. 2011;67:2463–72.

Vlasblom JP, Van der Steen JT, Knol DL, Jochemsen H. Effects of a spiritual care training for nurses. Nurse Educ Today. 2011;31:790–6.

Beese RJ, Ringdahl D. Enhancing spiritually based care through gratitude practices: a health-care improvement project. Creat Nurs. 2018;24:42–51.

Kim SH, Kim HJ. Simulation-based disaster nursing education program for nursing student: a systematic review. J Korean Soc Simul Nurs. 2021;9:69–87.

Download references

Acknowledgements

The authors thank EDITAGE for their English language editing.

The authors received no financial support for the research, authorship, and/or publication of this article.

Author information

Authors and affiliations.

College of Nursing, Pusan National University, 49 Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do, 50612, South Korea

Hyun-Jin Cho & Kyo-Yeon Park

College of Nursing, Research Institute of Nursing Science, Pusan National University, 49 Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do, 50612, South Korea

Kyoungrim Kang

You can also search for this author in PubMed   Google Scholar

Contributions

HJC and KK were involved in the conception and design of the study. All authors discussed inclusion and exclusion criteria. HJC and KYP searched databases, screened articles, and all authors discussed inclusion of articles. All authors were involved in data collection, and data analysis plans, as well as drafting the manuscript. HJC performed meta-analysis, and KK supervised the research. All authors were involved in the critical review of the article, writing, drafting, and editing the final document for publication. All authors have read and agreed with the final version of the manuscript.

Corresponding author

Correspondence to Kyoungrim Kang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Cho, HJ., Kang, K. & Park, KY. Spiritual nursing education programme for nursing students in Korea: a systematic review and meta-analysis. BMC Nurs 23 , 310 (2024). https://doi.org/10.1186/s12912-024-01961-6

Download citation

Received : 02 March 2023

Accepted : 22 April 2024

Published : 07 May 2024

DOI : https://doi.org/10.1186/s12912-024-01961-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Meta-analysis
  • Spiritual nursing
  • Systematic review

BMC Nursing

ISSN: 1472-6955

systematic literature review databases

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • For authors
  • New editors
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Online First
  • Best practices for the dissemination and implementation of neuromuscular training injury prevention warm-ups in youth team sport: a systematic review
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0009-0005-0529-0398 Destiny Lutz 1 ,
  • http://orcid.org/0000-0001-6429-4333 Carla van den Berg 1 ,
  • http://orcid.org/0000-0003-3056-8169 Anu M Räisänen 1 , 2 ,
  • Isla J Shill 1 , 3 ,
  • Jemma Kim 4 , 5 ,
  • Kenzie Vaandering 1 ,
  • Alix Hayden 6 ,
  • http://orcid.org/0000-0002-0427-2877 Kati Pasanen 1 , 7 , 8 , 9 ,
  • http://orcid.org/0000-0002-5951-5899 Kathryn J Schneider 1 , 3 , 8 , 9 , 10 ,
  • http://orcid.org/0000-0002-9499-6691 Carolyn A Emery 1 , 3 , 8 , 9 , 11 , 12 , 13 ,
  • http://orcid.org/0000-0002-5984-9821 Oluwatoyosi B A Owoeye 1 , 4
  • 1 Sport Injury Prevention Research Centre, Faculty of Kinesiology , University of Calgary , Calgary , Alberta , Canada
  • 2 Department of Physical Therapy Education - Oregon , Western University of Health Sciences College of Health Sciences - Northwest , Lebanon , Oregon , USA
  • 3 Hotchkiss Brain Institute , University of Calgary , Calgary , Alberta , Canada
  • 4 Department of Physical Therapy & Athletic Training , Doisy College of Health Sciences, Saint Louis University , Saint Louis , Missouri , USA
  • 5 Interdisciplinary Program in Biomechanics and Movement Science , University of Delaware College of Health Sciences , Newark , Delaware , USA
  • 6 Libraries and Cultural Resources , University of Calgary , Calgary , Alberta , Canada
  • 7 Tampere Research Center for Sports Medicine , Ukk Instituutti , Tampere , Finland
  • 8 McCaig Institute for Bone and Joint Health , University of Calgary , Calgary , Alberta , Canada
  • 9 Alberta Chilrden's Hopsital Research Institute , University of Calgary , Calgary , Alberta , Canada
  • 10 Sport Medicine Centre , University of Calgary , Calgary , Alberta , Canada
  • 11 O'Brien Institute for Public Health , University of Calgary , Calgary , Alberta , Canada
  • 12 Department of Community Health Sciences , Cumming School of Medicine, University of Calgary , Calgary , Alberta , Canada
  • 13 Department of Paediatrics , Cumming School of Medicine, University of Calgary , Calgary , Alberta , Canada
  • Correspondence to Ms Destiny Lutz, Sport Injury Prevention Research Centre, Faculty of Kinesiology, University of Calgary, Calgary, Alberta, Canada; destiny.lutz{at}ucalgary.ca

Objective To evaluate best practices for neuromuscular training (NMT) injury prevention warm-up programme dissemination and implementation (D&I) in youth team sports, including characteristics, contextual predictors and D&I strategy effectiveness.

Design Systematic review.

Data sources Seven databases were searched.

Eligibility The literature search followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Inclusion criteria: participation in a team sport, ≥70% youth participants (<19 years), D&I outcomes with/without NMT-related D&I strategies. The risk of bias was assessed using the Downs & Black checklist.

Results Of 8334 identified papers, 68 were included. Sport participants included boys, girls and coaches. Top sports were soccer, basketball and rugby. Study designs included randomised controlled trials (RCTs) (29.4%), cross-sectional (23.5%) and quasi-experimental studies (13.2%). The median Downs & Black score was 14/33. Injury prevention effectiveness (vs efficacy) was rarely (8.3%) prioritised across the RCTs evaluating NMT programmes. Two RCTs (2.9%) used Type 2/3 hybrid approaches to investigate D&I strategies. 19 studies (31.6%) used D&I frameworks/models. Top barriers were time restrictions, lack of buy-in/support and limited benefit awareness. Top facilitators were comprehensive workshops and resource accessibility. Common D&I strategies included Workshops with supplementary Resources (WR; n=24) and Workshops with Resources plus in-season Personnel support (WRP; n=14). WR (70%) and WRP (64%) were similar in potential D&I effect. WR and WRP had similar injury reduction (36–72%) with higher adherence showing greater effectiveness.

Conclusions Workshops including supplementary resources supported the success of NMT programme implementation, however, few studies examined effectiveness. High-quality D&I studies are needed to optimise the translation of NMT programmes into routine practice in youth sport.

Data availability statement

Data are available in a public, open access repository. Not Applicable.

https://doi.org/10.1136/bjsports-2023-106906

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THE TOPIC

Neuromuscular training (NMT) injury prevention warm-up programmes are effective at preventing injury rates in youth sports. However, for proper dissemination and implementation (D&I) by multiple stakeholders, barriers such as low adoption, adherence and lack of time must be addressed.

WHAT THIS STUDY ADDS

There are limited high-quality research studies to facilitate the widespread adoption of, and improved adherence to, NMT programmes. Few studies used D&I theories, frameworks or models. Programme flexibility is a common barrier to implementation; adaptation of NMT programmes to fit local contexts is imperative. Comprehensive workshops and supplementary resources currently support the success of NMT programme implementation.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

Promotion of NMT programmes as the standard of practice is essential to increase practical D&I of these programmes, and thus reduce the burden of youth sport injuries. This work provides some directions for stakeholders, including researchers, implementation support practitioners and youth sport policymakers, on current best practices for the delivery of NMT programmes in local youth sport settings. This work also provides the evidence base for more translational research efforts in youth sport injury prevention, a much-needed next step to optimise NMT programmes into youth sport practice.

Introduction

Youth (<19 years) sport participation provides numerous benefits, positively impacting physical and mental health. 1 Youth sport participation rates are high, with up to 90% of youth participating in sport globally. 2–5 However, with increased sport participation comes increased injury risk. One-in-three youth sustain a sport-related injury each year, leading to a significant public health burden with high healthcare costs. 3 6–8 Sport-related injuries may also result in long-term health consequences (eg, poor mental health, reduced physical activity, post-traumatic osteoarthritis). 7–9 Implementing injury prevention strategies is critical to mitigate the injury risk associated with youth sport participation.

Neuromuscular training (NMT) injury prevention warm-up programmes in youth team sport are effective in reducing injury rates by up to 60% and decreasing costs associated with injury based on randomised controlled trials (RCT) and systematic reviews. 10–21 NMT programmes include exercises that can be categorised across aerobic, balance, strength and agility components 22 23 and typically take 10–15 min. 24 25 Originally implemented with the intention of reducing non-contact lower extremity injury risk, 26–28 the effectiveness of NMT programmes has since been evaluated across numerous sports, age groups and levels of play and are associated with lower extremity and overall injury rates compared with standard of practice warm-ups. 12 20 21 25 In youth team sports, a protective effect has been demonstrated in soccer, handball, basketball, netball, rugby and floorball. 11 16 29–31 The International Olympic Committee Consensus Statement on Youth Athletic Development recommends multifaceted NMT warm-up programmes in youth sport. 32

Despite being a primary injury prevention strategy across youth sports, NMT programme adoption remains low. 33–38 For evidence-informed interventions to be successful and have a practical impact, pragmatic approaches derived from dissemination and implementation (D&I) science are necessary across multiple socioecological levels including organisation, coach and player. 36 Dissemination is defined as ‘the active process of spreading evidence-based interventions to a target population through determined channels and using planned strategies’. Implementation is ‘the active process of using strategies across multiple levels of change to translate evidence-based interventions into practice and prompt corresponding behaviour change in a target population’. 36

The aim of this systematic review was to evaluate current best practices for the D&I of NMT programmes in a youth team sport. The specific objectives of this systematic review were to: (1) describe the characteristics of identified D&I-related studies (studies with at least one D&I outcome directly or indirectly assessed as primary, secondary or tertiary outcome); (2) evaluate factors associated with the D&I of NMT warm-up programme across socioecological levels, including barriers and facilitators; (3) examine the effect of D&I strategies in delivering NMT warm-ups across multiple socioecological levels; and (4) examine the influence of D&I strategies on injury rates. Our protocol was registered in PROSPERO (CRD42021271734), and the review is reported in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines ( supplemental appendix S1 ).

Supplemental material

Search strategy and data sources.

A comprehensive search was developed with a librarian (KAH) in MEDLINE, incorporating four main concepts: child/youth, injury prevention, implementation/compliance/adherence and sports. The author team reviewed the final search strategy which was then piloted against the known key studies to ensure that the search was capturing relevant studies. Finally, the MEDLINE search was translated to the other databases. Searches were conducted 25 August 2021 (updated 16–18 August 2022; 5 September 2023). Search strategies are available in Supplemental Appendix S2 . Studies were identified by searching seven databases: MEDLINE(R) and EPUB Ahead of Print, In-Process & Other Non-Indexed Citations and Daily, Embase, Cochrane Central Register of Controlled Trials, Cochrane database of Systematic Reviews (all Ovid); CINAHL Plus with Full Text, SPORTDiscus with Full Text (EBSCO) and ProQuest Dissertations & Thesis Global.

Study selection and eligibility

All database search results were uploaded and duplicates were removed in Covidence (Veritas Health Innovation, Melbourne, Australia). Records were independently reviewed by authors in pairs (DL/IJS, CV/JK, KV/DL), starting with a screening of 50 randomly selected citations to assess inter-rater agreement with a threshold set at 90%. Each pair of reviewers performed title/abstract screening and full-text screening independently, providing reasons for exclusion at full-text stage ( figure 1 ). Any disagreements for exclusion, where a consensus could not be reached within pairs, were resolved by a senior author (OBAO). A secondary evaluation of included manuscripts was performed by senior authors (OBAO and CAE) to ensure appropriate inclusion. Study inclusion criteria were: (1) Participation in a team sport (male and female); (2) a minimum of 70% of participants as a youth (<19 years) or coaches of these youth teams; (3) reported dissemination and/or implementation outcomes (eg, self-efficacy, adherence, intention); (4) reported D&I strategies related to NMT warm-up programmes (ie, NMT delivery strategies, where applicable eg, in RCTs). Exclusion criteria were: (1) Studies evaluating rehabilitation programmes, non-team-based or physical education programmes; (2) non-peer-reviewed; (3) not English. The screening process was reported using the PRISMA flow diagram. 39

  • Download figure
  • Open in new tab
  • Download powerpoint

Study identification Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram.

Risk of bias

To assess the risk of bias, three sets of paired reviewers independently used the Downs & Black (D&B) quality assessment tool. 40 The tool consists of a 27-item checklist (total score/33). A third senior reviewer (OBAO or CAE or AMR) resolved any disagreements. The rating of evidence and strength of recommendations were assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) guidelines. 41–43

Frameworks/models

The proportion of studies that used D&I research theories/frameworks/models, including behaviour change frameworks/models, was examined to identify commonly used frameworks/models.

Efficacy-effectiveness orientation in RCTs

We assessed the components of 12 RCTs using the Rating Included Trials on the Efficacy-Effectiveness Spectrum (RITES) tool, as adapted by Maddox et al 44 RITES scores RCTs in systematic reviews based on a continuum of efficacy-effectiveness across four domains: Participant characteristics, trial setting, flexibility of intervention(s) and clinical relevance of experimental and comparison intervention(s) ( online supplemental table S1 ). 45 We modified the Likert grading system to classify studies depending on whether their emphasis was more on efficacy or effectiveness or balanced for both. Given that different aspects of each trial may fall in different places along the efficacy-effectiveness continuum, each RITES domain is scored independently and a composite score is not applicable. To minimise subjectivity, the RITES evaluation for included RCTs was completed by two reviewers (AMR and OBAO). Any disagreements were resolved through discussion to reach a consensus.

Study typologies and assessment of study relevance to D&I

The level of relevance of individual studies (RCTs and quasi-experiments) to D&I was determined based on the implementation-effectiveness hybrid taxonomy: Type 1 (primarily focused on clinical/intervention outcomes), Type 2 (balance focused on both clinical/intervention outcomes and D&I outcomes) and Type 3 (primarily or ‘fully’ (our adaptation) focused on D&I outcomes) studies. 36 46 For ease of interpretation of results, studies were rated considering three broad traditional research design categories (ie, hierarchy of evidence): RCTs, quasi-experimental and observational studies, including cohort, cross-sectional, pre-experimental, qualitative, mixed-methods and ecological studies. Observational studies were categorised as ‘fully focused’ observational-implementation (if only D&I outcomes were evaluated) or ‘partially focused’ observational-implementation (if a combination of clinical and D&I outcomes were evaluated’ D&I studies. 47 RCTs and quasi-experimental studies with Type 2 or Type 3 hybrid approaches were indicated as ‘highly relevant’ towards informing D&I best practices. Furthermore, observational-implementation studies that are fully focused on D&I were also indicated as ‘highly relevant’.

Data extraction

The extracted data included: study design, author, journal, year, population (eg, 13–17 years old female soccer players), participant demographics, D&I intervention strategies (eg, workshops, supplementary resources), D&I framework/model, control group strategies, D&I outcomes (eg, adoption, adherence, intention, fidelity, self-efficacy) and injury outcomes. Study design classification was completed based on data extracted and the process taken by authors, 48 which may have differed from the original classification. Furthermore, prospective, and retrospective cohort studies were consolidated into ‘cohort’ to improve ease of readability. D&I outcomes indicated as compliance were included in the appropriate adherence category as defined in Owoeye et al and described as ‘adherence-related’ outcomes, to maintain unified language across results; the full list is provided in online supplemental table S2 . 36 49–51 Based on the dose-response thresholds reported for NMT programmes within current literature, measures of adherence were used to indicate potential D&I effect ( online supplemental table S5 ). 24 36 52–54 Studies with cumulative utilisation (sessions completed/total possible) of ≥70%, utilisation frequency of ≥1.5 sessions/week or a significant association between D&I exposures and outcomes were defined as moderate-to-highly relevant and identified as having a potential D&I effect (ie, yes). Studies presenting cumulative utilisation <70%, utilisation frequency <1.5 sessions/week or no association between D&I exposures and outcomes were defined as low-to-no relevance (ie, no). Studies reporting both frequency utilisation and cumulative utilisation must both reach the established dose-response thresholds to be considered as having a potential D&I effect. D&I barriers and facilitators, factors influencing injury prevention implementation success and the identification of any frameworks used were also extracted and categorised into themes. Measures of potential effect for these results were summarised using OR, proportions and mean differences in D&I outcomes (eg, adoption, adherence). Injury-specific results were reported as incidence rate ratios, risk ratios, ORs or prevalence. D&I strategies were classified into various categories, including workshops, supplementary resources, personnel support, supervision and combinations of these strategies.

Equity, diversity and inclusion statement

Our author team is comprised of student and senior researchers across various disciplines with representation from low-to-middle-income countries. A variety of demographic, socioeconomic and cultural backgrounds were included in our study populations.

PRISMA flow, characteristics of included studies and risk of bias assessment

A total of 68 relevant studies were included from our initial and updated search yield of 9021 studies ( figure 1 ). Across included studies, 13 included only male youth participants, 13 included only female youth participants, 26 included both and 16 reported coach-focused findings. Sports represented were soccer (n=33), rugby (n=8), basketball (n=7), multisport (n=7), handball (n=5), floorball (n=3), field hockey (n=3), volleyball (n=1) and futsal (n=1).

Details of study characteristics and risk of bias are presented in online supplemental table S3 . D&B scores ranged from 4/33 to 24/33 (median=14/33) from a variety of study designs, including 20 RCTs, 16 cross-sectional, 9 quasi-experimental, 8 cohort, 6 qualitative, 3 ecological, 3 mixed-methods and 3 pre-experimental. The D&B scores for the two top D&I-related relevant studies—an RCT Type 2 study (n=1) was 21/33 and an RCT Type 3 hybrid study (n=1) was 17/33. Using the GRADE guidelines for the process of rating the quality of evidence available and interpreting the quality assessment, the strength of recommendations was ‘low’ given the multiplicity of designs. 42 43

Characteristics of current D&I-related studies

23 studies (33.8%) reported using a D&I /behaviour change framework/model. D&I frameworks included Reach, Effectiveness, Adoption, Implementation and Maintentance (RE-AIM) Framework (n=7), Consolidation Framework for Implementation Research (n=1), Precede-Proceed Model (n=1), Translating Research into Injury Prevention Practice (n=1) and Promoting Action on Research Implementation of Health Sciences (n=1) and the Adherence Optimisation Framework (n=1). Behaviour change models included the Health Action Process Approach (HAPA) (n=8), Theory of Planned Behaviour (n=1) and the Health Belief Model (n=1).

Assessment of study relevance to D&I

Two RCTs of 68 included D&I-related studies (2.9%) were identified as highly relevant to D&I best practices (ie, Type 2 or 3 hybrid approach). 55 56 18 (27.9%) RCTs reported a secondary analysis of D&I strategies 12 16 19 30 53 57–69 ; classified as Type 1 hybrids. Five (8.3%) quasi-experimental studies used Type 2 or Type 3 hybrid approach 22 70–73 ; the remaining studies (n=4; 5%) were classified as quasi-experimental Type 1 hybrids. 74–77 Many observational studies (n=17; 26.7%), 78–94 were highly relevant based on being fully-focused observational-implementation studies; 5 (6.7%) were partially-focused observational-implementation studies. 52 95–98 The remaining observational studies (n=17; 23.3%) were observational-implementation studies, 35 99–114 reporting D&I outcomes from a qualitative lens using interviews and surveys.

The RITES scores for the 14 D&I-related RCTs that examined injuries as primary outcome and D&I outcomes as secondary (Type 1 hybrid approach) are presented in table 1 . Almost all (13 of 14; 92.9%) of the RCTs focused mainly on intervention efficacy (as opposed to effectiveness) regarding the flexibility of NMT warm-up programmes. Cumulatively, effectiveness was rarely (7.1%) prioritised as a primary focus across all the 56 possible ratings of the RITES domains for all 14 studies. 50% of the domain ratings demonstrated efficacy as a priority and 42.9% of the ratings were indicated for a balance between efficacy and effectiveness.

  • View inline

RITES domain scores for included trials

Contextual predictors of NMT warm-up programme implementation

50 (73.5% of total) studies identified ≥1 barrier or facilitator within their findings, with 10 (14.7% of total) specifically examining barriers/facilitators as their main objectives. Full list is provided in online supplemental table S4 . The most common barriers identified were time restrictions (n=30), 30 35 59 62 69 70 73 74 78 79 81 82 84 87–91 93 96 98 101 102 105 107 108 112 114 reduced buy-in/support (n=8) 62 75 84 87 105 110–112 and limited awareness of preventative effects of programmes (n=8). 74 84 103 104 107 109 113 Facilitators included comprehensive workshops from trained instructors (n=11), 53 71 78–80 84 90 96 99 100 112 accessibility of supplementary resources (n=10) 82 84 87 89 90 105 114 and uptake/support from multiple stakeholders (n=7). 56 67 84 101 103 105 112 Moreover, suggestions from multiple socioecological levels indicated that increasing programme education and support, increased sport-specific activities and improved awareness of preventive effects, influence NMT implementation success. 36 88 89 115 116 Figure 2 , adapted from Basow et al 117 illustrates the contextual factors reported in the literature. 117 This evidence-informed model shows the important barriers and facilitators that influence the end-user implementation of NMT warm-up programmes across the three key socioecological levels of change.

Contextual predictors of NMT implementation across multiple socioecological levels. (Adapted from Basow et al (2021)).(116). Notes. SE, self-efficacy. NMT, neuromuscular training. Bold represents top barrier(s)/facilitator(s).

51 (75%) studies used implementation strategies for NMT warm-up programmes. The most frequently used strategies were Workshops with supplementary Resources (WR; n=24), followed by Workshops with supplementary Resources, plus in-season Personnel support (WRP; n=14). Three studies employed both WR and WRP strategies. Other methods for implementation included only workshops (n=9), only supplementary resources (n=4), supplementary resources and personnel support (n=2), workshops with personnel support (n=1) and supervision (n=1). Note, some studies are duplicated throughout the table when multiple D&I strategies are compared. 22 53 56 86

The key D&I concepts that were reported within the included studies were adherence or adherence-related (eg, self-efficacy, translation and perception). Specific outcomes within these concepts were further examined from the individual study results. We did not have enough evidence to present a meta-analysis of the effect of D&I strategies on D&I outcomes. Therefore, online supplemental table S5 presents a qualitative summary of the relationships between reported D&I exposure and D&I outcomes. 40 studies reported adherence-related outcomes, of which 32 (80%) were indicated to have potential D&I effect. Studies using WRP (n=14) reported completing between 1.4 and 2.6 sessions/week and cumulative utilisation of 39–85.6%; 9 of these 14 studies have potential D&I effect. Studies using WR (n=24) presented utilisation frequency ranging from 0.8 to 3.2 sessions/week and cumulative utilisation of 55–98% of sessions; 16 of these 24 have potential D&I effect. In studies evaluating workshops only (n=9; 22%), frequency utilisation was reported between 1 and 2 sessions/week across eight of the nine studies and one study had 52% cumulative utilisation; two have potential D&I effect.

Effects of D&I strategies on injury outcomes

Three RCTs specifically examined the effects of the D&I strategies used to deliver NMT programmes on injury outcomes ( table 2 ). Two studies that compared both WR and WRP to supplementary resource only found no significant differences between strategies, 53 56 they reported reduced injury rates in the highest adherence groups by 56% and 72%, respectively. Another study comparing WR and WRP to a standard of practice warm-up found a 36% reduction of ankle and knee injuries when using WR and a 38% reduction in ankle and knee injuries without supervision. 22 There were no significant differences in injury rates between groups.

Injury Outcomes by D&I strategies and adherence

This study evaluated current literature to inform evidence-based best practices for the D&I of NMT programmes in youth team sport. To our knowledge, this is the first systematic review evaluating the D&I of NMT programmes in youth sport. To improve the practical implementation of NMT warm-ups, factors associated with implementation success and current best practices for delivering context-specific NMT programmes are required to be evaluated. 118 In this review, we found few D&I-related studies use D&I or behaviour change frameworks, theories or models to guide their research questions. We discovered the number of RCTs examining the effectiveness of D&I strategies for NMT programme delivery is limited. Common barriers to NMT implementation include programme flexibility and time restrictions; and the use of coach workshops and supplementary resources are currently the primary strategy in NMT programme D&I facilitation.

One-third of the included studies used a D&I framework or behaviour change model in their research work. The HAPA and RE-AIM models were the most frequently used. These models are a conceptual and organised combination of theories required to direct the design, evaluation and translation of evidence-based interventions (NMT programmes) and the context in which they are being implemented. 36 71 119 It is imperative for D&I studies to use these frameworks/models to fully understand specific implementation processes and contexts. Future D&I studies should consider using appropriate frameworks or models, including adaptations and combination of models to guide their specific aims.

Relevance to D&I

Across the relevant literature, a variety of designs and levels of evidence were included.

Of 68 studies, 7 (10.3%) were found to be ‘highly relevant’ toward informing D&I best practice (2 (2.9%) RCTs, 5 (7.4%) quasi-experimental). Other ‘relevant’ studies evaluated implementation as secondary objectives (Type 1 hybrid designs) and/or were of lower level of evidence. 33 observational studies were ‘highly relevant’ to D&I, assessing D&I outcomes and barriers and facilitators from a qualitative lens. While these studies are important for understanding D&I context, more high-quality and highly relevant studies such as RCTs and quasi-experimental designs using the Type 3 hybrid approach, or non-hybrid approach focused on solely evaluating the effectiveness of D&I strategies, are needed to advance the widespread adoption and continued use of NMT programmes in youth team sport.

Effectiveness versus efficacy

Effectiveness is indicative of an evidence-informed intervention’s readiness for practical implementation. 36 Findings from our RITES scores evaluation indicate that the majority of the RCTs had a primary focus on efficacy and not effectiveness. Although many RCT studies had a fair balance between efficacy and effectiveness for participant characteristics, trial settings and clinical relevance domains (≥50% of RCTs), there is a lack of flexibility in the development and evaluation of the evidence supporting current NMT warm-up programmes. These disparities regarding practical implementation have implications for D&I research and practice in this field. Current NMT programmes may need to be modified or adapted to the local context and evaluated further to improve implementation in youth sport settings.

Contextual considerations

In our Adapted Socioecological Model ( figure 2 ), we demonstrate that the utilisation of NMT programmes by individual players within youth team sport can depend on their coach adopting and implementing the warm-up, which may also be dependent on larger organisational systems. Barriers related to end-users’ success in wide-spread adoption and long-term maintenance can be moderated; however, researchers and implementers have to be intentional about tackling these recognised barriers and associated challenges 25 87 104 115 ; integrating the facilitators of successful implementation intending to reduce and address these obstacles is essential. The barriers and facilitators identified in this systematic review provide insight into the combination of D&I strategies that should be formulated and tested by D&I researchers in the sports injury prevention field.

Within the current review, lack of time, whether it be learning, instructing and/or practicing the programme, is a common barrier that plays a significant role in implementation. A recent narrative review focused solely on the barriers and facilitators associated with exercise-based warm-up programmes showed similar conclusions regarding time restrictions. 115 Collective themes within this literature for players, coaches and organisations found that reduced buy-in and support at different levels impacted the adoption of NMT warm-up programmes. The lack of awareness and knowledge of the injury prevention benefits of NMT warm-up programmes also presented major barriers to buy-in, leading to reduced implementation success. Future interventions should ensure that education about evidence-informed injury prevention outcomes associated with programme adherence is integrated within their D&I strategies.

D&I science is a growing field of study. A variety of D&I outcomes were identified such as self-efficacy, intention, reach, outcome expectancy and most commonly, adherence or adherence-related outcomes. These outcomes were evaluated using different D&I strategies for NMT warm-up programmes. The most commonly reported strategies were Workshops with supplementary Resources with/without in-season Personnel support. Evaluation of D&I outcomes showed that adherence or adherence-related outcomes were most frequently reported across studies. Various measures of adherence as defined by Owoeye et al (2020) were identified, including cumulative utilisation, utilisation frequency, utilisation fidelity, duration fidelity and exercise fidelity. 36

Adherence remains the most common D&I outcome in the sport injury prevention literature. 36 120 In this review, we defined adherence and adherence-related thresholds for a moderate-to-high dose-response to be ≥70% cumulative utilisation and/or ≥1.5 session/week to achieve the desired protective effects. This was done with consideration of pragmatism and a practical balance between programme efficacy and effectiveness given the existing literature. 24 91 32 of 40 studies (80%) from those with adherence or adherence-related outcomes had a potential D&I effect based on a moderate-to-high adherence or adherence-related outcome level. The use of WR and WRP was the most common D&I strategies for delivering NMT warm-up programmes. While there are several areas for improvement for the practical D&I of NMT warm-up programmes in youth sport settings, the use of comprehensive workshops and supplementary resources at various levels, particularly with coaches, appears to be the optimal best practice. However, only two ‘highly relevant’ D&I studies (RCTs) from the current systematic review presented conclusions based on the effectiveness of D&I strategies and outcomes specifically.

Many studies (n=26/68; 38.2%) included both male and female participants; however, no sex-differences were described. When examining D&I outcomes, only 7/26 (26.9%) had moderate-to-high adherence when looking at both male and female youth players. In total, 84.6% of the female-only (11/13) and 72.7% of the male-only studies (8/11) reported moderate-to-high adherence levels. These findings suggest greater attention and efforts for adherence and implementation of NMT programmes in the male youth team sport setting may need further consideration compared with the female youth sport context.

Of the preliminary evidence for Type 2 and 3 hybrid designs, the literature highlighted in the synthesis of this data that WR are effective strategies in injury prevention and showed more moderate-to-high adherence levels. Given that most studies are doing some form of WR, adding in-season personnel support does not increase the protective effect and may be less sustainable given that resources, time and support are significant barriers to the D&I of these programmes.

Additionally, greater implementation and programme buy-in were found in studies where uptake of these NMT programmes was supported across multiple stakeholders, particularly at the organisation level. 19 67 90 103 112 Catering to programme deliverers (coaches, organisations, parents) and evaluating their awareness, perception and self-efficacy may help further inform our understanding of D&I and how we can best work to promote programme uptake further.

D&I strategies and injury outcomes

The findings from this systematic review suggest that while various D&I intervention strategies are effective at reducing injuries in youth team sports, the ranges of injury rate ratios are similar across studies employing different strategies (32–88% lower injury rates across WR strategy studies and 41–77% lower injury rates across WRP strategy studies). 22 53 56 Although this was not the proposed evaluation of these studies, our findings demonstrate that the use of workshops may influence D&I success and the availability of supplementary resources alone may not be efficacious. Future evaluation of the influence of delivery strategies should be considered.

Future directions

Using facilitators to reduce barrier burden.

Regarding NMT strategy evaluation, our findings show that most of the current programmes focus on efficacy over effectiveness, particularly in the aspect of intervention flexibility; this suggests a need for the adaptation of NMT programmes to fit local contexts. NMT programme developers should consider more enjoyable and user-friendly exercises that include sport-specific activities (eg, ball work, partner drills, tags). Increasing variations also improves player buy-in and increases intrinsic motivation. At a coaching level, workshops on NMT programmes should include evidence-informed education on the injury prevention benefits and should incorporate content addressing coach self-efficacy to enhance implementation quality. 16 100 121 An ongoing pragmatic evaluation of NMT programme effectiveness is warranted as they undergo adaptation to local contexts.

Organisations have expressed limited knowledge and education for implementation as a significant barrier to successful NMT programme use. 90 99 101 105 112 115 122 Implementers should look to provide accessible resources and encourage further support from multiple stakeholders, including the governing bodies. This could lead to policy changes within the club and result in greater uptake of these programmes long-term. Collaborations among stakeholders (researchers, youth sport administrators, coaches and players) in programme development, evaluation, D&I are necessary to improve efforts for impactful practical translation of programmes.

Research recommendations

The support for NMT programmes within youth sport is extensive. 28 Although these programmes have been shown to be effective for injury prevention in many sports, 10 11 sport representation across D&I studies in our review was limited. Scaling up of NMT programmes and supporting continued research into other sports is vital for increased context-specific D&I of these programmes to reduce the overall burden of youth sport injuries.

Compliance and adherence were often used interchangeably, despite having distinct definitions. Although their mathematical calculations are similar, these two constructs are contextually different. Compliance refers to individuals conforming to prescribed recommendations in controlled intervention settings, 123 while adherence refers to the agreement of an individual’s behaviour to recommended evidence-based interventions in uncontrolled settings. 36 Standardised definitions should be considered more frequently by researchers to build on current knowledge and inform future D&I research.

Using D&I frameworks/models can improve NMT programme implementation success in a practical setting. 71 124 Application of D&I frameworks/models, including behaviour change models, 124 is limited in injury prevention and this is reflected in the current systematic review. Future studies should use D&I frameworks/models to help guide the implementation of these NMT programmes. In doing so, researchers can gain a better understanding of the contextual and behaviour change aspects related to youth sport injury prevention. 115

Limitations

Given the broad nature of our research question, specific results were required for inclusion. Despite being specific to our objectives, our limitations set for participant age range, team sport settings and English language studies only, may have resulted in missing other studies that evaluated D&I interventions and outcomes related to NMT programmes.

Due to the heterogeneous nature of studies, meta-analysis was not possible for any of our objectives. Inclusion of various study designs, although comprehensive, impeded this process and resulted in inconsistent injury and adherence definitions across our population of interest. Furthermore, the subjective nature of many qualitative studies included may have resulted in variability within the data extracted. With the varied definitions used for each specific outcome and design, we looked to consolidate the terminology used into more succinct and unified language and we encourage this to be employed by researchers.

Methodological flaws existed in the included studies that may warrant caution about the interpretation of our conclusions. For example, many of the included studies did not include power calculations or reported low power, increasing the chance of Type 2 error. Further, many studies did not consider confounding or effect modification in their analyses or failed to report the validity of measurement tools used for injury data collection. We also acknowledge that publication bias may have favoured the inclusion of studies demonstrating significant findings (eg, effectiveness, efficacy). By considering quality assessment as an objective, we aimed to account for these limitations.

There was limited evidence supporting the effect of D&I strategies on D&I-specific outcomes. There were only two high-level evidence (RCTs) studies in this review that directly discussed the matter of D&I strategies on D&I outcomes. 55 56 D&I-related outcomes were evaluated as secondary objectives in other high-level evidence studies, therefore, we could only examine the relationship between D&I strategy and outcome to assess if the strategy used resulted in moderate-to-high adherence levels, given our pre-established thresholds.

Conclusions

This systematic review demonstrates that: (1) Few D&I-related studies are based on D&I or behaviour change theories, frameworks or models; (2) few RCTs have examined the effectiveness of D&I strategies for delivering NMT programmes; (3) programme flexibility and time restrictions are the most common barriers to implementation and; (4) a combination of coach workshops and supplementary resources are currently the primary strategy facilitating NMT programme D&I; however their effectiveness is only evaluated in a few studies. This systematic review provides foundational evidence to facilitate evidence-informed knowledge translation practices in youth sport injury prevention. Transitioning to more high-quality D&I research RCTs and quasi-experimental designs that leverage current knowledge of barriers and facilitators, incorporates Type 2 or Type 3 hybrid approaches and uses behaviour change frameworks are important next steps to optimise the translation of NMT programmes into routine practice in youth team sport settings.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

  • Cumming SP ,
  • Australian Sports Commission
  • Meeuwisse DW ,
  • Eliason PH , et al
  • Child and Adolescent Health Measurement Initiative
  • Emmonds S ,
  • Weaving D ,
  • Lara-Bercial S , et al
  • Merkel DL ,
  • Raysmith BP ,
  • Charlton PC
  • Brunner R ,
  • Friesenbichler B ,
  • Casartelli NC , et al
  • Whittaker JL , et al
  • Meeuwisse WH
  • Gianotti S ,
  • Jacquet C ,
  • Verhagen E , et al
  • Marshall DA ,
  • Lopatina E ,
  • Lacny S , et al
  • Owoeye OBA ,
  • Akinbo SRA ,
  • Tella BA , et al
  • Palacios-Derflingher LM ,
  • Pasanen K ,
  • Parkkari J ,
  • Pasanen M , et al
  • Soligard T ,
  • Myklebust G ,
  • Steffen K , et al
  • Thorborg K ,
  • Krommes KK ,
  • Esteve E , et al
  • Webster KE ,
  • Räisänen AM , et al
  • Räisänen AM ,
  • van den Berg C ,
  • Owoeye OBA , et al
  • Pfeifer K , et al
  • Richmond SA ,
  • Doyle-Baker PK , et al
  • Barengo N ,
  • Meneses-Echávez J ,
  • Ramírez-Vélez R , et al
  • Bizzini M ,
  • Sadigursky D ,
  • De Lira DNL , et al
  • Åkerlund I ,
  • Sonesson S , et al
  • Hislop MD ,
  • Stokes KA ,
  • Williams S , et al
  • Hopper AJ ,
  • Joyce C , et al
  • Bergeron MF ,
  • Mountjoy M ,
  • Armstrong N , et al
  • De Ste Croix M ,
  • Sanchez SH , et al
  • Creech MJ ,
  • Peterson DC , et al
  • Rauvola RS ,
  • Brownson RC
  • Owoeye OB ,
  • Olawale OA , et al
  • Niederer D ,
  • Vogt L , et al
  • Ardern CL ,
  • Büttner F ,
  • Andrade R , et al
  • Guyatt GH ,
  • Vist GE , et al
  • Kunz R , et al
  • Siemieniuk R ,
  • Maddox CD ,
  • Subialka JA ,
  • Young JL , et al
  • Wieland LS ,
  • Berman BM ,
  • Altman DG , et al
  • Landes SJ ,
  • McBain SA ,
  • Schwartz S ,
  • Duncan DT , et al
  • Ranganathan P
  • Proctor E ,
  • Silmere H ,
  • Raghavan R , et al
  • Schwarzer R
  • Hägglund M ,
  • Atroshi I ,
  • Wagner P , et al
  • Steffen K ,
  • Romiti M , et al
  • Sugimoto D ,
  • Barber Foss KD , et al
  • Schneider B , et al
  • Meeuwisse WH ,
  • Al Attar WSA ,
  • Alzahrani H , et al
  • Alizadeh MH ,
  • Shahrbanian S , et al
  • Waldén M , et al
  • Akasaka K ,
  • Otsudo T , et al
  • Leppänen M ,
  • Vasankari T , et al
  • LaBella CR ,
  • Huxford MR ,
  • Grissom J , et al
  • Loppini M ,
  • Berton A , et al
  • Parsons JL ,
  • Carswell J ,
  • Nwoba IM , et al
  • Rössler R ,
  • Bizzini M , et al
  • Slauterbeck JR ,
  • Choquette R ,
  • Tourville TW , et al
  • Olsen OE , et al
  • Mattacola CG ,
  • Bush HM , et al
  • Hancock MV ,
  • Stokes KA , et al
  • RÄisÄnen AM , et al
  • Barboza SD ,
  • Emery C , et al
  • Hellquist E ,
  • Ahlqvist K , et al
  • Tomsovsky L ,
  • Whatman C , et al
  • Verhagen E ,
  • Gouttebarge V , et al
  • Celebrini RG ,
  • Miller WC , et al
  • Franchina M ,
  • Tercier S , et al
  • Register-Mihalik J ,
  • Choquette R , et al
  • Lindblom H ,
  • Carlfjord S ,
  • Ljunggren G ,
  • Perera NKP ,
  • Merrett CK ,
  • Linnéll J , et al
  • Myklebust G , et al
  • O’Brien J ,
  • Santner E ,
  • Befus K , et al
  • Räisänen A ,
  • Black AM , et al
  • Sonesson S ,
  • Lindblom H , et al
  • Nilstad A ,
  • Thein-Nissenbaum J ,
  • Ageberg E ,
  • Lucander K , et al
  • Brodin EM ,
  • Watkins R ,
  • Cornelissen MH ,
  • Baan A , et al
  • Donaldson A ,
  • Callaghan A ,
  • Taylor JR ,
  • Novak MA , et al
  • MacFarlane AJ ,
  • Weiss-Laxer NS , et al
  • McGuine TA ,
  • Pennuto A , et al
  • Munoz-Plaza C ,
  • Davis A , et al
  • Norcross MF ,
  • Johnson ST ,
  • Bovbjerg VE , et al
  • Jimenez P ,
  • Oliver JL , et al
  • Denegar CR , et al
  • Lambert M , et al
  • Withall AL , et al
  • Minnig MC ,
  • Hawkinson L ,
  • Root HJ , et al
  • Macpherson A , et al
  • Fenelon D , et al
  • Bruinsma A , et al
  • Verhagen EALM , et al
  • Brown JC , et al
  • Bogardus RL ,
  • Martin RJ ,
  • Richman AR , et al

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Data supplement 1
  • Data supplement 2

X @carlavdb_, @amraisanen, @KatiPasanen, @Kat_Schneider7, @CarolynAEmery, @owoeye_oba

Contributors DL, CE and OBAO contributed to development of study proposal and design. DL, CvdB, AMR, IJS, KV, JK, AH, CE and OBAO conducted search, study selection and screening, data extraction and synthesis and quality assessment. DL led the writing of the manuscript and was the guarantor for the project. All authors contributed to drafting and revising the final manuscript. All authors approved the submitted version of the manuscript.

Funding This study was funded by Canadian Institutes for Health Research Foundation Grant Program (PI CAE).

Competing interests OBAO is a Deputy Editor for the British Journal of Sports Medicine. CE, KJS and KP are Associate Editors for the British Journal of Sports Medicine.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

  • Open supplemental data
  • Reference Manager
  • Simple TEXT file

People also looked at

Systematic review article, association between low birth weight and impaired glucose tolerance in children: a systematic review and meta-analysis.

systematic literature review databases

  • 1 Department of Pediatrics, Heping Hospital Affiliated to Changzhi Medical College, Changzhi, Shanxi, China
  • 2 Department of Gastroenterology, Guangzhou Red Cross Hospital, Guangzhou, Guangdong, China
  • 3 Department of Otolaryngology, Heping Hospital Affiliated to Changzhi Medical College, Changzhi, Shanxi, China
  • 4 Department of Nursing, Heping Hospital Affiliated to Changzhi Medical College, Changzhi, Shanxi, China

Background: A potential association between the onset of diabetes and normal birth weight (NBW) has been discovered. Diverse conclusions and study methodologies exist regarding the connection between low birth weight (LBW) and impaired glucose tolerance in children, underscoring the need for further robust research. Our institution is embarking on this study to thoroughly examine the association between LBW and impaired glucose tolerance in children.

Methods: We conducted searches on Cochrane Library, ScienceDirect, EMBASE, PubMed, China National Knowledge Infrastructure (CNKI), Chinese Biomedical Literature data (CBM) online database, VIP full-text Database, and Wanfang Database to identify correlation analyses or case-control studies investigating the relationship between LBW and abnormal glucose tolerance in children. The search spanned from January 2010 to September 2023. The quality of observational studies was evaluated using the Newcastle–Ottawa Scale (NOS) tool. Data synthesis was performed using the statistical software RevMan 5.3 for meta-analysis.

Results: Based on the preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines, we finally included 10 clinical control studies consisting of a total of 2971 cases. There wasn’t considerably change in blood sugar levels among LBW, NBW and high birth weight (HBW) infants ( P  > 0.05). There was no significant difference in insulin levels between LBW infants and NBW infants ( P  > 0.05). The HOMA-IR of LBW infants was considerably higher than that of NBW infants ( P  < 0.05). The risk of abnormal glucose tolerance in LBW infants was 0.42 times higher than that in NBW and HBW infants [Fisher's Z  = 0.42, 95% CI = (0.09, 0.75), P  = 0.01].

Conclusion: LBW is associated with an increased risk of abnormal glucose tolerance, as indicated by elevated HOMA-IR level in LBW infants compared to NBW and HBW pediatric population. Further research is needed to confirm and expand upon these findings to better understand the complex relationship between LBW and impaired glucose tolerance in children.

1 Introduction

In China, the prevalence of diabetes has surged, with over 30 million individuals affected, marking a substantial rise from 0.8% in 1980 to 3.5% in 2000 ( 1 , 2 ). A study conducted from 2015 to 2017 revealed that in China, the overall prevalence of diabetes among adults is 12.8%, including a newly diagnosed diabetes prevalence of 6.8% and a self-reported diabetes prevalence of 6.0% ( 3 – 6 ). The rising incidence of diabetes has led to an increased prevalence of the condition among young adults, and reports indicate that diabetes can manifest in individuals as young as 13 years old ( 7 , 8 ). The presence of concurrent complications such as hyperlipidemia, hypertension, and other conditions alongside diabetes has garnered increasing attention in terms of the onset, progression, outcomes, and management of diabetes. Diabetic complications are more common in macrovascular and microvascular diseases, and abnormal blood lipid metabolism is involved in the whole process of this disease. A randomized controlled trial study has demonstrated the intricate interplay between blood glucose and blood lipids in individuals with diabetes ( 9 – 11 ).

Given the rising incidence and prevalence of type 2 diabetes among children and adolescents, this issue may emerge as a significant public health concern impacting both developed and developing nations. Consequently, from a population standpoint, it is imperative to identify potential risk factors and identify susceptible groups that could benefit from screening and preventive measures ( 12 – 14 ). So far, scholars have explored the etiology of diabetes from various perspectives and directions, including pathology, genetics, genomics, social factors, and other fields. The formation of diabetes corresponds to abnormal birth weight ( 15 , 16 ). The occurrence of high birth weight infants, often stemming from fetal overnutrition, maternal diabetes, and other maternal health conditions, can significantly predispose individuals to obesity and diabetes in adulthood, typically around the age of 18. This association may be attributed to genetic polymorphisms and the onset of insulin resistance ( 17 ). Additionally, abnormal insulin secretion during the fetal period, impacting fetal growth and development, may contribute to the prevalence of infants with low birth weight (LBW) and heighten the risk of diabetes in adulthood ( 18 , 19 ).

At present, numerous investigations have explored the link between LBW and impaired glucose tolerance in children. However, these studies yield varying conclusions and employ designs, leading to poor applicability. The findings of a single study regarding the correlation between LBW and impaired glucose tolerance in children may lack conviction without robust scientific support. Therefore, additional research is warranted, necessitating reputable scientific studies to comprehensively evaluate this relationship. Consequently, a thorough, quantitative, and systematic meta-analysis of independent studies with similar objectives was conducted to investigate the association between LBW and impaired glucose tolerance in children. This analysis aims to provide valuable insights to inform further exploration of the underlying causes of type 2 diabetes and to enhance eugenic strategies.

2.1 Database and literature search

A computer-based search was carried out across multiple databases, including CochraneLibrary, ScienceDirect, EMBASE, Wanfang Database, the Chinese Biomedical Literature Data (CBM), VIP Full-text Database, China National Knowledge Infrastructure (CNKI). This extensive search strategy encompassed a wide range of sources, including both degree papers, conference papers, Chinese and foreign periodicals, news articles, and manual searches, among others.

The main aim was to collect pertinent data regarding the association between LBW and impaired glucose tolerance in children. The literature retrieval process utilized a combination of free-text and subject-specific keywords. Key search terms such as “newborn,” “low birth weight,” and “impaired glucose tolerance” were employed, with the search period spanning from January 2010 onwards. This comprehensive strategy aimed to encompass the latest and most relevant research findings in the field.

2.2 Inclusion criteria and exclusion criteria

2.2.1 criteria for include literature.

(1) Observational studies that were published in full-text format.

(2) Inclusion of newborns with birth weight of less than 1,500 g.

(3) Assessment of the correlation between LBW and impaired glucose tolerance in children.

(4) Adjustment or control for the potential confounding factors, with the reporting of relative risk factors or the comparison of blood glucose, insulin, and Model Assessment for Insulin Resistance (HOMA-IR) indices with those of normal newborns and high-birth-weight newborns. Based on a previous literature ( 20 ), children were classified into LBW (<2,500 g), normal birth weight (NBW; 2,500–3,999 g), and high birth weight (HBW; ≥4,000 g). Impaired glucose tolerance was defined as having 2-h plasma glucose concentration (2hPG) 140–199 mg/dl ( 21 ).

2.2.2 The literature exclusion criteria

(1) Studies with incomplete and unusable data.

(2) Duplicate research content, with preference given to the most recent study.

(3) Reviews, editorials, preclinical studies, and literature that did not directly relate to the special purpose of the current meta-analysis.

(4) Clinical cases, which were not considered in this particular meta-analysis.

2.3 Study selection and data extraction

The process of extracting data and screening books followed a rigorous and systematic approach.

2.3.1 Independent screening

Two researchers conducted separate reviews of the selected literature and extracted relevant information.

2.3.2 Quality evaluation

These researchers also assessed the quality of the included studies.

2.3.3 Cross-check

To ensure accuracy and consistency, the results of the independent screenings and data extractions were cross-checked. Any discrepancies were addressed through discussion and consensus. In instances of unresolved discrepancies, a third researcher was consulted to provide adjudication.

2.3.4 Software utilization

NoteExpress document management software and Excel office software were employed for data management and extraction, facilitating efficient organization and analysis of the research data.

2.3.5 Data completeness

In cases where the literature lacked necessary information, the authors of the respective articles were contacted to request Supplementary Data .

The information retrieved from the data comprised: (1) the authors’ names, the publishing year and the country of the institute; (2) the characteristics of the study design; (3) the characteristics of participants, including health status, sample size and average age; (4) the number of normal weight, overweight and LBW newborns; and (5) confounding factors adjusted or controlled when reporting correlations.

2.4 Qualitative assessment

For assessing the quality of observational studies in this meta-analysis, the Newcastle-Ottawa Scale (NOS) tool was utilized. Studies with a NOS score of ≥6 were categorized as medium to high quality, whereas those with an NOS score <6 were classified as low quality.

2.5 Statistical analysis

RevMan 5.3 software, derived from the Cochrane Collaboration, was used for conducting meta-analyses. The mean values, and standard deviations for Blood glucose levels, insulin levels, HOMA-IR in each group were input into RevMan 5.3 for analysis. The weighted mean difference (WMD) was used as the effect size, and 95% confidence intervals (CI) were calculated. Heterogeneity was evaluated using the χ 2 test and the I 2 statistic, which quantifies the total variation across studies attributed to heterogeneity. P -value below 0.05 was deemed statistically significant ( 22 , 23 ).

3 Results and analysis

3.1 the outcomes of literature retrieval and the fundamental circumstances behind literature inclusion.

In adherence to the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) guidelines, the study initiated with a computer-based database search, resulting in the retrieval of 742 studies. After eliminating duplicate studies, 561 unique studies remained. These papers were then subject to preliminary screening, during which 308 studies were reviewed.

After the initial screening, 142 studies met the inclusion criteria for further assessment, while irrelevant studies, reviews, case reports, and uncontrolled documents were excluded. Subsequently, the full texts of the selected literature underwent thorough examination, with papers containing incomplete data or lacking key outcome indicators being excluded. Ultimately, the study integrated data from 10 clinical control studies, comprising a total of 2,971 samples. This meticulous selection process ensured that the included studies were pertinent, met the required criteria, and enhanced the robustness of the meta-analysis. Figure 1 illustrates the flow chart detailing the literature screening process, while Table 1 presents the fundamental characteristics identified in the literature.

www.frontiersin.org

Figure 1 . Illustration of literature screening.

www.frontiersin.org

Table 1 . Basic characteristics of literature.

3.2 An assessment of the study’s methodology’s quality

All the literatures described the detailed intervention methods and observation indicators, and all the literatures did not describe the quantity and causes of blind procedures, as well as missed follow-up or withdrawal, in detail. The NOS scale study indicated that low-quality literature had a score of <6, while high-quality literature had a score of ≥6 ( Table 2 ).

www.frontiersin.org

Table 2 . Literature quality.

3.3 Meta analysis result

3.3.1 blood glucose level.

The blood glucose levels of each group were examined using meta, and the heterogeneity test results revealed that LBW vs. NBW: Chi 2  = 25.86, I 2  = 85%, P  < 0.0001, df = 4; LBW vs. HBW: Chi 2  = 0.31, I 2  = 0%, P  = 0.58, df = 1. From the analysis shown as Figures 2 , 3 , there was no statistical difference in blood sugar levels between LBW infants and normal weight and overweight infants ( P  > 0.05).

www.frontiersin.org

Figure 2 . Comparison of blood glucose levels between normal weight and low birth weight children forest analysis chart F.

www.frontiersin.org

Figure 3 . Comparison of blood glucose levels between overweight and low birth weight children forest analysis map.

3.3.2 Insulin level

A meta-analysis of the comparative results of insulin levels was performed in each group. In the comparison between LBW and NBW, with four degrees of freedom, the Chi 2 statistic yielded a value of 6.85, resulting in a p -value of 0.14 and an I 2 of 42%. These findings indicate moderate heterogeneity among the studies for this comparison. In the comparison between LBW and HBW, the Chi-squared value was 11.78 with one degree of freedom, resulting in a p -value of 0.0006, and I 2 was determined to be 92%. These results indicate a high level of heterogeneity among the studies for this comparison. According to the analysis of the random-effect model ( Figure 4 ), there wasn't considerably difference in insulin level between LBW infants and normal weight children ( P  > 0.05).

www.frontiersin.org

Figure 4 . Comparison of insulin levels between normal weight and low birth weight children forest analysis map.

3.3.3 HOMA-IR

In the comparison between LBW and normal birth weight (NBW) children ( Figure 5 ), with four degrees of freedom, the Chi-squared statistic was 6.85, yielding a p -value of 0.14 and an I 2 of 42%, indicating a moderate level of heterogeneity among the studies.

www.frontiersin.org

Figure 5 . Forest analysis map of HOMA-IR comparison between normal weight and low birth weight children.

In the comparison of LBW with HBW children ( Figure 6 ), with one degree of freedom, the Chi-squared statistic was 11.78, yielding a p -value of 0.0006, and I 2 was determined to be 92%, suggesting a high level of heterogeneity among the studies for this comparison. The meta-analysis findings reveal that LBW infants have significantly higher HOMA-IR values when compared to NBW children ( P  < 0.05). Nonetheless, in comparing LBW to HBW children, the observed high level of heterogeneity underscores the need for caution in interpreting the results. This heterogeneity indicates significant variability among the included studies in this comparison, potentially influencing the overall findings.

www.frontiersin.org

Figure 6 . Forest analysis map of HOMA-IR comparison between overweight and low birth weight children.

3.3.4 Analysis of correlation between low birth weight and HOMA-IR

This study encompassed data from 10 clinical controlled studies, comprising a total of 2,971 samples, and conducted a meta-analysis on the association between LBW and HOMA-IR. The heterogeneity test results indicated significant heterogeneity, with Chi 2  = 912.67, df = 7, P  < 0.00001, and I 2  = 99%. These findings suggest a substantial level of variation among the included studies’ meta-analyses, assessed using a random effects model ( Figure 7 ), the risk of abnormal glucose tolerance in LBW newborns was 0.42 times higher than that in normal and overweight children [Fisher's Z  = 0.42,95% CI:0.09–0.75, P  = 0.01].

www.frontiersin.org

Figure 7 . Forest analysis map of the correlation between low birth weight and HOMA-IR.

3.3.5 Publication bias analysis

The funnel diagram was created using the blood glucose, insulin level, HOMA-IR value and correlation analysis results of each group, and an examination of publication bias was conducted ( Supplementary Figures S1–S4 ). The results revealed that while a small proportion of the included studies exhibited asymmetry, the majority of funnel plots appeared symmetrical, suggesting potential publication bias in the included literature. This bias could be linked to the heterogeneity observed in the study.

4 Analysis and discussion

Previous research has shown a link between diabetes and LBW ( 33 ). The “Fetal Origin hypothesis,” proposed in the 1990s, suggests that the conditions experienced during fetal intrauterine development significantly influence the risk of developing diseases in adulthood. According to this hypothesis, individuals born with LBW are at a considerably higher risk of developing type 2 diabetes later in life ( 34 ). Preterm delivery or intrauterine growth restriction is the most common cause of LBW ( 35 ). 63% of LBW infants are born prematurely, while the remaining cases are attributed to intrauterine dysplasia. It is noteworthy that nearly all very low birth weight infants are born prematurely, with some being extremely premature, with gestational ages of less than 25 weeks. In utero stunting of development in LBW infants impairs the development and function of the pancreas, leading to problems with lipid and glucose metabolism and hypertension in adulthood ( 36 , 37 ). Genetic research indicates that variations in susceptibility genes associated with type 2 diabetes may also be linked to LBW. This suggests a potential genetic predisposition for both lower birth weight and an increased risk of type 2 diabetes later in life. These findings underscore the intricate interplay between genetic factors and health outcomes across the lifespan ( 38 ). If an individual has a low birth weight or childhood weight, there is a tendency for rapid weight gain in adulthood (after 18 years of age) due to dietary changes, which significantly increases the risk of developing diabetes and other related metabolic disorders. Reduced birth weight has been associated with the upregulation of certain genes, commonly known as “thrift genes.” These genes might be involved in metabolic adaptations to prenatal undernutrition. Furthermore, there is evidence connecting LBW to a higher risk of developing several disorders, including diabetes, in adulthood, suggesting that early life factors, including birth weight, can influence gene expression and can aid in the later-life development of chronic illnesses.

Recently, LBW infants are prone to developing obesity, insulin resistance, hypertension, and vascular diseases in adulthood. Additionally, the incidence and mortality rates of other conditions such as enterocolitis, late-onset septicemia, and intraventricular hemorrhage are elevated in this population ( 39 ). The prevalence of diabetes and hypertension in LBW infants heightened significantly in adulthood. A survey has shown that the incidence of type 2 diabetes and birth weight are correlated in a U-shaped manner, and the quantity of diabetes cases complicated with hypertension in LBW is significantly increased. Diabetes is also associated with high birth weight, while hypertension is notably more prevalent among high birth weight infants. It is hypothesized that hypertension in high birth weight infants and LBW infants may arise from distinct metabolic phenotypes or similar environmental factors. Moreover, LBW infants exhibit a significantly higher prevalence of hyperlipidemia compared to those with normal birth weight ( 40 ). Previous study has found that 300 cases of high birth weight infants, and the results show the detection rates of overweight and obesity in the macrosomia group (13.10% vs. 2.86%) are higher than those in the control group (9.69% vs. 1.61%) ( 41 ), which suggested that the risk of insulin resistance and abnormal lipid metabolism in abnormal birth weight infants is greater than that in normal birth weight infants. China's Chinese multi-provincial Study on Risk Factors of Cardiovascular Diseases (CMCS) has suggested that the proportion of diabetic patients with abnormal blood lipid metabolism is considerably higher, and the proportion of diabetic patients with atherosclerosis risk factors such as coronary heart disease, cerebral infarction and venous thrombosis is also significantly higher than that of non-diabetic patients.

More and more evidence shows that the LBW of newborns is directly related to the abnormal glucose tolerance of children. The blood sugar and insulin levels of LBW newborns, normal newborns and overweight newborns were analyzed by meta-analysis. The findings indicated that there was not a significant variation between the blood sugar levels of LBW newborns and overweight and normal newborns. Meta-analysis of the comparison results of HOMA-RI values in each group showed that the HOMA-IR values of LBW infants were considerably higher. It is suggested that there is a certain correlation between LBW of newborns and HOMA-IR. Meta-analysis was made on the correlation between LBW and HOMA-IR, and random effect model analysis showed the risk of abnormal glucose tolerance in LBW newborns was 0.42 times higher than that in normal and overweight children [Fisher's SZ = 0.42, P  = 0.01, 95%CI = (0.09, 0.75)]. Through an analysis of existing research in this domain, it is evident that there exists a connection between abnormal glucose tolerance and atypical birth weight in LBW infants. This association cannot be solely attributed to factors related to the fetus itself, prenatal malnutrition, or the intrauterine environment; rather, it encompasses various other contributing factors. These factors encompass aspects related to the pregnant woman's health, as well as lifestyle choices and dietary habits during adulthood. Additionally, genetic modifications resulting from certain factors in adulthood may also influence this intricate relationship. Understanding these multifaceted connections is crucial for comprehensively addressing and managing health risks associated with abnormal glucose tolerance and birth weight.

However, the study has certain limitations that warrant consideration:

(1) Stringent Criteria for Inclusion and Exclusion: The study employed rigorous criteria for inclusion and exclusion, leading to a relatively small number of included studies. Furthermore, detailed subgroup analysis was not conducted on studies displaying heterogeneity. This limited the diversity of the included literature and may affect the generalizability of the findings.

(2) Inconsistent Treatment Protocols and Outcome Measures: Variability in the treatment protocols and outcome indicators across the included studies may introduce heterogeneity and impact the reliability of the outcomes.For example, insulin level is influenced by age and gender ( 42 ). Therefore, these factors may influence the results in this study. To bolster the robustness of the findings, it is imperative to conduct further research, encompassing high-quality correlation studies and case-control trials. These endeavors will provide a deeper understanding of the relationship between abnormal glucose tolerance and birth weight, thus advancing our knowledge in this critical area of study.

5 Conclusion

It has been shown that LBW in babies is associated with poor glucose tolerance in pediatric population and a higher chance of type 2 diabetes in adults. This underscores the significance of preventive measures to manage birth weight abnormalities.Highlighting the significance of dietary and exercise management during the perinatal and developmental stages is crucial for mitigating the risk of diabetes. These insights underscore the necessity of early interventions and a comprehensive healthcare approach to mitigate the enduring adverse impacts of low birth weight on health outcomes.

Data availability statement

The datasets used and analyzed during the current study available from the corresponding author on reasonable request.

Author contributions

JM: Data curation, Formal Analysis, Writing – original draft. YW: Conceptualization, Writing – review & editing. MM: Data curation, Methodology, Writing – original draft. ZL: Conceptualization, Formal Analysis, Methodology, Writing – original draft.

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fped.2024.1362076/full#supplementary-material

Supplementary Figure S1 Funnel chart based on blood glucose level. Note: ( A ) Comparison between LBW and DBW; ( B ) LBW compared with HBW.

Supplementary Figure S2 Funnel chart based on insulin level.

Supplementary Figure S3 Funnel diagram based on HOMA-IR. Note: ( C ) LBW compared with DBW; ( D ) LBW compared with HBW.

Supplementary Figure S4 Funnel chart based on the results of correlation analysis.

1. Champion ML, Battarbee AN, Biggio JR, Casey BM, Harper LM. Postpartum glucose intolerance following early gestational diabetes mellitus. Am J Obstet Gynecol MFM . (2022) 4(3):100609. doi: 10.1016/j.ajogmf.2022.100609

PubMed Abstract | Crossref Full Text | Google Scholar

2. Ueki K, Sasako T, Okazaki Y, Miyake K, Nangaku M, Ohashi Y, et al. Multifactorial intervention has a significant effect on diabetic kidney disease in patients with type 2 diabetes. Kidney Int . (2021) 99(1):256–66. doi: 10.1016/j.kint.2020.08.012

3. Li Y, Teng D, Shi X, Qin G, Qin Y, Quan H, et al. Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American diabetes association: national cross sectional study. BMJ . (2020) 369:m997. doi: 10.1136/bmj.m997

4. Li W, Zhang H, Chen Z, Tao Y, Huang X, Chen W, et al. MiRNA-92a-3p mediated the association between occupational noise exposure and blood pressure among Chinese adults. Sci Total Environ . (2024) 907:168148. doi: 10.1016/j.scitotenv.2023.168148

5. Jinyi W, Zhang Y, Wang K, Peng P. Global, regional, and national mortality of tuberculosis attributable to alcohol and tobacco from 1990 to 2019: a modelling study based on the global burden of disease study 2019. J Glob Health . (2024) 14:04023. doi: 10.7189/jogh.14.04023

6. Zhang Y, Wang K, Zhu J, Wu J. A network suspected infectious disease model for the development of syphilis transmission from 2015 to 2021 in Hubei province, China. J Appl Microbiol . (2023) 134(12):lxad311. doi: 10.1093/jambio/lxad311

7. Yajnik CS, Bandopadhyay S, Bhalerao A, Bhat DS, Phatak SB, Wagh RH, et al. Poor in utero growth, and reduced β -cell compensation and high fasting glucose from childhood, are harbingers of glucose intolerance in young Indians. Diabetes Care . (2021) 44(12):2747–57. doi: 10.2337/dc20-3026

8. Green JB, Mottl AK, Bakris G, Heerspink HJL, Mann JFE, McGill JB, et al. Design of the COmbinatioN effect of FInerenone anD EmpaglifloziN in participants with chronic kidney disease and type 2 diabetes using a UACR endpoint study (CONFIDENCE). Nephrol Dial Transplant . (2023) 38(4):894–903. doi: 10.1093/ndt/gfac198

9. Subramanian SC, Porkodi A, Akila P. Effectiveness of nurse-led intervention on self-management, self-efficacy and blood glucose level among patients with type 2 diabetes mellitus. J Complement Integr Med . (2020) 17(3). doi: 10.1515/jcim-2019-0064

10. Wu J, Wang K, Tao F, Li Q, Luo X, Xia F. The association of blood metals with latent tuberculosis infection among adults and adolescents. Front Nutr . (2023) 10:1259902. doi: 10.3389/fnut.2023.1259902

11. Wang K, Xia F, Li Q, Luo X, Wu J. The associations of weekend warrior activity patterns with the visceral adiposity index in US adults: repeated cross-sectional study. JMIR Public Health Surveill . (2023) 9:e41973. doi: 10.2196/41973

12. Li W, Feng X, Zhang H, Wang YX, Zeng Q, Liu C, et al. Association of shift work with oxidative stress and alteration of fasting plasma glucose level in Chinese adults. Obesity (Silver Spring) . (2023) 31(10):2505–14. doi: 10.1002/oby.23845

13. Li W, Chen D, Peng Y, Lu Z, Wang D. Association of polycyclic aromatic hydrocarbons with systemic inflammation and metabolic syndrome and its components. Obesity (Silver Spring) . (2023) 31(5):1392–401. doi: 10.1002/oby.23691

14. Li W, Chen D, Tao Y, Lu Z, Wang D. Association between triglyceride-glucose index and carotid atherosclerosis detected by ultrasonography. Cardiovasc Diabetol . (2022) 21(1):137. doi: 10.1186/s12933-022-01570-0

15. Bianco ME, Kuang A, Josefson JL, Catalano PM, Dyer AR, Lowe LP, et al. Hyperglycemia and adverse pregnancy outcome follow-up study: newborn anthropometrics and childhood glucose metabolism. Diabetologia . (2021) 64(3):561–70. doi: 10.1007/s00125-020-05331-0

16. Cherney DZI, Charbonnel B, Cosentino F, Dagogo-Jack S, McGuire DK, Pratley R, et al. Effects of ertugliflozin on kidney composite outcomes, renal function and albuminuria in patients with type 2 diabetes mellitus: an analysis from the randomised VERTIS CV trial. Diabetologia . (2021) 64(6):1256–67. doi: 10.1007/s00125-021-05407-5

17. Kumbhojkar A, Saraff V, Nightingale P, Högler W. Glycated haemoglobin as a screening test for abnormal glucose homeostasis in childhood obesity. Diabet Med . (2020) 37(2):356–61. doi: 10.1111/dme.14192

18. Soh J F, Beaulieu S, Trepiccione F, Linnaranta O, Torres-Platas G, Platt RW, et al. A double-blind, randomized, placebo-controlled pilot trial of atorvastatin for nephrogenic diabetes insipidus in lithium users. Bipolar Disord . (2021) 23(1):66–75. doi: 10.1111/bdi.12973

19. Pop-Busui R, Braffett BH, Wessells H, Herman WH, Martin CL, Jacobson AM, et al. Diabetic peripheral neuropathy and urological complications in type 1 diabetes: findings from the epidemiology of diabetes interventions and complications study. Diabet Care . (2022) 45(1):119–26. doi: 10.2337/dc21-1276

Crossref Full Text | Google Scholar

20. Ledo DL, Suano-Souza FI, Franco M, Strufaldi MWL. Body mass index and cardiovascular risk factors in children and adolescents with high birth weight. Ann Nutr Metab . (2018) 72(4):272–8. doi: 10.1159/000488595

21. American Diabetes Association. 2. Classification and diagnosis of diabetes: standards of medical care in diabetes-2018. Diabet Care . (2018) 41(Suppl. 1):S13–s27. doi: 10.2337/dc18-S002

22. Li W, Chen D, Ruan W, Peng Y, Lu Z, Wang D. Association of polycyclic aromatic hydrocarbons exposure, systemic inflammation with hearing loss among adults and adolescents. Environ Pollut . (2022) 296:118772. doi: 10.1016/j.envpol.2021.118772

23. Zheng X, Shi J, Wu J. Analysis of factors and corresponding interactions influencing clinical management assistant ability using competency model in China. Medicine (Baltimore) . (2020) 99(51):e23516. doi: 10.1097/MD.0000000000023516

24. Oliveira-Santos J, Santos R, Moreira C, Abreu S, Lopes L, Agostinis-Sobrinho C, et al. Associations between anthropometric indicators in early life and low-grade inflammation, insulin resistance and lipid profile in adolescence. Nutr Metab Cardiovasc Dis . (2019) 29(8):783–92. doi: 10.1016/j.numecd.2019.05.052

25. de Jong M, Cranendonk A, van Weissenbruch MM. Components of the metabolic syndrome in early childhood in very-low-birth-weight infants and term small and appropriate for gestational age infants. Pediatr Res . (2015) 78(4):457–61. doi: 10.1038/pr.2015.118

26. Domínguez Hernández C, Klünder Klünder M, Huang F, Flores Armas EM, Velázquez-López L, Medina-Bravo P. Association between abdominal fat distribution, adipocytokines and metabolic alterations in obese low-birth-weight children. Pediatr Obes . (2016) 11(4):285–91. doi: 10.1111/ijpo.12060

27. dos Santos Alves PJ, PTH AC, Pinto LR, SM RM, MA CH, Alves RS, et al. Endothelial and metabolic disorders in adolescence: low birth weight is not an isolated risk factor. J Pediatr Endocrinol Metab . (2015) 28(3-4):407–13. doi: 10.1515/jpem-2014-0146

28. Mori M, Mori H, Yamori Y, Tsuda K. Low birth weight as cardiometabolic risk in Japanese high school girls. J Am Coll Nutr . (2012) 31(1):39–44. doi: 10.1080/07315724.2012.10720007

29. Guerrero-Romero F, Aradillas-García C, Simental-Mendia LE, Monreal-Escalante E, de la Cruz Mendoza E, Rodríguez-Moran M. Birth weight, family history of diabetes, and metabolic syndrome in children and adolescents. J Pediatr . (2010) 156(5):719–23. 23.e1. doi: 10.1016/j.jpeds.2009.11.043

30. Huang Y, Li Y, Chen Q, Chen H, Ma H, Su Z, et al. Low serum adiponectin levels are associated with reduced insulin sensitivity and lipid disturbances in short children born small for gestational age. Clin Endocrinol (Oxf) . (2015) 83(1):78–84. doi: 10.1111/cen.12663

31. Sebastiani G, Díaz M, Bassols J, Aragonés G, López-Bermejo A, de Zegher F, et al. The sequence of prenatal growth restraint and post-natal catch-up growth leads to a thicker intima-media and more pre-peritoneal and hepatic fat by age 3–6 years. Pediatr Obes . (2016) 11(4):251–7. doi: 10.1111/ijpo.12053

32. Blusková Z, Koštálová L, Celec P, Vitáriušová E, Pribilincová Z, Maršálková M, et al. Evaluation of lipid and glucose metabolism and cortisol and thyroid hormone levels in obese appropriate for gestational age (AGA) born and non-obese small for gestational age (SGA) born prepubertal Slovak children. J Pediatr Endocrinol Metab . (2014) 27(7-8):693–9. doi: 10.1515/jpem-2013-0334

33. Suzuki Y, Kido J, Matsumoto S, Shimizu K, Nakamura K. Associations among amino acid, lipid, and glucose metabolic profiles in childhood obesity. BMC Pediatr . (2019) 19(1):273. doi: 10.1186/s12887-019-1647-8

34. Chuar PF, Ng YT, Phang SCW, Koay YY, Ho JI, Ho LS, et al. Tocotrienol-rich vitamin E (tocovid) improved nerve conduction velocity in type 2 diabetes mellitus patients in a phase II double-blind, randomized controlled clinical trial. Nutrients . (2021) 13(11):3770. doi: 10.3390/nu13113770

35. Zohdi V, Sutherland MR, Lim K, Gubhaju L, Zimanyi MA, Black MJ. Low birth weight due to intrauterine growth restriction and/or preterm birth: effects on nephron number and long-term renal health. Int J Nephrol . (2012) 2012:136942. doi: 10.1155/2012/136942

36. Mehrzadi S, Mirzaei R, Heydari M, Sasani M, Yaqoobvand B, Huseini HF. Efficacy and safety of a traditional herbal combination in patients with type II diabetes mellitus: a randomized controlled trial. J Diet Suppl . (2021) 18(1):31–43. doi: 10.1080/19390211.2020.1727076

37. Schmitt A, Kulzer B, Reimer A, Herder C, Roden M, Haak T, et al. Evaluation of a stepped care approach to manage depression and diabetes distress in patients with type 1 diabetes and type 2 diabetes: results of a randomized controlled trial (ECCE HOMO study). Psychother Psychosom . (2022) 91(2):107–22. doi: 10.1159/000520319

38. Maddaloni E, Coleman RL, Agbaje O, Buzzetti R, Holman RR. Time-varying risk of microvascular complications in latent autoimmune diabetes of adulthood compared with type 2 diabetes in adults: a post-hoc analysis of the UK prospective diabetes study 30-year follow-up data (UKPDS 86). Lancet Diab Endocrinol . (2020) 8(3):206–15. doi: 10.1016/S2213-8587(20)30003-6

39. Hainsworth DP, Gao X, Bebu I, Das A, de Koo L O, Barkmeier AJ, et al. Refractive error and retinopathy outcomes in type 1 diabetes: the diabetes control and complications trial/epidemiology of diabetes interventions and complications study. Ophthalmology . (2021) 128(4):554–60. doi: 10.1016/j.ophtha.2020.09.014

40. Kurnikowski A, Nordheim E, Schwaiger E, Krenn S, Harreiter J, Kautzky-Willer A, et al. Criteria for prediabetes and posttransplant diabetes mellitus after kidney transplantation: a 2-year diagnostic accuracy study of participants from a randomized controlled trial. Am J Transplant . (2022) 22(12):2880–91. doi: 10.1111/ajt.17187

41. Koundal H, Dhandapani M, Thakur P, Dutta P, Walia R, Sahoo SK, et al. Effectiveness of dietary diabetes insipidus bundle on the severity of postoperative fluid imbalance in pituitary region tumours: a randomized controlled trial. J Adv Nurs . (2021) 77(9):3911–20. doi: 10.1111/jan.14894

42. Wiegand S, Raile K, Reinehr T, Hofer S, Näke A, Rabl W, et al. Daily insulin requirement of children and adolescents with type 1 diabetes: effect of age, gender, body mass index and mode of therapy. Eur J Endocrinol . (2008) 158(4):543–9. doi: 10.1530/EJE-07-0904

Keywords: newborn, low birth weight, abnormal glucose tolerance, diabetes, meta-analysis

Citation: Ma J, Wang Y, Mo M and Lian Z (2024) Association between low birth weight and impaired glucose tolerance in children: a systematic review and meta-analysis. Front. Pediatr. 12:1362076. doi: 10.3389/fped.2024.1362076

Received: 27 December 2023; Accepted: 23 April 2024; Published: 9 May 2024.

Reviewed by:

© 2024 Ma, Wang, Mo and Lian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zerong Lian [email protected]

COMMENTS

  1. Databases and Sources

    No one database can cover the literature for any topic. For medical topics, a combination of PubMed (or other search of PubMed data) plus Embase, Web of Science, and Google Scholar has been shown to provide adequate recall ( Syst Rev. 2017;6 (1):245 ). For topics that reach beyond the biomedicine, other databases need to be considered. PubMed.

  2. Systematic Reviews and Evidence Syntheses : Databases

    Three databases alone does not complete the search standards for systematic review requirements as you will also have additional searches of the grey literature and hand searches to complete. Which databases you search is highly dependent on your systematic review topic, so it is recommended you meet with a librarian .

  3. Cochrane Database of Systematic Reviews

    The Cochrane Database of Systematic Reviews ( CDSR) is the leading database for systematic reviews in health care. The CDSR includes Cochrane Reviews (systematic reviews) and protocols for Cochrane Reviews as well as editorials and supplements. The CDSR (ISSN 1469-493X) is owned and produced by Cochrane, a global, independent network of ...

  4. Cochrane Reviews

    See more on using PICO in the Cochrane Handbook. The Cochrane Library is a collection of high-quality, independent evidence to inform healthcare decision-making, including the Cochrane Database of Systematic Reviews and the CENTRAL register of controlled trials.

  5. Literature Search: Databases and Gray Literature

    The Literature Search. A systematic review search includes a search of databases, gray literature, personal communications, and a handsearch of high impact journals in the related field. See our list of recommended databases and gray literature sources on this page.

  6. Systematic reviews: Structure, form and content

    Systematic reviews: Structure, form and content. This article aims to provide an overview of the structure, form and content of systematic reviews. It focuses in particular on the literature searching component, and covers systematic database searching techniques, searching for grey literature and the importance of librarian involvement in the ...

  7. Systematic Reviews: Medical Literature Databases to search

    Cochrane Database of Systematic Reviews to search for a pre-existing systematic review on your topic; Epistemonikos database, has a matrix of evidence table so you can see what citations are shared in common across existing systematic reviews of the same topic. This feature might help identify sentinel or 'don't miss' articles.

  8. Optimal database combinations for literature ...

    Investigators and information specialists searching for relevant references for a systematic review (SR) are generally advised to search multiple databases and to use additional methods to be able to adequately identify all literature related to the topic of interest [1,2,3,4,5,6].The Cochrane Handbook, for example, recommends the use of at least MEDLINE and Cochrane Central and, when ...

  9. Databases and Sources

    Reviews and readers expect these databases to be searched in most systematic literature reviews. You'll need a reason not to search them. PubMed. Covers over 5,700 journals in the biomedical and health sciences and years covered late 1940's - present, with additional older medical literature selectively added. Search using keywords or the ...

  10. A global database for conducting systematic reviews and meta ...

    This method provides guidance to scholars in conducting systematic literature reviews by following the four proposed steps: (1) identification, (2) screening, (3) eligibility and (4) included ...

  11. How-to conduct a systematic literature review: A quick guide for

    A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, ... Databases that feature advanced searches enable researchers to perform search queries based on titles, abstracts, and keywords, as well as for years or areas of research. ...

  12. Systematic reviews: Structure, form and content

    Abstract. This article aims to provide an overview of the structure, form and content of systematic reviews. It focuses in particular on the literature searching component, and covers systematic database searching techniques, searching for grey literature and the importance of librarian involvement in the search.

  13. Guidance on Conducting a Systematic Literature Review

    Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews. In this article, through a systematic search on the methodology of literature review, we categorize a typology of literature reviews, discuss steps in conducting a systematic literature review, and provide suggestions on how to enhance rigor in literature ...

  14. Systematic Search for Systematic Review

    To do a systematic review in nursing and health science fields, Medline, CINAHL (Complete), Embase and Cochrance Library are the core databases for literature searching. If your research topic involves psychological problems (e.g., cognitive behavior), include PsycINFO and Web of Science (SSCI) too in your literature search.

  15. Database combinations to retrieve systematic reviews in overviews of

    Overviews were defined as systematic reviews for which the unit of searching, inclusion and data analysis is the systematic review rather than the primary study . Thus, we included all Overviews that had searched explicitly and systematically for SRs in at least one electronic database, included at least one SR (Overviews including both SRs and ...

  16. Guidance to best tools and practices for systematic reviews

    Once a research question is established, searching on registry sites and databases for existing systematic reviews addressing the same or a similar topic is necessary in order to avoid contributing to research waste . Repeating an existing systematic review must be justified, for example, if previous reviews are out of date or methodologically ...

  17. How to carry out a literature search for a systematic review: a

    A literature search is distinguished from, but integral to, a literature review. Literature reviews are conducted for the purpose of (a) locating information on a topic or identifying gaps in the literature for areas of future study, (b) synthesising conclusions in an area of ambiguity and (c) helping clinicians and researchers inform decision-making and practice guidelines.

  18. Defining the process to literature searching in systematic reviews: a

    Background Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving readers clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence. Information specialists and review teams ...

  19. A practical guide to data analysis in general literature reviews

    A general literature review starts with formulating a research question, defining the population, and conducting a systematic search in scientific databases, steps that are well-described elsewhere. 1,2,3 Once students feel confident that they have thoroughly combed through relevant databases and found the most relevant research on the topic ...

  20. Literature Searching

    Selecting Databases. Knowing where to look for studies is key to a successful review. Below is a link to Duquesne's database list. The databases to search in conducting a systematic review rely heavily on the field a research question is based in. It's important to revise a search strategy and terms for every database used!

  21. Literature searches in systematic reviews and meta-analyses: A review

    Systematic literature reviews, whether quantitative or qualitative, are important tools for drawing conclusions from large bodies of research, enabling advancement of scientific theory and evidence-based practice. ... we identify the most common databases searched in the systematic reviews evaluated in Study 1. Subsequently, in Study 2, we ...

  22. Systematic Reviews in the Engineering Literature: A Scoping Review

    A systematic review is a specialized type of literature review used to collect and synthesize all the available evidence related to a research question. The met ... the authors searched the databases Compendex, Inspec, and ERIC and retrieved 11,588 records. After removing duplicates and applying inclusion and exclusion criteria, 3,066 articles ...

  23. Computers

    In this systematic literature review, the intersection of deep learning applications within the aphasia domain is meticulously explored, acknowledging the condition's complex nature and the nuanced challenges it presents for language comprehension and expression. By harnessing data from primary databases and employing advanced query methodologies, this study synthesizes findings from 28 ...

  24. Efficacy of psilocybin for treating symptoms of depression: systematic

    Objective To determine the efficacy of psilocybin as an antidepressant compared with placebo or non-psychoactive drugs. Design Systematic review and meta-analysis. Data sources Five electronic databases of published literature (Cochrane Central Register of Controlled Trials, Medline, Embase, Science Citation Index and Conference Proceedings Citation Index, and PsycInfo) and four databases of ...

  25. A systematic approach to searching: an efficient and complete method to

    The described method can be used to create complex and comprehensive search strategies for different databases and interfaces, such as those that are needed when searching for relevant references for systematic reviews, and will assist both information specialists and practitioners when they are searching the biomedical literature.

  26. Probable extinction of influenza B/Yamagata and its public health

    In preparing this Review, the 2020-23 data from the FluNet database were downloaded three times: on May 19, 2023; Aug 28, 2023; and Jan 21, 2024. In this time period, the number of reported B/Yamagata influenza cases decreased from 45 to ten for the year 2021, from eight to three for the year 2022, and from six to one for the year 2023.

  27. Spiritual nursing education programme for nursing students in Korea: a

    This study conducts a systematic review and meta-analysis to understand the characteristics and contents of studies on spiritual nursing education programmes and their effects. The literature search included five databases (RISS, KISS, DBpia, Science ON, and KmBase) published in South Korea until September 30, 2021. Nine studies were included in the final review, with six for the meta-analysis ...

  28. Best practices for the dissemination and implementation of

    Design Systematic review. Data sources Seven databases were searched. Eligibility The literature search followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Inclusion criteria: participation in a team sport, ≥70% youth participants (<19 years), D&I outcomes with/without NMT-related D&I strategies.

  29. Frontiers

    3.1 The outcomes of literature retrieval and the fundamental circumstances behind literature inclusion. In adherence to the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) guidelines, the study initiated with a computer-based database search, resulting in the retrieval of 742 studies.