Advertisement
- Previous Issue
- Previous Article
- Next Article
Clarifying the Research Purpose
Methodology, measurement, data analysis and interpretation, tools for evaluating the quality of medical education research, research support, competing interests, quantitative research methods in medical education.
Submitted for publication January 8, 2018. Accepted for publication November 29, 2018.
- Split-Screen
- Article contents
- Figures & tables
- Supplementary Data
- Peer Review
- Open the PDF for in another window
- Cite Icon Cite
- Get Permissions
- Search Site
John T. Ratelle , Adam P. Sawatsky , Thomas J. Beckman; Quantitative Research Methods in Medical Education. Anesthesiology 2019; 131:23–35 doi: https://doi.org/10.1097/ALN.0000000000002727
Download citation file:
- Ris (Zotero)
- Reference Manager
There has been a dramatic growth of scholarly articles in medical education in recent years. Evaluating medical education research requires specific orientation to issues related to format and content. Our goal is to review the quantitative aspects of research in medical education so that clinicians may understand these articles with respect to framing the study, recognizing methodologic issues, and utilizing instruments for evaluating the quality of medical education research. This review can be used both as a tool when appraising medical education research articles and as a primer for clinicians interested in pursuing scholarship in medical education.
Image: J. P. Rathmell and Terri Navarette.
There has been an explosion of research in the field of medical education. A search of PubMed demonstrates that more than 40,000 articles have been indexed under the medical subject heading “Medical Education” since 2010, which is more than the total number of articles indexed under this heading in the 1980s and 1990s combined. Keeping up to date requires that practicing clinicians have the skills to interpret and appraise the quality of research articles, especially when serving as editors, reviewers, and consumers of the literature.
While medical education shares many characteristics with other biomedical fields, substantial particularities exist. We recognize that practicing clinicians may not be familiar with the nuances of education research and how to assess its quality. Therefore, our purpose is to provide a review of quantitative research methodologies in medical education. Specifically, we describe a structure that can be used when conducting or evaluating medical education research articles.
Clarifying the research purpose is an essential first step when reading or conducting scholarship in medical education. 1 Medical education research can serve a variety of purposes, from advancing the science of learning to improving the outcomes of medical trainees and the patients they care for. However, a well-designed study has limited value if it addresses vague, redundant, or unimportant medical education research questions.
What is the research topic and why is it important? What is unknown about the research topic? Why is further research necessary?
What is the conceptual framework being used to approach the study?
What is the statement of study intent?
What are the research methodology and study design? Are they appropriate for the study objective(s)?
Which threats to internal validity are most relevant for the study?
What is the outcome and how was it measured?
Can the results be trusted? What is the validity and reliability of the measurements?
How were research subjects selected? Is the research sample representative of the target population?
Was the data analysis appropriate for the study design and type of data?
What is the effect size? Do the results have educational significance?
Fortunately, there are steps to ensure that the purpose of a research study is clear and logical. Table 1 2–5 outlines these steps, which will be described in detail in the following sections. We describe these elements not as a simple “checklist,” but as an advanced organizer that can be used to understand a medical education research study. These steps can also be used by clinician educators who are new to the field of education research and who wish to conduct scholarship in medical education.
Steps in Clarifying the Purpose of a Research Study in Medical Education
Literature Review and Problem Statement
A literature review is the first step in clarifying the purpose of a medical education research article. 2 , 5 , 6 When conducting scholarship in medical education, a literature review helps researchers develop an understanding of their topic of interest. This understanding includes both existing knowledge about the topic as well as key gaps in the literature, which aids the researcher in refining their study question. Additionally, a literature review helps researchers identify conceptual frameworks that have been used to approach the research topic. 2
When reading scholarship in medical education, a successful literature review provides background information so that even someone unfamiliar with the research topic can understand the rationale for the study. Located in the introduction of the manuscript, the literature review guides the reader through what is already known in a manner that highlights the importance of the research topic. The literature review should also identify key gaps in the literature so the reader can understand the need for further research. This gap description includes an explicit problem statement that summarizes the important issues and provides a reason for the study. 2 , 4 The following is one example of a problem statement:
“Identifying gaps in the competency of anesthesia residents in time for intervention is critical to patient safety and an effective learning system… [However], few available instruments relate to complex behavioral performance or provide descriptors…that could inform subsequent feedback, individualized teaching, remediation, and curriculum revision.” 7
This problem statement articulates the research topic (identifying resident performance gaps), why it is important (to intervene for the sake of learning and patient safety), and current gaps in the literature (few tools are available to assess resident performance). The researchers have now underscored why further research is needed and have helped readers anticipate the overarching goals of their study (to develop an instrument to measure anesthesiology resident performance). 4
The Conceptual Framework
Following the literature review and articulation of the problem statement, the next step in clarifying the research purpose is to select a conceptual framework that can be applied to the research topic. Conceptual frameworks are “ways of thinking about a problem or a study, or ways of representing how complex things work.” 3 Just as clinical trials are informed by basic science research in the laboratory, conceptual frameworks often serve as the “basic science” that informs scholarship in medical education. At a fundamental level, conceptual frameworks provide a structured approach to solving the problem identified in the problem statement.
Conceptual frameworks may take the form of theories, principles, or models that help to explain the research problem by identifying its essential variables or elements. Alternatively, conceptual frameworks may represent evidence-based best practices that researchers can apply to an issue identified in the problem statement. 3 Importantly, there is no single best conceptual framework for a particular research topic, although the choice of a conceptual framework is often informed by the literature review and knowing which conceptual frameworks have been used in similar research. 8 For further information on selecting a conceptual framework for research in medical education, we direct readers to the work of Bordage 3 and Irby et al. 9
To illustrate how different conceptual frameworks can be applied to a research problem, suppose you encounter a study to reduce the frequency of communication errors among anesthesiology residents during day-to-night handoff. Table 2 10 , 11 identifies two different conceptual frameworks researchers might use to approach the task. The first framework, cognitive load theory, has been proposed as a conceptual framework to identify potential variables that may lead to handoff errors. 12 Specifically, cognitive load theory identifies the three factors that affect short-term memory and thus may lead to communication errors:
Conceptual Frameworks to Address the Issue of Handoff Errors in the Intensive Care Unit
Intrinsic load: Inherent complexity or difficulty of the information the resident is trying to learn ( e.g. , complex patients).
Extraneous load: Distractions or demands on short-term memory that are not related to the information the resident is trying to learn ( e.g. , background noise, interruptions).
Germane load: Effort or mental strategies used by the resident to organize and understand the information he/she is trying to learn ( e.g. , teach back, note taking).
Using cognitive load theory as a conceptual framework, researchers may design an intervention to reduce extraneous load and help the resident remember the overnight to-do’s. An example might be dedicated, pager-free handoff times where distractions are minimized.
The second framework identified in table 2 , the I-PASS (Illness severity, Patient summary, Action list, Situational awareness and contingency planning, and Synthesis by receiver) handoff mnemonic, 11 is an evidence-based best practice that, when incorporated as part of a handoff bundle, has been shown to reduce handoff errors on pediatric wards. 13 Researchers choosing this conceptual framework may adapt some or all of the I-PASS elements for resident handoffs in the intensive care unit.
Note that both of the conceptual frameworks outlined above provide researchers with a structured approach to addressing the issue of handoff errors; one is not necessarily better than the other. Indeed, it is possible for researchers to use both frameworks when designing their study. Ultimately, we provide this example to demonstrate the necessity of selecting conceptual frameworks to clarify the research purpose. 3 , 8 Readers should look for conceptual frameworks in the introduction section and should be wary of their omission, as commonly seen in less well-developed medical education research articles. 14
Statement of Study Intent
After reviewing the literature, articulating the problem statement, and selecting a conceptual framework to address the research topic, the final step in clarifying the research purpose is the statement of study intent. The statement of study intent is arguably the most important element of framing the study because it makes the research purpose explicit. 2 Consider the following example:
This study aimed to test the hypothesis that the introduction of the BASIC Examination was associated with an accelerated knowledge acquisition during residency training, as measured by increments in annual ITE scores. 15
This statement of study intent succinctly identifies several key study elements including the population (anesthesiology residents), the intervention/independent variable (introduction of the BASIC Examination), the outcome/dependent variable (knowledge acquisition, as measure by in In-training Examination [ITE] scores), and the hypothesized relationship between the independent and dependent variable (the authors hypothesize a positive correlation between the BASIC examination and the speed of knowledge acquisition). 6 , 14
The statement of study intent will sometimes manifest as a research objective, rather than hypothesis or question. In such instances there may not be explicit independent and dependent variables, but the study population and research aim should be clearly identified. The following is an example:
“In this report, we present the results of 3 [years] of course data with respect to the practice improvements proposed by participating anesthesiologists and their success in implementing those plans. Specifically, our primary aim is to assess the frequency and type of improvements that were completed and any factors that influence completion.” 16
The statement of study intent is the logical culmination of the literature review, problem statement, and conceptual framework, and is a transition point between the Introduction and Methods sections of a medical education research report. Nonetheless, a systematic review of experimental research in medical education demonstrated that statements of study intent are absent in the majority of articles. 14 When reading a medical education research article where the statement of study intent is absent, it may be necessary to infer the research aim by gathering information from the Introduction and Methods sections. In these cases, it can be useful to identify the following key elements 6 , 14 , 17 :
Population of interest/type of learner ( e.g. , pain medicine fellow or anesthesiology residents)
Independent/predictor variable ( e.g. , educational intervention or characteristic of the learners)
Dependent/outcome variable ( e.g. , intubation skills or knowledge of anesthetic agents)
Relationship between the variables ( e.g. , “improve” or “mitigate”)
Occasionally, it may be difficult to differentiate the independent study variable from the dependent study variable. 17 For example, consider a study aiming to measure the relationship between burnout and personal debt among anesthesiology residents. Do the researchers believe burnout might lead to high personal debt, or that high personal debt may lead to burnout? This “chicken or egg” conundrum reinforces the importance of the conceptual framework which, if present, should serve as an explanation or rationale for the predicted relationship between study variables.
Research methodology is the “…design or plan that shapes the methods to be used in a study.” 1 Essentially, methodology is the general strategy for answering a research question, whereas methods are the specific steps and techniques that are used to collect data and implement the strategy. Our objective here is to provide an overview of quantitative methodologies ( i.e. , approaches) in medical education research.
The choice of research methodology is made by balancing the approach that best answers the research question against the feasibility of completing the study. There is no perfect methodology because each has its own potential caveats, flaws and/or sources of bias. Before delving into an overview of the methodologies, it is important to highlight common sources of bias in education research. We use the term internal validity to describe the degree to which the findings of a research study represent “the truth,” as opposed to some alternative hypothesis or variables. 18 Table 3 18–20 provides a list of common threats to internal validity in medical education research, along with tactics to mitigate these threats.
Threats to Internal Validity and Strategies to Mitigate Their Effects
Experimental Research
The fundamental tenet of experimental research is the manipulation of an independent or experimental variable to measure its effect on a dependent or outcome variable.
True Experiment
True experimental study designs minimize threats to internal validity by randomizing study subjects to experimental and control groups. Through ensuring that differences between groups are—beyond the intervention/variable of interest—purely due to chance, researchers reduce the internal validity threats related to subject characteristics, time-related maturation, and regression to the mean. 18 , 19
Quasi-experiment
There are many instances in medical education where randomization may not be feasible or ethical. For instance, researchers wanting to test the effect of a new curriculum among medical students may not be able to randomize learners due to competing curricular obligations and schedules. In these cases, researchers may be forced to assign subjects to experimental and control groups based upon some other criterion beyond randomization, such as different classrooms or different sections of the same course. This process, called quasi-randomization, does not inherently lead to internal validity threats, as long as research investigators are mindful of measuring and controlling for extraneous variables between study groups. 19
Single-group Methodologies
All experimental study designs compare two or more groups: experimental and control. A common experimental study design in medical education research is the single-group pretest–posttest design, which compares a group of learners before and after the implementation of an intervention. 21 In essence, a single-group pre–post design compares an experimental group ( i.e. , postintervention) to a “no-intervention” control group ( i.e. , preintervention). 19 This study design is problematic for several reasons. Consider the following hypothetical example: A research article reports the effects of a year-long intubation curriculum for first-year anesthesiology residents. All residents participate in monthly, half-day workshops over the course of an academic year. The article reports a positive effect on residents’ skills as demonstrated by a significant improvement in intubation success rates at the end of the year when compared to the beginning.
This study does little to advance the science of learning among anesthesiology residents. While this hypothetical report demonstrates an improvement in residents’ intubation success before versus after the intervention, it does not tell why the workshop worked, how it compares to other educational interventions, or how it fits in to the broader picture of anesthesia training.
Single-group pre–post study designs open themselves to a myriad of threats to internal validity. 20 In our hypothetical example, the improvement in residents’ intubation skills may have been due to other educational experience(s) ( i.e. , implementation threat) and/or improvement in manual dexterity that occurred naturally with time ( i.e. , maturation threat), rather than the airway curriculum. Consequently, single-group pre–post studies should be interpreted with caution. 18
Repeated testing, before and after the intervention, is one strategy that can be used to reduce the some of the inherent limitations of the single-group study design. Repeated pretesting can mitigate the effect of regression toward the mean, a statistical phenomenon whereby low pretest scores tend to move closer to the mean on subsequent testing (regardless of intervention). 20 Likewise, repeated posttesting at multiple time intervals can provide potentially useful information about the short- and long-term effects of an intervention ( e.g. , the “durability” of the gain in knowledge, skill, or attitude).
Observational Research
Unlike experimental studies, observational research does not involve manipulation of any variables. These studies often involve measuring associations, developing psychometric instruments, or conducting surveys.
Association Research
Association research seeks to identify relationships between two or more variables within a group or groups (correlational research), or similarities/differences between two or more existing groups (causal–comparative research). For example, correlational research might seek to measure the relationship between burnout and educational debt among anesthesiology residents, while causal–comparative research may seek to measure differences in educational debt and/or burnout between anesthesiology and surgery residents. Notably, association research may identify relationships between variables, but does not necessarily support a causal relationship between them.
Psychometric and Survey Research
Psychometric instruments measure a psychologic or cognitive construct such as knowledge, satisfaction, beliefs, and symptoms. Surveys are one type of psychometric instrument, but many other types exist, such as evaluations of direct observation, written examinations, or screening tools. 22 Psychometric instruments are ubiquitous in medical education research and can be used to describe a trait within a study population ( e.g. , rates of depression among medical students) or to measure associations between study variables ( e.g. , association between depression and board scores among medical students).
Psychometric and survey research studies are prone to the internal validity threats listed in table 3 , particularly those relating to mortality, location, and instrumentation. 18 Additionally, readers must ensure that the instrument scores can be trusted to truly represent the construct being measured. For example, suppose you encounter a research article demonstrating a positive association between attending physician teaching effectiveness as measured by a survey of medical students, and the frequency with which the attending physician provides coffee and doughnuts on rounds. Can we be confident that this survey administered to medical students is truly measuring teaching effectiveness? Or is it simply measuring the attending physician’s “likability”? Issues related to measurement and the trustworthiness of data are described in detail in the following section on measurement and the related issues of validity and reliability.
Measurement refers to “the assigning of numbers to individuals in a systematic way as a means of representing properties of the individuals.” 23 Research data can only be trusted insofar as we trust the measurement used to obtain the data. Measurement is of particular importance in medical education research because many of the constructs being measured ( e.g. , knowledge, skill, attitudes) are abstract and subject to measurement error. 24 This section highlights two specific issues related to the trustworthiness of data: the validity and reliability of measurements.
Validity regarding the scores of a measurement instrument “refers to the degree to which evidence and theory support the interpretations of the [instrument’s results] for the proposed use of the [instrument].” 25 In essence, do we believe the results obtained from a measurement really represent what we were trying to measure? Note that validity evidence for the scores of a measurement instrument is separate from the internal validity of a research study. Several frameworks for validity evidence exist. Table 4 2 , 22 , 26 represents the most commonly used framework, developed by Messick, 27 which identifies sources of validity evidence—to support the target construct—from five main categories: content, response process, internal structure, relations to other variables, and consequences.
Sources of Validity Evidence for Measurement Instruments
Reliability
Reliability refers to the consistency of scores for a measurement instrument. 22 , 25 , 28 For an instrument to be reliable, we would anticipate that two individuals rating the same object of measurement in a specific context would provide the same scores. 25 Further, if the scores for an instrument are reliable between raters of the same object of measurement, then we can extrapolate that any difference in scores between two objects represents a true difference across the sample, and is not due to random variation in measurement. 29 Reliability can be demonstrated through a variety of methods such as internal consistency ( e.g. , Cronbach’s alpha), temporal stability ( e.g. , test–retest reliability), interrater agreement ( e.g. , intraclass correlation coefficient), and generalizability theory (generalizability coefficient). 22 , 29
Example of a Validity and Reliability Argument
This section provides an illustration of validity and reliability in medical education. We use the signaling questions outlined in table 4 to make a validity and reliability argument for the Harvard Assessment of Anesthesia Resident Performance (HARP) instrument. 7 The HARP was developed by Blum et al. to measure the performance of anesthesia trainees that is required to provide safe anesthetic care to patients. According to the authors, the HARP is designed to be used “…as part of a multiscenario, simulation-based assessment” of resident performance. 7
Content Validity: Does the Instrument’s Content Represent the Construct Being Measured?
To demonstrate content validity, instrument developers should describe the construct being measured and how the instrument was developed, and justify their approach. 25 The HARP is intended to measure resident performance in the critical domains required to provide safe anesthetic care. As such, investigators note that the HARP items were created through a two-step process. First, the instrument’s developers interviewed anesthesiologists with experience in resident education to identify the key traits needed for successful completion of anesthesia residency training. Second, the authors used a modified Delphi process to synthesize the responses into five key behaviors: (1) formulate a clear anesthetic plan, (2) modify the plan under changing conditions, (3) communicate effectively, (4) identify performance improvement opportunities, and (5) recognize one’s limits. 7 , 30
Response Process Validity: Are Raters Interpreting the Instrument Items as Intended?
In the case of the HARP, the developers included a scoring rubric with behavioral anchors to ensure that faculty raters could clearly identify how resident performance in each domain should be scored. 7
Internal Structure Validity: Do Instrument Items Measuring Similar Constructs Yield Homogenous Results? Do Instrument Items Measuring Different Constructs Yield Heterogeneous Results?
Item-correlation for the HARP demonstrated a high degree of correlation between some items ( e.g. , formulating a plan and modifying the plan under changing conditions) and a lower degree of correlation between other items ( e.g. , formulating a plan and identifying performance improvement opportunities). 30 This finding is expected since the items within the HARP are designed to assess separate performance domains, and we would expect residents’ functioning to vary across domains.
Relationship to Other Variables’ Validity: Do Instrument Scores Correlate with Other Measures of Similar or Different Constructs as Expected?
As it applies to the HARP, one would expect that the performance of anesthesia residents will improve over the course of training. Indeed, HARP scores were found to be generally higher among third-year residents compared to first-year residents. 30
Consequence Validity: Are Instrument Results Being Used as Intended? Are There Unintended or Negative Uses of the Instrument Results?
While investigators did not intentionally seek out consequence validity evidence for the HARP, unanticipated consequences of HARP scores were identified by the authors as follows:
“Data indicated that CA-3s had a lower percentage of worrisome scores (rating 2 or lower) than CA-1s… However, it is concerning that any CA-3s had any worrisome scores…low performance of some CA-3 residents, albeit in the simulated environment, suggests opportunities for training improvement.” 30
That is, using the HARP to measure the performance of CA-3 anesthesia residents had the unintended consequence of identifying the need for improvement in resident training.
Reliability: Are the Instrument’s Scores Reproducible and Consistent between Raters?
The HARP was applied by two raters for every resident in the study across seven different simulation scenarios. The investigators conducted a generalizability study of HARP scores to estimate the variance in assessment scores that was due to the resident, the rater, and the scenario. They found little variance was due to the rater ( i.e. , scores were consistent between raters), indicating a high level of reliability. 7
Sampling refers to the selection of research subjects ( i.e. , the sample) from a larger group of eligible individuals ( i.e. , the population). 31 Effective sampling leads to the inclusion of research subjects who represent the larger population of interest. Alternatively, ineffective sampling may lead to the selection of research subjects who are significantly different from the target population. Imagine that researchers want to explore the relationship between burnout and educational debt among pain medicine specialists. The researchers distribute a survey to 1,000 pain medicine specialists (the population), but only 300 individuals complete the survey (the sample). This result is problematic because the characteristics of those individuals who completed the survey and the entire population of pain medicine specialists may be fundamentally different. It is possible that the 300 study subjects may be experiencing more burnout and/or debt, and thus, were more motivated to complete the survey. Alternatively, the 700 nonresponders might have been too busy to respond and even more burned out than the 300 responders, which would suggest that the study findings were even more amplified than actually observed.
When evaluating a medical education research article, it is important to identify the sampling technique the researchers employed, how it might have influenced the results, and whether the results apply to the target population. 24
Sampling Techniques
Sampling techniques generally fall into two categories: probability- or nonprobability-based. Probability-based sampling ensures that each individual within the target population has an equal opportunity of being selected as a research subject. Most commonly, this is done through random sampling, which should lead to a sample of research subjects that is similar to the target population. If significant differences between sample and population exist, those differences should be due to random chance, rather than systematic bias. The difference between data from a random sample and that from the population is referred to as sampling error. 24
Nonprobability-based sampling involves selecting research participants such that inclusion of some individuals may be more likely than the inclusion of others. 31 Convenience sampling is one such example and involves selection of research subjects based upon ease or opportuneness. Convenience sampling is common in medical education research, but, as outlined in the example at the beginning of this section, it can lead to sampling bias. 24 When evaluating an article that uses nonprobability-based sampling, it is important to look for participation/response rate. In general, a participation rate of less than 75% should be viewed with skepticism. 21 Additionally, it is important to determine whether characteristics of participants and nonparticipants were reported and if significant differences between the two groups exist.
Interpreting medical education research requires a basic understanding of common ways in which quantitative data are analyzed and displayed. In this section, we highlight two broad topics that are of particular importance when evaluating research articles.
The Nature of the Measurement Variable
Measurement variables in quantitative research generally fall into three categories: nominal, ordinal, or interval. 24 Nominal variables (sometimes called categorical variables) involve data that can be placed into discrete categories without a specific order or structure. Examples include sex (male or female) and professional degree (M.D., D.O., M.B.B.S., etc .) where there is no clear hierarchical order to the categories. Ordinal variables can be ranked according to some criterion, but the spacing between categories may not be equal. Examples of ordinal variables may include measurements of satisfaction (satisfied vs . unsatisfied), agreement (disagree vs . agree), and educational experience (medical student, resident, fellow). As it applies to educational experience, it is noteworthy that even though education can be quantified in years, the spacing between years ( i.e. , educational “growth”) remains unequal. For instance, the difference in performance between second- and third-year medical students is dramatically different than third- and fourth-year medical students. Interval variables can also be ranked according to some criteria, but, unlike ordinal variables, the spacing between variable categories is equal. Examples of interval variables include test scores and salary. However, the conceptual boundaries between these measurement variables are not always clear, as in the case where ordinal scales can be assumed to have the properties of an interval scale, so long as the data’s distribution is not substantially skewed. 32
Understanding the nature of the measurement variable is important when evaluating how the data are analyzed and reported. Medical education research commonly uses measurement instruments with items that are rated on Likert-type scales, whereby the respondent is asked to assess their level of agreement with a given statement. The response is often translated into a corresponding number ( e.g. , 1 = strongly disagree, 3 = neutral, 5 = strongly agree). It is remarkable that scores from Likert-type scales are sometimes not normally distributed ( i.e. , are skewed toward one end of the scale), indicating that the spacing between scores is unequal and the variable is ordinal in nature. In these cases, it is recommended to report results as frequencies or medians, rather than means and SDs. 33
Consider an article evaluating medical students’ satisfaction with a new curriculum. Researchers measure satisfaction using a Likert-type scale (1 = very unsatisfied, 2 = unsatisfied, 3 = neutral, 4 = satisfied, 5 = very satisfied). A total of 20 medical students evaluate the curriculum, 10 of whom rate their satisfaction as “satisfied,” and 10 of whom rate it as “very satisfied.” In this case, it does not make much sense to report an average score of 4.5; it makes more sense to report results in terms of frequency ( e.g. , half of the students were “very satisfied” with the curriculum, and half were not).
Effect Size and CIs
In medical education, as in other research disciplines, it is common to report statistically significant results ( i.e. , small P values) in order to increase the likelihood of publication. 34 , 35 However, a significant P value in itself does necessarily represent the educational impact of the study results. A statement like “Intervention x was associated with a significant improvement in learners’ intubation skill compared to education intervention y ( P < 0.05)” tells us that there was a less than 5% chance that the difference in improvement between interventions x and y was due to chance. Yet that does not mean that the study intervention necessarily caused the nonchance results, or indicate whether the between-group difference is educationally significant. Therefore, readers should consider looking beyond the P value to effect size and/or CI when interpreting the study results. 36 , 37
Effect size is “the magnitude of the difference between two groups,” which helps to quantify the educational significance of the research results. 37 Common measures of effect size include Cohen’s d (standardized difference between two means), risk ratio (compares binary outcomes between two groups), and Pearson’s r correlation (linear relationship between two continuous variables). 37 CIs represent “a range of values around a sample mean or proportion” and are a measure of precision. 31 While effect size and CI give more useful information than simple statistical significance, they are commonly omitted from medical education research articles. 35 In such instances, readers should be wary of overinterpreting a P value in isolation. For further information effect size and CI, we direct readers the work of Sullivan and Feinn 37 and Hulley et al. 31
In this final section, we identify instruments that can be used to evaluate the quality of quantitative medical education research articles. To this point, we have focused on framing the study and research methodologies and identifying potential pitfalls to consider when appraising a specific article. This is important because how a study is framed and the choice of methodology require some subjective interpretation. Fortunately, there are several instruments available for evaluating medical education research methods and providing a structured approach to the evaluation process.
The Medical Education Research Study Quality Instrument (MERSQI) 21 and the Newcastle Ottawa Scale-Education (NOS-E) 38 are two commonly used instruments, both of which have an extensive body of validity evidence to support the interpretation of their scores. Table 5 21 , 39 provides more detail regarding the MERSQI, which includes evaluation of study design, sampling, data type, validity, data analysis, and outcomes. We have found that applying the MERSQI to manuscripts, articles, and protocols has intrinsic educational value, because this practice of application familiarizes MERSQI users with fundamental principles of medical education research. One aspect of the MERSQI that deserves special mention is the section on evaluating outcomes based on Kirkpatrick’s widely recognized hierarchy of reaction, learning, behavior, and results ( table 5 ; fig .). 40 Validity evidence for the scores of the MERSQI include its operational definitions to improve response process, excellent reliability, and internal consistency, as well as high correlation with other measures of study quality, likelihood of publication, citation rate, and an association between MERSQI score and the likelihood of study funding. 21 , 41 Additionally, consequence validity for the MERSQI scores has been demonstrated by its utility for identifying and disseminating high-quality research in medical education. 42
Kirkpatrick’s hierarchy of outcomes as applied to education research. Reaction = Level 1, Learning = Level 2, Behavior = Level 3, Results = Level 4. Outcomes become more meaningful, yet more difficult to achieve, when progressing from Level 1 through Level 4. Adapted with permission from Beckman and Cook, 2007. 2
The Medical Education Research Study Quality Instrument for Evaluating the Quality of Medical Education Research
The NOS-E is a newer tool to evaluate the quality of medication education research. It was developed as a modification of the Newcastle-Ottawa Scale 43 for appraising the quality of nonrandomized studies. The NOS-E includes items focusing on the representativeness of the experimental group, selection and compatibility of the control group, missing data/study retention, and blinding of outcome assessors. 38 , 39 Additional validity evidence for NOS-E scores includes operational definitions to improve response process, excellent reliability and internal consistency, and its correlation with other measures of study quality. 39 Notably, the complete NOS-E, along with its scoring rubric, can found in the article by Cook and Reed. 39
A recent comparison of the MERSQI and NOS-E found acceptable interrater reliability and good correlation between the two instruments 39 However, noted differences exist between the MERSQI and NOS-E. Specifically, the MERSQI may be applied to a broad range of study designs, including experimental and cross-sectional research. Additionally, the MERSQI addresses issues related to measurement validity and data analysis, and places emphasis on educational outcomes. On the other hand, the NOS-E focuses specifically on experimental study designs, and on issues related to sampling techniques and outcome assessment. 39 Ultimately, the MERSQI and NOS-E are complementary tools that may be used together when evaluating the quality of medical education research.
Conclusions
This article provides an overview of quantitative research in medical education, underscores the main components of education research, and provides a general framework for evaluating research quality. We highlighted the importance of framing a study with respect to purpose, conceptual framework, and statement of study intent. We reviewed the most common research methodologies, along with threats to the validity of a study and its measurement instruments. Finally, we identified two complementary instruments, the MERSQI and NOS-E, for evaluating the quality of a medical education research study.
Bordage G: Conceptual frameworks to illuminate and magnify. Medical education. 2009; 43(4):312–9.
Cook DA, Beckman TJ: Current concepts in validity and reliability for psychometric instruments: Theory and application. The American journal of medicine. 2006; 119(2):166. e7–166. e116.
Franenkel JR, Wallen NE, Hyun HH: How to Design and Evaluate Research in Education. 9th edition. New York, McGraw-Hill Education, 2015.
Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB: Designing clinical research. 4th edition. Philadelphia, Lippincott Williams & Wilkins, 2011.
Irby BJ, Brown G, Lara-Alecio R, Jackson S: The Handbook of Educational Theories. Charlotte, NC, Information Age Publishing, Inc., 2015
Standards for Educational and Psychological Testing (American Educational Research Association & American Psychological Association, 2014)
Swanwick T: Understanding medical education: Evidence, theory and practice, 2nd edition. Wiley-Blackwell, 2013.
Sullivan GM, Artino Jr AR: Analyzing and interpreting data from Likert-type scales. Journal of graduate medical education. 2013; 5(4):541–2.
Sullivan GM, Feinn R: Using effect size—or why the P value is not enough. Journal of graduate medical education. 2012; 4(3):279–82.
Tavakol M, Sandars J: Quantitative and qualitative methods in medical education research: AMEE Guide No 90: Part II. Medical teacher. 2014; 36(10):838–48.
Support was provided solely from institutional and/or departmental sources.
The authors declare no competing interests.
Citing articles via
Most viewed, email alerts, related articles, social media, affiliations.
- ASA Practice Parameters
- Online First
- Author Resource Center
- About the Journal
- Editorial Board
- Rights & Permissions
- Online ISSN 1528-1175
- Print ISSN 0003-3022
- Anesthesiology
- ASA Monitor
- Terms & Conditions Privacy Policy
- Manage Cookie Preferences
- © Copyright 2024 American Society of Anesthesiologists
This Feature Is Available To Subscribers Only
Sign In or Create an Account
An official website of the United States government
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Publications
- Account settings
- Advanced Search
- Journal List
Appraising Quantitative Research in Health Education: Guidelines for Public Health Educators
Leonard jack jr , ph.d., msc., ches, sandra c hayes , mph, jeanfreau g scharalda , dns, fnp, barbara stetson , ph.d., nkenge h jones-jack , mph, matthew valliere , mpa, william r kirchain , pharmd., cde, michael fagen , phd, mph, cris leblanc.
- Author information
- Copyright and License information
Corresponding author.
Many practicing health educators do not feel they possess the skills necessary to critically appraise quantitative research. This publication is designed to help provide practicing health educators with basic tools helpful to facilitate a better understanding of quantitative research. This article describes the major components—title, introduction, methods, analyses, results and discussion sections—of quantitative research. Readers will be introduced to information on the various types of study designs and seven key questions health educators can use to facilitate the appraisal process. Upon reading, health educators will be in a better position to determine whether research studies are well designed and executed.
Keywords: health education, quantitative research, study designs, research methods
Appraising the Quality of Quantitative Research in Health Education
Practicing health educators often find themselves with little time to read published research in great detail. Some health educators with limited time to read scientific papers may get frustrated as they get bogged down trying to understand research terminology, methods, and approaches. The purpose of appraising a scientific publication is to assess whether the study’s research questions (hypotheses), methods and results (findings) are sufficiently valid to produce useful information ( Fowkes and Fulton, 1991 ; Donnelly, 2004 ; Greenhalgh and Taylor, 1997 ; Johnson and Onwuegbuze, 2004 ; Greenhalgh, 1997 ; Yin, 2003; and Hennekens and Buring, 1987 ). Having the ability to deconstruct and reconstruct scientific publications is a critical skill in a results-oriented environment linked to increasing demands and expectations for improved program outcomes and strong justifications to program focus and direction. Health educators do must not solely rely on the opinions of researchers, but, rather, increase their confidence in their own abilities to discern the quality of published scientific research. Health educators with little experience reading and appraising scientific publications, may find this task less difficult if they: 1) become more familiar with the key components of a research publication, and 2) utilize questions presented in this article to critically appraise the strengths and weaknesses of published research.
Key Components of a Scientific Research Publication
The key components of a research publication should provide important information that is needed to assess the strengths and weaknesses of the research. Key components typically include the: publication title , abstract , introduction , research methods used to address the research question(s) or hypothesis, statistical analysis used, results , and the researcher’s interpretation and conclusion or recommended use of results to inform future research or practice. A brief description of these components follows:
Publication Title
A general heading or description should provide immediate insight into the intent of the research. Titles may include information regarding the focus of the research, population or target audience being studied, and study design.
An abstract provides the reader with a brief description of the overall research, how it was done, statistical techniques employed, key results,and relevant implications or recommendations.
Introduction
This section elaborates on the content mentioned in the abstract and provides a better idea of what to anticipate in the manuscript. The introduction provides a succinct presentation of previously published literature, thus offering a purpose (rationale) for the study.
This component of the publication provides critical information on the type of research methods used to conduct the study. Common examples of study designs used to conduct quantitative research include cross sectional study, cohort study, case-control study, and controlled trial. The methods section should contain information on the inclusion and exclusion criteria used to identify participants in the study.
Quantitative data contains information that is quantifiable, perhaps through surveys that are analyzed using statistical tests to determine if the results happened by chance. Two types of statistical analyses are used: descriptive and inferential ( Johnson and Onwuegbuze, 2004 ). Descriptive statistics are used to describe the basic features of the study data and provide simple summaries about the sample and measures. With inferential statistics, researchers are trying to reach conclusions that extend beyond the immediate data alone. Thus, they use inferential statistics to make inferences from the data to more general conditions.
This section presents the reader with the researcher’s data and results of statistical analyses described in the method section. Thus, this section must align closely with the methods section.
Discussion (Conclusion)
This section should explain what the data means thereby summarizing main results and findings for the reader. Important limitations (such as the use of a non-random sample, the absence of a control group, and short duration of the intervention) should be discussed. Researchers should discuss how each limitation can impact the applicability and use of study results. This section also presents recommendations on ways the study can help advance future health education and practice.
Critically Appraising the Strengths and Weaknesses of Published Research
During careful reading of the analysis, results, and discussion (conclusion) sections, what key questions might you ask yourself in order to critically appraise the strengths and weaknesses of the research? Based on a careful review of the literature ( Greenhalgh and Taylor, 1997 ; Greenhalgh, 1997 ; and Hennekens and Buring, 1987 ) and our research experiences, we have identified seven key questions around which to guide your assessment of quantitative research.
1) Is a study design identified and appropriately applied?
Study designs refer to the methodology used to investigate a particular health phenomenon. Becoming familiar with the various study designs will help prepare you to critically assess whether its selection was applied adequately to answer the research questions (or hypotheses). As mentioned previously, common examples of study designs frequently used to conduct quantitative research include cross sectional study, cohort study, case-control study, and controlled trail. A brief description of each can be found in Table 1 .
Definitions of Study Designs
2) Is the study sample representative of the group from which it is drawn?
The study sample must be representative of the group from which it is drawn. The study sample must therefore be typical of the wider target audience to whom the research might apply. Addressing whether the study sample is representative of the group from which it is drawn will require the researcher to take into consideration the sampling method and sample size.
Sampling Method
Many sampling methods are used individually or in combination. Keep in mind that sampling methods are divided into two categories: probability sampling and non-probability sampling ( Last, 2001 ). Probability sampling (also called random sampling) is any sampling scheme in which the probability of choosing each individual is the same (or at least known, so it can be readjusted mathematically to be equal). Non-probability sampling is any sampling scheme in which the probability of an individual being chosen is unknown. Typically, researchers should offer a rationale for utilizing non-probability sampling, and when utilized, be aware of its limitations. For example, use of a convenience sample (choosing individuals in an unstructured manner) can be justified when collecting pilot data around which future studies employing more rigorous sampling methods will be utilized.
Sample Size
Established statistical theories and formulas are used to generate sample size calculations—the recommended number of individuals necessary in order to have sufficient power to detect meaningful results at a certain level of statistical significance. In the methods section, look for a statement or two confirming whether steps where taken to obtain the appropriate sample size.
3) In research studies using a control group, is this group adequate for the purpose of the study?
Source of controls.
In case-control and cohort studies, the source of controls should be such that the distribution of characteristics not under investigation are similar to those in the cases or study cohort.
In case-control studies both cases and controls are often matched on certain characteristics such as age, sex, income, and race. The criteria used for including and excluding study participants must be adequately described and examined carefully. Inclusion and exclusion criteria may include: ethnicity, age of diagnosis, length of time living with a health condition, geographic location, and presence or absence of complications. You should critically assess whether matching across these characteristics actually occurred.
4) What is the validity of measurements and outcomes identified in the study?
Validity is the extent to which a measurement captures what it claims to measure. This might take the form of questions contained on a survey, questionnaire or instrument. Researchers should address one or more of the following types of validity: face, content, criterion-related, and construct ( Last, 2001 ; William and Donnelly, 2008).
Face validity
Face validity assures that, upon examination, the variable of interest can measure what it intends to measure. If the researcher has chosen to study a variable that has not been studied before, he/she usually will need to start with face validity.
Content validity
Content validity involves comparing the content of the measurement technique to the known literature on the topic and validating the fact that the tool (e.g., survey, questionnaire) does represent the literature accurately.
Criterion-related validity
Criterion-related validity involves making sure the measures within a survey when tested proves to be effective in predicting criterion or indicators of a construct.
Construct validity
Construct validity deals with the validation of the construct that underlies the research. Here, researchers test the theory that underlies the hypothesis or research question.
5) To what extent is a common source of bias called blindness taken into account?
During data collection, a common source of bias is that subjects and/or those collecting the data are not blind to the purpose of the research. This can likely be the result of researchers going the extra mile to make sure those in the experimental group benefit from the intervention ( Fowkes and Fulton, 1991 ). Inadequate blindness can be a problem in studies utilizing all types of study designs. While total blindness is not possible, appraising whether steps were taken to be sure issues related to ensure blindness occurred is essential.
6) To what extent is the study considered complete with regard to drop outs and missing data?
Regardless of the study design employed, one must assess not only the proportion of drop outs in each group, but also why they dropped out. This may point to possible bias, as well as determine what efforts were taken to retain participants in the study.
Missing data
Despite the fact that missing data are a part of almost all research, it should still be appraised. There are several reasons why the data may be missing. The nature and extent to which data is missing should be explained.
7) To what extent are study results influenced by factors that negatively impact their credibility?
Contamination.
In research studies comparing the effectiveness of a structured intervention, contamination occurs when the control group makes changes based on learning what those participating in the intervention are doing. Despite the fact that researchers typically do not report the extent to which contamination occurs, you should nevertheless try to assess whether contamination negatively impacted the credibility of study results.
Confounding factors
A confounding factor in a study is a variable which is related to one or more of the measurements (measures or variables) defined in a study. A confounding factor may mask an actual association or falsely demonstrate an apparent association between the study variables where no real association between them exists. If confounding factors are not measured and considered, study results may be biased and compromised.
The guidelines and questions presented in this article are by no means exhaustive. However, when applied, they can help health education practitioners obtain a deeper understanding of the quality of published research. While no study is 100% perfect, we do encourage health education practitioners to pause before taking researchers at their word that study results are both accurate and impressive. If you find yourself answering ‘no’ to a majority of the key questions provided, then it is probably safe to say that, from your perspective, the quality of the research is questionable.
Over time, as you repeatedly apply the guidelines presented in this article, you will become more confident and interested in reading research publications from beginning to end. While this article is geared to health educators, it can help anyone interested in learning how to appraise published research. Table 2 lists additional reading resources that can help improve one’s understanding and knowledge of quantitative research. This article and the reading resources identified in Table 2 can serve as useful tools to frame informative conversations with your peers regarding the strengths and weaknesses of published quantitative research in health education.
Publications on How to Read, Write and Appraise Quantitative Research
Contributor Information
Leonard Jack, Jr., Email: [email protected], Associate Dean for Research and Endowed Chair of Minority Health Disparities, College of Pharmacy, Xavier University of Louisiana, 1 Drexel Drive, New Orleans, Louisiana 70125; Telephone: 504-520-5345; Fax: 504-520-7971.
Sandra C. Hayes, Email: [email protected], Central Mississippi Area Health Education Center, 350 West Woodrow Wilson, Suite 3320, Jackson, MS 39213; Telephone: 601-987-0272; Fax: 601-815-5388.
Jeanfreau G. Scharalda, Email: [email protected], Louisiana State University Health Sciences Center School of Nursing, 1900 Gravier Street, New Orleans, Louisiana 70112; Telephone: 504-568-4140; Fax: 504-568-5853.
Barbara Stetson, Email: [email protected], Department of Psychological and Brain Sciences, 317 Life Sciences Building, University of Louisville, Louisville, KY 40292; Telephone: 502-852-2540; Fax: 502-852-8904.
Nkenge H. Jones-Jack, Email: [email protected], Epidemiologist & Evaluation Consultant, Metairie, Louisiana 70002. Telephone: 678-524-1147; Fax: 504-267-4080.
Matthew Valliere, Email: [email protected], Chronic Disease Prevention and Control, Bureau of Primary Care and Rural Health, Office of the Secretary, 628 North 4th Street, Baton Rouge, LA 70821-3118; Telephone: 225-342-2655; Fax: 225-342-2652.
William R. Kirchain, Email: [email protected], Division of Clinical and Administrative Sciences, College of Pharmacy, Xavier University of Louisiana, 1 Drexel Drive, Room 121, New Orleans, Louisiana 70125; Telephone: 504-520-5395; Fax: 504-520-7971.
Michael Fagen, Email: [email protected], Co-Associate Editor for the Evaluation and Practice section of Health Promotion Practice , Department of Community Health Sciences, School of Public Health, University of Illinois at Chicago, 1603 W. Taylor St., M/C 923, Chicago, IL 60608-1260, Telephone: 312-355-0647; Fax: 312-996-3551.
Cris LeBlanc, Centers of Excellence Scholar, College of Pharmacy, Xavier University of Louisiana, 1 Drexel Drive, New Orleans, Louisiana 70125; Telephone: 504-520-5345; Fax: 504-520-7971.
- Fowkes FG, Fulton PM. Critical appraisal of published research: introductory guidelines. British Medical Journal. 1991;302:1136–40. doi: 10.1136/bmj.302.6785.1136. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Donnelly RA. The Complete Idiots Guide to Statistics. Alpha Books; New York, NY: 2004. pp. 6–7. [ Google Scholar ]
- Greenhalgh T, Taylor R. How to read a paper: Papers that go beyond numbers (qualitative research) British Medical Journal. 1997;315:740–743. doi: 10.1136/bmj.315.7110.740. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Greenhalgh T. How to read a paper: Assessing the methodological quality of published papers. British Medical Journal. 315:305–308. doi: 10.1136/bmj.315.7103.305. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Johnson RB, Onwuegbuze AJ. Mixed methods research: A research paradigm whose time has come. Educational Researcher. 2004;33:14–26. [ Google Scholar ]
- Hennekens CH, Buring JE. Epidemiology in Medicine. Little, Brown and Company; Boston, Massachusetts: 1987. pp. 106–108. [ Google Scholar ]
- Last JM. A dictionary of epidemiology. 4. Oxford University Press, Inc; New York, New York: 2001. [ Google Scholar ]
- Trochim WM, Donnelly J. Research methods knowledge base. 3. Atomic Dog; Mason, Ohio: 2008. pp. 6–8. [ Google Scholar ]
- View on publisher site
- PDF (36.7 KB)
- Collections
Similar articles
Cited by other articles, links to ncbi databases.
- Download .nbib .nbib
- Format: AMA APA MLA NLM
Add to Collections
Quantitative research
Affiliation.
- 1 Faculty of Health and Social Care, University of Hull, Hull, England.
- PMID: 25828021
- DOI: 10.7748/ns.29.31.44.e8681
This article describes the basic tenets of quantitative research. The concepts of dependent and independent variables are addressed and the concept of measurement and its associated issues, such as error, reliability and validity, are explored. Experiments and surveys – the principal research designs in quantitative research – are described and key features explained. The importance of the double-blind randomised controlled trial is emphasised, alongside the importance of longitudinal surveys, as opposed to cross-sectional surveys. Essential features of data storage are covered, with an emphasis on safe, anonymous storage. Finally, the article explores the analysis of quantitative data, considering what may be analysed and the main uses of statistics in analysis.
Keywords: Experiments; measurement; nursing research; quantitative research; reliability; surveys; validity.
- Biomedical Research / methods*
- Double-Blind Method
- Evaluation Studies as Topic
- Longitudinal Studies
- Randomized Controlled Trials as Topic
- United Kingdom
Log in using your username and password
- Search More Search for this keyword Advanced search
- Latest content
- Current issue
- Write for Us
- BMJ Journals More You are viewing from: Google Indexer
You are here
- Volume 21, Issue 4
- How to appraise quantitative research
- Article Text
- Article info
- Citation Tools
- Rapid Responses
- Article metrics
This article has a correction. Please see:
- Correction: How to appraise quantitative research - April 01, 2019
- Xabi Cathala 1 ,
- Calvin Moorley 2
- 1 Institute of Vocational Learning , School of Health and Social Care, London South Bank University , London , UK
- 2 Nursing Research and Diversity in Care , School of Health and Social Care, London South Bank University , London , UK
- Correspondence to Mr Xabi Cathala, Institute of Vocational Learning, School of Health and Social Care, London South Bank University London UK ; cathalax{at}lsbu.ac.uk and Dr Calvin Moorley, Nursing Research and Diversity in Care, School of Health and Social Care, London South Bank University, London SE1 0AA, UK; Moorleyc{at}lsbu.ac.uk
https://doi.org/10.1136/eb-2018-102996
Statistics from Altmetric.com
Request permissions.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Introduction
Some nurses feel that they lack the necessary skills to read a research paper and to then decide if they should implement the findings into their practice. This is particularly the case when considering the results of quantitative research, which often contains the results of statistical testing. However, nurses have a professional responsibility to critique research to improve their practice, care and patient safety. 1 This article provides a step by step guide on how to critically appraise a quantitative paper.
Title, keywords and the authors
The authors’ names may not mean much, but knowing the following will be helpful:
Their position, for example, academic, researcher or healthcare practitioner.
Their qualification, both professional, for example, a nurse or physiotherapist and academic (eg, degree, masters, doctorate).
This can indicate how the research has been conducted and the authors’ competence on the subject. Basically, do you want to read a paper on quantum physics written by a plumber?
The abstract is a resume of the article and should contain:
Introduction.
Research question/hypothesis.
Methods including sample design, tests used and the statistical analysis (of course! Remember we love numbers).
Main findings.
Conclusion.
The subheadings in the abstract will vary depending on the journal. An abstract should not usually be more than 300 words but this varies depending on specific journal requirements. If the above information is contained in the abstract, it can give you an idea about whether the study is relevant to your area of practice. However, before deciding if the results of a research paper are relevant to your practice, it is important to review the overall quality of the article. This can only be done by reading and critically appraising the entire article.
The introduction
Example: the effect of paracetamol on levels of pain.
My hypothesis is that A has an effect on B, for example, paracetamol has an effect on levels of pain.
My null hypothesis is that A has no effect on B, for example, paracetamol has no effect on pain.
My study will test the null hypothesis and if the null hypothesis is validated then the hypothesis is false (A has no effect on B). This means paracetamol has no effect on the level of pain. If the null hypothesis is rejected then the hypothesis is true (A has an effect on B). This means that paracetamol has an effect on the level of pain.
Background/literature review
The literature review should include reference to recent and relevant research in the area. It should summarise what is already known about the topic and why the research study is needed and state what the study will contribute to new knowledge. 5 The literature review should be up to date, usually 5–8 years, but it will depend on the topic and sometimes it is acceptable to include older (seminal) studies.
Methodology
In quantitative studies, the data analysis varies between studies depending on the type of design used. For example, descriptive, correlative or experimental studies all vary. A descriptive study will describe the pattern of a topic related to one or more variable. 6 A correlational study examines the link (correlation) between two variables 7 and focuses on how a variable will react to a change of another variable. In experimental studies, the researchers manipulate variables looking at outcomes 8 and the sample is commonly assigned into different groups (known as randomisation) to determine the effect (causal) of a condition (independent variable) on a certain outcome. This is a common method used in clinical trials.
There should be sufficient detail provided in the methods section for you to replicate the study (should you want to). To enable you to do this, the following sections are normally included:
Overview and rationale for the methodology.
Participants or sample.
Data collection tools.
Methods of data analysis.
Ethical issues.
Data collection should be clearly explained and the article should discuss how this process was undertaken. Data collection should be systematic, objective, precise, repeatable, valid and reliable. Any tool (eg, a questionnaire) used for data collection should have been piloted (or pretested and/or adjusted) to ensure the quality, validity and reliability of the tool. 9 The participants (the sample) and any randomisation technique used should be identified. The sample size is central in quantitative research, as the findings should be able to be generalised for the wider population. 10 The data analysis can be done manually or more complex analyses performed using computer software sometimes with advice of a statistician. From this analysis, results like mode, mean, median, p value, CI and so on are always presented in a numerical format.
The author(s) should present the results clearly. These may be presented in graphs, charts or tables alongside some text. You should perform your own critique of the data analysis process; just because a paper has been published, it does not mean it is perfect. Your findings may be different from the author’s. Through critical analysis the reader may find an error in the study process that authors have not seen or highlighted. These errors can change the study result or change a study you thought was strong to weak. To help you critique a quantitative research paper, some guidance on understanding statistical terminology is provided in table 1 .
- View inline
Some basic guidance for understanding statistics
Quantitative studies examine the relationship between variables, and the p value illustrates this objectively. 11 If the p value is less than 0.05, the null hypothesis is rejected and the hypothesis is accepted and the study will say there is a significant difference. If the p value is more than 0.05, the null hypothesis is accepted then the hypothesis is rejected. The study will say there is no significant difference. As a general rule, a p value of less than 0.05 means, the hypothesis is accepted and if it is more than 0.05 the hypothesis is rejected.
The CI is a number between 0 and 1 or is written as a per cent, demonstrating the level of confidence the reader can have in the result. 12 The CI is calculated by subtracting the p value to 1 (1–p). If there is a p value of 0.05, the CI will be 1–0.05=0.95=95%. A CI over 95% means, we can be confident the result is statistically significant. A CI below 95% means, the result is not statistically significant. The p values and CI highlight the confidence and robustness of a result.
Discussion, recommendations and conclusion
The final section of the paper is where the authors discuss their results and link them to other literature in the area (some of which may have been included in the literature review at the start of the paper). This reminds the reader of what is already known, what the study has found and what new information it adds. The discussion should demonstrate how the authors interpreted their results and how they contribute to new knowledge in the area. Implications for practice and future research should also be highlighted in this section of the paper.
A few other areas you may find helpful are:
Limitations of the study.
Conflicts of interest.
Table 2 provides a useful tool to help you apply the learning in this paper to the critiquing of quantitative research papers.
Quantitative paper appraisal checklist
- 1. ↵ Nursing and Midwifery Council , 2015 . The code: standard of conduct, performance and ethics for nurses and midwives https://www.nmc.org.uk/globalassets/sitedocuments/nmc-publications/nmc-code.pdf ( accessed 21.8.18 ).
- Gerrish K ,
- Moorley C ,
- Tunariu A , et al
- Shorten A ,
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Commissioned; internally peer reviewed.
Correction notice This article has been updated since its original publication to update p values from 0.5 to 0.05 throughout.
Linked Articles
- Miscellaneous Correction: How to appraise quantitative research BMJ Publishing Group Ltd and RCN Publishing Company Ltd Evidence-Based Nursing 2019; 22 62-62 Published Online First: 31 Jan 2019. doi: 10.1136/eb-2018-102996corr1
Read the full text or download the PDF:
Quantitative Methods in Global Health Research
- Living reference work entry
- First Online: 24 November 2020
- Cite this living reference work entry
- Jamalludin Ab Rahman 5
94 Accesses
Quantitative research is the foundation for evidence-based global health practice and interventions. Preparing health research starts with a clear research question to initiate the study, careful planning using sound methodology as well as the development and management of the capacity and resources to complete the whole research cycle. Good planning will also ensure valid research outcomes. Quantitative research emphasizes a clear target population, proper sampling techniques, adequate sample size, detailed planning for data collection, and proper statistical analysis. This chapter provides an overview of quantitative research methods, explains relevant study designs, and presents considerations on all aspect of the research cycle along four phases: initiation, planning, data collection, and reporting phase.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Institutional subscriptions
Similar content being viewed by others
Epidemiological Methods and Measures in Global Health Research
Reporting of Qualitative Health Research
Bourque LB, Clark VA (1992) Quantitative applications in the social sciences: processing data. Hist Soc Res. https://doi.org/10.4135/9781412985499
Ford I, Blauw GJ, Murphy MB, Shepherd J, Cobbe SM, Bollen ELEM et al (2002) A prospective study of pravastatin in the elderly at risk (PROSPER): screening experience and baseline characteristics. Curr Control Trials Cardiovasc Med 3(1):8–8. https://doi.org/10.1186/1468-6708-3-8
Article PubMed PubMed Central Google Scholar
Fricker R (2017) Sampling methods for online surveys. In: The SAGE handbook of online research methods, pp 162–183. https://doi.org/10.4135/9781473957992
Chapter Google Scholar
Hennessy S, Bilker WB, Berlin JA, Strom BL (1999) Factors influencing the optimal control-to-case ratio in matched case-control studies. Am J Epidemiol 149(2):195–197
Article CAS Google Scholar
Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (2019) Cochrane handbook for systematic reviews of interventions. Wiley, Hoboken
Book Google Scholar
Hill AB (1965) The environment and disease: association or causation? Proc R Soc Med 58(5):295–300. Retrieved from https://pubmed.ncbi.nlm.nih.gov/14283879 . https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1898525/
CAS PubMed PubMed Central Google Scholar
Hu FB, Willett WC, Manson JE, Colditz GA, Rimm EB, Speizer FE et al (2000) Snoring and risk of cardiovascular disease in women. J Am Coll Cardiol 35(2):308–313. https://doi.org/10.1016/S0735-1097(99)00540-9
Article CAS PubMed Google Scholar
Koplan JP, Bond TC, Merson MH, Reddy KS, Rodriguez MH, Sewankambo NK, Wasserheit JN (2009) Towards a common definition of global health. Lancet 373(9679):1993–1995. https://doi.org/10.1016/s0140-6736(09)60332-9
Article PubMed Google Scholar
Luk JW, Wang J, Simons-Morton BG (2010) Bullying victimisation and substance use among U.S. adolescents: mediation by depression. Prev Sci: The Off J Soc Prev Res 11(4):355–359. https://doi.org/10.1007/s11121-010-0179-0
Article Google Scholar
O’Donnell MJ, Chin SL, Rangarajan S, Xavier D, Liu L, Zhang H et al (2016) Global and regional effects of potentially modifiable risk factors associated with acute stroke in 32 countries (INTERSTROKE): a case-control study. Lancet 388(10046):761–775. https://doi.org/10.1016/s0140-6736(16)30506-2
Palipudi KM, Morton J, Hsia J, Andes L, Asma S, Talley B et al (2016) Methodology of the global adult tobacco survey – 2008–2010. Global Heal Prom 23(2_suppl):3–23. https://doi.org/10.1177/1757975913499800
Preece DA (1987) Good statistical practice. J Roy Stat Soc Ser D (The Statistician) 36(4):397–408. https://doi.org/10.2307/2348838
Sajedinejad S, Majdzadeh R, Vedadhir A, Tabatabaei MG, Mohammad K (2015) Maternal mortality: a cross-sectional study in global health. Glob Health 11:4. https://doi.org/10.1186/s12992-015-0087-y
Scandinavian Simvastatin Survival Study, G (1994) Randomised trial of cholesterol lowering in 4444 patients with coronary heart disease: the Scandinavian Simvastatin Survival Study (4S). Lancet 344(8934):1383–1389. https://doi.org/10.1016/S0140-6736(94)90566-5
Shepherd J, Blauw GJ, Murphy MB, Bollen EL, Buckley BM, Cobbe SM et al (2002) Pravastatin in elderly individuals at risk of vascular disease (PROSPER): a randomised controlled trial. Lancet 360(9346):1623–1630. https://doi.org/10.1016/s0140-6736(02)11600-x
Souza JP, Betran AP, Dumont A, de Mucio B, Gibbs Pickens CM, Deneux-Tharaux C et al (2016) A global reference for caesarean section rates (C-Model): a multicountry cross-sectional study. BJOG 123(3):427–436. https://doi.org/10.1111/1471-0528.13509
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A et al (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3(1):1–9
Worthington R (2016) Best practices for designing data collection forms. Retrieved from http://www.kwantu.net/blog/2016/12/26/best-practices-for-designing-data-collection-forms
Google Scholar
Download references
Author information
Authors and affiliations.
Kulliyyah of Medicine, International Islamic University Malaysia, Kuantan, Pahang, Malaysia
Jamalludin Ab Rahman
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Jamalludin Ab Rahman .
Editor information
Editors and affiliations.
Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, VIC, Australia
Robin Haring
Global Health Programme, Graduate Institute of International and, Geneva, Switzerland
Ilona Kickbusch
Charite-Univ.-Med. Berlin, Berlin, Berlin, Germany
Detlev Ganten
Africa, Brazzaville, Rep of the Congo, World Health Organization Regional Offic, Brazzaville, Congo, Demographic Republic (Zaire)
Matshidiso Moeti
Rights and permissions
Reprints and permissions
Copyright information
© 2021 The Editors and the World Health Organization
About this entry
Cite this entry.
Ab Rahman, J. (2021). Quantitative Methods in Global Health Research. In: Haring, R., Kickbusch, I., Ganten, D., Moeti, M. (eds) Handbook of Global Health. Springer, Cham. https://doi.org/10.1007/978-3-030-05325-3_9-1
Download citation
DOI : https://doi.org/10.1007/978-3-030-05325-3_9-1
Received : 24 August 2020
Accepted : 24 August 2020
Published : 24 November 2020
Publisher Name : Springer, Cham
Print ISBN : 978-3-030-05325-3
Online ISBN : 978-3-030-05325-3
eBook Packages : Springer Reference Biomedicine and Life Sciences Reference Module Biomedical and Life Sciences
- Publish with us
Policies and ethics
- Find a journal
- Track your research
IMAGES
VIDEO
COMMENTS
The necessity, importance, relevance, and urgency of quantitative research are articulated, establishing a strong foundation for the subsequent discussion, which delineates the scope, objectivity, goals, data, and methods that distinguish quantitative research, alongside a balanced inspection of its strengths and shortcomings, particularly in ...
This article provides an overview of quantitative research in medical education, underscores the main components of education research, and provides a general framework for evaluating research quality. We highlighted the importance of framing a study with respect to purpose, conceptual framework, and statement of study intent.
Abstract Background. The complexity of public health interventions create challenges in evaluating their effectiveness. There have been huge advancements in quantitative evidence synthesis methods development (including meta-analysis) for dealing with heterogeneity of intervention effects, inappropriate ‘lumping’ of interventions, adjusting for different populations and outcomes and the ...
Quantitative medicine is a paradigm shift in the practice of medicine that emphasizes the use of quantitative data and mathematical models to understand and treat disease. 20 This approach is based on the idea that the human body can be studied as a complex system, with many interconnected parts that can be modeled and simulated using ...
The Quantitative research method is widely used in the healthcare field to quantify behaviors, attitudes, opinions and other important variables from a large sample of data collection. Quantitative Research is also used to quantify the problem by way of generating numerical data or data that can be transformed into usable statistics (Defranzo ...
This publication is designed to help provide practicing health educators with basic tools helpful to facilitate a better understanding of quantitative research. This article describes the major components—title, introduction, methods, analyses, results and discussion sections—of quantitative research.
This article describes the basic tenets of quantitative research. The concepts of dependent and independent variables are addressed and the concept of measurement and its associated issues, such as error, reliability and validity, are explored. Experiments and surveys – the principal research designs in quantitative research – are described ...
Title, keywords and the authors. The title of a paper should be clear and give a good idea of the subject area. The title should not normally exceed 15 words 2 and should attract the attention of the reader. 3 The next step is to review the key words. These should provide information on both the ideas or concepts discussed in the paper and the ...
The study and practice of medicine could benefit from an enhanced engagement with the new perspectives provided by the emerging areas of complexity science and systems biology. A more integrated, systemic approach is needed to fully understand the processes of health, disease, and dysfunction, and the many challenges in medical research and education. Integral to this approach is the search ...
Abstract. Quantitative research is the foundation for evidence-based global health practice and interventions. Preparing health research starts with a clear research question to initiate the study, careful planning using sound methodology as well as the development and management of the capacity and resources to complete the whole research cycle.