Sample Size in Qualitative Interview Studies: Guided by Information Power

Affiliations.

  • 1 1 University of Copenhagen, Copenhagen, Denmark.
  • 2 2 Uni Research Health, Bergen, Norway.
  • 3 3 University of Bergen, Bergen, Norway.
  • PMID: 26613970
  • DOI: 10.1177/1049732315617444

Sample sizes must be ascertained in qualitative studies like in quantitative studies but not by the same means. The prevailing concept for sample size in qualitative studies is "saturation." Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose the concept "information power" to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds, relevant for the actual study, the lower amount of participants is needed. We suggest that the size of a sample with sufficient information power depends on (a) the aim of the study, (b) sample specificity, (c) use of established theory, (d) quality of dialogue, and (e) analysis strategy. We present a model where these elements of information and their relevant dimensions are related to information power. Application of this model in the planning and during data collection of a qualitative study is discussed.

Keywords: information power; methodology; participants; qualitative; sample size; saturation.

To read this content please select one of the options below:

Please note you do not have access to teaching notes, sample size for qualitative research.

Qualitative Market Research

ISSN : 1352-2752

Article publication date: 12 September 2016

Qualitative researchers have been criticised for not justifying sample size decisions in their research. This short paper addresses the issue of which sample sizes are appropriate and valid within different approaches to qualitative research.

Design/methodology/approach

The sparse literature on sample sizes in qualitative research is reviewed and discussed. This examination is informed by the personal experience of the author in terms of assessing, as an editor, reviewer comments as they relate to sample size in qualitative research. Also, the discussion is informed by the author’s own experience of undertaking commercial and academic qualitative research over the last 31 years.

In qualitative research, the determination of sample size is contextual and partially dependent upon the scientific paradigm under which investigation is taking place. For example, qualitative research which is oriented towards positivism, will require larger samples than in-depth qualitative research does, so that a representative picture of the whole population under review can be gained. Nonetheless, the paper also concludes that sample sizes involving one single case can be highly informative and meaningful as demonstrated in examples from management and medical research. Unique examples of research using a single sample or case but involving new areas or findings that are potentially highly relevant, can be worthy of publication. Theoretical saturation can also be useful as a guide in designing qualitative research, with practical research illustrating that samples of 12 may be cases where data saturation occurs among a relatively homogeneous population.

Practical implications

Sample sizes as low as one can be justified. Researchers and reviewers may find the discussion in this paper to be a useful guide to determining and critiquing sample size in qualitative research.

Originality/value

Sample size in qualitative research is always mentioned by reviewers of qualitative papers but discussion tends to be simplistic and relatively uninformed. The current paper draws attention to how sample sizes, at both ends of the size continuum, can be justified by researchers. This will also aid reviewers in their making of comments about the appropriateness of sample sizes in qualitative research.

  • Qualitative research
  • Qualitative methodology
  • Case studies
  • Sample size

Boddy, C.R. (2016), "Sample size for qualitative research", Qualitative Market Research , Vol. 19 No. 4, pp. 426-432. https://doi.org/10.1108/QMR-06-2016-0053

Emerald Group Publishing Limited

Copyright © 2016, Emerald Group Publishing Limited

Related articles

We’re listening — tell us what you think, something didn’t work….

Report bugs here

All feedback is valuable

Please share your general feedback

Join us on our journey

Platform update page.

Visit emeraldpublishing.com/platformupdate to discover the latest news and updates

Questions & More Information

Answers to the most commonly asked questions here

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Psychol Med
  • v.42(1); Jan-Feb 2020

Sample Size and its Importance in Research

Chittaranjan andrade.

Clinical Psychopharmacology Unit, Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India

The sample size for a study needs to be estimated at the time the study is proposed; too large a sample is unnecessary and unethical, and too small a sample is unscientific and also unethical. The necessary sample size can be calculated, using statistical software, based on certain assumptions. If no assumptions can be made, then an arbitrary sample size is set for a pilot study. This article discusses sample size and how it relates to matters such as ethics, statistical power, the primary and secondary hypotheses in a study, and findings from larger vs. smaller samples.

Studies are conducted on samples because it is usually impossible to study the entire population. Conclusions drawn from samples are intended to be generalized to the population, and sometimes to the future as well. The sample must therefore be representative of the population. This is best ensured by the use of proper methods of sampling. The sample must also be adequate in size – in fact, no more and no less.

SAMPLE SIZE AND ETHICS

A sample that is larger than necessary will be better representative of the population and will hence provide more accurate results. However, beyond a certain point, the increase in accuracy will be small and hence not worth the effort and expense involved in recruiting the extra patients. Furthermore, an overly large sample would inconvenience more patients than might be necessary for the study objectives; this is unethical. In contrast, a sample that is smaller than necessary would have insufficient statistical power to answer the primary research question, and a statistically nonsignificant result could merely be because of inadequate sample size (Type 2 or false negative error). Thus, a small sample could result in the patients in the study being inconvenienced with no benefit to future patients or to science. This is also unethical.

In this regard, inconvenience to patients refers to the time that they spend in clinical assessments and to the psychological and physical discomfort that they experience in assessments such as interviews, blood sampling, and other procedures.

ESTIMATING SAMPLE SIZE

So how large should a sample be? In hypothesis testing studies, this is mathematically calculated, conventionally, as the sample size necessary to be 80% certain of identifying a statistically significant outcome should the hypothesis be true for the population, with P for statistical significance set at 0.05. Some investigators power their studies for 90% instead of 80%, and some set the threshold for significance at 0.01 rather than 0.05. Both choices are uncommon because the necessary sample size becomes large, and the study becomes more expensive and more difficult to conduct. Many investigators increase the sample size by 10%, or by whatever proportion they can justify, to compensate for expected dropout, incomplete records, biological specimens that do not meet laboratory requirements for testing, and other study-related problems.

Sample size calculations require assumptions about expected means and standard deviations, or event risks, in different groups; or, upon expected effect sizes. For example, a study may be powered to detect an effect size of 0.5; or a response rate of 60% with drug vs. 40% with placebo.[ 1 ] When no guesstimates or expectations are possible, pilot studies are conducted on a sample that is arbitrary in size but what might be considered reasonable for the field.

The sample size may need to be larger in multicenter studies because of statistical noise (due to variations in patient characteristics, nonspecific treatment characteristics, rating practices, environments, etc. between study centers).[ 2 ] Sample size calculations can be performed manually or using statistical software; online calculators that provide free service can easily be identified by search engines. G*Power is an example of a free, downloadable program for sample size estimation. The manual and tutorial for G*Power can also be downloaded.

PRIMARY AND SECONDARY ANALYSES

The sample size is calculated for the primary hypothesis of the study. What is the difference between the primary hypothesis, primary outcome and primary outcome measure? As an example, the primary outcome may be a reduction in the severity of depression, the primary outcome measure may be the Montgomery-Asberg Depression Rating Scale (MADRS) and the primary hypothesis may be that reduction in MADRS scores is greater with the drug than with placebo. The primary hypothesis is tested in the primary analysis.

Studies almost always have many hypotheses; for example, that the study drug will outperform placebo on measures of depression, suicidality, anxiety, disability and quality of life. The sample size necessary for adequate statistical power to test each of these hypotheses will be different. Because a study can have only one sample size, it can be powered for only one outcome, the primary outcome. Therefore, the study would be either overpowered or underpowered for the other outcomes. These outcomes are therefore called secondary outcomes, and are associated with secondary hypotheses, and are tested in secondary analyses. Secondary analyses are generally considered exploratory because when many hypotheses in a study are each tested at a P < 0.05 level for significance, some may emerge statistically significant by chance (Type 1 or false positive errors).[ 3 ]

INTERPRETING RESULTS

Here is an interesting question. A test of the primary hypothesis yielded a P value of 0.07. Might we conclude that our sample was underpowered for the study and that, had our sample been larger, we would have identified a significant result? No! The reason is that larger samples will more accurately represent the population value, whereas smaller samples could be off the mark in either direction – towards or away from the population value. In this context, readers should also note that no matter how small the P value for an estimate is, the population value of that estimate remains the same.[ 4 ]

On a parting note, it is unlikely that population values will be null. That is, for example, that the response rate to the drug will be exactly the same as that to placebo, or that the correlation between height and age at onset of schizophrenia will be zero. If the sample size is large enough, even such small differences between groups, or trivial correlations, would be detected as being statistically significant. This does not mean that the findings are clinically significant.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

  • Research article
  • Open access
  • Published: 21 November 2018

Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period

  • Konstantina Vasileiou   ORCID: orcid.org/0000-0001-5047-3920 1 ,
  • Julie Barnett 1 ,
  • Susan Thorpe 2 &
  • Terry Young 3  

BMC Medical Research Methodology volume  18 , Article number:  148 ( 2018 ) Cite this article

731k Accesses

1147 Citations

172 Altmetric

Metrics details

Choosing a suitable sample size in qualitative research is an area of conceptual debate and practical uncertainty. That sample size principles, guidelines and tools have been developed to enable researchers to set, and justify the acceptability of, their sample size is an indication that the issue constitutes an important marker of the quality of qualitative research. Nevertheless, research shows that sample size sufficiency reporting is often poor, if not absent, across a range of disciplinary fields.

A systematic analysis of single-interview-per-participant designs within three health-related journals from the disciplines of psychology, sociology and medicine, over a 15-year period, was conducted to examine whether and how sample sizes were justified and how sample size was characterised and discussed by authors. Data pertinent to sample size were extracted and analysed using qualitative and quantitative analytic techniques.

Our findings demonstrate that provision of sample size justifications in qualitative health research is limited; is not contingent on the number of interviews; and relates to the journal of publication. Defence of sample size was most frequently supported across all three journals with reference to the principle of saturation and to pragmatic considerations. Qualitative sample sizes were predominantly – and often without justification – characterised as insufficient (i.e., ‘small’) and discussed in the context of study limitations. Sample size insufficiency was seen to threaten the validity and generalizability of studies’ results, with the latter being frequently conceived in nomothetic terms.

Conclusions

We recommend, firstly, that qualitative health researchers be more transparent about evaluations of their sample size sufficiency, situating these within broader and more encompassing assessments of data adequacy . Secondly, we invite researchers critically to consider how saturation parameters found in prior methodological studies and sample size community norms might best inform, and apply to, their own project and encourage that data adequacy is best appraised with reference to features that are intrinsic to the study at hand. Finally, those reviewing papers have a vital role in supporting and encouraging transparent study-specific reporting.

Peer Review reports

Sample adequacy in qualitative inquiry pertains to the appropriateness of the sample composition and size . It is an important consideration in evaluations of the quality and trustworthiness of much qualitative research [ 1 ] and is implicated – particularly for research that is situated within a post-positivist tradition and retains a degree of commitment to realist ontological premises – in appraisals of validity and generalizability [ 2 , 3 , 4 , 5 ].

Samples in qualitative research tend to be small in order to support the depth of case-oriented analysis that is fundamental to this mode of inquiry [ 5 ]. Additionally, qualitative samples are purposive, that is, selected by virtue of their capacity to provide richly-textured information, relevant to the phenomenon under investigation. As a result, purposive sampling [ 6 , 7 ] – as opposed to probability sampling employed in quantitative research – selects ‘information-rich’ cases [ 8 ]. Indeed, recent research demonstrates the greater efficiency of purposive sampling compared to random sampling in qualitative studies [ 9 ], supporting related assertions long put forward by qualitative methodologists.

Sample size in qualitative research has been the subject of enduring discussions [ 4 , 10 , 11 ]. Whilst the quantitative research community has established relatively straightforward statistics-based rules to set sample sizes precisely, the intricacies of qualitative sample size determination and assessment arise from the methodological, theoretical, epistemological, and ideological pluralism that characterises qualitative inquiry (for a discussion focused on the discipline of psychology see [ 12 ]). This mitigates against clear-cut guidelines, invariably applied. Despite these challenges, various conceptual developments have sought to address this issue, with guidance and principles [ 4 , 10 , 11 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 ], and more recently, an evidence-based approach to sample size determination seeks to ground the discussion empirically [ 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 ].

Focusing on single-interview-per-participant qualitative designs, the present study aims to further contribute to the dialogue of sample size in qualitative research by offering empirical evidence around justification practices associated with sample size. We next review the existing conceptual and empirical literature on sample size determination.

Sample size in qualitative research: Conceptual developments and empirical investigations

Qualitative research experts argue that there is no straightforward answer to the question of ‘how many’ and that sample size is contingent on a number of factors relating to epistemological, methodological and practical issues [ 36 ]. Sandelowski [ 4 ] recommends that qualitative sample sizes are large enough to allow the unfolding of a ‘new and richly textured understanding’ of the phenomenon under study, but small enough so that the ‘deep, case-oriented analysis’ (p. 183) of qualitative data is not precluded. Morse [ 11 ] posits that the more useable data are collected from each person, the fewer participants are needed. She invites researchers to take into account parameters, such as the scope of study, the nature of topic (i.e. complexity, accessibility), the quality of data, and the study design. Indeed, the level of structure of questions in qualitative interviewing has been found to influence the richness of data generated [ 37 ], and so, requires attention; empirical research shows that open questions, which are asked later on in the interview, tend to produce richer data [ 37 ].

Beyond such guidance, specific numerical recommendations have also been proffered, often based on experts’ experience of qualitative research. For example, Green and Thorogood [ 38 ] maintain that the experience of most qualitative researchers conducting an interview-based study with a fairly specific research question is that little new information is generated after interviewing 20 people or so belonging to one analytically relevant participant ‘category’ (pp. 102–104). Ritchie et al. [ 39 ] suggest that studies employing individual interviews conduct no more than 50 interviews so that researchers are able to manage the complexity of the analytic task. Similarly, Britten [ 40 ] notes that large interview studies will often comprise of 50 to 60 people. Experts have also offered numerical guidelines tailored to different theoretical and methodological traditions and specific research approaches, e.g. grounded theory, phenomenology [ 11 , 41 ]. More recently, a quantitative tool was proposed [ 42 ] to support a priori sample size determination based on estimates of the prevalence of themes in the population. Nevertheless, this more formulaic approach raised criticisms relating to assumptions about the conceptual [ 43 ] and ontological status of ‘themes’ [ 44 ] and the linearity ascribed to the processes of sampling, data collection and data analysis [ 45 ].

In terms of principles, Lincoln and Guba [ 17 ] proposed that sample size determination be guided by the criterion of informational redundancy , that is, sampling can be terminated when no new information is elicited by sampling more units. Following the logic of informational comprehensiveness Malterud et al. [ 18 ] introduced the concept of information power as a pragmatic guiding principle, suggesting that the more information power the sample provides, the smaller the sample size needs to be, and vice versa.

Undoubtedly, the most widely used principle for determining sample size and evaluating its sufficiency is that of saturation . The notion of saturation originates in grounded theory [ 15 ] – a qualitative methodological approach explicitly concerned with empirically-derived theory development – and is inextricably linked to theoretical sampling. Theoretical sampling describes an iterative process of data collection, data analysis and theory development whereby data collection is governed by emerging theory rather than predefined characteristics of the population. Grounded theory saturation (often called theoretical saturation) concerns the theoretical categories – as opposed to data – that are being developed and becomes evident when ‘gathering fresh data no longer sparks new theoretical insights, nor reveals new properties of your core theoretical categories’ [ 46 p. 113]. Saturation in grounded theory, therefore, does not equate to the more common focus on data repetition and moves beyond a singular focus on sample size as the justification of sampling adequacy [ 46 , 47 ]. Sample size in grounded theory cannot be determined a priori as it is contingent on the evolving theoretical categories.

Saturation – often under the terms of ‘data’ or ‘thematic’ saturation – has diffused into several qualitative communities beyond its origins in grounded theory. Alongside the expansion of its meaning, being variously equated with ‘no new data’, ‘no new themes’, and ‘no new codes’, saturation has emerged as the ‘gold standard’ in qualitative inquiry [ 2 , 26 ]. Nevertheless, and as Morse [ 48 ] asserts, whilst saturation is the most frequently invoked ‘guarantee of qualitative rigor’, ‘it is the one we know least about’ (p. 587). Certainly researchers caution that saturation is less applicable to, or appropriate for, particular types of qualitative research (e.g. conversation analysis, [ 49 ]; phenomenological research, [ 50 ]) whilst others reject the concept altogether [ 19 , 51 ].

Methodological studies in this area aim to provide guidance about saturation and develop a practical application of processes that ‘operationalise’ and evidence saturation. Guest, Bunce, and Johnson [ 26 ] analysed 60 interviews and found that saturation of themes was reached by the twelfth interview. They noted that their sample was relatively homogeneous, their research aims focused, so studies of more heterogeneous samples and with a broader scope would be likely to need a larger size to achieve saturation. Extending the enquiry to multi-site, cross-cultural research, Hagaman and Wutich [ 28 ] showed that sample sizes of 20 to 40 interviews were required to achieve data saturation of meta-themes that cut across research sites. In a theory-driven content analysis, Francis et al. [ 25 ] reached data saturation at the 17th interview for all their pre-determined theoretical constructs. The authors further proposed two main principles upon which specification of saturation be based: (a) researchers should a priori specify an initial analysis sample (e.g. 10 interviews) which will be used for the first round of analysis and (b) a stopping criterion , that is, a number of interviews (e.g. 3) that needs to be further conducted, the analysis of which will not yield any new themes or ideas. For greater transparency, Francis et al. [ 25 ] recommend that researchers present cumulative frequency graphs supporting their judgment that saturation was achieved. A comparative method for themes saturation (CoMeTS) has also been suggested [ 23 ] whereby the findings of each new interview are compared with those that have already emerged and if it does not yield any new theme, the ‘saturated terrain’ is assumed to have been established. Because the order in which interviews are analysed can influence saturation thresholds depending on the richness of the data, Constantinou et al. [ 23 ] recommend reordering and re-analysing interviews to confirm saturation. Hennink, Kaiser and Marconi’s [ 29 ] methodological study sheds further light on the problem of specifying and demonstrating saturation. Their analysis of interview data showed that code saturation (i.e. the point at which no additional issues are identified) was achieved at 9 interviews, but meaning saturation (i.e. the point at which no further dimensions, nuances, or insights of issues are identified) required 16–24 interviews. Although breadth can be achieved relatively soon, especially for high-prevalence and concrete codes, depth requires additional data, especially for codes of a more conceptual nature.

Critiquing the concept of saturation, Nelson [ 19 ] proposes five conceptual depth criteria in grounded theory projects to assess the robustness of the developing theory: (a) theoretical concepts should be supported by a wide range of evidence drawn from the data; (b) be demonstrably part of a network of inter-connected concepts; (c) demonstrate subtlety; (d) resonate with existing literature; and (e) can be successfully submitted to tests of external validity.

Other work has sought to examine practices of sample size reporting and sufficiency assessment across a range of disciplinary fields and research domains, from nutrition [ 34 ] and health education [ 32 ], to education and the health sciences [ 22 , 27 ], information systems [ 30 ], organisation and workplace studies [ 33 ], human computer interaction [ 21 ], and accounting studies [ 24 ]. Others investigated PhD qualitative studies [ 31 ] and grounded theory studies [ 35 ]. Incomplete and imprecise sample size reporting is commonly pinpointed by these investigations whilst assessment and justifications of sample size sufficiency are even more sporadic.

Sobal [ 34 ] examined the sample size of qualitative studies published in the Journal of Nutrition Education over a period of 30 years. Studies that employed individual interviews ( n  = 30) had an average sample size of 45 individuals and none of these explicitly reported whether their sample size sought and/or attained saturation. A minority of articles discussed how sample-related limitations (with the latter most often concerning the type of sample, rather than the size) limited generalizability. A further systematic analysis [ 32 ] of health education research over 20 years demonstrated that interview-based studies averaged 104 participants (range 2 to 720 interviewees). However, 40% did not report the number of participants. An examination of 83 qualitative interview studies in leading information systems journals [ 30 ] indicated little defence of sample sizes on the basis of recommendations by qualitative methodologists, prior relevant work, or the criterion of saturation. Rather, sample size seemed to correlate with factors such as the journal of publication or the region of study (US vs Europe vs Asia). These results led the authors to call for more rigor in determining and reporting sample size in qualitative information systems research and to recommend optimal sample size ranges for grounded theory (i.e. 20–30 interviews) and single case (i.e. 15–30 interviews) projects.

Similarly, fewer than 10% of articles in organisation and workplace studies provided a sample size justification relating to existing recommendations by methodologists, prior relevant work, or saturation [ 33 ], whilst only 17% of focus groups studies in health-related journals provided an explanation of sample size (i.e. number of focus groups), with saturation being the most frequently invoked argument, followed by published sample size recommendations and practical reasons [ 22 ]. The notion of saturation was also invoked by 11 out of the 51 most highly cited studies that Guetterman [ 27 ] reviewed in the fields of education and health sciences, of which six were grounded theory studies, four phenomenological and one a narrative inquiry. Finally, analysing 641 interview-based articles in accounting, Dai et al. [ 24 ] called for more rigor since a significant minority of studies did not report precise sample size.

Despite increasing attention to rigor in qualitative research (e.g. [ 52 ]) and more extensive methodological and analytical disclosures that seek to validate qualitative work [ 24 ], sample size reporting and sufficiency assessment remain inconsistent and partial, if not absent, across a range of research domains.

Objectives of the present study

The present study sought to enrich existing systematic analyses of the customs and practices of sample size reporting and justification by focusing on qualitative research relating to health. Additionally, this study attempted to expand previous empirical investigations by examining how qualitative sample sizes are characterised and discussed in academic narratives. Qualitative health research is an inter-disciplinary field that due to its affiliation with medical sciences, often faces views and positions reflective of a quantitative ethos. Thus qualitative health research constitutes an emblematic case that may help to unfold underlying philosophical and methodological differences across the scientific community that are crystallised in considerations of sample size. The present research, therefore, incorporates a comparative element on the basis of three different disciplines engaging with qualitative health research: medicine, psychology, and sociology. We chose to focus our analysis on single-per-participant-interview designs as this not only presents a popular and widespread methodological choice in qualitative health research, but also as the method where consideration of sample size – defined as the number of interviewees – is particularly salient.

Study design

A structured search for articles reporting cross-sectional, interview-based qualitative studies was carried out and eligible reports were systematically reviewed and analysed employing both quantitative and qualitative analytic techniques.

We selected journals which (a) follow a peer review process, (b) are considered high quality and influential in their field as reflected in journal metrics, and (c) are receptive to, and publish, qualitative research (Additional File  1 presents the journals’ editorial positions in relation to qualitative research and sample considerations where available). Three health-related journals were chosen, each representing a different disciplinary field; the British Medical Journal (BMJ) representing medicine, the British Journal of Health Psychology (BJHP) representing psychology, and the Sociology of Health & Illness (SHI) representing sociology.

Search strategy to identify studies

Employing the search function of each individual journal, we used the terms ‘interview*’ AND ‘qualitative’ and limited the results to articles published between 1 January 2003 and 22 September 2017 (i.e. a 15-year review period).

Eligibility criteria

To be eligible for inclusion in the review, the article had to report a cross-sectional study design. Longitudinal studies were thus excluded whilst studies conducted within a broader research programme (e.g. interview studies nested in a trial, as part of a broader ethnography, as part of a longitudinal research) were included if they reported only single-time qualitative interviews. The method of data collection had to be individual, synchronous qualitative interviews (i.e. group interviews, structured interviews and e-mail interviews over a period of time were excluded), and the data had to be analysed qualitatively (i.e. studies that quantified their qualitative data were excluded). Mixed method studies and articles reporting more than one qualitative method of data collection (e.g. individual interviews and focus groups) were excluded. Figure  1 , a PRISMA flow diagram [ 53 ], shows the number of: articles obtained from the searches and screened; papers assessed for eligibility; and articles included in the review (Additional File  2 provides the full list of articles included in the review and their unique identifying code – e.g. BMJ01, BJHP02, SHI03). One review author (KV) assessed the eligibility of all papers identified from the searches. When in doubt, discussions about retaining or excluding articles were held between KV and JB in regular meetings, and decisions were jointly made.

figure 1

PRISMA flow diagram

Data extraction and analysis

A data extraction form was developed (see Additional File  3 ) recording three areas of information: (a) information about the article (e.g. authors, title, journal, year of publication etc.); (b) information about the aims of the study, the sample size and any justification for this, the participant characteristics, the sampling technique and any sample-related observations or comments made by the authors; and (c) information about the method or technique(s) of data analysis, the number of researchers involved in the analysis, the potential use of software, and any discussion around epistemological considerations. The Abstract, Methods and Discussion (and/or Conclusion) sections of each article were examined by one author (KV) who extracted all the relevant information. This was directly copied from the articles and, when appropriate, comments, notes and initial thoughts were written down.

To examine the kinds of sample size justifications provided by articles, an inductive content analysis [ 54 ] was initially conducted. On the basis of this analysis, the categories that expressed qualitatively different sample size justifications were developed.

We also extracted or coded quantitative data regarding the following aspects:

Journal and year of publication

Number of interviews

Number of participants

Presence of sample size justification(s) (Yes/No)

Presence of a particular sample size justification category (Yes/No), and

Number of sample size justifications provided

Descriptive and inferential statistical analyses were used to explore these data.

A thematic analysis [ 55 ] was then performed on all scientific narratives that discussed or commented on the sample size of the study. These narratives were evident both in papers that justified their sample size and those that did not. To identify these narratives, in addition to the methods sections, the discussion sections of the reviewed articles were also examined and relevant data were extracted and analysed.

In total, 214 articles – 21 in the BMJ, 53 in the BJHP and 140 in the SHI – were eligible for inclusion in the review. Table  1 provides basic information about the sample sizes – measured in number of interviews – of the studies reviewed across the three journals. Figure  2 depicts the number of eligible articles published each year per journal.

figure 2

The publication of qualitative studies in the BMJ was significantly reduced from 2012 onwards and this appears to coincide with the initiation of the BMJ Open to which qualitative studies were possibly directed.

Pairwise comparisons following a significant Kruskal-Wallis Footnote 2 test indicated that the studies published in the BJHP had significantly ( p  < .001) smaller samples sizes than those published either in the BMJ or the SHI. Sample sizes of BMJ and SHI articles did not differ significantly from each other.

Sample size justifications: Results from the quantitative and qualitative content analysis

Ten (47.6%) of the 21 BMJ studies, 26 (49.1%) of the 53 BJHP papers and 24 (17.1%) of the 140 SHI articles provided some sort of sample size justification. As shown in Table  2 , the majority of articles which justified their sample size provided one justification (70% of articles); fourteen studies (25%) provided two distinct justifications; one study (1.7%) gave three justifications and two studies (3.3%) expressed four distinct justifications.

There was no association between the number of interviews (i.e. sample size) conducted and the provision of a justification (rpb = .054, p  = .433). Within journals, Mann-Whitney tests indicated that sample sizes of ‘justifying’ and ‘non-justifying’ articles in the BMJ and SHI did not differ significantly from each other. In the BJHP, ‘justifying’ articles ( Mean rank  = 31.3) had significantly larger sample sizes than ‘non-justifying’ studies ( Mean rank  = 22.7; U = 237.000, p  < .05).

There was a significant association between the journal a paper was published in and the provision of a justification (χ 2 (2) = 23.83, p  < .001). BJHP studies provided a sample size justification significantly more often than would be expected ( z  = 2.9); SHI studies significantly less often ( z  = − 2.4). If an article was published in the BJHP, the odds of providing a justification were 4.8 times higher than if published in the SHI. Similarly if published in the BMJ, the odds of a study justifying its sample size were 4.5 times higher than in the SHI.

The qualitative content analysis of the scientific narratives identified eleven different sample size justifications. These are described below and illustrated with excerpts from relevant articles. By way of a summary, the frequency with which these were deployed across the three journals is indicated in Table  3 .

Saturation was the most commonly invoked principle (55.4% of all justifications) deployed by studies across all three journals to justify the sufficiency of their sample size. In the BMJ, two studies claimed that they achieved data saturation (BMJ17; BMJ18) and one article referred descriptively to achieving saturation without explicitly using the term (BMJ13). Interestingly, BMJ13 included data in the analysis beyond the point of saturation in search of ‘unusual/deviant observations’ and with a view to establishing findings consistency.

Thirty three women were approached to take part in the interview study. Twenty seven agreed and 21 (aged 21–64, median 40) were interviewed before data saturation was reached (one tape failure meant that 20 interviews were available for analysis). (BMJ17). No new topics were identified following analysis of approximately two thirds of the interviews; however, all interviews were coded in order to develop a better understanding of how characteristic the views and reported behaviours were, and also to collect further examples of unusual/deviant observations. (BMJ13).

Two articles reported pre-determining their sample size with a view to achieving data saturation (BMJ08 – see extract in section In line with existing research ; BMJ15 – see extract in section Pragmatic considerations ) without further specifying if this was achieved. One paper claimed theoretical saturation (BMJ06) conceived as being when “no further recurring themes emerging from the analysis” whilst another study argued that although the analytic categories were highly saturated, it was not possible to determine whether theoretical saturation had been achieved (BMJ04). One article (BMJ18) cited a reference to support its position on saturation.

In the BJHP, six articles claimed that they achieved data saturation (BJHP21; BJHP32; BJHP39; BJHP48; BJHP49; BJHP52) and one article stated that, given their sample size and the guidelines for achieving data saturation, it anticipated that saturation would be attained (BJHP50).

Recruitment continued until data saturation was reached, defined as the point at which no new themes emerged. (BJHP48). It has previously been recommended that qualitative studies require a minimum sample size of at least 12 to reach data saturation (Clarke & Braun, 2013; Fugard & Potts, 2014; Guest, Bunce, & Johnson, 2006) Therefore, a sample of 13 was deemed sufficient for the qualitative analysis and scale of this study. (BJHP50).

Two studies argued that they achieved thematic saturation (BJHP28 – see extract in section Sample size guidelines ; BJHP31) and one (BJHP30) article, explicitly concerned with theory development and deploying theoretical sampling, claimed both theoretical and data saturation.

The final sample size was determined by thematic saturation, the point at which new data appears to no longer contribute to the findings due to repetition of themes and comments by participants (Morse, 1995). At this point, data generation was terminated. (BJHP31).

Five studies argued that they achieved (BJHP05; BJHP33; BJHP40; BJHP13 – see extract in section Pragmatic considerations ) or anticipated (BJHP46) saturation without any further specification of the term. BJHP17 referred descriptively to a state of achieved saturation without specifically using the term. Saturation of coding , but not saturation of themes, was claimed to have been reached by one article (BJHP18). Two articles explicitly stated that they did not achieve saturation; instead claiming a level of theme completeness (BJHP27) or that themes being replicated (BJHP53) were arguments for sufficiency of their sample size.

Furthermore, data collection ceased on pragmatic grounds rather than at the point when saturation point was reached. Despite this, although nuances within sub-themes were still emerging towards the end of data analysis, the themes themselves were being replicated indicating a level of completeness. (BJHP27).

Finally, one article criticised and explicitly renounced the notion of data saturation claiming that, on the contrary, the criterion of theoretical sufficiency determined its sample size (BJHP16).

According to the original Grounded Theory texts, data collection should continue until there are no new discoveries ( i.e. , ‘data saturation’; Glaser & Strauss, 1967). However, recent revisions of this process have discussed how it is rare that data collection is an exhaustive process and researchers should rely on how well their data are able to create a sufficient theoretical account or ‘theoretical sufficiency’ (Dey, 1999). For this study, it was decided that theoretical sufficiency would guide recruitment, rather than looking for data saturation. (BJHP16).

Ten out of the 20 BJHP articles that employed the argument of saturation used one or more citations relating to this principle.

In the SHI, one article (SHI01) claimed that it achieved category saturation based on authors’ judgment.

This number was not fixed in advance, but was guided by the sampling strategy and the judgement, based on the analysis of the data, of the point at which ‘category saturation’ was achieved. (SHI01).

Three articles described a state of achieved saturation without using the term or specifying what sort of saturation they had achieved (i.e. data, theoretical, thematic saturation) (SHI04; SHI13; SHI30) whilst another four articles explicitly stated that they achieved saturation (SHI100; SHI125; SHI136; SHI137). Two papers stated that they achieved data saturation (SHI73 – see extract in section Sample size guidelines ; SHI113), two claimed theoretical saturation (SHI78; SHI115) and two referred to achieving thematic saturation (SHI87; SHI139) or to saturated themes (SHI29; SHI50).

Recruitment and analysis ceased once theoretical saturation was reached in the categories described below (Lincoln and Guba 1985). (SHI115). The respondents’ quotes drawn on below were chosen as representative, and illustrate saturated themes. (SHI50).

One article stated that thematic saturation was anticipated with its sample size (SHI94). Briefly referring to the difficulty in pinpointing achievement of theoretical saturation, SHI32 (see extract in section Richness and volume of data ) defended the sufficiency of its sample size on the basis of “the high degree of consensus [that] had begun to emerge among those interviewed”, suggesting that information from interviews was being replicated. Finally, SHI112 (see extract in section Further sampling to check findings consistency ) argued that it achieved saturation of discursive patterns . Seven of the 19 SHI articles cited references to support their position on saturation (see Additional File  4 for the full list of citations used by articles to support their position on saturation across the three journals).

Overall, it is clear that the concept of saturation encompassed a wide range of variants expressed in terms such as saturation, data saturation, thematic saturation, theoretical saturation, category saturation, saturation of coding, saturation of discursive themes, theme completeness. It is noteworthy, however, that although these various claims were sometimes supported with reference to the literature, they were not evidenced in relation to the study at hand.

Pragmatic considerations

The determination of sample size on the basis of pragmatic considerations was the second most frequently invoked argument (9.6% of all justifications) appearing in all three journals. In the BMJ, one article (BMJ15) appealed to pragmatic reasons, relating to time constraints and the difficulty to access certain study populations, to justify the determination of its sample size.

On the basis of the researchers’ previous experience and the literature, [30, 31] we estimated that recruitment of 15–20 patients at each site would achieve data saturation when data from each site were analysed separately. We set a target of seven to 10 caregivers per site because of time constraints and the anticipated difficulty of accessing caregivers at some home based care services. This gave a target sample of 75–100 patients and 35–50 caregivers overall. (BMJ15).

In the BJHP, four articles mentioned pragmatic considerations relating to time or financial constraints (BJHP27 – see extract in section Saturation ; BJHP53), the participant response rate (BJHP13), and the fixed (and thus limited) size of the participant pool from which interviewees were sampled (BJHP18).

We had aimed to continue interviewing until we had reached saturation, a point whereby further data collection would yield no further themes. In practice, the number of individuals volunteering to participate dictated when recruitment into the study ceased (15 young people, 15 parents). Nonetheless, by the last few interviews, significant repetition of concepts was occurring, suggesting ample sampling. (BJHP13).

Finally, three SHI articles explained their sample size with reference to practical aspects: time constraints and project manageability (SHI56), limited availability of respondents and project resources (SHI131), and time constraints (SHI113).

The size of the sample was largely determined by the availability of respondents and resources to complete the study. Its composition reflected, as far as practicable, our interest in how contextual factors (for example, gender relations and ethnicity) mediated the illness experience. (SHI131).

Qualities of the analysis

This sample size justification (8.4% of all justifications) was mainly employed by BJHP articles and referred to an intensive, idiographic and/or latently focused analysis, i.e. that moved beyond description. More specifically, six articles defended their sample size on the basis of an intensive analysis of transcripts and/or the idiographic focus of the study/analysis. Four of these papers (BJHP02; BJHP19; BJHP24; BJHP47) adopted an Interpretative Phenomenological Analysis (IPA) approach.

The current study employed a sample of 10 in keeping with the aim of exploring each participant’s account (Smith et al. , 1999). (BJHP19).

BJHP47 explicitly renounced the notion of saturation within an IPA approach. The other two BJHP articles conducted thematic analysis (BJHP34; BJHP38). The level of analysis – i.e. latent as opposed to a more superficial descriptive analysis – was also invoked as a justification by BJHP38 alongside the argument of an intensive analysis of individual transcripts

The resulting sample size was at the lower end of the range of sample sizes employed in thematic analysis (Braun & Clarke, 2013). This was in order to enable significant reflection, dialogue, and time on each transcript and was in line with the more latent level of analysis employed, to identify underlying ideas, rather than a more superficial descriptive analysis (Braun & Clarke, 2006). (BJHP38).

Finally, one BMJ paper (BMJ21) defended its sample size with reference to the complexity of the analytic task.

We stopped recruitment when we reached 30–35 interviews, owing to the depth and duration of interviews, richness of data, and complexity of the analytical task. (BMJ21).

Meet sampling requirements

Meeting sampling requirements (7.2% of all justifications) was another argument employed by two BMJ and four SHI articles to explain their sample size. Achieving maximum variation sampling in terms of specific interviewee characteristics determined and explained the sample size of two BMJ studies (BMJ02; BMJ16 – see extract in section Meet research design requirements ).

Recruitment continued until sampling frame requirements were met for diversity in age, sex, ethnicity, frequency of attendance, and health status. (BMJ02).

Regarding the SHI articles, two papers explained their numbers on the basis of their sampling strategy (SHI01- see extract in section Saturation ; SHI23) whilst sampling requirements that would help attain sample heterogeneity in terms of a particular characteristic of interest was cited by one paper (SHI127).

The combination of matching the recruitment sites for the quantitative research and the additional purposive criteria led to 104 phase 2 interviews (Internet (OLC): 21; Internet (FTF): 20); Gyms (FTF): 23; HIV testing (FTF): 20; HIV treatment (FTF): 20.) (SHI23). Of the fifty interviews conducted, thirty were translated from Spanish into English. These thirty, from which we draw our findings, were chosen for translation based on heterogeneity in depressive symptomology and educational attainment. (SHI127).

Finally, the pre-determination of sample size on the basis of sampling requirements was stated by one article though this was not used to justify the number of interviews (SHI10).

Sample size guidelines

Five BJHP articles (BJHP28; BJHP38 – see extract in section Qualities of the analysis ; BJHP46; BJHP47; BJHP50 – see extract in section Saturation ) and one SHI paper (SHI73) relied on citing existing sample size guidelines or norms within research traditions to determine and subsequently defend their sample size (7.2% of all justifications).

Sample size guidelines suggested a range between 20 and 30 interviews to be adequate (Creswell, 1998). Interviewer and note taker agreed that thematic saturation, the point at which no new concepts emerge from subsequent interviews (Patton, 2002), was achieved following completion of 20 interviews. (BJHP28). Interviewing continued until we deemed data saturation to have been reached (the point at which no new themes were emerging). Researchers have proposed 30 as an approximate or working number of interviews at which one could expect to be reaching theoretical saturation when using a semi-structured interview approach (Morse 2000), although this can vary depending on the heterogeneity of respondents interviewed and complexity of the issues explored. (SHI73).

In line with existing research

Sample sizes of published literature in the area of the subject matter under investigation (3.5% of all justifications) were used by 2 BMJ articles as guidance and a precedent for determining and defending their own sample size (BMJ08; BMJ15 – see extract in section Pragmatic considerations ).

We drew participants from a list of prisoners who were scheduled for release each week, sampling them until we reached the target of 35 cases, with a view to achieving data saturation within the scope of the study and sufficient follow-up interviews and in line with recent studies [8–10]. (BMJ08).

Similarly, BJHP38 (see extract in section Qualities of the analysis ) claimed that its sample size was within the range of sample sizes of published studies that use its analytic approach.

Richness and volume of data

BMJ21 (see extract in section Qualities of the analysis ) and SHI32 referred to the richness, detailed nature, and volume of data collected (2.3% of all justifications) to justify the sufficiency of their sample size.

Although there were more potential interviewees from those contacted by postcode selection, it was decided to stop recruitment after the 10th interview and focus on analysis of this sample. The material collected was considerable and, given the focused nature of the study, extremely detailed. Moreover, a high degree of consensus had begun to emerge among those interviewed, and while it is always difficult to judge at what point ‘theoretical saturation’ has been reached, or how many interviews would be required to uncover exception(s), it was felt the number was sufficient to satisfy the aims of this small in-depth investigation (Strauss and Corbin 1990). (SHI32).

Meet research design requirements

Determination of sample size so that it is in line with, and serves the requirements of, the research design (2.3% of all justifications) that the study adopted was another justification used by 2 BMJ papers (BMJ16; BMJ08 – see extract in section In line with existing research ).

We aimed for diverse, maximum variation samples [20] totalling 80 respondents from different social backgrounds and ethnic groups and those bereaved due to different types of suicide and traumatic death. We could have interviewed a smaller sample at different points in time (a qualitative longitudinal study) but chose instead to seek a broad range of experiences by interviewing those bereaved many years ago and others bereaved more recently; those bereaved in different circumstances and with different relations to the deceased; and people who lived in different parts of the UK; with different support systems and coroners’ procedures (see Tables 1 and 2 for more details). (BMJ16).

Researchers’ previous experience

The researchers’ previous experience (possibly referring to experience with qualitative research) was invoked by BMJ15 (see extract in section Pragmatic considerations ) as a justification for the determination of sample size.

Nature of study

One BJHP paper argued that the sample size was appropriate for the exploratory nature of the study (BJHP38).

A sample of eight participants was deemed appropriate because of the exploratory nature of this research and the focus on identifying underlying ideas about the topic. (BJHP38).

Further sampling to check findings consistency

Finally, SHI112 argued that once it had achieved saturation of discursive patterns, further sampling was decided and conducted to check for consistency of the findings.

Within each of the age-stratified groups, interviews were randomly sampled until saturation of discursive patterns was achieved. This resulted in a sample of 67 interviews. Once this sample had been analysed, one further interview from each age-stratified group was randomly chosen to check for consistency of the findings. Using this approach it was possible to more carefully explore children’s discourse about the ‘I’, agency, relationality and power in the thematic areas, revealing the subtle discursive variations described in this article. (SHI112).

Thematic analysis of passages discussing sample size

This analysis resulted in two overarching thematic areas; the first concerned the variation in the characterisation of sample size sufficiency, and the second related to the perceived threats deriving from sample size insufficiency.

Characterisations of sample size sufficiency

The analysis showed that there were three main characterisations of the sample size in the articles that provided relevant comments and discussion: (a) the vast majority of these qualitative studies ( n  = 42) considered their sample size as ‘small’ and this was seen and discussed as a limitation; only two articles viewed their small sample size as desirable and appropriate (b) a minority of articles ( n  = 4) proclaimed that their achieved sample size was ‘sufficient’; and (c) finally, a small group of studies ( n  = 5) characterised their sample size as ‘large’. Whilst achieving a ‘large’ sample size was sometimes viewed positively because it led to richer results, there were also occasions when a large sample size was problematic rather than desirable.

‘Small’ but why and for whom?

A number of articles which characterised their sample size as ‘small’ did so against an implicit or explicit quantitative framework of reference. Interestingly, three studies that claimed to have achieved data saturation or ‘theoretical sufficiency’ with their sample size, discussed or noted as a limitation in their discussion their ‘small’ sample size, raising the question of why, or for whom, the sample size was considered small given that the qualitative criterion of saturation had been satisfied.

The current study has a number of limitations. The sample size was small (n = 11) and, however, large enough for no new themes to emerge. (BJHP39). The study has two principal limitations. The first of these relates to the small number of respondents who took part in the study. (SHI73).

Other articles appeared to accept and acknowledge that their sample was flawed because of its small size (as well as other compositional ‘deficits’ e.g. non-representativeness, biases, self-selection) or anticipated that they might be criticized for their small sample size. It seemed that the imagined audience – perhaps reviewer or reader – was one inclined to hold the tenets of quantitative research, and certainly one to whom it was important to indicate the recognition that small samples were likely to be problematic. That one’s sample might be thought small was often construed as a limitation couched in a discourse of regret or apology.

Very occasionally, the articulation of the small size as a limitation was explicitly aligned against an espoused positivist framework and quantitative research.

This study has some limitations. Firstly, the 100 incidents sample represents a small number of the total number of serious incidents that occurs every year. 26 We sent out a nationwide invitation and do not know why more people did not volunteer for the study. Our lack of epidemiological knowledge about healthcare incidents, however, means that determining an appropriate sample size continues to be difficult. (BMJ20).

Indicative of an apparent oscillation of qualitative researchers between the different requirements and protocols demarcating the quantitative and qualitative worlds, there were a few instances of articles which briefly recognised their ‘small’ sample size as a limitation, but then defended their study on more qualitative grounds, such as their ability and success at capturing the complexity of experience and delving into the idiographic, and at generating particularly rich data.

This research, while limited in size, has sought to capture some of the complexity attached to men’s attitudes and experiences concerning incomes and material circumstances. (SHI35). Our numbers are small because negotiating access to social networks was slow and labour intensive, but our methods generated exceptionally rich data. (BMJ21). This study could be criticised for using a small and unrepresentative sample. Given that older adults have been ignored in the research concerning suntanning, fair-skinned older adults are the most likely to experience skin cancer, and women privilege appearance over health when it comes to sunbathing practices, our study offers depth and richness of data in a demographic group much in need of research attention. (SHI57).

‘Good enough’ sample sizes

Only four articles expressed some degree of confidence that their achieved sample size was sufficient. For example, SHI139, in line with the justification of thematic saturation that it offered, expressed trust in its sample size sufficiency despite the poor response rate. Similarly, BJHP04, which did not provide a sample size justification, argued that it targeted a larger sample size in order to eventually recruit a sufficient number of interviewees, due to anticipated low response rate.

Twenty-three people with type I diabetes from the target population of 133 ( i.e. 17.3%) consented to participate but four did not then respond to further contacts (total N = 19). The relatively low response rate was anticipated, due to the busy life-styles of young people in the age range, the geographical constraints, and the time required to participate in a semi-structured interview, so a larger target sample allowed a sufficient number of participants to be recruited. (BJHP04).

Two other articles (BJHP35; SHI32) linked the claimed sufficiency to the scope (i.e. ‘small, in-depth investigation’), aims and nature (i.e. ‘exploratory’) of their studies, thus anchoring their numbers to the particular context of their research. Nevertheless, claims of sample size sufficiency were sometimes undermined when they were juxtaposed with an acknowledgement that a larger sample size would be more scientifically productive.

Although our sample size was sufficient for this exploratory study, a more diverse sample including participants with lower socioeconomic status and more ethnic variation would be informative. A larger sample could also ensure inclusion of a more representative range of apps operating on a wider range of platforms. (BJHP35).

‘Large’ sample sizes - Promise or peril?

Three articles (BMJ13; BJHP05; BJHP48) which all provided the justification of saturation, characterised their sample size as ‘large’ and narrated this oversufficiency in positive terms as it allowed richer data and findings and enhanced the potential for generalisation. The type of generalisation aspired to (BJHP48) was not further specified however.

This study used rich data provided by a relatively large sample of expert informants on an important but under-researched topic. (BMJ13). Qualitative research provides a unique opportunity to understand a clinical problem from the patient’s perspective. This study had a large diverse sample, recruited through a range of locations and used in-depth interviews which enhance the richness and generalizability of the results. (BJHP48).

And whilst a ‘large’ sample size was endorsed and valued by some qualitative researchers, within the psychological tradition of IPA, a ‘large’ sample size was counter-normative and therefore needed to be justified. Four BJHP studies, all adopting IPA, expressed the appropriateness or desirability of ‘small’ sample sizes (BJHP41; BJHP45) or hastened to explain why they included a larger than typical sample size (BJHP32; BJHP47). For example, BJHP32 below provides a rationale for how an IPA study can accommodate a large sample size and how this was indeed suitable for the purposes of the particular research. To strengthen the explanation for choosing a non-normative sample size, previous IPA research citing a similar sample size approach is used as a precedent.

Small scale IPA studies allow in-depth analysis which would not be possible with larger samples (Smith et al. , 2009). (BJHP41). Although IPA generally involves intense scrutiny of a small number of transcripts, it was decided to recruit a larger diverse sample as this is the first qualitative study of this population in the United Kingdom (as far as we know) and we wanted to gain an overview. Indeed, Smith, Flowers, and Larkin (2009) agree that IPA is suitable for larger groups. However, the emphasis changes from an in-depth individualistic analysis to one in which common themes from shared experiences of a group of people can be elicited and used to understand the network of relationships between themes that emerge from the interviews. This large-scale format of IPA has been used by other researchers in the field of false-positive research. Baillie, Smith, Hewison, and Mason (2000) conducted an IPA study, with 24 participants, of ultrasound screening for chromosomal abnormality; they found that this larger number of participants enabled them to produce a more refined and cohesive account. (BJHP32).

The IPA articles found in the BJHP were the only instances where a ‘small’ sample size was advocated and a ‘large’ sample size problematized and defended. These IPA studies illustrate that the characterisation of sample size sufficiency can be a function of researchers’ theoretical and epistemological commitments rather than the result of an ‘objective’ sample size assessment.

Threats from sample size insufficiency

As shown above, the majority of articles that commented on their sample size, simultaneously characterized it as small and problematic. On those occasions that authors did not simply cite their ‘small’ sample size as a study limitation but rather continued and provided an account of how and why a small sample size was problematic, two important scientific qualities of the research seemed to be threatened: the generalizability and validity of results.

Generalizability

Those who characterised their sample as ‘small’ connected this to the limited potential for generalization of the results. Other features related to the sample – often some kind of compositional particularity – were also linked to limited potential for generalisation. Though not always explicitly articulated to what form of generalisation the articles referred to (see BJHP09), generalisation was mostly conceived in nomothetic terms, that is, it concerned the potential to draw inferences from the sample to the broader study population (‘representational generalisation’ – see BJHP31) and less often to other populations or cultures.

It must be noted that samples are small and whilst in both groups the majority of those women eligible participated, generalizability cannot be assumed. (BJHP09). The study’s limitations should be acknowledged: Data are presented from interviews with a relatively small group of participants, and thus, the views are not necessarily generalizable to all patients and clinicians. In particular, patients were only recruited from secondary care services where COFP diagnoses are typically confirmed. The sample therefore is unlikely to represent the full spectrum of patients, particularly those who are not referred to, or who have been discharged from dental services. (BJHP31).

Without explicitly using the term generalisation, two SHI articles noted how their ‘small’ sample size imposed limits on ‘the extent that we can extrapolate from these participants’ accounts’ (SHI114) or to the possibility ‘to draw far-reaching conclusions from the results’ (SHI124).

Interestingly, only a minority of articles alluded to, or invoked, a type of generalisation that is aligned with qualitative research, that is, idiographic generalisation (i.e. generalisation that can be made from and about cases [ 5 ]). These articles, all published in the discipline of sociology, defended their findings in terms of the possibility of drawing logical and conceptual inferences to other contexts and of generating understanding that has the potential to advance knowledge, despite their ‘small’ size. One article (SHI139) clearly contrasted nomothetic (statistical) generalisation to idiographic generalisation, arguing that the lack of statistical generalizability does not nullify the ability of qualitative research to still be relevant beyond the sample studied.

Further, these data do not need to be statistically generalisable for us to draw inferences that may advance medicalisation analyses (Charmaz 2014). These data may be seen as an opportunity to generate further hypotheses and are a unique application of the medicalisation framework. (SHI139). Although a small-scale qualitative study related to school counselling, this analysis can be usefully regarded as a case study of the successful utilisation of mental health-related resources by adolescents. As many of the issues explored are of relevance to mental health stigma more generally, it may also provide insights into adult engagement in services. It shows how a sociological analysis, which uses positioning theory to examine how people negotiate, partially accept and simultaneously resist stigmatisation in relation to mental health concerns, can contribute to an elucidation of the social processes and narrative constructions which may maintain as well as bridge the mental health service gap. (SHI103).

Only one article (SHI30) used the term transferability to argue for the potential of wider relevance of the results which was thought to be more the product of the composition of the sample (i.e. diverse sample), rather than the sample size.

The second major concern that arose from a ‘small’ sample size pertained to the internal validity of findings (i.e. here the term is used to denote the ‘truth’ or credibility of research findings). Authors expressed uncertainty about the degree of confidence in particular aspects or patterns of their results, primarily those that concerned some form of differentiation on the basis of relevant participant characteristics.

The information source preferred seemed to vary according to parents’ education; however, the sample size is too small to draw conclusions about such patterns. (SHI80). Although our numbers were too small to demonstrate gender differences with any certainty, it does seem that the biomedical and erotic scripts may be more common in the accounts of men and the relational script more common in the accounts of women. (SHI81).

In other instances, articles expressed uncertainty about whether their results accounted for the full spectrum and variation of the phenomenon under investigation. In other words, a ‘small’ sample size (alongside compositional ‘deficits’ such as a not statistically representative sample) was seen to threaten the ‘content validity’ of the results which in turn led to constructions of the study conclusions as tentative.

Data collection ceased on pragmatic grounds rather than when no new information appeared to be obtained ( i.e. , saturation point). As such, care should be taken not to overstate the findings. Whilst the themes from the initial interviews seemed to be replicated in the later interviews, further interviews may have identified additional themes or provided more nuanced explanations. (BJHP53). …it should be acknowledged that this study was based on a small sample of self-selected couples in enduring marriages who were not broadly representative of the population. Thus, participants may not be representative of couples that experience postnatal PTSD. It is therefore unlikely that all the key themes have been identified and explored. For example, couples who were excluded from the study because the male partner declined to participate may have been experiencing greater interpersonal difficulties. (BJHP03).

In other instances, articles attempted to preserve a degree of credibility of their results, despite the recognition that the sample size was ‘small’. Clarity and sharpness of emerging themes and alignment with previous relevant work were the arguments employed to warrant the validity of the results.

This study focused on British Chinese carers of patients with affective disorders, using a qualitative methodology to synthesise the sociocultural representations of illness within this community. Despite the small sample size, clear themes emerged from the narratives that were sufficient for this exploratory investigation. (SHI98).

The present study sought to examine how qualitative sample sizes in health-related research are characterised and justified. In line with previous studies [ 22 , 30 , 33 , 34 ] the findings demonstrate that reporting of sample size sufficiency is limited; just over 50% of articles in the BMJ and BJHP and 82% in the SHI did not provide any sample size justification. Providing a sample size justification was not related to the number of interviews conducted, but it was associated with the journal that the article was published in, indicating the influence of disciplinary or publishing norms, also reported in prior research [ 30 ]. This lack of transparency about sample size sufficiency is problematic given that most qualitative researchers would agree that it is an important marker of quality [ 56 , 57 ]. Moreover, and with the rise of qualitative research in social sciences, efforts to synthesise existing evidence and assess its quality are obstructed by poor reporting [ 58 , 59 ].

When authors justified their sample size, our findings indicate that sufficiency was mostly appraised with reference to features that were intrinsic to the study, in agreement with general advice on sample size determination [ 4 , 11 , 36 ]. The principle of saturation was the most commonly invoked argument [ 22 ] accounting for 55% of all justifications. A wide range of variants of saturation was evident corroborating the proliferation of the meaning of the term [ 49 ] and reflecting different underlying conceptualisations or models of saturation [ 20 ]. Nevertheless, claims of saturation were never substantiated in relation to procedures conducted in the study itself, endorsing similar observations in the literature [ 25 , 30 , 47 ]. Claims of saturation were sometimes supported with citations of other literature, suggesting a removal of the concept away from the characteristics of the study at hand. Pragmatic considerations, such as resource constraints or participant response rate and availability, was the second most frequently used argument accounting for approximately 10% of justifications and another 23% of justifications also represented intrinsic-to-the-study characteristics (i.e. qualities of the analysis, meeting sampling or research design requirements, richness and volume of the data obtained, nature of study, further sampling to check findings consistency).

Only, 12% of mentions of sample size justification pertained to arguments that were external to the study at hand, in the form of existing sample size guidelines and prior research that sets precedents. Whilst community norms and prior research can establish useful rules of thumb for estimating sample sizes [ 60 ] – and reveal what sizes are more likely to be acceptable within research communities – researchers should avoid adopting these norms uncritically, especially when such guidelines [e.g. 30 , 35 ], might be based on research that does not provide adequate evidence of sample size sufficiency. Similarly, whilst methodological research that seeks to demonstrate the achievement of saturation is invaluable since it explicates the parameters upon which saturation is contingent and indicates when a research project is likely to require a smaller or a larger sample [e.g. 29 ], specific numbers at which saturation was achieved within these projects cannot be routinely extrapolated for other projects. We concur with existing views [ 11 , 36 ] that the consideration of the characteristics of the study at hand, such as the epistemological and theoretical approach, the nature of the phenomenon under investigation, the aims and scope of the study, the quality and richness of data, or the researcher’s experience and skills of conducting qualitative research, should be the primary guide in determining sample size and assessing its sufficiency.

Moreover, although numbers in qualitative research are not unimportant [ 61 ], sample size should not be considered alone but be embedded in the more encompassing examination of data adequacy [ 56 , 57 ]. Erickson’s [ 62 ] dimensions of ‘evidentiary adequacy’ are useful here. He explains the concept in terms of adequate amounts of evidence, adequate variety in kinds of evidence, adequate interpretive status of evidence, adequate disconfirming evidence, and adequate discrepant case analysis. All dimensions might not be relevant across all qualitative research designs, but this illustrates the thickness of the concept of data adequacy, taking it beyond sample size.

The present research also demonstrated that sample sizes were commonly seen as ‘small’ and insufficient and discussed as limitation. Often unjustified (and in two cases incongruent with their own claims of saturation) these findings imply that sample size in qualitative health research is often adversely judged (or expected to be judged) against an implicit, yet omnipresent, quasi-quantitative standpoint. Indeed there were a few instances in our data where authors appeared, possibly in response to reviewers, to resist to some sort of quantification of their results. This implicit reference point became more apparent when authors discussed the threats deriving from an insufficient sample size. Whilst the concerns about internal validity might be legitimate to the extent that qualitative research projects, which are broadly related to realism, are set to examine phenomena in sufficient breadth and depth, the concerns around generalizability revealed a conceptualisation that is not compatible with purposive sampling. The limited potential for generalisation, as a result of a small sample size, was often discussed in nomothetic, statistical terms. Only occasionally was analytic or idiographic generalisation invoked to warrant the value of the study’s findings [ 5 , 17 ].

Strengths and limitations of the present study

We note, first, the limited number of health-related journals reviewed, so that only a ‘snapshot’ of qualitative health research has been captured. Examining additional disciplines (e.g. nursing sciences) as well as inter-disciplinary journals would add to the findings of this analysis. Nevertheless, our study is the first to provide some comparative insights on the basis of disciplines that are differently attached to the legacy of positivism and analysed literature published over a lengthy period of time (15 years). Guetterman [ 27 ] also examined health-related literature but this analysis was restricted to 26 most highly cited articles published over a period of five years whilst Carlsen and Glenton’s [ 22 ] study concentrated on focus groups health research. Moreover, although it was our intention to examine sample size justification in relation to the epistemological and theoretical positions of articles, this proved to be challenging largely due to absence of relevant information, or the difficulty into discerning clearly articles’ positions [ 63 ] and classifying them under specific approaches (e.g. studies often combined elements from different theoretical and epistemological traditions). We believe that such an analysis would yield useful insights as it links the methodological issue of sample size to the broader philosophical stance of the research. Despite these limitations, the analysis of the characterisation of sample size and of the threats seen to accrue from insufficient sample size, enriches our understanding of sample size (in)sufficiency argumentation by linking it to other features of the research. As the peer-review process becomes increasingly public, future research could usefully examine how reporting around sample size sufficiency and data adequacy might be influenced by the interactions between authors and reviewers.

The past decade has seen a growing appetite in qualitative research for an evidence-based approach to sample size determination and to evaluations of the sufficiency of sample size. Despite the conceptual and methodological developments in the area, the findings of the present study confirm previous studies in concluding that appraisals of sample size sufficiency are either absent or poorly substantiated. To ensure and maintain high quality research that will encourage greater appreciation of qualitative work in health-related sciences [ 64 ], we argue that qualitative researchers should be more transparent and thorough in their evaluation of sample size as part of their appraisal of data adequacy. We would encourage the practice of appraising sample size sufficiency with close reference to the study at hand and would thus caution against responding to the growing methodological research in this area with a decontextualised application of sample size numerical guidelines, norms and principles. Although researchers might find sample size community norms serve as useful rules of thumb, we recommend methodological knowledge is used to critically consider how saturation and other parameters that affect sample size sufficiency pertain to the specifics of the particular project. Those reviewing papers have a vital role in encouraging transparent study-specific reporting. The review process should support authors to exercise nuanced judgments in decisions about sample size determination in the context of the range of factors that influence sample size sufficiency and the specifics of a particular study. In light of the growing methodological evidence in the area, transparent presentation of such evidence-based judgement is crucial and in time should surely obviate the seemingly routine practice of citing the ‘small’ size of qualitative samples among the study limitations.

A non-parametric test of difference for independent samples was performed since the variable number of interviews violated assumptions of normality according to the standardized scores of skewness and kurtosis (BMJ: z skewness = 3.23, z kurtosis = 1.52; BJHP: z skewness = 4.73, z kurtosis = 4.85; SHI: z skewness = 12.04, z kurtosis = 21.72) and the Shapiro-Wilk test of normality ( p  < .001).

Abbreviations

British Journal of Health Psychology

British Medical Journal

Interpretative Phenomenological Analysis

Sociology of Health & Illness

Spencer L, Ritchie J, Lewis J, Dillon L. Quality in qualitative evaluation: a framework for assessing research evidence. National Centre for Social Research 2003 https://www.heacademy.ac.uk/system/files/166_policy_hub_a_quality_framework.pdf Accessed 11 May 2018.

Fusch PI, Ness LR. Are we there yet? Data saturation in qualitative research Qual Rep. 2015;20(9):1408–16.

Google Scholar  

Robinson OC. Sampling in interview-based qualitative research: a theoretical and practical guide. Qual Res Psychol. 2014;11(1):25–41.

Article   Google Scholar  

Sandelowski M. Sample size in qualitative research. Res Nurs Health. 1995;18(2):179–83.

Article   CAS   Google Scholar  

Sandelowski M. One is the liveliest number: the case orientation of qualitative research. Res Nurs Health. 1996;19(6):525–9.

Luborsky MR, Rubinstein RL. Sampling in qualitative research: rationale, issues. and methods Res Aging. 1995;17(1):89–113.

Marshall MN. Sampling for qualitative research. Fam Pract. 1996;13(6):522–6.

Patton MQ. Qualitative evaluation and research methods. 2nd ed. Newbury Park, CA: Sage; 1990.

van Rijnsoever FJ. (I Can’t get no) saturation: a simulation and guidelines for sample sizes in qualitative research. PLoS One. 2017;12(7):e0181689.

Morse JM. The significance of saturation. Qual Health Res. 1995;5(2):147–9.

Morse JM. Determining sample size. Qual Health Res. 2000;10(1):3–5.

Gergen KJ, Josselson R, Freeman M. The promises of qualitative inquiry. Am Psychol. 2015;70(1):1–9.

Borsci S, Macredie RD, Barnett J, Martin J, Kuljis J, Young T. Reviewing and extending the five-user assumption: a grounded procedure for interaction evaluation. ACM Trans Comput Hum Interact. 2013;20(5):29.

Borsci S, Macredie RD, Martin JL, Young T. How many testers are needed to assure the usability of medical devices? Expert Rev Med Devices. 2014;11(5):513–25.

Glaser BG, Strauss AL. The discovery of grounded theory: strategies for qualitative research. Chicago, IL: Aldine; 1967.

Kerr C, Nixon A, Wild D. Assessing and demonstrating data saturation in qualitative inquiry supporting patient-reported outcomes research. Expert Rev Pharmacoecon Outcomes Res. 2010;10(3):269–81.

Lincoln YS, Guba EG. Naturalistic inquiry. London: Sage; 1985.

Book   Google Scholar  

Malterud K, Siersma VD, Guassora AD. Sample size in qualitative interview studies: guided by information power. Qual Health Res. 2015;26:1753–60.

Nelson J. Using conceptual depth criteria: addressing the challenge of reaching saturation in qualitative research. Qual Res. 2017;17(5):554–70.

Saunders B, Sim J, Kingstone T, Baker S, Waterfield J, Bartlam B, et al. Saturation in qualitative research: exploring its conceptualization and operationalization. Qual Quant. 2017. https://doi.org/10.1007/s11135-017-0574-8 .

Caine K. Local standards for sample size at CHI. In Proceedings of the 2016 CHI conference on human factors in computing systems. 2016;981–992. ACM.

Carlsen B, Glenton C. What about N? A methodological study of sample-size reporting in focus group studies. BMC Med Res Methodol. 2011;11(1):26.

Constantinou CS, Georgiou M, Perdikogianni M. A comparative method for themes saturation (CoMeTS) in qualitative interviews. Qual Res. 2017;17(5):571–88.

Dai NT, Free C, Gendron Y. Interview-based research in accounting 2000–2014: a review. November 2016. https://ssrn.com/abstract=2711022 or https://doi.org/10.2139/ssrn.2711022 . Accessed 17 May 2018.

Francis JJ, Johnston M, Robertson C, Glidewell L, Entwistle V, Eccles MP, et al. What is an adequate sample size? Operationalising data saturation for theory-based interview studies. Psychol Health. 2010;25(10):1229–45.

Guest G, Bunce A, Johnson L. How many interviews are enough? An experiment with data saturation and variability. Field Methods. 2006;18(1):59–82.

Guetterman TC. Descriptions of sampling practices within five approaches to qualitative research in education and the health sciences. Forum Qual Soc Res. 2015;16(2):25. http://nbn-resolving.de/urn:nbn:de:0114-fqs1502256 . Accessed 17 May 2018.

Hagaman AK, Wutich A. How many interviews are enough to identify metathemes in multisited and cross-cultural research? Another perspective on guest, bunce, and Johnson’s (2006) landmark study. Field Methods. 2017;29(1):23–41.

Hennink MM, Kaiser BN, Marconi VC. Code saturation versus meaning saturation: how many interviews are enough? Qual Health Res. 2017;27(4):591–608.

Marshall B, Cardon P, Poddar A, Fontenot R. Does sample size matter in qualitative research?: a review of qualitative interviews in IS research. J Comput Inform Syst. 2013;54(1):11–22.

Mason M. Sample size and saturation in PhD studies using qualitative interviews. Forum Qual Soc Res 2010;11(3):8. http://nbn-resolving.de/urn:nbn:de:0114-fqs100387 . Accessed 17 May 2018.

Safman RM, Sobal J. Qualitative sample extensiveness in health education research. Health Educ Behav. 2004;31(1):9–21.

Saunders MN, Townsend K. Reporting and justifying the number of interview participants in organization and workplace research. Br J Manag. 2016;27(4):836–52.

Sobal J. 2001. Sample extensiveness in qualitative nutrition education research. J Nutr Educ. 2001;33(4):184–92.

Thomson SB. 2010. Sample size and grounded theory. JOAAG. 2010;5(1). http://www.joaag.com/uploads/5_1__Research_Note_1_Thomson.pdf . Accessed 17 May 2018.

Baker SE, Edwards R. How many qualitative interviews is enough?: expert voices and early career reflections on sampling and cases in qualitative research. National Centre for Research Methods Review Paper. 2012; http://eprints.ncrm.ac.uk/2273/4/how_many_interviews.pdf . Accessed 17 May 2018.

Ogden J, Cornwell D. The role of topic, interviewee, and question in predicting rich interview data in the field of health research. Sociol Health Illn. 2010;32(7):1059–71.

Green J, Thorogood N. Qualitative methods for health research. London: Sage; 2004.

Ritchie J, Lewis J, Elam G. Designing and selecting samples. In: Ritchie J, Lewis J, editors. Qualitative research practice: a guide for social science students and researchers. London: Sage; 2003. p. 77–108.

Britten N. Qualitative research: qualitative interviews in medical research. BMJ. 1995;311(6999):251–3.

Creswell JW. Qualitative inquiry and research design: choosing among five approaches. 2nd ed. London: Sage; 2007.

Fugard AJ, Potts HW. Supporting thinking on sample sizes for thematic analyses: a quantitative tool. Int J Soc Res Methodol. 2015;18(6):669–84.

Emmel N. Themes, variables, and the limits to calculating sample size in qualitative research: a response to Fugard and Potts. Int J Soc Res Methodol. 2015;18(6):685–6.

Braun V, Clarke V. (Mis) conceptualising themes, thematic analysis, and other problems with Fugard and Potts’ (2015) sample-size tool for thematic analysis. Int J Soc Res Methodol. 2016;19(6):739–43.

Hammersley M. Sampling and thematic analysis: a response to Fugard and Potts. Int J Soc Res Methodol. 2015;18(6):687–8.

Charmaz K. Constructing grounded theory: a practical guide through qualitative analysis. London: Sage; 2006.

Bowen GA. Naturalistic inquiry and the saturation concept: a research note. Qual Res. 2008;8(1):137–52.

Morse JM. Data were saturated. Qual Health Res. 2015;25(5):587–8.

O’Reilly M, Parker N. ‘Unsatisfactory saturation’: a critical exploration of the notion of saturated sample sizes in qualitative research. Qual Res. 2013;13(2):190–7.

Manen M, Higgins I, Riet P. A conversation with max van Manen on phenomenology in its original sense. Nurs Health Sci. 2016;18(1):4–7.

Dey I. Grounding grounded theory. San Francisco, CA: Academic Press; 1999.

Hays DG, Wood C, Dahl H, Kirk-Jenkins A. Methodological rigor in journal of counseling & development qualitative research articles: a 15-year review. J Couns Dev. 2016;94(2):172–83.

Moher D, Liberati A, Tetzlaff J, Altman DG, Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009; 6(7): e1000097.

Hsieh HF, Shannon SE. Three approaches to qualitative content analysis. Qual Health Res. 2005;15(9):1277–88.

Boyatzis RE. Transforming qualitative information: thematic analysis and code development. Thousand Oaks, CA: Sage; 1998.

Levitt HM, Motulsky SL, Wertz FJ, Morrow SL, Ponterotto JG. Recommendations for designing and reviewing qualitative research in psychology: promoting methodological integrity. Qual Psychol. 2017;4(1):2–22.

Morrow SL. Quality and trustworthiness in qualitative research in counseling psychology. J Couns Psychol. 2005;52(2):250–60.

Barroso J, Sandelowski M. Sample reporting in qualitative studies of women with HIV infection. Field Methods. 2003;15(4):386–404.

Glenton C, Carlsen B, Lewin S, Munthe-Kaas H, Colvin CJ, Tunçalp Ö, et al. Applying GRADE-CERQual to qualitative evidence synthesis findings—paper 5: how to assess adequacy of data. Implement Sci. 2018;13(Suppl 1):14.

Onwuegbuzie AJ. Leech NL. A call for qualitative power analyses. Qual Quant. 2007;41(1):105–21.

Sandelowski M. Real qualitative researchers do not count: the use of numbers in qualitative research. Res Nurs Health. 2001;24(3):230–40.

Erickson F. Qualitative methods in research on teaching. In: Wittrock M, editor. Handbook of research on teaching. 3rd ed. New York: Macmillan; 1986. p. 119–61.

Bradbury-Jones C, Taylor J, Herber O. How theory is used and articulated in qualitative research: development of a new typology. Soc Sci Med. 2014;120:135–41.

Greenhalgh T, Annandale E, Ashcroft R, Barlow J, Black N, Bleakley A, et al. An open letter to the BMJ editors on qualitative research. BMJ. 2016;i563:352.

Download references

Acknowledgments

We would like to thank Dr. Paula Smith and Katharine Lee for their comments on a previous draft of this paper as well as Natalie Ann Mitchell and Meron Teferra for assisting us with data extraction.

This research was initially conceived of and partly conducted with financial support from the Multidisciplinary Assessment of Technology Centre for Healthcare (MATCH) programme (EP/F063822/1 and EP/G012393/1). The research continued and was completed independent of any support. The funding body did not have any role in the study design, the collection, analysis and interpretation of the data, in the writing of the paper, and in the decision to submit the manuscript for publication. The views expressed are those of the authors alone.

Availability of data and materials

Supporting data can be accessed in the original publications. Additional File 2 lists all eligible studies that were included in the present analysis.

Author information

Authors and affiliations.

Department of Psychology, University of Bath, Building 10 West, Claverton Down, Bath, BA2 7AY, UK

Konstantina Vasileiou & Julie Barnett

School of Psychology, Newcastle University, Ridley Building 1, Queen Victoria Road, Newcastle upon Tyne, NE1 7RU, UK

Susan Thorpe

Department of Computer Science, Brunel University London, Wilfred Brown Building 108, Uxbridge, UB8 3PH, UK

Terry Young

You can also search for this author in PubMed   Google Scholar

Contributions

JB and TY conceived the study; KV, JB, and TY designed the study; KV identified the articles and extracted the data; KV and JB assessed eligibility of articles; KV, JB, ST, and TY contributed to the analysis of the data, discussed the findings and early drafts of the paper; KV developed the final manuscript; KV, JB, ST, and TY read and approved the manuscript.

Corresponding author

Correspondence to Konstantina Vasileiou .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

Terry Young is an academic who undertakes research and occasional consultancy in the areas of health technology assessment, information systems, and service design. He is unaware of any direct conflict of interest with respect to this paper. All other authors have no competing interests to declare.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional Files

Additional file 1:.

Editorial positions on qualitative research and sample considerations (where available). (DOCX 12 kb)

Additional File 2:

List of eligible articles included in the review ( N  = 214). (DOCX 38 kb)

Additional File 3:

Data Extraction Form. (DOCX 15 kb)

Additional File 4:

Citations used by articles to support their position on saturation. (DOCX 14 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Vasileiou, K., Barnett, J., Thorpe, S. et al. Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period. BMC Med Res Methodol 18 , 148 (2018). https://doi.org/10.1186/s12874-018-0594-7

Download citation

Received : 22 May 2018

Accepted : 29 October 2018

Published : 21 November 2018

DOI : https://doi.org/10.1186/s12874-018-0594-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Sample size
  • Sample size justification
  • Sample size characterisation
  • Data adequacy
  • Qualitative health research
  • Qualitative interviews
  • Systematic analysis

BMC Medical Research Methodology

ISSN: 1471-2288

sample size for qualitative research body

sample size for qualitative research body

CRO Platform

Test your insights. Run experiments. Win. Or learn. And then win.

sample size for qualitative research body

eCommerce Customer Analytics Platform

sample size for qualitative research body

Acquisition matters. But retention matters more. Understand, monitor & nurture the best customers.

  • Case Studies
  • Ebooks, Tools, Templates
  • Digital Marketing Glossary
  • eCommerce Growth Stories
  • eCommerce Growth Show
  • Help & Technical Documentation

CRO Guide   >  Chapter 3.1

Qualitative Research: Definition, Methodology, Limitation & Examples

Qualitative research is a method focused on understanding human behavior and experiences through non-numerical data. Examples of qualitative research include:

  • One-on-one interviews,
  • Focus groups, Ethnographic research,
  • Case studies,
  • Record keeping,
  • Qualitative observations

In this article, we’ll provide tips and tricks on how to use qualitative research to better understand your audience through real world examples and improve your ROI. We’ll also learn the difference between qualitative and quantitative data.

gathering data

Table of Contents

Marketers often seek to understand their customers deeply. Qualitative research methods such as face-to-face interviews, focus groups, and qualitative observations can provide valuable insights into your products, your market, and your customers’ opinions and motivations. Understanding these nuances can significantly enhance marketing strategies and overall customer satisfaction.

What is Qualitative Research

Qualitative research is a market research method that focuses on obtaining data through open-ended and conversational communication. This method focuses on the “why” rather than the “what” people think about you. Thus, qualitative research seeks to uncover the underlying motivations, attitudes, and beliefs that drive people’s actions. 

Let’s say you have an online shop catering to a general audience. You do a demographic analysis and you find out that most of your customers are male. Naturally, you will want to find out why women are not buying from you. And that’s what qualitative research will help you find out.

In the case of your online shop, qualitative research would involve reaching out to female non-customers through methods such as in-depth interviews or focus groups. These interactions provide a platform for women to express their thoughts, feelings, and concerns regarding your products or brand. Through qualitative analysis, you can uncover valuable insights into factors such as product preferences, user experience, brand perception, and barriers to purchase.

Types of Qualitative Research Methods

Qualitative research methods are designed in a manner that helps reveal the behavior and perception of a target audience regarding a particular topic.

The most frequently used qualitative analysis methods are one-on-one interviews, focus groups, ethnographic research, case study research, record keeping, and qualitative observation.

1. One-on-one interviews

Conducting one-on-one interviews is one of the most common qualitative research methods. One of the advantages of this method is that it provides a great opportunity to gather precise data about what people think and their motivations.

Spending time talking to customers not only helps marketers understand who their clients are, but also helps with customer care: clients love hearing from brands. This strengthens the relationship between a brand and its clients and paves the way for customer testimonials.

  • A company might conduct interviews to understand why a product failed to meet sales expectations.
  • A researcher might use interviews to gather personal stories about experiences with healthcare.

These interviews can be performed face-to-face or on the phone and usually last between half an hour to over two hours. 

When a one-on-one interview is conducted face-to-face, it also gives the marketer the opportunity to read the body language of the respondent and match the responses.

2. Focus groups

Focus groups gather a small number of people to discuss and provide feedback on a particular subject. The ideal size of a focus group is usually between five and eight participants. The size of focus groups should reflect the participants’ familiarity with the topic. For less important topics or when participants have little experience, a group of 10 can be effective. For more critical topics or when participants are more knowledgeable, a smaller group of five to six is preferable for deeper discussions.

The main goal of a focus group is to find answers to the “why”, “what”, and “how” questions. This method is highly effective in exploring people’s feelings and ideas in a social setting, where group dynamics can bring out insights that might not emerge in one-on-one situations.

  • A focus group could be used to test reactions to a new product concept.
  • Marketers might use focus groups to see how different demographic groups react to an advertising campaign.

One advantage that focus groups have is that the marketer doesn’t necessarily have to interact with the group in person. Nowadays focus groups can be sent as online qualitative surveys on various devices.

Focus groups are an expensive option compared to the other qualitative research methods, which is why they are typically used to explain complex processes.

3. Ethnographic research

Ethnographic research is the most in-depth observational method that studies individuals in their naturally occurring environment.

This method aims at understanding the cultures, challenges, motivations, and settings that occur.

  • A study of workplace culture within a tech startup.
  • Observational research in a remote village to understand local traditions.

Ethnographic research requires the marketer to adapt to the target audiences’ environments (a different organization, a different city, or even a remote location), which is why geographical constraints can be an issue while collecting data.

This type of research can last from a few days to a few years. It’s challenging and time-consuming and solely depends on the expertise of the marketer to be able to analyze, observe, and infer the data.

4. Case study research

The case study method has grown into a valuable qualitative research method. This type of research method is usually used in education or social sciences. It involves a comprehensive examination of a single instance or event, providing detailed insights into complex issues in real-life contexts.  

  • Analyzing a single school’s innovative teaching method.
  • A detailed study of a patient’s medical treatment over several years.

Case study research may seem difficult to operate, but it’s actually one of the simplest ways of conducting research as it involves a deep dive and thorough understanding of the data collection methods and inferring the data.

5. Record keeping

Record keeping is similar to going to the library: you go over books or any other reference material to collect relevant data. This method uses already existing reliable documents and similar sources of information as a data source.

  • Historical research using old newspapers and letters.
  • A study on policy changes over the years by examining government records.

This method is useful for constructing a historical context around a research topic or verifying other findings with documented evidence.

6. Qualitative observation

Qualitative observation is a method that uses subjective methodologies to gather systematic information or data. This method deals with the five major sensory organs and their functioning, sight, smell, touch, taste, and hearing.

  • Sight : Observing the way customers visually interact with product displays in a store to understand their browsing behaviors and preferences.
  • Smell : Noting reactions of consumers to different scents in a fragrance shop to study the impact of olfactory elements on product preference.
  • Touch : Watching how individuals interact with different materials in a clothing store to assess the importance of texture in fabric selection.
  • Taste : Evaluating reactions of participants in a taste test to identify flavor profiles that appeal to different demographic groups.
  • Hearing : Documenting responses to changes in background music within a retail environment to determine its effect on shopping behavior and mood.

Below we are also providing real-life examples of qualitative research that demonstrate practical applications across various contexts:

Qualitative Research Real World Examples

Let’s explore some examples of how qualitative research can be applied in different contexts.

1. Online grocery shop with a predominantly male audience

Method used: one-on-one interviews.

Let’s go back to one of the previous examples. You have an online grocery shop. By nature, it addresses a general audience, but after you do a demographic analysis you find out that most of your customers are male.

One good method to determine why women are not buying from you is to hold one-on-one interviews with potential customers in the category.

Interviewing a sample of potential female customers should reveal why they don’t find your store appealing. The reasons could range from not stocking enough products for women to perhaps the store’s emphasis on heavy-duty tools and automotive products, for example. These insights can guide adjustments in inventory and marketing strategies.

2. Software company launching a new product

Method used: focus groups.

Focus groups are great for establishing product-market fit.

Let’s assume you are a software company that wants to launch a new product and you hold a focus group with 12 people. Although getting their feedback regarding users’ experience with the product is a good thing, this sample is too small to define how the entire market will react to your product.

So what you can do instead is holding multiple focus groups in 20 different geographic regions. Each region should be hosting a group of 12 for each market segment; you can even segment your audience based on age. This would be a better way to establish credibility in the feedback you receive.

3. Alan Pushkin’s “God’s Choice: The Total World of a Fundamentalist Christian School”

Method used: ethnographic research.

Moving from a fictional example to a real-life one, let’s analyze Alan Peshkin’s 1986 book “God’s Choice: The Total World of a Fundamentalist Christian School”.

Peshkin studied the culture of Bethany Baptist Academy by interviewing the students, parents, teachers, and members of the community alike, and spending eighteen months observing them to provide a comprehensive and in-depth analysis of Christian schooling as an alternative to public education.

The study highlights the school’s unified purpose, rigorous academic environment, and strong community support while also pointing out its lack of cultural diversity and openness to differing viewpoints. These insights are crucial for understanding how such educational settings operate and what they offer to students.

Even after discovering all this, Peshkin still presented the school in a positive light and stated that public schools have much to learn from such schools.

Peshkin’s in-depth research represents a qualitative study that uses observations and unstructured interviews, without any assumptions or hypotheses. He utilizes descriptive or non-quantifiable data on Bethany Baptist Academy specifically, without attempting to generalize the findings to other Christian schools.

4. Understanding buyers’ trends

Method used: record keeping.

Another way marketers can use quality research is to understand buyers’ trends. To do this, marketers need to look at historical data for both their company and their industry and identify where buyers are purchasing items in higher volumes.

For example, electronics distributors know that the holiday season is a peak market for sales while life insurance agents find that spring and summer wedding months are good seasons for targeting new clients.

5. Determining products/services missing from the market

Conducting your own research isn’t always necessary. If there are significant breakthroughs in your industry, you can use industry data and adapt it to your marketing needs.

The influx of hacking and hijacking of cloud-based information has made Internet security a topic of many industry reports lately. A software company could use these reports to better understand the problems its clients are facing.

As a result, the company can provide solutions prospects already know they need.

Real-time Customer Lifetime Value (CLV) Benchmark Report

See where your business stands compared to 1,000+ e-stores in different industries.

35 reports by industry and business size.

Qualitative Research Approaches

Once the marketer has decided that their research questions will provide data that is qualitative in nature, the next step is to choose the appropriate qualitative approach.

The approach chosen will take into account the purpose of the research, the role of the researcher, the data collected, the method of data analysis , and how the results will be presented. The most common approaches include:

  • Narrative : This method focuses on individual life stories to understand personal experiences and journeys. It examines how people structure their stories and the themes within them to explore human existence. For example, a narrative study might look at cancer survivors to understand their resilience and coping strategies.
  • Phenomenology : attempts to understand or explain life experiences or phenomena; It aims to reveal the depth of human consciousness and perception, such as by studying the daily lives of those with chronic illnesses.
  • Grounded theory : investigates the process, action, or interaction with the goal of developing a theory “grounded” in observations and empirical data. 
  • Ethnography : describes and interprets an ethnic, cultural, or social group;
  • Case study : examines episodic events in a definable framework, develops in-depth analyses of single or multiple cases, and generally explains “how”. An example might be studying a community health program to evaluate its success and impact.

How to Analyze Qualitative Data

Analyzing qualitative data involves interpreting non-numerical data to uncover patterns, themes, and deeper insights. This process is typically more subjective and requires a systematic approach to ensure reliability and validity. 

1. Data Collection

Ensure that your data collection methods (e.g., interviews, focus groups, observations) are well-documented and comprehensive. This step is crucial because the quality and depth of the data collected will significantly influence the analysis.

2. Data Preparation

Once collected, the data needs to be organized. Transcribe audio and video recordings, and gather all notes and documents. Ensure that all data is anonymized to protect participant confidentiality where necessary.

3. Familiarization

Immerse yourself in the data by reading through the materials multiple times. This helps you get a general sense of the information and begin identifying patterns or recurring themes.

Develop a coding system to tag data with labels that summarize and account for each piece of information. Codes can be words, phrases, or acronyms that represent how these segments relate to your research questions.

  • Descriptive Coding : Summarize the primary topic of the data.
  • In Vivo Coding : Use language and terms used by the participants themselves.
  • Process Coding : Use gerunds (“-ing” words) to label the processes at play.
  • Emotion Coding : Identify and record the emotions conveyed or experienced.

5. Thematic Development

Group codes into themes that represent larger patterns in the data. These themes should relate directly to the research questions and form a coherent narrative about the findings.

6. Interpreting the Data

Interpret the data by constructing a logical narrative. This involves piecing together the themes to explain larger insights about the data. Link the results back to your research objectives and existing literature to bolster your interpretations.

7. Validation

Check the reliability and validity of your findings by reviewing if the interpretations are supported by the data. This may involve revisiting the data multiple times or discussing the findings with colleagues or participants for validation.

8. Reporting

Finally, present the findings in a clear and organized manner. Use direct quotes and detailed descriptions to illustrate the themes and insights. The report should communicate the narrative you’ve built from your data, clearly linking your findings to your research questions.

Limitations of qualitative research

The disadvantages of qualitative research are quite unique. The techniques of the data collector and their own unique observations can alter the information in subtle ways. That being said, these are the qualitative research’s limitations:

1. It’s a time-consuming process

The main drawback of qualitative study is that the process is time-consuming. Another problem is that the interpretations are limited. Personal experience and knowledge influence observations and conclusions.

Thus, qualitative research might take several weeks or months. Also, since this process delves into personal interaction for data collection, discussions often tend to deviate from the main issue to be studied.

2. You can’t verify the results of qualitative research

Because qualitative research is open-ended, participants have more control over the content of the data collected. So the marketer is not able to verify the results objectively against the scenarios stated by the respondents. For example, in a focus group discussing a new product, participants might express their feelings about the design and functionality. However, these opinions are influenced by individual tastes and experiences, making it difficult to ascertain a universally applicable conclusion from these discussions.

3. It’s a labor-intensive approach

Qualitative research requires a labor-intensive analysis process such as categorization, recording, etc. Similarly, qualitative research requires well-experienced marketers to obtain the needed data from a group of respondents.

4. It’s difficult to investigate causality

Qualitative research requires thoughtful planning to ensure the obtained results are accurate. There is no way to analyze qualitative data mathematically. This type of research is based more on opinion and judgment rather than results. Because all qualitative studies are unique they are difficult to replicate.

5. Qualitative research is not statistically representative

Because qualitative research is a perspective-based method of research, the responses given are not measured.

Comparisons can be made and this can lead toward duplication, but for the most part, quantitative data is required for circumstances that need statistical representation and that is not part of the qualitative research process.

While doing a qualitative study, it’s important to cross-reference the data obtained with the quantitative data. By continuously surveying prospects and customers marketers can build a stronger database of useful information.

Quantitative vs. Qualitative Research

Qualitative and quantitative research side by side in a table

Image source

Quantitative and qualitative research are two distinct methodologies used in the field of market research, each offering unique insights and approaches to understanding consumer behavior and preferences.

As we already defined, qualitative analysis seeks to explore the deeper meanings, perceptions, and motivations behind human behavior through non-numerical data. On the other hand, quantitative research focuses on collecting and analyzing numerical data to identify patterns, trends, and statistical relationships.  

Let’s explore their key differences: 

Nature of Data:

  • Quantitative research : Involves numerical data that can be measured and analyzed statistically.
  • Qualitative research : Focuses on non-numerical data, such as words, images, and observations, to capture subjective experiences and meanings.

Research Questions:

  • Quantitative research : Typically addresses questions related to “how many,” “how much,” or “to what extent,” aiming to quantify relationships and patterns.
  • Qualitative research: Explores questions related to “why” and “how,” aiming to understand the underlying motivations, beliefs, and perceptions of individuals.

Data Collection Methods:

  • Quantitative research : Relies on structured surveys, experiments, or observations with predefined variables and measures.
  • Qualitative research : Utilizes open-ended interviews, focus groups, participant observations, and textual analysis to gather rich, contextually nuanced data.

Analysis Techniques:

  • Quantitative research: Involves statistical analysis to identify correlations, associations, or differences between variables.
  • Qualitative research: Employs thematic analysis, coding, and interpretation to uncover patterns, themes, and insights within qualitative data.

sample size for qualitative research body

Do Conversion Rate Optimization the Right way.

Explore helps you make the most out of your CRO efforts through advanced A/B testing, surveys, advanced segmentation and optimised customer journeys.

An isometric image of an adobe adobe adobe adobe ad.

If you haven’t subscribed yet to our newsletter, now is your chance!

A man posing happily in front of a vivid purple background for an engaging blog post.

Like what you’re reading?

Join the informed ecommerce crowd.

We will never bug you with irrelevant info.

By clicking the Button, you confirm that you agree with our Terms and Conditions .

Continue your Conversion Rate Optimization Journey

  • Last modified: January 3, 2023
  • Conversion Rate Optimization , User Research

Valentin Radu

Valentin Radu

Omniconvert logo on a black background.

We’re a team of people that want to empower marketers around the world to create marketing campaigns that matter to consumers in a smart way. Meet us at the intersection of creativity, integrity, and development, and let us show you how to optimize your marketing.

Our Software

  • > Book a Demo
  • > Partner Program
  • > Affiliate Program
  • Blog Sitemap
  • Terms and Conditions
  • Privacy & Security
  • Cookies Policy
  • REVEAL Terms and Conditions
  • Open access
  • Published: 31 May 2024

Exploring potential EQ-5D bolt-on dimensions with a qualitative approach: an interview study in Hong Kong SAR, China

  • Clement Cheuk Wai Ng 1 ,
  • Annie Wai Ling Cheung 1 , 2 &
  • Eliza Lai Yi Wong 1 , 2 , 3  

Health and Quality of Life Outcomes volume  22 , Article number:  42 ( 2024 ) Cite this article

Metrics details

The introduction of bolt-on dimensions in EQ-5D instruments is growing common, but most bolt-on studies have targeted the diseased population and obtained bolt-on from other existing Health-related Quality of Life (HRQoL) instruments. As the qualitative approach offers important evidence to support the consistency and design of the potential bolt-on items, this paper studies the Hong Kong SAR community’s perception of the current EQ-5D-5 L instrument and identifies potential bolt-on via a qualitative approach.

A representative sample mix was recruited based on the age group, gender, and education level composition of the Hong Kong SAR community by quota sampling. Semi-structured interviews were conducted and the interviews were transcribed and coded to identify emergent and recurrent themes.

Thirty interviews were conducted and the majority of the interviewees considered the EQ-5D-5 L insufficiently comprehensive to illustrate their HRQoL. While some key HRQoL aspects included in the EQ-5D matched with the community’s HRQoL perception, respondents showed concern about the potential overlap of the existing HRQoL dimension, the optimal number or attributes, and the appropriateness of the EQ-VAS. Among the potential bolt-on dimensions that emerged, ‘Sleep’, ‘Interpersonal Relationship’, and ‘Satisfaction’ were the key potential bolt-on dimensions identified and emphasized in the interviews.

Conclusions

The qualitative findings of the study illustrate the possible gap between EQ-5D-5 L measurements and community HRQoL perception, while the findings support the development of EQ-5D bolt-on dimensions in the target community with content and face validity.

With the rising global predominance of chronic disease conditions, a surging number of people dwell in communities with impaired physical, psychological, and social functioning but prolonged life expectancy [ 1 ]. As the concept of health-related quality of life (HRQoL) enables us to assess the patient-perceived health impact of medical conditions [ 2 ], HRQoL has become a popular patient-reported outcome (PRO) in recent years to supplement objective indicators to improve health outcomes [ 3 ]. In addition, the application of HRQoL may facilitate economic evaluations and decision-making at both the clinical and policy levels [ 4 ].

One of the most widely used generic HRQoL instruments is the EQ-5D. With the generic and brevity nature of the EQ-5D, the EQ-5D instrument can be administered at a low cost, while the utility derived can facilitate economic evaluations. However, the EQ-5D-5L is often prone to a high ceiling effect [ 5 ], and some of the significant HRQoL aspects might not be captured with the current dimensions [ 6 ]. Recent research considers the inclusion of bolt-on dimensions as a ‘promising solution’ to address such limitations [ 7 ]. An increasing number of publications explored the appendment of the EQ-5D bolt-ons which aim to enrich the health state classification, and to enhance the responsiveness and content validity of the EQ-5D [ 8 , 9 , 10 ].

As described by the EQ-5D bolt-on development guidelines, qualitative evidence of bolt-on development can support the design of clear and concise HRQoL descriptors [ 8 ]. However, a systematic review reported that many EQ-5D bolt-on studies obtained their bolt-on items directly from other existing generic HRQoL instruments such as the WHOQOL-BREF, Assessment of Quality of Life, and Health Utility Index, but only one project reported the identification and development of the EQ-5D bolt-on dimension via qualitative research [ 9 ]. While two other recent studies had attempted to develop EQ-5D disease-specific bolt-on qualitatively [ 11 , 12 ], the majority of HRQoL instrument development research still adopted a quantitative approach, despite the qualitative scope providing another important perspective in HRQoL research [ 13 ].

In addition to the development strategy, most bolt-on studies focused on disease-specific populations [ 9 ], and scarce research had investigated the HRQoL construct perceived by the general population, especially in non-Western cultures [ 14 , 15 ]. However, it is equally important to understand the possible discrepancy between HRQoL measurements and the community’s HRQoL perception [ 16 ], when healthcare services and systems are designed for both the ill and healthy people to sustain health.

The community-specific bolt-on item may vary according to community characteristics. The general public in the United Kingdom regarded sensory deprivation and mental health items as significant dimensions not covered by the EQ-5D [ 17 ], while respondents from New Zealand raised examples such as fitness, happiness, mental health, and cognition as areas not illustrated by the EQ-5D [ 18 ]. In particular, the current EQ-5D instrument may not illustrate the HRQoL perception in the Asian communities well with its European development background [ 16 ], and some Asian communities such as China, Korean, Malaysia and Thailand have been attempting to develop its cultural-specific bolt-ons [ 19 , 20 , 21 , 22 , 23 ]. In Hong Kong context, ceiling effect could be observed as 46% of the respondents reported [11,111] in the EQ-5D-5 L [ 24 ]. A local telephone survey further revealed that the current EQ-5D-5L may not be sensitive to reflect the change in patient-reported HRQoL, while the majority welcomed the inclusion of appetite, hearing, vision and energy/ sleeping quality to support better HRQoL illustration [ 25 ]. To understand and interpret the community’s HRQoL perception and screen for appropriate bolt-on dimensions, this study aimed to identify relevant EQ-5D bolt-on candidates with a qualitative approach in Hong Kong SAR, China.

Study design and participants

A qualitative individual interview study was conducted. Details of the study design, analysis, and findings were checked with the Consolidated Criteria for Reporting Qualitative (COREQ) 32-item checklist [ 26 ].

To ensure the heterogeneity of the sample population, quota sampling was applied with reference to local census data on age group, gender, and highest education level attained. Hong Kong Cantonese-speaking residents aged ≥ 18 were invited for the interview.

Sample recruitment was promoted openly in the community center, hospital, and campus areas, while some cases were approached by referral to fill in the designated quota. The number of total interviews conducted was determined by data saturation when no additional important themes were revealed from the new interviews.

Data collection process

The interview focused on the EQ-5D-5L as the EQ-5D-5L offered improved sensitivity and discriminatory power [ 5 , 27 , 28 , 29 ], and had been used for local healthcare service benchmarking [ 30 , 31 ]. Before the interview, the participants completed the EQ-5D-5L(HK) survey after they provided written consent, allowing the participants to construct a general picture of the interview topic.

The interview was comprised three sections: (i) perception on ‘health’ and ‘HRQoL’; (ii) perception and comments on current EQ-5D-5 L(HK) instrument; and (iii) exploration of potential bolt-on(s) for EQ-5D-5L(HK). The first two interviews were regarded as pilot test to determine the appropriateness of the interview guide and data collection process. Participants were invited to explain their perceptions of ‘health’ and ‘HRQoL’ in the first section. In the second section, the interviewees were asked to comment on the validity of the current EQ-5D-5L(HK), and they were also encouraged to freely suggest any potential bolt-on that may enrich the current EQ-5D-5L. In the final section, the interviews were then invited to identify any other potential EQ-5D-5L bolt-on dimensions from the two lists of the candidates based on the past literatures. To minimize confirmation bias, the two lists were put at the end of the section and presented to the interviewees after they shared their personal views and suggestions on potential EQ-5D-5L bolt-on on the current EQ-5D-5L. The first 8-item list were prepared based on a factor analysis study of EQ-5D bolt-on dimensions and the local telephone survey [ 25 , 32 ], while dimensions not covered by the EQ-5D and the first list were extracted from local validated WHOQOL-BREF(HK) as the second 12-item list [ 33 , 34 ]. With the bolt-on candidates proposed, the interviewees were first invited to express their interpretation of the bolt-on, and then explain its potential improvement if the dimension is appended to the EQ-5D-5L(HK). Trained interviewers would queue the respondent to elaborate on what EQ-5D-5L(HK) gap or limitation may be addressed by the bolt-on candidates, and how would the bolt-on fit the community’s HRQoL perception. Ranking exercises were conducted, which interviewees explained and ranked the bolt-on dimensions from the most useful to least useful in their respective lists. Participants’ sociodemographic characteristics were collected at the end of interview. Each interview lasted approximately 45–60 min, and the interviews were audio-recorded under the participants’ consent.

Data analysis

Audiotapes of the interviews were transcribed verbatim, and the data analysis was handled with Dedoose [ 35 ]. Applying thematic analysis referencing the interview guide, transcripts were coded twice by independent reviewers to identify emergent and recurrent themes. The reviewers cross-checked the codebooks before restructuring the codes into master themes or subthemes. Disagreements between codes were resolved by discussion among the research team, and interview excerpts were quoted to illustrate the respective themes. The results of the ranking exercise were summarised by (i) the frequency of the dimension chosen as top three in its list; (ii) summation of the ranked votes, and (iii) the relative ranking reflected by ii, with lower total ranked indicating higher priorities by the respondents.

Sample characteristics

Thirty face-to-face interviews were conducted between March and August 2021. The sample mix was similar to the Hong Kong demographic composition in terms of age group, gender, and highest education attained [ 36 ]. Similar to local statistics, approximately one-third of the sample suffered from chronic conditions [ 37 ]. Sample characteristics are summarized in Table  1 .

Considering the EQ-5D-5L, 40% of respondents experienced a certain extent of Anxiety/Depression (AD) problem, while most experienced no problems in other EQ-5D dimensions. The mean EQ-5D-5 L utility and EQ-VAS were 0.902 and 78.5 respectively, which were slightly lower than the population norm in 2015 [ 38 ]. EQ-5D-5L responses of the samples area tabulated in Table  2 .

Perceptions of ‘Health’ and ‘HRQoL’

Interviewee discussed ‘health’ as a combination of physical, mental and social well-being, while often emphasized physical and mental health in their elaborations.

“… I will look into physical and mental health first. Social well-being is… a factor affecting the two (physical and mental health), that’s what I think.” H014. “In general, …divided into physical and mental aspects. In physical (health), there is no disease or pain, and you have a routine daily lifestyle… and pattern. On the mental (health) side, you should feel motivated and anxious… not exactly anxiety-free but shouldn’t pressure you too much.” H023.

For HRQoL, respondents repeatedly reported ‘pain-free and disease-free’( n  = 18), ‘eat well’( n  = 13), ‘walk well’( n  = 12), ‘move well’( n  = 10), ‘sleep well’(n = 8) as the common HRQoL targets that the community focused on, which some were highly similar to the EQ-5D-5L dimensions such as Mobility (MO) and Usual Activities (UA). Interviewees further emphasized how unhealthy habits shared by the Hong Kong community such as “irregular sleep schedule” and “working overtime” may deteriorate HRQoL.

“(If you) walk well, move well, eat well, sleep well, and everything (HRQoL) would be fine, you may even include exercise too.” H007. “You don’t truly have to consider social well-being, or anxiety and mental health-related factors. Naturally your mood, your own self, physical (health), you would be relaxed or feel better if you eat well, sleep well, walk well and move well” H030.

The validity of the current EQ-5D-5L

Most participants considered the 5-level design appropriate for illustrating changes in HRQoL measurements and health state classifications. However, the majority of participants (66.7%, n  = 20) stated that the five dimensions were not comprehensive enough to describe their HRQoL.

Many interviewees were comfortable with the current length of the EQ-5D-5L, but also considered 5–10 questions as adequate, while a couple of respondents suggested 20. However, 19 participants raised concerns about the overlapping between the original EQ-5D dimensions, and some struggled if such a design may cause confusion and redundancy.

“I would prefer more (question items)… as I paid emphasis on mental health. Within the five (EQ-5D) dimensions, seems only anxiety/depression may address psychological health, and I think we need more items (for mental health).” H018. “This instrument focuses on physical health, and lacks coverage in mental and social well- being.” H023. “Usual activities and mobility … actually these three (self-care) are slightly duplicated… Your movements, maybe daily moving, standing and sitting…, in these three (EQ-5D dimensions) maybe… I can’t tell if there is more than enough information or a duplication.” H030.

Several participants considered the EQ-VAS as a wrap-up which they tended to respond by ‘gut feeling’ and out of intuition. Some participants preferred reserving rooms on both ends for unforeseeable circumstances such as accidents and undiagnosed conditions. Respondents considered wide range of factors when answering the EQ-VAS, and many highlighted factors were not necessarily HRQoL-specific, examples include weather of the day, unforeseeable circumstances, or other sociodemographic factors.

“I think the scoring (VAS) is not accurate and not very convincing… I could be very well in terms of biological indicators today… but with any incidents today… (I could get) very upset… such as the death of a family member… but with the scale alone… it would be hard to tell whether you experienced difficulty in biologically, or physiologically… it is just too general.” H018. “Nice weather would make a difference (in VAS), like extra points will be added for finer weather… and the next would be having a smooth day at work, so you won’t be working late, or be too late when you have your rest.” H022.

Exploration of the Potential EQ-5D-5 L Bolt-on Dimension.

Four interviewees proposed ‘exercise’ as a as possible supplement to the EQ-5D-5 L respectively, but had later probed ‘exercise’ into a proxy of MO and UA. Similarly, three sample suggested ‘job’ as a bolt-on candidate, but later considered ‘job’ already covered by a descriptor in UA. Otherwise, the bolt-on candidates discussed by the respondents could be represented by the dimensions included in the two EQ-5D-5L potential bolt-on lists. Many interviewees found the list based on the EQ-5D bolt-on factor analysis more relevant and compatible, and considered dimensions from the WHOQOL-BREF-HK to be vague and distant to the original EQ-5D.

“(adding items from the WHOQOL-BREF HK) doesn’t have much impact… some are repeated… or subdividing extra details from the existing list. Some (items from the WHOQOL-BREF HK) are meaningless to the current instrument (EQ-5D-5L) too. As for myself… these items couldn’t apply to someone like me, it wouldn’t be useful asking me these (as bolt-ons)” H028.

The results of the ranking exercise with potential bolt-on dimensions discussed in the EQ-5D list are reported in Table  3 . The priority was topped by ‘Sleep’, followed by ‘Interpersonal Relationships’, ‘Energy’, ‘Satisfaction’, ‘Appetite’, ‘Speech/ Cognition’, ‘Vision’ and ‘Hearing’ in descending order.

Among all potential bolt-ons that emerged, almost all interviewees ( n  = 29) discussed ‘Sleep’ (SL) as an EQ-5D-5 L bolt-on candidate and SL recorded the highest coding frequency. Interviewees often drew a cyclic relationship between SL and ‘energy’, and proposed the combination of the two which was adherent with Finch [ 32 ]. Respondents intended to consider SL as bolt-on candidate to enrich the mental health description in the EQ-5D-5L(HK), as they considered the SL as an important HRQoL dimension to illustrate the daily recovery process and functional status not covered by the current EQ-5D-5L(HK).

“I believe sleep is an important criterion for the Hong Kong people, because… everyone knows Hong Kong people are fast-paced, everything seems tense. In my opinion, when I glanced through the list and came across ‘sleep’, I would think yes, if we include sleep (into EQ-5D), given that…I can go to work, can take of myself, I don’t feel discomfort, or any depression, but actually… in fact, I heard a lot of friends not sleeping well including myself, so sleep is significant (to HRQoL).” H011. “Sleeping Quality may reflect your mental health, on top of our answers… to (EQ-5D-5 L) anxiety/depression. These are one of the direct HRQoL indicators, even if your body… had no problem in your basic mobility, sleeping quality describes more than the physical but an overall condition. ” H023.

‘Interpersonal Relationship’ (IR) was discussed by 24 interviewees as another potential EQ-5D bolt-on dimension to supplement the ‘social well-being’ gap perceived in the EQ-5D-5L. The interviewees highlighted that original five dimensions took heavy emphasis on the individual but paid little emphasis on how social engagements in daily life may enhance HRQoL. An IR bolt-on may take account of the gregarious nature of humans, and serve as an appropriate EQ-5D-5L bolt-on to reflect mental and social well-being.

“the (EQ-5D-5L) five dimensions lack items on social or interpersonal relationships, social well-being… and it seems the most important (bolt-on). The appendments enriches (EQ-5D-5L), as it provides a whole new perspective.” H013. “Interpersonal relationship discussed how man-to-man interact, these (EQ-5D-5L) questions are often individual-based, and if an individual has a normal social life, I would think… directly affect oneself, partially on the mental health dimensions…” H017.

After the considering SL and ‘energy’ together, ‘satisfaction’(SF) is the next prevalent bolt-on discussed by 19 interviewees to enhance the coverage on mental and social HRQoL. Respondents emphasized that the understanding of SF could differ widely between individuals, and these variations would impact immensely on the community’s HRQoL perception. Many interviewees preferred not to fix a coherent definition, and allowed SF as a malleable bolt-on candidate to cover intangible gaps uncaptured by the EQ-5D-5L instrument.

“Satisfaction refers to how you perceive yourself as satisfied with different aspects such as your surroundings, family or money. Because I believe if someone is not feeling satisfied, he or she cannot not be considered as truly healthy both physically and mentally.” H010. “Everyone’s definition of satisfaction differs, but if you look into the (EQ-5D-5L) five dimensions, four of them are very straightforward. For example, mobility, being able to self-care or not, or experiencing any pain. In fact, these (EQ-5D dimensions) are very direct, but for satisfaction, everyone may perhaps perceive it differently, and how it affects one’s emotion.” H013. “This (EQ-5D-5L) tool has limited coverage on emotions, and the ‘Satisfaction’ dimension appears to me as enhancing the description of emotions and mental well-being, and would be a great enrichment to the (EQ-5D-5L) instrument at first impression.” H017.

This study reported the Hong Kong SAR community’s perception of the EQ-5D-5L extensively, and although the EuroQol did not intend to cover all dimensions of health with the original EQ-5D descriptive structure [ 39 , 40 ], the sample population welcomed the appendment of bolt-on dimensions to illustrate their HRQoL more comprehensively. The findings from the qualitative interview and the ranking exercise supported SL, IR, and SF as key potential EQ-5D-5L bolt-on dimensions for the Hong Kong SAR community. As the Hong Kong SAR-specific bolt-on candidates showed variance from other or neighboring communities, the study findings highlighted that the appropriate EQ-5D-5L bolt-on dimensions may vary according to community culture or characteristics, and the bolt-on development should be investigated and validated within the context of the target population. The qualitative approach with semi-structured interview allowed researchers to uncover suitable EQ-5D-5L bolt-on candidates with direct reference to the community’s perception of the HRQoL construct and provided evidence in future item design with the content and face validity gained from the target population [ 41 , 42 , 43 ]. Psychometric properties of the shortlisted EQ-5D-5 L bolt-on items should be examined quantitatively in future study, and valuation research may be conducted if appropriate bolt-on dimension(s) is validated. Despite the conventional EQ-VT valuation protocol may not be adopted for bolt-on valuation, a few publications had attempted to explore bolt-on valuation to facilitate economic evaluations [ 44 , 45 , 46 ].

The Hong Kong SAR community emphasized on both physical and mental health while some local HRQoL target could be covered by the original EQ-5D-5 L. However, other HRQoL targets such as ‘eat well and ‘sleep well’ may not be described distinctively by the current EQ-5D system. Careful interpretation should be made to determine whether these uncaptured HRQoL aspects should be designed as EQ-5D bolt-on dimensions, or supplemented as extra example descriptors to the existing dimensions in the local-adapted EQ-5D-5 L. As the EQ-5D-5L was designed as a generic instrument, amendments to its current form may not be preferred. Referencing the Thai example, ‘activities related to knee bending’ was regarded as a subtheme of UA, but was concurrently tested as a cultural-specific EQ-5D bolt-on [ 19 , 20 ]. In contrast, this study had in-depth discussed the bolt-on candidates with direct reference to the EQ-5D-5L design, and the findings uncovered would go beyond ‘key HRQoL aspects’ but explicitly as ‘EQ-5D-5L bolt-on candidates’ as addressed in the second and third interview sections.

The study revealed that the sample population’s acceptance on increasing the number of relevant HRQoL attributes in the EQ-5D-5L. However, unassisted respondents could face difficulty in information processing and engagement if the instrument is too complex, and the ideal length for self-reported instruments was previously recommended as seven attributes [ 47 ]. Though support could be further provided if necessary [ 48 ], the final number of attributes in a HRQoL instrument should be carefully considered. Depending on the target population and desirable outcome, researchers should examine the advantages of using the EQ-5D and relevant bolt on(s), and choose the optimal tool of appropriate length and comprehensiveness for their research.

This study provided two lists of potential EQ-5D bolt-on items derived from an EQ-5D-themed factor analysis study and from the WHOQOL-BREF (HK) instrument for discussion [ 32 , 34 ], and the respondents reported inclined preference to the former as they found the dimensions from the factor analysis more relevant and compatible to the EQ-5D-5L instrument. However, most previous studies directly adopted and tested bolt-on(s) obtained from existing HRQoL instruments [ 9 ], and findings from this study may hint that such a bolt-on development strategy may not be ideal. While such an approach could be a convenient way to test HRQoL areas not covered by the current EQ-5D, the choice of dimension and item design may need further verification before adoption in the target populations.

Currently, the statistical method for bolt-on validation has yet to be standardized, but many bolt-on publications had referenced the EQ-VAS responses to illustrate the improvements of introducing bolt-on dimensions [ 20 , 23 , 44 , 49 , 50 , 51 , 52 , 53 ]. Yet, our study revealed that many respondents had considered non-HRQoL factors such as weather and unforeseeable circumstances in their EQ-VAS responses. As the EQ-VAS had been measuring a much boarder concept than the EQ-5D oftentimes [ 54 , 55 ], EQ-VAS may not be a ‘golden standard’ in evaluating the impact of bolt-on appendment. This further implies that a standardized protocol for EQ-5D bolt-on development and validation is needed, while the performance of bolt-on could be reviewed by a range of criteria such as content validity, internal consistency, construct validity, ceiling effects, and responsiveness [ 8 , 56 ].

The choice and performance of EQ-5D bolt-on candidates may vary across sample populations with different backgrounds. For example, sleeping quality was previously reported as an important HRQoL aspect in China [ 14 ], and sleep was considered a significant EQ-5D bolt-on candidate in Korea [ 23 ], where poor sleeping quality was shown associated with deteriorated EQ-5D utility [ 57 ]. In contrast, limited benefits of introducing a ‘sleep’ dimension in EQ-5D-3L were observed in England [ 50 ]. It would be crucial for bolt-on development to be validated in the target population, whereas further research on psychometric impacts and dimension structure should also be conducted on top of face validation before applying the bolt-on dimension in the target community [ 8 ]. Though most EQ-5D bolt-on projects focused on disease-specific context, findings of this project may demonstrate the exploration of EQ-5D-5L bolt-on in community setting, while the discrepancy between EQ-5D-5L and community’s HRQoL perception observed may suggest further HRQoL instrument evaluation in other community setting to improve accuracy and comprehensiveness of HRQoL measurements.

This study has its limitations. First, despite obtaining face validity from the sample to combine SL and ‘energy’, the merging of two HRQoL aspects may conceal the possible interactions in measurements. Further research is necessary to study the effect of such merge. Besides, the dimensions obtained from the factor analysis studied were not translated by professional translators. Though the translated terminology was pilot-tested with a telephone survey [ 25 ], full translation may be preferred to develop comparable results with future EQ-5D bolt-on studies.

This study serves as an early attempt to develop community-specific EQ-5D-5L bolt-ons, and the qualitative approach provides in-depth evidence that SL, IF, and IR are considered potential EQ-5D-5L bolt-on dimensions in Hong Kong SAR. In contrast to the conventional ‘top-down’ approach of proposing bolt-on candidates by healthcare professionals [ 14 ], this study elicited ‘EQ-5D bolt-on candidates’ directly from the perspective of the general population, which may further enhance the validity of the EQ-5D-5L as a PRO instrument. The psychometric benefit of introducing these community-specific EQ-5D-5L bolt-ons should be investigated in future quantitative research.

Data availability

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.

Megari K. Quality of life in Chronic Disease patients. Health Psychol Res. 2013;1(3):e27.

Article   PubMed   PubMed Central   Google Scholar  

Van Wilder L, et al. Living with a chronic disease: insights from patients with a low socioeconomic status. BMC Fam Pract. 2021;22(1):233.

Kaplan RM, Hays RD. Health-Related Quality of Life Measurement in Public Health. Annu Rev Public Health. 2022;43:355–73.

Article   PubMed   Google Scholar  

Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality of life. Ann Intern Med. 1993;118(8):622–9.

Article   CAS   PubMed   Google Scholar  

Feng YS, et al. Psychometric properties of the EQ-5D-5L: a systematic review of the literature. Qual Life Res. 2021;30(3):647–73.

Brazier J, et al. A review of generic preference-based measures for use in cost-effectiveness models. PharmacoEconomics. 2017;35(Suppl 1):21–31.

Longworth L, et al. Use of generic and condition-specific measures of health-related quality of life in NICE decision-making: a systematic review, statistical modelling and survey. Health Technol Assess. 2014;18(9):1–224.

Mulhern BJ, et al. Criteria for developing, assessing and selecting candidate EQ-5D bolt-ons. Qual Life Res. 2022;31(10):3041–8.

Geraerds A, et al. Methods used to identify, test, and assess impact on preferences of Bolt-Ons: a systematic review. Value Health. 2021;24(6):901–16.

Kangwanrattanakul K, Phimarn W. A systematic review of the development and testing of additional dimensions for the EQ-5D descriptive system. Expert Rev Pharmacoecon Outcomes Res. 2019;19(4):431–43.

Sampson C et al. Candidate bolt-ons for cognition and vision: qualitative findings from a development programme , in EuroQol Plenary Meeting 2020 . 2020.

Rencz F, et al. A qualitative investigation of the relevance of skin irritation and self-confidence bolt-ons and their conceptual overlap with the EQ-5D in patients with psoriasis. Qual Life Res. 2022;31(10):3049–60.

Haraldstad K, et al. A systematic review of quality of life research in medicine and health sciences. Qual Life Res. 2019;28(10):2641–50.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Mao Z, et al. The unfolding method to explore Health-Related Quality of Life constructs in a Chinese General Population. Value Health. 2021;24(6):846–54.

Mao Z, et al. Similarities and differences in Health-Related Quality-of-life concepts between the East and the West: a qualitative analysis of the content of Health-Related Quality-of-life measures. Value Health Reg Issues. 2021;24:96–106.

Ock M, et al. Perceptions of the General Public About Health-Related Quality of Life and the EQ-5D questionnaire: a qualitative study in Korea. J Prev Med Public Health. 2022;55(3):213–25.

Shah KK, et al. Views of the UK General Public on Important Aspects of Health not captured by EQ-5D. Patient. 2017;10(6):701–9.

PubMed   Google Scholar  

Devlin NJ, Hansen P, Selai C. Understanding health state valuations: a qualitative analysis of respondents’ comments. Qual Life Res. 2004;13(7):1265–77.

Kangwanrattanakul K, et al. Adding two culture-specific ‘bolt-on’ dimensions on the Thai version of EQ-5D-5L: an exploratory study in patients with diabetes. Expert Rev Pharmacoecon Outcomes Res. 2019;19(3):321–9.

Kangwanrattanakul K, et al. Exploration of a cultural-adaptation of the EQ-5D for Thai population: a bolt-on experiment. Qual Life Res. 2019;28(5):1207–15.

Mao Z, et al. Developing and testing culturally relevant bolt-on items for EQ-5D-5L in Chinese populations: a mixed-methods study protocol. BMJ Open. 2024;14(1):e081140.

CJ, T.A.S.A.L. Exploration of EQ-5D-5L Bolt-On items among Malaysian Population. Malaysian J Pharm, 2017. 3(1).

Kim SH, et al. Exploratory Study of Dimensions of Health-Related Quality of Life in the General Population of South Korea. J Prev Med Public Health. 2017;50(6):361–8.

Wong EL, et al. Normative Profile of Health-Related Quality of Life for Hong Kong General Population using preference-based instrument EQ-5D-5L. Value Health. 2019;22(8):916–24.

Ng CCW, Wong ELY. Comparison of Health-related Quality of Life Derived by EQ-5D in Hong Kong Year 2014 and 2020 . in Hong Kong College of Community Medicine Annual Scientific Meeting 2021. Hong Kong.

Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007;19(6):349–57.

Herdman M, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res. 2011;20(10):1727–36.

Janssen MF, et al. Measurement properties of the EQ-5D-5L compared to the EQ-5D-3L across eight patient groups: a multi-country study. Qual Life Res. 2013;22(7):1717–27.

Kangwanrattanakul K, Parmontree P. Psychometric properties comparison between EQ-5D-5L and EQ-5D-3L in the general Thai population. Qual Life Res. 2020;29(12):3407–17.

Authority H. Hospital Authority launches a new round of patient experience survey. Hong Kong SAR; 2021.

The Jockey Club School of Public Health and Primary Care. 2019 Patient Experience Survey –Inpatient Service in Hong Kong Hospital Authority . 2020, The Chinese University of Hong Kong,: Hong Kong. p. 246.

Finch AP, et al. An Exploratory Study on Using Principal-Component Analysis and Confirmatory Factor Analysis to identify Bolt-On dimensions: the EQ-5D case study. Value Health. 2017;20(10):1362–75.

World Health Organization. WHOQOL: measuring quality of life. World Health Organization: Geneva; 1997.

Leung KF, et al. Development and validation of the interview version of the Hong Kong Chinese WHOQOL-BREF. Qual Life Res. 2005;14(5):1413–9.

Dedoose. web application for managing, analyzing, and presenting qualitative and mixed method research data . 2021, SocioCultural Research Consultants, LLC: Los Angeles, CA.

Census, Department S. T.G.o.t.H.K.S.A.R., Interactive Data Dissemination Service . 2017: Hong Kong SAR.

Hong Kong Special Administrative Region. Thematic Household Survey Report No.74. Editor: C.a.S. Department; 2021.

Google Scholar  

Wong ELY, et al. Assessing the Use of a Feedback Module to Model EQ-5D-5L Health States values in Hong Kong. Patient. 2018;11(2):235–47.

Brooks R. EuroQol: the current state of play. Health Policy. 1996;37(1):53–72.

Brazier JE, et al. Future directions in valuing benefits for estimating QALYs: is time up for the EQ-5D? Value Health. 2019;22(1):62–8.

Al-Janabi H, Flynn TN, Coast J. Development of a self-report measure of capability wellbeing for adults: the ICECAP-A. Qual Life Res. 2012;21(1):167–76.

Stevens K. Developing a descriptive system for a new preference-based measure of health-related quality of life for children. Qual Life Res. 2009;18(8):1105–13.

Stevens K, Palfreyman S. The use of qualitative methods in developing the descriptive systems of preference-based measures of health-related quality of life for use in economic evaluation. Value Health. 2012;15(8):991–8.

Chen G, Olsen JA. Filling the psycho-social gap in the EQ-5D: the empirical support for four bolt-on dimensions. Qual Life Res. 2020;29(11):3119–29.

Finch AP, Brazier J, Mukuria C. Selecting bolt-on dimensions for the EQ-5D: testing the impact of hearing, Sleep, Cognition, Energy, and relationships on preferences using pairwise choices. Med Decis Mak. 2021;41(1):89–99.

Article   Google Scholar  

Kharroubi SA, et al. Modelling a preference-based index for EQ-5D-3L and EQ-5D-3L + sleep using a bayesian framework. Qual Life Res. 2020;29(6):1495–507.

Miller GA. The magical number seven plus or minus two: some limits on our capacity for processing information. Psychol Rev. 1956;63(2):81–97.

Baddeley A. The magical number seven: still magic after all these years? Psychol Rev. 1994;101(2):353–6.

Perneger TV, Courvoisier DS. Exploration of health dimensions to be included in multi-attribute health-utility assessment. Int J Qual Health Care. 2011;23(1):52–9.

Yang Y, Brazier J, Tsuchiya A. Effect of adding a sleep dimension to the EQ-5D descriptive system: a bolt-on experiment. Med Decis Mak. 2014;34(1):42–53.

Jelsma J, Maart S. Should additional domains be added to the EQ-5D health-related quality of life instrument for community-based studies? An analytical descriptive study. Popul Health Metr. 2015;13:13.

Geraerds A, et al. The added value of the EQ-5D with a cognition dimension in injury patients with and without traumatic brain injury. Qual Life Res. 2019;28(7):1931–9.

Ophuis RH, et al. Health-related quality of life in injury patients: the added value of extending the EQ-5D-3L with a cognitive dimension. Qual Life Res. 2019;28(7):1941–9.

Feng Y, Parkin D, Devlin NJ. Assessing the performance of the EQ-VAS in the NHS PROMs programme. Qual Life Res. 2014;23(3):977–89.

Whynes DK, Group T. Correspondence between EQ-5D health state classifications and EQ VAS scores. Health Qual Life Outcomes. 2008;6:94.

Terwee CB, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34–42.

Lee S, Kim JH, Chung JH. The association between sleep quality and quality of life: a population-based study. Sleep Med. 2021;84:121–6.

Download references

Acknowledgements

We would like to thank the interviewees in sharing their insights and constructive comments on the research topic.

No funding.

Author information

Authors and affiliations.

The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China

Clement Cheuk Wai Ng, Annie Wai Ling Cheung & Eliza Lai Yi Wong

Centre for Health Systems and Policy Research, The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China

Annie Wai Ling Cheung & Eliza Lai Yi Wong

Rm418, School of Public Health Building, Prince of Wales Hospital, Sha Tin, New Territories, Hong Kong SAR, China

Eliza Lai Yi Wong

You can also search for this author in PubMed   Google Scholar

Contributions

The research project and manuscript were prepared by CCWN and ELYW. Data collection was handled by CCWN and AWLC. All authors contributed to sample recruitment and data analysis, and approved the final manuscript.

Corresponding author

Correspondence to Eliza Lai Yi Wong .

Ethics declarations

Ethics approval and consent to participate.

This project sought ethics approval from The Chinese University of Hong Kong Survey and Behavioral Research Ethics Committee (SBRE-19-702). Written informed consent was obtained from the participants before the data collection.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Cheuk Wai Ng, C., Wai Ling Cheung, A. & Lai Yi Wong, E. Exploring potential EQ-5D bolt-on dimensions with a qualitative approach: an interview study in Hong Kong SAR, China. Health Qual Life Outcomes 22 , 42 (2024). https://doi.org/10.1186/s12955-024-02259-6

Download citation

Received : 15 January 2024

Accepted : 14 May 2024

Published : 31 May 2024

DOI : https://doi.org/10.1186/s12955-024-02259-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Health-related quality of life
  • Qualitative research

Health and Quality of Life Outcomes

ISSN: 1477-7525

sample size for qualitative research body

  • Open access
  • Published: 29 May 2024

Paramedics’ experiences of barriers to, and enablers of, responding to suspected or confirmed COVID-19 cases: a qualitative study

  • Ursula Howarth 1 ,
  • Peta-Anne Zimmerman 2 , 3 , 4 , 5 ,
  • Thea F. van de Mortel 2 &
  • Nigel Barr 6  

BMC Health Services Research volume  24 , Article number:  678 ( 2024 ) Cite this article

28 Accesses

2 Altmetric

Metrics details

Paramedics’ work, even pre-pandemic, can be confronting and dangerous. As pandemics add extra stressors, the study explored paramedics’ lived experience of the barriers to, and enablers of, responding to suspected or confirmed Coronavirus Disease 2019 (COVID-19) cases.

This exploratory-descriptive qualitative study used semi-structured interviews to investigate Queensland metropolitan paramedics’ experiences of responding to cases during the COVID-19 pandemic. Interview transcripts were analysed using thematic analysis. Registered Paramedics were recruited by criterion sampling of staff who experienced the COVID-19 pandemic as active officers.

Nine registered paramedics participated. Five themes emerged: communication, fear and risk, work-related protective factors, leadership, and change. Unique barriers included impacts on effective communication due to the mobile nature of paramedicine, inconsistent policies/procedures between different healthcare facilities, dispatch of incorrect information to paramedics, assisting people to navigate the changing healthcare system, and wearing personal protective equipment in hot, humid environments. A lower perceived risk from COVID-19, and increased empathy after recovering from COVID-19 were unique enablers.

Conclusions

This study uncovered barriers and enablers to attending suspected or confirmed COVID-19 cases unique to paramedicine, often stemming from the mobile nature of prehospital care, and identifies the need for further research in paramedicine post-pandemic to better understand how paramedics can be supported during public health emergencies to ensure uninterrupted ambulance service delivery.

Peer Review reports

Introduction

The COVID-19 pandemic disrupted healthcare globally and significantly impacted lives, including those of paramedics who perform essential frontline health care [ 1 ]. In Australia, emergency ambulance services are run/contracted by the state/territory and most qualified paramedics have a paramedicine diploma or degree and can provide advanced life support [ 2 ].

Prior to the COVID-19 pandemic, lessons learnt from other healthcare settings about processes of care and behaviours during disaster and emergency responses were applied to the prehospital environment [ 3 , 4 ]. A recent review [ 5 ] found only nine studies that included the paramedic experience of the COVID-19 pandemic, with various foci, including leadership strategies, psychological/social wellbeing or resilience, attitudes and stressors, and knowledge and preparedness; while including two Australian studies [ 6 , 7 ], none focused specifically on the experiences of paramedics in attending suspected or confirmed COVID-19 cases to examine the barriers to, and enablers of, responding to those cases. Exploring paramedics’ experience of responding under COVID-19 specific conditions may provide insights into how to increase the willingness of paramedics to respond during future public health emergencies to ensure uninterrupted ambulance service access and delivery.

This research sought to understand paramedics’ lived experience during the COVID-19 pandemic. The research question was ‘What were Queensland metropolitan paramedics’ experiences of barriers to, and enablers of, attending suspected or confirmed COVID-19 cases?’

Study design

An exploratory-descriptive qualitative approach [ 8 ] was applied to understand the experience of paramedics during the COVID-19 pandemic. A constructivist paradigm was chosen to explore paramedics’ experiences because it assumes there are multiple subjective realities, insider knowledge can be valuable, there is a holistic emphasis on the experience being investigated, and rich data are obtained whilst addressing context and processes [ 8 , 9 ].

Participant selection and setting

Registered paramedics from metropolitan south-east Queensland, Australia were invited to participate (few COVID-19 cases were occurring elsewhere at the time). Advanced Care Paramedics (ACP) and Critical Care Paramedics (CCP) in patient-facing roles with at least one year of operational experience during the COVID-19 pandemic were included. Patient Transport officers, doctors or paramedics working in supervisory roles were excluded. Criterion sampling [ 10 ] was applied to find participants with diverse education levels, age, gender and experience.

Recruitment and data collection

The primary researcher’s management position created a potential power imbalance given the position they worked in at the time, and their previous experience in operational paramedic roles made it likely they would know participants. Consequently, they had no direct contact with participants. A research assistant (RA) was utilised to ensure participant confidentiality and to ensure they felt safe to express themselves freely. The RA had a health science doctoral qualification and invited expressions of interest via an email containing an information sheet sent by the ambulance research department. Thirty-four responses were received. After an initial screen against the inclusion criteria, the RA sent a de-identified list to the primary researcher who authorised eleven invitations to be sent out in June 2022 that maximised sample diversity. After eight interviews, no new codes were generated; one more participant was interviewed to confirm this. Four open-ended interview questions on participants’ experiences of responding to patients during the COVID-19 pandemic, and the barriers and enablers to responding to these patients were asked. The interview was piloted with a paramedic who was not part of the study; no changes to the questions were required. The RA conducted, audio-recorded and transcribed interviews (approximately 30-min in duration) in July, 2022.

Data analysis

The research team included the primary researcher, and three doctoral qualified academics, one of whom was also a Registered Paramedic. Trustworthiness and rigour during data collection and analysis was addressed using the Lincoln-Guba framework, which underpins credibility, dependability, confirmability, and transferability [ 11 ]. During the interview and analysis phase, this included utilising a RA, member checking at the end of each interview, and researcher reflection on their own biases and preconceived thoughts after each transcript was reviewed. Researcher discussion supported rigour by identifying preconceptions the primary researcher may have that could influence data analysis [ 12 ]. Further member checking of transcripts was not deemed necessary due to the clarity of the participants’ comments.

Thematic analysis was conducted using the six-phase process outlined by Braun and Clarke [ 13 ]. The inductive method was used as the analysis was driven by the data, each participant’s language, and concepts [ 14 ], and aligns with the exploratory-descriptive qualitative approach, which focused on investigating the essence of the paramedics’ experiences during COVID-19 and remaining open to emerging themes. The transcripts were analysed by UH and all researchers discussed the coding and agreed on the themes. This discussion was informed by a range of illustrative quotes that exemplified each code.

Ethics approval was obtained from Royal Brisbane and Women’s Hospital Human Research Ethics Committee (Ref. no:84446) and Griffith University Human Research Ethics Committee (Ref. no:2021/819). The ambulance service approved paramedic recruitment. Participants gave informed consent.

Nine Registered Paramedics, four female and five male, aged 27–52 years (median 42; IQR = 32, 43), with 3–24 years of experience (median 8; IQR = 5, 15.5) were interviewed. Eight were ACPs, one was a CCP, all had a Bachelor of Paramedicine and two had paramedicine-related Master’s degrees. The analysis generated 26 codes and five themes: communication, fear and risk, leadership, work-related protective factors, and change.

Communication

This theme included the codes: organisational communication, media, public health messages, and interagency communication (Table  1 ). Participants perceived communication - from the ambulance service, media or formal health channels – substantially impacted paramedics during the pandemic. Communication ranged from being helpful and building trust, to lacking clarity and becoming overwhelming, confusing, and frustrating.

Fear and risk

The fear and risk theme included the codes: paramedic safety prioritised, physical risk to paramedic, healthcare barriers, unnecessary risk, fear of unknown, and having contracted COVID-19 (Table  2 ). Most indicated fear and risk influenced their personal and professional lives, with a flow on effect to patient care. Whilst mostly seen as a barrier to responding to cases, fear and risk also led to more empathetic approaches to patient care, and adherence to effective infection prevention and control practices.

The leadership theme included the codes: organisational leadership and lack of trust in organisation and government through the pandemic (Table  3 ). Some commented on the challenge of leadership through a pandemic, and appreciated open information-sharing, while others mistrusted decision-making and indicated the need for a consistent, visible leader.

Work-related protective factors

Work-related protective factors covered emotional, physical, or financial support, including vaccines, leave entitlements, personal protective equipment (PPE), secure employment and comradery (Table  4 ). However, wearing PPE in hot, humid environments, and difficulty accessing entitlement information caused frustration and distress.

The theme of change included the codes: adapting to their role and expectations, effect on personal life, emotional/mental health, evolution of pandemic normalised responding to cases, workload, and public reaction (Table  5 ). Paramedics reported issues as barriers earlier in the pandemic, but adapted as the community became highly vaccinated, their exposure to COVID-19 cases increased and it became more endemic, normalising responding to cases. Paramedics were often the first point of contact to navigate patients through the healthcare system, e.g., when patients called the ambulance service because they did not know what to do.

Barriers to, and enablers of, Queensland metropolitan paramedics responding to suspected or confirmed COVID-19 cases were identified. Some barriers had previously been reported in studies of other healthcare workers, including communication issues, change in work practices, increased burnout, psychological distress, fear of infection to self and loved ones, lack of PPE and vaccines, and unpreparedness [ 15 , 16 , 17 , 18 ]. Barriers unique to the prehospital environment included ineffective communication due to the mobile nature of paramedicine, inconsistent policies/procedures between different facilities, dispatch of incorrect information, assisting people to navigate the changing healthcare system, and wearing PPE in hot, humid environments.

Communication difficulties related to the mobile nature of paramedicine

While there can be communication issues in everyday work at the best of times, effective communication during a global infectious disease outbreak is particularly challenging due to mass media coverage, public concern, and uncertainty related to the disease [ 19 ]. Email-based communication is not always received, and communication failure can occur due to one-time message delivery, and communication fatigue [ 20 ]. In addition, media coverage, and widespread mis/disinformation created communication challenges [ 21 ].

Overwhelming, changing information during an outbreak is not unusual [ 7 ]. What was unique to the paramedic experience was the impact of the mobile nature of prehospital care. Attending multiple healthcare facilities per shift meant paramedics were exposed to multiple interpretations of pandemic guidance and local practices. Inconsistencies and lack of communication regarding different procedures, caused frustration, delays, and unnecessary exposure to infectious patients. This experience was confirmed in recent studies [ 5 , 7 , 22 , 23 ].

One paramedic [ 22 ] attended a case where four paramedics on scene had four different oxygenation strategies, due to frequent guideline changes and the timing of accessing updates, highlighting the need for better communication strategies as an outbreak evolves.

Increased safety risks due to receiving incorrect information from the ambulance service dispatch

Another unique communication barrier related to case dispatch. Paramedics rely on receiving correct information prior to arriving on scene to assess and mitigate risk based on what is known about the case. Miscommunication arose from the dispatcher either misunderstanding information or receiving incorrect information from the person requiring assistance, causing an increase in stress to the paramedic. Whilst case dispatch errors can occur outside of pandemic situations, the pandemic itself added an extra layer of stress in relation to paramedic safety. More stringent organisational procedures and public education are required to prevent this.

Paramedics assisted patients to navigate the new healthcare rules

The pandemic disrupted the way healthcare was delivered and/or accessed by both health professionals and consumers [ 17 , 24 , 25 ]. Paramedics were affected by increased hospital waiting times, and the move to telehealth changed the types of cases they were called to [ 7 ]. Paramedics often had to navigate patients through the healthcare system to access the most appropriate help in addition to the many changes they were experiencing in their workplace and community. This indicates the need for further investigation into how paramedics can effectively assist patients when there are so many changes occurring during a pandemic, often with limited information.

Wearing PPE in hot, humid environments, caused discomfort and fatigue

Globally, healthcare workers felt the adverse effects of wearing PPE more frequently and for longer periods [ 26 ], however, the prehospital environment created additional challenges for paramedics working in hot, humid conditions. While there is limited literature specifically on paramedics and heat-related illness when wearing PPE, during the African Ebola outbreak, the Centers for Disease Control and Prevention [ 27 ] indicated wearing PPE impairs the body’s ability to reduce body heat through sweat production, PPE holds excess heat and moisture and increases the physical effort to perform duties and the wearer can’t drink, increasing the risk of heat-related illness [ 27 , 28 ]. Other common risk factors in prehospital environments include direct sun exposure, physical exertion, dehydration, and indoor heat sources at patients’ homes. Clinicians need to balance having an impermeable layer of PPE to protect against viral contamination, and the heat stress caused to the wearer [ 29 ]. While personal cooling garments are available, the effectiveness of these to decrease PPE-related heat stress has not been studied [ 28 ].

Healthcare workers are at increased risk of self-contamination when doffing PPE if they are experiencing PPE-related discomfort [ 30 ], have trouble completing procedures, and experience facial injuries and skin conditions, and decreased well-being and job satisfaction. These issues are particularly relevant for paramedics in hot, humid parts of Australia. Paramedic-specific research is required to better support paramedics working in these environments in full PPE.

After contracting COVID-19, participants’ perceptions of risk reduced and empathy towards COVID-19-positive patients increased

One enabler - a decreased perception of risk and associated anxiety, and increased empathy for COVID cases after contracting COVID oneself - has not been previously reported, possibly because paramedics are used to experiencing risk in their work [ 31 , 32 ].

This exploration of paramedics’ experiences of barriers to, and enablers of, responding to suspected or confirmed COVID-19 cases uncovered challenges unique to the prehospital field that can potentially impact service delivery. Paramedicine is often the ‘forgotten profession’ overshadowed by community and acute care, and emergency department issues [ 31 ]. While studies based on a hypothetical public health emergency and willingness to respond are helpful, there are limitations compared to exploring this phenomenon during an actual public health emergency [ 33 ].

Limitations

Paramedics in non-metropolitan areas were not recruited and may have provided new insights into responding to cases in a geographically diverse state that includes logistical and resourcing challenges common in rural/remote areas. Given the specific recruitment for this study, the findings may not be transferable to other prehospital settings. Culture and personal beliefs and how these may have affected paramedics’ experience of working during a pandemic were not explored.

Recommendations

Further research is required on methods to improve communication to paramedics, particularly cross-facility communication, and how to flag critical information changes so these changes are implemented as soon, and consistently, as possible. Strategies to mitigate the effects of PPE when worn for extended periods in hot, humid conditions should also be explored. In the meantime, supervisors should prioritise regular rehydration, breaks, and welfare checks. Research on barriers and enablers during a public health emergency from the perspective of managers, executive leadership and other ambulance service providers would provide a deeper understanding of the issues.

The value of this research is that it captures Queensland metropolitan paramedics’ experience while working through the most significant public health emergency of our generation. This study uncovered barriers and enablers to responding to COVID-19 cases and thus to ambulance service delivery unique to paramedicine stemming from the mobile nature of prehospital care. It is vital that we support healthcare workers to maintain their physical and mental health, and willingly provide essential services, and that the healthcare system is ready to provide a cohesive response to public health emergencies across all sectors. This study highlights the importance of further research into paramedics in their roles.

Data availability

The datasets generated and analysed during the current study are not publicly available to protect the confidentiality of participants but are available from the corresponding author on reasonable request.

Abbreviations

Advanced care paramedic

Critical care paramedic

Coronavirus disease 2019

Personal protective equipment

Research assistant

Piotrowski A, Makarowski R, Predoiu R, Predoiu A, Boe O. Resilience and subjectively experienced stress among paramedics prior to and during the COVID-19 pandemic. Front Psychol 2021;12.

Ambulance Victoria. Types of Paramedics. Victoria: Ambulance Victoria. www.ambulance.vic.gov.au/paramedics/typesof-paramed Accessed 15 October 2023.

Carter H, Thompson J. Defining the paramedic process. Aust J Prim Health. 2015;21(1):22–6.

Article   PubMed   Google Scholar  

Watt K, Tippett VC, Raven S, Jamrozik K, Koory M, Archer F, et al. Attitudes to living and working in pandemic conditions among emergency prehospital medical care personnel. Prehosp Disaster Med. 2010;25(1):13–9.

Howarth U, Zimmerman P-A, van de Mortel T, Barr N. Barriers to, and enablers of, paramedics responding to suspected or confirmed COVID-19 cases: an integrative review. Australa Emerg Care. 2022;26(1):66–74.

Article   Google Scholar  

Li C, Sotomayor-Castillo C, Nahidi S, Kuznetsov S, Considine J, Curtis K, et al. Emergency clinicians’ knowledge, preparedness and experience of managing COVID-19 during the 2020 global pandemic in Australian healthcare settings. Australas Emerg Care. 2021;24:186–96.

Article   PubMed   PubMed Central   Google Scholar  

Petrie K, Smallwood N, Pascoe A, Willis K. Mental health symptoms and workplace challenges among Australian paramedics during the COVID-19 pandemic. Int J Environ Res Public Health. 2022;19(2):1004.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hunter DJ, McCallum J, Howes D. Defining exploratory-descriptive qualitative (EDQ) research and considering its application to healthcare. J Nurs H Care 2019; 4(1).

Polit D, Beck C. Trustworthiness and rigor in qualitative research. In: Polit D, Beck C, editors. Nursing Research. Generating and assessing evidence for nursing practice. 11th ed. Philadelphia: Wolters Kluwer; 2020. pp. 1948–2967.

Google Scholar  

Moser A, Korstjens I. Practical guidance to qualitative research. Part 3. Sampling, data collection and analysis. Eur J Gen Pract. 2018;24(1):9–18.

Stahl NA, King JR. Expanding approaches for research: Understanding and using trustworthiness in qualitative research. J Dev Educ. 2020 Fall;44(1):26–29.

Sundler AJ, Lindberg E, Nilsson C, Palmer L. Qualitative thematic analysis based on descriptive phenomenology. Nurs Open. 2019;6(3):733–9.

Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3(2):77–101.

Clarke V, Braun V. Thematic analysis. In: Michalos A, Editor, editors. Encyclopedia of quality of life and well-being research. Netherlands: Springer; 2014. pp. 6626–8.

Chapter   Google Scholar  

Murray EJ, Mason M, Sparke V, Zimmerman P-E. Factors influencing health care workers’ willingness to respond to duty during infectious disease outbreaks and bioterrorist events: an integrative review. Prehosp Disaster Med. 2021;36(3):321–37.

Pilbeam C, Tonkin-Crine S, Martindale A, Atkinson P, Mabelson H, Lant S, et al. How do healthcare workers ‘do’ guidelines? Exploring how policy decisions impacted UK healthcare workers during the first phase of the COVID-19 pandemic. Qual Health Res. 2022;32(5):729–43.

Smallwood N, Karimi L, Pascoe A, Bismark M, Putland M, Johnson D, et al. Coping strategies adopted by Australian frontline health workers to address psychological distress during the COVID-19 pandemic. Gen Hosp Psychiatry. 2021;72:124–30.

Stuijfzand S, DeForges C, Sandoz V, Sajin C-T, Jacque C, Elmers A, et al. Psychological impact of an epidemic/pandemic on the mental health of healthcare professionals: a rapid review. BMC Public Health. 2020;20(1):1–18.

Huang C, Chou T, Liu JS. The development of pandemic outbreak communication: a literature review from the response enactment perspective. Know Manag Res Pract. 2021;19(4):525–35.

Germaine P, Catanzano T, Patel A, Mohan A, Patel K, Pryluck D, et al. Communication strategies and our learners. Curr Probl Diagn Radiol. 2021;50(3):297–300.

Mukhtar S. Psychological health during the coronavirus disease 2019 pandemic outbreak. Int J Soc Psychiatry. 2020;66(5):512–6.

Boechler L, Cameron C, Smith C, Ford-Jones P, Southers P. Impactful approaches to leadership on the front lines of the COVID-19 pandemic: lived experiences of Canadian paramedics. Healthc Q. 2021;24(3):42–7.

Oliphant A, Faulds C, Nouvet E. At the front-line: Ontario paramedics’ experiences of occupational safety, risk and communication during the 2020 COVID-19 pandemic. Int J Emerg Serv. 2022;11(2):207–21.

Bernacki K, Keister A, Sapiro N, Joo J, Mattle L. Impact of COVID-19 on patient and healthcare professional attitudes, beliefs, and behaviours toward the healthcare system and on the dynamics of the healthcare pathway. BMC Health Serv Res. 2021;21:1309.

Mitton JA, de Hernandez BU, Pasupuleti N, Hurley K, John R, Cole A. Disruptive resilience: harnessing leadership to build a more equitable health care system after COVID-19. Popul Health Manag. 2021;24(6):646–7.

Unoki T, Sakuramoto H, Sato R, Ouchi A, Kuribara T, Furumaya T et al. (2021). Adverse effects of personal protective equipment among intensive care unit healthcare professionals during the COVID-19 pandemic: A scoping review. SAGE Open Nurs 2021;6:1–14.

Centers for Disease Control and Prevention. Limiting heat burden while wearing Personal Protective Equipment (PPE). Centers for Disease Control and Prevention 2014. https://www.cdc.gov/niosh/topics/ebola/pdfs/limiting-heat-burden-while-wearing-ppe-training-slides-healthcare-workers-site-coordinators.pdf Accessed 15 October 2023.

Tumram NK. Personal protective equipment and personal cooling garments to reduce heat-related stress and injuries. Med Leg J. 2020;88(1 suppl):43–36.

Coca A, Quinn T, Kim J-H, Wu T, Powell J, Roberge R, et al. Physiological evaluation of personal protective ensembles recommended for use in West Africa. Disaster Med Public Health Prep. 2017;11(5):580–6.

Davey SL, Lee BJ, Robbins T, Randeva H, Thake C. Heat stress and PPE during COVID-19: impact of healthcare workers’ performance, safety and well-being in NHS settings. J Hosp Infect. 2021;108:185–8.

Article   CAS   PubMed   Google Scholar  

Lawn S, Roberts L, Willis E, Couzner L, Mohammadi L, Gobi E. The effects of emergency medical service work on the psychological, physical, and social well-being of ambulance personnel: a systematic review of qualitative research. BMC Psychiatry. 2020;20(1):348.

Maguire BJ. Violence against ambulance personnel: a retrospective cohort study of national data from safe work Australia. Public Health Res Pract. 2018;28(1):e28011805.

Gee S, Skovdal M. The role of risk perception in willingness to respond to the 2014–2016 west African Ebola outbreak: a qualitative study of international health care workers. Glob Health Res Policy. 2017;2(1):1–10.

Download references

Acknowledgements

We thank the Queensland Ambulance Service) for facilitating paramedic recruitment, Dr. Megan Rattray for her research assistance, and participants for their insights.

Nil to declare.

Author information

Authors and affiliations.

Queensland Ambulance Service, GPO Box 1425, Brisbane, QLD, 4001, Australia

Ursula Howarth

School of Nursing & Midwifery, Griffith University, Parklands Drive, Southport, QLD, 4222, Australia

Peta-Anne Zimmerman & Thea F. van de Mortel

Collaborative for the Advancement for Infection Prevention and Control, Gold Coast, QLD, Australia

Peta-Anne Zimmerman

Gold Coast Hospital and Health Service, Southport, QLD, Australia

Menzies Health Institute, Southport, QLD, Australia

University of Sunshine Coast School of Health, Locked Bag 4, Maroochydore, DC, QLD, 4558, Australia

You can also search for this author in PubMed   Google Scholar

Contributions

U.H. conceptualised the study and collected the data. U.H., P.Z, T.M. and N.B. analysed the data. U.H. drafted the manuscript. All authors revised and approved the manuscript.

Corresponding author

Correspondence to Thea F. van de Mortel .

Ethics declarations

Ethics approval and consent to participate.

Ethics approval was obtained from Royal Brisbane and Women’s Hospital Human Research Ethics Committee (Ref. no:84446) and Griffith University Human Research Ethics Committee (Ref. no:2021/819). The ambulance service approved paramedic recruitment. Participants gave written informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Howarth, U., Zimmerman, PA., van de Mortel, T.F. et al. Paramedics’ experiences of barriers to, and enablers of, responding to suspected or confirmed COVID-19 cases: a qualitative study. BMC Health Serv Res 24 , 678 (2024). https://doi.org/10.1186/s12913-024-11120-x

Download citation

Received : 07 November 2023

Accepted : 19 May 2024

Published : 29 May 2024

DOI : https://doi.org/10.1186/s12913-024-11120-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Emergency medical technician
  • Ambulance service

BMC Health Services Research

ISSN: 1472-6963

sample size for qualitative research body

IMAGES

  1. Qualitative research sample design and sample size: resolving and

    sample size for qualitative research body

  2. Minimum sample size recommendations for most common quantitative and

    sample size for qualitative research body

  3. PPT

    sample size for qualitative research body

  4. How to Determine the Sample Size in your Research

    sample size for qualitative research body

  5. Sample Size For Qualitative Research: Clive Roland Boddy

    sample size for qualitative research body

  6. Developing questionnaire survey and calculating sample size

    sample size for qualitative research body

VIDEO

  1. How to calculate/determine the Sample size for difference in proportion/percentage between 2 groups?

  2. Morphologic Changes in Neutrophils (nuclear and cytoplasmic changes)

  3. Describing the Sample Size and Sampling Procedure

  4. Qualitative analysis of given sample of Paracetamol #like #share #subscribe #hardwork

  5. Saturation Point in Qualitative Research

  6. Kinds of Research Design/All About Research Design

COMMENTS

  1. Big enough? Sampling in qualitative inquiry

    So there was no uniform answer to the question and the ranges varied according to methodology. In fact, Shaw and Holland (2014) claim, sample size will largely depend on the method. (p. 87), "In truth," they write, "many decisions about sample size are made on the basis of resources, purpose of the research" among other factors. (p. 87).

  2. Sample sizes for saturation in qualitative research: A systematic

    Our results can be used to demonstrate that 'small' sample sizes are effective for qualitative research and to show why they are effective - because they are able to reach saturation, the long-held benchmark for an adequate sample size in qualitative research. Furthermore, our results show what a 'small' sample actually is, by ...

  3. Series: Practical guidance to qualitative research. Part 3: Sampling

    In quantitative research, by contrast, the sample size is determined by a power calculation. The usually small sample size in qualitative research depends on the information richness of the data, the variety of participants (or other units), the broadness of the research question and the phenomenon, the data collection method (e.g., individual ...

  4. Characterising and justifying sample size sufficiency in interview

    Sample size in qualitative research has been the subject of enduring discussions [4, 10, 11]. ... The funding body did not have any role in the study design, the collection, analysis and interpretation of the data, in the writing of the paper, and in the decision to submit the manuscript for publication. ...

  5. Sample Size in Qualitative Interview Studies: Guided by Information

    The prevailing concept for sample size in qualitative studies is "saturation." Saturation is closely tied to a specific methodology, and the term is inconsistently applied. We propose the concept "information power" to guide adequate sample size for qualitative studies. Information power indicates that the more information the sample holds ...

  6. Sample size for qualitative research

    Sample size in qualitative research is always mentioned by reviewers of qualitative papers but discussion tends to be simplistic and relatively uninformed. The current paper draws attention to how sample sizes, at both ends of the size continuum, can be justified by researchers. This will also aid reviewers in their making of comments about the ...

  7. Sample size in qualitative research

    Determining adequate sample size in qualitative research is ultimately a matter of judgment and experience in evaluating the quality of the information collected against the uses to which it will be put, the particular research method and purposeful sampling strategy employed, and the research product intended. ©1995 John Wiley & Sons, Inc. ...

  8. Determining the Sample Size in Qualitative Research

    finds a variation of the sample size from 1 to 95 (averages being of 31 in the first ca se and 28 in the. second). The research region - one of t he cultural factors, plays a significant role in ...

  9. Sample Sizes in Qualitative UX Research: A Definitive Guide

    A formula for determining qualitative sample size. In 2013, Research by Design published a whitepaper by Donna Bonde which included research-backed guidelines for qualitative sampling in a market research context. Victor Yocco, writing in 2017, drew on these guidelines to create a formula determining qualitative sample sizes.

  10. Determining Appropriate Sample Size for Qualitative Interviews: Code

    This body of work has advanced the evidence base for sample size estimation in qualitative inquiry during the design phase of a study, prior to data collection, but it does not provide qualitative ...

  11. Sample size for qualitative research.

    Purpose: Qualitative researchers have been criticised for not justifying sample size decisions in their research. This short paper addresses the issue of which sample sizes are appropriate and valid within different approaches to qualitative research. Design/methodology/approach: The sparse literature on sample sizes in qualitative research is reviewed and discussed. This examination is ...

  12. (PDF) Qualitative Research Designs, Sample Size and Saturation: Is

    Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period. BMC Medical Research Methodology. 18(1 ...

  13. Sample Size and its Importance in Research

    The sample size for a study needs to be estimated at the time the study is proposed; too large a sample is unnecessary and unethical, and too small a sample is unscientific and also unethical. The necessary sample size can be calculated, using statistical software, based on certain assumptions. If no assumptions can be made, then an arbitrary ...

  14. Characterising and justifying sample size sufficiency in interview

    Sample adequacy in qualitative inquiry pertains to the appropriateness of the sample composition and size.It is an important consideration in evaluations of the quality and trustworthiness of much qualitative research [] and is implicated - particularly for research that is situated within a post-positivist tradition and retains a degree of commitment to realist ontological premises - in ...

  15. PDF Sim J Can sample size in qualitative research be determined a priori

    determining sample size a priori is inherently problematic in qualitative research, given that. sample size is often adaptive and emergent, and - particularly if based on a grounded theory. approach - adopts the principle of saturation. Saturation is operationalized in different ways.

  16. Sample size for qualitative research

    Purpose Qualitative researchers have been criticised for not justifying sample size decisions in their research. This short paper addresses the issue of which sample sizes are appropriate and valid within different approaches to qualitative research. Design/methodology/approach The sparse literature on sample sizes in qualitative research is reviewed and discussed. This examination is informed ...

  17. Qualitative Sample Size Calculator

    What is a good sample size for a qualitative research study? ‍ Our sample size calculator will work out the answer based on your project's scope, participant characteristics, researcher expertise, and methodology. Just answer 4 quick questions to get a super actionable, data-backed recommendation for your next study.

  18. Sample size for qualitative research

    Marshall, Cardon, Poddar and Fontenot considered 81 qualitative studies and concluded that scant attention was paid to estimating or justifying sample sizes. The question of what sample size is needed for qualitative research is frequently asked by individual researchers (Dworkin, 2012) but not frequently discussed in the literature ...

  19. Sample size for qualitative research

    Sample size for qualitative research. September 2016. Qualitative Market Research An International Journal 19 (4):426-432. DOI: 10.1108/QMR-06-2016-0053. Authors: C.R. Boddy. Anglia Ruskin ...

  20. Series: Practical guidance to qualitative research. Part 3: Sampling

    In quantitative research, by contrast, the sample size is determined by a power calculation. The usually small sample size in qualitative research depends on the information richness of the data, the variety of participants (or other units), the broadness of the research question and the phenomenon, the data collection method (e.g., individual ...

  21. How to Choose a Sample Size in Qualitative Research

    Below are four points to keep in mind when thinking about sample size: Quality over quantity. Qualitative market research aims to tease out insights from a specific demographic, whether they are ...

  22. Determining Sample Size in BI Research

    Determining the correct sample size for qualitative research involves considering the purpose of the study, the diversity of the population, and the resources available. Unlike quantitative ...

  23. Qualitative Research: Definition, Methodology, Limitation, Examples

    Find out what is qualitative research, its limitations, types, & examples. ... it also gives the marketer the opportunity to read the body language of the respondent and match the responses. 2. Focus groups ... Sample Size: Large sample sizes: Small sample sizes: Researcher Role: Objective observer:

  24. (PDF) Characterising and justifying sample size sufficiency in

    Background: Choosing a suitable sample size in qualitative research is an area of conceptual debate and practical uncertainty. That sample size principles, guidelines and tools have been developed ...

  25. Exploring potential EQ-5D bolt-on dimensions with a qualitative

    The introduction of bolt-on dimensions in EQ-5D instruments is growing common, but most bolt-on studies have targeted the diseased population and obtained bolt-on from other existing Health-related Quality of Life (HRQoL) instruments. As the qualitative approach offers important evidence to support the consistency and design of the potential bolt-on items, this paper studies the Hong Kong SAR ...

  26. Qualitative Significance as First-Class Evidence in the Design and

    Anticipating a possible small size effect (Cohen's f = 0.15, with β = 0.80 and α = 0.05), a minimum sample size of 492 women (123 in each group) was required. Due to the possibility of loss during the follow-up phase, the sample was increased by 10%, thus requiring a minimum of 540 women (135 in each group).

  27. Paramedics' experiences of barriers to, and enablers of, responding to

    Study design. An exploratory-descriptive qualitative approach [] was applied to understand the experience of paramedics during the COVID-19 pandemic.A constructivist paradigm was chosen to explore paramedics' experiences because it assumes there are multiple subjective realities, insider knowledge can be valuable, there is a holistic emphasis on the experience being investigated, and rich ...