random assignment pros and cons

Purpose and Limitations of Random Assignment

In an experimental study, random assignment is a process by which participants are assigned, with the same chance, to either a treatment or a control group. The goal is to assure an unbiased assignment of participants to treatment options.

Random assignment is considered the gold standard for achieving comparability across study groups, and therefore is the best method for inferring a causal relationship between a treatment (or intervention or risk factor) and an outcome.

Representation of random assignment in an experimental study

Random assignment of participants produces comparable groups regarding the participants’ initial characteristics, thereby any difference detected in the end between the treatment and the control group will be due to the effect of the treatment alone.

How does random assignment produce comparable groups?

1. random assignment prevents selection bias.

Randomization works by removing the researcher’s and the participant’s influence on the treatment allocation. So the allocation can no longer be biased since it is done at random, i.e. in a non-predictable way.

This is in contrast with the real world, where for example, the sickest people are more likely to receive the treatment.

2. Random assignment prevents confounding

A confounding variable is one that is associated with both the intervention and the outcome, and thus can affect the outcome in 2 ways:

Causal diagram representing how confounding works

Either directly:

Direct influence of confounding on the outcome

Or indirectly through the treatment:

Indirect influence of confounding on the outcome

This indirect relationship between the confounding variable and the outcome can cause the treatment to appear to have an influence on the outcome while in reality the treatment is just a mediator of that effect (as it happens to be on the causal pathway between the confounder and the outcome).

Random assignment eliminates the influence of the confounding variables on the treatment since it distributes them at random between the study groups, therefore, ruling out this alternative path or explanation of the outcome.

How random assignment protects from confounding

3. Random assignment also eliminates other threats to internal validity

By distributing all threats (known and unknown) at random between study groups, participants in both the treatment and the control group become equally subject to the effect of any threat to validity. Therefore, comparing the outcome between the 2 groups will bypass the effect of these threats and will only reflect the effect of the treatment on the outcome.

These threats include:

History: This is any event that co-occurs with the treatment and can affect the outcome.
Maturation: This is the effect of time on the study participants (e.g. participants becoming wiser, hungrier, or more stressed with time) which might influence the outcome.
Regression to the mean: This happens when the participants’ outcome score is exceptionally good on a pre-treatment measurement, so the post-treatment measurement scores will naturally regress toward the mean — in simple terms, regression happens since an exceptional performance is hard to maintain. This effect can bias the study since it represents an alternative explanation of the outcome.

Note that randomization does not prevent these effects from happening, it just allows us to control them by reducing their risk of being associated with the treatment.

What if random assignment produced unequal groups?

Question: What should you do if after randomly assigning participants, it turned out that the 2 groups still differ in participants’ characteristics? More precisely, what if randomization accidentally did not balance risk factors that can be alternative explanations between the 2 groups? (For example, if one group includes more male participants, or sicker, or older people than the other group).

Short answer: This is perfectly normal, since randomization only assures an unbiased assignment of participants to groups, i.e. it produces comparable groups, but it does not guarantee the equality of these groups.

A more complete answer: Randomization will not and cannot create 2 equal groups regarding each and every characteristic. This is because when dealing with randomization there is still an element of luck. If you want 2 perfectly equal groups, you better match them manually as is done in a matched pairs design (for more information see my article on matched pairs design ).

This is similar to throwing a die: If you throw it 10 times, the chance of getting a specific outcome will not be 1/6. But it will approach 1/6 if you repeat the experiment a very large number of times and calculate the average number of times the specific outcome turned up.

So randomization will not produce perfectly equal groups for each specific study, especially if the study has a small sample size. But do not forget that scientific evidence is a long and continuous process, and the groups will tend to be equal in the long run when a meta-analysis aggregates the results of a large number of randomized studies.

So for each individual study, differences between the treatment and control group will exist and will influence the study results. This means that the results of a randomized trial will sometimes be wrong, and this is absolutely okay.

BOTTOM LINE:

Although the results of a particular randomized study are unbiased, they will still be affected by a sampling error due to chance. But the real benefit of random assignment will be when data is aggregated in a meta-analysis.

Limitations of random assignment

Randomized designs can suffer from:

1. Ethical issues:

Randomization is ethical only if the researcher has no evidence that one treatment is superior to the other.

Also, it would be unethical to randomly assign participants to harmful exposures such as smoking or dangerous chemicals.

2. Low external validity:

With random assignment, external validity (i.e. the generalizability of the study results) is compromised because the results of a study that uses random assignment represent what would happen under “ideal” experimental conditions, which is in general very different from what happens at the population level.

In the real world, people who take the treatment might be very different from those who don’t – so the assignment of participants is not a random event, but rather under the influence of all sort of external factors.

External validity can be also jeopardized in cases where not all participants are eligible or willing to accept the terms of the study.

3. Higher cost of implementation:

An experimental design with random assignment is typically more expensive than observational studies where the investigator’s role is just to observe events without intervening.

Experimental designs also typically take a lot of time to implement, and therefore are less practical when a quick answer is needed.

4. Impracticality when answering non-causal questions:

A randomized trial is our best bet when the question is to find the causal effect of a treatment or a risk factor.

Sometimes however, the researcher is just interested in predicting the probability of an event or a disease given some risk factors. In this case, the causal relationship between these variables is not important, making observational designs more suitable for such problems.

5. Impracticality when studying the effect of variables that cannot be manipulated:

The usual objective of studying the effects of risk factors is to propose recommendations that involve changing the level of exposure to these factors.

However, some risk factors cannot be manipulated, and so it does not make any sense to study them in a randomized trial. For example it would be impossible to randomly assign participants to age categories, gender, or genetic factors.

6. Difficulty to control participants:

These difficulties include:

Participants refusing to receive the assigned treatment.
Participants not adhering to recommendations.
Differential loss to follow-up between those who receive the treatment and those who don’t.

All of these issues might occur in a randomized trial, but might not affect an observational study.

Shadish WR, Cook TD, Campbell DT. Experimental and Quasi-Experimental Designs for Generalized Causal Inference . 2nd edition. Cengage Learning; 2001.
Friedman LM, Furberg CD, DeMets DL, Reboussin DM, Granger CB. Fundamentals of Clinical Trials . 5th ed. 2015 edition. Springer; 2015.

6.2 Experimental Design

Learning objectives.

Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question.
Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it.
Define what a control condition is, explain its purpose in research on treatment effectiveness, and describe some alternative types of control conditions.
Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

In this section, we look at some different ways to design an experiment. The primary distinction we will make is between approaches in which each participant experiences one level of the independent variable and approaches in which each participant experiences all levels of the independent variable. The former are called between-subjects experiments and the latter are called within-subjects experiments.

Between-Subjects Experiments

In a between-subjects experiment , each participant is tested in only one condition. For example, a researcher with a sample of 100 college students might assign half of them to write about a traumatic event and the other half write about a neutral event. Or a researcher with a sample of 60 people with severe agoraphobia (fear of open spaces) might assign 20 of them to receive each of three different treatments for that disorder. It is essential in a between-subjects experiment that the researcher assign participants to conditions so that the different groups are, on average, highly similar to each other. Those in a trauma condition and a neutral condition, for example, should include a similar proportion of men and women, and they should have similar average intelligence quotients (IQs), similar average levels of motivation, similar average numbers of health problems, and so on. This is a matter of controlling these extraneous participant variables across conditions so that they do not become confounding variables.

Random Assignment

The primary way that researchers accomplish this kind of control of extraneous variables across conditions is called random assignment , which means using a random process to decide which participants are tested in which conditions. Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population, and it is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other fields too.

In its strictest sense, random assignment should meet two criteria. One is that each participant has an equal chance of being assigned to each condition (e.g., a 50% chance of being assigned to each of two conditions). The second is that each participant is assigned to a condition independently of other participants. Thus one way to assign participants to two conditions would be to flip a coin for each one. If the coin lands heads, the participant is assigned to Condition A, and if it lands tails, the participant is assigned to Condition B. For three conditions, one could use a computer to generate a random integer from 1 to 3 for each participant. If the integer is 1, the participant is assigned to Condition A; if it is 2, the participant is assigned to Condition B; and if it is 3, the participant is assigned to Condition C. In practice, a full sequence of conditions—one for each participant expected to be in the experiment—is usually created ahead of time, and each new participant is assigned to the next condition in the sequence as he or she is tested. When the procedure is computerized, the computer program often handles the random assignment.

One problem with coin flipping and other strict procedures for random assignment is that they are likely to result in unequal sample sizes in the different conditions. Unequal sample sizes are generally not a serious problem, and you should never throw away data you have already collected to achieve equal sample sizes. However, for a fixed number of participants, it is statistically most efficient to divide them into equal-sized groups. It is standard practice, therefore, to use a kind of modified random assignment that keeps the number of participants in each group as similar as possible. One approach is block randomization . In block randomization, all the conditions occur once in the sequence before any of them is repeated. Then they all occur again before any of them is repeated again. Within each of these “blocks,” the conditions occur in a random order. Again, the sequence of conditions is usually generated before any participants are tested, and each new participant is assigned to the next condition in the sequence. Table 6.2 “Block Randomization Sequence for Assigning Nine Participants to Three Conditions” shows such a sequence for assigning nine participants to three conditions. The Research Randomizer website ( http://www.randomizer.org ) will generate block randomization sequences for any number of participants and conditions. Again, when the procedure is computerized, the computer program often handles the block randomization.

Table 6.2 Block Randomization Sequence for Assigning Nine Participants to Three Conditions

Random assignment is not guaranteed to control all extraneous variables across conditions. It is always possible that just by chance, the participants in one condition might turn out to be substantially older, less tired, more motivated, or less depressed on average than the participants in another condition. However, there are some reasons that this is not a major concern. One is that random assignment works better than one might expect, especially for large samples. Another is that the inferential statistics that researchers use to decide whether a difference between groups reflects a difference in the population takes the “fallibility” of random assignment into account. Yet another reason is that even if random assignment does result in a confounding variable and therefore produces misleading results, this is likely to be detected when the experiment is replicated. The upshot is that random assignment to conditions—although not infallible in terms of controlling extraneous variables—is always considered a strength of a research design.

Treatment and Control Conditions

Between-subjects experiments are often used to determine whether a treatment works. In psychological research, a treatment is any intervention meant to change people’s behavior for the better. This includes psychotherapies and medical treatments for psychological disorders but also interventions designed to improve learning, promote conservation, reduce prejudice, and so on. To determine whether a treatment works, participants are randomly assigned to either a treatment condition , in which they receive the treatment, or a control condition , in which they do not receive the treatment. If participants in the treatment condition end up better off than participants in the control condition—for example, they are less depressed, learn faster, conserve more, express less prejudice—then the researcher can conclude that the treatment works. In research on the effectiveness of psychotherapies and medical treatments, this type of experiment is often called a randomized clinical trial .

There are different types of control conditions. In a no-treatment control condition , participants receive no treatment whatsoever. One problem with this approach, however, is the existence of placebo effects. A placebo is a simulated treatment that lacks any active ingredient or element that should make it effective, and a placebo effect is a positive effect of such a treatment. Many folk remedies that seem to work—such as eating chicken soup for a cold or placing soap under the bedsheets to stop nighttime leg cramps—are probably nothing more than placebos. Although placebo effects are not well understood, they are probably driven primarily by people’s expectations that they will improve. Having the expectation to improve can result in reduced stress, anxiety, and depression, which can alter perceptions and even improve immune system functioning (Price, Finniss, & Benedetti, 2008).

Placebo effects are interesting in their own right (see Note 6.28 “The Powerful Placebo” ), but they also pose a serious problem for researchers who want to determine whether a treatment works. Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” shows some hypothetical results in which participants in a treatment condition improved more on average than participants in a no-treatment control condition. If these conditions (the two leftmost bars in Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” ) were the only conditions in this experiment, however, one could not conclude that the treatment worked. It could be instead that participants in the treatment group improved more because they expected to improve, while those in the no-treatment control condition did not.

Figure 6.2 Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions

Fortunately, there are several solutions to this problem. One is to include a placebo control condition , in which participants receive a placebo that looks much like the treatment but lacks the active ingredient or element thought to be responsible for the treatment’s effectiveness. When participants in a treatment condition take a pill, for example, then those in a placebo control condition would take an identical-looking pill that lacks the active ingredient in the treatment (a “sugar pill”). In research on psychotherapy effectiveness, the placebo might involve going to a psychotherapist and talking in an unstructured way about one’s problems. The idea is that if participants in both the treatment and the placebo control groups expect to improve, then any improvement in the treatment group over and above that in the placebo control group must have been caused by the treatment and not by participants’ expectations. This is what is shown by a comparison of the two outer bars in Figure 6.2 “Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions” .

Of course, the principle of informed consent requires that participants be told that they will be assigned to either a treatment or a placebo control condition—even though they cannot be told which until the experiment ends. In many cases the participants who had been in the control condition are then offered an opportunity to have the real treatment. An alternative approach is to use a waitlist control condition , in which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it. This allows researchers to compare participants who have received the treatment with participants who are not currently receiving it but who still expect to improve (eventually). A final solution to the problem of placebo effects is to leave out the control condition completely and compare any new treatment with the best available alternative treatment. For example, a new treatment for simple phobia could be compared with standard exposure therapy. Because participants in both conditions receive a treatment, their expectations about improvement should be similar. This approach also makes sense because once there is an effective treatment, the interesting question about a new treatment is not simply “Does it work?” but “Does it work better than what is already available?”

The Powerful Placebo

Many people are not surprised that placebos can have a positive effect on disorders that seem fundamentally psychological, including depression, anxiety, and insomnia. However, placebos can also have a positive effect on disorders that most people think of as fundamentally physiological. These include asthma, ulcers, and warts (Shapiro & Shapiro, 1999). There is even evidence that placebo surgery—also called “sham surgery”—can be as effective as actual surgery.

Medical researcher J. Bruce Moseley and his colleagues conducted a study on the effectiveness of two arthroscopic surgery procedures for osteoarthritis of the knee (Moseley et al., 2002). The control participants in this study were prepped for surgery, received a tranquilizer, and even received three small incisions in their knees. But they did not receive the actual arthroscopic surgical procedure. The surprising result was that all participants improved in terms of both knee pain and function, and the sham surgery group improved just as much as the treatment groups. According to the researchers, “This study provides strong evidence that arthroscopic lavage with or without débridement [the surgical procedures used] is not better than and appears to be equivalent to a placebo procedure in improving knee pain and self-reported function” (p. 85).

Research has shown that patients with osteoarthritis of the knee who receive a “sham surgery” experience reductions in pain and improvement in knee function similar to those of patients who receive a real surgery.

Army Medicine – Surgery – CC BY 2.0.

Within-Subjects Experiments

In a within-subjects experiment , each participant is tested under all conditions. Consider an experiment on the effect of a defendant’s physical attractiveness on judgments of his guilt. Again, in a between-subjects experiment, one group of participants would be shown an attractive defendant and asked to judge his guilt, and another group of participants would be shown an unattractive defendant and asked to judge his guilt. In a within-subjects experiment, however, the same group of participants would judge the guilt of both an attractive and an unattractive defendant.

The primary advantage of this approach is that it provides maximum control of extraneous participant variables. Participants in all conditions have the same mean IQ, same socioeconomic status, same number of siblings, and so on—because they are the very same people. Within-subjects experiments also make it possible to use statistical procedures that remove the effect of these extraneous participant variables on the dependent variable and therefore make the data less “noisy” and the effect of the independent variable easier to detect. We will look more closely at this idea later in the book.

Carryover Effects and Counterbalancing

The primary disadvantage of within-subjects designs is that they can result in carryover effects. A carryover effect is an effect of being tested in one condition on participants’ behavior in later conditions. One type of carryover effect is a practice effect , where participants perform a task better in later conditions because they have had a chance to practice it. Another type is a fatigue effect , where participants perform a task worse in later conditions because they become tired or bored. Being tested in one condition can also change how participants perceive stimuli or interpret their task in later conditions. This is called a context effect . For example, an average-looking defendant might be judged more harshly when participants have just judged an attractive defendant than when they have just judged an unattractive defendant. Within-subjects experiments also make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. This could lead the participant to judge the unattractive defendant more harshly because he thinks this is what he is expected to do. Or it could make participants judge the two defendants similarly in an effort to be “fair.”

Carryover effects can be interesting in their own right. (Does the attractiveness of one person depend on the attractiveness of other people that we have seen recently?) But when they are not the focus of the research, carryover effects can be problematic. Imagine, for example, that participants judge the guilt of an attractive defendant and then judge the guilt of an unattractive defendant. If they judge the unattractive defendant more harshly, this might be because of his unattractiveness. But it could be instead that they judge him more harshly because they are becoming bored or tired. In other words, the order of the conditions is a confounding variable. The attractive condition is always the first condition and the unattractive condition the second. Thus any difference between the conditions in terms of the dependent variable could be caused by the order of the conditions and not the independent variable itself.

There is a solution to the problem of order effects, however, that can be used in many situations. It is counterbalancing , which means testing different participants in different orders. For example, some participants would be tested in the attractive defendant condition followed by the unattractive defendant condition, and others would be tested in the unattractive condition followed by the attractive condition. With three conditions, there would be six different orders (ABC, ACB, BAC, BCA, CAB, and CBA), so some participants would be tested in each of the six orders. With counterbalancing, participants are assigned to orders randomly, using the techniques we have already discussed. Thus random assignment plays an important role in within-subjects designs just as in between-subjects designs. Here, instead of randomly assigning to conditions, they are randomly assigned to different orders of conditions. In fact, it can safely be said that if a study does not involve random assignment in one form or another, it is not an experiment.

There are two ways to think about what counterbalancing accomplishes. One is that it controls the order of conditions so that it is no longer a confounding variable. Instead of the attractive condition always being first and the unattractive condition always being second, the attractive condition comes first for some participants and second for others. Likewise, the unattractive condition comes first for some participants and second for others. Thus any overall difference in the dependent variable between the two conditions cannot have been caused by the order of conditions. A second way to think about what counterbalancing accomplishes is that if there are carryover effects, it makes it possible to detect them. One can analyze the data separately for each order to see whether it had an effect.

When 9 Is “Larger” Than 221

Researcher Michael Birnbaum has argued that the lack of context provided by between-subjects designs is often a bigger problem than the context effects created by within-subjects designs. To demonstrate this, he asked one group of participants to rate how large the number 9 was on a 1-to-10 rating scale and another group to rate how large the number 221 was on the same 1-to-10 rating scale (Birnbaum, 1999). Participants in this between-subjects design gave the number 9 a mean rating of 5.13 and the number 221 a mean rating of 3.10. In other words, they rated 9 as larger than 221! According to Birnbaum, this is because participants spontaneously compared 9 with other one-digit numbers (in which case it is relatively large) and compared 221 with other three-digit numbers (in which case it is relatively small).

Simultaneous Within-Subjects Designs

So far, we have discussed an approach to within-subjects designs in which participants are tested in one condition at a time. There is another approach, however, that is often used when participants make multiple responses in each condition. Imagine, for example, that participants judge the guilt of 10 attractive defendants and 10 unattractive defendants. Instead of having people make judgments about all 10 defendants of one type followed by all 10 defendants of the other type, the researcher could present all 20 defendants in a sequence that mixed the two types. The researcher could then compute each participant’s mean rating for each type of defendant. Or imagine an experiment designed to see whether people with social anxiety disorder remember negative adjectives (e.g., “stupid,” “incompetent”) better than positive ones (e.g., “happy,” “productive”). The researcher could have participants study a single list that includes both kinds of words and then have them try to recall as many words as possible. The researcher could then count the number of each type of word that was recalled. There are many ways to determine the order in which the stimuli are presented, but one common way is to generate a different random order for each participant.

Between-Subjects or Within-Subjects?

Almost every experiment can be conducted using either a between-subjects design or a within-subjects design. This means that researchers must choose between the two approaches based on their relative merits for the particular situation.

Between-subjects experiments have the advantage of being conceptually simpler and requiring less testing time per participant. They also avoid carryover effects without the need for counterbalancing. Within-subjects experiments have the advantage of controlling extraneous participant variables, which generally reduces noise in the data and makes it easier to detect a relationship between the independent and dependent variables.

A good rule of thumb, then, is that if it is possible to conduct a within-subjects experiment (with proper counterbalancing) in the time that is available per participant—and you have no serious concerns about carryover effects—this is probably the best option. If a within-subjects design would be difficult or impossible to carry out, then you should consider a between-subjects design instead. For example, if you were testing participants in a doctor’s waiting room or shoppers in line at a grocery store, you might not have enough time to test each participant in all conditions and therefore would opt for a between-subjects design. Or imagine you were trying to reduce people’s level of prejudice by having them interact with someone of another race. A within-subjects design with counterbalancing would require testing some participants in the treatment condition first and then in a control condition. But if the treatment works and reduces people’s level of prejudice, then they would no longer be suitable for testing in the control condition. This is true for many designs that involve a treatment meant to produce long-term change in participants’ behavior (e.g., studies testing the effectiveness of psychotherapy). Clearly, a between-subjects design would be necessary here.

Remember also that using one type of design does not preclude using the other type in a different study. There is no reason that a researcher could not use both a between-subjects design and a within-subjects design to answer the same research question. In fact, professional researchers often do exactly this.

Key Takeaways

Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
Random assignment to conditions in between-subjects experiments or to orders of conditions in within-subjects experiments is a fundamental element of experimental research. Its purpose is to control extraneous variables so that they do not become confounding variables.
Experimental research on the effectiveness of a treatment requires both a treatment condition and a control condition, which can be a no-treatment control condition, a placebo control condition, or a waitlist control condition. Experimental treatments can also be compared with the best available alternative.

Discussion: For each of the following topics, list the pros and cons of a between-subjects and within-subjects design and decide which would be better.

You want to test the relative effectiveness of two training programs for running a marathon.
Using photographs of people as stimuli, you want to see if smiling people are perceived as more intelligent than people who are not smiling.
In a field experiment, you want to see if the way a panhandler is dressed (neatly vs. sloppily) affects whether or not passersby give him any money.
You want to see if concrete nouns (e.g., dog ) are recalled better than abstract nouns (e.g., truth ).
Discussion: Imagine that an experiment shows that participants who receive psychodynamic therapy for a dog phobia improve more than participants in a no-treatment control group. Explain a fundamental problem with this research design and at least two ways that it might be corrected.

Birnbaum, M. H. (1999). How to show that 9 > 221: Collect judgments in a between-subjects design. Psychological Methods, 4 , 243–249.

Moseley, J. B., O’Malley, K., Petersen, N. J., Menke, T. J., Brody, B. A., Kuykendall, D. H., … Wray, N. P. (2002). A controlled trial of arthroscopic surgery for osteoarthritis of the knee. The New England Journal of Medicine, 347 , 81–88.

Price, D. D., Finniss, D. G., & Benedetti, F. (2008). A comprehensive review of the placebo effect: Recent advances and current thought. Annual Review of Psychology, 59 , 565–590.

Shapiro, A. K., & Shapiro, E. (1999). The powerful placebo: From ancient priest to modern physician . Baltimore, MD: Johns Hopkins University Press.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

5.2 Experimental Design

Learning objectives.

Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question.
Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it
Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

Between-Subjects Experiments

In a between-subjects experiment , each participant is tested in only one condition. For example, a researcher with a sample of 100 university students might assign half of them to write about a traumatic event and the other half write about a neutral event. Or a researcher with a sample of 60 people with severe agoraphobia (fear of open spaces) might assign 20 of them to receive each of three different treatments for that disorder. It is essential in a between-subjects experiment that the researcher assigns participants to conditions so that the different groups are, on average, highly similar to each other. Those in a trauma condition and a neutral condition, for example, should include a similar proportion of men and women, and they should have similar average intelligence quotients (IQs), similar average levels of motivation, similar average numbers of health problems, and so on. This matching is a matter of controlling these extraneous participant variables across conditions so that they do not become confounding variables.

Random Assignment

One problem with coin flipping and other strict procedures for random assignment is that they are likely to result in unequal sample sizes in the different conditions. Unequal sample sizes are generally not a serious problem, and you should never throw away data you have already collected to achieve equal sample sizes. However, for a fixed number of participants, it is statistically most efficient to divide them into equal-sized groups. It is standard practice, therefore, to use a kind of modified random assignment that keeps the number of participants in each group as similar as possible. One approach is block randomization . In block randomization, all the conditions occur once in the sequence before any of them is repeated. Then they all occur again before any of them is repeated again. Within each of these “blocks,” the conditions occur in a random order. Again, the sequence of conditions is usually generated before any participants are tested, and each new participant is assigned to the next condition in the sequence. Table 5.2 shows such a sequence for assigning nine participants to three conditions. The Research Randomizer website ( http://www.randomizer.org ) will generate block randomization sequences for any number of participants and conditions. Again, when the procedure is computerized, the computer program often handles the block randomization.

Random assignment is not guaranteed to control all extraneous variables across conditions. The process is random, so it is always possible that just by chance, the participants in one condition might turn out to be substantially older, less tired, more motivated, or less depressed on average than the participants in another condition. However, there are some reasons that this possibility is not a major concern. One is that random assignment works better than one might expect, especially for large samples. Another is that the inferential statistics that researchers use to decide whether a difference between groups reflects a difference in the population takes the “fallibility” of random assignment into account. Yet another reason is that even if random assignment does result in a confounding variable and therefore produces misleading results, this confound is likely to be detected when the experiment is replicated. The upshot is that random assignment to conditions—although not infallible in terms of controlling extraneous variables—is always considered a strength of a research design.

Matched Groups

An alternative to simple random assignment of participants to conditions is the use of a matched-groups design . Using this design, participants in the various conditions are matched on the dependent variable or on some extraneous variable(s) prior the manipulation of the independent variable. This guarantees that these variables will not be confounded across the experimental conditions. For instance, if we want to determine whether expressive writing affects people’s health then we could start by measuring various health-related variables in our prospective research participants. We could then use that information to rank-order participants according to how healthy or unhealthy they are. Next, the two healthiest participants would be randomly assigned to complete different conditions (one would be randomly assigned to the traumatic experiences writing condition and the other to the neutral writing condition). The next two healthiest participants would then be randomly assigned to complete different conditions, and so on until the two least healthy participants. This method would ensure that participants in the traumatic experiences writing condition are matched to participants in the neutral writing condition with respect to health at the beginning of the study. If at the end of the experiment, a difference in health was detected across the two conditions, then we would know that it is due to the writing manipulation and not to pre-existing differences in health.

Within-Subjects Experiments

One disadvantage of within-subjects experiments is that they make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. This knowledge could lead the participant to judge the unattractive defendant more harshly because he thinks this is what he is expected to do. Or it could make participants judge the two defendants similarly in an effort to be “fair.”

Carryover Effects and Counterbalancing

The primary disadvantage of within-subjects designs is that they can result in order effects. An order effect occurs when participants’ responses in the various conditions are affected by the order of conditions to which they were exposed. One type of order effect is a carryover effect. A carryover effect is an effect of being tested in one condition on participants’ behavior in later conditions. One type of carryover effect is a practice effect , where participants perform a task better in later conditions because they have had a chance to practice it. Another type is a fatigue effect , where participants perform a task worse in later conditions because they become tired or bored. Being tested in one condition can also change how participants perceive stimuli or interpret their task in later conditions. This type of effect is called a context effect (or contrast effect) . For example, an average-looking defendant might be judged more harshly when participants have just judged an attractive defendant than when they have just judged an unattractive defendant. Within-subjects experiments also make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt.

There is a solution to the problem of order effects, however, that can be used in many situations. It is counterbalancing , which means testing different participants in different orders. The best method of counterbalancing is complete counterbalancing in which an equal number of participants complete each possible order of conditions. For example, half of the participants would be tested in the attractive defendant condition followed by the unattractive defendant condition, and others half would be tested in the unattractive condition followed by the attractive condition. With three conditions, there would be six different orders (ABC, ACB, BAC, BCA, CAB, and CBA), so some participants would be tested in each of the six orders. With four conditions, there would be 24 different orders; with five conditions there would be 120 possible orders. With counterbalancing, participants are assigned to orders randomly, using the techniques we have already discussed. Thus, random assignment plays an important role in within-subjects designs just as in between-subjects designs. Here, instead of randomly assigning to conditions, they are randomly assigned to different orders of conditions. In fact, it can safely be said that if a study does not involve random assignment in one form or another, it is not an experiment.

A more efficient way of counterbalancing is through a Latin square design which randomizes through having equal rows and columns. For example, if you have four treatments, you must have four versions. Like a Sudoku puzzle, no treatment can repeat in a row or column. For four versions of four treatments, the Latin square design would look like:

You can see in the diagram above that the square has been constructed to ensure that each condition appears at each ordinal position (A appears first once, second once, third once, and fourth once) and each condition preceded and follows each other condition one time. A Latin square for an experiment with 6 conditions would by 6 x 6 in dimension, one for an experiment with 8 conditions would be 8 x 8 in dimension, and so on. So while complete counterbalancing of 6 conditions would require 720 orders, a Latin square would only require 6 orders.

Finally, when the number of conditions is large experiments can use random counterbalancing in which the order of the conditions is randomly determined for each participant. Using this technique every possible order of conditions is determined and then one of these orders is randomly selected for each participant. This is not as powerful a technique as complete counterbalancing or partial counterbalancing using a Latin squares design. Use of random counterbalancing will result in more random error, but if order effects are likely to be small and the number of conditions is large, this is an option available to researchers.

When 9 Is “Larger” Than 221

Researcher Michael Birnbaum has argued that the lack of context provided by between-subjects designs is often a bigger problem than the context effects created by within-subjects designs. To demonstrate this problem, he asked participants to rate two numbers on how large they were on a scale of 1-to-10 where 1 was “very very small” and 10 was “very very large”. One group of participants were asked to rate the number 9 and another group was asked to rate the number 221 (Birnbaum, 1999) [1] . Participants in this between-subjects design gave the number 9 a mean rating of 5.13 and the number 221 a mean rating of 3.10. In other words, they rated 9 as larger than 221! According to Birnbaum, this difference is because participants spontaneously compared 9 with other one-digit numbers (in which case it is relatively large) and compared 221 with other three-digit numbers (in which case it is relatively small).

Simultaneous Within-Subjects Designs

Between-Subjects or Within-Subjects?

Almost every experiment can be conducted using either a between-subjects design or a within-subjects design. This possibility means that researchers must choose between the two approaches based on their relative merits for the particular situation.

A good rule of thumb, then, is that if it is possible to conduct a within-subjects experiment (with proper counterbalancing) in the time that is available per participant—and you have no serious concerns about carryover effects—this design is probably the best option. If a within-subjects design would be difficult or impossible to carry out, then you should consider a between-subjects design instead. For example, if you were testing participants in a doctor’s waiting room or shoppers in line at a grocery store, you might not have enough time to test each participant in all conditions and therefore would opt for a between-subjects design. Or imagine you were trying to reduce people’s level of prejudice by having them interact with someone of another race. A within-subjects design with counterbalancing would require testing some participants in the treatment condition first and then in a control condition. But if the treatment works and reduces people’s level of prejudice, then they would no longer be suitable for testing in the control condition. This difficulty is true for many designs that involve a treatment meant to produce long-term change in participants’ behavior (e.g., studies testing the effectiveness of psychotherapy). Clearly, a between-subjects design would be necessary here.

Key Takeaways

Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
Random assignment to conditions in between-subjects experiments or counterbalancing of orders of conditions in within-subjects experiments is a fundamental element of experimental research. The purpose of these techniques is to control extraneous variables so that they do not become confounding variables.
You want to test the relative effectiveness of two training programs for running a marathon.
Using photographs of people as stimuli, you want to see if smiling people are perceived as more intelligent than people who are not smiling.
In a field experiment, you want to see if the way a panhandler is dressed (neatly vs. sloppily) affects whether or not passersby give him any money.
You want to see if concrete nouns (e.g., dog ) are recalled better than abstract nouns (e.g., truth).
Birnbaum, M.H. (1999). How to show that 9>221: Collect judgments in a between-subjects design. Psychological Methods, 4 (3), 243-249. ↵

Share This Book

Increase Font Size

Frequently asked questions

What is random assignment.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomisation. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Frequently asked questions: Methodology

Quantitative observations involve measuring or counting something and expressing the result in numerical form, while qualitative observations involve describing something in non-numerical terms, such as its appearance, texture, or color.

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Scope of research is determined at the beginning of your research process , prior to the data collection stage. Sometimes called “scope of study,” your scope delineates what will and will not be covered in your project. It helps you focus your work and your time, ensuring that you’ll be able to achieve your goals and outcomes.

Defining a scope can be very useful in any research project, from a research proposal to a thesis or dissertation . A scope is needed for all types of research: quantitative , qualitative , and mixed methods .

To define your scope of research, consider the following:

Budget constraints or any specifics of grant funding
Your proposed timeline and duration
Specifics about your population of study, your proposed sample size , and the research methodology you’ll pursue
Any inclusion and exclusion criteria
Any anticipated control , extraneous , or confounding variables that could bias your research if not accounted for properly.

Inclusion and exclusion criteria are predominantly used in non-probability sampling . In purposive sampling and snowball sampling , restrictions apply as to who can be included in the sample .

Inclusion and exclusion criteria are typically presented and discussed in the methodology section of your thesis or dissertation .

The purpose of theory-testing mode is to find evidence in order to disprove, refine, or support a theory. As such, generalisability is not the aim of theory-testing mode.

Due to this, the priority of researchers in theory-testing mode is to eliminate alternative causes for relationships between variables . In other words, they prioritise internal validity over external validity , including ecological validity .

Convergent validity shows how much a measure of one construct aligns with other measures of the same or related constructs .

On the other hand, concurrent validity is about how a measure matches up to some known criterion or gold standard, which can be another measure.

Although both types of validity are established by calculating the association or correlation between a test score and another variable , they represent distinct validation methods.

Validity tells you how accurately a method measures what it was designed to measure. There are 4 main types of validity :

Construct validity : Does the test measure the construct it was designed to measure?
Face validity : Does the test appear to be suitable for its objectives ?
Content validity : Does the test cover all relevant parts of the construct it aims to measure.
Criterion validity : Do the results accurately measure the concrete outcome they are designed to measure?

Criterion validity evaluates how well a test measures the outcome it was designed to measure. An outcome can be, for example, the onset of a disease.

Criterion validity consists of two subtypes depending on the time at which the two measures (the criterion and your test) are obtained:

Concurrent validity is a validation strategy where the the scores of a test and the criterion are obtained at the same time
Predictive validity is a validation strategy where the criterion variables are measured after the scores of the test

Attrition refers to participants leaving a study. It always happens to some extent – for example, in randomised control trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analysing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Content validity shows you how accurately a test or other measurement method taps into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Construct validity refers to how well a test measures the concept (or construct) it was designed to measure. Assessing construct validity is especially important when you’re researching concepts that can’t be quantified and/or are intangible, like introversion. To ensure construct validity your test should be based on known indicators of introversion ( operationalisation ).

On the other hand, content validity assesses how well the test represents all aspects of the construct. If some aspects are missing or irrelevant parts are included, the test has low content validity.

Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Construct validity has convergent and discriminant subtypes. They assist determine if a test measures the intended notion.

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
A successful replication shows that the reliability of the results is high.
Reproducing research entails reanalysing the existing data in the same manner.
Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data .

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalisations – often the goal of quantitative research . As such, a snowball sample is not representative of the target population, and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

If there is no sampling frame available (e.g., people with a rare disease)
If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
If the research focuses on a sensitive topic (e.g., extra-marital affairs)

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection , using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

When your population is large in size, geographically dispersed, or difficult to contact, it’s necessary to use a sampling method .

This allows you to gather information from a smaller part of the population, i.e. the sample, and make accurate statements by using statistical analysis. A few sampling methods include simple random sampling , convenience sampling , and snowball sampling .

The two main types of social desirability bias are:

Self-deceptive enhancement (self-deception): The tendency to see oneself in a favorable light without realizing it.
Impression managemen t (other-deception): The tendency to inflate one’s abilities or achievement in order to make a good impression on other people.

Response bias refers to conditions or factors that take place during the process of responding to surveys, affecting the responses. One type of response bias is social desirability bias .

Demand characteristics are aspects of experiments that may give away the research objective to participants. Social desirability bias occurs when participants automatically try to respond in ways that make them seem likeable in a study, even if it means misrepresenting how they truly feel.

Participants may use demand characteristics to infer social norms or experimenter expectancies and act in socially desirable ways, so you should try to control for demand characteristics wherever possible.

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information – for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Peer review is a process of evaluating submissions to an academic journal. Utilising rigorous criteria, a panel of reviewers in the same subject area decide whether to accept each submission for publication.

For this reason, academic journals are often considered among the most credible sources you can use in a research project – provided that the journal itself is trustworthy and well regarded.

In general, the peer review process follows the following steps:

First, the author submits the manuscript to the editor.
Reject the manuscript and send it back to author, or
Send it onward to the selected peer reviewer(s)
Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made.
Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field.

It acts as a first defence, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure.

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

In a single-blind study , only the participants are blinded.
In a double-blind study , both participants and experimenters are blinded.
In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analysing the data.

Blinding is important to reduce bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behaviour in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a die to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalisability of your results, while random assignment improves the internal validity of your study.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardisation and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyse, detect, modify, or remove ‘dirty’ data to make your dataset ‘clean’. Data cleaning is also called data cleansing or data scrubbing.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimise or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Observer bias occurs when a researcher’s expectations, opinions, or prejudices influence what they perceive or record in a study. It usually affects studies when observers are aware of the research aims or hypotheses. This type of research bias is also called detection bias or ascertainment bias .

The observer-expectancy effect occurs when researchers influence the results of their own study through interactions with participants.

Researchers’ own beliefs and expectations about the study results may unintentionally influence participants through demand characteristics .

You can use several tactics to minimise observer bias .

Use masking (blinding) to hide the purpose of your study from all observers.
Triangulate your data with different data collection methods or sources.
Use multiple observers and ensure inter-rater reliability.
Train your observers to make sure data is consistently recorded between them.
Standardise your observation procedures to make sure they are structured and clear.

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviours of your research subjects in real-world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as ‘people watching’ with a purpose.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

You can organise the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomisation can minimise the bias from order effects.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or by post. All questions are standardised so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

A control group that receives a standard treatment, a fake treatment, or no treatment
Random assignment of participants to ensure the groups are equivalent

Depending on your study topic, there are various other methods of controlling variables .

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

A true experiment (aka a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analysing data from people using questionnaires.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviours. It is made up of four or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with five or seven possible responses, to capture their degree of agreement.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyse your data.

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess. It should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data are available for analysis; other times your research question may only require a cross-sectional study to answer it.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyse behaviour over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

A correlation reflects the strength and/or direction of the association between two or more variables.

A positive correlation means that both variables change in the same direction.
A negative correlation means that the variables change in opposite directions.
A zero correlation means there’s no relationship between the variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups . Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

Open-ended and flexible
Impossible to answer with ‘yes’ or ‘no’ (questions that start with ‘why’ or ‘how’ are often best)
Unambiguous, getting straight to the point while still stimulating discussion
Unbiased and neutral

Social desirability bias is the tendency for interview participants to give responses that will be viewed favourably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias in research can also occur in observations if the participants know they’re being observed. They might alter their behaviour accordingly.

A focus group is a research method that brings together a small group of people to answer questions in a moderated setting. The group is chosen due to predefined demographic traits, and the questions are designed to shed light on a topic of interest. It is one of four types of interviews .

The four most common types of interviews are:

Structured interviews : The questions are predetermined in both topic and order.
Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
Unstructured interviews : None of the questions are predetermined.
Focus group interviews : The questions are presented to a group instead of one individual.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions
Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
You are seeking descriptive data, and are ready to ask questions that will deepen and contextualise your initial thoughts and hypotheses
Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when:

You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
You are constrained in terms of time or resources and need to analyse your data quickly and efficiently
Your research question depends on strong parity between participants, with environmental conditions held constant

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

When conducting research, collecting original data has significant advantages:

You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

If something is a mediating variable :

It’s caused by the independent variable
It influences the dependent variable
When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

The independent variable is the amount of nutrients added to the crop field.
The dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Discrete and continuous variables are two types of quantitative variables :

Discrete variables represent counts (e.g., the number of objects in a collection).
Continuous variables represent measurable amounts (e.g., water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

You want to find out how blood sugar levels are affected by drinking diet cola and regular cola, so you conduct an experiment .

The type of cola – diet or regular – is the independent variable .
The level of blood sugar that you measure is the dependent variable – it changes depending on the type of cola.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control, and randomisation.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomisation , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalisation .

In statistics, ordinal and nominal variables are both considered categorical variables .

Even though ordinal data can sometimes be numerical, not all mathematical operations can be performed on them.

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

‘Controlling for a variable’ means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

There are 4 main types of extraneous variables :

Demand characteristics : Environmental cues that encourage participants to conform to researchers’ expectations
Experimenter effects : Unintentional actions by researchers that influence study outcomes
Situational variables : Eenvironmental variables that alter participants’ behaviours
Participant variables : Any characteristic or aspect of a participant’s background that could affect study results

The difference between explanatory and response variables is simple:

An explanatory variable is the expected cause, and it explains the results.
A response variable is the expected effect, and it responds to other variables.

The term ‘ explanatory variable ‘ is sometimes preferred over ‘ independent variable ‘ because, in real-world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so ‘explanatory variables’ is a more appropriate term.

On graphs, the explanatory variable is conventionally placed on the x -axis, while the response variable is placed on the y -axis.

If you have quantitative variables , use a scatterplot or a line graph.
If your response variable is categorical, use a scatterplot or a line graph.
If your explanatory variable is categorical, use a bar graph.

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called ‘independent’ because it’s not influenced by any other variables in the study.

Independent variables are also called:

Explanatory variables (they explain an event or outcome)
Predictor variables (they can be used to predict the value of a dependent variable)
Right-hand-side variables (they appear on the right-hand side of a regression equation)

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it ‘depends’ on your independent variable.

In statistics, dependent variables are also called:

Response variables (they respond to a change in another variable)
Outcome variables (they represent the outcome you want to measure)
Left-hand-side variables (they appear on the left-hand side of a regression equation)

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

Inductive generalisation : You use observations about a sample to come to a conclusion about the population it came from.
Statistical generalisation: You use specific numbers about samples to make statements about populations.
Causal reasoning: You make cause-and-effect links between different things.
Sign reasoning: You make a conclusion about a correlational relationship between different things.
Analogical reasoning: You make a conclusion about something based on its similarities to something else.

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity , because it covers all of the other types. You need to have face validity , content validity, and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

Convergent validity : The extent to which your measure corresponds to measures of related constructs
Discriminant validity: The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Attrition bias can skew your sample so that your final sample differs significantly from your original sample. Your sample is biased because some groups from your population are underrepresented.

With a biased final sample, you may not be able to generalise your findings to the original population that you sampled from, so your external validity is compromised.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment, and situation effect.

The two types of external validity are population validity (whether you can generalise to other groups of people) and ecological validity (whether you can generalise to other situations and settings).

The external validity of a study is the extent to which you can generalise your findings to different groups of people, situations, and measures.

Attrition bias is a threat to internal validity . In experiments, differential rates of attrition between treatment and control groups can skew results.

This bias can affect the relationship between your independent and dependent variables . It can make variables appear to be correlated when they are not, or vice versa.

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction, and attrition .

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

There are three key steps in systematic sampling :

Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
Choose every k th member of the population as your sample.

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 × 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method .

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

In multistage sampling , you can use probability or non-probability sampling methods.

For a probability sample, you have to probability sampling at every stage. You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

In single-stage sampling , you collect data from every unit within the selected clusters.
In double-stage sampling , you select a random sample of units from within the clusters.
In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data are then collected from as large a percentage as possible of this random subset.

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from county to city to neighbourhood) to create a sample that’s less expensive and time-consuming to collect data from.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling , and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

Advantages:

Prevents carryover effects of learning and fatigue.
Shorter study duration.

Disadvantages:

Needs larger samples for high power.
Uses more resources to recruit participants, administer sessions, cover costs, etc.
Individual differences may be an alternative explanation for results.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Only requires small samples
Statistically powerful
Removes the effects of individual differences on the outcomes
Internal validity threats reduce the likelihood of establishing a direct relationship between variables
Time-related effects, such as growth, can influence the outcomes
Carryover effects mean that the specific order of different treatments affect the outcomes

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference between this and a true experiment is that the groups are not randomly assigned.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word ‘between’ means that you’re comparing different conditions between groups, while the word ‘within’ means you’re comparing different conditions within the same group.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Triangulation can help:

Reduce bias that comes from using a single method, theory, or investigator
Enhance validity by approaching the same topic with different tools
Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

It’s time-consuming and labour-intensive, often involving an interdisciplinary team.
Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

Data triangulation : Using data from different times, spaces, and people
Investigator triangulation : Involving multiple researchers in collecting or analysing data
Theory triangulation : Using varying theoretical perspectives in your research
Methodological triangulation : Using different methodologies to approach the same topic

Experimental designs are a set of procedures that you plan in order to examine the relationship between variables that interest you.

To design a successful experiment, first identify:

A testable hypothesis
One or more independent variables that you will manipulate
One or more dependent variables that you will measure

When designing the experiment, first decide:

How your variable(s) will be manipulated
How you will control for any potential confounding or lurking variables
How many subjects you will include
How you will assign treatments to your subjects

Exploratory research explores the main aspects of a new or barely researched question.

Explanatory research explains the causes and effects of an already widely researched question.

The key difference between observational studies and experiments is that, done correctly, an observational study will never influence the responses or behaviours of participants. Experimental designs will have a treatment condition applied to at least a portion of participants.

An observational study could be a good fit for your research if your research question is based on things you observe. If you have ethical, logistical, or practical concerns that make an experimental design challenging, consider an observational study. Remember that in an observational study, it is critical that there be no interference or manipulation of the research subjects. Since it’s not an experiment, there are no control or treatment groups either.

These are four of the most common mixed methods designs :

Convergent parallel: Quantitative and qualitative data are collected at the same time and analysed separately. After both analyses are complete, compare your results to draw overall conclusions.
Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
Explanatory sequential: Quantitative data is collected and analysed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualise your quantitative findings.
Exploratory sequential: Qualitative data is collected and analysed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

There are five common approaches to qualitative research :

Grounded theory involves collecting data in order to develop new theories.
Ethnography involves immersing yourself in a group or organisation to understand its culture.
Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
Phenomenological research involves investigating phenomena through people’s lived experiences.
Action research links theory and practice in several cycles to drive innovative changes.

There are various approaches to qualitative data analysis , but they all share five steps in common:

Prepare and organise your data.
Review and explore your data.
Develop a data coding system.
Assign codes to the data.
Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyse data (e.g. experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

The research methods you use depend on the type of data you need to answer your research question .

If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Ask our team

Want to contact us directly? No problem. We are always here for you.

Chat with us
Email [email protected]
Call +44 (0)20 3917 4242
WhatsApp +31 20 261 6040

Our support team is here to help you daily via chat, WhatsApp, email, or phone between 9:00 a.m. to 11:00 p.m. CET.

Our APA experts default to APA 7 for editing and formatting. For the Citation Editing Service you are able to choose between APA 6 and 7.

Yes, if your document is longer than 20,000 words, you will get a sample of approximately 2,000 words. This sample edit gives you a first impression of the editor’s editing style and a chance to ask questions and give feedback.

How does the sample edit work?

You will receive the sample edit within 24 hours after placing your order. You then have 24 hours to let us know if you’re happy with the sample or if there’s something you would like the editor to do differently.

Read more about how the sample edit works

Yes, you can upload your document in sections.

We try our best to ensure that the same editor checks all the different sections of your document. When you upload a new file, our system recognizes you as a returning customer, and we immediately contact the editor who helped you before.

However, we cannot guarantee that the same editor will be available. Your chances are higher if

You send us your text as soon as possible and
You can be flexible about the deadline.

Please note that the shorter your deadline is, the lower the chance that your previous editor is not available.

If your previous editor isn’t available, then we will inform you immediately and look for another qualified editor. Fear not! Every Scribbr editor follows the Scribbr Improvement Model and will deliver high-quality work.

Yes, our editors also work during the weekends and holidays.

Because we have many editors available, we can check your document 24 hours per day and 7 days per week, all year round.

If you choose a 72 hour deadline and upload your document on a Thursday evening, you’ll have your thesis back by Sunday evening!

Yes! Our editors are all native speakers, and they have lots of experience editing texts written by ESL students. They will make sure your grammar is perfect and point out any sentences that are difficult to understand. They’ll also notice your most common mistakes, and give you personal feedback to improve your writing in English.

Every Scribbr order comes with our award-winning Proofreading & Editing service , which combines two important stages of the revision process.

For a more comprehensive edit, you can add a Structure Check or Clarity Check to your order. With these building blocks, you can customize the kind of feedback you receive.

You might be familiar with a different set of editing terms. To help you understand what you can expect at Scribbr, we created this table:

View an example

When you place an order, you can specify your field of study and we’ll match you with an editor who has familiarity with this area.

However, our editors are language specialists, not academic experts in your field. Your editor’s job is not to comment on the content of your dissertation, but to improve your language and help you express your ideas as clearly and fluently as possible.

This means that your editor will understand your text well enough to give feedback on its clarity, logic and structure, but not on the accuracy or originality of its content.

Good academic writing should be understandable to a non-expert reader, and we believe that academic editing is a discipline in itself. The research, ideas and arguments are all yours – we’re here to make sure they shine!

After your document has been edited, you will receive an email with a link to download the document.

The editor has made changes to your document using ‘Track Changes’ in Word. This means that you only have to accept or ignore the changes that are made in the text one by one.

It is also possible to accept all changes at once. However, we strongly advise you not to do so for the following reasons:

You can learn a lot by looking at the mistakes you made.
The editors don’t only change the text – they also place comments when sentences or sometimes even entire paragraphs are unclear. You should read through these comments and take into account your editor’s tips and suggestions.
With a final read-through, you can make sure you’re 100% happy with your text before you submit!

You choose the turnaround time when ordering. We can return your dissertation within 24 hours , 3 days or 1 week . These timescales include weekends and holidays. As soon as you’ve paid, the deadline is set, and we guarantee to meet it! We’ll notify you by text and email when your editor has completed the job.

Very large orders might not be possible to complete in 24 hours. On average, our editors can complete around 13,000 words in a day while maintaining our high quality standards. If your order is longer than this and urgent, contact us to discuss possibilities.

Always leave yourself enough time to check through the document and accept the changes before your submission deadline.

Scribbr is specialised in editing study related documents. We check:

Graduation projects
Dissertations
Admissions essays
College essays
Application essays
Personal statements
Process reports
Reflections
Internship reports
Academic papers
Research proposals
Prospectuses

Calculate the costs

The fastest turnaround time is 24 hours.

You can upload your document at any time and choose between four deadlines:

At Scribbr, we promise to make every customer 100% happy with the service we offer. Our philosophy: Your complaint is always justified – no denial, no doubts.

Our customer support team is here to find the solution that helps you the most, whether that’s a free new edit or a refund for the service.

Yes, in the order process you can indicate your preference for American, British, or Australian English .

If you don’t choose one, your editor will follow the style of English you currently use. If your editor has any questions about this, we will contact you.

Protection of Random Assignment

First Online: 14 October 2021

Cite this chapter

Lynda H. Powell 4 ,
Peter G. Kaufmann 5 &
Kenneth E. Freedland 6

534 Accesses

Existence of an alternative explanation for the benefit of a treatment is a confounder. It is a nuisance “passenger” variable that rides along with treatment and undermines the ability to make causal inferences. This chapter focuses on why random assignment is so powerful and should be protected. It presents a history of attempts to answer the question of whether or not a treatment works, and the arrival at random assignment as the best way to make causal inferences about the benefits of a treatment. It defines confounding as an error of interpretation and the essential role of avoiding it by protecting the random assignment. It then goes on to illustrate ways to protect random assignment in the design, conduct, and analyses of a trial, with particular attention to the central role of identifying a patient-centered target population, recruiting it, retaining it, and insuring that all randomized participants are included in the evaluation of trial results.

“Daniel and his three companions were young Israelites who were taken to serve in the palace of the king of Babylon because they were of noble royal family, without physical defect, handsome, versed in wisdom, and competent. Daniel determined he would not defile himself with the King’s food or wine. He asked the overseer: ‘Please test us for 10 days and let us be given some vegetables to eat and water to drink. Then let our appearance be compared to the appearance of youths who are eating the King’s choice food.’ At the end of 10 days, their appearance seemed better and they were fatter than any of the youths who had been eating the King’s food. So the overseer let them continue to eat vegetables and drink water instead of what the king provided.” Bible, Old Testament, Book of Daniel 1:16

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Available as EPUB and PDF
Read on any device
Instant download
Own it forever
Durable hardcover edition
Dispatched in 3 to 5 business days
Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bull JP (1959) The historical development of clinical therapeutic trials. J Chron Dis 10:218–248

PubMed Google Scholar

Armitage P (1982) The role of randomization in clinical trials. Stat Med 1:345–352

Van Helmont JB (1662) Oriatrike or Physik Refined. In Debus AG (1968) The chemical dream of the renaissance. Heffer, London

Google Scholar

Peirce CS, Jastrow J (1884) Fifth memoir: on small differences of sensation. Ntl Acad Sci 3:73–83

Yule G (1924) The function of statistical method in scientific investigation. Industrial Health Research Board Report 28. His Majesty’s Stationery Office, London

Eliot MM (1925) The control of rickets: preliminary discussion of the demonstration in New Haven. JAMA 85:656–663

Hill AB (1952) The clinical trial. New Engl J Med 247:113–119

Hill AB (1953) Observation and experiment. New Engl J Med 248:995–1001

Sinclair HM (1951) Nutritional surveys of population groups. New Engl J Med 245:39–47

Mill JS (1843) A system of logic ratiocinative and inductive. Being a connected view of the principles of evidence and the methods of scientific investigation. Book I. In Robson JM (ed). The collected works of John Stuart Mill (1974). University of Toronto Press, Toronto

Hill AB (1965) The environment and disease: association or causation. Proc Roy Soc Med 58:295–300

Wang D, Bakhai A (2006) Clinical trials: a practical guide to design, analysis, and reporting. Remedica, London

Domanski M, McKinlay S (2009) Successful randomized trials. A handbook for the 21st century. Lippincott Williams & Wilkins, Philadelphia

Friedman LM, Furberg CD, DeMets D, Reboussin DH, Granger CB (2015) Fundamentals of clinical trials, 5th edn. Springer, Cham

Rothman KJ, Greenland S, Lash TL (2008) Modern epidemiology, 3rd edn. Lippincott Williams & Wilkins, Philadelphia

Szklo M, Nieto FJ (2019) Epidemiology: beyond the basics, 4th edn. Jones & Bartlett Learning, Burlington

Hennekens CH, Buring JE, Mayrent SL (1987) Epidemiology in medicine. Little Brown, Boston

Susser M (1973) Causal thinking in the health sciences: Concepts and strategies of epidemiology. Oxford University Press, New York

Fisher RA (1951) The design of experiments, 6th edn. Hafner, New York

Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin, Boston

Byar DP, Simon RM, Friedewald WT, Schlesselman JJ, DeMets D, Ellenberg JH, Gail MH, Ware JH (1976) Randomized clinical trials--perspectives on some recent ideas. N Engl J Med 295:74–80

Moher D, Hopewell S, Schulz KF, Montori V, Gotzche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG (2010) CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 340:c869. https://doi.org/10.1136/bmj.c869

Mosteller F, Gilbert JP, McPeek B (1980) Reporting standards and research strategies for controlled trials. Control Clin Trials 1:37–58

Schulz KF, Chalmers I, Hayes RJ, Altman DG (1995) Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 273:408–412

CONSORT Group (2010) CONSORT checklist. www.consort-statement.org

Schulz KF, Altman DG, Moher D, CONSORT Group (2010) CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials. Ann Intern Med 152:726–732

Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B, Oxman AD, Moher D, and for the CONSORT and Pragmatic Trials in Healthcare (Practihc) groups (2008) Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ 337:a2390. https://doi.org/10.1136/bmj.a2390

Schulz KF (1995) Subverting randomization in controlled trials. JAMA 274:1456–1458

Kraemer HC (2015) A source of false findings in published research studies: adjusting for covariates. JAMA Psychiatry 72:961–962

Pocock SJ, Assmann SE, Enos LE, Kasten LE (2002) Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stat Med 21:2917–2930

Schulz KF, Grimes DA, Altman DG, Hayes RJ (1996) Blinding and exclusions after allocation in randomised controlled trials: survey of published parallel group trials in obstetrics and gynaecology. BMJ 312:742–744

PubMed PubMed Central Google Scholar

Detry MA, Lewis RJ (2014) The intention-to-treat principle: how to assess the true effect of choosing a medical treatment. JAMA 312:85–86

Freedman B (1987) Equipoise and the ethics of clinical research. N Eng J Med 317:141–145

Green SB, Byar DP (1984) Using observational data from registries to compare treatments: the fallacy of omnimetrics. Stat Med 3:361–373

Hollon SD, Wampold BE (2009) Are randomized controlled trials relevant to clinical practice? Can J Psychiatry 54:637–643

Cook TD, Campbell DT (1979) Quasi-experimentation: Design and analysis issues for field settings. Houghton Mifflin, Boston

Glasgow RE, Lichtenstein E, Marcus AC (2003) Why don’t we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness transition. Am J Public Health 93:1261–1267

Areán PA, Kraemer HC (2013) High-quality psychotherapy research: From conception to piloting to national trials. Oxford University Press, New York

Brownell KD, Wadden TA (1992) Etiology and treatment of obesity: understanding a serious, prevalent, and refractory disorder. J Consult Clin Psychol 60:505–517

Prochaska JO, DiClemente CC, Norcross JC (1992) In search of how people change: Applications to addictive behaviors. Am Psychol 47:1102–1114

Hall SM, Tsoh JY, Prochaska JJ, Eisendrath S, Rossi JS, Redding CA, Rosen AB, Meisner M, Humfleet GL, Gorecki JA (2006) Treatment for cigarette smoking among depressed mental health outpatients: a randomized clinical trial. Am J Public Health 96:1808–1814

Prochaska JJ, Hall SE, Delucchi K, Hall SM (2014) Efficacy of initiating tobacco dependence treatment in inpatient psychiatry: a randomized controlled trial. Am J Public Health 104:1557–1565

Prochaska JJ, Hall SE, Hall SM (2009) Stage-tailored tobacco cessation treatment in inpatient psychiatry. Psychiatr Serv 60:848. https://doi:10.1176/appi.ps.60.6.848

Prochaska JJ, Velicer WF, Prochaska JO, Delucchi K, Hall SM (2006) Comparing intervention outcomes in smokers treated for single versus multiple behavioral risks. Health Psychol 25:380–388

The Steering Committee of the Physicians Health Study Research Group (1988) Preliminary report: findings from the aspirin component of the ongoing Physicians’ Health Study. N Engl J Med 318:262–264

Coronary Drug Project Research Group (1980) Influence of adherence to treatment and response of cholesterol on mortality in the Coronary Drug Project. N Engl J Med 303:1038–1041

Adamson J, Cockayne S, Puffer S, Torgerson DJ (2006) Review of randomised trials using the post-randomised consent (Zelen’s) design. Contemp Clin Trials 27:305–319

Fabricatore AN, Wadden TA, Moore RH, Butryn ML, Gravallese EA, Erondu NE, Heymsfield SB, Nguyen AM (2009) Attrition from randomized controlled trials of pharmacological weight loss agents: a systematic review and analysis. Obes Rev 10:333–341

Lang JM (1990) The use of a run-in to enhance compliance. Stat Med 9:87–93

Kong W, Langlois MF, Kamga-Ngandé C, Gagnon C, Brown C, Baillargeon JP (2010) Predictors of success to weight-loss intervention program in individuals at high risk for type 2 diabetes. Diabetes Res Clin Pract 90:147–153

Teixeira PJ, Going SB, Houtkooper LB, Cussler EC, Metcalfe LL, Blew RM, Sardinha LB, Lohman TG (2004) Pretreatment predictors of attrition and successful weight management in women. Int J Obes Relat Metab Disord 28:1124–1133

Czajkowski SM, Powell LH, Adler N, Naar-King S, Reynolds KD, Hunter CM, Laraia B, Olster DH, Perna FM, Peterson JC, Epel E, Boyington JE, Charlson ME (2015) From ideas to efficacy: the ORBIT model for developing behavioral treatments for chronic diseases. Health Psychol 34:971–982

Bailey JV, Pavlou M, Copas A, McCarthy OL, Carswell K, Rait G, Hart G, Nazareth I, Free CJ, French R, Murray E (2013) The Sexunzipped trial: optimizing the design of online randomized controlled trials. J Med Internet Res 15:e278. https://doi.org/10.2196/jmir.2668

Boyd A, Tilling K, Cornish R, Davies A, Humphries K, Macleod J (2015) Professionally designed information materials and telephone reminders improved consent response rates: evidence from an RCT nested within a cohort study. J Clin Epidemiol 68:877–887

Dickson S, Logan J, Hagen S, Stark D, Glazener C, McDonald AM, McPherson G (2013) Reflecting on the methodological challenges of recruiting to a United Kingdom-wide, multi-centre, randomised controlled trial in gynaecology outpatient settings. Trials 14:389. https://doi.org/10.1186/1745-6215-14-389

Gupta A, Calfas KJ, Marshall SJ, Robinson TN, Rock CL, Huang JS, Epstein-Corbin M, Servetas C, Donohue MC, Norman GJ, Raab F, Merchant G, Fowler JH, Griswold WG, Fogg BJ, Patrick K (2015) Clinical trial management of participant recruitment, enrollment, engagement, and retention in the SMART study using a Marketing and Information Technology (MARKIT) model. Contemp Clin Trials 42:185–195

Hadidi N, Buckwalter K, Lindquist R, Rangen C (2012) Lessons learned in recruitment and retention of stroke survivors. J Neurosci Nurs 44:105–110

Hartlieb KB, Jacques-Tiura AJ, Naar-King S, Ellis DA, Jen KL, Marshall S (2015) Recruitment strategies and the retention of obese urban racial/ethnic minority adolescents in clinical trials: the FIT families project, Michigan, 2010–2014. Prev Chronic Dis 12:E22. https://doi.org/10.5888/pcd12.140409

Johnson DA, Joosten YA, Wilkins CH, Shibao CA (2015) Case study. Community engagement and clinical trial success: outreach to African American women. Clin Transl Sci 8:388–390

Blake K, Holbrook JT, Antal H, Shade D, Bunnell HT, McCahan SM, Wise RA, Pennington C, Garfinkel P, Wysocki T (2015) Use of mobile devices and the internet for multimedia informed consent delivery and data entry in a pediatric asthma trial: study design and rationale. Contemp Clin Trials 42:105–118

Cermak SA, Stein Duker LI, Williams ME, Lane CJ, Dawson ME, Borreson AE, Polido JC (2015) Feasibility of a sensory-adapted dental environment for children with autism. Am J Occup Ther 69:6903220020. https://doi.org/10.5014/ajot.2015.013714

Giuffrida A, Torgerson DJ (1997) Should we pay the patient? Review of financial incentives to enhance patient compliance. BMJ 315:703–707

Brown SD, Lee K, Schoffman DE, King AC, Crawley LM, Kiernan M (2012) Minority recruitment into clinical trials: experimental findings and practical implications. Contemp Clin Trials 33:620–623

Kiernan M, Phillips K, Fair JM, King AC (2000) Using direct mail to recruit Hispanic adults into a dietary intervention: an experimental study. Ann Behav Med 22:89–93

Batliner T, Fehringer KA, Tiwari T, Henderson WG, Wilson A, Brega AG, Albino J (2014) Motivational interviewing with American Indian mothers to prevent early childhood caries: study design and methodology of a randomized control trial. Trials 15:125. https://doi.org/10.1186/1745-6215-15-125

Article PubMed PubMed Central Google Scholar

Clark F, Pyatak EA, Carlson M, Blanche E, Vigen C, Hay J, Mallinson T, Blanchard J, Unger JB, Garber SL, Diaz J, Florindez L, Atkins M, Rubayi S, Azen SP, PUPS Study Group (2014) Implementing trials of complex interventions in community settings: the USC-Rancho Los Amigos Pressure Ulcer Prevention Study (PUPS). Clin Trials 11:218–229

Cruz TH, Davis SM, FitzGerald CA, Canaca GF, Keane PC (2014) Engagement, recruitment, and retention in a trans-community, randomized controlled trial for the prevention of obesity in rural American Indian and Hispanic children. J Prim Prev 35:135–149

Jimenez DE, Reynolds CF 3rd, Alegría M, Harvey P, Bartels SJ (2015) The Happy Older Latinos are Active (HOLA) health promotion and prevention study: study protocol for a pilot randomized controlled trial. Trials 6:579. https://doi.org/10.1186/s13063-015-1113-3

Koziol-McLain J, Vandal AC, Nada-Raja S, Wilson D, Glass NE, Eden KB, McLean C, Dobbs T, Case J (2015) A web-based intervention for abused women: the New Zealand isafe randomised controlled trial protocol. BMC Public Health 15:56. https://doi.org/10.1186/s12889-015-1395-0

Bakari M, Munseri P, Francis J, Aris E, Moshiro C, Siyame D, Janabi M, Ngatoluwa M, Aboud S, Lyamuya E, Sandström E, Mhalu F (2013) Experiences on recruitment and retention of volunteers in the first HIV vaccine trial in Dar es Salam, Tanzania - the phase I/II HIVIS 03 trial. BMC Public Health 13:1149. https://doi.org/10.1186/1471-2458-13-1149

Goldberg JH, Kiernan M (2005) Innovative techniques to address retention in a behavioral weight-loss trial. Health Educ Res 20:439–447

National Commission for the Protection of Human Subjects of Biomedical Behavioral Research (1978) The Belmont report: ethical principles and guidelines for the protection of human subjects of research. ERIC Clearinghouse, Bethesda

Moseley JB, O’Malley K, Petersen NJ, Menke TJ, Brody BA, Kuykendall DH, Hollingsworth JC, Ashton CM, Wray NP (2002) A controlled trial of arthroscopic surgery for osteoarthritis of the knee. N Engl J Med 347:81–88

Hays JL, Hunt JR, Hubbell FA, Anderson GL, Limacher MC, Allen C, Rossouw JE (2003) The Women’s Health Initiative recruitment methods and results. Ann Epidemiol 13:S18–S77

Kaptchuk TJ, Friedlander E, Kelley JM, Sanchez MN, Kokkotou E, Singer JP, Kowalczykowski M, Miller FG, Kirsch I, Lembo AJ (2010) Placebos without deception: a randomized controlled trial in irritable bowel syndrome. PLoS One 5:e15591. https://doi.org/10.1371/journal.pone.0015591

Crichton GE, Howe PR, Buckley JD, Coates AM, Murphy KJ, Bryan J (2012) Long-term dietary intervention trials: critical issues and challenges. Trials 13:111. https://doi.org/10.1186/1745-6215-13-111

Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB (2013) Designing clinical research, 4th edn. Lippincott Williams & Wilkins, Philadelphia

Siddiqi AE, Sikorskii A, Given CW, Given B (2008) Early participant attrition from clinical trials: role of trial design and logistics. Clin Trials 5:328–335

Idoko OT, Owolabi OA, Odutola AA, Ogundare O, Worwui A, Saidu Y, Smith-Sanneh A, Tunkara A, Sey G, Sanyang A, Mendy P, Ota MO (2014) Lessons in participant retention in the course of a randomized controlled clinical trial. BMC Res Notes 7:706. https://doi.org/10.1186/1756-0500-7-706

Rucker-Whitaker C, Flynn KJ, Kravitz G, Eaton C, Calvin JE, Powell LH (2006) Understanding African-American participation in a behavioral intervention: results from focus groups. Contemp Clin Trials 27:274–286

Gross D, Fogg L (2004) A critical analysis of the intent-to-treat principle in prevention research. J Primary Prevention 25:475–489

Feinstein AR (1991) Intent-to-treat policy for analyzing randomized trials: statistical distortions and neglected clinical challenges. In: Cramer JA, Spilker B (eds) Patient compliance in medical practice and clinical trials. Raven, New York

Sheiner LB, Rubin DB (1995) Intention-to-treat analysis and the goals of clinical trials. Clin Pharmacol Ther 57:6–15

Knowler WC, Barrett-Connor E, Fowler SE, Hamman RF, Lachin JM, Walker EA, Nathan DM, Diabetes Prevention Program Research Group (2002) Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med 346:393–403

Diabetes Prevention Program Research Group (1999) The Diabetes Prevention Program. Design and methods for a clinical trial in the prevention of type 2 diabetes. Diabetes Care 22:623–634

Diabetes Prevention Program Research Group (2000) The Diabetes Prevention Program: baseline characteristics of the randomized cohort. Diabetes Care 23:1619–1629

Frasure-Smith N, Prince R (1985) The Ischemic Heart Disease Life Stress Monitoring Program. Impact on mortality. Psychosom Med 47:431–445

Frasure-Smith N, Prince R (1989) Long-term follow-up of the Ischemic Heart Disease Life Stress Monitoring Program. Psychosom Med 51:485–513

Powell LH (1989) Unanswered questions in the Ischemic Heart Disease Life Stress Monitoring Program. Psychosom Med 51:479–484

Frasure-Smith N, Lespérance F, Prince RH, Verrier P, Garber RA, Juneau M, Wolfson C, Bourassa MG (1997) Randomised trial of home-based psychosocial nursing intervention for patients recovering from myocardial infarction. Lancet 350:473–479

O’Connor CM, Whellan DJ, Lee KL, Keteyian SJ, Cooper LS, Ellis SJ, Leifer ES, Kraus WE, Kitzman DW, Blumenthal JA, Rendall DS, Miller NH, Fleg JL, Schulman KA, McKelvie RS, Zannad F, Piña IL, HF-ACTION Investigators (2009) Efficacy and safety of exercise training in patients with chronic heart failure: HF-ACTION randomized controlled trial. JAMA 301:1439–1450

Keteyian SJ, Leifer ES, Houston-Miller N, Kraus WE, Brawner CA, O’Connor CM, Whellan DJ, Cooper LS, Fleg JL, Kitzman DW, Cohen-Solal A, Blumenthal JA, Rendall DS, Piña IL, HF-ACTION Investigators (2012) Relation between volume of exercise and clinical outcomes in patients with heart failure. J Am Coll Cardiol 60:1899–1905

Yancy CW, Jessup M, Bozkurt B, Butler J, Casey DE Jr, Drazner MH, Fonarow GC, Geraci SA, Horwich T, Januzzi JL, Johnson MR, Kasper EK, Levy WC, Masoudi FA, McBride PE, McMurray JJ, Mitchell JE, Peterson PN, Riegel B, Sam F, Stevenson LW, Tang WH, Tsai EJ, Wilkoff BL, American College of Cardiology Foundation, American Heart Association Task Force on Practice Guidelines (2013) 2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol 62:e147–e239

Centers for Medicare and Medicaid Services (2014) Decision memo for cardiac rehabilitation programs - chronic heart failure (CAG-00437N). US Department of Health & Human Services. http://www.cms.gov/medicare-coverage-database/details/nca-decision-memo.aspx?

McCambridge J, Kypri K, Elbourne D (2014) In randomization we trust? There are overlooked problems in experimenting with people in behavioral intervention trials. J Clin Epidemiol 67:247–253

Ashley EA (2015) The precision medicine initiative: a new national effort. JAMA 313:2019–2020

Khoury MJ, Evans JP (2015) A public health perspective on a national precision medicine cohort: balancing long-term knowledge generation with early health benefit. JAMA 313:2117–2118

Ma J, Rosas LG, Lv N (2016) Precision lifestyle medicine: a new frontier in the science of behavior change and population health. Am J Prev Med 50:395–397

Brewin CR, Bradley C (1989) Patient preferences and randomised clinical trials. Br Med J 299:313–315

Download references

Author information

Authors and affiliations.

Department of Preventive Medicine, Rush University Medical Center, Chicago, IL, USA

Lynda H. Powell

College of Nursing, Villanova University, Villanova, PA, USA

Peter G. Kaufmann

Department of Psychiatry, Washington University in St. Louis, St. Louis, MO, USA

Kenneth E. Freedland

You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Powell, L.H., Kaufmann, P.G., Freedland, K.E. (2021). Protection of Random Assignment. In: Behavioral Clinical Trials for Chronic Diseases. Springer, Cham. https://doi.org/10.1007/978-3-030-39330-4_8

Download citation

DOI : https://doi.org/10.1007/978-3-030-39330-4_8

Published : 14 October 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-39328-1

Online ISBN : 978-3-030-39330-4

eBook Packages : Behavioral Science and Psychology Behavioral Science and Psychology (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Find a journal
Track your research

Search Search Please fill out this field.

What Is Simple Random Sampling?

Simple Random Sample
Disadvantages
Random Sampling FAQs

The Bottom Line

Simple random sampling definition, advantages and disadvantage.

Simple random sampling is a technique in which a researcher selects a random subset of people from a larger group or population. In simple random sampling, each member of the group has an equal chance of getting selected. The method is commonly used in statistics to obtain a sample that is representative of the larger population.

Statistics is a branch of applied mathematics that helps us learn about large datasets by studying smaller events or objects. Put simply, you can make inferences about a large population by examining a smaller sample. Statistical analysis is commonly used to identify trends in many different areas, including business and finance. Individuals can use findings from statistical research to make better decisions about their money, businesses, and investments.

The simple random sampling method allows researchers to statistically measure a subset of individuals selected from a larger group or population to approximate a response from the entire group. This research method has both benefits and drawbacks. We highlight these pros and cons in this article, along with an overview of simple random sampling.

Key Takeaways

A simple random sample is one of the methods researchers use to choose a sample from a larger population.
This method works if there is an equal chance that any of the subjects in a population will be chosen.
Researchers choose simple random sampling to make generalizations about a population.
Major advantages include its simplicity and lack of bias.
Among the disadvantages are difficulty gaining access to a list of a larger population, time, costs, and that bias can still occur under certain circumstances.

Simple Random Sample: An Overview

As noted above, simple random sampling involves choosing a smaller subset of a larger population. This is done randomly. But the catch here is that there is an equal chance that any of the samples in the subset will be chosen. Researchers tend to choose this method of sampling when they want to make generalizations about the larger population.

Simple random sampling can be conducted by using:

The lottery method. This method involves assigning a number to each member of the dataset then choosing a prescribed set of numbers from those members at random.
Technology. Using software programs like Excel makes it easier to conduct random sampling. Researchers just have to make sure that all the formulas and inputs are correctly laid out.

For simple random sampling to work, researchers must know the total population size. They must also be able to remove all hints of bias as simple random sampling is meant to be a completely unbiased approach to garner responses from a large group.

Keep in mind that there is room for error with random sampling. This is noted by adding a plus or minus variance to the results. In order to avoid any errors, researchers must study the entire population, which for all intents and purposes, isn't always possible.

To ensure bias does not occur, researchers must acquire responses from an adequate number of respondents, which may not be possible due to time or budget constraints.

Advantages of a Simple Random Sample

Simple random sampling may be simple to perform (as the name suggests) but it isn't used that often. But that doesn't mean it shouldn't be used. As long as it is done properly, there are certain distinct advantages to this sampling method.

Lack of Bias

The use of simple random sampling removes all hints of bias —or at least it should. Because individuals who make up the subset of the larger group are chosen at random, each individual in the large population set has the same probability of being selected. In most cases, this creates a balanced subset that carries the greatest potential for representing the larger group as a whole.

Here's a simple way to show how a researcher can remove bias when conducting simple random sampling. Let's say there are 100 bingo balls in a bowl, from which the researcher must choose 10. In order to remove any bias, the individual must close their eyes or look away when choosing the balls.

As its name implies, producing a simple random sample is much less complicated than other methods . There are no special skills involved in using this method, which can result in a fairly reliable outcome. This is in contrast to other sampling methods like stratified random sampling . This method involves dividing larger groups into smaller subgroups that are called strata. Members are divided up into these groups based on any attributes they share. As mentioned, individuals in the subset are selected randomly and there are no additional steps.

Less Knowledge Required

We've already established that simple random sampling is a very simple sampling method to execute. But there's also another, similar benefit: It requires little to no special knowledge. This means that the individual conducting the research doesn't need to have any information or knowledge about the larger population in order to effectively do their job.

Be sure that the sample subset from the larger group is inclusive enough. A sample that doesn't adequately reflect the population as a whole will result in a skewed result.

Disadvantages of a Simple Random Sample

Although there are distinct advantages to using a simple random sample, it does come with inherent drawbacks. These disadvantages include the time needed to gather the full list of a specific population, the capital necessary to retrieve and contact that list, and the bias that could occur when the sample set is not large enough to adequately represent the full population. We go into more detail below.

Difficulty Accessing Lists of the Full Population

An accurate statistical measure of a large population can only be obtained in simple random sampling when a full list of the entire population to be studied is available. Think of a list of students at a university or a group of employees at a specific company.

The problem lies in the accessibility of these lists. As such, getting access to the whole list can present challenges. Some universities or colleges may not want to provide a complete list of students or faculty for research. Similarly, specific companies may not be willing or able to hand over information about employee groups due to privacy policies.

Time Consuming

When a full list of a larger population is not available, individuals attempting to conduct simple random sampling must gather information from other sources. If publicly available, smaller subset lists can be used to recreate a full list of a larger population, but this strategy takes time to complete.

Organizations that keep data on students, employees, and individual consumers often impose lengthy retrieval processes that can stall a researcher's ability to obtain the most accurate information on the entire population set.

In addition to the time it takes to gather information from various sources, the process may cost a company or individual a substantial amount of capital. Retrieving a full list of a population or smaller subset lists from a third-party data provider may require payment each time data is provided.

If the sample is not large enough to represent the views of the entire population during the first round of simple random sampling, purchasing additional lists or databases to avoid a sampling error can be prohibitive.

Sample Selection Bias

Although simple random sampling is intended to be an unbiased approach to surveying, sample selection bias can occur. When a sample set of the larger population is not inclusive enough, representation of the full population is skewed and requires additional sampling techniques.

Data Quality Is Reliant on Researcher Qualify

The success of any sampling method relies on the researcher's willingness to thoroughly do their job. Someone who isn't willing to follow the rules or deviates from the task at hand won't help get a reliable result. For instance, there may be issues if a researcher doesn't ask the appropriate questions or asks the wrong ones. This could create implicit bias, ending up in a skewed study.

The term simple random sampling refers to a smaller section of a larger population. There is an equal chance that each member of this section will be chosen. For this reason, a simple random sampling is meant to be unbiased in its representation of the larger group. There is normally room for error with this method, which is indicated by a plus or minus variant. This is known as a sampling error.

How Is Simple Random Sampling Conducted?

Simple random sampling involves the study of a larger population by taking a smaller subset. This subgroup is chosen at random and studied to get the desired result. In order for this sampling method to work, the researcher must know the size of the larger population. The selection of the subset must be unbiased.

What Are the 4 Types of Random Sampling?

There are four types of random sampling. Simple random sampling involves an unbiased study of a smaller subset of a larger population. Stratified random sampling uses smaller groups derived from a larger population that is based on shared characteristics and attributes. Systematic sampling is a method that involves specific members of a larger dataset. These samples are selected based on a random starting point using a fixed, periodic interval. The final type of random sampling is cluster sampling, which takes members of a dataset and places them into clusters based on shared characteristics. Researchers then randomly select clusters to study.

When Is It Best to Use Simple Random Sampling?

It's always a good idea to use simple random sampling when you have smaller data sets to study. This allows you to produce better results that are more representative of the overall population. Keep in mind that this method requires each member of the larger population is identified and selected individually, which can often be challenging and time consuming.

Studying large populations can be very difficult. Getting information from each individual member can be costly and time-consuming. That's why researchers turn to random sampling to help reach the conclusions they need to make key decisions, whether that means helping provide the services that residents need, making better business decisions, or executing changes in an investor's portfolio.

Simple random sampling is relatively easy to conduct as long as you remove any and all hints of bias. Doing so means you must have information about each member of the larger population at your disposal before you conduct your research. This can be relatively simple and require very little knowledge. But keep in mind that the process can be costly and it may be hard trying to get access to information about all of the members of the population.

Pressbooks. " Significant Statistics: 1.5 Sampling Techniques and Ethics ."

Terms of Service
Editorial Policy
Privacy Policy
Your Privacy Choices

Bipolar Disorder
Therapy Center
When To See a Therapist
Types of Therapy
Best Online Therapy
Best Couples Therapy
Best Family Therapy
Managing Stress
Sleep and Dreaming
Understanding Emotions
Self-Improvement
Healthy Relationships
Student Resources
Personality Types
Guided Meditations
Verywell Mind Insights
2024 Verywell Mind 25
Mental Health in the Classroom
Editorial Process
Meet Our Review Board
Crisis Support

The Random Selection Experiment Method

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

When researchers need to select a representative sample from a larger population, they often utilize a method known as random selection. In this selection process, each member of a group stands an equal chance of being chosen as a participant in the study.

Random Selection vs. Random Assignment

How does random selection differ from random assignment ? Random selection refers to how the sample is drawn from the population as a whole, whereas random assignment refers to how the participants are then assigned to either the experimental or control groups.

It is possible to have both random selection and random assignment in an experiment.

Imagine that you use random selection to draw 500 people from a population to participate in your study. You then use random assignment to assign 250 of your participants to a control group (the group that does not receive the treatment or independent variable) and you assign 250 of the participants to the experimental group (the group that receives the treatment or independent variable).

Why do researchers utilize random selection? The purpose is to increase the generalizability of the results.

By drawing a random sample from a larger population, the goal is that the sample will be representative of the larger group and less likely to be subject to bias.

Factors Involved

Imagine a researcher is selecting people to participate in a study. To pick participants, they may choose people using a technique that is the statistical equivalent of a coin toss.

They may begin by using random selection to pick geographic regions from which to draw participants. They may then use the same selection process to pick cities, neighborhoods, households, age ranges, and individual participants.

Another important thing to remember is that larger sample sizes tend to be more representative. Even random selection can lead to a biased or limited sample if the sample size is small.

When the sample size is small, an unusual participant can have an undue influence over the sample as a whole. Using a larger sample size tends to dilute the effects of unusual participants and prevent them from skewing the results.

Lin L. Bias caused by sampling error in meta-analysis with small sample sizes . PLoS ONE . 2018;13(9):e0204056. doi:10.1371/journal.pone.0204056

Elmes DG, Kantowitz BH, Roediger HL. Research Methods in Psychology. Belmont, CA: Wadsworth; 2012.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Skip to main content
Skip to primary sidebar

IResearchNet

Quasi-Experimental Design

Quasi-experimental design definition.

Example of a Quasi-Experimental Design

Quasi-experimental designs are most often used in natural (nonlaboratory) settings over longer periods and usually include an intervention or treatment. Consider, for example, a study of the effect of a motivation intervention on class attendance and enjoyment in students. When an intact group such as a classroom is singled out for an intervention, randomly assigning each person to experimental conditions is not possible. Rather, the researcher gives one classroom the motivational intervention (intervention group) and the other classroom receives no intervention (comparison group). The researcher uses two classrooms that are as similar as possible in background (e.g., same age, racial composition) and that have comparable experiences within the class (e.g., type of class, meeting time) except for the intervention. In addition, the researcher gives participants in both conditions (comparison and motivation intervention) pretest questionnaires to assess attendance, enjoyment, and other related variables before the intervention. After the intervention is administered, the researcher measures attendance and enjoyment of the class. The researcher can then determine if students in the motivation intervention group enjoyed and attended class more than the students in the comparison group did.

Interpreting Results from a Quasi-Experimental Design

How should results from this hypothetical study be interpreted? Investigators, when interpreting the results of quasi-experimental designs that lacked random assignment of participants to conditions, must be cautious drawing conclusions about causality because of potential confounds in the setting. For example, the previous hypothetical example course material in the intervention group might have become more engaging whereas the comparison group started to cover a more mundane topic that led to changes in class enjoyment and attendance. However, if the intervention group and comparison group had similar pretest scores and comparable classroom experiences, then changes on posttest scores suggest that the motivation intervention influenced class attendance and enjoyment.

The Pros and Cons of Using Quasi-Experimental Designs

Quasi-experiments are most useful when conducting research in settings where random assignment is not possible because of ethical considerations or constraining situational factors. In consequence, such designs are more prevalent in studies conducted in natural settings, thereby increasing the real-world applicability of the findings. Such studies are not, however, true experiments, and thus the lack of control over assignment of participants to conditions renders causal conclusions suspect.

References:

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimental: Design and analysis issues for field settings. Boston: Houghton Mifflin.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.
Social Psychology Research Methods

What random assignment does and does not do

Affiliation.

1 Northwestern University, USA. [email protected]
PMID: 12808582
DOI: 10.1002/jclp.10170

Random assignment of patients to comparison groups stochastically tends, with increasing sample size or number of experiment replications, to minimize the confounding of treatment outcome differences by the effects of differences among these groups in unknown/unmeasured patient characteristics. To what degree such confounding is actually avoided we cannot know unless we have validly measured these patient variables, but completely avoiding it is quite unlikely. Even if this confounding were completely avoided, confounding by unmeasured Patient Variable x Treatment Variable interactions remains a possibility. And the causal power of the confounding variables is no less important for internal validity than the degree of confounding.

Publication types

Comparative Study
Confounding Factors, Epidemiologic
Observer Variation
Psychotherapy
Random Allocation*
Randomized Controlled Trials as Topic / methods*
Randomized Controlled Trials as Topic / statistics & numerical data
Reproducibility of Results
Selection Bias
Stochastic Processes*
Treatment Outcome*

14 Advantages and Disadvantages of a Randomized Controlled Trial

A randomized controlled trial is a study where people get allocated by chance alone to receive one of several unique clinical interventions. One of them is the standard of comparison, which is traditionally referred to as the control group. The control can be from the use of a standard practice, offering a placebo, or to use no interventions whatsoever. Then the results from this group get compared to the other treatments to determine if positive results occur.

The individuals who take part in a randomized controlled trial are called subjects or participants. Researchers use this method to compare outcomes after every participant receives a specific intervention. That’s why this method is a quantitative study. It uses controlled experiments where investigators can study two or more interventions in a series of people who receive them in a randomized order.

When we look at the randomized controlled trial advantages and disadvantages, it is clear to see that this tool is a simple approach for researchers to use. It is also popular because it can provide a powerful database of results.

List of the Advantages of Randomized Controlled Trials

1. Randomization prevents the deliberate manipulation of results. A randomized controlled trial works to prevent skewing or the deliberate manipulation of results by researchers or participants. Because each subject gets assigned to a specific group randomly, the removal of choice works to get rid of selection bias. This advantage prevents scientists from subconsciously or deliberately assigning patients to specific groups receiving active treatment if it seems like they would be likely to benefit from it. That would make the outcome seem to be more beneficial than it really would be.

The opposite could also occur. Researchers who want to demonstrate the potential danger of a specific treatment could assign participants who have a higher risk of complications to the active group.

2. It provides immediate comparative results that researchers can use. Randomized controlled trial processes use one treatment option that gets directly compared to another one as part of the project. This method makes it much easier to establish which one has a superior outcome. That means the study design can make causal inferences so that we have access to the strongest empirical evidence of efficacy or an inability to perform as intended.

Although the placebo effect can sometimes minimize the outcomes of this advantage, researchers can know immediately if something seems to work or know for sure that a new idea needs to get pursued.

3. This method minimized the confounding factors. The act of randomization in a controlled trial setting minimizes confounding because of the unequal distribution of prognostic factors. It makes groups become comparable through the collected data according to known and unknown factors that investigators find during the research process.

Even a blocked randomization effort can make groups comparable when they fall within known confounding factors.

4. It offers a higher level of statistical probability. When a study is significantly randomized, then the statistical test of significance is readily interpretable for investigators. That means the results generated from these efforts have a greater level of reliability when compared to other research methods. The sample size from a randomized controlled trial avoids type one and type two errors when it is adequately powered.

That means it can stop issues where the null hypothesis is either incorrectly rejected or accepted.

5. Investigators can use multiple methods of randomization. Randomized controlled trials can get designed through the use of several different structures. Depending on the results that researchers hope to achieve, this method can use a stratified approach, random clusters, or crossover clusters. Some situations might call for complete randomization of the participants into investigatory groups. There’s also the step-wedge and block methods that work well in specific areas of study.

6. It allows researchers to have control of the exposure event. Even though the population groups in a randomized controlled trial get masked, researchers still get the advantage of having control over the exposure. That means investigators can work with the amount of a treatment option, the timing of its application, and the duration of the study. That means several different groups can get studied at once to determine if changes in frequency can provide more results than a standard dose alone or nothing at all.

That’s why a properly designed study is well-regarded as being a true measure of efficacy. It provides researchers with high levels of internal validity.

List of the Disadvantages of Randomized Controlled Trials

1. The logistics of a randomized controlled trial can be demanding. Researchers and participants may need to endure a long trial run to ensure that there is enough data for comparison. This disadvantage can result in the loss of relevance for an idea because a practice can move away from the idea being studied by the time the trial is at a place where publication is possible. Because validity requires multiple sites and groups for scientists to manage, the power calculation might demand a vast sample size that goes outside of the resources of the investigators.

If a research project cannot manage the logistics of a randomized controlled trial, then the data these efforts produce will be questionable at best.

2. Some randomization efforts may be predictable. Some randomized controlled trials use block methods to select participants for a research project. When this option is the primary selection option for investigators at the start of an effort, then the allocation of subjects can become predictable. That means this disadvantage can result in a higher level of selection bias in the data when the study groups are eventually unmasked on the way toward publication.

If this disadvantage occurs within the context of a randomized controlled trial, then it provides a limited amount of external validity. This issue can continue to grow when artificial environments get introduced in the research setting.

3. There can be some applicability issues with randomized controlled trials. Trials that test for efficacy may not be widely applicable when using the randomized control method. When an effort to test for effectiveness is large, then it becomes more expensive, and that issue can also impact the results achieved. The results from this work might not always mimic real-life treatment situations, such as inclusion and exclusion criteria, so what happens in the context of the investigation may not provide relevant data. Highly controlled settings are not always the best solution to use where there’s a need to find specific results.

4. This research method comes with some ethical limitations. The active randomization requires a certain level of clinical equipoise. Investigators cannot ethically provide random groups of patients their treatment options unless both selections have an equal level of support within the clinical community. That’s why the issue of informed consent is a significant disadvantage when using this approach for a clinical trial. Researchers must use generalities instead of specificity when talking to patients about what their participation will look like in the coming days.

When subjects hear generalities, then they are more likely to assume that they are receiving the designated treatment. That’s why the placebo effect can be so disruptive to randomized controlled trials. If someone feels like their items are working as intended, then it can create an internal psychological effect that creates a similar result.

5. Some research cannot be ethically performed using this model. The traditional argument against the use of randomized controlled trials involves the effects of a parachute on the survival rate of skydivers. You cannot place some individuals in a position where their life is immediately threatened by the structure of the research that investigators want to pursue. That means there are certain limitations with this approach that can limit the amount of usable data we can collect.

Because of this disadvantage, there is limited scope to the research that benefits from a randomized controlled trial. Most of the efforts involve preventive or therapeutic treatments.

6. The data does not show you the critical information needed for results. When researchers use a randomized controlled trial, then the outcome shows that the treatment option being studied has potential or it does not. The results will not give professionals the critical information that is needed to benefit human life. You will not know what patients are going to experience the most advantages when they receive the intended product.

Randomized controlled trials must be very large to achieve statistical significance because of this disadvantage. It is the only way to account for heterogeneity within the participant groups. That means the final data that investigators receive tends to have centralized tendencies, so it is not going to be representative of the final population group at the individualized level.

7. Participants who participate in this research aren’t always reflective of their demographic. Randomized controlled trials don’t take into account the history of each participant. The subjects who tend to seek out treatments in this manner are often those who have tried everything else and are desperate for a positive outcome. That makes it challenging for investigators to generalize the results so that they can apply to everyone because the population tends to be people who always enroll in the studies instead of being people who never do.

8. It is an expensive proposition. Even if the randomized controlled trial only uses one baseline group and a single treatment demographic, the length of the research requires a significant investment. It is one of the most expensive methods of collecting data in terms of time and money. You’re looking for relative risk within a population group and odds ratios that can determine specific outcomes. If the investigators inadvertently study two population groups, then the information may not be useful.

An example of this issue is a 2015 study on this use of a hamstring exercise with male amateur soccer players. Over the course of 12 months, the randomized controlled trial from van Der Horst, et. al showed that there was a statistically significant reduction in the incidence of hamstring injuries for players proactively using the exercise. The average age was 24.5, but the deviation rate was almost four years. Someone at 20 can react differently than another individual at the age of 28.

Randomized controlled trials are considered the gold standard for current research methods. The findings from this work have a higher level of statistical reliability because of the comparative processes that investigators follow. Although there can be logistical issues that impact the results in adverse ways, the outcomes are generally accepted as a useful finding.

There are ethical limitations and applicability issues that must come under consideration for anyone thinking about the design of a trial such as this. Informed consent is often impossible because the placebo used to create comparative results can mimic the desired outcome. That’s why subjects often get told that they might receive the treatment or the placebo. If someone is in the control group without treatment, even the fact that they’re in a research study can produce positive results.

That’s why we must approach the advantages and disadvantages of a randomized controlled trial with caution. It may be the best option for publication, but this pursuit might not be the best investigative choice in every situation.

6.2 Experimental Design

Learning objectives.

Explain the difference between between-subjects and within-subjects experiments, list some of the pros and cons of each approach, and decide which approach to use to answer a particular research question.
Define random assignment, distinguish it from random sampling, explain its purpose in experimental research, and use some simple strategies to implement it.
Define what a control condition is, explain its purpose in research on treatment effectiveness, and describe some alternative types of control conditions.
Define several types of carryover effect, give examples of each, and explain how counterbalancing helps to deal with them.

Between-Subjects Experiments

In a between-subjects experiment An experiment in which each participant is tested in one condition. , each participant is tested in only one condition. For example, a researcher with a sample of 100 college students might assign half of them to write about a traumatic event and the other half write about a neutral event. Or a researcher with a sample of 60 people with severe agoraphobia (fear of open spaces) might assign 20 of them to receive each of three different treatments for that disorder. It is essential in a between-subjects experiment that the researcher assign participants to conditions so that the different groups are, on average, highly similar to each other. Those in a trauma condition and a neutral condition, for example, should include a similar proportion of men and women, and they should have similar average intelligence quotients (IQs), similar average levels of motivation, similar average numbers of health problems, and so on. This is a matter of controlling these extraneous participant variables across conditions so that they do not become confounding variables.

Random Assignment

The primary way that researchers accomplish this kind of control of extraneous variables across conditions is called random assignment The assignment of participants to different conditions according to a random procedure, such as flipping a coin, rolling a die, or using a random number generator. , which means using a random process to decide which participants are tested in which conditions. Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population, and it is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other fields too.

One problem with coin flipping and other strict procedures for random assignment is that they are likely to result in unequal sample sizes in the different conditions. Unequal sample sizes are generally not a serious problem, and you should never throw away data you have already collected to achieve equal sample sizes. However, for a fixed number of participants, it is statistically most efficient to divide them into equal-sized groups. It is standard practice, therefore, to use a kind of modified random assignment that keeps the number of participants in each group as similar as possible. One approach is block randomization A method of randomly assigning participants that guarantees that the condition sample sizes are equal or almost equal. A random procedure is used to assign the first k participants into the k conditions, and then to assign the next k participants into the k conditions, and so on until all the participants have been assigned. . In block randomization, all the conditions occur once in the sequence before any of them is repeated. Then they all occur again before any of them is repeated again. Within each of these “blocks,” the conditions occur in a random order. Again, the sequence of conditions is usually generated before any participants are tested, and each new participant is assigned to the next condition in the sequence. Table 6.2 "Block Randomization Sequence for Assigning Nine Participants to Three Conditions" shows such a sequence for assigning nine participants to three conditions. The Research Randomizer website ( http://www.randomizer.org ) will generate block randomization sequences for any number of participants and conditions. Again, when the procedure is computerized, the computer program often handles the block randomization.

Table 6.2 Block Randomization Sequence for Assigning Nine Participants to Three Conditions

Treatment and Control Conditions

Between-subjects experiments are often used to determine whether a treatment works. In psychological research, a treatment An intervention intended to change people’s behavior for the better. is any intervention meant to change people’s behavior for the better. This includes psychotherapies and medical treatments for psychological disorders but also interventions designed to improve learning, promote conservation, reduce prejudice, and so on. To determine whether a treatment works, participants are randomly assigned to either a treatment condition A condition in a study in which participants receive some treatment of interest. , in which they receive the treatment, or a control condition A condition in a study in which participants do not receive the treatment of interest. , in which they do not receive the treatment. If participants in the treatment condition end up better off than participants in the control condition—for example, they are less depressed, learn faster, conserve more, express less prejudice—then the researcher can conclude that the treatment works. In research on the effectiveness of psychotherapies and medical treatments, this type of experiment is often called a randomized clinical trial An experiment designed to test the effectiveness of a psychological or medical treatment. .

There are different types of control conditions. In a no-treatment control condition A control condition in which participants receive no treatment whatsoever—not even a placebo. , participants receive no treatment whatsoever. One problem with this approach, however, is the existence of placebo effects. A placebo A treatment that lacks any active ingredient or element that should make it effective. is a simulated treatment that lacks any active ingredient or element that should make it effective, and a placebo effect The positive effect of a placebo. is a positive effect of such a treatment. Many folk remedies that seem to work—such as eating chicken soup for a cold or placing soap under the bedsheets to stop nighttime leg cramps—are probably nothing more than placebos. Although placebo effects are not well understood, they are probably driven primarily by people’s expectations that they will improve. Having the expectation to improve can result in reduced stress, anxiety, and depression, which can alter perceptions and even improve immune system functioning (Price, Finniss, & Benedetti, 2008). Price, D. D., Finniss, D. G., & Benedetti, F. (2008). A comprehensive review of the placebo effect: Recent advances and current thought. Annual Review of Psychology, 59 , 565–590.

Placebo effects are interesting in their own right (see Note 6.28 "The Powerful Placebo" ), but they also pose a serious problem for researchers who want to determine whether a treatment works. Figure 6.2 "Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions" shows some hypothetical results in which participants in a treatment condition improved more on average than participants in a no-treatment control condition. If these conditions (the two leftmost bars in Figure 6.2 "Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions" ) were the only conditions in this experiment, however, one could not conclude that the treatment worked. It could be instead that participants in the treatment group improved more because they expected to improve, while those in the no-treatment control condition did not.

Figure 6.2 Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions

Fortunately, there are several solutions to this problem. One is to include a placebo control condition A control condition in which participants receive a placebo. , in which participants receive a placebo that looks much like the treatment but lacks the active ingredient or element thought to be responsible for the treatment’s effectiveness. When participants in a treatment condition take a pill, for example, then those in a placebo control condition would take an identical-looking pill that lacks the active ingredient in the treatment (a “sugar pill”). In research on psychotherapy effectiveness, the placebo might involve going to a psychotherapist and talking in an unstructured way about one’s problems. The idea is that if participants in both the treatment and the placebo control groups expect to improve, then any improvement in the treatment group over and above that in the placebo control group must have been caused by the treatment and not by participants’ expectations. This is what is shown by a comparison of the two outer bars in Figure 6.2 "Hypothetical Results From a Study Including Treatment, No-Treatment, and Placebo Conditions" .

Of course, the principle of informed consent requires that participants be told that they will be assigned to either a treatment or a placebo control condition—even though they cannot be told which until the experiment ends. In many cases the participants who had been in the control condition are then offered an opportunity to have the real treatment. An alternative approach is to use a waitlist control condition A control condition in which participants are put on a waitlist to receive the treatment after the study is completed. , in which participants are told that they will receive the treatment but must wait until the participants in the treatment condition have already received it. This allows researchers to compare participants who have received the treatment with participants who are not currently receiving it but who still expect to improve (eventually). A final solution to the problem of placebo effects is to leave out the control condition completely and compare any new treatment with the best available alternative treatment. For example, a new treatment for simple phobia could be compared with standard exposure therapy. Because participants in both conditions receive a treatment, their expectations about improvement should be similar. This approach also makes sense because once there is an effective treatment, the interesting question about a new treatment is not simply “Does it work?” but “Does it work better than what is already available?”

The Powerful Placebo

Many people are not surprised that placebos can have a positive effect on disorders that seem fundamentally psychological, including depression, anxiety, and insomnia. However, placebos can also have a positive effect on disorders that most people think of as fundamentally physiological. These include asthma, ulcers, and warts (Shapiro & Shapiro, 1999). Shapiro, A. K., & Shapiro, E. (1999). The powerful placebo: From ancient priest to modern physician . Baltimore, MD: Johns Hopkins University Press. There is even evidence that placebo surgery—also called “sham surgery”—can be as effective as actual surgery.

Medical researcher J. Bruce Moseley and his colleagues conducted a study on the effectiveness of two arthroscopic surgery procedures for osteoarthritis of the knee (Moseley et al., 2002). Moseley, J. B., O’Malley, K., Petersen, N. J., Menke, T. J., Brody, B. A., Kuykendall, D. H., … Wray, N. P. (2002). A controlled trial of arthroscopic surgery for osteoarthritis of the knee. The New England Journal of Medicine, 347 , 81–88. The control participants in this study were prepped for surgery, received a tranquilizer, and even received three small incisions in their knees. But they did not receive the actual arthroscopic surgical procedure. The surprising result was that all participants improved in terms of both knee pain and function, and the sham surgery group improved just as much as the treatment groups. According to the researchers, “This study provides strong evidence that arthroscopic lavage with or without débridement [the surgical procedures used] is not better than and appears to be equivalent to a placebo procedure in improving knee pain and self-reported function” (p. 85).

Within-Subjects Experiments

In a within-subjects experiment An experiment in which each participant is tested in all conditions. , each participant is tested under all conditions. Consider an experiment on the effect of a defendant’s physical attractiveness on judgments of his guilt. Again, in a between-subjects experiment, one group of participants would be shown an attractive defendant and asked to judge his guilt, and another group of participants would be shown an unattractive defendant and asked to judge his guilt. In a within-subjects experiment, however, the same group of participants would judge the guilt of both an attractive and an unattractive defendant.

Carryover Effects and Counterbalancing

The primary disadvantage of within-subjects designs is that they can result in carryover effects. A carryover effect An effect of being tested in one condition on participants’ behavior in later conditions. is an effect of being tested in one condition on participants’ behavior in later conditions. One type of carryover effect is a practice effect A carryover effect in which participants perform better on a task in later conditions because they have had a chance to practice. , where participants perform a task better in later conditions because they have had a chance to practice it. Another type is a fatigue effect A carryover effect in which participants perform worse on a task in later conditions because they have become tired or bored. , where participants perform a task worse in later conditions because they become tired or bored. Being tested in one condition can also change how participants perceive stimuli or interpret their task in later conditions. This is called a context effect An unintended effect of the context in which a response is made. In within-subjects experiments, this can be an effect of being tested in one condition on how participants perceive stimuli or interpret their task and therefore how they respond in later conditions. In survey research, this can be an effect of the surrounding items or the response scale on responses to a particular item. . For example, an average-looking defendant might be judged more harshly when participants have just judged an attractive defendant than when they have just judged an unattractive defendant. Within-subjects experiments also make it easier for participants to guess the hypothesis. For example, a participant who is asked to judge the guilt of an attractive defendant and then is asked to judge the guilt of an unattractive defendant is likely to guess that the hypothesis is that defendant attractiveness affects judgments of guilt. This could lead the participant to judge the unattractive defendant more harshly because he thinks this is what he is expected to do. Or it could make participants judge the two defendants similarly in an effort to be “fair.”

There is a solution to the problem of order effects, however, that can be used in many situations. It is counterbalancing Systematically varying the order of conditions across participants. , which means testing different participants in different orders. For example, some participants would be tested in the attractive defendant condition followed by the unattractive defendant condition, and others would be tested in the unattractive condition followed by the attractive condition. With three conditions, there would be six different orders (ABC, ACB, BAC, BCA, CAB, and CBA), so some participants would be tested in each of the six orders. With counterbalancing, participants are assigned to orders randomly, using the techniques we have already discussed. Thus random assignment plays an important role in within-subjects designs just as in between-subjects designs. Here, instead of randomly assigning to conditions, they are randomly assigned to different orders of conditions. In fact, it can safely be said that if a study does not involve random assignment in one form or another, it is not an experiment.

When 9 Is “Larger” Than 221

Researcher Michael Birnbaum has argued that the lack of context provided by between-subjects designs is often a bigger problem than the context effects created by within-subjects designs. To demonstrate this, he asked one group of participants to rate how large the number 9 was on a 1-to-10 rating scale and another group to rate how large the number 221 was on the same 1-to-10 rating scale (Birnbaum, 1999). Birnbaum, M. H. (1999). How to show that 9 > 221: Collect judgments in a between-subjects design. Psychological Methods, 4 , 243–249. Participants in this between-subjects design gave the number 9 a mean rating of 5.13 and the number 221 a mean rating of 3.10. In other words, they rated 9 as larger than 221! According to Birnbaum, this is because participants spontaneously compared 9 with other one-digit numbers (in which case it is relatively large) and compared 221 with other three-digit numbers (in which case it is relatively small).

Simultaneous Within-Subjects Designs

Between-Subjects or Within-Subjects?

Key Takeaways

Experiments can be conducted using either between-subjects or within-subjects designs. Deciding which to use in a particular situation requires careful consideration of the pros and cons of each approach.
Random assignment to conditions in between-subjects experiments or to orders of conditions in within-subjects experiments is a fundamental element of experimental research. Its purpose is to control extraneous variables so that they do not become confounding variables.
Experimental research on the effectiveness of a treatment requires both a treatment condition and a control condition, which can be a no-treatment control condition, a placebo control condition, or a waitlist control condition. Experimental treatments can also be compared with the best available alternative.

Discussion: For each of the following topics, list the pros and cons of a between-subjects and within-subjects design and decide which would be better.

You want to test the relative effectiveness of two training programs for running a marathon.
Using photographs of people as stimuli, you want to see if smiling people are perceived as more intelligent than people who are not smiling.
In a field experiment, you want to see if the way a panhandler is dressed (neatly vs. sloppily) affects whether or not passersby give him any money.
You want to see if concrete nouns (e.g., dog ) are recalled better than abstract nouns (e.g., truth ).
Discussion: Imagine that an experiment shows that participants who receive psychodynamic therapy for a dog phobia improve more than participants in a no-treatment control group. Explain a fundamental problem with this research design and at least two ways that it might be corrected.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
v.18(Suppl 2); 2024
PMC10795211

Rethinking the pros and cons of randomized controlled trials and observational studies in the era of big data and advanced methods: a panel discussion

Pamela fernainy.

1 Department of Health Management, Evaluation and Policy, School of Public Health, University of Montreal, Montreal, QC Canada

2 Research Centre of the Centre Hospitalier de L’Université de Montréal (CHUM), Montreal, QC Canada

Alan A. Cohen

3 Department of Family and Emergency Medicine, Faculty of Medicine and Health Sciences, University of Sherbrooke, Montreal, QC Canada

4 CHUS Research Centre, Montreal, QC Canada

5 Centre de Recherche Sur Le Vieillissement, Montreal, QC Canada

6 Butler Columbia Aging Center, New York, NY USA

7 Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University New York, New York, USA

Eleanor Murray

8 School of Public Health, Boston University, Boston, MA USA

Elena Losina

9 Harvard Medical School Department of Orthopedic Surgery, Cambridge, MA USA

Francois Lamontagne

10 Departement de Medicine, University of Sherbrooke, Montreal, QC Canada

Nadia Sourial

Associated data.

Not applicable.

Randomized controlled trials (RCTs) have traditionally been considered the gold standard for medical evidence. However, in light of emerging methodologies in data science, many experts question the role of RCTs. Within this context, experts in the USA and Canada came together to debate whether the primacy of RCTs as the gold standard for medical evidence, still holds in light of recent methodological advances in data science and in the era of big data. The purpose of this manuscript, aims to raise awareness of the pros and cons of RCTs and observational studies in order to help guide clinicians, researchers, students, and decision-makers in making informed decisions on the quality of medical evidence to support their work. In particular, new and underappreciated advantages and disadvantages of both designs are contrasted. Innovations taking place in both of these research methodologies, which can blur the lines between the two, are also discussed. Finally, practical guidance for clinicians and future directions in assessing the quality of evidence is offered.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12919-023-00285-8.

Randomized controlled trials (RCTs) have traditionally been considered the gold standard for medical evidence because of their ability to eliminate bias due to confounding and to thereby ensure internal validity [ 1 ]. However, the primacy of RCTs is far from universally accepted by methodological experts. This is particularly true in the era of big data and in light of emerging methodologies in data science, machine learning, causal inference methods, and other research methods, which may shift how researchers view the relative quality of evidence from observational studies compared to RCTs. In this context, on February 24, 2022, a debate took place to discuss the pros and cons of randomized control trials and observational studies. This debate was intended to reach a wide audience at all levels of training and expertise, and welcomed clinicians, researchers, students, and decision-makers seeking to better navigate the complex landscape of health evidence in a fast-changing world. The webinar announcement was shared through multiple research centers and the social networks of the panelists. A broad range of attendees participated (total of 267 attendees: 35% researchers, 28% students, 16% clinicians, 5% managers and 15% other), with varying levels of methodological expertise (26% minimal, 56% moderate, and 18% advanced). The panel was composed of clinicians and researchers with methodological expertise in experimental and observational studies from the USA and Canada (authors AAC, EM, EL, FL, and NS). This article seeks to summarize areas of agreement and disagreement among discussion panelists, highlight methodological innovations, and guide researchers, students, decision-makers, and clinicians in making informed decisions on the quality of medical evidence. The debate can be viewed at https://www.youtube.com/watch?v=VNc30fab9nM&t=17s . A lay infographic of the key points of the debate is also available (Appendix A ).

In general, RCTs are studies where investigators randomly assign subjects to different treatment groups (intervention or control group) to examine the effect of an intervention on relevant outcomes [ 2 ]. In large samples, random assignment generally results in balance between both observed (measured) and unobserved (unmeasured) group characteristics [ 1 ]. In observational studies, investigators observe the effects of exposures on outcomes using either existing data such as electronic health records (EHRs) [ 3 ], health administrative data, or collected data such as through population-based surveys [ 4 ]. Thus, in observational studies, the investigator does not play a role in the assignment of an exposure to the study subjects [ 5 ].

Pros and cons of RCTs and observational studies

By and large, RCTs are well suited to establish the efficacy of interventions involving medical interventions, and can accordingly advance knowledge that is important to the work of clinicians and the subsequent improvement of patients’ well-being. Besides being prescriptive and intuitive, the key feature of RCTs is the control for confounding due to the random assignment of the exposure of interest. Under ideal conditions, this design ensures high internal validity and can provide an unbiased causal effect of the exposure on the outcome [ 6 ]. Consequently, RCTs are helpful to physicians who prescribe medications, and studies that deal with medications as interventions lend themselves to such studies. Conversely, the lack of random assignment in observational studies is a key disadvantage, opening up the possibility of bias due to confounding and requiring researchers to employ more sophisticated methods when attempting to control for this important source of bias [ 7 ]. For instance, when considering the effect of alcohol consumption on lung cancer, factors such as smoking should be considered, as smoking has been linked to both alcohol consumption and lung cancer and can therefore confound the effect of interest if not controlled. Yet, in reality, generalizability of RCTs may also be threatened due to selection bias [ 8 ] or particularities of the study population. Furthermore, randomization of the exposure only protects against confounding at baseline [ 9 ]. Confounding might occur during the course of the study, due to loss to follow up, non-compliance, and missing data [ 10 , 11 ]. These post-randomization biases are often overlooked and the benefits of randomization at baseline may give researchers and clinicians a false sense of security.

Conversely, in observational studies, researchers are keenly aware of the threat to validity due to bias and must often consider and implement methods at the design, analysis and interpretation stage to account for it [ 12 ]. An advantage of observational studies is that they allow researchers to examine the effect of natural experiments including the effect of interventions under real-world conditions [ 13 , 14 ]. This is particularly relevant when the study system is formally complex, such as for physiological and biochemical regulatory networks, healthcare systems, infectious diseases, and social networks. In this case, results may be highly contingent on many factors, for example, when assessing COVID-19 public health measures during the pandemic, determining the impact of lifestyle, or a patient belonging to an interprofessional primary care team. In these contexts, observational studies may provide better external validity than RCTs, which typically occur under well-controlled and, by the same token, often less realistic conditions. Observational studies are also preferred when RCTs are too costly, not feasible, time-intensive, or unethical to conduct [ 13 ]. For example, a RCT studying the development of melanoma would require a long follow-up period and may not be feasible. Among researchers, there is overall agreement that low-quality RCTs might not be generally superior to observational studies, but disagreement remains as to whether high-quality RCTs, as a rule, provide a higher standard of evidence [ 13 ]. For panelists, this disagreement stemmed partly from the relative weights they accorded to internal versus external validity. While no panelist felt that observational studies were systematically better than RCTs, there was disagreement as to whether the notion that RCTs are a gold standard is helpful or harmful. Still, despite this disaccord, methodological advances are opening the door to promising opportunities. Table Table1 1 provides a succinct summary of several pros and cons of RCTs and observational studies.

Table 1

Pros and cons of randomized control trials and observational studies

Innovations and opportunities in RCTs and observational studies

Recent innovations in RCTs have facilitated or improved the results of this research method and can result in trials that are more flexible, efficient, or ethical [ 15 ]. New designs being considered in RCTs include, but are not limited to, adaptive trials, sequential trials, and platform trials. Adaptive trials, for instance, include scheduled interim looks at the data during the trial. This leads to predetermined changes based on the analyses of accumulating data, all the while maintaining trial validity and integrity [ 15 ]. Sequential trials are an approach to clinical trials during which subjects are serially recruited and study results are continuously analyzed [ 16 ]. Once enough data enabling a decision regarding treatment effectiveness is collected, the trial is stopped [ 17 ]. Platform trials focus on an entire disease or syndrome to compare multiple interventions and add or drop interventions over time [ 18 ]. Also, the development of EHRs and an expanded access to routinely-collected clinical data has resulted in RCTs being conducted within the context of EHR-based clinical trials. EHRs have the potential to advance clinical health research by facilitating RCTs in real-world settings. Many RCTs have leveraged EHRs to recruit patients or assess clinical outcomes with minimal patient contact [ 19 ]. Such approaches are considered a particularly innovative convergence of observational and experimental data, which blurs the line between these two methodologies going forward.

As well as innovations in RCTs, innovations are taking place in observational studies. The last two decades have seen the use of novel methods such as causal inference to analyze observational data as hypothetical RCTs, which have generated similar results to those of randomized trials [ 13 ]. Causal inference in observational studies refers to an intellectual discipline which allows researchers to draw causal conclusions based on data by considering the assumptions, study design, and estimation strategies [ 20 ]. Causal inference methods, through their well-defined frameworks and assumptions, have the advantage of requiring researchers to be explicit in defining the design intervention, exposure, and confounders, for example through the use of DAGs (Directed Acyclic Graphs) [ 21 ], and have helped to overcome concerns about bias in the analysis of observational studies [ 10 ]. Moreover, recently, large observational studies have become more popular in the era of big data because of their ability to leverage and analyze multiple sources of observational data [ 22 ] such as from population databases, social media, and digital health tools [ 23 ]. Another innovation is the E-value, “the minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the treatment and the outcome to fully explain away a specific treatment-outcome association, conditional on the measured covariates” [ 24 ]. The E-value is an intuitive metric to help determine how robust the results of a study are to unmeasured confounding. A summary of the methods and their application can be seen in Table 2 .

Table 2

Innovations in randomized controlled trials and observational studies

Despite the salient advances taking place, challenges and future considerations exist for both observational and experimental research methodologies (see Appendix A ). One concern is how to apply innovations to new contexts, different topics, and novel areas of research. For example, causal inference methods are widely used in pharmacoepidemiology, but have so far rarely been used in other fields such as primary care [ 44 ]. One solution could be to encourage the use of these novel techniques by developing guidelines, sensitizing medical students to these methods by including them in the curriculum, or inclusion of more impartial and open-minded journal review boards. Such measures could facilitate cross-fertilization of methods across disciplines and foster their use in more studies.

When considering RCTs and observational studies, several key take-home messages can be drawn:

No study is designed to answer all questions, and consequently, neither RCTs nor observational studies can answer all research questions at all times. Rather, the research question and context should drive the choice of method to be used.
Both observational studies and RCTs face methodological challenges and are subject to bias. While any single study is flawed, it is the hope that the body of evidence together will show consistency in the effect of the exposure. Furthermore, triangulation of evidence from observational and experimental approaches can furnish a stronger basis for causal inference to better understand the phenomenon studied by the researcher [ 10 ].
Recent methodological innovations in health research represent a paradigm shift in how studies should be planned and conducted [ 44 ]. More knowledge translation is needed to disseminate these innovations across the different health research fields.

Finally, RCTs and observational studies can result in evidence that can subsequently improve the health and clinical care for patients, the desired effect and general aim for all researchers, decision-makers, and physicians using these study methods. However, the necessity of RCTs for establishing the highest level of evidence, remains an area of substantial disagreement, and it will be important to continue discussions around these issues going forward.

Acknowledgements

Lise Gauvin, Department of Social and Preventive medicine, School of Public Health, University of Montreal, Research Centre of the Centre Hospitalier de l’Université de Montréal (CRCHUM).

Hosting research centres: CRCHUM, Research Centre of the University of Sherbrooke and the University of Sherbrooke Research Center on Aging.

Abbreviations

Authors’ contributions.

PF contributed to the conception of the paper and drafted the work. AAC contributed to conception and revision of the manuscript. EM contributed to conception and revision of the manuscript. EL contributed to conception and revision of the manuscript. FL contributed to conception and revision of the manuscript. NS was responsible for conception and revision of the manuscript and substantially revised the work. All authors read and approved the submitted manuscript.

This work was funded by a Canadian Institutes of Health Research grant (#178264).

Availability of data and materials

Declarations.

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Advantages and Disadvantages of Assignments For Students

Looking for advantages and disadvantages of Assignments For Students?

We have collected some solid points that will help you understand the pros and cons of Assignments For Students in detail.

But first, let’s understand the topic:

What is Assignments For Students?

Assignments for students are tasks or activities given by teachers to be completed outside of class time. These can include writing essays, solving math problems, or reading books. They help students practice what they’ve learned and prepare for future lessons.

What are the advantages and disadvantages of Assignments For Students

The following are the advantages and disadvantages of Assignments For Students:

Advantages of Assignments For Students

Boosts understanding of topics – Assignments help students dive deeper into topics, providing a clear and thorough understanding that goes beyond surface-level knowledge.
Encourages independent learning – They promote self-learning, pushing students to study and solve problems on their own, fostering self-reliance.
Enhances time management skills – Time management skills are honed as students balance assignments with other responsibilities, teaching them to prioritize tasks.
Improves research and writing abilities – Assignments also refine research and writing skills, as students learn to gather information and articulate ideas effectively.
Reinforces classroom learning – They serve as a reinforcement tool, solidifying what is taught in the classroom and making learning more effective.

Disadvantages of Assignments For Students

Can increase stress levels – Assignments can often lead to elevated stress levels in students due to tight deadlines and high expectations.
Limits free time – When students are loaded with assignments, their leisure time gets compromised, affecting their work-life balance.
May discourage creativity – The rigid structure of assignments can sometimes curb the creative instincts of students, stifling their innovative ideas.
Risks of plagiarism – Assignments also pose the risk of plagiarism as students might copy answers from readily available sources, compromising their learning.
Difficulty understanding instructions – Sometimes, students face challenges in comprehending the instructions of assignments, leading to incorrect submissions.
Advantages and disadvantages of Assignment Method Of Teaching
Advantages and disadvantages of Assignment Method
Advantages and disadvantages of Assets

You can view other “advantages and disadvantages of…” posts by clicking here .

If you have a related query, feel free to let us know in the comments below.

Also, kindly share the information with your friends who you think might be interested in reading it.

random assignment pros and cons

IMAGES

VIDEO

COMMENTS

Purpose and Limitations of Random Assignment

How does random assignment produce comparable groups?

2. Random assignment prevents confounding

3. Random assignment also eliminates other threats to internal validity

What if random assignment produced unequal groups?

Limitations of random assignment

1. Ethical issues:

2. Low external validity:

3. Higher cost of implementation:

4. Impracticality when answering non-causal questions:

5. Impracticality when studying the effect of variables that cannot be manipulated:

6. Difficulty to control participants:

Further reading

6.2 Experimental Design

Between-Subjects Experiments

Random Assignment

Treatment and Control Conditions

The Powerful Placebo

Within-Subjects Experiments

Carryover Effects and Counterbalancing

When 9 Is “Larger” Than 221

Simultaneous Within-Subjects Designs

Between-Subjects or Within-Subjects?

Key Takeaways

5.2 Experimental Design

Between-Subjects Experiments

Random Assignment

Matched Groups

Within-Subjects Experiments

Carryover Effects and Counterbalancing

When 9 Is “Larger” Than 221

Simultaneous Within-Subjects Designs

Between-Subjects or Within-Subjects?

Key Takeaways

Share This Book

Frequently asked questions

Frequently asked questions: Methodology

Ask our team

How does the sample edit work?

Protection of Random Assignment

Cite this chapter

Access this chapter

Author information

Rights and permissions

Copyright information

About this chapter

Download citation

Share this chapter

What Is Simple Random Sampling?

The Bottom Line

Key Takeaways

Simple Random Sample: An Overview

Advantages of a Simple Random Sample

Lack of Bias

Less Knowledge Required

Disadvantages of a Simple Random Sample

Difficulty Accessing Lists of the Full Population

Time Consuming

Sample Selection Bias

Data Quality Is Reliant on Researcher Qualify

How Is Simple Random Sampling Conducted?

What Are the 4 Types of Random Sampling?

When Is It Best to Use Simple Random Sampling?

The Random Selection Experiment Method

Random Selection vs. Random Assignment

Factors Involved

Quasi-Experimental Design

Example of a Quasi-Experimental Design

Interpreting Results from a Quasi-Experimental Design

The Pros and Cons of Using Quasi-Experimental Designs

References:

What random assignment does and does not do

Publication types

14 Advantages and Disadvantages of a Randomized Controlled Trial

List of the Advantages of Randomized Controlled Trials

List of the Disadvantages of Randomized Controlled Trials

6.2 Experimental Design

Between-Subjects Experiments

Random Assignment

Treatment and Control Conditions