• Privacy Policy

Buy Me a Coffee

Research Method

Home » Quasi-Experimental Research Design – Types, Methods

Quasi-Experimental Research Design – Types, Methods

Table of Contents

Quasi-Experimental Design

Quasi-Experimental Design

Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable(s) that is available in a true experimental design.

In a quasi-experimental design, the researcher uses an existing group of participants that is not randomly assigned to the experimental and control groups. Instead, the groups are selected based on pre-existing characteristics or conditions, such as age, gender, or the presence of a certain medical condition.

Types of Quasi-Experimental Design

There are several types of quasi-experimental designs that researchers use to study causal relationships between variables. Here are some of the most common types:

Non-Equivalent Control Group Design

This design involves selecting two groups of participants that are similar in every way except for the independent variable(s) that the researcher is testing. One group receives the treatment or intervention being studied, while the other group does not. The two groups are then compared to see if there are any significant differences in the outcomes.

Interrupted Time-Series Design

This design involves collecting data on the dependent variable(s) over a period of time, both before and after an intervention or event. The researcher can then determine whether there was a significant change in the dependent variable(s) following the intervention or event.

Pretest-Posttest Design

This design involves measuring the dependent variable(s) before and after an intervention or event, but without a control group. This design can be useful for determining whether the intervention or event had an effect, but it does not allow for control over other factors that may have influenced the outcomes.

Regression Discontinuity Design

This design involves selecting participants based on a specific cutoff point on a continuous variable, such as a test score. Participants on either side of the cutoff point are then compared to determine whether the intervention or event had an effect.

Natural Experiments

This design involves studying the effects of an intervention or event that occurs naturally, without the researcher’s intervention. For example, a researcher might study the effects of a new law or policy that affects certain groups of people. This design is useful when true experiments are not feasible or ethical.

Data Analysis Methods

Here are some data analysis methods that are commonly used in quasi-experimental designs:

Descriptive Statistics

This method involves summarizing the data collected during a study using measures such as mean, median, mode, range, and standard deviation. Descriptive statistics can help researchers identify trends or patterns in the data, and can also be useful for identifying outliers or anomalies.

Inferential Statistics

This method involves using statistical tests to determine whether the results of a study are statistically significant. Inferential statistics can help researchers make generalizations about a population based on the sample data collected during the study. Common statistical tests used in quasi-experimental designs include t-tests, ANOVA, and regression analysis.

Propensity Score Matching

This method is used to reduce bias in quasi-experimental designs by matching participants in the intervention group with participants in the control group who have similar characteristics. This can help to reduce the impact of confounding variables that may affect the study’s results.

Difference-in-differences Analysis

This method is used to compare the difference in outcomes between two groups over time. Researchers can use this method to determine whether a particular intervention has had an impact on the target population over time.

Interrupted Time Series Analysis

This method is used to examine the impact of an intervention or treatment over time by comparing data collected before and after the intervention or treatment. This method can help researchers determine whether an intervention had a significant impact on the target population.

Regression Discontinuity Analysis

This method is used to compare the outcomes of participants who fall on either side of a predetermined cutoff point. This method can help researchers determine whether an intervention had a significant impact on the target population.

Steps in Quasi-Experimental Design

Here are the general steps involved in conducting a quasi-experimental design:

  • Identify the research question: Determine the research question and the variables that will be investigated.
  • Choose the design: Choose the appropriate quasi-experimental design to address the research question. Examples include the pretest-posttest design, non-equivalent control group design, regression discontinuity design, and interrupted time series design.
  • Select the participants: Select the participants who will be included in the study. Participants should be selected based on specific criteria relevant to the research question.
  • Measure the variables: Measure the variables that are relevant to the research question. This may involve using surveys, questionnaires, tests, or other measures.
  • Implement the intervention or treatment: Implement the intervention or treatment to the participants in the intervention group. This may involve training, education, counseling, or other interventions.
  • Collect data: Collect data on the dependent variable(s) before and after the intervention. Data collection may also include collecting data on other variables that may impact the dependent variable(s).
  • Analyze the data: Analyze the data collected to determine whether the intervention had a significant impact on the dependent variable(s).
  • Draw conclusions: Draw conclusions about the relationship between the independent and dependent variables. If the results suggest a causal relationship, then appropriate recommendations may be made based on the findings.

Quasi-Experimental Design Examples

Here are some examples of real-time quasi-experimental designs:

  • Evaluating the impact of a new teaching method: In this study, a group of students are taught using a new teaching method, while another group is taught using the traditional method. The test scores of both groups are compared before and after the intervention to determine whether the new teaching method had a significant impact on student performance.
  • Assessing the effectiveness of a public health campaign: In this study, a public health campaign is launched to promote healthy eating habits among a targeted population. The behavior of the population is compared before and after the campaign to determine whether the intervention had a significant impact on the target behavior.
  • Examining the impact of a new medication: In this study, a group of patients is given a new medication, while another group is given a placebo. The outcomes of both groups are compared to determine whether the new medication had a significant impact on the targeted health condition.
  • Evaluating the effectiveness of a job training program : In this study, a group of unemployed individuals is enrolled in a job training program, while another group is not enrolled in any program. The employment rates of both groups are compared before and after the intervention to determine whether the training program had a significant impact on the employment rates of the participants.
  • Assessing the impact of a new policy : In this study, a new policy is implemented in a particular area, while another area does not have the new policy. The outcomes of both areas are compared before and after the intervention to determine whether the new policy had a significant impact on the targeted behavior or outcome.

Applications of Quasi-Experimental Design

Here are some applications of quasi-experimental design:

  • Educational research: Quasi-experimental designs are used to evaluate the effectiveness of educational interventions, such as new teaching methods, technology-based learning, or educational policies.
  • Health research: Quasi-experimental designs are used to evaluate the effectiveness of health interventions, such as new medications, public health campaigns, or health policies.
  • Social science research: Quasi-experimental designs are used to investigate the impact of social interventions, such as job training programs, welfare policies, or criminal justice programs.
  • Business research: Quasi-experimental designs are used to evaluate the impact of business interventions, such as marketing campaigns, new products, or pricing strategies.
  • Environmental research: Quasi-experimental designs are used to evaluate the impact of environmental interventions, such as conservation programs, pollution control policies, or renewable energy initiatives.

When to use Quasi-Experimental Design

Here are some situations where quasi-experimental designs may be appropriate:

  • When the research question involves investigating the effectiveness of an intervention, policy, or program : In situations where it is not feasible or ethical to randomly assign participants to intervention and control groups, quasi-experimental designs can be used to evaluate the impact of the intervention on the targeted outcome.
  • When the sample size is small: In situations where the sample size is small, it may be difficult to randomly assign participants to intervention and control groups. Quasi-experimental designs can be used to investigate the impact of an intervention without requiring a large sample size.
  • When the research question involves investigating a naturally occurring event : In some situations, researchers may be interested in investigating the impact of a naturally occurring event, such as a natural disaster or a major policy change. Quasi-experimental designs can be used to evaluate the impact of the event on the targeted outcome.
  • When the research question involves investigating a long-term intervention: In situations where the intervention or program is long-term, it may be difficult to randomly assign participants to intervention and control groups for the entire duration of the intervention. Quasi-experimental designs can be used to evaluate the impact of the intervention over time.
  • When the research question involves investigating the impact of a variable that cannot be manipulated : In some situations, it may not be possible or ethical to manipulate a variable of interest. Quasi-experimental designs can be used to investigate the relationship between the variable and the targeted outcome.

Purpose of Quasi-Experimental Design

The purpose of quasi-experimental design is to investigate the causal relationship between two or more variables when it is not feasible or ethical to conduct a randomized controlled trial (RCT). Quasi-experimental designs attempt to emulate the randomized control trial by mimicking the control group and the intervention group as much as possible.

The key purpose of quasi-experimental design is to evaluate the impact of an intervention, policy, or program on a targeted outcome while controlling for potential confounding factors that may affect the outcome. Quasi-experimental designs aim to answer questions such as: Did the intervention cause the change in the outcome? Would the outcome have changed without the intervention? And was the intervention effective in achieving its intended goals?

Quasi-experimental designs are useful in situations where randomized controlled trials are not feasible or ethical. They provide researchers with an alternative method to evaluate the effectiveness of interventions, policies, and programs in real-life settings. Quasi-experimental designs can also help inform policy and practice by providing valuable insights into the causal relationships between variables.

Overall, the purpose of quasi-experimental design is to provide a rigorous method for evaluating the impact of interventions, policies, and programs while controlling for potential confounding factors that may affect the outcome.

Advantages of Quasi-Experimental Design

Quasi-experimental designs have several advantages over other research designs, such as:

  • Greater external validity : Quasi-experimental designs are more likely to have greater external validity than laboratory experiments because they are conducted in naturalistic settings. This means that the results are more likely to generalize to real-world situations.
  • Ethical considerations: Quasi-experimental designs often involve naturally occurring events, such as natural disasters or policy changes. This means that researchers do not need to manipulate variables, which can raise ethical concerns.
  • More practical: Quasi-experimental designs are often more practical than experimental designs because they are less expensive and easier to conduct. They can also be used to evaluate programs or policies that have already been implemented, which can save time and resources.
  • No random assignment: Quasi-experimental designs do not require random assignment, which can be difficult or impossible in some cases, such as when studying the effects of a natural disaster. This means that researchers can still make causal inferences, although they must use statistical techniques to control for potential confounding variables.
  • Greater generalizability : Quasi-experimental designs are often more generalizable than experimental designs because they include a wider range of participants and conditions. This can make the results more applicable to different populations and settings.

Limitations of Quasi-Experimental Design

There are several limitations associated with quasi-experimental designs, which include:

  • Lack of Randomization: Quasi-experimental designs do not involve randomization of participants into groups, which means that the groups being studied may differ in important ways that could affect the outcome of the study. This can lead to problems with internal validity and limit the ability to make causal inferences.
  • Selection Bias: Quasi-experimental designs may suffer from selection bias because participants are not randomly assigned to groups. Participants may self-select into groups or be assigned based on pre-existing characteristics, which may introduce bias into the study.
  • History and Maturation: Quasi-experimental designs are susceptible to history and maturation effects, where the passage of time or other events may influence the outcome of the study.
  • Lack of Control: Quasi-experimental designs may lack control over extraneous variables that could influence the outcome of the study. This can limit the ability to draw causal inferences from the study.
  • Limited Generalizability: Quasi-experimental designs may have limited generalizability because the results may only apply to the specific population and context being studied.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

research questions in quasi experimental design

Home Market Research Research Tools and Apps

Quasi-experimental Research: What It Is, Types & Examples

quasi-experimental research is research that appears to be experimental but is not.

Much like an actual experiment, quasi-experimental research tries to demonstrate a cause-and-effect link between a dependent and an independent variable. A quasi-experiment, on the other hand, does not depend on random assignment, unlike an actual experiment. The subjects are sorted into groups based on non-random variables.

What is Quasi-Experimental Research?

“Resemblance” is the definition of “quasi.” Individuals are not randomly allocated to conditions or orders of conditions, even though the regression analysis is changed. As a result, quasi-experimental research is research that appears to be experimental but is not.

The directionality problem is avoided in quasi-experimental research since the regression analysis is altered before the multiple regression is assessed. However, because individuals are not randomized at random, there are likely to be additional disparities across conditions in quasi-experimental research.

As a result, in terms of internal consistency, quasi-experiments fall somewhere between correlational research and actual experiments.

The key component of a true experiment is randomly allocated groups. This means that each person has an equivalent chance of being assigned to the experimental group or the control group, depending on whether they are manipulated or not.

Simply put, a quasi-experiment is not a real experiment. A quasi-experiment does not feature randomly allocated groups since the main component of a real experiment is randomly assigned groups. Why is it so crucial to have randomly allocated groups, given that they constitute the only distinction between quasi-experimental and actual  experimental research ?

Let’s use an example to illustrate our point. Let’s assume we want to discover how new psychological therapy affects depressed patients. In a genuine trial, you’d split half of the psych ward into treatment groups, With half getting the new psychotherapy therapy and the other half receiving standard  depression treatment .

And the physicians compare the outcomes of this treatment to the results of standard treatments to see if this treatment is more effective. Doctors, on the other hand, are unlikely to agree with this genuine experiment since they believe it is unethical to treat one group while leaving another untreated.

A quasi-experimental study will be useful in this case. Instead of allocating these patients at random, you uncover pre-existing psychotherapist groups in the hospitals. Clearly, there’ll be counselors who are eager to undertake these trials as well as others who prefer to stick to the old ways.

These pre-existing groups can be used to compare the symptom development of individuals who received the novel therapy with those who received the normal course of treatment, even though the groups weren’t chosen at random.

If any substantial variations between them can be well explained, you may be very assured that any differences are attributable to the treatment but not to other extraneous variables.

As we mentioned before, quasi-experimental research entails manipulating an independent variable by randomly assigning people to conditions or sequences of conditions. Non-equivalent group designs, pretest-posttest designs, and regression discontinuity designs are only a few of the essential types.

What are quasi-experimental research designs?

Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn’t give full control over the independent variable(s) like true experimental designs do.

In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at random. Instead, people are put into groups based on things they already have in common, like their age, gender, or how many times they have seen a certain stimulus.

Because the assignments are not random, it is harder to draw conclusions about cause and effect than in a real experiment. However, quasi-experimental designs are still useful when randomization is not possible or ethical.

The true experimental design may be impossible to accomplish or just too expensive, especially for researchers with few resources. Quasi-experimental designs enable you to investigate an issue by utilizing data that has already been paid for or gathered by others (often the government). 

Because they allow better control for confounding variables than other forms of studies, they have higher external validity than most genuine experiments and higher  internal validity  (less than true experiments) than other non-experimental research.

Is quasi-experimental research quantitative or qualitative?

Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. 

It draws statistical conclusions from quantitative data. Qualitative data can enhance quasi-experimental research by revealing participants’ experiences and opinions, but quantitative data is the method’s foundation.

Quasi-experimental research types

There are many different sorts of quasi-experimental designs. Three of the most popular varieties are described below: Design of non-equivalent groups, Discontinuity in regression, and Natural experiments.

Design of Non-equivalent Groups

Example: design of non-equivalent groups, discontinuity in regression, example: discontinuity in regression, natural experiments, example: natural experiments.

However, because they couldn’t afford to pay everyone who qualified for the program, they had to use a random lottery to distribute slots.

Experts were able to investigate the program’s impact by utilizing enrolled people as a treatment group and those who were qualified but did not play the jackpot as an experimental group.

How QuestionPro helps in quasi-experimental research?

QuestionPro can be a useful tool in quasi-experimental research because it includes features that can assist you in designing and analyzing your research study. Here are some ways in which QuestionPro can help in quasi-experimental research:

Design surveys

Randomize participants, collect data over time, analyze data, collaborate with your team.

With QuestionPro, you have access to the most mature market research platform and tool that helps you collect and analyze the insights that matter the most. By leveraging InsightsHub, the unified hub for data management, you can ​​leverage the consolidated platform to organize, explore, search, and discover your  research data  in one organized data repository . 

Optimize Your quasi-experimental research with QuestionPro. Get started now!

FREE TRIAL         LEARN MORE

MORE LIKE THIS

customer advocacy software

21 Best Customer Advocacy Software for Customers in 2024

Apr 19, 2024

quantitative data analysis software

10 Quantitative Data Analysis Software for Every Data Scientist

Apr 18, 2024

Enterprise Feedback Management software

11 Best Enterprise Feedback Management Software in 2024

online reputation management software

17 Best Online Reputation Management Software in 2024

Apr 17, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.3 Quasi-Experimental Research

Learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001). Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952). But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate without receiving psychotherapy. This suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here:

http://psychclassics.yorku.ca/Eysenck/psychotherapy.htm

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980). They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Han Eysenck

In a classic 1952 article, researcher Hans Eysenck pointed out the shortcomings of the simple pretest-posttest design for evaluating the effectiveness of psychotherapy.

Wikimedia Commons – CC BY-SA 3.0.

Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979). Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Figure 7.5 A Hypothetical Interrupted Time-Series Design

A Hypothetical Interrupted Time-Series Design - The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not

The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

Discussion: Imagine that a group of obese children is recruited for a study in which their weight is measured, then they participate for 3 months in a program that encourages them to be more active, and finally their weight is measured again. Explain how each of the following might affect the results:

  • regression to the mean
  • spontaneous remission

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin.

Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324.

Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146.

Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

The use and interpretation of quasi-experimental design

Last updated

6 February 2023

Reviewed by

Miroslav Damyanov

  • What is a quasi-experimental design?

Commonly used in medical informatics (a field that uses digital information to ensure better patient care), researchers generally use this design to evaluate the effectiveness of a treatment – perhaps a type of antibiotic or psychotherapy, or an educational or policy intervention.

Even though quasi-experimental design has been used for some time, relatively little is known about it. Read on to learn the ins and outs of this research design.

Make research less tedious

Dovetail streamlines research to help you uncover and share actionable insights

  • When to use a quasi-experimental design

A quasi-experimental design is used when it's not logistically feasible or ethical to conduct randomized, controlled trials. As its name suggests, a quasi-experimental design is almost a true experiment. However, researchers don't randomly select elements or participants in this type of research.

Researchers prefer to apply quasi-experimental design when there are ethical or practical concerns. Let's look at these two reasons more closely.

Ethical reasons

In some situations, the use of randomly assigned elements can be unethical. For instance, providing public healthcare to one group and withholding it to another in research is unethical. A quasi-experimental design would examine the relationship between these two groups to avoid physical danger.

Practical reasons

Randomized controlled trials may not be the best approach in research. For instance, it's impractical to trawl through large sample sizes of participants without using a particular attribute to guide your data collection .

Recruiting participants and properly designing a data-collection attribute to make the research a true experiment requires a lot of time and effort, and can be expensive if you don’t have a large funding stream.

A quasi-experimental design allows researchers to take advantage of previously collected data and use it in their study.

  • Examples of quasi-experimental designs

Quasi-experimental research design is common in medical research, but any researcher can use it for research that raises practical and ethical concerns. Here are a few examples of quasi-experimental designs used by different researchers:

Example 1: Determining the effectiveness of math apps in supplementing math classes

A school wanted to supplement its math classes with a math app. To select the best app, the school decided to conduct demo tests on two apps before selecting the one they will purchase.

Scope of the research

Since every grade had two math teachers, each teacher used one of the two apps for three months. They then gave the students the same math exams and compared the results to determine which app was most effective.

Reasons why this is a quasi-experimental study

This simple study is a quasi-experiment since the school didn't randomly assign its students to the applications. They used a pre-existing class structure to conduct the study since it was impractical to randomly assign the students to each app.

Example 2: Determining the effectiveness of teaching modern leadership techniques in start-up businesses

A hypothetical quasi-experimental study was conducted in an economically developing country in a mid-sized city.

Five start-ups in the textile industry and five in the tech industry participated in the study. The leaders attended a six-week workshop on leadership style, team management, and employee motivation.

After a year, the researchers assessed the performance of each start-up company to determine growth. The results indicated that the tech start-ups were further along in their growth than the textile companies.

The basis of quasi-experimental research is a non-randomized subject-selection process. This study didn't use specific aspects to determine which start-up companies should participate. Therefore, the results may seem straightforward, but several aspects may determine the growth of a specific company, apart from the variables used by the researchers.

Example 3: A study to determine the effects of policy reforms and of luring foreign investment on small businesses in two mid-size cities

In a study to determine the economic impact of government reforms in an economically developing country, the government decided to test whether creating reforms directed at small businesses or luring foreign investments would spur the most economic development.

The government selected two cities with similar population demographics and sizes. In one of the cities, they implemented specific policies that would directly impact small businesses, and in the other, they implemented policies to attract foreign investment.

After five years, they collected end-of-year economic growth data from both cities. They looked at elements like local GDP growth, unemployment rates, and housing sales.

The study used a non-randomized selection process to determine which city would participate in the research. Researchers left out certain variables that would play a crucial role in determining the growth of each city. They used pre-existing groups of people based on research conducted in each city, rather than random groups.

  • Advantages of a quasi-experimental design

Some advantages of quasi-experimental designs are:

Researchers can manipulate variables to help them meet their study objectives.

It offers high external validity, making it suitable for real-world applications, specifically in social science experiments.

Integrating this methodology into other research designs is easier, especially in true experimental research. This cuts down on the time needed to determine your outcomes.

  • Disadvantages of a quasi-experimental design

Despite the pros that come with a quasi-experimental design, there are several disadvantages associated with it, including the following:

It has a lower internal validity since researchers do not have full control over the comparison and intervention groups or between time periods because of differences in characteristics in people, places, or time involved. It may be challenging to determine whether all variables have been used or whether those used in the research impacted the results.

There is the risk of inaccurate data since the research design borrows information from other studies.

There is the possibility of bias since researchers select baseline elements and eligibility.

  • What are the different quasi-experimental study designs?

There are three distinct types of quasi-experimental designs:

Nonequivalent

Regression discontinuity, natural experiment.

This is a hybrid of experimental and quasi-experimental methods and is used to leverage the best qualities of the two. Like the true experiment design, nonequivalent group design uses pre-existing groups believed to be comparable. However, it doesn't use randomization, the lack of which is a crucial element for quasi-experimental design.

Researchers usually ensure that no confounding variables impact them throughout the grouping process. This makes the groupings more comparable.

Example of a nonequivalent group design

A small study was conducted to determine whether after-school programs result in better grades. Researchers randomly selected two groups of students: one to implement the new program, the other not to. They then compared the results of the two groups.

This type of quasi-experimental research design calculates the impact of a specific treatment or intervention. It uses a criterion known as "cutoff" that assigns treatment according to eligibility.

Researchers often assign participants above the cutoff to the treatment group. This puts a negligible distinction between the two groups (treatment group and control group).

Example of regression discontinuity

Students must achieve a minimum score to be enrolled in specific US high schools. Since the cutoff score used to determine eligibility for enrollment is arbitrary, researchers can assume that the disparity between students who only just fail to achieve the cutoff point and those who barely pass is a small margin and is due to the difference in the schools that these students attend.

Researchers can then examine the long-term effects of these two groups of kids to determine the effect of attending certain schools. This information can be applied to increase the chances of students being enrolled in these high schools.

This research design is common in laboratory and field experiments where researchers control target subjects by assigning them to different groups. Researchers randomly assign subjects to a treatment group using nature or an external event or situation.

However, even with random assignment, this research design cannot be called a true experiment since nature aspects are observational. Researchers can also exploit these aspects despite having no control over the independent variables.

Example of the natural experiment approach

An example of a natural experiment is the 2008 Oregon Health Study.

Oregon intended to allow more low-income people to participate in Medicaid.

Since they couldn't afford to cover every person who qualified for the program, the state used a random lottery to allocate program slots.

Researchers assessed the program's effectiveness by assigning the selected subjects to a randomly assigned treatment group, while those that didn't win the lottery were considered the control group.

  • Differences between quasi-experiments and true experiments

There are several differences between a quasi-experiment and a true experiment:

Participants in true experiments are randomly assigned to the treatment or control group, while participants in a quasi-experiment are not assigned randomly.

In a quasi-experimental design, the control and treatment groups differ in unknown or unknowable ways, apart from the experimental treatments that are carried out. Therefore, the researcher should try as much as possible to control these differences.

Quasi-experimental designs have several "competing hypotheses," which compete with experimental manipulation to explain the observed results.

Quasi-experiments tend to have lower internal validity (the degree of confidence in the research outcomes) than true experiments, but they may offer higher external validity (whether findings can be extended to other contexts) as they involve real-world interventions instead of controlled interventions in artificial laboratory settings.

Despite the distinct difference between true and quasi-experimental research designs, these two research methodologies share the following aspects:

Both study methods subject participants to some form of treatment or conditions.

Researchers have the freedom to measure some of the outcomes of interest.

Researchers can test whether the differences in the outcomes are associated with the treatment.

  • An example comparing a true experiment and quasi-experiment

Imagine you wanted to study the effects of junk food on obese people. Here's how you would do this as a true experiment and a quasi-experiment:

How to carry out a true experiment

In a true experiment, some participants would eat junk foods, while the rest would be in the control group, adhering to a regular diet. At the end of the study, you would record the health and discomfort of each group.

This kind of experiment would raise ethical concerns since the participants assigned to the treatment group are required to eat junk food against their will throughout the experiment. This calls for a quasi-experimental design.

How to carry out a quasi-experiment

In quasi-experimental research, you would start by finding out which participants want to try junk food and which prefer to stick to a regular diet. This allows you to assign these two groups based on subject choice.

In this case, you didn't assign participants to a particular group, so you can confidently use the results from the study.

When is a quasi-experimental design used?

Quasi-experimental designs are used when researchers don’t want to use randomization when evaluating their intervention.

What are the characteristics of quasi-experimental designs?

Some of the characteristics of a quasi-experimental design are:

Researchers don't randomly assign participants into groups, but study their existing characteristics and assign them accordingly.

Researchers study the participants in pre- and post-testing to determine the progress of the groups.

Quasi-experimental design is ethical since it doesn’t involve offering or withholding treatment at random.

Quasi-experimental design encompasses a broad range of non-randomized intervention studies. This design is employed when it is not ethical or logistically feasible to conduct randomized controlled trials. Researchers typically employ it when evaluating policy or educational interventions, or in medical or therapy scenarios.

How do you analyze data in a quasi-experimental design?

You can use two-group tests, time-series analysis, and regression analysis to analyze data in a quasi-experiment design. Each option has specific assumptions, strengths, limitations, and data requirements.

Get started today

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 6 October 2023

Last updated: 5 March 2024

Last updated: 25 November 2023

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free

Research Methodologies Guide

  • Action Research
  • Bibliometrics
  • Case Studies
  • Content Analysis
  • Digital Scholarship This link opens in a new window
  • Documentary
  • Ethnography
  • Focus Groups
  • Grounded Theory
  • Life Histories/Autobiographies
  • Longitudinal
  • Participant Observation
  • Qualitative Research (General)

Quasi-Experimental Design

  • Usability Studies

Quasi-Experimental Design is a unique research methodology because it is characterized by what is lacks. For example, Abraham & MacDonald (2011) state:

" Quasi-experimental research is similar to experimental research in that there is manipulation of an independent variable. It differs from experimental research because either there is no control group, no random selection, no random assignment, and/or no active manipulation. "

This type of research is often performed in cases where a control group cannot be created or random selection cannot be performed. This is often the case in certain medical and psychological studies. 

For more information on quasi-experimental design, review the resources below: 

Where to Start

Below are listed a few tools and online guides that can help you start your Quasi-experimental research. These include free online resources and resources available only through ISU Library.

  • Quasi-Experimental Research Designs by Bruce A. Thyer This pocket guide describes the logic, design, and conduct of the range of quasi-experimental designs, encompassing pre-experiments, quasi-experiments making use of a control or comparison group, and time-series designs. An introductory chapter describes the valuable role these types of studies have played in social work, from the 1930s to the present. Subsequent chapters delve into each design type's major features, the kinds of questions it is capable of answering, and its strengths and limitations.
  • Experimental and Quasi-Experimental Designs for Research by Donald T. Campbell; Julian C. Stanley. Call Number: Q175 C152e Written 1967 but still used heavily today, this book examines research designs for experimental and quasi-experimental research, with examples and judgments about each design's validity.

Online Resources

  • Quasi-Experimental Design From the Web Center for Social Research Methods, this is a very good overview of quasi-experimental design.
  • Experimental and Quasi-Experimental Research From Colorado State University.
  • Quasi-experimental design--Wikipedia, the free encyclopedia Wikipedia can be a useful place to start your research- check the citations at the bottom of the article for more information.
  • << Previous: Qualitative Research (General)
  • Next: Sampling >>
  • Last Updated: Dec 19, 2023 2:12 PM
  • URL: https://instr.iastate.libguides.com/researchmethods

Quasi-Experimental Design: Definition, Types, Examples

Appinio Research · 19.12.2023 · 36min read

Quasi-Experimental Design Definition Types Examples

Ever wondered how researchers uncover cause-and-effect relationships in the real world, where controlled experiments are often elusive? Quasi-experimental design holds the key. In this guide, we'll unravel the intricacies of quasi-experimental design, shedding light on its definition, purpose, and applications across various domains. Whether you're a student, a professional, or simply curious about the methods behind meaningful research, join us as we delve into the world of quasi-experimental design, making complex concepts sound simple and embarking on a journey of knowledge and discovery.

What is Quasi-Experimental Design?

Quasi-experimental design is a research methodology used to study the effects of independent variables on dependent variables when full experimental control is not possible or ethical. It falls between controlled experiments, where variables are tightly controlled, and purely observational studies, where researchers have little control over variables. Quasi-experimental design mimics some aspects of experimental research but lacks randomization.

The primary purpose of quasi-experimental design is to investigate cause-and-effect relationships between variables in real-world settings. Researchers use this approach to answer research questions, test hypotheses, and explore the impact of interventions or treatments when they cannot employ traditional experimental methods. Quasi-experimental studies aim to maximize internal validity and make meaningful inferences while acknowledging practical constraints and ethical considerations.

Quasi-Experimental vs. Experimental Design

It's essential to understand the distinctions between Quasi-Experimental and Experimental Design to appreciate the unique characteristics of each approach:

  • Randomization:  In Experimental Design, random assignment of participants to groups is a defining feature. Quasi-experimental design, on the other hand, lacks randomization due to practical constraints or ethical considerations.
  • Control Groups :  Experimental Design typically includes control groups that are subjected to no treatment or a placebo. The quasi-experimental design may have comparison groups but lacks the same level of control.
  • Manipulation of IV:  Experimental Design involves the intentional manipulation of the independent variable. Quasi-experimental design often deals with naturally occurring independent variables.
  • Causal Inference:  Experimental Design allows for stronger causal inferences due to randomization and control. Quasi-experimental design permits causal inferences but with some limitations.

When to Use Quasi-Experimental Design?

A quasi-experimental design is particularly valuable in several situations:

  • Ethical Constraints:  When manipulating the independent variable is ethically unacceptable or impractical, quasi-experimental design offers an alternative to studying naturally occurring variables.
  • Real-World Settings:  When researchers want to study phenomena in real-world contexts, quasi-experimental design allows them to do so without artificial laboratory settings.
  • Limited Resources:  In cases where resources are limited and conducting a controlled experiment is cost-prohibitive, quasi-experimental design can provide valuable insights.
  • Policy and Program Evaluation:  Quasi-experimental design is commonly used in evaluating the effectiveness of policies, interventions, or programs that cannot be randomly assigned to participants.

Importance of Quasi-Experimental Design in Research

Quasi-experimental design plays a vital role in research for several reasons:

  • Addressing Real-World Complexities:  It allows researchers to tackle complex real-world issues where controlled experiments are not feasible. This bridges the gap between controlled experiments and purely observational studies.
  • Ethical Research:  It provides an honest approach when manipulating variables or assigning treatments could harm participants or violate ethical standards.
  • Policy and Practice Implications:  Quasi-experimental studies generate findings with direct applications in policy-making and practical solutions in fields such as education, healthcare, and social sciences.
  • Enhanced External Validity:  Findings from Quasi-Experimental research often have high external validity, making them more applicable to broader populations and contexts.

By embracing the challenges and opportunities of quasi-experimental design, researchers can contribute valuable insights to their respective fields and drive positive changes in the real world.

Key Concepts in Quasi-Experimental Design

In quasi-experimental design, it's essential to grasp the fundamental concepts underpinning this research methodology. Let's explore these key concepts in detail.

Independent Variable

The independent variable (IV) is the factor you aim to study or manipulate in your research. Unlike controlled experiments, where you can directly manipulate the IV, quasi-experimental design often deals with naturally occurring variables. For example, if you're investigating the impact of a new teaching method on student performance, the teaching method is your independent variable.

Dependent Variable

The dependent variable (DV) is the outcome or response you measure to assess the effects of changes in the independent variable. Continuing with the teaching method example, the dependent variable would be the students' academic performance, typically measured using test scores, grades, or other relevant metrics.

Control Groups vs. Comparison Groups

While quasi-experimental design lacks the luxury of randomly assigning participants to control and experimental groups, you can still establish comparison groups to make meaningful inferences. Control groups consist of individuals who do not receive the treatment, while comparison groups are exposed to different levels or variations of the treatment. These groups help researchers gauge the effect of the independent variable.

Pre-Test and Post-Test Measures

In quasi-experimental design, it's common practice to collect data both before and after implementing the independent variable. The initial data (pre-test) serves as a baseline, allowing you to measure changes over time (post-test). This approach helps assess the impact of the independent variable more accurately. For instance, if you're studying the effectiveness of a new drug, you'd measure patients' health before administering the drug (pre-test) and afterward (post-test).

Threats to Internal Validity

Internal validity is crucial for establishing a cause-and-effect relationship between the independent and dependent variables. However, in a quasi-experimental design, several threats can compromise internal validity. These threats include:

  • Selection Bias :  When non-randomized groups differ systematically in ways that affect the study's outcome.
  • History Effects:  External events or changes over time that influence the results.
  • Maturation Effects:  Natural changes or developments that occur within participants during the study.
  • Regression to the Mean:  The tendency for extreme scores on a variable to move closer to the mean upon retesting.
  • Attrition and Mortality:  The loss of participants over time, potentially skewing the results.
  • Testing Effects:  The mere act of testing or assessing participants can impact their subsequent performance.

Understanding these threats is essential for designing and conducting Quasi-Experimental studies that yield valid and reliable results.

Randomization and Non-Randomization

In traditional experimental designs, randomization is a powerful tool for ensuring that groups are equivalent at the outset of a study. However, quasi-experimental design often involves non-randomization due to the nature of the research. This means that participants are not randomly assigned to treatment and control groups. Instead, researchers must employ various techniques to minimize biases and ensure that the groups are as similar as possible.

For example, if you are conducting a study on the effects of a new teaching method in a real classroom setting, you cannot randomly assign students to the treatment and control groups. Instead, you might use statistical methods to match students based on relevant characteristics such as prior academic performance or socioeconomic status. This matching process helps control for potential confounding variables, increasing the validity of your study.

Types of Quasi-Experimental Designs

In quasi-experimental design, researchers employ various approaches to investigate causal relationships and study the effects of independent variables when complete experimental control is challenging. Let's explore these types of quasi-experimental designs.

One-Group Posttest-Only Design

The One-Group Posttest-Only Design is one of the simplest forms of quasi-experimental design. In this design, a single group is exposed to the independent variable, and data is collected only after the intervention has taken place. Unlike controlled experiments, there is no comparison group. This design is useful when researchers cannot administer a pre-test or when it is logistically difficult to do so.

Example : Suppose you want to assess the effectiveness of a new time management seminar. You offer the seminar to a group of employees and measure their productivity levels immediately afterward to determine if there's an observable impact.

One-Group Pretest-Posttest Design

Similar to the One-Group Posttest-Only Design, this approach includes a pre-test measure in addition to the post-test. Researchers collect data both before and after the intervention. By comparing the pre-test and post-test results within the same group, you can gain a better understanding of the changes that occur due to the independent variable.

Example : If you're studying the impact of a stress management program on participants' stress levels, you would measure their stress levels before the program (pre-test) and after completing the program (post-test) to assess any changes.

Non-Equivalent Groups Design

The Non-Equivalent Groups Design involves multiple groups, but they are not randomly assigned. Instead, researchers must carefully match or control for relevant variables to minimize biases. This design is particularly useful when random assignment is not possible or ethical.

Example : Imagine you're examining the effectiveness of two teaching methods in two different schools. You can't randomly assign students to the schools, but you can carefully match them based on factors like age, prior academic performance, and socioeconomic status to create equivalent groups.

Time Series Design

Time Series Design is an approach where data is collected at multiple time points before and after the intervention. This design allows researchers to analyze trends and patterns over time, providing valuable insights into the sustained effects of the independent variable.

Example : If you're studying the impact of a new marketing campaign on product sales, you would collect sales data at regular intervals (e.g., monthly) before and after the campaign's launch to observe any long-term trends.

Regression Discontinuity Design

Regression Discontinuity Design is employed when participants are assigned to different groups based on a specific cutoff score or threshold. This design is often used in educational and policy research to assess the effects of interventions near a cutoff point.

Example : Suppose you're evaluating the impact of a scholarship program on students' academic performance. Students who score just above or below a certain GPA threshold are assigned differently to the program. This design helps assess the program's effectiveness at the cutoff point.

Propensity Score Matching

Propensity Score Matching is a technique used to create comparable treatment and control groups in non-randomized studies. Researchers calculate propensity scores based on participants' characteristics and match individuals in the treatment group to those in the control group with similar scores.

Example : If you're studying the effects of a new medication on patient outcomes, you would use propensity scores to match patients who received the medication with those who did not but have similar health profiles.

Interrupted Time Series Design

The Interrupted Time Series Design involves collecting data at multiple time points before and after the introduction of an intervention. However, in this design, the intervention occurs at a specific point in time, allowing researchers to assess its immediate impact.

Example : Let's say you're analyzing the effects of a new traffic management system on traffic accidents. You collect accident data before and after the system's implementation to observe any abrupt changes right after its introduction.

Each of these quasi-experimental designs offers unique advantages and is best suited to specific research questions and scenarios. Choosing the right design is crucial for conducting robust and informative studies.

Advantages and Disadvantages of Quasi-Experimental Design

Quasi-experimental design offers a valuable research approach, but like any methodology, it comes with its own set of advantages and disadvantages. Let's explore these in detail.

Quasi-Experimental Design Advantages

Quasi-experimental design presents several advantages that make it a valuable tool in research:

  • Real-World Applicability:  Quasi-experimental studies often take place in real-world settings, making the findings more applicable to practical situations. Researchers can examine the effects of interventions or variables in the context where they naturally occur.
  • Ethical Considerations:  In situations where manipulating the independent variable in a controlled experiment would be unethical, quasi-experimental design provides an ethical alternative. For example, it would be unethical to assign participants to smoke for a study on the health effects of smoking, but you can study naturally occurring groups of smokers and non-smokers.
  • Cost-Efficiency:  Conducting Quasi-Experimental research is often more cost-effective than conducting controlled experiments. The absence of controlled environments and extensive manipulations can save both time and resources.

These advantages make quasi-experimental design an attractive choice for researchers facing practical or ethical constraints in their studies.

Quasi-Experimental Design Disadvantages

However, quasi-experimental design also comes with its share of challenges and disadvantages:

  • Limited Control:  Unlike controlled experiments, where researchers have full control over variables, quasi-experimental design lacks the same level of control. This limited control can result in confounding variables that make it difficult to establish causality.
  • Threats to Internal Validity:  Various threats to internal validity, such as selection bias, history effects, and maturation effects, can compromise the accuracy of causal inferences. Researchers must carefully address these threats to ensure the validity of their findings.
  • Causality Inference Challenges:  Establishing causality can be challenging in quasi-experimental design due to the absence of randomization and control. While you can make strong arguments for causality, it may not be as conclusive as in controlled experiments.
  • Potential Confounding Variables:  In a quasi-experimental design, it's often challenging to control for all possible confounding variables that may affect the dependent variable. This can lead to uncertainty in attributing changes solely to the independent variable.

Despite these disadvantages, quasi-experimental design remains a valuable research tool when used judiciously and with a keen awareness of its limitations. Researchers should carefully consider their research questions and the practical constraints they face before choosing this approach.

How to Conduct a Quasi-Experimental Study?

Conducting a Quasi-Experimental study requires careful planning and execution to ensure the validity of your research. Let's dive into the essential steps you need to follow when conducting such a study.

1. Define Research Questions and Objectives

The first step in any research endeavor is clearly defining your research questions and objectives. This involves identifying the independent variable (IV) and the dependent variable (DV) you want to study. What is the specific relationship you want to explore, and what do you aim to achieve with your research?

  • Specify Your Research Questions :  Start by formulating precise research questions that your study aims to answer. These questions should be clear, focused, and relevant to your field of study.
  • Identify the Independent Variable:  Define the variable you intend to manipulate or study in your research. Understand its significance in your study's context.
  • Determine the Dependent Variable:  Identify the outcome or response variable that will be affected by changes in the independent variable.
  • Establish Hypotheses (If Applicable):  If you have specific hypotheses about the relationship between the IV and DV, state them clearly. Hypotheses provide a framework for testing your research questions.

2. Select the Appropriate Quasi-Experimental Design

Choosing the right quasi-experimental design is crucial for achieving your research objectives. Select a design that aligns with your research questions and the available data. Consider factors such as the feasibility of implementing the design and the ethical considerations involved.

  • Evaluate Your Research Goals:  Assess your research questions and objectives to determine which type of quasi-experimental design is most suitable. Each design has its strengths and limitations, so choose one that aligns with your goals.
  • Consider Ethical Constraints:  Take into account any ethical concerns related to your research. Depending on your study's context, some designs may be more ethically sound than others.
  • Assess Data Availability:  Ensure you have access to the necessary data for your chosen design. Some designs may require extensive historical data, while others may rely on data collected during the study.

3. Identify and Recruit Participants

Selecting the right participants is a critical aspect of Quasi-Experimental research. The participants should represent the population you want to make inferences about, and you must address ethical considerations, including informed consent.

  • Define Your Target Population:  Determine the population that your study aims to generalize to. Your sample should be representative of this population.
  • Recruitment Process:  Develop a plan for recruiting participants. Depending on your design, you may need to reach out to specific groups or institutions.
  • Informed Consent:  Ensure that you obtain informed consent from participants. Clearly explain the nature of the study, potential risks, and their rights as participants.

4. Collect Data

Data collection is a crucial step in Quasi-Experimental research. You must adhere to a consistent and systematic process to gather relevant information before and after the intervention or treatment.

  • Pre-Test Measures:  If applicable, collect data before introducing the independent variable. Ensure that the pre-test measures are standardized and reliable.
  • Post-Test Measures:  After the intervention, collect post-test data using the same measures as the pre-test. This allows you to assess changes over time.
  • Maintain Data Consistency:  Ensure that data collection procedures are consistent across all participants and time points to minimize biases.

5. Analyze Data

Once you've collected your data, it's time to analyze it using appropriate statistical techniques . The choice of analysis depends on your research questions and the type of data you've gathered.

  • Statistical Analysis :  Use statistical software to analyze your data. Common techniques include t-tests, analysis of variance (ANOVA), regression analysis, and more, depending on the design and variables.
  • Control for Confounding Variables:  Be aware of potential confounding variables and include them in your analysis as covariates to ensure accurate results.

6. Interpret Results

With the analysis complete, you can interpret the results to draw meaningful conclusions about the relationship between the independent and dependent variables.

  • Examine Effect Sizes:  Assess the magnitude of the observed effects to determine their practical significance.
  • Consider Significance Levels:  Determine whether the observed results are statistically significant . Understand the p-values and their implications.
  • Compare Findings to Hypotheses:  Evaluate whether your findings support or reject your hypotheses and research questions.

7. Draw Conclusions

Based on your analysis and interpretation of the results, draw conclusions about the research questions and objectives you set out to address.

  • Causal Inferences:  Discuss the extent to which your study allows for causal inferences. Be transparent about the limitations and potential alternative explanations for your findings.
  • Implications and Applications:  Consider the practical implications of your research. How do your findings contribute to existing knowledge, and how can they be applied in real-world contexts?
  • Future Research:  Identify areas for future research and potential improvements in study design. Highlight any limitations or constraints that may have affected your study's outcomes.

By following these steps meticulously, you can conduct a rigorous and informative Quasi-Experimental study that advances knowledge in your field of research.

Quasi-Experimental Design Examples

Quasi-experimental design finds applications in a wide range of research domains, including business-related and market research scenarios. Below, we delve into some detailed examples of how this research methodology is employed in practice:

Example 1: Assessing the Impact of a New Marketing Strategy

Suppose a company wants to evaluate the effectiveness of a new marketing strategy aimed at boosting sales. Conducting a controlled experiment may not be feasible due to the company's existing customer base and the challenge of randomly assigning customers to different marketing approaches. In this scenario, a quasi-experimental design can be employed.

  • Independent Variable:  The new marketing strategy.
  • Dependent Variable:  Sales revenue.
  • Design:  The company could implement the new strategy for one group of customers while maintaining the existing strategy for another group. Both groups are selected based on similar demographics and purchase history , reducing selection bias. Pre-implementation data (sales records) can serve as the baseline, and post-implementation data can be collected to assess the strategy's impact.

Example 2: Evaluating the Effectiveness of Employee Training Programs

In the context of human resources and employee development, organizations often seek to evaluate the impact of training programs. A randomized controlled trial (RCT) with random assignment may not be practical or ethical, as some employees may need specific training more than others. Instead, a quasi-experimental design can be employed.

  • Independent Variable:  Employee training programs.
  • Dependent Variable:  Employee performance metrics, such as productivity or quality of work.
  • Design:  The organization can offer training programs to employees who express interest or demonstrate specific needs, creating a self-selected treatment group. A comparable control group can consist of employees with similar job roles and qualifications who did not receive the training. Pre-training performance metrics can serve as the baseline, and post-training data can be collected to assess the impact of the training programs.

Example 3: Analyzing the Effects of a Tax Policy Change

In economics and public policy, researchers often examine the effects of tax policy changes on economic behavior. Conducting a controlled experiment in such cases is practically impossible. Therefore, a quasi-experimental design is commonly employed.

  • Independent Variable:  Tax policy changes (e.g., tax rate adjustments).
  • Dependent Variable:  Economic indicators, such as consumer spending or business investments.
  • Design:  Researchers can analyze data from different regions or jurisdictions where tax policy changes have been implemented. One region could represent the treatment group (with tax policy changes), while a similar region with no tax policy changes serves as the control group. By comparing economic data before and after the policy change in both groups, researchers can assess the impact of the tax policy changes.

These examples illustrate how quasi-experimental design can be applied in various research contexts, providing valuable insights into the effects of independent variables in real-world scenarios where controlled experiments are not feasible or ethical. By carefully selecting comparison groups and controlling for potential biases, researchers can draw meaningful conclusions and inform decision-making processes.

How to Publish Quasi-Experimental Research?

Publishing your Quasi-Experimental research findings is a crucial step in contributing to the academic community's knowledge. We'll explore the essential aspects of reporting and publishing your Quasi-Experimental research effectively.

Structuring Your Research Paper

When preparing your research paper, it's essential to adhere to a well-structured format to ensure clarity and comprehensibility. Here are key elements to include:

Title and Abstract

  • Title:  Craft a concise and informative title that reflects the essence of your study. It should capture the main research question or hypothesis.
  • Abstract:  Summarize your research in a structured abstract, including the purpose, methods, results, and conclusions. Ensure it provides a clear overview of your study.

Introduction

  • Background and Rationale:  Provide context for your study by discussing the research gap or problem your study addresses. Explain why your research is relevant and essential.
  • Research Questions or Hypotheses:  Clearly state your research questions or hypotheses and their significance.

Literature Review

  • Review of Related Work:  Discuss relevant literature that supports your research. Highlight studies with similar methodologies or findings and explain how your research fits within this context.
  • Participants:  Describe your study's participants, including their characteristics and how you recruited them.
  • Quasi-Experimental Design:  Explain your chosen design in detail, including the independent and dependent variables, procedures, and any control measures taken.
  • Data Collection:  Detail the data collection methods , instruments used, and any pre-test or post-test measures.
  • Data Analysis:  Describe the statistical techniques employed, including any control for confounding variables.
  • Presentation of Findings:  Present your results clearly, using tables, graphs, and descriptive statistics where appropriate. Include p-values and effect sizes, if applicable.
  • Interpretation of Results:  Discuss the implications of your findings and how they relate to your research questions or hypotheses.
  • Interpretation and Implications:  Analyze your results in the context of existing literature and theories. Discuss the practical implications of your findings.
  • Limitations:  Address the limitations of your study, including potential biases or threats to internal validity.
  • Future Research:  Suggest areas for future research and how your study contributes to the field.

Ethical Considerations in Reporting

Ethical reporting is paramount in Quasi-Experimental research. Ensure that you adhere to ethical standards, including:

  • Informed Consent:  Clearly state that informed consent was obtained from all participants, and describe the informed consent process.
  • Protection of Participants:  Explain how you protected the rights and well-being of your participants throughout the study.
  • Confidentiality:  Detail how you maintained privacy and anonymity, especially when presenting individual data.
  • Disclosure of Conflicts of Interest:  Declare any potential conflicts of interest that could influence the interpretation of your findings.

Common Pitfalls to Avoid

When reporting your Quasi-Experimental research, watch out for common pitfalls that can diminish the quality and impact of your work:

  • Overgeneralization:  Be cautious not to overgeneralize your findings. Clearly state the limits of your study and the populations to which your results can be applied.
  • Misinterpretation of Causality:  Clearly articulate the limitations in inferring causality in Quasi-Experimental research. Avoid making strong causal claims unless supported by solid evidence.
  • Ignoring Ethical Concerns:  Ethical considerations are paramount. Failing to report on informed consent, ethical oversight, and participant protection can undermine the credibility of your study.

Guidelines for Transparent Reporting

To enhance the transparency and reproducibility of your Quasi-Experimental research, consider adhering to established reporting guidelines, such as:

  • CONSORT Statement:  If your study involves interventions or treatments, follow the CONSORT guidelines for transparent reporting of randomized controlled trials.
  • STROBE Statement:  For observational studies, the STROBE statement provides guidance on reporting essential elements.
  • PRISMA Statement:  If your research involves systematic reviews or meta-analyses, adhere to the PRISMA guidelines.
  • Transparent Reporting of Evaluations with Non-Randomized Designs (TREND):  TREND guidelines offer specific recommendations for transparently reporting non-randomized designs, including Quasi-Experimental research.

By following these reporting guidelines and maintaining the highest ethical standards, you can contribute to the advancement of knowledge in your field and ensure the credibility and impact of your Quasi-Experimental research findings.

Quasi-Experimental Design Challenges

Conducting a Quasi-Experimental study can be fraught with challenges that may impact the validity and reliability of your findings. We'll take a look at some common challenges and provide strategies on how you can address them effectively.

Selection Bias

Challenge:  Selection bias occurs when non-randomized groups differ systematically in ways that affect the study's outcome. This bias can undermine the validity of your research, as it implies that the groups are not equivalent at the outset of the study.

Addressing Selection Bias:

  • Matching:  Employ matching techniques to create comparable treatment and control groups. Match participants based on relevant characteristics, such as age, gender, or prior performance, to balance the groups.
  • Statistical Controls:  Use statistical controls to account for differences between groups. Include covariates in your analysis to adjust for potential biases.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess how vulnerable your results are to selection bias. Explore different scenarios to understand the impact of potential bias on your conclusions.

History Effects

Challenge:  History effects refer to external events or changes over time that influence the study's results. These external factors can confound your research by introducing variables you did not account for.

Addressing History Effects:

  • Collect Historical Data:  Gather extensive historical data to understand trends and patterns that might affect your study. By having a comprehensive historical context, you can better identify and account for historical effects.
  • Control Groups:  Include control groups whenever possible. By comparing the treatment group's results to those of a control group, you can account for external influences that affect both groups equally.
  • Time Series Analysis:  If applicable, use time series analysis to detect and account for temporal trends. This method helps differentiate between the effects of the independent variable and external events.

Maturation Effects

Challenge:  Maturation effects occur when participants naturally change or develop throughout the study, independent of the intervention. These changes can confound your results, making it challenging to attribute observed effects solely to the independent variable.

Addressing Maturation Effects:

  • Randomization:  If possible, use randomization to distribute maturation effects evenly across treatment and control groups. Random assignment minimizes the impact of maturation as a confounding variable.
  • Matched Pairs:  If randomization is not feasible, employ matched pairs or statistical controls to ensure that both groups experience similar maturation effects.
  • Shorter Time Frames:  Limit the duration of your study to reduce the likelihood of significant maturation effects. Shorter studies are less susceptible to long-term maturation.

Regression to the Mean

Challenge:  Regression to the mean is the tendency for extreme scores on a variable to move closer to the mean upon retesting. This can create the illusion of an intervention's effectiveness when, in reality, it's a natural statistical phenomenon.

Addressing Regression to the Mean:

  • Use Control Groups:  Include control groups in your study to provide a baseline for comparison. This helps differentiate genuine intervention effects from regression to the mean.
  • Multiple Data Points:  Collect numerous data points to identify patterns and trends. If extreme scores regress to the mean in subsequent measurements, it may be indicative of regression to the mean rather than a true intervention effect.
  • Statistical Analysis:  Employ statistical techniques that account for regression to the mean when analyzing your data. Techniques like analysis of covariance (ANCOVA) can help control for baseline differences.

Attrition and Mortality

Challenge:  Attrition refers to the loss of participants over the course of your study, while mortality is the permanent loss of participants. High attrition rates can introduce biases and affect the representativeness of your sample.

Addressing Attrition and Mortality:

  • Careful Participant Selection:  Select participants who are likely to remain engaged throughout the study. Consider factors that may lead to attrition, such as participant motivation and commitment.
  • Incentives:  Provide incentives or compensation to participants to encourage their continued participation.
  • Follow-Up Strategies:  Implement effective follow-up strategies to reduce attrition. Regular communication and reminders can help keep participants engaged.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess the impact of attrition and mortality on your results. Compare the characteristics of participants who dropped out with those who completed the study.

Testing Effects

Challenge:  Testing effects occur when the mere act of testing or assessing participants affects their subsequent performance. This phenomenon can lead to changes in the dependent variable that are unrelated to the independent variable.

Addressing Testing Effects:

  • Counterbalance Testing:  If possible, counterbalance the order of tests or assessments between treatment and control groups. This helps distribute the testing effects evenly across groups.
  • Control Groups:  Include control groups subjected to the same testing or assessment procedures as the treatment group. By comparing the two groups, you can determine whether testing effects have influenced the results.
  • Minimize Testing Frequency:  Limit the frequency of testing or assessments to reduce the likelihood of testing effects. Conducting fewer assessments can mitigate the impact of repeated testing on participants.

By proactively addressing these common challenges, you can enhance the validity and reliability of your Quasi-Experimental study, making your findings more robust and trustworthy.

Quasi-experimental design is a powerful tool that helps researchers investigate cause-and-effect relationships in real-world situations where strict control is not always possible. By understanding the key concepts, types of designs, and how to address challenges, you can conduct robust research and contribute valuable insights to your field. Remember, quasi-experimental design bridges the gap between controlled experiments and purely observational studies, making it an essential approach in various fields, from business and market research to public policy and beyond. So, whether you're a researcher, student, or decision-maker, the knowledge of quasi-experimental design empowers you to make informed choices and drive positive changes in the world.

How to Supercharge Quasi-Experimental Design with Real-Time Insights?

Introducing Appinio , the real-time market research platform that transforms the world of quasi-experimental design. Imagine having the power to conduct your own market research in minutes, obtaining actionable insights that fuel your data-driven decisions. Appinio takes care of the research and tech complexities, freeing you to focus on what truly matters for your business.

Here's why Appinio stands out:

  • Lightning-Fast Insights:  From formulating questions to uncovering insights, Appinio delivers results in minutes, ensuring you get the answers you need when you need them.
  • No Research Degree Required:  Our intuitive platform is designed for everyone, eliminating the need for a PhD in research. Anyone can dive in and start harnessing the power of real-time consumer insights.
  • Global Reach, Local Expertise:  With access to over 90 countries and the ability to define precise target groups based on 1200+ characteristics, you can conduct Quasi-Experimental research on a global scale while maintaining a local touch.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Quota Sampling Definition Types Methods Examples

17.04.2024 | 25min read

Quota Sampling: Definition, Types, Methods, Examples

What is Market Share? Definition, Formula, Examples

15.04.2024 | 34min read

What is Market Share? Definition, Formula, Examples

What is Data Analysis Definition Tools Examples

11.04.2024 | 34min read

What is Data Analysis? Definition, Tools, Examples

Logo for UNT Open Books

5 Chapter 5: Experimental and Quasi-Experimental Designs

Case stu dy: the impact of teen court.

Research Study

An Experimental Evaluation of Teen Courts 1

Research Question

Is teen court more effective at reducing recidivism and improving attitudes than traditional juvenile justice processing?

Methodology

Researchers randomly assigned 168 juvenile offenders ages 11 to 17 from four different counties in Maryland to either teen court as experimental group members or to traditional juvenile justice processing as control group members. (Note: Discussion on the technical aspects of experimental designs, including random assignment, is found in detail later in this chapter.) Of the 168 offenders, 83 were assigned to teen court and 85 were assigned to regular juvenile justice processing through random assignment. Of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study.

Upon assignment to teen court or regular juvenile justice processing, all offenders entered their respective sanction. Approximately four months later, offenders in both the experimental group (teen court) and the control group (regular juvenile justice processing) were asked to complete a post-test survey inquiring about a variety of behaviors (frequency of drug use, delinquent behavior, variety of drug use) and attitudinal measures (social skills, rebelliousness, neighborhood attachment, belief in conventional rules, and positive self-concept). The study researchers also collected official re-arrest data for 18 months starting at the time of offender referral to juvenile justice authorities.

Teen court participants self-reported higher levels of delinquency than those processed through regular juvenile justice processing. According to official re-arrests, teen court youth were re-arrested at a higher rate and incurred a higher average number of total arrests than the control group. Teen court offenders also reported significantly lower scores on survey items designed to measure their “belief in conventional rules” compared to offenders processed through regular juvenile justice avenues. Other attitudinal and opinion measures did not differ significantly between the experimental and control group members based on their post-test responses. In sum, those youth randomly assigned to teen court fared worse than control group members who were not randomly assigned to teen court.

Limitations with the Study Procedure

Limitations are inherent in any research study and those research efforts that utilize experimental designs are no exception. It is important to consider the potential impact that a limitation of the study procedure could have on the results of the study.

In the current study, one potential limitation is that teen courts from four different counties in Maryland were utilized. Because of the diversity in teen court sites, it is possible that there were differences in procedure between the four teen courts and such differences could have impacted the outcomes of this study. For example, perhaps staff members at one teen court were more punishment-oriented than staff members at the other county teen courts. This philosophical difference may have affected treatment delivery and hence experimental group members’ belief in conventional attitudes and recidivism. Although the researchers monitored each teen court to help ensure treatment consistency between study sites, it is possible that differences existed in the day-to-day operation of the teen courts that may have affected participant outcomes. This same limitation might also apply to control group members who were sanctioned with regular juvenile justice processing in four different counties.

A researcher must also consider the potential for differences between the experimental and control group members. Although the offenders were randomly assigned to the experimental or control group, and the assumption is that the groups were equivalent to each other prior to program participation, the researchers in this study were only able to compare the experimental and control groups on four variables: age, school grade, gender, and race. It is possible that the experimental and control group members differed by chance on one or more factors not measured or available to the researchers. For example, perhaps a large number of teen court members experienced problems at home that can explain their more dismal post-test results compared to control group members without such problems. A larger sample of juvenile offenders would likely have helped to minimize any differences between the experimental and control group members. The collection of additional information from study participants would have also allowed researchers to be more confident that the experimental and control group members were equivalent on key pieces of information that could have influenced recidivism and participant attitudes.

Finally, while 168 juvenile offenders were randomly assigned to either the experimental or control group, not all offenders agreed to participate in the evaluation. Remember that of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study. While this limitation is unavoidable, it still could have influenced the study. Perhaps those 27 offenders who declined to participate in the teen court group differed significantly from the 56 who agreed to participate. If so, it is possible that the differences among those two groups could have impacted the results of the study. For example, perhaps the 27 youths who were randomly assigned to teen court but did not agree to be a part of the study were some of the least risky of potential teen court participants—less serious histories, better attitudes to begin with, and so on. In this case, perhaps the most risky teen court participants agreed to be a part of the study, and as a result of being more risky, this led to more dismal delinquency outcomes compared to the control group at the end of each respective program. Because parental consent was required for the study authors to be able to compare those who declined to participate in the study to those who agreed, it is unknown if the participants and nonparticipants differed significantly on any variables among either the experimental or control group. Moreover, of the resulting 107 offenders who took part in the study, only 75 offenders accurately completed the post-test survey measuring offending and attitudinal outcomes.

Again, despite the experimental nature of this study, such limitations could have impacted the study results and must be considered.

Impact on Criminal Justice

Teen courts are generally designed to deal with nonserious first time offenders before they escalate to more serious and chronic delinquency. Innovative programs such as “Scared Straight” and juvenile boot camps have inspired an increase in teen court programs across the country, although there is little evidence regarding their effectiveness compared to traditional sanctions for youthful offenders. This study provides more specific evidence as to the effectiveness of teen courts relative to normal juvenile justice processing. Researchers learned that teen court participants fared worse than those in the control group. The potential labeling effects of teen court, including stigma among peers, especially where the offense may have been very minor, may be more harmful than doing less or nothing. The real impact of this study lies in the recognition that teen courts and similar sanctions for minor offenders may do more harm than good.

One important impact of this study is that it utilized an experimental design to evaluate the effectiveness of a teen court compared to traditional juvenile justice processing. Despite the study’s limitations, by using an experimental design it improved upon previous teen court evaluations by attempting to ensure any results were in fact due to the treatment, not some difference between the experimental and control group. This study also utilized both official and self-report measures of delinquency, in addition to self-report measures on such factors as self-concept and belief in conventional rules, which have been generally absent from teen court evaluations. The study authors also attempted to gauge the comparability of the experimental and control groups on factors such as age, gender, and race to help make sure study outcomes were attributable to the program, not the participants.

In This Chapter You Will Learn

The four components of experimental and quasi-experimental research designs and their function in answering a research question

The differences between experimental and quasi-experimental designs

The importance of randomization in an experimental design

The types of questions that can be answered with an experimental or quasi-experimental research design

About the three factors required for a causal relationship

That a relationship between two or more variables may appear causal, but may in fact be spurious, or explained by another factor

That experimental designs are relatively rare in criminal justice and why

About common threats to internal validity or alternative explanations to what may appear to be a causal relationship between variables

Why experimental designs are superior to quasi-experimental designs for eliminating or reducing the potential of alternative explanations

Introduction

The teen court evaluation that began this chapter is an example of an experimental design. The researchers of the study wanted to determine whether teen court was more effective at reducing recidivism and improving attitudes compared to regular juvenile justice case processing. In short, the researchers were interested in the relationship between variables —the relationship of teen court to future delinquency and other outcomes. When researchers are interested in whether a program, policy, practice, treatment, or other intervention impacts some outcome, they often utilize a specific type of research method/design called experimental design. Although there are many types of experimental designs, the foundation for all of them is the classic experimental design. This research design, and some typical variations of this experimental design, are the focus of this chapter.

Although the classic experiment may be appropriate to answer a particular research question, there are barriers that may prevent researchers from using this or another type of experimental design. In these situations, researchers may turn to quasi-experimental designs. Quasi-experiments include a group of research designs that are missing a key element found in the classic experiment and other experimental designs (hence the term “quasi” experiment). Despite this missing part, quasi-experiments are similar in structure to experimental designs and are used to answer similar types of research questions. This chapter will also focus on quasi-experiments and how they are similar to and different from experimental designs.

Uncovering the relationship between variables, such as the impact of teen court on future delinquency, is important in criminal justice and criminology, just as it is in other scientific disciplines such as education, biology, and medicine. Indeed, whereas criminal justice researchers may be interested in whether a teen court reduces recidivism or improves attitudes, medical field researchers may be concerned with whether a new drug reduces cholesterol, or an education researcher may be focused on whether a new teaching style leads to greater academic gains. Across these disciplines and topics of interest, the experimental design is appropriate. In fact, experimental designs are used in all scientific disciplines; the only thing that changes is the topic. Specific to criminal justice, below is a brief sampling of the types of questions that can be addressed using an experimental design:

Does participation in a correctional boot camp reduce recidivism?

What is the impact of an in-cell integration policy on inmate-on-inmate assaults in prisons?

Does police officer presence in schools reduce bullying?

Do inmates who participate in faith-based programming while in prison have a lower recidivism rate upon their release from prison?

Do police sobriety checkpoints reduce drunken driving fatalities?

What is the impact of a no-smoking policy in prisons on inmate-on-inmate assaults?

Does participation in a domestic violence intervention program reduce repeat domestic violence arrests?

A focus on the classic experimental design will demonstrate the usefulness of this research design for addressing criminal justice questions interested in cause and effect relationships. Particular attention is paid to the classic experimental design because it serves as the foundation for all other experimental and quasi-experimental designs, some of which are covered in this chapter. As a result, a clear understanding of the components, organization, and logic of the classic experimental design will facilitate an understanding of other experimental and quasi-experimental designs examined in this chapter. It will also allow the reader to better understand the results produced from those various designs, and importantly, what those results mean. It is a truism that the results of a research study are only as “good” as the design or method used to produce them. Therefore, understanding the various experimental and quasi-experimental designs is the key to becoming an informed consumer of research.

The Challenge of Establishing Cause and Effect

Researchers interested in explaining the relationship between variables, such as whether a treatment program impacts recidivism, are interested in causation or causal relationships. In a simple example, a causal relationship exists when X (independent variable) causes Y (dependent variable), and there are no other factors (Z) that can explain that relationship. For example, offenders who participated in a domestic violence intervention program (X–domestic violence intervention program) experienced fewer re-arrests (Y–re-arrests) than those who did not participate in the domestic violence program, and no other factor other than participation in the domestic violence program can explain these results. The classic experimental design is superior to other research designs in uncovering a causal relationship, if one exists. Before a causal relationship can be established, however, there are three conditions that must be met (see Figure 5.1). 2

FIGURE 5.1 | The Cause and Effect Relationship

research questions in quasi experimental design

Timing The first condition for a causal relationship is timing. For a causal relationship to exist, it must be shown that the independent variable or cause (X) preceded the dependent variable or outcome (Y) in time. A decrease in domestic violence re-arrests (Y) cannot occur before participation in a domestic violence reduction program (X ), if the domestic violence program is proposed to be the cause of fewer re-arrests. Ensuring that cause comes before effect is not sufficient to establish that a causal relationship exists, but it is one requirement that must be met for a causal relationship.

Association In addition to timing, there must also be an observable association between X and Y, the second necessary condition for a causal relationship. Association is also commonly referred to as covariance or correlation. When an association or correlation exits, this means there is some pattern of relationship between X and Y —as X changes by increasing or decreasing, Y also changes by increasing or decreasing. Here, the notion of X and Y increasing or decreasing can mean an actual increase/decrease in the quantity of some factor, such as an increase/decrease in the number of prison terms or days in a program or re-arrests. It can also refer to an increase/decrease in a particular category, for example, from nonparticipation in a program to participation in a program. For instance, subjects who participated in a domestic violence reduction program (X) incurred fewer domestic violence re-arrests (Y) than those who did not participate in the program. In this example, X and Y are associated—as X change s or increases from nonparticipation to participation in the domestic violence program, Y or the number of re-arrests for domestic violence decreases.

Associations between X and Y can occur in two different directions: positive or negative. A positive association means that as X increases, Y increases, or, as X decreases, Y decreases. A negative association means that as X increases, Y decreases, or, as X decreases, Y increases. In the example above, the association is negative—participation in the domestic violence program was associated with a reduction in re-arrests. This is also sometimes called an inverse relationship.

Elimination of Alternative Explanations Although participation in a domestic violence program may be associated with a reduction in re-arrests, this does not mean for certain that participation in the program was the cause of reduced re-arrests. Just as timing by itself does not imply a causal relationship, association by itself does not imply a causal relationship. For example, instead of the program being the cause of a reduction in re-arrests, perhaps several of the program participants died shortly after completion of the domestic violence program and thus were not able to engage in domestic violence (and their deaths were unknown to the researcher tracking re-arrests). Perhaps a number of the program participants moved out of state and domestic violence re-arrests occurred but were not able to be uncovered by the researcher. Perhaps those in the domestic violence program experienced some other event, such as the trauma of a natural disaster, and that experience led to a reduction in domestic violence, an event not connected to the domestic violence program. If any of these situations occurred, it might appear that the domestic violence program led to fewer re-arrests. However, the observed reduction in re-arrests can actually be attributed to a factor unrelated to the domestic violence program.

The previous discussion leads to the third and final necessary consideration in determining a causal relationship— elimination of alternative explanations. This means that the researcher must rule out any other potential explanation of the results, except for the experimental condition such as a program, policy, or practice. Accounting for or ruling out alternative explanations is much more difficult than ensuring timing and association. Ruling out all alternative explanations is difficult because there are so many potential other explanations that can wholly or partly explain the findings of a research study. This is especially true in the social sciences, where researchers are often interested in relationships explaining human behavior. Because of this difficulty, associations by themselves are sometimes mistaken as causal relationships when in fact they are spurious. A spurious relationship is one where it appears that X and Y are causally related, but the relationship is actually explained by something other than the independent variable, or X.

One only needs to go so far as the daily newspaper to find headlines and stories of mere associations being mistaken, assumed, or represented as causal relationships. For example, a newspaper headline recently proclaimed “Churchgoers live longer.” 3 An uninformed consumer may interpret this headline as evidence of a causal relationship—that going to church by itself will lead to a longer life—but the astute consumer would note possible alternative explanations. For example, people who go to church may live longer because they tend to live healthier lifestyles and tend to avoid risky situations. These are two probable alternative explanations to the relationship independent of simply going to church. In another example, researchers David Kalist and Daniel Yee explored the relationship between first names and delinquent behavior in their manuscript titled “First Names and Crime: Does Unpopularity Spell Trouble?” 4 Kalist and Lee (2009) found that unpopular names are associated with juvenile delinquency. In other words, those individuals with the most unpopular names were more likely to be delinquent than those with more popular names. According to the authors, is it not necessarily someone’s name that leads to delinquent behavior, but rather, the most unpopular names also tend to be correlated with individuals who come from disadvantaged home environments and experience a low socio-economic status of living. Rightly noted by the authors, these alternative explanations help to explain the link between someone’s name and delinquent behavior—a link that is not causal.

A frequently cited example provides more insight to the claim that an association by itself is not sufficient to prove causality. In certain cities in the United States, for example, as ice cream sales increase on a particular day or in a particular month so does the incidence of certain forms of crime. If this association were represented as a causal statement, it would be that ice cream or ice cream sales causes crime. There is an association, no doubt, and let us assume that ice cream sales rose before the increase in crime (timing). Surely, however, this relationship between ice cream sales and crime is spurious. The alternative explanation is that ice cream sales and crime are associated in certain parts of the country because of the weather. Ice cream sales tend to increase in warmer temperatures, and it just so happens that certain forms of crime tend to increase in warmer temperatures as well. This coincidence or association does not mean a causal relationship exists. Additionally, this does not mean that warm temperatures cause crime either. There are plenty of other alternative explanations for the increase in certain forms of crime and warmer temperatures. 6 For another example of a study subject to alternative explanations, read the June 2011 news article titled “Less Crime in U.S. Thanks to Videogames.” 7 Based on your reading, what are some other potential explanations for the crime drop other than videogames?

The preceding examples demonstrate how timing and association can be present, but the final needed condition for a causal relationship is that all alternative explanations are ruled out. While this task is difficult, the classic experimental design helps to ensure these additional explanatory factors are minimized. When other designs are used, such as quasi-experimental designs, the chance that alternative explanations emerge is greater. This potential should become clearer as we explore the organization and logic of the classic experimental design.

CLASSICS IN CJ RESEARCH

Minneapolis Domestic Violence Experiment

The Minneapolis Domestic Violence Experiment (MDVE) 5

Which police action (arrest, separation, or mediation) is most effective at deterring future misdemeanor domestic violence?

The experiment began on March 17, 1981, and continued until August 1, 1982. The experiment was conducted in two of Minneapolis’s four police precincts—the two with the highest number of domestic violence reports and arrests. A total of 314 reports of misdemeanor domestic violence were handled by the police during this time frame.

This study utilized an experimental design with the random assignment of police actions. Each police officer involved in the study was given a pad of report forms. Upon a misdemeanor domestic violence call, the officer’s action (arrest, separation, or mediation) was predetermined by the order and color of report forms in the officer’s notebook. Colored report forms were randomly ordered in the officer’s notebook and the color on the form determined the officer response once at the scene. For example, after receiving a call for domestic violence, an officer would turn to his or her report pad to determine the action. If the top form was pink, the action was arrest. If on the next call the top form was a different color, an action other than arrest would occur. All colored report forms were randomly ordered through a lottery assignment method. The result is that all police officer actions to misdemeanor domestic violence calls were randomly assigned. To ensure the lottery procedure was properly carried out, research staff participated in ride-alongs with officers to ensure that officers did not skip the order of randomly ordered forms. Research staff also made sure the reports were received in the order they were randomly assigned in the pad of report forms.

To examine the relationship of different officer responses to future domestic violence, the researchers examined official arrests of the suspects in a 6-month follow-up period. For example, the researchers examined those initially arrested for misdemeanor domestic violence and how many were subsequently arrested for domestic violence within a 6-month time frame. They did the same procedure for the police actions of separation and mediation. The researchers also interviewed the victim(s) of each incident and asked if a repeat domestic violence incident occurred with the same suspect in the 6-month follow-up period. This allowed researchers to examine domestic violence offenses that may have occurred but did not come to the official attention of police. The researchers then compared official arrests for domestic violence to self-reported domestic violence after the experiment.

Suspects arrested for misdemeanor domestic violence, as opposed to situations where separation or mediation was used, were significantly less likely to engage in repeat domestic violence as measured by official arrest records and victim interviews during the 6-month follow-up period. According to official police records, 10% of those initially arrested engaged in repeat domestic violence in the followup period, 19% of those who initially received mediation engaged in repeat domestic violence, and 24% of those who randomly received separation engaged in repeat domestic violence. According to victim interviews, 19% of those initially arrested engaged in repeat domestic violence, compared to 37% for separation and 33% for mediation. The general conclusion of the experiment was that arrest was preferable to separation or mediation in deterring repeat domestic violence across both official police records and victim interviews.

A few issues that affected the random assignment procedure occurred throughout the study. First, some officers did not follow the randomly assigned action (arrest, separation, or mediation) as a result of other circumstances that occurred at the scene. For example, if the randomly assigned action was separation, but the suspect assaulted the police officer during the call, the officer might arrest the suspect. Second, some officers simply ignored the assigned action if they felt a particular call for domestic violence required another action. For example, if the action was mediation as indicated by the randomly assigned report form, but the officer felt the suspect should be arrested, he or she may have simply ignored the randomly assigned response and substituted his or her own. Third, some officers forgot their report pads and did not know the randomly assigned course of action to take upon a call of domestic violence. Fourth and finally, the police chief also allowed officers to deviate from the randomly assigned action in certain circumstances. In all of these situations, the random assignment procedures broke down.

The results of the MDVE had a rapid and widespread impact on law enforcement practice throughout the United States. Just two years after the release of the study, a 1986 telephone survey of 176 urban police departments serving cities with populations of 100,000 or more found that 46 percent of the departments preferred to make arrests in cases of minor domestic violence, largely due to the effectiveness of this practice in the Minneapolis Domestic Violence Experiment. 8

In an attempt to replicate the findings of the Minneapolis Domestic Violence Experiment, the National Institute of Justice sponsored the Spouse Assault Replication Program. Replication studies were conducted in Omaha, Charlotte, Milwaukee, Miami, and Colorado Springs from 1986–1991. In three of the five replications, offenders randomly assigned to the arrest group had higher levels of continued domestic violence in comparison to other police actions during domestic violence situations. 9 Therefore, rather than providing results that were consistent with the Minneapolis Domestic Violence Experiment, the results from the five replication experiments produced inconsistent findings about whether arrest deters domestic violence. 10

Despite the findings of the replications, the push to arrest domestic violence offenders has continued in law enforcement. Today many police departments require officers to make arrests in domestic violence situations. In agencies that do not mandate arrest, department policy typically states a strong preference toward arrest. State legislatures have also enacted laws impacting police actions regarding domestic violence. Twenty-one states have mandatory arrest laws while eight have pro-arrest statutes for domestic violence. 11

The Classic Experimental Design

Table 5.1 provides an illustration of the classic experimental design. 12 It is important to become familiar with the specific notation and organization of the classic experiment before a full discussion of its components and their purpose.

Major Components of the Classic Experimental Design

The classic experimental design has four major components:

1. Treatment

2. Experimental Group and Control Group

3. Pre-Test and Post-Test

4. Random Assignment

Treatment The first component of the classic experimental design is the treatment, and it is denoted by X in the classic experimental design. The treatment can be a number of things—a program, a new drug, or the implementation of a new policy. In a classic experimental design, the primary goal is to determine what effect, if any, a particular treatment had on some outcome. In this way, the treatment can also be considered the independent variable.

TABLE 5.1 | The Classic Experimental Design

Experimental Group = Group that receives the treatment

Control Group = Group that does not receive the treatment

R = Random assignment

O 1 = Observation before the treatment, or the pre-test

X = Treatment or the independent variable

O 2 = Observation after the treatment, or the post-test

Experimental and Control Groups The second component of the classic experiment is an experimental group and a control group. The experimental group receives the treatment, and the control group does not receive the treatment. There will always be at least one group that receives the treatment in experimental and quasi-experimental designs. In some cases, experiments may have multiple experimental groups receiving multiple treatments.

Pre-Test and Post-Test The third component of the classic experiment is a pre-test and a post-test. A pretest is a measure of the dependent variable or outcome before the treatment. The post-test is a measure of the dependent variable after the treatment is administered. It is important to note that the post-test is defined based on the stated goals of the program. For example, if the stated goal of a particular program is to reduce re-arrests, the post-test will be a measure of re-arrests after the program. The dependent variable also defines the pre-test. For example, if a researcher wanted to examine the impact of a domestic violence reduction program (treatment or X) on the goal of reducing re-arrests (dependent variable or Y), the pre-test would be the number of domestic violence arrests incurred before the program. Program goals may be numerous and all can constitute a post-test, and hence, the pre-test. For example, perhaps the goal of the domestic violence program is also that participants learn of different pro-social ways to handle domestic conflicts other than resorting to violence. If researchers wanted to examine this goal, the post-test might be subjects’ level of knowledge about pro-social ways to handle domestic conflicts other than violence. The pre-test would then be subjects’ level of knowledge about these pro-social alternatives to violence before they received the treatment program.

Although all designs have a post-test, it is not always the case that designs have a pre-test. This is because researchers may not have access or be able to collect information constituting the pre-test. For example, researchers may not be able to determine subjects’ level of knowledge about alternatives to domestic violence before the intervention program if the subjects are already enrolled in the domestic violence intervention program. In other cases, there may be financial barriers to collecting pre-test information. In the teen court evaluation that started this chapter, for example, researchers were not able to collect pre-test information on study participants due to the financial strain it would have placed on the agencies involved in the study. 13 There are a number of potential reasons why a pre-test might not be available in a research study. The defining feature, however, is that the pre-test is determined by the post-test.

Random Assignment The fourth component of the classic experiment is random assignment. Random assignment refers to a process whereby members of the experimental group and control group are assigned to the two groups through a random and unbiased process. Random assignment should not be mistaken for random selection as discussed in Chapter 3. Random selection refers to selecting a smaller but representative sample from a larger population. For example, a researcher may randomly select a sample from a larger city population for the purposes of sending sample members a mail survey to determine their attitudes on crime. The goal of random selection in this example is to make sure the sample, although smaller in size than the population, accurately represents the larger population.

Random assignment, on the other hand, refers to the process of assigning subjects to either the experimental or control group with the goal that the groups are similar or equivalent to each other in every way (see Figure 5.2). The exception to this rule is that one group gets the treatment and the other does not (see discussion below on why equivalence is so important). Although the concept of random is similar in each, the goals are different between random selection and random assignment. 14 Experimental designs all feature random assignment, but this is not true of other research designs, in particular quasi-experimental designs.

FIGURE 5.2 | Random Assignment

research questions in quasi experimental design

The classic experimental design is the foundation for all other experimental and quasi-experimental designs because it retains all of the major components discussed above. As mentioned, sometimes designs do not have a pre-test, a control group, or random assignment. Because the pre-test, control group, and random assignment are so critical to the goal of uncovering a causal relationship, if one exists, we explore them further below.

The Logic of the Classic Experimental Design

Consider a research study using the classic experimental design where the goal is to determine if a domestic violence treatment program has any effect on re-arrests for domestic violence. The randomly assigned experimental and control groups are comprised of persons who had previously been arrested for domestic violence. The pretest is a measure of the number of domestic violence arrests before the program. This is because the goal of the program is to determine whether re-arrests are impacted after the treatment. The post-test is the number of re-arrests following the treatment program.

Once randomly assigned, the experimental group members receive the domestic violence program, and the control group members do not. After the program, the researcher will compare the pre-test arrests for domestic violence of the experimental group to post-test arrests for domestic violence to determine if arrests increased, decreased, or remained constant since the start of the program. The researcher will also compare the post-test re-arrests for domestic violence between the experimental and control groups. With this example, we explore the usefulness of the classic experimental design, and the contribution of the pre-test, random assignment, and the control group to the goal of determining whether a domestic violence program reduces re-arrests.

The Pre-Test As a component of the classic experiment, the pre-test allows an examination of change in the dependent variable from before the domestic violence program to after the domestic violence program. In short, a pre-test allows the researcher to determine if re-arrests increased, decreased, or remained the same following the domestic violence program. Without a pre-test, researchers would not be able to determine the extent of change, if any, from before to after the program for either the experimental or control group.

Although the pre-test is a measure of the dependent variable before the treatment, it can also be thought of as a measure whereby the researcher can compare the experimental group to the control group before the treatment is administered. For example, the pre-test helps researchers to make sure both groups are similar or equivalent on previous arrests for domestic violence. The importance of equivalence between the experimental and control groups on previous arrests is discussed below with random assignment.

Random Assignment Random assignment helps to ensure that the experimental and control groups are equivalent before the introduction of the treatment. This is perhaps one of the most critical aspects of the classic experiment and all experimental designs. Although the experimental and control groups will be made up of different people with different characteristics, assigning them to groups via a random assignment process helps to ensure that any differences or bias between the groups is eliminated or minimized. By minimizing bias, we mean that the groups will balance each other out on all factors except the treatment. If they are balanced out on all factors prior to the administration of the treatment, any differences between the groups at the post-test must be due to the treatment—the only factor that differs between the experimental group and the control group. According to Shadish, Cook, and Campbell: “If implemented correctly, random assignment creates two or more groups of units that are probabilistically similar to each other on the average. Hence, any outcome differences that are observed between those groups at the end of a study are likely to be due to treatment, not to differences between the groups that already existed at the start of the study.” 15 Considered in another way, if the experimental and control group differed significantly on any relevant factor other than the treatment, the researcher would not know if the results observed at the post-test are attributable to the treatment or to the differences between the groups.

Consider an example where 500 domestic abusers were randomly assigned to the experimental group and 500 were randomly assigned to the control group. Because they were randomly assigned, we would likely find more frequent domestic violence arrestees in both groups, older and younger arrestees in both groups, and so on. If random assignment was implemented correctly, it would be highly unlikely that all of the experimental group members were the most serious or frequent arrestees and all of the control group members were less serious and/or less frequent arrestees. While there are no guarantees, we know the chance of this happening is extremely small with random assignment because it is based on known probability theory. Thus, except for a chance occurrence, random assignment will result in equivalence between the experimental and control group in much the same way that flipping a coin multiple times will result in heads approximately 50% of the time and tails approximately 50% of the time. Over 1,000 tosses of a coin, for example, should result in roughly 500 heads and 500 tails. While there is a chance that flipping a coin 1,000 times will result in heads 1,000 times, or some other major imbalance between heads and tails, this potential is small and would only occur by chance.

The same logic from above also applies with randomly assigning people to groups, and this can even be done by flipping a coin. By assigning people to groups through a random and unbiased process, like flipping a coin, only by chance (or researcher error) will one group have more of one characteristic than another, on average. If there are no major (also called statistically significant) differences between the experimental and control group before the treatment, the most plausible explanation for the results at the post-test is the treatment.

As mentioned, it is possible by some chance occurrence that the experimental and control group members are significantly different on some characteristic prior to administration of the treatment. To confirm that the groups are in fact similar after they have been randomly assigned, the researcher can examine the pre-test if one is present. If the researcher has additional information on subjects before the treatment is administered, such as age, or any other factor that might influence post-test results at the end of the study, he or she can also compare the experimental and control group on those measures to confirm that the groups are equivalent. Thus, a researcher can confirm that the experimental and control groups are equivalent on information known to the researcher.

Being able to compare the groups on known measures is an important way to ensure the random assignment process “worked.” However, perhaps most important is that randomization also helps to ensure similarity across unknown variables between the experimental and control group. Because random assignment is based on known probability theory, there is a much higher probability that all potential differences between the groups that could impact the post-test should balance out with random assignment—known or unknown. Without random assignment, it is likely that the experimental and control group would differ on important but unknown factors and such differences could emerge as alternative explanations for the results. For example, if a researcher did not utilize random assignment and instead took the first 500 domestic abusers from an ordered list and assigned them to the experimental group and the last 500 domestic abusers and assigned them to the control group, one of the groups could be “lopsided” or imbalanced on some important characteristic that could impact the outcome of the study. With random assignment, there is a much higher likelihood that these important characteristics among the experimental and control groups will balance out because no individual has a different chance of being placed into one group versus the other. The probability of one or more characteristics being concentrated into one group and not the other is extremely small with random assignment.

To further illustrate the importance of random assignment to group equivalence, suppose the first 500 domestic violence abusers who were assigned to the experimental group from the ordered list had significantly fewer domestic violence arrests before the program than the last 500 domestic violence abusers on the list. Perhaps this is because the ordered list was organized from least to most chronic domestic abusers. In this instance, the control group would be lopsided concerning number of pre-program domestic violence arrests—they would be more chronic than the experimental group. The arrest imbalance then could potentially explain the post-test results following the domestic violence program. For example, the “less risky” offenders in the experimental group might be less likely to be re-arrested regardless of their participation in the domestic violence program, especially compared to the more chronic domestic abusers in the control group. Because of imbalances between the experimental and control group on arrests before the program was implemented, it would not be known for certain whether an observed reduction in re-arrests after the program for the experimental group was due to the program or the natural result of having less risky offenders in the experimental group. In this instance, the results might be taken to suggest that the program significantly reduces re-arrests. This conclusion might be spurious, however, for the association may simply be due to the fact that the offenders in the experimental group were much different (less frequent offenders) than the control group. Here, the program may have had no effect—the experimental group members may have performed the same regardless of the treatment because they were low-level offenders.

The example above suggests that differences between the experimental and control groups based on previous arrest records could have a major impact on the results of a study. Such differences can arise with the lack of random assignment. If subjects were randomly assigned to the experimental and control group, however, there would be a much higher probability that less frequent and more frequent domestic violence arrestees would have been found in both the experimental and control groups and the differences would have balanced out between the groups—leaving any differences between the groups at the post-test attributable to the treatment only.

In summary, random assignment helps to ensure that the experimental and control group members are balanced or equivalent on all factors that could impact the dependent variable or post-test—known or unknown. The only factor they are not balanced or equal on is the treatment. As such, random assignment helps to isolate the impact of the treatment, if any, on the post-test because it increases confidence that the only difference between the groups should be that one group gets the treatment and the other does not. If that is the only difference between the groups, any change in the dependent variable between the experimental and control group must be attributed to the treatment and not an alternative explanation, such as significant arrest history imbalance between the groups (refer to Figure 5.2). This logic also suggests that if the experimental group and control group are imbalanced on any factor that may be relevant to the outcome, that factor then becomes a potential alternative explanation for the results—an explanation that reduces the researcher’s ability to isolate the real impact of the treatment.

WHAT RESEARCH SHOWS: IMPACTING CRIMINAL JUSTICE OPERATIONS

Scared Straight

The 1978 documentary Scared Straight introduced to the public the “Lifer’s Program” at Rahway State Prison in New Jersey. This program sought to decrease juvenile delinquency by bringing at-risk and delinquent juveniles into the prison where they would be “scared straight” by inmates serving life sentences. Participants in the program were talked to and yelled at by the inmates in an effort to scare them. It was believed that the fear felt by the participants would lead to a discontinuation of their problematic behavior so that they would not end up in prison themselves. Although originally touted as a success based on anecdotal evidence, subsequent evaluations of the program and others like it proved otherwise.

Using a classic experimental design, Finckenauer evaluated the original “Lifer’s Program” at Rahway State Prison. 16 Participating juveniles were randomly assigned to the experimental group or the control group. Results of the evaluation were not positive. Post-test measures revealed that juveniles who were assigned to the experimental group and participated in the program were actually more seriously delinquent afterwards than those who did not participate in the program. Also using an experimental design with random assignment, Yarborough evaluated the “Juvenile Offenders Learn Truth” (JOLT) program at the State Prison of Southern Michigan at Jackson. 17 This program was similar to that of the “Lifer’s Program” only with fewer obscenities used by the inmates. Post-test measurements were taken at two intervals, 3 and 6 months after program completion. Again, results were not positive. Findings revealed no significant differences between those juveniles who attended the program and those who did not.

Other experiments conducted on Scared Straight -like programs further revealed their inability to deter juveniles from future criminality. 18 Despite the intuitive popularity of these programs, these evaluations proved that such programs were not successful. In fact, it is postulated that these programs may have actually done more harm than good.

The Control Group The presence of an equivalent control group (created through random assignment) also gives the researcher more confidence that the findings at the post-test are due to the treatment and not some other alternative explanation. This logic is perhaps best demonstrated by considering how interpretation of results is affected without a control group. Absent an equivalent control group, it cannot be known whether the results of the study are due to the program or some other factor. This is because the control group provides a baseline of comparison or a “control.” For example, without a control group, the researcher may find that domestic violence arrests declined from pre-test to post-test. But the researcher would not be able to definitely attribute that finding to the program without a control group. Perhaps the single experimental group incurred fewer arrests because they matured over their time in the program, regardless of participation in the domestic violence program. Having a randomly assigned control group would allow this consideration to be eliminated, because the equivalent control group would also have naturally matured if that was the case.

Because the control group is meant to be similar to the experimental group on all factors with the exception that the experimental group receives the treatment, the logic is that any differences between the experimental and control group after the treatment must then be attributable only to the treatment itself—everything else occurs equally in both the experimental and control groups and thus cannot be the cause of results. The bottom line is that a control group allows the researcher more confidence to attribute any change in the dependent variable from pre- to post-test and between the experimental and control groups to the treatment—and not another alternative explanation. Absent a control group, the researcher would have much less confidence in the results.

Knowledge about the major components of the classic experimental design and how they contribute to an understanding of cause and effect serves as an important foundation for studying different types of experimental and quasi-experimental designs and their organization. A useful way to become familiar with the components of the experimental design and their important role is to consider the impact on the interpretation of results when one or more components are lacking. For example, what if a design lacked a pre-test? How could this impact the interpretation of post-test results and knowledge about the comparability of the experimental and control group? What if a design lacked random assignment? What are some potential problems that could occur and how could those potential problems impact interpretation of results? What if a design lacked a control group? How does the absence of an equivalent control group affect a researcher’s ability to determine the unique effects of the treatment on the outcomes being measured? The ability to discuss the contribution of a pre-test, random assignment, and a control group—and what is the impact when one or more of those components is absent from a research design—is the key to understanding both experimental and quasi-experimental designs that will be discussed in the remainder of this chapter. As designs lose these important parts and transform from a classic experiment to another experimental design or to a quasi-experiment, they become less useful in isolating the impact that a treatment has on the dependent variable and allow more room for alternative explanations of the results.

One more important point must be made before further delving into experimental and quasi-experimental designs. This point is that rarely, if ever, will the average consumer of research be exposed to the symbols or specific language of the classic experiment, or other experimental and quasi-experimental designs examined in this chapter. In fact, it is unlikely that the average consumer will ever be exposed to the terms pre-test, post-test, experimental group, or random assignment in the popular media, among other terms related to experimental and quasi-experimental designs. Yet, consumers are exposed to research results produced from these and other research designs every day. For example, if a national news organization or your regional newspaper reported a story about the effectiveness of a new drug to reduce cholesterol or the effects of different diets on weight loss, it is doubtful that the results would be reported as produced through a classic experimental design that used a control group and random assignment. Rather, these media outlets would use generally nonscientific terminology such as “results of an experiment showed” or “results of a scientific experiment indicated” or “results showed that subjects who received the new drug had greater cholesterol reductions than those who did not receive the new drug.” Even students who regularly search and read academic articles for use in course papers and other projects will rarely come across such design notation in the research studies they utilize. Depiction of the classic experimental design, including a discussion of its components and their function, simply illustrates the organization and notation of the classic experimental design. Unfortunately, the average consumer has to read between the lines to determine what type of design was used to produce the reported results. Understanding the key components of the classic experimental design allows educated consumers of research to read between those lines.

RESEARCH IN THE NEWS

“Swearing Makes Pain More Tolerable” 19

In 2009, Richard Stephens, John Atkins, and Andrew Kingston of the School of Psychology at Keele University conducted a study with 67 undergraduate students to determine if swearing affects an individual’s response to pain. Researchers asked participants to immerse their hand in a container filled with ice-cold water and repeat a preferred swear word. The researchers then asked the same participants to immerse their hand in ice-cold water while repeating a word used to describe a table (a non-swear word). The results showed that swearing increased pain tolerance compared to the non-swearing condition. Participants who used a swear word were able to hold their hand in ice-cold water longer than when they did not swear. Swearing also decreased participants’ perception of pain.

1. This study is an example of a repeated measures design. In this form of experimental design, study participants are exposed to an experimental condition (swearing with hand in ice-cold water) and a control condition (non-swearing with hand in ice-cold water) while repeated outcome measures are taken with each condition, for example, the length of time a participant was able to keep his or her hand submerged in ice-cold water. Conduct an Internet search for “repeated measures design” and explore the various ways such a study could be conducted, including the potential benefits and drawbacks to this design.

2. After researching repeated measures designs, devise a hypothetical repeated measures study of your own.

3. Retrieve and read the full research study “Swearing as a Response to Pain” by Stephens, Atkins, and Kingston while paying attention to the design and methods (full citation information for this study is listed below). Has your opinion of the study results changed after reading the full study? Why or why not?

Full Study Source: Stephens, R., Atkins, J., and Kingston, A. (2009). “Swearing as a response to pain.” NeuroReport 20, 1056–1060.

Variations on the Experimental Design

The classic experimental design is the foundation upon which all experimental and quasi-experimental designs are based. As such, it can be modified in numerous ways to fit the goals (or constraints) of a particular research study. Below are two variations of the experimental design. Again, knowledge about the major components of the classic experiment, how they contribute to an explanation of results, and what the impact is when one or more components are missing provides an understanding of all other experimental designs.

Post-Test Only Experimental Design

The post-test only experimental design could be used to examine the impact of a treatment program on school disciplinary infractions as measured or operationalized by referrals to the principal’s office (see Table 5.2). In this design, the researcher randomly assigns a group of discipline problem students to the experimental group and control group by flipping a coin—heads to the experimental group and tails to the control group. The experimental group then enters the 3-month treatment program. After the program, the researcher compares the number of referrals to the principal’s office between the experimental and control groups over some period of time, for example, discipline referrals at 6 months after the program. The researcher finds that the experimental group has a much lower number of referrals to the principal’s office in the 6 month follow-up period than the control group.

TABLE 5.2 | Post-Test Only Experimental Design

Several issues arise in this example study. The researcher would not know if discipline problems decreased, increased, or stayed the same from before to after the treatment program because the researcher did not have a count of disciplinary referrals prior to the treatment program (e.g., a pre-test). Although the groups were randomly assigned and are presumed equivalent, the absence of a pre-test means the researcher cannot confirm that the experimental and control groups were equivalent before the treatment was administered, particularly on the number of referrals to the principal’s office. The groups could have differed by a chance occurrence even with random assignment, and any such differences between the groups could potentially explain the post-test difference in the number of referrals to the principal’s office. For example, if the control group included much more serious or frequent discipline problem students than the experimental group by chance, this difference might explain the lower number of referrals for the experimental group, not that the treatment produced this result.

Experimental Design with Two Treatments and a Control Group

This design could be used to determine the impact of boot camp versus juvenile detention on post-release recidivism (see Table 5.3). Recidivism in this study is operationalized as re-arrest for delinquent behavior. First, a population of known juvenile delinquents is randomly assigned to either boot camp, juvenile detention, or a control condition where they receive no sanction. To accomplish random assignment to groups, the researcher places the names of all youth into a hat and assigns the groups in order. For example, the first name pulled goes into experimental group 1, the next into experimental group 2, and the next into the control group, and so on. Once randomly assigned, the experimental group youth receive either boot camp or juvenile detention for a period of 3 months, whereas members of the control group are released on their own recognizance to their parents. At the end of the experiment, the researcher compares the re-arrest activity of boot camp participants to detention delinquents to control group members during a 6-month follow-up period.

TABLE 5.3 | Experimental Design with Two Treatments and a Control Group

This design has several advantages. First, it includes all major components of the classic experimental design, and simply adds an additional treatment for comparison purposes. Random assignment was utilized and this means that the groups have a higher probability of being equivalent on all factors that could impact the post-test. Thus, random assignment in this example helps to ensure the only differences between the groups are the treatment conditions. Without random assignment, there is a greater chance that one group of youth was somehow different, and this difference could impact the post-test. For example, if the boot camp youth were much less serious and frequent delinquents than the juvenile detention youth or control group youth, the results might erroneously show that the boot camp reduced recidivism when in fact the youth in boot camp may have been the “best risks”—unlikely to get re-arrested with or without boot camp. The pre-test in the example above allows the researcher to determine change in re-arrests from pretest to post-test. Thus, the researcher can determine if delinquent behavior, as measured by re-arrest, increased, decreased, or remained constant from pre- to post-test. The pre-test also allows the researcher to confirm that the random assignment process resulted in equivalent groups based on the pre-test. Finally, the presence of a control group allows the researcher to have more confidence that any differences in the post-test are due to the treatment. For example, if the control group had more re-arrests than the boot camp or juvenile detention experimental groups 6 months after their release from those programs, the researcher would have more confidence that the programs produced fewer re-arrests because the control group members were the same as the experimental groups; the only difference was that they did not receive a treatment.

The one key feature of experimental designs is that they all retain random assignment. This is why they are considered “experimental” designs. Sometimes, however, experimental designs lack a pre-test. Knowledge of the usefulness of a pre-test demonstrates the potential problems with those designs where it is missing. For example, in the post-test only experimental design, a researcher would not be able to make a determination of change in the dependent variable from pre- to post-test. Perhaps most importantly, the researcher would not be able to confirm that the experimental and control groups were in fact equivalent on a pre-test measure before the introduction of the treatment. Even though both groups were randomly assigned, and probability theory suggests they should be equivalent, without a pre-test measure the researcher could not confirm similarity because differences could occur by chance even with random assignment. If there were any differences at the post-test between the experimental group and control group, the results might be due to some explanation other than the treatment, namely that the groups differed prior to the administration of the treatment. The same limitation could apply in any form of experimental design that does not utilize a pre-test for conformational purposes.

Understanding the contribution of a pre-test to an experimental design shows that it is a critical component. It provides a measure of change and also gives the researcher more confidence that the observed results are due to the treatment, and not some difference between the experimental and control groups. Despite the usefulness of a pre-test, however, perhaps the most critical ingredient of any experimental design is random assignment. It is important to note that all experimental designs retain random assignment.

Experimental Designs Are Rare in Criminal Justice and Criminology

The classic experiment is the foundation for other types of experimental and quasi-experimental designs. The unfortunate reality, however, is that the classic experiment, or other experimental designs, are few and far between in criminal justice. 20 Recall that one of the major components of an experimental design is random assignment. Achieving random assignment is often a barrier to experimental research in criminal justice. Achieving random assignment might, for example, require the approval of the chief (or city council or both) of a major metropolitan police agency to allow researchers to randomly assign patrol officers to certain areas of a city and/or randomly assign police officer actions. Recall the MDVE. This experiment required the full cooperation of the chief of police and other decision-makers to allow researchers to randomly assign police actions. In another example, achieving random assignment might require a judge to randomly assign a group of youthful offenders to a certain juvenile court sanction (experimental group), and another group of similar youthful offenders to no sanction or an alternative sanction as a control group. 21 In sum, random assignment typically requires the cooperation of a number of individuals and sometimes that cooperation is difficult to obtain.

Even when random assignment can be accomplished, sometimes it is not implemented correctly and the random assignment procedure breaks down. This is another barrier to conducting experimental research. For example, in the MDVE, researchers randomly assigned officer responses, but the officers did not always follow the assigned course of action. Moreover, some believe that the random assignment of criminal justice programs, sentences, or randomly assigning officer responses may be unethical in certain circumstances, and even a violation of the rights of citizens. For example, some believe it is unfair when random assignment results in some delinquents being sentenced to boot camp while others get assigned to a control group without any sanction at all or a less restrictive sanction than boot camp. In the MDVE, some believe it is unfair that some suspects were arrested and received an official record whereas others were not arrested for the same type of behavior. In other cases, subjects in the experimental group may receive some benefit from the treatment that is essentially denied to the control group for a period of time and this can become an issue as well.

There are other important reasons why random assignment is difficult to accomplish. Random assignment may, for example, involve a disruption of the normal procedures of agencies and their officers. In the MDVE, officers had to adjust their normal and established routine, and this was a barrier at times in that study. Shadish, Cook, and Campbell also note that random assignment may not always be feasible or desirable when quick answers are needed. 22 This is because experimental designs sometimes take a long time to produce results. In addition to the time required in planning and organizing the experiment, and treatment delivery, researchers may need several months if not years to collect and analyze the data before they have answers. This is particularly important because time is often of the essence in criminal justice research, especially in research efforts testing the effect of some policy or program where it is not feasible to wait years for answers. Waiting for the results of an experimental design means that many policy-makers may make decisions without the results.

Quasi-Experimental Designs

In general terms, quasi-experiments include a group of designs that lack random assignment. Quasi-experiments may also lack other parts, such as a pre-test or a control group, just like some experimental designs. The absence of random assignment, however, is the ingredient that transforms an otherwise experimental design into a quasi-experiment. Lacking random assignment is a major disadvantage because it increases the chances that the experimental and control groups differ on relevant factors before the treatment—both known and unknown—differences that may then emerge as alternative explanations of the outcomes.

Just like experimental designs, quasi-experimental designs can be organized in many different ways. This section will discuss three types of quasi-experiments: nonequivalent group design, one-group longitudinal design, and two-group longitudinal design.

Nonequivalent Group Design

The nonequivalent group design is perhaps the most common type of quasi-experiment. 23 Notice that it is very similar to the classic experimental design with the exception that it lacks random assignment (see Table 5.4). Additionally, what was labeled the experimental group in an experimental design is sometimes called the treatment group in the nonequivalent group design. What was labeled the control group in the experimental design is sometimes called the comparison group in the nonequivalent group design. This terminological distinction is an indicator that the groups were not created through random assignment.

TABLE 5.4 | Nonequivalent Group Design

NR = Not Randomly assigned

One of the main problems with the nonequivalent group design is that it lacks random assignment, and without random assignment, there is a greater chance that the treatment and comparison groups may be different in some way that can impact study results. Take, for example, a nonequivalent group design where a researcher is interested in whether an aggression-reduction treatment program can reduce inmate-on-inmate assaults in a prison setting. Assume that the researcher asked for inmates who had previously been involved in assaultive activity to volunteer for the aggression-reduction program. Suppose the researcher placed the first 50 volunteers into the treatment group and the next 50 volunteers into the comparison group. Note that this method of assignment is not random but rather first come, first serve.

Because the study utilized volunteers and there was no random assignment, it is possible that the first 50 volunteers placed into the treatment group differed significantly from the last 50 volunteers who were placed in the comparison group. This can lead to alternative explanations for the results. For example, if the treatment group was much younger than the comparison group, the researcher may find at the end of the program that the treatment group still maintained a higher rate of infractions than the comparison group—even after the aggression-reduction program! The conclusion might be that the aggression program actually increased the level of violence among the treatment group. This conclusion would likely be spurious and may be due to the age differential between the treatment and comparison groups. Indeed, research has revealed that younger inmates are significantly more likely to engage in prison assaults than older inmates. The fact that the treatment group incurred more assaults than the comparison group after the aggression-reduction program may only relate to the age differential between the groups, not that the program had no effect or that it somehow may have increased aggression. The previous example highlights the importance of random assignment and the potential problems that can occur in its absence.

Although researchers who utilize a quasi-experimental design are not able to randomly assign their subjects to groups, they can employ other techniques in an attempt to make the groups as equivalent as possible on known or measured factors before the treatment is given. In the example above, it is likely that the researcher would have known the age of inmates, their prior assault record, and various other pieces of information (e.g., previous prison stays). Through a technique called matching, the researcher could make sure the treatment and comparison groups were “matched” on these important factors before administering the aggression reduction program to the treatment group. This type of matching can be done individual to individual (e.g., subject #1 in treatment group is matched to a selected subject #1 in comparison group on age, previous arrests, gender), or aggregately, such that the comparison group is similar to the treatment group overall (e.g., average ages between groups are similar, equal proportions of males and females). Knowledge of these and other important variables, for example, would allow the researcher to make sure that the treatment group did not have heavy concentrations of younger or more frequent or serious offenders than the comparison group—factors that are related to assaultive activity independent of the treatment program. In short, matching allows the researcher some control over who goes into the treatment and comparison groups so as to balance these groups on important factors absent random assignment. If unbalanced on one or more factors, these factors could emerge as alternative explanations of the results. Figure 5.3 demonstrates the logic of matching both at the individual and aggregate level in a quasi-experimental design.

Matching is an important part of the nonequivalent group design. By matching, the researcher can approximate equivalence between the groups on important variables that may influence the post-test. However, it is important to note that a researcher can only match subjects on factors that they have information about—a researcher cannot match the treatment and comparison group members on factors that are unmeasured or otherwise unknown but which may still impact outcomes. For example, if the researcher has no knowledge about the number of previous incarcerations, the researcher cannot match the treatment and comparison groups on this factor. Matching also requires that the information used for matching is valid and reliable, which is not always the case. Agency records, for example, are notorious for inconsistencies, errors, omissions, and for being dated, but are often utilized for matching purposes. Asking survey questions to generate information for matching (for example, how many times have you been incarcerated?) can also be problematic because some respondents may lie, forget, or exaggerate their behavior or experiences.

In addition to the above considerations, the more factors a researcher wishes to match the group members on, the more difficult it becomes to find appropriate matches. Matching on prior arrests or age is less complex than matching on several additional pieces of information. Finally, matching is never considered superior to random assignment when the goal is to construct equitable groups. This is because there is a much higher likelihood of equivalence with random assignment on factors that are both measured and unknown to the researcher. Thus, the results produced from a nonequivalent group design, even with matching, are at a greater risk of alternative explanations than an experimental design that features random assignment.

FIGURE 5.3 | (a) Individual Matching (b) Aggregate Matching

research questions in quasi experimental design

The previous discussion is not to suggest that the nonequivalent group design cannot be useful in answering important research questions. Rather, it is to suggest that the nonequivalent group design, and hence any quasi-experiment, is more susceptible to alternative explanations than the classic experimental design because of the absence of random assignment. As a result, a researcher must be prepared to rule out potential alternative explanations. Quasi-experimental designs that lack a pre-test or a comparison group are even less desirable than the nonequivalent group design and are subject to additional alternative explanations because of these missing parts. Although the quasi-experiment may be all that is available and still can serve as an important design in evaluating the impact of a particular treatment, it is not preferable to the classic experiment. Researchers (and consumers) must be attuned to the potential issues of this design so as to make informed conclusions about the results produced from such research studies.

The Effects of Red Light Camera (RLC) Enforcement

On March 15, 2009, an article appeared in the Santa Cruz Sentinel entitled “Ticket’s in the Mail: Red-Light Cameras Questioned.” The article stated “while studies show fewer T-bone crashes at lights with cameras and fewer drivers running red lights, the number of rear-end crashes increases.” 24 The study mentioned in the newspaper, which showed fewer drivers running red lights with cameras, was conducted by Richard Retting, Susan Ferguson, and Charles Farmer of the Insurance Institute for Highway Safety (IIHS). 25 They completed a quasi-experimental study in Philadelphia to determine the impact of red light cameras (RLC) on red light violations. In the study, the researchers selected nine intersections—six of which were experimental sites that utilized RLCs and three comparison sites that did not utilize RLCs. The six experimental sites were located in Philadelphia, Pennsylvania, and the three comparison sites were located in Atlantic County, New Jersey. The researchers chose the comparison sites based on the proximity to Philadelphia, the ability to collect data using the same methods as at experimental intersections (e.g., the use of cameras for viewing red light traffic), and the fact that police officials in Atlantic County had offered assistance selecting and monitoring the intersections.

The authors collected three phases of information in the RLC study at the experimental and comparison sites:

Phase 1 Data Collection: Baseline (pre-test) data collection at the experimental and comparison sites consisting of the number of vehicles passing through each intersection, the number of red light violations, and the rate of red light violations per 10,000 vehicles.

Phase 2 Data Collection: Number of vehicles traveling through experimental and comparison intersections, number of red light violations after a 1-second yellow light increase at the experimental sites (treatment 1), number of red light violations at comparison sites without a 1-second yellow light increase, and red light violations per 10,000 vehicles at both experimental and comparison sites.

Phase 3 Data Collection: Red light violations after a 1-second yellow light increase and RLC enforcement at the experimental sites (treatment 2), red light violations at comparison sites without a 1-second yellow increase or RLC enforcement, number of vehicles passing through the experimental and comparison intersections, and the rate of red light violations per 10,000 vehicles.

The researchers operationalized “red light violations” as those where the vehicle entered the intersection one-half of a second or more after the onset of the red signal where the vehicle’s rear tires had to be positioned behind the crosswalk or stop line prior to entering on red. Vehicles already in the intersection at the onset of the red light, or those making a right turn on red with or without stopping were not considered red light violations.

The researchers collected video data at each of the experimental and comparison sites during Phases 1–3. This allowed the researchers to examine red light violations before, during, and after the implementation of red light enforcement and yellow light time increases. Based on an analysis of data, the researchers revealed that the implementation of a 1-second yellow light increase led to reductions in the rate of red light violations from Phase 1 to Phase 2 in all of the experimental sites. In 2 out of 3 comparison sites, the rate of red light violations also decreased, despite no yellow light increase. From Phase 2 to Phase 3 (the enforcement of red light camera violations in addition to a 1-second yellow light increase at experimental sites), the authors noted decreases in the rate of red light violations in all experimental sites, and decreases among 2 of 3 comparison sites without red light enforcement in effect.

Concluding their study, the researchers noted that the study “found large and highly significant incremental reductions in red light running associated with increased yellow signal timing followed by the introduction of red light cameras.” Despite these findings, the researchers noted a number of potential factors to consider in light of the findings: the follow-up time periods utilized when counting red light violations before and after the treatment conditions were instituted; publicity about red light camera enforcement; and the size of fines associated with red light camera enforcement (the fine in Philadelphia was $100, higher than in many other cities), among others.

After reading about the study used in the newspaper article, has your impression of the newspaper headline and quote changed?

For more information and research on the effect of RLCs, visit the Insurance Institute for Highway Safety at http://www .iihs.org/research/topics/rlr.html .

One-Group Longitudinal Design

Like all experimental designs, the quasi-experimental design can come in a variety of forms. The second quasi-experimental design (above) is the one-group longitudinal design (also called a simple interrupted time series design). 26 An examination of this design shows that it lacks both random assignment and a comparison group (see Table 5.5). A major difference between this design and others we have covered is that it includes multiple pre-test and post-test observations.

TABLE 5.5 | One-Group Longitudinal Design

The one-group longitudinal design is useful when researchers are interested in exploring longer-term patterns. Indeed, the term longitudinal generally means “over time”—repeated measurements of the pre-test and post-test over time. This is different from cross-sectional designs, which examine the pre-test and post-test at only one point in time (e.g., at a single point before the application of the treatment and at a single point after the treatment). For example, in the nonequivalent group design and the classic experimental design previously examined, both are cross-sectional because pre-tests and post-tests are measured at one point in time (e.g., at a point 6 months after the treatment). Yet, these designs could easily be considered longitudinal if researchers took repeated measures of the pre-test and post-test.

The organization of the one-group longitudinal design is to examine a baseline of several pre-test observations, introduce a treatment or intervention, and then examine the post-test at several different time intervals. As organized, this design is useful for gauging the impact that a particular program, policy, or law has, if any, and how long the treatment impact lasts. Consider an example whereby a researcher is interested in gauging the impact of a tobacco ban on inmate-on-inmate assaults in a prison setting. This is an important question, for recent years have witnessed correctional systems banning all tobacco products from prison facilities. Correctional administrators predicted that there would be a major increase of inmate-on-inmate violence once the bans took effect. The one-group longitudinal design would be one appropriate design to examine the impact of banning tobacco on inmate assaults.

To construct this study using the one-group longitudinal design, the researcher would first examine the rate of inmate-on-inmate assaults in the prison system (or at an individual prison, a particular cellblock, or whatever the unit of analysis) prior to the removal of tobacco. This is the pre-test, or a baseline of assault activity before the ban goes into effect. In the design presented above, perhaps the researcher would measure the level of assaults in the preceding four months prior to the tobacco ban. When establishing a pre-test baseline, the general rule is that, in a longitudinal design, the more time utilized, both in overall time and number of intervals, the better. For example, the rate of assaults in the preceding month is not as useful as an entire year of data on inmate assaults prior to the tobacco ban. Next, once the tobacco ban is implemented, the researcher would then measure the rate of inmate assaults in the coming months to determine what impact the ban had on inmate-on-inmate assaults. This is shown in Table 5.5 as the multiple post-test measures of assaults. Assaults may increase, decrease, or remain constant from the pre-test baseline over the term of the post-test.

If assaults increased at the same time as the ban went into effect, the researcher might conclude that the increase was due only to the tobacco ban. But, could there be alternative explanations? The answer to this question is yes, there may be other plausible explanations for the increase even with several months of pre-test data. Unfortunately, without a comparison group there is no way for the researcher to be certain if the increase in assaults was due to the tobacco ban, or some other factor that may have spurred the increase in assaults and happened at the same time as the tobacco ban. What if assaults decreased after the tobacco ban went into effect? In this scenario, because there is no comparison group, the researcher would still not know if the results would have happened anyway without the tobacco ban. In these instances, the lack of a comparison group prevents the researcher from confidently attributing the results to the tobacco ban, and interpretation is subject to numerous alternative explanations.

Two-Group Longitudinal Design

A remedy for the previous situation would be to introduce a comparison group (see Table 5.6). Prior to the full tobacco ban, suppose prison administrators conducted a pilot program at one prison to provide insight as to what would happen once the tobacco ban went into effect systemwide. To conduct this pilot, the researcher identified one prison. At this prison, the researcher identified two different cellblocks, C-Block and D-Block. C-Block constitutes the treatment group, or the cellblock of inmates who will have their tobacco taken away. D-Block is the comparison group—inmates in this cellblock will retain their tobacco privileges during the course of the study and during a determined follow-up period to measure post-test assaults (e.g., 12-months). This is a two-group longitudinal design (also sometimes called a multiple interrupted time series design), and adding a comparison group makes this design superior to the one-group longitudinal design.

TABLE 5.6 | Two-Group Longitudinal Design

The usefulness of adding a comparison group to the study means that the researcher can have more confidence that the results at the post-test are due to the tobacco ban and not some alternative explanation. This is because any difference in assaults at the post-test between the treatment and comparison group should be attributed to the only difference between them, the tobacco ban. For this interpretation to hold, however, the researcher must be sure that C-Block and D-Block are similar or equivalent on all factors that might influence the post-test. There are many potential factors that should be considered. For example, the researcher will want to make sure that the same types of inmates are housed in both cellblocks. If a chronic group of assaultive inmates constitutes members of C-Block, but not D-Block, this differential could explain the results, not the treatment.

The researcher might also want to make sure equitable numbers of tobacco and non-tobacco users are found in each cellblock. If very few inmates in C-Block are smokers, the real effect of removing tobacco may be hidden. The researcher might also examine other areas where potential differences might arise, for example, that both cellblocks are staffed with equal numbers of officers, that officers in each cellblock tend to resolve inmate disputes similarly, and other potential issues that could influence post-test measure of assaults. Equivalence could also be ensured by comparing the groups on additional evidence before the ban takes effect: number of prior prison sentences, time served in prison, age, seriousness of conviction crime, and other factors that might relate to assaultive behavior, regardless of the tobacco ban. Moreover, the researcher should ensure that inmates in C-Block do not know that their D-Block counterparts are still allowed tobacco during the pilot study, and vice versa. If either group knows about the pilot program being an experiment, they might act differently than normal, and this could become an explanation of results. Additionally, the researchers might also try to make sure that C-Block inmates are completely tobacco free after the ban goes into effect—that they do not hoard, smuggle, or receive tobacco from officers or other inmates during the tobacco ban in or outside of the cellblock. If these and other important differences are accounted for at the individual and cellblock level, the researcher will have more confidence that any differences in assaults at the post-test between the treatment and comparison groups are related to the tobacco ban, and not some other difference between the two groups or the two cellblocks.

The addition of a comparison group aids in the ability of the researcher to isolate the true impact of a tobacco ban on inmate-on-inmate assaults. All factors that influence the treatment group should also influence the comparison group because the groups are made up of equivalent individuals in equivalent circumstances, with the exception of the tobacco ban. If this is the only difference, the results can be attributed to the ban. Although the addition of the comparison group in the two-group longitudinal design provides more confidence that the findings are attributed to the tobacco ban, the fact that this design lacks randomization means that alternative explanations cannot be completely ruled out—but they can be minimized. This example also suggests that the quasi-experiment in this instance may actually be preferable to an experimental design—noting the realities of prison administration. For example, prison inmates are not typically randomly assigned to different cellblocks by prison officers. Moreover, it is highly unlikely that a prison would have two open cellblocks waiting for a researcher to randomly assign incoming inmates to the prison for a tobacco ban study. Therefore, it is likely there would be differences among the groups in the quasi-experiment.

Fortunately, if differences between the groups are present, the researcher can attempt to determine their potential impact before interpretation of results. The researcher can also use statistical models after the ban takes effect to determine the impact of any differences between the groups on the post-test. While the two-group longitudinal quasi-experiment just discussed could also take the form of an experimental design, if random assignment could somehow be accomplished, the previous discussion provides one situation where an experimental design might be appropriate and desired for a particular research question, but would not be realistic considering the many barriers.

The Threat of Alternative Explanations

Alternative explanations are those factors that could explain the post-test results, other than the treatment. Throughout this chapter, we have noted the potential for alternative explanations and have given several examples of explanations other than the treatment. It is important to know that potential alternative explanations can arise in any research design discussed in this chapter. However, alternative explanations often arise because some design part is missing, for example, random assignment, a pre-test, or a control or comparison group. This is especially true in criminal justice where researchers often conduct field studies and have less control over their study conditions than do researchers who conduct experiments under highly controlled laboratory conditions. A prime example of this is the tobacco ban study, where it would be difficult for researchers to ensure that C-Block inmates, the treatment group, were completely tobacco free during the course of the study.

Alternative explanations are typically referred to as threats to internal validity. In this context, if an experiment is internally valid, it means that alternative explanations have been ruled out and the treatment is the only factor that produced the results. If a study is not internally valid, this means that alternative explanations for the results exist or potentially exist. In this section, we focus on some common alternative explanations that may arise in experimental and quasi-experimental designs. 27

Selection Bias

One of the more common alternative explanations that may occur is selection bias. Selection bias generally indicates that the treatment group (or experimental group) is somehow different from the comparison group (or control group) on a factor that could influence the post-test results. Selection bias is more often a threat in quasi-experimental designs than experimental designs due to the lack of random assignment. Suppose in our study of the prison tobacco ban, members of C-Block were substantially younger than members of D-Block, the comparison group. Such an imbalance between the groups would mean the researcher would not know if the differences in assaults are real (meaning the result of the tobacco ban) or a result of the age differential. Recall that research shows that younger inmates are more assaultive than older inmates and so we would expect more assaults among the younger offenders independent of the tobacco ban.

In a quasi-experiment, selection bias is perhaps the most prevalent type of alternative explanation and can seriously compromise results. Indeed, many of the examples above have referred to potential situations where the groups are imbalanced or not equivalent on some important factor. Although selection bias is a common threat in quasi-experimental designs because of lack of random assignment, and can be a threat in experimental designs because the groups could differ by chance alone or the practice of randomization was not maintained throughout the study (see Classics in CJ Research-MDVE above), a researcher may be able to detect such differentials. For example, the researcher could detect such differences by comparing the groups on the pre-test or other types of information before the start of the study. If differences were found, the researcher could take measures to correct them. The researcher could also use a statistical model that could account or control for differences between the groups and isolate the impact of the treatment, if any. This discussion is beyond the scope of this text but would be a potential way to deal with selection bias and estimate the impact of this bias on study results. The researcher could also, if possible, attempt to re-match the groups in a quasi-experiment or randomly assign the groups a second time in an experimental design to ensure equivalence. At the least, the researcher could recognize the group differences and discuss their potential impact on the results. Without a pre-test or other pre-study information on study participants, however, such differences might not be able to be detected and, therefore, it would be more difficult to determine how the differences, as a result of selection bias, influenced the results.

Another potential alternative explanation is history. History refers to any event experienced differently by the treatment and comparison groups in the time between the pre-test and the post-test that could impact results. Suppose during the course of the tobacco ban study several riots occurred on D-Block, the comparison group. Because of the riots, prison officers “locked down” this cellblock numerous times. Because D-Block inmates were locked down at various times, this could have affected their ability to otherwise engage in inmate assaults. At the end of the study, the assaults in D-Block might have decreased from their pre-test levels because of the lockdowns, whereas in C-Block assaults may have occurred at their normal pace because there was not a lockdown, or perhaps even increased from the pretest because tobacco was also taken away. Even if the tobacco ban had no effect and assaults remained constant in C-Block from pre- to post-test, the lockdown in D-Block might make it appear that the tobacco ban led to increased assaults in C-Block. Thus, the researcher would not know if the post-test results for the C-Block treatment group were attributable to the tobacco ban or the simple fact that D-Block inmates were locked down and their assault activity was artificially reduced. In this instance, the comparison group becomes much less useful because the lockdown created a historical factor that imbalanced the groups during the treatment phase and nullified the comparison.

Another potential alternative explanation is maturation. Maturation refers to the natural biological, psychological, or emotional processes we all experience as time passes—aging, becoming more or less intelligent, becoming bored, and so on. For example, if a researcher was interested in the effect of a boot camp on recidivism for juvenile offenders, it is possible that over the course of the boot camp program the delinquents naturally matured as they aged and this produced the reduction in recidivism—not that the boot camp somehow led to this reduction. This threat is particularly applicable in situations that deal with populations that rapidly change over a relatively short period of time or when a treatment lasts a considerable period of time. However, this threat could be eliminated with a comparison group that is similar to the treatment group. This is because the maturation effects would occur in both groups and the effect of the boot camp, if any, could be isolated. This assumes, however, that the groups are matched and equitable on factors subject to the maturation process, such as age. If not, such differentials could be an alternative explanation of results. For example, if the treatment and comparison groups differ by age, on average, this could mean that one group changes or matures at a different rate than the other group. This differential rate of change or maturation as a result of the age differential could explain the results, not the treatment. This example demonstrates how selection bias and maturation can interact at the same time as alternative explanations. This example also suggests the importance of an equivalent control or comparison group to eliminate or minimize the impact of maturation as an alternative explanation.

Attrition or Subject Mortality

Attrition or subject mortality is another typical alternative explanation. Attrition refers to differential loss in the number or type of subjects between the treatment and comparison groups and can occur in both experimental and quasi-experimental designs. Suppose we wanted to conduct a study to determine who is the better research methods professor among the authors of this textbook. Let’s assume that we have an experimental design where students were randomly assigned to professor 1, professor 2, or professor 3. By randomly assigning students to each respective professor, there is greater probability that the groups are equivalent and thus there are no differences between the three groups with one exception—the professor they receive and his or her particular teaching and delivery style. This is the treatment. Let’s also assume that the professors will be administering the same tests and using the same textbook. After the group members are randomly assigned, a pre-treatment evaluation shows the groups are in fact equivalent on all important known factors that could influence post-test scores, such as grade point average, age, time in school, and exposure to research methods concepts. Additionally, all groups scored comparably on a pre-test of knowledge about research methods, thus there is more confidence that the groups are in fact equivalent.

At the conclusion of the study, we find that professor 2’s group has the lowest final test scores of the three. However, because professor 2 is such an outstanding professor, the results appear odd. At first glance, the researcher thinks the results could have been influenced by students dropping out of the class. For example, perhaps several of professor 2’s students dropped the course but none did from the classes of professor 1 or 3. It is revealed, however, that an equal number of students dropped out of all three courses before the post-test and, therefore, this could not be the reason for the low scores in professor 2’s course. Upon further investigation, however, the researcher finds that although an equal number of students dropped out of each class, the dropouts in professor 2’s class were some of his best students. In contrast, those who dropped out of professor 1’s and professor 3’s courses were some of their poorest students. In this example, professor 2 appears to be the least effective teacher. However, this result appears to be due to the fact that his best students dropped out, and this highly influenced the final test average for his group. Although there was not a differential loss of subjects in terms of numbers (which can also be an attrition issue), there was differential loss in the types of students. This differential loss, not the teaching style, is an alternative explanation of the results.

Testing or Testing Bias

Another potential alternative explanation is testing or testing bias. Suppose that after the pre-test of research methods knowledge, professor 1 and professor 3 reviewed the test with their students and gave them the correct answers. Professor 2 did not. The fact that professor l’s and professor 3’s groups did better on the post-test final exam may be explained by the finding that students in those groups remembered the answers to the pre-test, were thus biased at the pre-test, and this artificially inflated their post-test scores. Testing bias can explain the results because students in groups 1 and 3 may have simply remembered the answers from the pre-test review. In fact, the students in professor l’s and 3’s courses may have scored high on the post-test without ever having been exposed to the treatment because they were biased at the pre-test.

Instrumentation

Another alternative explanation that can arise is instrumentation. Instrumentation refers to changes in the measuring instrument from pre- to post-test. Using the previous example, suppose professors 1 and 3 did not give the same final exam as professor 2. For example, professors 1 and 3 changed the final exam and professor 2 kept the final exam the same as the pretest. Because professors 1 and 3 changed the exam, and perhaps made it easier or somehow different from the pre-test exam, results that showed lower scores for professor 2’s students may be related only to instrumentation changes from pre- to post-test. Obviously, to limit the influence of instrumentation, researchers should make sure that instruments remain consistent from pre- to post-test.

A final alternative explanation is reactivity. Reactivity occurs when members of the treatment or experimental group change their behavior simply as a result of being part of a study. This is akin to the finding that people tend to change their behavior when they are being watched or are aware they are being studied. If members of the experiment know they are part of an experiment and are being studied and watched, it is possible that their behavior will change independent of the treatment. If this occurs, the researcher will not know if the behavior change is the result of the treatment, or simply a result of being part of a study. For example, suppose a researcher wants to determine if a boot camp program impacts the recidivism of delinquent offenders. Members of the experimental group are sentenced to boot camp and members of the control group are released on their own recognizance to their parents. Because members of the experimental group know they are part of the experiment, and hence being watched closely after they exit boot camp, they may artificially change their behavior and avoid trouble. Their change of behavior may be totally unrelated to boot camp, but rather, to their knowledge of being part of an experiment.

Other Potential Alternative Explanations

The above discussion provided some typical alternative explanations that may arise with the designs discussed in this chapter. There are, however, other potential alternative explanations that may arise. These alternative explanations arise only when a control or comparison group is present.

One such alternative explanation is diffusion of treatment. Diffusion of treatment occurs when the control or comparison group learns about the treatment its members are being denied and attempts to mimic the behavior of the treatment group. If the control group is successful in mimicking the experimental group, for example, the results at the end of the study may show similarity in outcomes between groups and cause the researcher to conclude that the program had no effect. In fact, however, the finding of no effect can be explained by the comparison group mimicking the treatment group. 28 In reality, there may be no effect of the treatment, but the researcher would not know this for sure because the control group effectively transformed into another experimental group—there is then no baseline of comparison. Consider a study where a researcher wants to determine the impact of a training program on class behavior and participation. In this study, the experimental group is exposed to several sessions of training on how to act appropriately in class and how to engage in class participation. The control group does not receive such training, but they are aware that they are part of an experiment. Suppose after a few class sessions the control group starts to mimic the behavior of the experimental group, acting the same way and participating in class the same way. At the conclusion of the study, the researcher might determine that the program had no impact because the comparison group, which did not receive the new program, showed similar progress.

In a related explanation, sometimes the comparison or control group learns about the experiment and attempts to compete with the experimental or treatment group. This alternative explanation is called compensatory rivalry. For example, suppose a police chief wants to determine if a new training program will increase the endurance of SWAT team officers. The chief randomly assigns SWAT members to either an experimental or control group. The experimental group will receive the new endurance training program and the control group will receive the normal program that has been used for years. During the course of the study, suppose the control group learns that the treatment group is receiving the new endurance program and starts to compete with the experimental group. Perhaps the control group runs five more miles per day and works out an extra hour in the weight room, in addition to their normal endurance program. At the end of the study, and due to the control group’s extra and competing effort, the results might show no effect of the new endurance program, and at worst, experimental group members may show a decline in endurance compared to the control group. The rivalry or competing behavior actually explains the results, not that the new endurance program has no effect or a damaging effect. Although the new endurance program may in reality have no effect, this cannot be known because of the actions of the control group, who learned about the treatment and competed with the experimental group.

Closely related to compensatory rivalry is the alternative explanation of comparison or control group demoralization. 29 In this instance, instead of competing with the experimental or treatment group, the control or comparison group simply gives up and changes their normal behavior. Using the SWAT example, perhaps the control group simply quits their normal endurance program when they learn about the treatment group receiving the new endurance program. At the post-test, their endurance will likely drop considerably compared to the treatment group. Because of this, the new endurance program might emerge as a shining success. In reality, however, the researcher will not know if any changes in endurance between the experimental and control groups are a result of the new endurance program or the control group giving up. Due to their giving up, there is no longer a comparison group of equitable others, the change in endurance among the treatment group members could be attributed to a number of alternative explanations, for example, maturation. If the comparison group behaves normally, the researcher will be able to exclude maturation as a potential explanation. This is because any maturation effects will occur in both groups.

The previous discussion suggests that when the control or comparison group learns about the experiment and the treatment they are denied, potential alternative explanations can arise. Perhaps the best remedy to protect from the alternative explanations just discussed is to make sure the treatment and comparison groups do not have contact with one another. In laboratory experiments this can be ensured, but sometimes this is a problem in criminal justice studies, which are often conducted in the field.

The previous discussion also suggests that there are numerous alternative explanations that can impact the interpretation of results from a study. A careful researcher would know that alternative explanations must be ruled out before reaching a definitive conclusion about the impact of a particular program. The researcher must be attuned to these potential alternative explanations because they can influence results and how results are interpreted. Moreover, the discussion shows that several alternative explanations can occur at the same time. For example, it is possible that selection bias, maturation, attrition, and compensatory rivalry all emerge as alternative explanations in the same study. Knowing about these potential alternative explanations and how they can impact the results of a study is what distinguishes a consumer of research from an educated consumer of research.

Chapter Summary

The primary focus of this chapter was the classic experimental design, the foundation for other types of experimental and quasi-experimental designs. The classic experimental design is perhaps the most useful design when exploring causal relationships. Often, however, researchers cannot employ the classic experimental design to answer a research question. In fact, the classic experimental design is rare in criminal justice and criminology because it is often difficult to ensure random assignment for a variety of reasons. In circumstances where an experimental design is appropriate but not feasible, researchers may turn to one of many quasi-experimental designs. The most important difference between the two is that quasi-experimental designs do not feature random assignment. This can create potential problems for researchers. The main problem is that there is a greater chance the treatment and comparison groups may differ on important characteristics that could influence the results of a study. Although researchers can attempt to prevent imbalances between the groups by matching them on important known characteristics, it is still much more difficult to establish equivalence than it is in the classic experiment. As such, it becomes more difficult to determine what impact a treatment had, if any, as one moves from an experimental to a quasi-experimental design.

Perhaps the most important lesson to be learned in this chapter is that to be an educated consumer of research results requires an understanding of the type of design that produced the results. There are numerous ways experimental and quasi-experimental designs can be structured. This is why much attention was paid to the classic experimental design. In reality, all experimental and quasi-experimental designs are variations of the classic experiment in some way—adding or deleting certain components. If the components and organization and logic of the classic experimental design are understood, consumers of research will have a better understanding of the results produced from any sort of research design. For example, what problems in interpretation arise when a design lacks a pre-test, a control group, or random assignment? Having an answer to this question is a good start toward being an informed consumer of research results produced through experimental and quasi-experimental designs.

Critical Thinking Questions

1. Why is randomization/random assignment preferable to matching? Provide several reasons with explanation.

2. What are some potential reasons a researcher would not be able to utilize random assignment?

3. What is a major limitation of matching?

4. What is the difference between a longitudinal study and a cross-sectional study?

5. Describe a hypothetical study where maturation, and not the treatment, could explain the outcomes of the research.

association (or covariance or correlation): One of three conditions that must be met for establishing cause and effect, or a causal relationship. Association refers to the condition that X and Y must be related for a causal relationship to exist. Association is also referred to as covariance or correlation. Although two variables may be associated (or covary or be correlated), this does not automatically imply that they are causally related

attrition or subject mortality: A threat to internal validity, it refers to the differential loss of subjects between the experimental (treatment) and control (comparison) groups during the course of a study

cause and effect relationship: A cause and effect relationship occurs when one variable causes another, and no other explanation for that relationship exists

classic experimental design or experimental design: A design in a research study that features random assignment to an experimental or control group. Experimental designs can vary tremendously, but a constant feature is random assignment, experimental and control groups, and a post-test. For example, a classic experimental design features random assignment, a treatment, experimental and control groups, and pre- and post-tests

comparison group: The group in a quasi-experimental design that does not receive the treatment. In an experimental design, the comparison group is referred to as the control group

compensatory rivalry: A threat to internal validity, it occurs when the control or comparison group attempts to compete with the experimental or treatment group

control group: In an experimental design, the control group does not receive the treatment. The control group serves as a baseline of comparison to the experimental group. It serves as an example of what happens when a group equivalent to the experimental group does not receive the treatment

cross-sectional designs: A measurement of the pre-test and post-test at one point in time (e.g., six months before and six months after the program)

demoralization: A threat to internal validity closely associated with compensatory rivalry, it occurs when the control or comparison group gives up and changes their normal behavior. While in compensatory rivalry the group members compete, in demoralization, they simply quit. Both are not normal behavioral reactions

dependent variable: Also known as the outcome in a research study. A post-test is a measure of the dependent variable

diffusion of treatment: A threat to internal validity, it occurs when the control or comparison group members learn that they are not getting the treatment and attempt to mimic the behavior of the experimental or treatment group. This mimicking may make it seem as if the treatment is having no effect, when in fact it may be

elimination of alternative explanations: One of three conditions that must be met for establishing cause and effect. Elimination of alternative explanations means that the researcher has ruled out other explanations for an observed relationship between X and Y

experimental group: In an experimental design, the experimental group receives the treatment

history: A threat to internal validity, it refers to any event experienced differently by the treatment and comparison groups—an event that could explain the results other than the supposed cause

independent variable: Also called the cause

instrumentation: A threat to internal validity, it refers to changes in the measuring instrument from pre- to post-test

longitudinal: Refers to repeated measurements of the pre-test and post-test over time, typically for the same group of individuals. This is the opposite of cross-sectional

matching: A process sometimes utilized in some quasi-experimental designs that feature treatment and comparison groups. Matching is a process whereby the researcher attempts to ensure equivalence between the treatment and comparison groups on known information, in the absence of the ability to randomly assign the groups

maturation: A threat to internal validity, maturation refers to the natural biological, psychological, or emotional processes as time passes

negative association: Refers to a negative association between two variables. A negative association is demonstrated when X increases and Y decreases, or X decreases and Y increases. Also known as an inverse relationship—the variables moving in opposite directions

operationalized or operationalization: Refers to the process of assigning a working definition to a concept. For example, the concept of intelligence can be operationalized or defined as grade point average or score on a standardized exam, among others

pilot program or test: Refers to a smaller test study or pilot to work out problems before a larger study and to anticipate changes needed for a larger study. Similar to a test run

positive association: Refers to a positive association between two variables. A positive association means as X increases, Y increases, or as X decreases, Y decreases

post-test: The post-test is a measure of the dependent variable after the treatment has been administered

pre-test: The pre-test is a measure of the dependent variable or outcome before a treatment is administered

quasi-experiment: A quasi-experiment refers to any number of research design configurations that resemble an experimental design but primarily lack random assignment. In the absence of random assignment, quasi-experimental designs feature matching to attempt equivalence

random assignment: Refers to a process whereby members of the experimental group and control group are assigned to each group through a random and unbiased process

random selection: Refers to selecting a smaller but representative subset from a population. Not to be confused with random assignment

reactivity: A threat to internal validity, it occurs when members of the experimental (treatment) or control (comparison) group change their behavior unnaturally as a result of being part of a study

selection bias: A threat to internal validity, selection bias occurs when the experimental (treatment) group and control (comparison) group are not equivalent. The difference between the groups can be a threat to internal validity, or, an alternative explanation to the findings

spurious: A spurious relationship is one where X and Y appear to be causally related, but in fact the relationship is actually explained by a variable or factor other than X

testing or testing bias: A threat to internal validity, it refers to the potential of study members being biased prior to a treatment, and this bias, rather than the treatment, may explain study results

threat to internal validity: Also known as alternative explanation to a relationship between X and Y. Threats to internal validity are factors that explain Y, or the dependent variable, and are not X, or the independent variable

timing: One of three conditions that must be met for establishing cause and effect. Timing refers to the condition that X must come before Y in time for X to be a cause of Y. While timing is necessary for a causal relationship, it is not sufficient, and considerations of association and eliminating other alternative explanations must be met

treatment: A component of a research design, it is typically denoted by the letter X. In a research study on the impact of teen court on juvenile recidivism, teen court is the treatment. In a classic experimental design, the treatment is given only to the experimental group, not the control group

treatment group: The group in a quasi-experimental design that receives the treatment. In an experimental design, this group is called the experimental group

unit of analysis: Refers to the focus of a research study as being individuals, groups, or other units of analysis, such as prisons or police agencies, and so on

variable(s): A variable is a concept that has been given a working definition and can take on different values. For example, intelligence can be defined as a person’s grade point average and can range from low to high or can be defined numerically by different values such as 3.5 or 4.0

1 Povitsky, W., N. Connell, D. Wilson, & D. Gottfredson. (2008). “An experimental evaluation of teen courts.” Journal of Experimental Criminology, 4, 137–163.

2 Hirschi, T., and H. Selvin (1966). “False criteria of causality in delinquency.” Social Problems, 13, 254–268.

3 Robert Roy Britt, “Churchgoers Live Longer.” April, 3, 2006. http://www.livescience.com/health/060403_church_ good.html. Retrieved on September 30, 2008.

4 Kalist, D., and D. Yee (2009). “First names and crime: Does unpopularity spell trouble?” Social Science Quarterly, 90 (1), 39–48.

5 Sherman, L. (1992). Policing domestic violence. New York: The Free Press.

6 For historical and interesting reading on the effects of weather on crime and other disorder, see Dexter, E. (1899). “Influence of weather upon crime.” Popular Science Monthly, 55, 653–660 in Horton, D. (2000). Pioneering Perspectives in Criminology. Incline Village, NV: Copperhouse.

7 http://www.escapistmagazine.com/news/view/111191-Less-Crime-in-U-S-Thanks-to-Videogames , retrieved on September 13, 2011. This news article was in response to a study titled “Understanding the effects of violent videogames on violent crime.” See Cunningham, Scott, Engelstätter, Benjamin, and Ward, (April 7, 2011). Available at SSRN: http://ssm.com/abstract= 1804959.

8 Cohn, E. G. (1987). “Changing the domestic violence policies of urban police departments: Impact of the Minneapolis experiment.” Response, 10 (4), 22–24.

9 Schmidt, Janell D., & Lawrence W. Sherman (1993). “Does arrest deter domestic violence?” American Behavioral Scientist, 36 (5), 601–610.

10 Maxwell, Christopher D., Joel H. Gamer, & Jeffrey A. Fagan. (2001). The effects of arrest on intimate partner violence: New evidence for the spouse assault replication program. Washington D.C.: National Institute of Justice.

11 Miller, N. (2005). What does research and evaluation say about domestic violence laws? A compendium of justice system laws and related research assessments. Alexandria, VA: Institute for Law and Justice.

12 The sections on experimental and quasi-experimental designs rely heavily on the seminal work of Campbell and Stanley (Campbell, D.T., & J. C. Stanley. (1963). Experimental and quasi-experimental designs for research. Chicago: RandMcNally) and more recently, Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin.

13 Povitsky et al. (2008). p. 146, note 9.

14 Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin Company.

15 Ibid, 15.

16 Finckenauer, James O. (1982). Scared straight! and the panacea phenomenon. Englewood Cliffs, N.J.: Prentice Hall.

17 Yarborough, J.C. (1979). Evaluation of JOLT (Juvenile Offenders Learn Truth) as a deterrence program. Lansing, MI: Michigan Department of Corrections.

18 Petrosino, Anthony, Carolyn Turpin-Petrosino, & James O. Finckenauer. (2000). “Well-meaning programs can have harmful effects! Lessons from experiments of programs such as Scared Straight.” Crime and Delinquency, 46, 354–379.

19 “Swearing makes pain more tolerable” retrieved at http:// www.livescience.com/health/090712-swearing-pain.html (July 13, 2009). Also see “Bleep! My finger! Why swearing helps ease pain” by Tiffany Sharpies, retrieved at http://www.time.com/time/health/article /0,8599,1910691,00.html?xid=rss-health (July 16, 2009).

20 For an excellent discussion of the value of controlled experiments and why they are so rare in the social sciences, see Sherman, L. (1992). Policing domestic violence. New York: The Free Press, 55–74.

21 For discussion, see Weisburd, D., T. Einat, & M. Kowalski. (2008). “The miracle of the cells: An experimental study of interventions to increase payment of court-ordered financial obligations.” Criminology and Public Policy, 7, 9–36.

22 Shadish, Cook, & Campbell. (2002).

24 Kelly, Cathy. (March 15, 2009). “Tickets in the mail: Red-light cameras questioned.” Santa Cruz Sentinel.

25 Retting, Richard, Susan Ferguson, & Charles Farmer. (January 2007). “Reducing red light running through longer yellow signal timing and red light camera enforcement: Results of a field investigation.” Arlington, VA: Insurance Institute for Highway Safety.

26 Shadish, Cook, & Campbell. (2002).

27 See Shadish, Cook, & Campbell. (2002), pp. 54–61 for an excellent discussion of threats to internal validity. Also see Chapter 2 for an extended discussion of all forms of validity considered in research design.

28 Trochim, W. (2001). The research methods knowledge base, 2nd ed. Cincinnati, OH: Atomic Dog.

Applied Research Methods in Criminal Justice and Criminology by University of North Texas is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 7: Nonexperimental Research

Quasi-Experimental Research

Learning Objectives

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix  quasi  means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A  nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This design would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a  pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of  history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of  maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is  regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study  because  of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is  spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001) [2] . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952) [3] . But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate  without  receiving psychotherapy. This parallel suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here: Classics in the History of Psychology .

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980) [4] . They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Interrupted Time Series Design

A variant of the pretest-posttest design is the  interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this one is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979) [5] . Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.3 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of  Figure 7.3 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of  Figure 7.3 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Image description available

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does  not  receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve  more  than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this change in attitude could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.
  • regression to the mean
  • spontaneous remission

Image Descriptions

Figure 7.3 image description: Two line graphs charting the number of absences per week over 14 weeks. The first 7 weeks are without treatment and the last 7 weeks are with treatment. In the first line graph, there are between 4 to 8 absences each week. After the treatment, the absences drop to 0 to 3 each week, which suggests the treatment worked. In the second line graph, there is no noticeable change in the number of absences per week after the treatment, which suggests the treatment did not work. [Return to Figure 7.3]

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin. ↵
  • Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146. ↵
  • Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324. ↵
  • Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press. ↵

A between-subjects design in which participants have not been randomly assigned to conditions.

The dependent variable is measured once before the treatment is implemented and once after it is implemented.

A category of alternative explanations for differences between scores such as events that happened between the pretest and posttest, unrelated to the study.

An alternative explanation that refers to how the participants might have changed between the pretest and posttest in ways that they were going to anyway because they are growing and learning.

The statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion.

The tendency for many medical and psychological problems to improve over time without any form of treatment.

A set of measurements taken at intervals over a period of time that are interrupted by a treatment.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

research questions in quasi experimental design

Logo for Mavs Open Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

14.3 Quasi-experimental designs

Learning objectives.

Learners will be able to…

  • Describe a quasi-experimental design in social work research
  • Understand the different types of quasi-experimental designs
  • Determine what kinds of research questions quasi-experimental designs are suited for
  • Discuss advantages and disadvantages of quasi-experimental designs

Quasi-experimental designs are a lot more common in social work research than true experimental designs. Although quasi-experiments don’t do as good a job of mitigating threats to internal validity, they still allow us to establish temporality , which is a criterion for establishing nomothetic causality. The prefix quasi means “resembling,” so quasi-experimental research is research that resembles experimental research, but is not true experimental research. Nonetheless, given proper attention, quasi-experiments can still provide rigorous and useful results.

The primary difference between quasi-experimental research and true experimental research is that quasi-experimental research does not involve random assignment to control and experimental groups. Instead, we talk about comparison groups in quasi-experimental research. As a result, these types of experiments don’t control for extraneous variables as well as true experiments do.  As a result, there are larger threats to internal validity in quasi-experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. Realistically, our example of the CBT-social anxiety project is likely to be a quasi experiment, based on the resources and participant pool we’re likely to have available. There are different kinds of quasi-experiments, and we will discuss three main types below: nonequivalent comparison group designs, time series designs, and ex post facto comparison group designs.

Nonequivalent comparison group design

This type of design looks very similar to the classical experimental design that we discussed in section 14.2. But instead of random assignment to control and experimental groups, researchers use other methods to construct their comparison and experimental groups. Researchers using this design will try to select a comparison group that’s as similar to the experimental group as possible based on relevant factors to their experimental group.

A diagram of this design will also look very similar to pretest/post-test design, but you’ll notice we’ve removed the “R” from our groups, since they are not randomly assigned (Figure 14.6).

research questions in quasi experimental design

This kind of design provides weaker evidence that the intervention itself leads to a change in outcome. Nonetheless, we are still able to establish time order using this method, and can thereby show an association between the intervention and the outcome. Like true experimental designs, this type of quasi-experimental design is useful for explanatory research questions.

What might this look like in a practice setting? Let’s say you’re working at an agency that provides CBT and other types of interventions, and you have identified a group of clients who are seeking help for social anxiety, as in our earlier example. Once you’ve obtained consent from your clients, you can create a comparison group using one of the matching methods we just discussed. If the group is small, you might match using individual matching, but if it’s larger, you’ll probably sort people by demographics to try to get similar population profiles. (You can do aggregate matching more easily when your agency has some kind of electronic records or database, but it’s still possible to do manually.)

Static-group design

Another type of quasi-experimental research is the static-group design. In this type of research, there are both comparison and experimental groups, which are not randomly assigned. There is no pretest, only a post-test, and the comparison group has to be constructed by the researcher. Sometimes, researchers will use matching techniques to construct the groups, but often, the groups are constructed by convenience of who is being served at an agency.

Ex post facto comparison group design

Ex post facto (Latin for “after the fact”) designs are extremely similar to nonequivalent comparison group designs. There are still comparison and experimental groups, pretest and post-test measurements, and an intervention. But in ex post facto designs, participants are assigned to the comparison and experimental groups once the intervention has already happened. This type of design often occurs when interventions are already up and running at an agency and the agency wants to assess effectiveness based on people who have already completed treatment.

In most clinical agency environments, social workers conduct both initial and exit assessments, so there are usually some kind of pretest and post-test measures available. We also typically collect demographic information about our clients, which could allow us to try to use some kind of matching to construct comparison and experimental groups.

In terms of internal validity and establishing causality, ex post facto designs are a bit of a mixed bag. The ability to establish causality depends partially on the ability to construct comparison and experimental groups that are demographically similar so we can control for these extraneous variables .

Propensity Score Matching

There are more advanced ways to match participants in the experimental and comparison groups based on statistical analyses. Researchers using a quasi-experimental design may consider using a matching algorithm to select people for the experimental and comparison groups based on their similarity on key variables (or “covariates”). This allows the assignment be considered “as good as random” after conditioning on the covariates.

Propensity Score Matching (PSM, Rosenbaum & Rubin, 1983) [1] is one such algorithm in which the probability of being assigned to the treatment group can be modeled as a function of several covariates using logistic regression.  However, to use Propensity Score Matching, researchers need a relatively large initial sample because the technique reduces the final sample during the statistical matching process. The need for the large sample means Propensity Score Matching may not be feasible for all projects.

Time series design

Another type of quasi-experimental design is a time series design. Unlike other types of experimental design, time series designs do not have a comparison group. A time series is a set of measurements taken at intervals over a period of time (Figure 14.7). Proper time series design should include at least three pre- and post-intervention measurement points. While there are a few types of time series designs, we’re going to focus on the most common: interrupted time series design.

research questions in quasi experimental design

But why use this method? Here’s an example. Let’s think about elementary student behavior throughout the school year. As anyone with children or who is a teacher knows, kids get very excited and animated around holidays, days off, or even just on a Friday afternoon. This fact might mean that around those times of year, there are more reports of disruptive behavior in classrooms. What if we took our one and only measurement in mid-December? It’s possible we’d see a higher-than-average rate of disruptive behavior reports, which could bias our results if our next measurement is around a time of year students are in a different, less excitable frame of mind. When we take multiple measurements throughout the first half of the school year, we can establish a more accurate baseline for the rate of these reports by looking at the trend over time.

We may want to test the effect of extended recess times in elementary school on reports of disruptive behavior in classrooms. When students come back after the winter break, the school extends recess by 10 minutes each day (the intervention), and the researchers start tracking the monthly reports of disruptive behavior again. These reports could be subject to the same fluctuations as the pre-intervention reports, and so we once again take multiple measurements over time to try to control for those fluctuations.

This method improves the extent to which we can establish causality because we are accounting for a major extraneous variable in the equation—the passage of time. On its own, it does not allow us to account for other extraneous variables, but it does establish time order and association between the intervention and the trend in reports of disruptive behavior. Finding a stable condition before the treatment that changes after the treatment is evidence for causality between treatment and outcome.

Quasi-experimental designs are common in social work intervention research because, when designed correctly, they balance the intense resource needs of true experiments with the realities of research in practice. They still offer researchers tools to gather robust evidence about whether interventions are having positive effects for clients.

Key Takeaways

  • Quasi-experimental designs are similar to true experiments, but do not require random assignment to experimental and control groups.
  • In quasi-experimental projects, the group not receiving the treatment is called the comparison group, not the control group.
  • Nonequivalent comparison group design is nearly identical to pretest/post-test experimental design, but participants are not randomly assigned to the experimental and control groups. As a result, this design provides slightly less robust evidence for causality.
  • Time series design does not have a control or experimental group, and instead compares the condition of participants before and after the intervention by measuring relevant factors at multiple points in time. This allows researchers to mitigate the error introduced by the passage of time.

Categories that we use that are determined ahead of time, based on existing literature/knowledge.

a summary of the main points of an article

whether you can actually reach people or documents needed to complete your project

The idea that researchers are responsible for conducting research that is ethical, honest, and following accepted research practices.

In a measure, when people say yes to whatever the researcher asks, even when doing so contradicts previous answers.

research that is conducted for the purpose of creating social change

Research methodologies that center and affirm African cultures, knowledge, beliefs, and values.

In nonequivalent comparison group designs, the process in which researchers match the population profile of the comparison and experimental groups.

what a researcher hopes to accomplish with their study

A type of reliability in which multiple forms of a tool yield the same results from the same participants.

the process of writing notes on an article

The identity of the person providing data cannot be connected to the data provided at any time in the research process, by anyone.

A statistical method to examine how a dependent variable changes as the value of a categorical independent variable changes

The potential for qualitative research findings to be applicable to other situations or with other people outside of the research study itself.

a statement about what you think is true backed up by evidence and critical thinking

Artifacts are a source of data for qualitative researcher that exist in some form already, without the research having to create it. They represent a very broad category that can range from print media, to clothing, to tools, to art, to live performances.

Comparable to informed consent for BUT this is for someone (e.g. child, teen, or someone with a cognitive impairment) who can’t legally give full consent but can determine if they are willing to participant. May or may not require researchers to collect an assent form, this could also be done verbally.

The characteristics we assume about our data, like that it is normally distributed, that makes it suitable for certain types of statistical tests

The characteristics that make up a variable

An audit trail is a system of documenting in qualitative research analysis that allows you to link your final results with your original raw data. Using an audit trail, an independent researcher should be able to start with your results and trace the research process backwards to the raw data. This helps to strengthen the trustworthiness of the research.

For the purposes of research, authenticity means that we do not misrepresent ourselves, our interests or our research; we are genuine in our interactions with participants and other colleagues.

also called convenience sampling; researcher gathers data from whatever cases happen to be convenient or available

Axial coding is phase of qualitative analysis in which the research will revisit the open codes and identify connections between codes, thereby beginning to group codes that share a relationship.

assumptions about the role of values in research

The stage in single-subjects design in which a baseline level or pattern of the dependent variable is established

One of the three values indicated in the Belmont report. An obligation to protect people from harm by maximizing benefits and minimizing risks.

Biases are conscious or subconscious preferences that lead us to favor some things over others.

A distribution with two distinct peaks when represented on a histogram.

A rating scale in which a respondent selects their alignment of choices between two opposite poles such as disagreement and agreement (e.g., strongly disagree, disagree, agree, strongly agree).

a group of statistical techniques that examines the relationship between two variables

A Boolean search is a structured system that uses modifying terms (AND, OR, NOT) and symbols such as quotation marks and asterisks to modify, broaden, or restrict the search results

A qualitative research technique where the researcher attempts to capture and track their subjective assumptions during the research process. * note, there are other definitions of bracketing, but this is the most widely used.

An acronym, BRUSO for writing questions in survey research. The letters stand for: “brief,” “relevant,” “unambiguous,” “specific,” and “objective.”

Case studies are a type of qualitative research design that focus on a defined case and gathers data to provide a very rich, full understanding of that case. It usually involves gathering data from multiple different sources to get a well-rounded case description.

variables whose values are organized into mutually exclusive groups but whose numerical values cannot be used in mathematical operations.

the idea that one event, behavior, or belief will result in the occurrence of another, subsequent event, behavior, or belief

A census is a study of every element in a population (as opposed of taking a sample of the population)

a statistical test to determine whether there is a significant relationship between two categorical variables

questions in which the researcher provides all of the response options

a sampling approach that begins by sampling groups (or clusters) of population elements and then selects elements from within those groups

A code is a label that we place on segment of data that seems to represent the main idea of that segment.

A document that we use to keep track of and define the codes that we have identified (or are using) in our qualitative data analysis.

Part of the qualitative data analysis process where we begin to interpret and assign meaning to the data.

When a participant faces undue or excess pressure to participate by either favorable or unfavorable means, this is known as coercion and must be avoided by researchers

predictable flaws in thinking

A type of longitudinal design where participants are selected because of a defining characteristic that the researcher is interested in studying. The same people don’t necessarily participate from year to year, but all participants must meet whatever categorical criteria fulfill the researcher’s primary interest.

Common method bias refers to the amount of spurious covariance shared between independent and dependent variables that are measured at the same point in time.

Someone who has the formal or informal authority to grant permission or access to a particular community.

the group of participants in our study who do not receive the intervention we are researching in experiments without random assignment

measurements of variables based on more than one one indicator

These are software tools that can aid qualitative researchers in managing, organizing and manipulating/analyzing their data.

A mental image that summarizes a set of similar observations, feelings, or ideas

developing clear, concise definitions for the key concepts in a research question

A type of criterion validity. Examines how well a tool provides the same scores as an already existing tool administered at the same point in time.

The different levels of the independent variable in an experimental design.

a range of values in which the true value is likely to be, to provide a more accurate description of their data

For research purposes, confidentiality means that only members of the research team have access potentially identifiable information that could be associated with participant data. According to confidentiality, it is the research team's responsibility to restrict access to this information by other parties, including the public.

observing and analyzing information in a way that agrees with what you already think is true and excludes other alternatives

Conflicting allegiances.

a variable whose influence makes it difficult to understand the relationship between an independent and dependent variable

Consistency is the idea that we use a systematic (and potentially repeatable) process when conducting our research.

a characteristic that does not change in a study

Constant comparison reflects the motion that takes place in some qualitative analysis approaches whereby the researcher moves back and forth between the data and the emerging categories and evolving understanding they have in their results. They are continually checking what they believed to be the results against the raw data they are working with.

"when the construct measured is not identical across cultures or when behaviors that characterize the construct are not identical across cultures" (Meiring et al., 2005, p. 2)

Constructivist research is a qualitative design that seeks to develop a deep understanding of the meaning that people attach to events, experiences, or phenomena.

Conditions that are not directly observable and represent states of being, experiences, and ideas.

Content is the substance of the artifact (e.g. the words, picture, scene). It is what can actually be observed.

An approach to data analysis that seeks to identify patterns, trends, or ideas across qualitative data through processes of coding and categorization.

The extent to which a measure “covers” the construct of interest, i.e., it's comprehensiveness to measure the construct.

Context is the circumstances surrounding an artifact, event, or experience.

unintended influences on respondents’ answers because they are not related to the content of the item but to the context in which the item appears.

Research findings are applicable to the group of people who contributed to the knowledge building and the situation in which it took place.

a visual representation of across-tabulation of categorical variables to demonstrate all the possible occurrences of categories

required courses clinical practitioners must take in order to remain current with licensure

variables whose values are mutually exclusive and can be used in mathematical operations

In research design and statistics, a series of methods that allow researchers to minimize the effect of an extraneous variable on the dependent variable in their project.

the group of participants in our study who do not receive the intervention we are researching in experiments with random assignment

a confounding variable whose effects are accounted for mathematically in quantitative analysis to isolate the relationship between an independent and dependent variable

also called availability sampling; researcher gathers data from whatever cases happen to be convenient or available

a relationship between two variables in which their values change together.

a statistically derived value between -1 and 1 that tells us the magnitude and direction of the relationship between two variables

when the values of two variables change at the same time

In qualitative data, coverage refers to the amount of data that can be categorized or sorted using the code structure that we are using (or have developed) in our study. With qualitative research, our aim is to have good coverage with our code structure.

The extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with.

a theory and practice that critiques the ways in which systems of power shape the creation, distribution, and reception of information

a paradigm in social science research focused on power, inequality, and social change

Statistical measure used to asses the internal consistency of an instrument.

When a researcher collects data only once from participants using a questionnaire

spurious covariance between your independent and dependent variables that is in fact caused by systematic error introduced by culturally insensitive or incompetent research practices

the concept that scores obtained from a measure are similar when employed in different cultural populations

Research that portrays groups of people or communities as flawed, surrounded by problems, or incapable of producing change.

An ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact analyses, step-by-step, that you plan to run to answer your research question.

A plan that is developed by a researcher, prior to commencing a research project, that details how data will be collected, stored and managed during the research project.

This is the document where you list your variable names, what the variables actually measure or represent, what each of the values of the variable mean if the meaning isn't obvious.

A data matrix is a tool used by researchers to track and organize data and findings during qualitative analysis.

Including data from multiple sources to help enhance your understanding of a topic

a searchable collection of information

A statement at the end of data collection (e.g. at the end of a survey or interview) that generally thanks participants and reminds them what the research was about, what it's purpose is, resources available to them if they need them, and contact information for the researcher if they have questions or concerns.

A decision-rule provides information on how the researcher determines what code should be placed on an item, especially when codes may be similar in nature.

Research methods that reclaim control over indigenous ways of knowing and being.

The act of breaking piece of qualitative data apart during the analysis process to discern meaning and ultimately, the results of the study.

The type of research in which a specific expectation is deduced from a general premise and then tested

An approach to data analysis in which the researchers begins their analysis using a theory to see if their data fits within this theoretical framework (tests the theory).

starts by reading existing theories, then testing hypotheses and revising or confirming the theory

a variable that depends on changes in the independent variable

research that describes or defines a particular phenomenon

A technique for summarizing and presenting data.

Participants are asked to select one of two possible choices, such as true/false, yes/no, or agree/disagree.

Having the ability to make decisions for yourself limited

Occurs when two variables move together in the same direction - as one increases, so does the other, or, as one decreases, so does the other

an academic field, like social work

Variables with finite value choices.

The extent to which scores on a measure are not correlated with measures of variables that are conceptually distinct.

“a planned process that involves consideration of target audiences and the settings in which research findings are to be received and, where appropriate, communicating and interacting with wider policy and…service audiences in ways that will facilitate research uptake in decision-making processes and practice” (Wilson, Petticrew, Calnan, & Natareth, 2010, p. 91)

how you plan to share your research findings

How you plan to share your research findings

the way the scores are distributed across the levels of that variable.

The analysis of documents (or other existing artifacts) as a source of data.

a question that asks more than one thing at a time, making it difficult to respond accurately

A combination of two people or objects

The performance of an intervention under "real-world" conditions that are not closely controlled and ideal

performance of an intervention under ideal and controlled circumstances, such as in a lab or delivered by trained researcher-interventionists

individual units of a population

Emergent design is the idea that some decision in our research design will be dynamic and change as our understanding of the research question evolves as we go through the research process. This is (often) evident in qualitative research, but rare in quantitative research.

in mixed methods research, this refers to the order in which each method is used, either concurrently or sequentially

report the results of a quantitative or qualitative data analysis conducted by the author

information about the social world gathered and analyzed through scientific observation or experimentation

research questions that can be answered by systematically observing the real world

when someone is treated unfairly in their capacity to know something or describe their experience of the world

assumptions about how we come to know what is real and true

A general approach to research that is conscientious of the dynamics of power and control created by the act of research and attempts to actively address these dynamics through the process and outcomes of research.

Often the end result of a phenomological study, this is a description of the lived experience of the phenomenon being studied.

unsuitable research questions which are not answerable by systematic observation of the real world but instead rely on moral or philosophical opinions

Ethnography is a qualitative research design that is used when we are attempting to learn about a culture by observing people in their natural environment.

research that evaluates the outcomes of a policy or program

a process composed of "four equally weighted parts: 1) current client needs and situation, (2) the best relevant research evidence, (3) client values and preferences, and (4) the clinician’s expertise" (Drisko & Grady, 2015, p. 275)

After the fact

characteristics that disqualify a person from being included in a sample

Exempt review is the lowest level of review. Studies that are considered exempt expose participants to the least potential for harm and often involve little participation by human subjects.

Exhaustive categories are options for closed ended questions that allow for every possible response (no one should feel like they can't find the answer for them).

Expanded field notes represents the field notes that we have taken during data collection after we have had time to sit down and add details to them that we were not able to capture immediately at the point of collection.

Expedited review is the middle level of review. Studies considered under expedited review do not have to go before the full IRB board because they expose participants to minimal risk. However, the studies must be thoroughly reviewed by a member of the IRB committee.

an operation or procedure carried out under controlled conditions in order to discover an unknown effect or law, to test or establish a hypothesis, or to illustrate a known law.

treatment, intervention, or experience that is being tested in an experiment (the independent variable) that is received by the experimental group and not by the control group.

Refers to research that is designed specifically to answer the question of whether there is a causal relationship between two variables.

in experimental design, the group of participants in our study who do receive the intervention we are researching

explains why particular phenomena work in the way that they do; answers “why” questions

conducted during the early stages of a project, usually when a researcher wants to test the feasibility of conducting a more extensive study or if the topic has not been studied in the past

Having an objective person, someone not connected to your study, try to start with your findings and trace them back to your raw data using your audit trail. A tool to help demonstrate rigor in qualitative research.

This is a synonymous term for generalizability - the ability to apply the findings of a study beyond the sample to a broader population.

variables and characteristics that have an effect on your outcome, but aren't the primary variable whose influence you're interested in testing.

A purposive sampling strategy that selects a case(s) that represent extreme or underrepresented perspectives. It is a way of intentionally focusing on or representing voices that may not often be heard or given emphasis.

The extent to which a measurement method appears “on its face” to measure the construct of interest

when a measure does not indicate the presence of a phenomenon, when in reality it is present

when a measure indicates the presence of a phenomenon, when in reality it is not present

whether you can practically and ethically complete the research project you propose

Research methods in this tradition seek to, "remove the power imbalance between research and subject; (are) politically motivated in that (they) seeks to change social inequality; and (they) begin with the standpoints and experiences of women".[footnote]PAR-L. (2010). Introduction to feminist research. [Webpage]. https://www2.unb.ca/parl/research.htm#:~:text=Methodologically%2C%20feminist%20research%20differs%20from,standpoints%20and%20experiences%20of%20women.[/footnote]

respondents to a survey who choose neutral response options, even if they have an opinion

Notes that are taken by the researcher while we are in the field, gathering data.

Questions that screen out/identify a certain type of respondent, usually to direct them to a certain part of the survey.

items on a questionnaire designed to identify some subset of survey respondents who are asked additional questions that are not relevant to the entire sample

respondents to a survey who choose a substantive answer to a question when really, they don’t understand the question or don’t have an opinion

Type of interview where participants answer questions in a group.

A document that will outline the instructions for conducting your focus group, including the questions you will ask participants. It often concludes with a debriefing statement for the group, as well.

A form of data gathering where researchers ask a group of participants to respond to a series of (mostly open-ended) questions.

Deliberate actions taken to impact a research project. For example deliberately falsifying data, plagiarism, not being truthful about the methodology, etc.

A table that lays out how many cases fall into each level of a variable.

A full board review will involve multiple members of the IRB evaluating your proposal. When researchers submit a proposal under full board review, the full IRB board will meet, discuss any questions or concerns with the study, invite the researcher to answer questions and defend their proposal, and vote to approve the study or send it back for revision. Full board proposals pose greater than minimal risk to participants. They may also involve the participation of vulnerable populations, or people who need additional protection from the IRB.

the people or organizations who control access to the population you want to study

The ability to apply research findings beyond the study sample to some broader population,

Findings form a research study that apply to larger group of people (beyond the sample). Producing generalizable findings requires starting with a representative sample.

(as in generalization) to make claims about a large population based on a smaller sample of people or items

research reports released by non-commercial publishers, such as government agencies, policy organizations, and think-tanks

A type of research design that is often used to study a process or identify a theory about how something works.

A form of qualitative analysis that aims to develop a theory or understanding of how some event or series of events occurs by closely examining participant knowledge and experience of that event(s).

A composite scale using a series of items arranged in increasing order of intensity of the construct of interest, from least intense to most intense.

The quality of or the amount of difference or variation in data or research participants.

a graphical display of a distribution.

The quality of or the amount of similarity or consistency in data or research participants.

As researchers in the social science, we ourselves are the main tool for conducting our studies.

The US Department of Health and Human Services (USDHHS) defines a human subject as “a living individual about whom an investigator (whether professional or student) conducting research obtains (1) data through intervention or interaction with the individual, or (2) identifiable private information ” (USDHHS, 1993, para. 1). [2]

a statement describing a researcher’s expectation regarding what they anticipate finding

A cyclical process of theory development, starting with an observed phenomenon, then developing or using a theory to make a specific prediction of what should happen if that theory is correct, testing that prediction, refining the theory in light of the findings, and using that refined theory to develop new hypotheses, and so on.

attempts to explain or describe your phenomenon exhaustively, based on the subjective understandings of your participants

A rich, deep, detailed understanding of a unique person, small group, and/or set of circumstances.

Tthe long-term condition that occurs at the end of a defined time period after an intervention.

The scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice, and, hence, to improve the quality and effectiveness of health services.

the impact your study will have on participants, communities, scientific knowledge, and social justice

Inclusion criteria are general requirements a person must possess to be a part of your sample.

causes a change in the dependent variable

a composite score derived from aggregating measures of multiple concepts (called components) using a set of rules and formulas

Clues that demonstrate the presence, intensity, or other aspects of a concept in the real world

things that require subtle and complex observations to measure, perhaps we must use existing knowledge and intuition to define.

In nonequivalent comparison group designs, the process by which researchers match individual cases in the experimental group to similar cases in the comparison group.

inductive reasoning draws conclusions from individual observations

An approach to data analysis in which we gather our data first and then generate a theory about its meaning through our analysis.

when a researcher starts with a set of observations and then moves from particular experiences to a more general set of propositions about those experiences

"a set of abilities requiring individuals to 'recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information" (American Library Association, 2020)

the accumulation of special rights and advantages not available to others in the area of information access

A process through which the researcher explains the research process, procedures, risks and benefits to a potential participant, usually through a written document, which the participant than signs, as evidence of their agreement to participate.

an administrative body established to protect the rights and welfare of human research subjects recruited to participate in research activities conducted under the auspices of the institution with which it is affiliated

The consistency of people’s responses across the items on a multiple-item measure. Responses about the same underlying construct should be correlated, though not perfectly.

Ability to say that one variable "causes" something to happen to another variable. Very important to assess when thinking about studies that examine causation such as experimental or quasi-experimental designs.

a paradigm based on the idea that social context and interaction frame our realities

The extent to which different observers are consistent in their assessment or rating of a particular characteristic or item.

the various aspects or dimensions that come together in forming our identity

A level of measurement that is continuous, can be rank ordered, is exhaustive and mutually exclusive, and for which the distance between attributes is known to be equal. But for which there is no zero point.

An interview guide is a document that outlines the flow of information during your interview, including a greeting and introduction to orient your participant to the topic, your questions and any probes, and any debriefing statement you might include. If you are part of a research team, your interview guide may also include instructions for the interviewer if certain things are brought up in the interview or as general guidance.

A questionnaire that is read to respondents

any possible changes in interviewee responses based on how or when the researcher presents question-and-answer options

A form of data gathering where researchers ask individual participants to respond to a series of (mostly open-ended) questions.

Type of reliability in which a rater rates something the same way on two different occasions.

a statistic ranging from 0 to 1 that measures how much outcomes (1) within a cluster are likely to be similar or (2) between different clusters are likely to be different

a “gut feeling” about what to do based on previous experience or knowledge

yer gut feelin'

occurs when two variables change in opposite directions - one goes up, the other goes down and vice versa; also called negative association

when the order in which the items are presented affects people’s responses

An iterative approach means that after planning and once we begin collecting data, we begin analyzing as data as it is coming in.  This early analysis of our (incomplete) data, then impacts our planning, ongoing data gathering and future analysis as it progresses.

a nonlinear process in which the original product is revised over and over again to improve it

One of the three ethical principles in the Belmont Report. States that benefits and burdens of research should be distributed fairly.

Someone who is especially knowledgeable about a topic being studied.

the words or phrases in your search query

when a participant's answer to a question is altered due to the way in which a question is written. In essence, the question leads the participant to answer in a specific way.

The level that describes how data for variables are recorded. The level of measurement defines the type of operations can be conducted with your data. There are four levels: nominal, ordinal, interval, and ratio.

measuring people’s attitude toward something by assessing their level of agreement with several statements about it

A research process where you create a plan, you gather your data, you analyze your data and each step is completed before you proceed to the next.

a statistical technique that can be used to predict how an independent variable affects a dependent variable in the context of other variables.

A science that deals with the principles and criteria of validity of inference and demonstration: the science of the formal principles of reasoning.

A graphic depiction (road map) that presents the shared relationships among the resources, activities, outputs, outcomes, and impact for your program

Researcher collects data from participants at multiple points over an extended period of time using a questionnaire.

examining social structures and institutions

The strength of a correlation, determined by the absolute value of a correlation coefficient

a type of survey question that lists a set of questions for which the response options are all the same in a grid layout

A purposive sampling strategy where you choose cases because they represent a range of very different perspectives on a topic

Also called the average, the mean is calculated by adding all your cases and dividing the total by the number of cases.

One number that can give you an idea about the distribution of your data.

The process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena under investigation in a research study.

The differerence between that value that we get when we measure something and the true value

Instrument or tool that operationalizes (measures) the concept that you are studying.

The value in the middle when all our values are placed in numerical order. Also called the 50th percentile.

Variables that refer to the mechanisms by which an independent variable might affect a dependent variable.

Member checking involves taking your results back to participants to see if we "got it right" in our analysis. While our findings bring together many different peoples' data into one set of findings, participants should still be able to recognize their input and feel like their ideas and experiences have been captured adequately.

approach to recruitment where participants are members of an organization or social group with identified membership

Memoing is the act of recording your thoughts, reactions, quandaries as you are reviewing the data you are gathering.

A written agreement between parties that want to participate in a collaborative project.

level of interaction or activity that exists between groups and within communities

a study that combines raw data from multiple quantitative studies and analyzes the pooled data using statistics

a study that combines primary data from multiple qualitative sources and analyzes the pooled data

an explanation of why you chose the specific design of your study; why do your chosen methods fit with the aim of your research

A description of how research is conducted.

level of interaction or activity that exists at the smallest level, usually among individuals

Usually unintentional. Very broad category that covers things such as not using the proper statistics for analysis, injecting bias into your study and in interpreting results, being careless with your research methodology

when researchers use both quantitative and qualitative methods in a project

The most commonly occurring value of a variable.

A variable that affects the strength and/or direction of the relationship between the independent and dependent variables.

concepts that are comprised of multiple elements

An empirical structure for measuring items or indicators of the multiple dimensions of a concept.

A group of statistical techniques that examines the relationship between at least three variables

Mutually exclusive categories are options for closed ended questions that do not overlap, so people only fit into one category or another, not both.

Those stories that we compose as human beings that allow us to make meaning of our experiences and the world around us

US legislation passed In 1974, which created the National Commission for the Protection of Human Subjects in Biomedical and Behavioral Research, which went on to produce The Belmont Report.

collecting data in the field where it naturally/normally occurs

Making qualitative observations that attempt to capture the subjects of the observation as unobtrusively as possible and with limited structure to the observation.

Including data that contrasts, contradicts, or challenges the majority of evidence that we have found or expect to find

occurs when two variables change in opposite directions - one goes up, the other goes down and vice versa

ensuring that we have correctly captured and reflected an accurate understanding in our findings by clarifying and verifying our findings with our participants

The idea that qualitative researchers attempt to limit or at the very least account for their own biases, motivations, interests and opinions during the research process.

The lowest level of measurement; categories cannot be mathematically ranked, though they are exhaustive and mutually exclusive

causal explanations that can be universally applied to groups, such as scientific laws or universal truths

provides a more general, sweeping explanation that is universally true for all people

sampling approaches for which a person’s likelihood of being selected for membership in the sample is unknown

Referring to data analysis that doesn't examine how variables relate to each other.

If the majority of the targeted respondents fail to respond to a survey, then a legitimate concern is whether non-respondents are not responding due to a systematic reason, which may raise questions about the validity of the study’s results, especially as this relates to the representativeness of the sample.

The bias that occurs when those who respond to your request to participate in a study are different from those who do not respond to you request to participate in a study.

an association between two variables that is NOT caused by a third variable

the assumption that no relationship exists between the variables in question

The Nuremberg Code is a 10-point set of research principles designed to guide doctors and scientists who conduct research on human subjects, crafted in response to the atrocities committed during the Holocaust.

a single truth, observed without bias, that is universally applicable

Observation is a tool for data gathering where researchers rely on their own senses (e.g. sight, sound) to gather information on a topic.

In measurement, conditions that are easy to identify and verify through direct observation.

The rows in your data set. In social work, these are often your study participants (people), but can be anything from census tracts to black bears to trains.

including more than one member of your research team to aid in analyzing the data

The federal government agency that oversees IRBs.

a statistical procedure to compare the means of a variable across three or more groups

assumptions about what is real and true

journal articles that are made freely available by the publisher

An initial phase of coding that involves reviewing the data to determine the preliminary ideas that seem important and potential labels that reflect their significance.

sharing one's data and methods for the purposes of replication, verifiability, and collaboration of findings

Questions for which the researcher does not include response options, allowing for respondents to answer the question in their own words

According to the APA Dictionary of Psychology, an operational definition is "a description of something in terms of the operations (procedures, actions, or processes) by which it could be observed and measured. For example, the operational definition of anxiety could be in terms of a test score, withdrawal from a situation, or activation of the sympathetic nervous system. The process of creating an operational definition is known as operationalization."

process by which researchers spell out precisely how a concept will be measured in their study

Oral histories are a type of qualitative research design that offers a detailed accounting of a person's life, some event, or experience. This story(ies) is aimed at answering a specific research question.

verbal presentation of research findings to a conference audience

Level of measurement that follows nominal level. Has mutually exclusive categories and a hierarchy (rank order), but we cannot calculate a mathematical distance between attributes.

Extreme values in your data.

summarizes the incompatibility between a particular set of data and a proposed model for the data, usually the null hypothesis. The lower the p-value, the more inconsistent the data are with the null hypothesis, indicating that the relationship is statistically significant.

group presentations that feature experts on a given issue, with time for audience question-and-answer

A type of longitudinal design where the researchers gather data at multiple points in time and the same people participate in the survey each time it is administered.

Those who are asked to contribute data in a research study; sometimes called respondents or subjects.

An approach to research that more intentionally attempts to involve community members throughout the research process compared to more traditional research methods. In addition, participatory approaches often seek some concrete, tangible change for the benefit of the community (often defined by the community).

when a publisher prevents access to reading content unless the user pays money

A qualitative research tool for enhancing rigor by partnering with a peer researcher who is not connected with your project (therefore more objective), to discuss project details, your decision, perhaps your reflexive journal, as a means of helping to reduce researcher bias and maintain consistency and transparency in the research process.

a formal process in which other esteemed researchers and experts ensure your work meets the standards and expectations of the professional field

trade publications, magazines, and newspapers

the tendency for a pattern to occur at regular intervals

A qualitative research design that aims to capture and describe the lived experience of some event or "phenomenon" for a group of people.

Photovoice is a technique that merges pictures with narrative (word or voice data that helps that interpret the meaning or significance of the visual artifact. It is often used as a tool in CBPR.

Testing out your research materials in advance on people who are not included as participants in your study.

as a criteria for causal relationship, the relationship must make logical sense and seem possible

A purposive sampling strategy that focuses on selecting cases that are important in representing a contemporary politicized issue.

the larger group of people you want to be able to make conclusions about based on the conclusions you draw from the people in your sample

A statement about the researchers worldview and life experiences, specifically in respect to the research topic they are studying. It helps to demonstrate the subjective connection(s) the researcher has to the topic and is a way to encourage transparency in research.

a paradigm guided by the principles of objectivity, knowability, and deductive logic

A measure of a participant's condition after an intervention or, if they are part of the control/comparison group, at the end of an experiment.

an experimental design in which participants are randomly assigned to control and treatment groups, one group receives an intervention, and both groups receive only a post-test assessment

presentations that use a poster to visually represent the elements of the study

the odds you will detect a significant relationship between variables when one is truly present in your sample

describe “how things are done” or comment on pressing issues in practice (Wallace & Wray, 2016, p. 20)

How well your findings can be translated and used in the "real world." For example, you may have a statistically significant correlation; however, the relationship may be very weak. This limits your abiltiy to use these data for real world change.

improvements in cognitive assessments due to exposure to the instrument

knowledge gained through “learning by doing” that guides social work intervention and increases over time

a research paradigm that suspends questions of philosophical ‘truth’ and focuses more on how different philosophies, theories, and methods can be used strategically to resolve a problem or question within the researcher's unique context

A type of criterion validity that examines how well your tool predicts a future criterion.

A measure of a participant's condition before they receive an intervention or treatment.

a type of experimental design in which participants are randomly assigned to control and experimental groups, one group receives an intervention, and both groups receive pre- and post-test assessments

Data you have collected yourself.

in a literature review, a source that describes primary data collected and analyzed by the author, rather than only reviewing what other researchers have found

This means that one scientist could repeat another’s study with relative ease. By replicating a study, we may become more (or less) confident in the original study’s findings.

a type of cluster sampling, in which clusters are given different chances of being selected based on their size so that each element across all of the clusters has an equal chance of being selected

sampling approaches for which a person’s likelihood of being selected from the sampling frame is known

Probes a brief prompts or follow up questions that are used in qualitative interviewing to help draw out additional information on a particular question or idea.

An analysis of how well a program runs

the "uptake of formal and informal learning opportunities that deepen and extend...professional competence, including knowledge, beliefs, motivation, and self-regulatory skills" (Richter, Kunter, Klusmann, Lüdtke, & Baumert, 2014)

The systematic process by which we determine if social programs are meeting their goals, how well the program runs, whether the program had the desired effect, and whether the program has merit according to stakeholders (including in terms of the monetary costs and benefits)

As researchers, this means we are extensively spending time with participants or are in the community we are studying.

In prospective studies, individuals are followed over time and data about them is collected as their characteristics or circumstances change.

a person who completes a survey on behalf of another person

Fake names assigned in research to protect the identity of participants.

claims about the world that appear scientific but are incompatible with the values and practices of science

The science of measurement. Involves using theory to assess measurement procedures and tools.

approach to recruitment where participants are sought in public spaces

In a purposive sample, participants are intentionally or hand-selected because of their specific expertise or experience.

data derived from analysis of texts. Usually, this is word data (like a conversation or journal entry) but can also include performances, pictures, and other means of expressing ideas.

qualitative methods interpret language and behavior to understand the world from the perspectives of other people

Research that involves the use of data that represents human expression through words, pictures, movies, performance and other artifacts.

numerical data

when a researcher administers a questionnaire verbally to participants

quantitative methods examine numerical data to precisely describe and predict elements of the social world

a subtype of experimental design that is similar to a true experiment, but does not have randomly assigned control and treatment groups

Research methods using this approach aim to question, challenge and/or reject knowledge that is commonly accepted and privileged in society and elevate and empower knowledge and perspectives that are often perceived as non-normative.

search terms used in a database to find sources of information, like articles or webpages

A research instrument consisting of a set of questions (items) intended to capture responses from participants in a standardized manner

A quota sample involves the researcher identifying a subgroups within a population that they want to make sure to include in their sample, and then identifies a quota or target number to recruit that represent each of these subgroups.

using a random process to decide which participants are tested in which conditions

Unpredictable error that does not result in scores that are consistently higher or lower on a given measure but are nevertheless inaccurate.

Errors lack any perceptable pattern.

an experiment that involves random assignment to a control and experimental group to evaluate the impact of an intervention or stimulus

An approach to sampling where all elements or people in a sampling frame have an equal chance of being selected for inclusion in a study's sample.

The difference between the highest and lowest scores in the distribution.

An ordered set of responses that participants must choose from.

The highest level of measurement. Denoted by mutually exclusive categories, a hierarchy (order), values can be added, subtracted, multiplied, and divided, and the presence of an absolute zero.

unprocessed data that researchers can analyze using quantitative and qualitative methods (e.g., responses to a survey or interview transcripts)

When respondents have difficult providing accurate answers to questions due to the passage of time.

Concept advanced by Albert Bandura that human behavior both shapes and is shaped by their environment.

The act of putting the deconstructed qualitative back together during the analysis process in the search for meaning and ultimately the results of the study.

the process by which the researcher informs potential participants about the study and attempts to get them to participate

A research journal that helps the researcher to reflect on and consider their thoughts and reactions to the research process and how it may be shaping the study

How we understand and account for our influence, as researchers, on the research process.

the process of considering something abstract to be a concrete object or thing; the fallacy of reification is assuming that abstract concepts exist in some concrete, tangible way

The degree to which an instrument reflects the true score rather than error.  In statistical terms, reliability is the portion of observed variability in the sample that is accounted for by the true variability, not by error. Note : Reliability is necessary, but not sufficient, for measurement validity.

a sample that looks like the population from which it was selected in all respects that are potentially relevant to the study

How closely your sample resembles the population from which it was drawn.

a systematic investigation, including development, testing, and. evaluation, designed to develop or contribute to generalizable knowledge

These are sites where contributing researchers can house data that other researchers can view and request permission to use

the methods researchers use to examine empirical data

a set of common philosophical (ontological, epistemological, and axiological) assumptions that inform research (e.g., Post-positivism, Constructivism, Pragmatic, Critical)

a document produced by researchers that reviews the literature relevant to their topic and describes the methods they will use to conduct their study

The details/steps outlining how a study will be carried out.

The unintended influence that the researcher may have on the research process.

One of the three ethical principles espoused in the Belmont Report. Treating people as autonomous beings who have the right to make their own decisions. Acknowledging participants' personal dignity.

the answers researchers provide to participants to choose from when completing a questionnaire

Similar to other longitudinal studies, these surveys deal with changes over time, but like a cross-sectional study, they are administered only once. In a retrospective survey, participants are asked to report events from the past.

journal articles that summarize the findings other researchers and establish the state of the literature in a given topic area

Rigor is the process through which we demonstrate, to the best of our ability, that our research is empirically sound and reflects a scientific approach to knowledge building.

facilitated discussions on a topic, often to generate new ideas

the group of people you successfully recruit from your sampling frame to participate in your study

The number of cases found in your final sample.

Sampling bias is present when our sampling process results in a sample that does not represent our population in some way.

the set of all possible samples you could possibly draw for your study

The difference in the statistical characteristics of the population (i.e., the population parameters ) and those in the sample (i.e., the sample statistics ); the error caused by observing characteristics of a sample rather than the entire population

the list of people from which a researcher will draw her sample

used in systematic random sampling; the distance between the elements in the sampling frame selected for the sample; determined by dividing the total sampling frame by the desired sample size

The point where gathering more data doesn't offer any new ideas or perspectives on the issue you are studying.  Reaching saturation is an indication that we can stop qualitative data collection.

A graphical representation of data where the y-axis (the vertical one along the side) is your variable's value and the x-axis (the horizontal one along the bottom) represents the individual instance in your data.

Visual representations of the relationship between two interval/ratio variables that usually use dots to represent data points

a way of knowing that attempts to systematically collect and categorize facts or truths

Data someone else has collected that you have permission to use in your research.

analyzing data that has been collected by another person or research group

interpret, discuss, and summarize primary sources

the degree to which people in my sample differs from the overall population

Selective or theoretical coding is part of a qualitative analysis process that seeks to determine how important concepts and their relationships to each other come together, providing a theory that describes the focus of the study. It often results in an overarching or unifying idea tying these concepts together.

A questionnaire that is distributed to participants (in person, by mail, virtually) to complete independently.

a participant answers questions about themselves

Composite (multi-item) scales in which respondents are asked to indicate their opinions or feelings toward a single statement using different pairs of adjectives framed as polar opposites.

An interview that has a general framework for the questions that will be asked, but there is more flexibility to pursue related topics that are brought up by participants than is found in a structured interview approach.

a classic work of research literature that is more than 5 years old and is marked by its uniqueness and contribution to professional knowledge” (Houser, 2018, p. 112)

in mixed methods research, this refers to the order each method is used

the words used to identify the organization and structure of your literature review to your reader

selecting elements from a list using randomly generated numbers

A distribution where cases are clustered on one or the other side of the median.

For a snowball sample, a few initial participants are recruited and then we rely on those initial (and successive) participants to help identify additional people to recruit. We thus rely on participants connects and knowledge of the population to aid our recruitment.

When a participant answers in a way that they believe is socially the most acceptable answer.

Social desirability bias occurs when we create questions that lead respondents to answer in ways that don't reflect their genuine thoughts or feelings to avoid being perceived negatively.

the science of humanity, social interactions, and social structures

A reliability evaluation that examines the internal consistency of a a measurement tool. This process involves comparing one half of a tool to the other half of the same tool and evaluating the results.

when an association between two variables appears to be causal but can in fact be explained by influence of a third variable

The people and organizations that have some interest in or will be effected by our program.

The ability to fail to accept the null hypotheses (i.e., actually find what you are seeking)

"Assuming that the null hypothesis is true and the study is repeated an infinite number times by drawing random samples from the same populations(s), less than 5% of these results will be more extreme than the current result" (Cassidy et al., 2019, p. 233).

the characteristic by which the sample is divided in stratified random sampling

dividing the study population into subgroups based on a characteristic (or strata) and then drawing a sample from each subgroup

Interview that uses a very prescribed or structured approach, with a rigid set of questions that are asked very consistently each time, with little to no deviation

Numbers or a series of numbers, symbols and letters assigned in research to both organize data as it is collected, as well as protecting the identity of participants.

the subset of the target population available for study

one truth among many, bound within a social and cultural context

The use of questionnaires to gather data from multiple participants.

A distribution with a roughly equal number of cases on either side of the median.

(also known as bias) refers to when a measure consistently outputs incorrect data, usually in one direction and due to an identifiable process

Errors that are generally predictable.

a probability sampling approach that begins by selecting a random start on a sampling frame and then selects every kth element from your sampling frame for the sample

journal articles that identify, appraise, and synthesize all relevant studies on a particular topic (Uman, 2011, p.57)

a quick, condensed summary of the report’s key findings arranged by row and column

knowledge that is difficult to express in words and may be conveyed more through intuition or feelings

the group of people whom your study addresses

approach to recruitment where participants are based on some personal characteristic or group association

as a criteria for causal relationship, the cause must come before the effect

any findings that follow from constructivist studies are not inherently applicable to other people or situations, as their realities may be quite different

review primary and secondary sources

The extent to which scores obtained on a scale or other measure are consistent across time

The measurement error related to how a test is given; the conditions of the testing, including environmental conditions; and acclimation to the test itself

The Belmont Report is a document outlining basic ethical principles for research on human subjects in the United States and is the foundation of work conducted by IRBs in carrying out their task of overseeing protection of human subjects in research (National Commission for the Protection of Human Subjects in Biomedical and Behavioral Research, 1979).

published works that document a scholarly conversation on a specific topic within and between disciplines

Thematic analysis is an approach to qualitative analysis, in which the researcher attempts to identify themes or patterns across their data to better understand the topic being studied.

A visual representation of how each individual category fits with the others when using thematic analysis to analyze your qualitative data.

a network of linked concepts that together provide a rationale for a research project or analysis; theoretical frameworks are based in theory and empirical literature

a set of concepts and relationships scientists use to explain the social world

A thick description is a very complete, detailed, and illustrative of the subject that is being described.

Biases or circumstances that can reduce or limit the internal validity of a study

circumstances or events that may affect the outcome of an experiment, resulting in changes in the research participants that are not a result of the intervention, treatment, or experimental condition being tested

A demonstration that a change occurred after an intervention. An important criterion for establishing causality.

a set of measurements taken at intervals over a period of time

periodicals directed to members of a specific profession which often include information about industry trends and practical information for people working in the field

To type out the text of recorded interview or focus group.

The process of research is record and described in such a way that the steps the researcher took throughout the research process are clear.

ensuring that everyone receives the same, or close to the same, treatment as possible

The stage in single subjects research design in which the treatment or intervention is delivered

A type of longitudinal survey where the researchers gather data at multiple times, but each time they ask different people from the group they are studying because their concern is capturing the sentiment of the group, not the individual people they survey.

Triangulation of data refers to the use of multiple types, measures or sources of data in a research project to increase the confidence that we have in our findings.

An experimental design in which one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different treatment levels (random assignment), and the results of the treatments on outcomes (dependent variables) are observed

Trustworthiness is a quality reflected by qualitative research that is conducted in a credible way; a way that should produce confidence in its findings.

Data that accurately portrays information that was shared in or by the original source.

The level of confidence that research is obtained through a systematic and scientific process and that findings can be clearly connected to the data they are based on (and not some fabrication or falsification of that data).

a statistical procedure to compare the means of a variable across groups using multiple independent variables to distinguish among groups

A purposive sampling strategy where you select cases that represent the most common/ a commonly held perspective.

concepts that are expected to have a single underlying dimension

A distribution with one distinct peak when represented on a histogram.

A rating scale where the magnitude of a single trait is being tested

entity that a researcher wants to say something about at the end of her study (individual, group, or organization)

the entities that a researcher actually observes, measures, or collects in the course of trying to learn something about her unit of analysis (individuals, groups, or organizations)

discrete segments of data

Univariate data analysis is a quantitative method in which a variable is examined individually to determine its distribution.

Interviews that contain very open-ended talking prompt that we want participants to respond to, with much flexibility to follow the conversation where it leads.

The extent to which the scores from a measure represent the variable they are intended to.

The extent to which the levels of a variable vary around their central tendency (the mean, median, or mode).

“a logical grouping of attributes that can be observed and measured and is expected to vary from person to person in a population” (Gillespie & Wagner, 2018, p. 9)

The name of your variable.

People who are at risk of undue influence or coercion. Examples are children, prisoners, parolees, and persons with impaired mental capabilities. Additional groups may be vulnerable if they are deemed to be unable to give consent.

According to the APA Dictionary of Psychology : an experimental design in which the treatment or other intervention is removed during one or more periods. A typical withdrawal design consists of three phases: an initial condition for obtaining a baseline, a condition in which the treatment is applied, and another baseline condition in which the treatment has been withdrawn. Often, the baseline condition is represented by the letter A and the treatment condition by the letter B, such that this type of withdrawal design is known as an A-B-A design. A fourth phase of reapplying the intervention may be added, as well as a fifth phase of removing the intervention, to determine whether the effect of the intervention can be reproduced.

interactive presentations which go hands-on with audience members to teach them new skills

TRACK 1 (IF YOU ARE CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

  • Think back to the experiment you considered for your research project in Section 14.3. Now that you know more about quasi-experimental designs, do you still think it's a true experiment? Why or why not?
  • What should you consider when deciding whether an experimental or quasi-experimental design would be more feasible or fit your research question better?

TRACK 2 (IF YOU AREN'T CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

Imagine you are interested in studying child welfare practice. You are interested in learning more about community-based programs aimed to prevent child maltreatment and to prevent out-of-home placement for children.

  • Now that you know more about quasi-experimental designs, do you still think the research design you proposed in the previous section is still a true experiment? Why or why not?
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70 (1), 41–55. https://doi.org/10.2307/2335942 ↵

Doctoral Research Methods in Social Work Copyright © by Mavs Open Press. All Rights Reserved.

Share This Book

research questions in quasi experimental design

  • Survey Software The world’s leading omnichannel survey software
  • Online Survey Tools Create sophisticated surveys with ease.
  • Mobile Offline Conduct efficient field surveys.
  • Text Analysis
  • Close The Loop
  • Automated Translations
  • NPS Dashboard
  • CATI Manage high volume phone surveys efficiently
  • Cloud/On-premise Dialer TCPA compliant Cloud on-premise dialer
  • IVR Survey Software Boost productivity with automated call workflows.
  • Analytics Analyze survey data with visual dashboards
  • Panel Manager Nurture a loyal community of respondents.
  • Survey Portal Best-in-class user friendly survey portal.
  • Voxco Audience Conduct targeted sample research in hours.
  • Predictive Analytics
  • Customer 360
  • Customer Loyalty
  • Fraud & Risk Management
  • AI/ML Enablement Services
  • Credit Underwriting

research questions in quasi experimental design

Find the best survey software for you! (Along with a checklist to compare platforms)

Get Buyer’s Guide

  • 100+ question types
  • Drag-and-drop interface
  • Skip logic and branching
  • Multi-lingual survey
  • Text piping
  • Question library
  • CSS customization
  • White-label surveys
  • Customizable ‘Thank You’ page
  • Customizable survey theme
  • Reminder send-outs
  • Survey rewards
  • Social media
  • SMS surveys
  • Website surveys
  • Correlation analysis
  • Cross-tabulation analysis
  • Trend analysis
  • Real-time dashboard
  • Customizable report
  • Email address validation
  • Recaptcha validation
  • SSL security

Take a peek at our powerful survey features to design surveys that scale discoveries.

Download feature sheet.

  • Hospitality
  • Financial Services
  • Academic Research
  • Customer Experience
  • Employee Experience
  • Product Experience
  • Market Research
  • Social Research
  • Data Analysis
  • Banking & Financial Services
  • Retail Solution
  • Risk Management
  • Customer Lifecycle Solutions
  • Net Promoter Score
  • Customer Behaviour Analytics
  • Customer Segmentation
  • Data Unification

Explore Voxco 

Need to map Voxco’s features & offerings? We can help!

Watch a Demo 

Download Brochures 

Get a Quote

  • NPS Calculator
  • CES Calculator
  • A/B Testing Calculator
  • Margin of Error Calculator
  • Sample Size Calculator
  • CX Strategy & Management Hub
  • Market Research Hub
  • Patient Experience Hub
  • Employee Experience Hub
  • Market Research Guide
  • Customer Experience Guide
  • The Voxco Guide to Customer Experience
  • NPS Knowledge Hub
  • Survey Research Guides
  • Survey Template Library
  • Webinars and Events
  • Feature Sheets
  • Try a sample survey
  • Professional services
  • Blogs & White papers
  • Case Studies

Find the best customer experience platform

Uncover customer pain points, analyze feedback and run successful CX programs with the best CX platform for your team.

Get the Guide Now

research questions in quasi experimental design

We’ve been avid users of the Voxco platform now for over 20 years. It gives us the flexibility to routinely enhance our survey toolkit and provides our clients with a more robust dataset and story to tell their clients.

VP Innovation & Strategic Partnerships, The Logit Group

  • Client Stories
  • Voxco Reviews
  • Why Voxco Research?
  • Why Voxco Intelligence?
  • Careers at Voxco
  • Vulnerabilities and Ethical Hacking

Explore Regional Offices

  • Cloud/On-premise Dialer TCPA compliant Cloud & on-premise dialer
  • Fraud & Risk Management

Get Buyer’s Guide

  • Banking & Financial Services

Explore Voxco 

Watch a Demo 

Download Brochures 

  • CX Strategy & Management Hub
  • Blogs & White papers

VP Innovation & Strategic Partnerships, The Logit Group

  • Our clients
  • Client stories
  • Featuresheets

Quasi-Experimental Design: Explanation, Methods and FAQs

  • September 27, 2021

Voxco makes your research faster & insightful.

SHARE THE ARTICLE ON

photo 1593642532871 8b12e02d091c L

As you strive to uncover causal (cause-and-effect) relationships between variables, you may often encounter ethical or practical constraints while conducting controlled experiments. Quasi-experimental design steps in as a powerful alternative that helps you overcome these challenges and offer valuable insights. 

In this blog, we’ll look into its characteristics, examples, types, and how it differs from true-experimental research design. The purpose of this blog is to understand how this research methodology bridges the gap between a fully controlled experiment and a purely observational study.

What Is Quasi-Experimental Design?

A quasi-experimental design is pretty much different from an experimental design, except for the fact that they both manifest the cause-effect relationship between the independent and dependent variables . So how is quasi-experimental design different? 

Well, unlike experimental design, quasi-experiments do not include random assignments of participants meaning, the participants are placed in the experimental groups based on some of the other criteria. Let us take a deeper look into how quasi-experimental design works.

Experimental design has three characteristics:

  • Manipulation
  • Randomization 

1. Manipulation

Manipulation simply means, evaluating the effect of the independent variable on the dependent variable. 

Example: A chocolate and a crying child.

Here, the Independent variable is : the type of chocolate. 

And dependent variable is : the child is crying for a chocolate.

So manipulation means the effect of an independent variable that is chocolate, on the dependent variable, that is the crying child. In short, you are using an outside source on the dependent variable. This proves that after getting the chocolate (independent variable), the child stops crying (dependent variable).

2. Randomization

Randomization means sudden selection without any plan. Example: A lottery system. The lottery numbers are announced at random so everyone who buys a lottery has an equal chance. Hence, it means you select a sample without any plan and everyone has an equal chance of getting into any one of the experimental groups.

It means using a control group in the experiment. In this group, researchers keep the independent variable constant. This control group is then compared to a treatment group, where the researchers have changed the independent variable. Well, for obvious reasons, researchers are interested in the treatment group more as it has a scope of change of the dependent variable. 

Example: You want to find out if the workers work more efficiently if there is a pay rise. 

Here, you will put certain workers in the treatment group and certain in the control group.

Treatment group : You pay more to the workers

Control group: You don’t pay any extra to the workers and things remain the same.  

By comparing these two groups, you understand that the workers who got paid more worked more efficiently than the workers who didn’t. 

As for the quasi-experimental design, the manipulation characteristic of the true experiment remains the same. But randomization or control characteristics are present in contrast to each other or none at all. 

Hence, these experiments are conducted where random selection is difficult or even impossible. The quasi-experiment does not include random assignment as the independent variable is manipulated before the measurement of the dependent variable.

See Voxco survey software in action with a Free demo.

Differences Between Quasi-Experiments and True Experiments

The above description is overwhelming? Don’t worry. Here is the straight difference between the quasi-experiments and true experiments just so you can understand how both vary from each other.

Example of a True Experiment vs a Quasi-Experiment

All the explaining isn’t always enough, right? So we got it covered with clear examples that will help you set quasi-experiments and true experiments apart from each other;

Let us say you want to study the effect of junk food on obese people.

Example of a true experiment

While starting the true experiment, you assign some participants in the treatment group where they are fed only junk food. While the other half of the participants go to the control group , where they have their regular ongoing diet (standard course).

You decide to take obese people’s reports every day after their meals to note down their health and discomfort if any.

Although, participants who are assigned to the treatment group would not like to change their diet to complete junk food for personal reasons. In this case, you cannot conduct a true experiment against their will. This is when quasi-experiment comes in.

Example of quasi-experiment

While talking to the participants you find out that some of the participants want to try the junk food effect while the others don’t want to experiment with their diet and choose to stick with a regular diet.

You can now assign already existing groups to the participants according to their choices. Study how the regular consumption of junk food affects the obese from that group. 

Here, you did not assign groups to the random participants and can be confident about the difference occurring due to the conducted experiment. 

Voxco surveys make your research easy & insightful

Advantages and disadvantages of using a quasi-experimental design.

The advantages of a quasi-experimental design include

  • Quasi-experiments design can be perfect to determine what is best for the population. Also known as external validity.
  • It gives the researchers power over the variables by being able to control them.
  • The quasi-experiment method can be combined with other experimental methods too.
  • It provides transferability to a greater extent.
  • It is an intuitive process that is well-shaped by the researchers. 
  • Involves real-world problems and solutions and not any artificial ones. 
  • Offers better control over the third variable known as the confounding variable which influences the cause and effect. 

The disadvantages of a quasi-experimental design include:

  • It serves less internal validity than true experiments.
  • Due to no randomization, you cannot tell for sure that the confounding or third variable is eradicated. 
  • It has scope for human errors.
  • It can allow the researcher’s personal bias to get involved. 
  • Human responses are difficult to measure, hence, there is a chance that the results are produced artificially.
  • Using old or backdated data can be incorrect and inadequate for the study.

Types of Quasi-Experimental Design

Amongst all the various types of quasi-experimental design, let us first get to know two main types of quasi-experimental design:

  • Non-equivalent group design (NEGD)
  • Regression discontinuity design

1. Non-Equivalent Group Design (NEGD)

You can picture non-equivalent group designs as a mixture of both true experimental design as well as quasi-experimental design. The reason is, that it uses both their qualities. Like a true experiment, NEGD uses the pre-existing groups that we feel are similar, namely treatment and control groups. But it lacks the randomization characteristic of a quasi-experiment. 

While grouping, researchers see to it that they are not influenced by any third variables or confounding variables. Hence, the groups are as similar as possible. Like talking about the political study, we might select groups that are more similar to each other. 

Let us understand it with an example:

Take the previous example where you studied whether the workers work more efficiently if there is a pay rise. 

You give a pre-test to the workers in one company while their pay is normal. Then you put them under the treatment group where they work and their pay is being increased. After the experiment, you take their post-test about their experience and attitude towards their work. 

Later, you give the same pre-test to the workers from a similar company and put them in a control group where their pay is not raised, and then conduct a post-test. 

Hence, the Non-equivalent design has the name to remind us that the groups are not equivalent and are not assigned on a random practice. 

2. Non-Equivalent Group Design (NEGD)

Regression discontinuity design or RDD is a quasi-experimental design technique that computes the influence of a treatment or intervention. It does so by using a mechanism that assigns the treatment based on eligibility known as a “cut-off”.

So the participants above the cut-off get to be in a treatment group and those below the cut-off doesn’t. Although the difference between these two groups is negligible. 

Let’s take a look at an example:

A school wants to grant a $50 scholarship to students depending on an independent test taken measuring their intellect and household. 

Those who pass the test will get a scholarship. However, the students who are just below the cut-off and those just above it can be considered similar. We can say the differences in their scores occurred randomly. Hence you can keep on studying both groups to get a long-term outcome.

Learn all about Exploratory Research

Other quasi-experimental designs.

Apart from the above-mentioned types, there are other equally important quasi-experimental designs that have different applications depending on their characteristics and their respective  design notations . 

Let’s take a look at all of them in detail:

1. he Proxy Pre-Test Design

The proxy pre-test design works the same as a typical pre-test and post-test design. Except, the pre-test here is conducted AFTER the treatment is given. Got confused? How is it pre-test if it is conducted after? Well, the keyword here is “proxy”. These proxy variables tell where the groups would have been in the pre-test. 

You ask the group after their program about how they’d have answered the same questions before their treatment. Although, this technique is not very reliable as we cannot expect the participants to remember how they felt a long time ago, and we surely cannot tell if they are faking their answers. 

As this design is highly not recommended, you can use this under some unavoidable circumstances like the treatment has already begun and you couldn’t take the pre-test. 

In such cases, this approach will help rather than depending totally on the post-test.

Example:  You want to study the workers’ performance after the pay rise. But you were called to do the pre-test after the program had started. In that case, you will have to take the post-test and study a proxy variable such as productivity from the time before the program and after the program

2. The Separate Pre-Post Samples Design

This technique also works on the pre-test and post-test designs. The difference is, that the participants you used for the pre-test won’t be the same for the post-test. 

Example:  You want to study the client satisfaction of two similar companies. You take one for the treatment and the other for the control. Let’s say you conducted a pre-test in both the companies at the same time and then begin your experiment. 

After a while, when the program is complete, you go to take a post-test. Now, the set of clients you take in for the test is going to be different than the pre-test ones, the reason being clients change after the course of the period. 

In this case, you cannot derive one-to-one results, but you can tell the average client satisfaction in both companies. 

3. The Double Pre-Test Design

The double pre-test design is a very robust quasi-experimental design designed to rule out the internal validity problem we had with the non-equivalent design. It has two pre-tests before the program. It is when the two groups are progressing at a different pace, that you should change from pre-test 1 to pre-test 2. 

Due to the benefit of two pre-tests, you can determine the null case scenario. It assumes the difference between the scores in pre-test and post-test is due to random chance as it doesn’t allow one person to take the pre-test twice.

4. The Switching Replications Design

In the switching replications design, as the name suggests, the role of the group is switched. It follows the same treatment-control group pattern, except it has two phases.

Phase 1:  Both the groups are pre-tested, then they undergo their respective program. Later they are post-tested.

Phase 2:  In this phase, an original treatment group is now a control group, and an original control group is now a treatment group.

The main benefit it provides by inculcating this design is that it not only proves strong against internal validation but as well as external validation as well. The reason being two parallel conducted implementations of the program allows all the participants to experience the program, making it ethically strong as well.

Create your ideal set of respondents.

Build your survey panel, design your survey, & uncover insights in minutes., 5. the non-equivalent dependent variables (nedv) design.

NEDV design, in its simplest form, is not the most reliable one and does not work wonders against internal validity either. But then what is the use of NEDV? 

Well, sometimes the treatment group may be affected by some external factors. Hence, there are two pre and post-tests applied to the participants, one regarding the treatment itself and the other regarding that external variable. 

Wait, how about we take an example to understand this?

Let us say you started a program for testing history teaching techniques. You design standards test for history (treatment group) as well as showing historical movies (external variable). Later in the post-tests, you find out that along with the history scores, students’ interest in historical movies has also increased, suggesting that showing historical movies have influenced students to study the subject.

6. he Regression Point Displacement (RPD) Design

RPD design is used when the measures for the already existing groups are available and can be compared with the treatment group. The treatment group is the only group present and both pre-test and post-tests are conducted. 

This method is widely beneficial for the larger groups per se; communities or companies. RPD works by comparing a single program unit with a larger comparison unit.

Let us make it clear with an example:

Consider a community-based COVID awareness program. It is decided to start the initiative in a particular town of a vast district. The representatives forecast the active cases in that town and use the remaining towns as a comparison. Now rather than giving the average for the rest of the towns’ COVID cases, they show their count.

When to Use Quasi-Experimental Design?

All that studying but shouldn’t you know when to perfectly use quasi-experiments? Well, now as we are to the end of the matter, let us discuss when to use quasi-experiments and for what reasons. 

1. For ethical reasons

Remember when we discussed the “willingness” of obese people to participate in the experiment? That is when ethics start to matter. You cannot go on putting random participants under treatments as you do with true experiments. 

Especially when it directly affects the participants’ lives. One of the best examples is Oregon Health Study where health insurance is given to certain people while others were restricted from it. 

2. For practical reasons

True experiments, despite having higher internal validity, can be expensive. Also, it requires enough participants so that the true experiment can be justified. Unlike that, in quasi-experiment, you can use the already gathered data. 

The data is collected and paid by some strong entity, say the government, and you use that to study your questions. 

Well, that concludes our guide. If you’re looking for extensive research tools, Voxco offers a complete  market research tool  kit  that includes market research trends, a guide to online surveys, an agile market research guide, and 5 market research templates.  

Quasi-experimental design has a unique approach that allows you to uncover causal relationship between variables when controlled experiments are not feasible or ethical. While it may not posses the level of control and randomization that you have when performing true-experiment; quasi-experimental research design enables you to make meaningful contribution by providing valuable insights to various fields.

Let us say you want to study the effect of eating cheese on bad breath. So you make the people with not so bad breath take the treatment and the other half with bad breath to be in the control group. After taking the post-test you discover that the participants in the treatment group start to have bad breath.

The quasi-experimental are used to evaluate interventions without using randomization. It also interprets the problems using pre-intervention and post-intervention measurements along with non-random assignments.

A true experiment uses random assignment of the participants while quasi-experiments does not. This allows its wide use in ethical problems. 

Quasi-experiments allots the participants based on a study, unlike true experiments where they have an equal chance of getting into any of the groups. 

Quasi-experiment also makes use of the pre-test as well as post-test measurements which opens a door to before-after comparisons.

The quasi-experimental design does not randomly assign groups to the participants, rather it studies their nature and then treats them accordingly. 

It studies the participants before and after the program known as pre-test and post-test which helps get an idea about the progress of the groups.

Quasi-experiments also are ethical, due to their non-randomization characteristic.

Quasi design or quasi-experimental design mostly resembles the true experimental design, just minus the key component. That is a random assignment.

Two prime quasi-experimental methods include:

  • Nonequivalent Groups Design
  • Regression-Discontinuity Design

Some other, rather equally important Quasi Designs are:

  • The Proxy Pretest Design
  • The Separate Pre-Post Samples Design
  • The Double Pretest Design
  • The Switching Replications Design
  • The Nonequivalent Dependent Variables (NEDV) Design
  • The Regression Point Displacement (RPD) Design

Explore Voxco Survey Software

+ Omnichannel Survey Software 

+ Online Survey Software 

+ CATI Survey Software 

+ IVR Survey Software 

+ Market Research Tool

+ Customer Experience Tool 

+ Product Experience Software 

+ Enterprise Survey Software 

Quasi-experimental design: explanation, methods and FAQs market survey

Market Survey

Market Survey: Uncovering Your Target Market SHARE THE ARTICLE ON Table of Contents In a competitive yet saturated business landscape, staying ahead of the curve

Brand Perception Strategy2

Brand Positioning

Brand Positioning Mastery: Perceptions & Strategies SHARE THE ARTICLE ON Table of Contents What is Brand Positioning? In marketing, positioning refers to an organization’s ability

Tips to avoid Non response bias1

Tips to avoid Non-response Bias

Tips to avoid Non-response Bias BOOK A DEMO Voxco is trusted by 450+ Global Brands in 40+ countries See what question types are possible with

Quasi-experimental design: explanation, methods and FAQs market survey

Customer Intelligence (CI): What is it and Why it Matters for your Business

Customer Intelligence (CI): What is it and Why it Matters for your Business SHARE THE ARTICLE ON Table of Contents With the increasing amount of

Research Methodology in Sociology

Uncovering the Essentials of Research Methodology in Sociology SHARE THE ARTICLE ON Table of Contents Introduction What is a Research Methodology? A research methodology is

What is Digital Customer Experience2 1

How to enhance customer experience in retail

How to enhance customer experience in retail SHARE THE ARTICLE ON Table of Contents What is customer experience? According to Wikipedia, customer experienced is defined

We use cookies in our website to give you the best browsing experience and to tailor advertising. By continuing to use our website, you give us consent to the use of cookies. Read More

This paper is in the following e-collection/theme issue:

Published on 19.4.2024 in Vol 26 (2024)

Psychometric Evaluation of a Tablet-Based Tool to Detect Mild Cognitive Impairment in Older Adults: Mixed Methods Study

Authors of this article:

Author Orcid Image

Original Paper

  • Josephine McMurray 1, 2 * , MBA, PhD   ; 
  • AnneMarie Levy 1 * , MSc, PhD   ; 
  • Wei Pang 1, 3 * , BTM   ; 
  • Paul Holyoke 4 , PhD  

1 Lazaridis School of Business & Economics, Wilfrid Laurier University, Brantford, ON, Canada

2 Health Studies, Faculty of Human and Social Sciences, Wilfrid Laurier University, Brantford, ON, Canada

3 Biomedical Informatics & Data Science, Yale University, New Haven, CT, United States

4 SE Research Centre, Markham, ON, Canada

*these authors contributed equally

Corresponding Author:

Josephine McMurray, MBA, PhD

Lazaridis School of Business & Economics

Wilfrid Laurier University

73 George St

Brantford, ON, N3T3Y3

Phone: 1 548 889 4492

Email: [email protected]

Background: With the rapid aging of the global population, the prevalence of mild cognitive impairment (MCI) and dementia is anticipated to surge worldwide. MCI serves as an intermediary stage between normal aging and dementia, necessitating more sensitive and effective screening tools for early identification and intervention. The BrainFx SCREEN is a novel digital tool designed to assess cognitive impairment. This study evaluated its efficacy as a screening tool for MCI in primary care settings, particularly in the context of an aging population and the growing integration of digital health solutions.

Objective: The primary objective was to assess the validity, reliability, and applicability of the BrainFx SCREEN (hereafter, the SCREEN) for MCI screening in a primary care context. We conducted an exploratory study comparing the SCREEN with an established screening tool, the Quick Mild Cognitive Impairment (Qmci) screen.

Methods: A concurrent mixed methods, prospective study using a quasi-experimental design was conducted with 147 participants from 5 primary care Family Health Teams (FHTs; characterized by multidisciplinary practice and capitated funding) across southwestern Ontario, Canada. Participants included health care practitioners, patients, and FHT administrative executives. Individuals aged ≥55 years with no history of MCI or diagnosis of dementia rostered in a participating FHT were eligible to participate. Participants were screened using both the SCREEN and Qmci. The study also incorporated the Geriatric Anxiety Scale–10 to assess general anxiety levels at each cognitive screening. The SCREEN’s scoring was compared against that of the Qmci and the clinical judgment of health care professionals. Statistical analyses included sensitivity, specificity, internal consistency, and test-retest reliability assessments.

Results: The study found that the SCREEN’s longer administration time and complex scoring algorithm, which is proprietary and unavailable for independent analysis, presented challenges. Its internal consistency, indicated by a Cronbach α of 0.63, was below the acceptable threshold. The test-retest reliability also showed limitations, with moderate intraclass correlation coefficient (0.54) and inadequate κ (0.15) values. Sensitivity and specificity were consistent (63.25% and 74.07%, respectively) between cross-tabulation and discrepant analysis. In addition, the study faced limitations due to its demographic skew (96/147, 65.3% female, well-educated participants), the absence of a comprehensive gold standard for MCI diagnosis, and financial constraints limiting the inclusion of confirmatory neuropsychological testing.

Conclusions: The SCREEN, in its current form, does not meet the necessary criteria for an optimal MCI screening tool in primary care settings, primarily due to its longer administration time and lower reliability. As the number of digital health technologies increases and evolves, further testing and refinement of tools such as the SCREEN are essential to ensure their efficacy and reliability in real-world clinical settings. This study advocates for continued research in this rapidly advancing field to better serve the aging population.

International Registered Report Identifier (IRRID): RR2-10.2196/25520

Introduction

Mild cognitive impairment (MCI) is a syndrome characterized by a slight but noticeable and measurable deterioration in cognitive abilities, predominantly memory and thinking skills, that is greater than expected for an individual’s age and educational level [ 1 , 2 ]. The functional impairments associated with MCI are subtle and often impair instrumental activities of daily living (ADL). Instrumental ADL include everyday tasks such as managing finances, cooking, shopping, or taking regularly prescribed medications and are considered more complex than ADL such as bathing, dressing, and toileting [ 3 , 4 ]. In cases in which memory impairment is the primary indicator of the disease, MCI is classified as amnesic MCI and when significant impairment of non–memory-related cognitive domains such as visual-spatial or executive functioning is dominant, MCI is classified as nonamnesic [ 5 ].

Cognitive decline, more so than cancer and cardiovascular disease, poses a substantial threat to an individual’s ability to live independently or at home with family caregivers [ 6 ]. The Centers for Disease Control and Prevention reports that 1 in 8 adults aged ≥60 years experiences memory loss and confusion, with 35% reporting functional difficulties with basic ADL [ 7 ]. The American Academy of Neurology estimates that the prevalence of MCI ranges from 13.4% to 42% in people aged ≥65 years [ 8 ], and a 2023 meta-analysis that included 233 studies and 676,974 participants aged ≥50 years estimated that the overall global prevalence of MCI is 19.7% [ 9 ]. Once diagnosed, the prognosis for MCI is variable, whereby the impairment may be reversible; the rate of decline may plateau; or it may progressively worsen and, in some cases, may be a prodromal stage to dementia [ 10 - 12 ]. While estimates vary based on sample (community vs clinical), annual rates of conversion from MCI to dementia range from 5% to 24% [ 11 , 12 ], and those who present with multiple domains of cognitive impairment are at higher risk of conversion [ 5 ].

The risk of developing MCI rises with age, and while there are no drug treatments for MCI, nonpharmacologic interventions may improve cognitive function, alleviate the burden on caregivers, and potentially delay institutionalization should MCI progress to dementia [ 13 ]. To overcome the challenges of early diagnosis, which currently depends on self-detection, family observation, or health care provider (HCP) recognition of symptoms, screening high-risk groups for MCI or dementia is suggested as a solution [ 13 ]. However, the Canadian Task Force on Preventive Health Care recommends against screening adults aged ≥65 years due to a lack of meaningful evidence from randomized controlled trials and the high false-positive rate [ 14 - 16 ]. The main objective of a screening test is to reduce morbidity or mortality in at-risk populations through early detection and intervention, with the anticipated benefits outweighing potential harms. Using brief screening tools in primary care might improve MCI case detection, allowing patients and families to address reversible causes, make lifestyle changes, and access disease-modifying treatments [ 17 ].

There is no agreement among experts as to which tests or groups of tests are most predictive of MCI [ 16 ], and the gold standard approach uses a combination of positive results from neuropsychological assessments, laboratory tests, and neuroimaging to infer a diagnosis [ 8 , 18 ]. The clinical heterogeneity of MCI complicates its diagnosis because it influences not only memory and thinking abilities but also mood, behavior, emotional regulation, and sensorimotor abilities, and patients may present with any combination of symptoms with varying rates of onset and decline [ 4 , 8 ]. For this reason, a collaborative approach between general practitioners and specialists (eg, geriatricians and neurologists) is often required to be confident in the diagnosis of MCI [ 8 , 19 , 20 ].

In Canada, diagnosis often begins with screening for cognitive impairment followed by referral for additional testing; this process takes, on average, 5 months [ 20 ]. The current usual practice screening tools for MCI are the Mini-Mental State Examination (MMSE) [ 21 , 22 ] and the Montreal Cognitive Assessment (MoCA) 8.1 [ 3 ]. Both are paper-and-pencil screens administered in 10 to 15 minutes, scored out of 30, and validated as MCI screening tools across diverse clinical samples [ 23 , 24 ]. Universally, the MMSE is most often used to screen for MCI [ 20 , 25 ] and consists of 20 items that measure orientation, immediate and delayed recall, attention and calculation, visual-spatial skills, verbal fluency, and writing. The MoCA 8.1 was developed to improve on the MMSE’s ability to detect early signs of MCI, placing greater emphasis on evaluating executive function as well as language, memory, visual-spatial skills, abstraction, attention, concentration, and orientation across 30 items [ 24 , 26 ]. Scores of <24 on the MMSE or ≤25 on the MoCA 8.1 signal probable MCI [ 21 , 27 ]. Lower cutoff scores for both screens have been recommended to address evidence that they lack specificity to detect mild and early cases of MCI [ 4 , 28 - 31 ]. The clinical efficacy of both screens for tracking change in cognition over time is limited as they are also subject to practice effects with repeated administration [ 32 ].

Novel screening tools, including the Quick Mild Cognitive Impairment (Qmci) screen, have been developed with the goal of improving the accuracy of detecting MCI [ 33 , 34 ]. The Qmci is a sensitive and specific tool that differentiates normal cognition from MCI and dementia and is more accurate at differentiating MCI from controls than either the MoCA 8.1 (Qmci area under the curve=0.97 vs MoCA 8.1 area under the curve=0.92) [ 25 , 35 ] or the Short MMSE [ 33 , 36 ]. It also demonstrates high test-retest reliability (intraclass correlation coefficient [ICC]=0.88) [ 37 ] and is clinically useful as a rapid screen for MCI as the Qmci mean is 4.5 (SD 1.3) minutes versus 9.5 (SD 2.8) minutes for the MoCA 8.1 [ 25 ].

The COVID-19 pandemic and the necessary shift to virtual health care accelerated the use of digital assessment tools, including MCI screening tools such as the electronic MoCA 8.1 [ 38 , 39 ], and the increased use and adoption of technology (eg, smartphones and tablets) by older adults suggests that a lack of proficiency with technology may not be a barrier to the use of such assessment tools [ 40 , 41 ]. BrainFx is a for-profit firm that creates proprietary software designed to assess cognition and changes in neurofunction that may be caused by neurodegenerative diseases (eg, MCI or dementia), stroke, concussions, or mental illness using ecologically relevant tasks (eg, prioritizing daily schedules and route finding on a map) [ 42 ]. Their assessments are administered via a tablet and stylus. The BrainFx 360 performance assessment (referred to hereafter as the 360) is a 90-minute digitally administered test that was designed to assess cognitive, physical, and psychosocial areas of neurofunction across 26 cognitive domains using 49 tasks that are timed and scored [ 42 ]. The BrainFx SCREEN (referred to hereafter as the SCREEN) is a short digital version of the 360 that includes 7 of the cognitive domains included in the 360, is estimated to take approximately 10 to 15 minutes to complete, and was designed to screen for early detection of cognitive impairment [ 43 , 44 ]. Upon completion of any BrainFx assessment, the results of the 360 or SCREEN are added to the BrainFx Living Brain Bank (LBB), which is an electronic database that stores all completed 360 and SCREEN assessments and is maintained by BrainFx. An electronic report is generated by BrainFx comparing an individual’s results to those of others collected and stored in the LBB. Normative data from the LBB are used to evaluate and compare an individual’s results.

The 360 has been used in clinical settings to assess neurofunction among youth [ 45 ] and anecdotally in other rehabilitation settings (T Milner, personal communication, May 2018). To date, research on the 360 indicates that it has been validated in healthy young adults (mean age 22.9, SD 2.4 years) and that the overall test-retest reliability of the tool is high (ICC=0.85) [ 42 ]. However, only 2 of the 7 tasks selected to be included in the SCREEN produced reliability coefficients of >0.70 (visual-spatial and problem-solving abilities) [ 42 ]. Jones et al [ 43 ] explored the acceptability and perceived usability of the SCREEN with a small sample (N=21) of Canadian Armed Forces veterans living with posttraumatic stress disorder. A structural equation model based on the Unified Theory of Acceptance and Use of Technology suggested that behavioral intent to use the SCREEN was predicted by facilitating conditions such as guidance during the test and appropriate resources to complete the test [ 43 ]. However, the validity, reliability, and sensitivity of the SCREEN for detecting cognitive impairment have not been tested.

McMurray et al [ 44 ] designed a protocol to assess the validity, reliability, and sensitivity of the SCREEN for detecting early signs of MCI in asymptomatic adults aged ≥55 years in a primary care setting (5 Family Health Teams [FHTs]). The protocol also used a series of semistructured interviews and surveys guided by the fit between individuals, task, technology, and environment framework [ 46 ], a health-specific model derived from the Task-Technology Fit model by Goodhue and Thompson [ 47 ], to explore the SCREEN’s acceptability and use by HCPs and patients in primary care settings (manuscript in preparation). This study is a psychometric evaluation of the SCREEN’s validity, reliability, and sensitivity for detecting MCI in asymptomatic adults aged ≥55 years in primary care settings.

Study Location, Design, and Data Collection

This was a concurrent, mixed methods, prospective study using a quasi-experimental design. Participants were recruited from 5 primary care FHTs (characterized by multidisciplinary practice and capitated funding) across southwestern Ontario, Canada. FHTs that used a registered occupational therapist on staff were eligible to participate in the study, and participating FHTs received a nominal compensatory payment for the time the HCPs spent in training; collecting data for the study; administering the SCREEN, Qmci, and Geriatric Anxiety Scale–10 (GAS-10); and communicating with the research team. A multipronged recruitment approach was used [ 44 ]. A designated occupational therapist at each location was provided with training and equipment to recruit participants, administer assessment tools, and submit collected data to the research team.

The research protocol describing the methods of both the quantitative and qualitative arms of the study is published elsewhere [ 44 ].

Ethical Considerations

This study was approved by the Wilfrid Laurier University Research Ethics Board (ORE 5820) and was reviewed and approved by each FHT. Participants (HCPs, patients, and administrative executives) read and signed an information and informed consent package in advance of taking part in the study. We complied with recommendations for obtaining informed consent and conducting qualitative interviews with persons with dementia when recruiting patients who may be affected by neurocognitive diseases [ 48 - 50 ]. In addition, at the end of each SCREEN assessment, patients were required to provide their consent (electronic signature) to contribute their anonymized scores to the database of SCREEN results maintained by BrainFx. Upon enrolling in the study, participants were assigned a unique identification number that was used in place of their name on all study documentation to anonymize the data and preserve their confidentiality. A master list matching participant names with their unique identification number was stored in a password-protected file by the administering HCP and principal investigator on the research team. The FHTs received a nominal compensatory payment to account for their HCPs’ time spent administering the SCREEN, collecting data for the study, and communicating with the research team. However, the individual HCPs who volunteered to participate and the patient participants were not financially compensated for taking part in the study.

Participants

Patients who were rostered with the FHT, were aged ≥55 years, and had no history of MCI or dementia diagnoses to better capture the population at risk of early signs of cognitive impairment were eligible to participate [ 51 , 52 ]. It was necessary for the participants to be rostered with the FHTs to ensure that the HCPs could access their electronic medical record to confirm eligibility and record the testing sessions and results and to ensure that there was a responsible physician for referral if indicated. As the SCREEN is administered using a tablet, participants had to be able to read and think in English and discern color, have adequate hearing and vision to interact with the administering HCP, read 12-point font on the tablet, and have adequate hand and arm function to manipulate and hold the tablet. The exclusion criteria used in the study included colorblindness and any disability that might impair the individual’s ability to hold and interact with the tablet. Prospective participants were also excluded based on a diagnosis of conditions that may result in MCI or dementia-like symptoms, including major depression that required hospitalization, psychiatric disorders (eg, schizophrenia and bipolar disorder), psychopathology, epilepsy, substance use disorders, or sleep apnea (without the use of a continuous positive airway pressure machine) [ 52 ]. Patients were required to complete a minimum of 2 screening sessions spaced 3 months apart to participate in the study and, depending on when they enrolled to participate, could complete a maximum of 4 screening sessions over a year.

Data Collection Instruments

Gas-10 instrument.

A standardized protocol was used to collect demographic data, randomly administer the SCREEN and the Qmci (a validated screening tool for MCI), and administer the GAS-10 immediately before and after the completion of the first MCI screen at each visit [ 44 ]. This was to assess participants’ general anxiety as it related to screening for cognitive impairment at the time of the assessment, any change in subjective ratings after completion of the first MCI screen, and change in anxiety between appointments. The GAS-10 is a 10-item, self-report screen for anxiety in older adults [ 53 ] developed for rapid screening of anxiety in clinical settings (the GAS-10 is the short form of the full 30-item Geriatric Anxiety Scale [GAS]) [ 54 ]. While 3 subscales are identified, the GAS is reported to be a unidimensional scale that assesses general anxiety [ 55 , 56 ]. Validation of the GAS-10 suggests that it is optimal for assessing average to moderate levels of anxiety in older adults, with subscale scores that are highly and positively correlated with the GAS and high internal consistency [ 53 ]. Participants were asked to use a 4-point Likert scale (0= not at all , 1= sometimes , 2= most of the time , and 3= all of the time ) to rate how often they had experienced each symptom over the previous week, including on the day the test was administered [ 54 ]. The GAS-10 has a maximum score of 30, with higher scores indicating higher levels of anxiety [ 53 , 54 , 57 ].

HCPs completed the required training to become certified BrainFx SCREEN administrators before the start of the study. To this end, HCPs completed a web-based training program (developed and administered through the BrainFx website) that included 3 self-directed training modules. For the purpose of the study, they also participated in 1 half-day in-person training session conducted by a certified BrainFx administrator (T Milner, BrainFx chief executive officer) at one of the participating FHT locations. The SCREEN (version 0.5; beta) was administered on a tablet (ASUS ZenPad 10.1” IPS WXGA display, 1920 × 1200, powered by a quad-core 1.5 GHz, 64-bit MediaTek MTK 8163A processor with 2 GB RAM and 16-GB storage). The tablet came with a tablet stand for optional use and a dedicated stylus that is recommended for completion of a subset of questions. At the start of the study, HCPs were provided with identical tablets preloaded with the SCREEN software for use in the study. The 7 tasks on the SCREEN are summarized in Table 1 and were taken directly from the 360 based on a clustering and regression analysis of LBB records in 2016 (N=188) [ 58 ]. A detailed description of the study and SCREEN administration procedures was published by McMurray et al [ 44 ].

An activity score is generated for each of the 7 tasks on the SCREEN. It is computed based on a combination of the accuracy of the participant’s response and the processing speed (time in seconds) that it takes to complete the task. The relative contribution of accuracy and processing speed to the final activity score for each task is proprietary to BrainFx and unknown to the research team. The participant’s activity score is compared to the mean activity score for the same task at the time of testing in the LBB. The mean activity score from the LBB may be based on the global reference population (ie, all available SCREEN results in the LBB), or the administering HCP may select a specific reference population by filtering according to factors including but not limited to age, sex, or diagnosis. If the participant’s activity score is >1 SD below the LBB activity score mean for that task, it is labeled as an area of challenge . Each of the 7 tasks on the SCREEN are evaluated independently of each other, producing a report with 7 activity scores showing the participant’s score, the LBB mean score, and the SD. The report also provides an overall performance and processing speed score. The overall performance score is an average of all 7 activity scores; however, the way in which the overall processing speed score is generated remains proprietary to BrainFx and unknown to the research team. Both the overall performance and processing speed scores are similarly evaluated against the LBB and identified as an area of challenge using the criteria described previously. For the purpose of this study, participants’ mean activity scores on the SCREEN were compared to the results of people aged ≥55 years in the LBB.

The Qmci evaluated 6 cognitive domains: orientation (10 points), registration (5 points), clock drawing (15 points), delayed recall (20 points), verbal fluency (20 points), and logical memory (30 points) [ 59 ]. Administering HCPs scored the text manually, with each subtest’s points contributing to the overall score out of 100 points, and the cutoff score to distinguish normal cognition from MCI was ≤67/100 [ 60 ]. Cutoffs to account for age and education have been validated and are recommended as the Qmci is sensitive to these factors [ 60 ]. A 2019 meta-analysis of the diagnostic accuracy of MCI screening tools reported that the sensitivity and specificity of the Qmci for distinguishing MCI from normal cognition is similar to usual standard-of-care tools (eg, the MoCA, Addenbrooke Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease battery total score, and Sunderland Clock Drawing Test) [ 61 ]. The Qmci has also been translated into >15 different languages and has undergone psychometric evaluation across a subset of these languages. While not as broadly adopted as the MoCA 8.1 in Canada, its psychometric properties, administration time, and availability for use suggested that the Qmci was an optimal assessment tool for MCI screening in FHT settings during the study.

Psychometric Evaluation

To date, the only published psychometric evaluation of any BrainFx tool is by Searles et al [ 42 ] in Athletic Training & Sports Health Care ; it assessed the test-retest reliability of the 360 in 15 healthy adults between the ages of 20 and 25 years. This study evaluated the psychometric properties of the SCREEN and included a statistical analysis of the tool’s internal consistency, construct validity, test-retest reliability, and sensitivity and specificity. McMurray et al [ 44 ] provide a detailed description of the data collection procedures for administration of the SCREEN and Qmci completed by participants at each visit.

Validity Testing

Face validity was outside the scope of this study but was implied, and assumptions are reported in the Results section. Construct validity, whether the 7 activities that make up the SCREEN were representative of MCI, was assessed through comparison with a substantive body of literature in the domain and through principal component analysis using varimax rotation. Criterion validity measures how closely the SCREEN results corresponded to the results of the Qmci (used here as an “imperfect gold standard” for identifying MCI in older adults) [ 62 ]. A BrainFx representative hypothesized that the ecological validity of the SCREEN questions (ie, using tasks that reflect real-world activities to detect early signs of cognitive impairment) [ 63 ] makes it a more sensitive tool than other screens (T Milner, personal communication, May 2018) and allows HCPs to equate activity scores on the SCREEN with real-world functional abilities. Criterion validity was explored first using cross-tabulations to calculate the sensitivity and specificity of the SCREEN compared to those of the Qmci. Conventional screens such as the Qmci are scored by taking the sum of correct responses on the screen and a cutoff score derived from normative data to distinguish normal cognition from MCI. The SCREEN used a different method of scoring whereby each of the 7 tasks was scored and evaluated independently of each other and there were no recommended guidelines for distinguishing normal cognition from MCI based on the aggregate areas of challenge identified by the SCREEN. Therefore, to compare the sensitivity and specificity of the SCREEN against those of the Qmci, the results of both screens were coded into a binary format as 1=healthy and 2=unhealthy, where healthy denoted no areas of challenge identified through the SCREEN and a Qmci score of ≥67. Conversely, unhealthy denoted one or more areas of challenge identified through the SCREEN and a Qmci score of <67.

Criterion validity was further explored using discrepant analysis via a resolver test [ 44 ]. Following the administration of the SCREEN and Qmci, screen results were evaluated by the administering HCP. HCPs were instructed to refer the participant for follow-up with their primary care physician if the Qmci result was <67 regardless of whether any areas of challenge were identified on the SCREEN. However, HCPs could use their clinical judgment to refer a participant for physician follow-up based on the results of the SCREEN or the Qmci, and all the referral decisions were charted on the participant’s electronic medical record following each visit and screening. In discrepant analysis, the results of the imperfect gold standard [ 64 ], as was the role of the Qmci in this study, were compared with the SCREEN results. A resolver test (classified as whether the HCP referred the patient to a physician for follow-up based on their performance on the SCREEN and the Qmci) was used on discordant results [ 64 , 65 ] to determine sensitivity and specificity. To this end, a new variable, Referral to a Physician for Cognitive Impairment , was coded as the true status (1=no referral; 2=referral was made) and compared to the Qmci as the imperfect gold standard (1=healthy; 2=unhealthy).

Reliability Testing

The reliability of a screening instrument is its ability to consistently measure an attribute and how well its component measures fit together conceptually. Internal consistency identifies whether the items in a multi-item scale are measuring the same underlying construct; the internal consistency of the SCREEN was assessed using the Cronbach α. Test-retest reliability refers to the ability of a measurement instrument to reproduce results over ≥2 occasions (assuming the underlying conditions have not changed) and was assessed using paired t tests (2-tailed), ICC, and the κ coefficient. In this study, participants completed both the SCREEN and the Qmci in the same sitting in a random sequence on at least 2 different occasions spaced 3 months apart (administration procedures are described elsewhere) [ 44 ]. In some instances, the screens were administered to the same participant on 4 separate occasions spaced 3 months apart each, and this provided up to 3 separate opportunities to conduct test-retest reliability analyses and investigate the effects of repeated practice. There are no clear guidelines on the optimal time between tests [ 66 , 67 ]; however, Streiner and Kottner [ 68 ] and Streiner [ 69 ] recommend longer periods between tests (eg, at least 10-14 days) to avoid recall bias, and greater practice effects have been experienced with shorter test-retest intervals [ 32 ].

Analysis of the quantitative data was completed using Stata (version 17.0; StataCorp). Assumptions of normality were not violated, so parametric tests were used. Collected data were reported using frequencies and percentages and compared using the chi-square or Fisher exact test as necessary. Continuous data were analyzed for central tendency and variability; categoric data were presented as proportions. Normality was tested using the Shapiro-Wilk test, and nonparametric data were tested using the Mann-Whitney U test. A P value of .05 was considered statistically significant, with 95% CIs provided where appropriate. We powered the exploratory analysis to validate the SCREEN using an estimated effect size of 12%—understanding that Canadian prevalence rates of MCI were not available [ 1 ]—and determined that the study required at least 162 participants. For test-retest reliability, using 90% power and a 5% type-I error rate, a minimum of 67 test results was required.

The time taken for participants to complete the SCREEN was recorded by the HCPs at the time of testing; there were 6 missing HCP records of time to complete the SCREEN. For these 6 cases of missing data, we imputed the mean time to complete the SCREEN by all participants who were tested by that HCP and used this to populate the missing cells [ 70 ]. There were 3 cases of missing data related to the SCREEN reports. More specifically, the SCREEN report generated by BrainFx did not include 1 or 2 data points each for the route finding, divided attention, and prioritizing tasks. The clinical notes provided by the HCP at the time of SCREEN administration did not indicate that the participant had not completed those questions, and it was not possible to determine the root cause of the missing data in report generation according to BrainFx (M Milner, personal communication, July 7, 2020). For continuous variables in analyses such as exploratory factor analysis, Cronbach α, and t test, missing values were imputed using the mean. However, for the coded healthy and unhealthy categorical variables, values were not imputed.

Data collection began in January 2019 and was to conclude on May 31, 2020. However, the emergence of the global COVID-19 pandemic resulted in the FHTs and Wilfrid Laurier University prohibiting all in-person research starting on March 16, 2020.

Participant Demographics

A total of 154 participants were recruited for the study, and 20 (13%) withdrew following their first visit to the FHT. The data of 65% (13/20) of the participants who withdrew were included in the final analysis, and the data of the remaining 35% (7/20) were removed, either due to their explicit request (3/7, 43%) or because technical issues at the time of testing rendered their data unusable (4/7, 57%). These technical issues were related to software issues (eg, any instance in which the patient or HCP interacted with the SCREEN software and followed the instructions provided, the software did not work as expected [ie, objects did not move where they were dragged or tapping on objects failed to highlight the object], and the question could not be completed). After attrition, a total of 147 individuals aged ≥55 years with no previous diagnosis of MCI or dementia participated in the study ( Table 2 ). Of the 147 participants, 71 (48.3%) took part in only 1 round of screening on visit 1 (due to COVID-19 restrictions imposed on in-person research that prevented a second visit). The remaining 51.7% (76/147) of the participants took part in ≥2 rounds of screening across multiple visits (76/147, 51.7% participated in 2 rounds; 22/147, 15% participated in 3 rounds; and 13/147, 8.8% participated in 4 rounds of screening).

The sample population was 65.3% (96/147) female (mean 70.2, SD 7.9 years) and 34.7% (51/147) male (mean 72.5, SD 8.1 years), with age ranging from 55 to 88 years; 65.3% (96/147) achieved the equivalent of or higher than a college diploma or certificate ( Table 2 ); and 32.7% (48/147) self-reported living with one or more chronic medical conditions ( Table 3 ). At the time of screening, 73.5% (108/147) of participants were also taking medications with side effects that may include impairments to memory and thinking abilities [ 71 - 75 ]; therefore, medication use was accounted for in a subset of the analyses. Finally, 84.4% (124/147) of participants self-reported regularly using technology (eg, smartphone, laptop, or tablet) with high proficiency. A random sequence generator was used to determine the order for administering the MCI screens; the SCREEN was administered first 51.9% (134/258) of the time.

Construct Validity

Construct validity was assessed through a review of relevant peer-reviewed literature that compared constructs included in the SCREEN with those identified in the literature as 2 of the most sensitive tools for MCI screening: the MoCA 8.1 [ 76 ] and the Qmci [ 25 ]. Memory, language, and verbal skills are assessed in the MoCA and Qmci but are absent from the SCREEN. Tests of verbal fluency and logical memory have been shown to be particularly sensitive to early cognitive changes [ 77 , 78 ] but are similarly absent from the SCREEN.

Exploratory factor analysis was performed to examine the SCREEN’s ability to reliably measure risk of MCI. The Kaiser-Meyer-Olkin measure yielded a value of 0.79, exceeding the commonly accepted threshold of 0.70, indicating that the sample was adequate for factor analysis. The Bartlett test of sphericity returned a chi-square value of χ 2 21 =167.1 ( P <.001), confirming the presence of correlations among variables suitable for factor analysis. A principal component analysis revealed 2 components with eigenvalues of >1, cumulatively accounting for 52.12% of the variance, with the first factor alone explaining 37.8%. After the varimax rotation, the 2 factors exhibited distinct patterns of loadings, with the visual-spatial ability factor loading predominantly on the second factor. The SCREEN tasks, except for visual-spatial ability, loaded substantially on the factors (>0.5), suggesting that the SCREEN possesses good convergent validity for assessing the risk of MCI.

Criterion Validity

The coding of SCREEN scores into a binary healthy and unhealthy outcome standardized the dependent variable to allow for criterion testing. Criterion validity was assessed using cross-tabulations and the analysis of confusion matrices and provided insights into the sensitivity and specificity of the SCREEN when compared to the Qmci. Of the 144 cases considered, 20 (13.9%) were true negatives, and 74 (51.4%) were true positives. The SCREEN’s sensitivity, which reflects its capacity to accurately identify healthy individuals (true positives), was 63.25% (74 correct identifications/117 actual positives). The specificity of the test, indicating its ability to accurately identify unhealthy individuals (true negatives), was 74.07% (20 correct identifications/27 actual negatives). Then, sensitivity and specificity were derived using discrepant analysis and a resolver test previously described (whether the HCP referred the participant to a physician following the screens). The results were identical, the estimate of the SCREEN sensitivity was 63.3% (74/117), and the estimate of the specificity was 74% (20/27).

Internal Reliability

A Cronbach α=0.70 is acceptable, and at least 0.90 is required for clinical instruments [ 79 ]. The estimate of internal consistency for the SCREEN (N=147) was Cronbach α=0.63.

Test-Retest Reliability

Test-retest reliability analyses were conducted using ICC for the SCREEN activity scores and the κ coefficient for the healthy and unhealthy classifications. Guidelines for interpretation of the ICC suggest that anything <0.5 indicates poor reliability and anything between 0.5 and 0.75 suggests moderate reliability [ 80 ]; the ICC for the SCREEN activity scores was 0.54. With respect to the κ coefficient, a κ value of <0.2 is considered to have no level of agreement, a κ value of 0.21 to 0.39 is considered minimal, a κ value of 0.4 to 0.59 is considered weak agreement, and anything >0.8 suggests strong to almost perfect agreement [ 81 ]. The κ coefficient for healthy and unhealthy classifications was 0.15.

Analysis of the Factors Impacting Healthy and Unhealthy Results

The Spearman rank correlation was used to assess the relationships between participants’ overall activity score on the SCREEN and their total time to complete the SCREEN; age, sex, and self-reported levels of education; technology use; medication use; amount of sleep; and level of anxiety (as measured using the GAS-10) at the time of SCREEN administration. Lower overall activity scores were moderately correlated with being older ( r s142 =–0.57; P <.001) and increased total time to complete the SCREEN ( r s142 =0.49; P <.001). There was also a moderate inverse relationship between overall activity score and total time to compete the SCREEN ( r s142 =–0.67; P <.001) whereby better performance was associated with quicker task completion. There were weak positive associations between overall activity score and increased technology use ( r s142 =0.34; P <.001) and higher level of education ( r s142 =0.21; P =.01).

A logistic regression model was used to predict the SCREEN result using data from 144 observations. The model’s predictors explain approximately 21.33% of the variance in the outcome variable. The likelihood ratio test indicates that the model provides a significantly better fit to the data than a model without predictors ( P <.001).

The SCREEN outcome variable ( healthy vs unhealthy ) was associated with the predictor variables sex and total time to complete the SCREEN. More specifically, female participants were more likely to obtain healthy SCREEN outcomes ( P =.007; 95% CI 0.32-2.05). For all participants, the longer it took to complete the SCREEN, the less likely they were to achieve a healthy SCREEN outcome ( P =.002; 95% CI –0.33 to –0.07). Age ( P =.25; 95% CI –0.09 to 0.02), medication use ( P =.96; 95% CI –0.9 to 0.94), technology use ( P =.44; 95% CI –0.28 to 0.65), level of education ( P =.14; 95% CI –0.09 to 0.64), level of anxiety ( P =.26; 95% CI –1.13 to 0.3), and hours of sleep ( P =.08; 95% CI –0.06 to 0.93) were not significant.

Impact of Practice Effects

The SCREEN was administered approximately 3 months apart, and separate, paired-sample t tests were performed to compare SCREEN outcomes between visits 1 and 2 (76/147, 51.7%; Table 4 ), visits 2 and 3 (22/147, 15%), and visits 3 and 4 (13/147, 8.8%). Declining visits were partially attributable to the early shutdown of data collection due to the COVID-19 pandemic, and therefore, comparisons between visits 2 and 3 or visits 3 and 4 were not reported. Compared to participants’ SCREEN performance on visit 1, their overall mean activity score and overall processing time improved on their second administration of the SCREEN (score: t 75 =–2.86 and P =.005; processing time: t 75 =–2.98 and P =.004). Even though the 7 task-specific activity scores on the SCREEN also increased between visits 1 and 2, these improvements were not significant, indicating that the difference in overall activity scores was cumulative and not attributable to a specific task ( Table 4 ).

Principal Findings

Our study aimed to evaluate the effectiveness and reliability of the BrainFx SCREEN in detecting MCI in primary care settings. The research took place during the COVID-19 pandemic, which influenced the study’s execution and timeline. Despite these challenges, the findings offer valuable insights into cognitive impairment screening.

Brief MCI screening tools help time-strapped primary care physicians determine whether referral for a definitive battery of more time-consuming and expensive tests is warranted. These tools must optimize and balance the need for time efficiency while also being psychometrically valid and easily administered [ 82 ]. The importance of brevity is determined by a number of factors, including the clinical setting. Screens that can be completed in approximately ≤5 minutes [ 13 ] are recommended for faster-paced clinical settings (eg, emergency rooms and preoperative screens), whereas those that can be completed in 5 to 10 minutes or less are better suited to primary care settings [ 82 - 84 ]. Identifying affordable, psychometrically tested screening tests for MCI that integrate into clinical workflows and are easy to consistently administer and complete may help with the following:

  • Initiating appropriate diagnostic tests for signs and symptoms at an earlier stage
  • Normalizing and destigmatizing cognitive testing for older adults
  • Expediting referrals
  • Allowing for timely access to programs and services that can support aging in place or delay institutionalization
  • Reducing risk
  • Improving the psychosocial well-being of patients and their care partners by increasing access to information and resources that aid with future planning and decision-making [ 85 , 86 ]

Various cognitive tests are commonly used for detecting MCI. These include the Addenbrook Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease, Sunderland Clock Drawing Test, Informant Questionnaire on Cognitive Decline in the Elderly, Memory Alternation Test, MMSE, MoCA 8.1, and Qmci [ 61 , 87 ]. The Addenbrook Cognitive Examination–Revised, Consortium to Establish a Registry for Alzheimer’s Disease, MoCA 8.1, Qmci, and Memory Alternation Test are reported to have similar diagnostic accuracy [ 61 , 88 ]. The HCPs participating in this study reported using the MoCA 8.1 as their primary screening tool for MCI along with other assessments such as the MMSE and Trail Making Test parts A and B.

Recent research highlights the growing use of digital tools [ 51 , 89 , 90 ], mobile technology [ 91 , 92 ], virtual reality [ 93 , 94 ], and artificial intelligence [ 95 ] to improve early identification of MCI. Demeyere et al [ 51 ] developed the tablet-based, 10-item Oxford Cognitive Screen–Plus to detect slight changes in cognitive impairment across 5 domains of cognition (memory, attention, number, praxis, and language), which has been validated among neurologically healthy older adults. Statsenko et al [ 96 ] have explored improvement of the predictive capabilities of tests using artificial intelligence. Similarly, there is an emerging focus on the use of machine learning techniques to detect dementia leveraging routinely collected clinical data [ 97 , 98 ]. This progression signifies a shift toward more technologically advanced, efficient, and potentially more accurate diagnostic approaches in the detection of MCI.

Whatever the modality, screening tools should be quick to administer, demonstrate consistent results over time and between different evaluators, cover all major cognitive areas, and be straightforward to both administer and interpret [ 99 ]. However, highly sensitive tests such as those suggested for screening carry a significant risk of false-positive diagnoses [ 15 ]. Given the high potential for harm of false positives, it is important to validate the psychometric properties of screening tests across different populations and understand how factors such as age and education can influence the results [ 99 ].

Our study did not assess the face validity of the SCREEN, but participating occupational therapists were comfortable with the test regimen. Nonetheless, the research team noted the absence of verbal fluency and memory tests in the SCREEN, both of which McDonnell et al [ 100 ] identified as being more sensitive to the more commonly seen amnesic MCI. Two of the most sensitive tools for MCI screening, the MoCA 8.1 [ 76 ] and Qmci [ 25 ], assess memory, language, and verbal skills, and tests of verbal fluency and logical memory have been shown to be particularly sensitive to early cognitive changes [ 77 , 78 ].

The constructs included in the SCREEN ( Table 1 ) were selected based on a single non–peer-reviewed study [ 58 ] using the 360 and traumatic brain injury data (N=188) that identified the constructs as predictive of brain injury. The absence of tasks that measure verbal fluency or logical memory in the SCREEN appears to weaken claims of construct validity. The principal component analysis of the SCREEN assessment identified 2 components accounting for 52.12% of the total variance. The first component was strongly associated with abstract reasoning, constructive ability, and divided attention, whereas the second was primarily influenced by visual-spatial abilities. This indicates that constructs related to perception, attention, and memory are central to the SCREEN scores.

The SCREEN’s binary outcome (healthy or unhealthy) created by the research team was based on comparisons with the Qmci. However, the method of identifying areas of challenge in the SCREEN by comparing the individual’s mean score on each of the 7 tasks with the mean scores of a global or filtered cohort in the LBB introduces potential biases or errors. These could arise from a surge in additions to the LBB from patients with specific characteristics, self-selection of participants, poorly trained SCREEN administrators, inclusion of nonstandard test results, underuse of appropriate filters, and underreporting of clinical conditions or factors such as socioeconomic status that impact performance in standardized cognitive tests.

The proprietary method of analyzing and reporting SCREEN results complicates traditional sensitivity and specificity measurement. Our testing indicated a sensitivity of 63.25% and specificity of 74.07% for identifying healthy (those without MCI) and unhealthy (those with MCI) individuals. The SCREEN’s Cronbach α=.63, slightly below the threshold for clinical instruments, and reliability scores that were lower than the ideal standards suggest a higher-than-acceptable level of random measurement error in its constructs. The lower reliability may also stem from an inadequate sample size or a limited number of scale items.

The SCREEN’s results are less favorable compared to those of other digital MCI screening tools that similarly enable evaluation of specific cognitive domains but also provide validated, norm-referenced cutoff scores and methods for cumulative scoring in clinical settings (Oxford Cognitive Screen–Plus) [ 51 ] or of validated MCI screening tools used in primary care (eg, MoCA 8.1, Qmci, and MMSE) [ 51 , 87 ]. The SCREEN’s unique scoring algorithm and the dynamic denominator in data analysis necessitate caution in comparing these results to those of other tools with fixed scoring algorithms and known sensitivities [ 101 , 102 ]. We found the SCREEN to have lower-than-expected internal reliability, suggesting significant random measurement error. Test-retest reliability was weak for the healthy or unhealthy outcome but stronger for overall activity scores between tests. The variability in identifying areas of challenge could relate to technological difficulties or variability from comparisons with a growing database of test results.

Potential reasons for older adults’ poorer scores on timed tests include the impact of sensorimotor decline on touch screen sensation and reaction time [ 38 , 103 ], anxiety related to taking a computer-enabled test [ 104 - 106 ], or the anticipated consequences of a negative outcome [ 107 ]. However, these effects were unlikely to have influenced the results of this study. Practice effects were observed [ 29 , 108 ], but the SCREEN’s novelty suggests that familiarity is not gained through prepreparation or word of mouth as this sample was self-selected and not randomized. Future research might also explore the impact of digital literacy and cultural differences in the interpretation of software constructs or icons on MCI screening in a randomized, older adult sample.

Limitations

This study had methodological limitations that warrant attention. The small sample size and the demographic distribution of the 147 participants aged ≥55 years, with most (96/147, 65.3%) being female and well educated, limits the generalizability of the findings to different populations. The study’s design, aiming to explore the sensitivity of the SCREEN for early detection of MCI, necessitated the exclusion of individuals with a previous diagnosis of MCI or dementia. This exclusion criterion might have impacted the study’s ability to thoroughly assess the SCREEN’s effectiveness in a more varied clinical context. The requirement for participants to read and comprehend English introduced another limitation to our study. This criterion potentially limited the SCREEN tool’s applicability across diverse linguistic backgrounds as individuals with language-based impairments or those not proficient in English may face challenges in completing the assessment [ 51 ]. Such limitations could impact the generalizability of our findings to non–English-speaking populations or to those with language impairments, underscoring the need for further research to evaluate the SCREEN tool’s effectiveness in broader clinical and linguistic contexts.

Financial constraints played a role in limiting the study’s scope. Due to funding limitations, it was not possible to include specialist assessments and a battery of neuropsychiatric tests generally considered the gold standard to confirm or rule out an MCI diagnosis. Therefore, the study relied on differential verification through 2 imperfect reference standards: a comparison with the Qmci (the tool with the highest published sensitivity to MCI in 2019, when the study was designed) and the clinical judgment of the administering HCP, particularly in decisions regarding referrals for further clinical assessment. Furthermore, while an economic feasibility assessment was considered, the research team determined that it should follow, not precede, an evaluation of the SCREEN’s validity and reliability.

The proprietary nature of the algorithm used for scoring the SCREEN posed another challenge. Without access to this algorithm, the research team had to use a novel comparative statistical approach, coding patient results into a binary variable: healthy (SCREEN=no areas of challenge OR Qmci≥67 out of 100) or unhealthy (SCREEN=one or more areas of challenge OR Qmci<67 out of 100). This may have introduced a higher level of error into our statistical analysis. Furthermore, the process for determining areas of challenge on the SCREEN involves comparing a participant’s result to the existing SCREEN results in the LBB at the time of testing. By the end of this study, the LBB contained 632 SCREEN results for adults aged ≥55 years, with this study contributing 258 of those results. The remaining 366 original SCREEN results, 64% of which were completed by individuals who self-identified as having a preexisting diagnosis or conditions associated with cognitive impairment (eg, traumatic brain injury, concussion, or stroke), could have led to an overestimation of the means and SDs of the study participants’ results at the outset of the study.

Unlike other cognitive screening tools, the SCREEN allows for filtering of results to compare different patient cohorts in the LBB using criteria such as age and education. However, at this stage of the LBB’s development, using such filters can significantly reduce the reliability of the results due to a smaller comparator population (ie, the denominator used to calculate the mean and SD). This, in turn, affects the significance of the results. Moreover, the constantly changing LBB data set makes it challenging to meaningfully compare an individual’s results over time as the evolving denominator affects the accuracy and relevance of these comparisons. Finally, the significant improvement in SCREEN scores between the first and second visits suggests the presence of practice effects, which could have influenced the reliability and validity of the findings.

Conclusions

In a primary care setting, where MCI screening tools are essential and recommended for those with concerns [ 85 ], certain criteria are paramount: time efficiency, ease of administration, and robust psychometric properties [ 82 ]. Our analysis of the BrainFx SCREEN suggests that, despite its innovative approach and digital delivery, it currently falls short in meeting these criteria. The SCREEN’s comparatively longer administration time and lower-than-expected reliability scores suggest that it may not be the most effective tool for MCI screening of older adults in a primary care setting at this time.

It is important to note that, in the wake of the COVID-19 pandemic, and with an aging population living and aging by design or necessity in a community setting, there is growing interest in digital solutions, including web-based applications and platforms to both collect digital biomarkers and deliver cognitive training and other interventions [ 109 , 110 ]. However, new normative standards are required when adapting cognitive tests to digital formats [ 92 ] as the change in medium can significantly impact test performance and results interpretation. Therefore, we recommend caution when interpreting our study results and encourage continued research and refinement of tools such as the SCREEN. This ongoing process will ensure that current and future MCI screening tools are effective, reliable, and relevant in meeting the needs of our aging population, particularly in primary care settings where early detection and intervention are key.

Acknowledgments

The researchers gratefully acknowledge the Ontario Centres of Excellence Health Technologies Fund for their financial support of this study; the executive directors and clinical leads in each of the Family Health Team study locations; the participants and their friends and families who took part in the study; and research assistants Sharmin Sharker, Kelly Zhu, and Muhammad Umair for their contributions to data management and statistical analysis.

Data Availability

The data sets generated during and analyzed during this study are available from the corresponding author on reasonable request.

Authors' Contributions

JM contributed to the conceptualization, methodology, validation, formal analysis, data curation, writing—original draft, writing—review and editing, visualization, supervision, and funding acquisition. AML contributed to the conceptualization, methodology, validation, investigation, formal analysis, data curation, writing—original draft, writing—review and editing, visualization, and project administration. WP contributed to the validation, formal analysis, data curation, writing—original draft, writing—review and editing, and visualization. Finally, PH contributed to conceptualization, methodology, writing—review and editing, supervision, and funding acquisition.

Conflicts of Interest

None declared.

  • Casagrande M, Marselli G, Agostini F, Forte G, Favieri F, Guarino A. The complex burden of determining prevalence rates of mild cognitive impairment: a systematic review. Front Psychiatry. 2022;13:960648. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Petersen RC, Caracciolo B, Brayne C, Gauthier S, Jelic V, Fratiglioni L. Mild cognitive impairment: a concept in evolution. J Intern Med. Mar 2014;275(3):214-228. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Knopman DS, Petersen RC. Mild cognitive impairment and mild dementia: a clinical perspective. Mayo Clin Proc. Oct 2014;89(10):1452-1459. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Anderson ND. State of the science on mild cognitive impairment (MCI). CNS Spectr. Feb 2019;24(1):78-87. [ CrossRef ] [ Medline ]
  • Tangalos EG, Petersen RC. Mild cognitive impairment in geriatrics. Clin Geriatr Med. Nov 2018;34(4):563-589. [ CrossRef ] [ Medline ]
  • Ng R, Maxwell C, Yates E, Nylen K, Antflick J, Jette N, et al. Brain disorders in Ontario: prevalence, incidence and costs from health administrative data. Institute for Clinical Evaluative Sciences. 2015. URL: https:/​/www.​ices.on.ca/​publications/​research-reports/​brain-disorders-in-ontario-prevalence-incidence-and-costs-from-health-administrative-data/​ [accessed 2024-04-01]
  • Centers for Disease ControlPrevention (CDC). Self-reported increased confusion or memory loss and associated functional difficulties among adults aged ≥ 60 years - 21 states, 2011. MMWR Morb Mortal Wkly Rep. May 10, 2013;62(18):347-350. [ FREE Full text ] [ Medline ]
  • Petersen RC, Lopez O, Armstrong MJ, Getchius TS, Ganguli M, Gloss D, et al. Practice guideline update summary: mild cognitive impairment: report of the guideline development, dissemination, and implementation subcommittee of the American Academy of Neurology. Neurology. Jan 16, 2018;90(3):126-135. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Song WX, Wu WW, Zhao YY, Xu HL, Chen GC, Jin SY, et al. Evidence from a meta-analysis and systematic review reveals the global prevalence of mild cognitive impairment. Front Aging Neurosci. 2023;15:1227112. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen Y, Denny KG, Harvey D, Farias ST, Mungas D, DeCarli C, et al. Progression from normal cognition to mild cognitive impairment in a diverse clinic-based and community-based elderly cohort. Alzheimers Dement. Apr 2017;13(4):399-405. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Langa KM, Levine DA. The diagnosis and management of mild cognitive impairment: a clinical review. JAMA. Dec 17, 2014;312(23):2551-2561. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang Y, Natale G, Clouston S. Incidence of mild cognitive impairment, conversion to probable dementia, and mortality. Am J Alzheimers Dis Other Demen. 2021;36:15333175211012235. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Prince M, Bryce R, Ferri CP. World Alzheimer report 2011: the benefits of early diagnosis and intervention. Alzheimer’s Disease International. 2011. URL: https://www.alzint.org/u/WorldAlzheimerReport2011.pdf [accessed 2024-04-01]
  • Patnode CD, Perdue LA, Rossom RC, Rushkin MC, Redmond N, Thomas RG, et al. Screening for cognitive impairment in older adults: updated evidence report and systematic review for the US preventive services task force. JAMA. Feb 25, 2020;323(8):764-785. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Canadian Task Force on Preventive Health Care, Pottie K, Rahal R, Jaramillo A, Birtwhistle R, Thombs BD, et al. Recommendations on screening for cognitive impairment in older adults. CMAJ. Jan 05, 2016;188(1):37-46. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tahami Monfared AA, Phan NT, Pearson I, Mauskopf J, Cho M, Zhang Q, et al. A systematic review of clinical practice guidelines for Alzheimer's disease and strategies for future advancements. Neurol Ther. Aug 2023;12(4):1257-1284. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mattke S, Jun H, Chen E, Liu Y, Becker A, Wallick C. Expected and diagnosed rates of mild cognitive impairment and dementia in the U.S. medicare population: observational analysis. Alzheimers Res Ther. Jul 22, 2023;15(1):128. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Manly JJ, Tang MX, Schupf N, Stern Y, Vonsattel JP, Mayeux R. Frequency and course of mild cognitive impairment in a multiethnic community. Ann Neurol. Apr 2008;63(4):494-506. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Black CM, Ambegaonkar BM, Pike J, Jones E, Husbands J, Khandker RK. The diagnostic pathway from cognitive impairment to dementia in Japan: quantification using real-world data. Alzheimer Dis Assoc Disord. 2019;33(4):346-353. [ CrossRef ] [ Medline ]
  • Ritchie CW, Black CM, Khandker RK, Wood R, Jones E, Hu X, et al. Quantifying the diagnostic pathway for patients with cognitive impairment: real-world data from seven European and north American countries. J Alzheimers Dis. 2018;62(1):457-466. [ CrossRef ] [ Medline ]
  • Folstein MF, Folstein SE, McHugh PR. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. Nov 1975;12(3):189-198. [ CrossRef ] [ Medline ]
  • Tsoi KK, Chan JY, Hirai HW, Wong SY, Kwok TC. Cognitive tests to detect dementia: a systematic review and meta-analysis. JAMA Intern Med. Sep 2015;175(9):1450-1458. [ CrossRef ] [ Medline ]
  • Lopez MN, Charter RA, Mostafavi B, Nibut LP, Smith WE. Psychometric properties of the Folstein mini-mental state examination. Assessment. Jun 2005;12(2):137-144. [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal cognitive assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. Apr 2005;53(4):695-699. [ CrossRef ] [ Medline ]
  • O'Caoimh R, Timmons S, Molloy DW. Screening for mild cognitive impairment: comparison of "MCI specific" screening instruments. J Alzheimers Dis. 2016;51(2):619-629. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Trzepacz PT, Hochstetler H, Wang S, Walker B, Saykin AJ, Alzheimer’s Disease Neuroimaging Initiative. Relationship between the Montreal cognitive assessment and mini-mental state examination for assessment of mild cognitive impairment in older adults. BMC Geriatr. Sep 07, 2015;15:107. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Phillips N, Chertkow H. Normative data for the Montreal Cognitive Assessment (MoCA) in a population-based sample. Neurology. Mar 06, 2012;78(10):765-766. [ CrossRef ] [ Medline ]
  • Monroe T, Carter M. Using the Folstein Mini Mental State Exam (MMSE) to explore methodological issues in cognitive aging research. Eur J Ageing. Sep 2012;9(3):265-274. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Damian AM, Jacobson SA, Hentz JG, Belden CM, Shill HA, Sabbagh MN, et al. The Montreal cognitive assessment and the mini-mental state examination as screening instruments for cognitive impairment: item analyses and threshold scores. Dement Geriatr Cogn Disord. 2011;31(2):126-131. [ CrossRef ] [ Medline ]
  • Kaufer DI, Williams CS, Braaten AJ, Gill K, Zimmerman S, Sloane PD. Cognitive screening for dementia and mild cognitive impairment in assisted living: comparison of 3 tests. J Am Med Dir Assoc. Oct 2008;9(8):586-593. [ CrossRef ] [ Medline ]
  • Gagnon C, Saillant K, Olmand M, Gayda M, Nigam A, Bouabdallaoui N, et al. Performances on the Montreal cognitive assessment along the cardiovascular disease continuum. Arch Clin Neuropsychol. Jan 17, 2022;37(1):117-124. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cooley SA, Heaps JM, Bolzenius JD, Salminen LE, Baker LM, Scott SE, et al. Longitudinal change in performance on the Montreal cognitive assessment in older adults. Clin Neuropsychol. 2015;29(6):824-835. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Caoimh R, Gao Y, McGlade C, Healy L, Gallagher P, Timmons S, et al. Comparison of the quick mild cognitive impairment (Qmci) screen and the SMMSE in screening for mild cognitive impairment. Age Ageing. Sep 2012;41(5):624-629. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Caoimh R, Molloy DW. Comparing the diagnostic accuracy of two cognitive screening instruments in different dementia subtypes and clinical depression. Diagnostics (Basel). Aug 08, 2019;9(3):93. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Clarnette R, O'Caoimh R, Antony DN, Svendrovski A, Molloy DW. Comparison of the Quick Mild Cognitive Impairment (Qmci) screen to the Montreal Cognitive Assessment (MoCA) in an Australian geriatrics clinic. Int J Geriatr Psychiatry. Jun 2017;32(6):643-649. [ CrossRef ] [ Medline ]
  • Glynn K, Coen R, Lawlor BA. Is the Quick Mild Cognitive Impairment screen (QMCI) more accurate at detecting mild cognitive impairment than existing short cognitive screening tests? A systematic review of the current literature. Int J Geriatr Psychiatry. Dec 2019;34(12):1739-1746. [ CrossRef ] [ Medline ]
  • Lee MT, Chang WY, Jang Y. Psychometric and diagnostic properties of the Taiwan version of the quick mild cognitive impairment screen. PLoS One. 2018;13(12):e0207851. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wallace SE, Donoso Brown EV, Simpson RC, D'Acunto K, Kranjec A, Rodgers M, et al. A comparison of electronic and paper versions of the Montreal cognitive assessment. Alzheimer Dis Assoc Disord. 2019;33(3):272-278. [ CrossRef ] [ Medline ]
  • Gagnon C, Olmand M, Dupuy EG, Besnier F, Vincent T, Grégoire CA, et al. Videoconference version of the Montreal cognitive assessment: normative data for Quebec-French people aged 50 years and older. Aging Clin Exp Res. Jul 2022;34(7):1627-1633. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Friemel TN. The digital divide has grown old: determinants of a digital divide among seniors. New Media & Society. Jun 12, 2014;18(2):313-331. [ CrossRef ]
  • Ventola CL. Mobile devices and apps for health care professionals: uses and benefits. P T. May 2014;39(5):356-364. [ FREE Full text ] [ Medline ]
  • Searles C, Farnsworth JL, Jubenville C, Kang M, Ragan B. Test–retest reliability of the BrainFx 360® performance assessment. Athl Train Sports Health Care. Jul 2019;11(4):183-191. [ CrossRef ]
  • Jones C, Miguel-Cruz A, Brémault-Phillips S. Technology acceptance and usability of the BrainFx SCREEN in Canadian military members and veterans with posttraumatic stress disorder and mild traumatic brain injury: mixed methods UTAUT study. JMIR Rehabil Assist Technol. May 13, 2021;8(2):e26078. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McMurray J, Levy A, Holyoke P. Psychometric evaluation and workflow integration study of a tablet-based tool to detect mild cognitive impairment in older adults: protocol for a mixed methods study. JMIR Res Protoc. May 21, 2021;10(5):e25520. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wilansky P, Eklund JM, Milner T, Kreindler D, Cheung A, Kovacs T, et al. Cognitive behavior therapy for anxious and depressed youth: improving homework adherence through mobile technology. JMIR Res Protoc. Nov 10, 2016;5(4):e209. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ammenwerth E, Iller C, Mahler C. IT-adoption and the interaction of task, technology and individuals: a fit framework and a case study. BMC Med Inform Decis Mak. Jan 09, 2006;6:3. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Goodhue DL, Thompson RL. Task-technology fit and individual performance. MIS Q. Jun 1995;19(2):213-236. [ CrossRef ]
  • Beuscher L, Grando VT. Challenges in conducting qualitative research with individuals with dementia. Res Gerontol Nurs. Jan 2009;2(1):6-11. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Howe E. Informed consent, participation in research, and the Alzheimer's patient. Innov Clin Neurosci. May 2012;9(5-6):47-51. [ FREE Full text ] [ Medline ]
  • Thorogood A, Mäki-Petäjä-Leinonen A, Brodaty H, Dalpé G, Gastmans C, Gauthier S, et al. Global Alliance for GenomicsHealth‚ AgeingDementia Task Team. Consent recommendations for research and international data sharing involving persons with dementia. Alzheimers Dement. Oct 2018;14(10):1334-1343. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Demeyere N, Haupt M, Webb SS, Strobel L, Milosevich ET, Moore MJ, et al. Introducing the tablet-based Oxford Cognitive Screen-Plus (OCS-Plus) as an assessment tool for subtle cognitive impairments. Sci Rep. Apr 12, 2021;11(1):8000. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nasreddine ZS, Patel BB. Validation of Montreal cognitive assessment, MoCA, alternate French versions. Can J Neurol Sci. Sep 2016;43(5):665-671. [ CrossRef ] [ Medline ]
  • Mueller AE, Segal DL, Gavett B, Marty MA, Yochim B, June A, et al. Geriatric anxiety scale: item response theory analysis, differential item functioning, and creation of a ten-item short form (GAS-10). Int Psychogeriatr. Jul 2015;27(7):1099-1111. [ CrossRef ] [ Medline ]
  • Segal DL, June A, Payne M, Coolidge FL, Yochim B. Development and initial validation of a self-report assessment tool for anxiety among older adults: the Geriatric Anxiety Scale. J Anxiety Disord. Oct 2010;24(7):709-714. [ CrossRef ] [ Medline ]
  • Balsamo M, Cataldi F, Carlucci L, Fairfield B. Assessment of anxiety in older adults: a review of self-report measures. Clin Interv Aging. 2018;13:573-593. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gatti A, Gottschling J, Brugnera A, Adorni R, Zarbo C, Compare A, et al. An investigation of the psychometric properties of the Geriatric Anxiety Scale (GAS) in an Italian sample of community-dwelling older adults. Aging Ment Health. Sep 2018;22(9):1170-1178. [ CrossRef ] [ Medline ]
  • Yochim BP, Mueller AE, June A, Segal DL. Psychometric properties of the Geriatric Anxiety Scale: comparison to the beck anxiety inventory and geriatric anxiety inventory. Clin Gerontol. Dec 06, 2010;34(1):21-33. [ CrossRef ]
  • Recent concussion (< 6 months ago) analysis result. Daisy Intelligence. 2016. URL: https://www.daisyintelligence.com/retail-solutions/ [accessed 2024-04-01]
  • Malloy DW, O'Caoimh R. The Quick Guide: Scoring and Administration Instructions for The Quick Mild Cognitive Impairment (Qmci) Screen. Waterford, Ireland. Newgrange Press; 2017.
  • O'Caoimh R, Gao Y, Svendovski A, Gallagher P, Eustace J, Molloy DW. Comparing approaches to optimize cut-off scores for short cognitive screening instruments in mild cognitive impairment and dementia. J Alzheimers Dis. 2017;57(1):123-133. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Breton A, Casey D, Arnaoutoglou NA. Cognitive tests for the detection of mild cognitive impairment (MCI), the prodromal stage of dementia: meta-analysis of diagnostic accuracy studies. Int J Geriatr Psychiatry. Feb 2019;34(2):233-242. [ CrossRef ] [ Medline ]
  • Umemneku Chikere CM, Wilson K, Graziadio S, Vale L, Allen AJ. Diagnostic test evaluation methodology: a systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard - An update. PLoS One. 2019;14(10):e0223832. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Espinosa A, Alegret M, Boada M, Vinyes G, Valero S, Martínez-Lage P, et al. Ecological assessment of executive functions in mild cognitive impairment and mild Alzheimer's disease. J Int Neuropsychol Soc. Sep 2009;15(5):751-757. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hawkins DM, Garrett JA, Stephenson B. Some issues in resolution of diagnostic tests using an imperfect gold standard. Stat Med. Jul 15, 2001;20(13):1987-2001. [ CrossRef ] [ Medline ]
  • Hadgu A, Dendukuri N, Hilden J. Evaluation of nucleic acid amplification tests in the absence of a perfect gold-standard test: a review of the statistical and epidemiologic issues. Epidemiology. Sep 2005;16(5):604-612. [ CrossRef ] [ Medline ]
  • Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. A comparison of two time intervals for test-retest reliability of health status instruments. J Clin Epidemiol. Aug 2003;56(8):730-735. [ CrossRef ] [ Medline ]
  • Paiva CE, Barroso EM, Carneseca EC, de Pádua Souza C, Dos Santos FT, Mendoza López RV, et al. A critical analysis of test-retest reliability in instrument validation studies of cancer patients under palliative care: a systematic review. BMC Med Res Methodol. Jan 21, 2014;14:8. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Streiner DL, Kottner J. Recommendations for reporting the results of studies of instrument and scale development and testing. J Adv Nurs. Sep 2014;70(9):1970-1979. [ CrossRef ] [ Medline ]
  • Streiner DL. A checklist for evaluating the usefulness of rating scales. Can J Psychiatry. Mar 1993;38(2):140-148. [ CrossRef ] [ Medline ]
  • Peyre H, Leplège A, Coste J. Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey. Qual Life Res. Mar 2011;20(2):287-300. [ CrossRef ] [ Medline ]
  • Nevado-Holgado AJ, Kim CH, Winchester L, Gallacher J, Lovestone S. Commonly prescribed drugs associate with cognitive function: a cross-sectional study in UK Biobank. BMJ Open. Nov 30, 2016;6(11):e012177. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Moore AR, O'Keeffe ST. Drug-induced cognitive impairment in the elderly. Drugs Aging. Jul 1999;15(1):15-28. [ CrossRef ] [ Medline ]
  • Rogers J, Wiese BS, Rabheru K. The older brain on drugs: substances that may cause cognitive impairment. Geriatr Aging. 2008;11(5):284-289. [ FREE Full text ]
  • Marvanova M. Drug-induced cognitive impairment: effect of cardiovascular agents. Ment Health Clin. Jul 2016;6(4):201-206. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Espeland MA, Rapp SR, Manson JE, Goveas JS, Shumaker SA, Hayden KM, et al. WHIMSYWHIMS-ECHO Study Groups. Long-term effects on cognitive trajectories of postmenopausal hormone therapy in two age groups. J Gerontol A Biol Sci Med Sci. Jun 01, 2017;72(6):838-845. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Luis CA, Keegan AP, Mullan M. Cross validation of the Montreal cognitive assessment in community dwelling older adults residing in the Southeastern US. Int J Geriatr Psychiatry. Feb 2009;24(2):197-201. [ CrossRef ] [ Medline ]
  • Cunje A, Molloy DW, Standish TI, Lewis DL. Alternate forms of logical memory and verbal fluency tasks for repeated testing in early cognitive changes. Int Psychogeriatr. Feb 2007;19(1):65-75. [ CrossRef ] [ Medline ]
  • Molloy DW, Standish TI, Lewis DL. Screening for mild cognitive impairment: comparing the SMMSE and the ABCS. Can J Psychiatry. Jan 2005;50(1):52-58. [ CrossRef ] [ Medline ]
  • Streiner DL, Norman GR. Health Measurement Scales: A Practical Guide to Their Development and Use. 4th edition. Oxford, UK. Oxford University Press; 2008.
  • Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. Jun 2016;15(2):155-163. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276-282. [ FREE Full text ] [ Medline ]
  • Zhuang L, Yang Y, Gao J. Cognitive assessment tools for mild cognitive impairment screening. J Neurol. May 2021;268(5):1615-1622. [ CrossRef ] [ Medline ]
  • Zhang J, Wang L, Deng X, Fei G, Jin L, Pan X, et al. Five-minute cognitive test as a new quick screening of cognitive impairment in the elderly. Aging Dis. Dec 2019;10(6):1258-1269. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Feldman HH, Jacova C, Robillard A, Garcia A, Chow T, Borrie M, et al. Diagnosis and treatment of dementia: 2. Diagnosis. CMAJ. Mar 25, 2008;178(7):825-836. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sabbagh MN, Boada M, Borson S, Chilukuri M, Dubois B, Ingram J, et al. Early detection of mild cognitive impairment (MCI) in primary care. J Prev Alzheimers Dis. 2020;7(3):165-170. [ CrossRef ] [ Medline ]
  • Milne A. Dementia screening and early diagnosis: the case for and against. Health Risk Soc. Mar 05, 2010;12(1):65-76. [ CrossRef ]
  • Screening tools to identify adults with cognitive impairment associated with dementia: diagnostic accuracy. Canadian Agency for Drugs and Technologies in Health. 2014. URL: https:/​/www.​cadth.ca/​sites/​default/​files/​pdf/​htis/​nov-2014/​RB0752%20Cognitive%20Assessments%20for%20Dementia%20Final.​pdf [accessed 2024-04-01]
  • Chehrehnegar N, Nejati V, Shati M, Rashedi V, Lotfi M, Adelirad F, et al. Early detection of cognitive disturbances in mild cognitive impairment: a systematic review of observational studies. Psychogeriatrics. Mar 2020;20(2):212-228. [ CrossRef ] [ Medline ]
  • Chan JY, Yau ST, Kwok TC, Tsoi KK. Diagnostic performance of digital cognitive tests for the identification of MCI and dementia: a systematic review. Ageing Res Rev. Dec 2021;72:101506. [ CrossRef ] [ Medline ]
  • Cubillos C, Rienzo A. Digital cognitive assessment tests for older adults: systematic literature review. JMIR Ment Health. Dec 08, 2023;10:e47487. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen R, Foschini L, Kourtis L, Signorini A, Jankovic F, Pugh M, et al. Developing measures of cognitive impairment in the real world from consumer-grade multimodal sensor streams. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019. Presented at: KDD '19; August 4-8, 2019;2145; Anchorage, AK. URL: https://dl.acm.org/doi/10.1145/3292500.3330690 [ CrossRef ]
  • Koo BM, Vizer LM. Mobile technology for cognitive assessment of older adults: a scoping review. Innov Aging. Jan 2019;3(1):igy038. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zygouris S, Ntovas K, Giakoumis D, Votis K, Doumpoulakis S, Segkouli S, et al. A preliminary study on the feasibility of using a virtual reality cognitive training application for remote detection of mild cognitive impairment. J Alzheimers Dis. 2017;56(2):619-627. [ CrossRef ] [ Medline ]
  • Liu Q, Song H, Yan M, Ding Y, Wang Y, Chen L, et al. Virtual reality technology in the detection of mild cognitive impairment: a systematic review and meta-analysis. Ageing Res Rev. Jun 2023;87:101889. [ CrossRef ] [ Medline ]
  • Fayemiwo MA, Olowookere TA, Olaniyan OO, Ojewumi TO, Oyetade IS, Freeman S, et al. Immediate word recall in cognitive assessment can predict dementia using machine learning techniques. Alzheimers Res Ther. Jun 15, 2023;15(1):111. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Statsenko Y, Meribout S, Habuza T, Almansoori TM, van Gorkom KN, Gelovani JG, et al. Patterns of structure-function association in normal aging and in Alzheimer's disease: screening for mild cognitive impairment and dementia with ML regression and classification models. Front Aging Neurosci. 2022;14:943566. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Roebuck-Spencer TM, Glen T, Puente AE, Denney RL, Ruff RM, Hostetter G, et al. Cognitive screening tests versus comprehensive neuropsychological test batteries: a national academy of neuropsychology education paper†. Arch Clin Neuropsychol. Jun 01, 2017;32(4):491-498. [ CrossRef ] [ Medline ]
  • Jammeh EA, Carroll CB, Pearson SW, Escudero J, Anastasiou A, Zhao P, et al. Machine-learning based identification of undiagnosed dementia in primary care: a feasibility study. BJGP Open. Jul 2018;2(2):bjgpopen18X101589. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Riello M, Rusconi E, Treccani B. The role of brief global cognitive tests and neuropsychological expertise in the detection and differential diagnosis of dementia. Front Aging Neurosci. 2021;13:648310. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • McDonnell M, Dill L, Panos S, Amano S, Brown W, Giurgius S, et al. Verbal fluency as a screening tool for mild cognitive impairment. Int Psychogeriatr. Sep 2020;32(9):1055-1062. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wojtowicz A, Larner AJ. Diagnostic test accuracy of cognitive screeners in older people. Prog Neurol Psychiatry. Mar 20, 2017;21(1):17-21. [ CrossRef ]
  • Larner AJ. Cognitive screening instruments for the diagnosis of mild cognitive impairment. Prog Neurol Psychiatry. Apr 07, 2016;20(2):21-26. [ CrossRef ]
  • Heintz BD, Keenan KG. Spiral tracing on a touchscreen is influenced by age, hand, implement, and friction. PLoS One. 2018;13(2):e0191309. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Laguna K, Babcock RL. Computer anxiety in young and older adults: implications for human-computer interactions in older populations. Comput Human Behav. Aug 1997;13(3):317-326. [ CrossRef ]
  • Wild KV, Mattek NC, Maxwell SA, Dodge HH, Jimison HB, Kaye JA. Computer-related self-efficacy and anxiety in older adults with and without mild cognitive impairment. Alzheimers Dement. Nov 2012;8(6):544-552. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wiechmann D, Ryan AM. Reactions to computerized testing in selection contexts. Int J Sel Assess. Jul 30, 2003;11(2-3):215-229. [ CrossRef ]
  • Gass CS, Curiel RE. Test anxiety in relation to measures of cognitive and intellectual functioning. Arch Clin Neuropsychol. Aug 2011;26(5):396-404. [ CrossRef ] [ Medline ]
  • Barbic D, Kim B, Salehmohamed Q, Kemplin K, Carpenter CR, Barbic SP. Diagnostic accuracy of the Ottawa 3DY and short blessed test to detect cognitive dysfunction in geriatric patients presenting to the emergency department. BMJ Open. Mar 16, 2018;8(3):e019652. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Owens AP, Ballard C, Beigi M, Kalafatis C, Brooker H, Lavelle G, et al. Implementing remote memory clinics to enhance clinical care during and after COVID-19. Front Psychiatry. 2020;11:579934. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Geddes MR, O'Connell ME, Fisk JD, Gauthier S, Camicioli R, Ismail Z, et al. Alzheimer Society of Canada Task Force on Dementia Care Best Practices for COVID‐19. Remote cognitive and behavioral assessment: report of the Alzheimer Society of Canada task force on dementia care best practices for COVID-19. Alzheimers Dement (Amst). 2020;12(1):e12111. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by G Eysenbach, T de Azevedo Cardoso; submitted 29.01.24; peer-reviewed by J Gao, MJ Moore; comments to author 20.02.24; revised version received 05.03.24; accepted 19.03.24; published 19.04.24.

©Josephine McMurray, AnneMarie Levy, Wei Pang, Paul Holyoke. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 19.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Psychol Med
  • v.43(5); 2021 Sep

The Limitations of Quasi-Experimental Studies, and Methods for Data Analysis When a Quasi-Experimental Research Design Is Unavoidable

Chittaranjan andrade.

1 Dept. of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India.

A quasi-experimental (QE) study is one that compares outcomes between intervention groups where, for reasons related to ethics or feasibility, participants are not randomized to their respective interventions; an example is the historical comparison of pregnancy outcomes in women who did versus did not receive antidepressant medication during pregnancy. QE designs are sometimes used in noninterventional research, as well; an example is the comparison of neuropsychological test performance between first degree relatives of schizophrenia patients and healthy controls. In QE studies, groups may differ systematically in several ways at baseline, itself; when these differences influence the outcome of interest, comparing outcomes between groups using univariable methods can generate misleading results. Multivariable regression is therefore suggested as a better approach to data analysis; because the effects of confounding variables can be adjusted for in multivariable regression, the unique effect of the grouping variable can be better understood. However, although multivariable regression is better than univariable analyses, there are inevitably inadequately measured, unmeasured, and unknown confounds that may limit the validity of the conclusions drawn. Investigators should therefore employ QE designs sparingly, and only if no other option is available to answer an important research question.

If we wish to study how antidepressant drug treatment affects outcomes in pregnancy, we should ideally randomize depressed pregnant women to receive an antidepressant drug or placebo; this is a randomized controlled trial (RCT) research design. However, because ethics committees are unlikely to approve such RCTs, researchers can only examine pregnancy outcomes (prospectively or retrospectively) in women who did versus did not receive antidepressant drugs; this is a quasi-experimental (QE) research design. A QE study is one that compares outcomes between intervention groups where, for reasons related to ethics or feasibility, participants are not randomized to their respective interventions.

QE studies are problematic because, when participants are not randomized to intervention versus control groups, systematic biases may influence group membership. For example, women who are prescribed and who accept antidepressant medications during pregnancy are likely to be more severely ill than those who are not prescribed or those who do not accept antidepressant medications during pregnancy. So, if adverse pregnancy outcomes are commoner in the antidepressant group, they may be consequences of genetic, physiological, and/or behavioral features that characterize severe depression rather than the antidepressant treatment, itself.

A statistical approach to dealing with such confounds is to perform a regression analysis where pregnancy outcome is the dependent variable and antidepressant treatment, age, sex, socioeconomic status, medical history, family history, smoking history, drinking history, history of use of other substances, nutrition, history of infection during pregnancy, and dozens of other important variables that can influence pregnancy outcomes are independent variables. In such a regression, antidepressant treatment is the independent variable of interest, and the remaining independent variables are confounders that are adjusted for in the regression so that the unique effect of antidepressant treatment on pregnancy outcomes can be better identified. Propensity score matching refines the approach to analysis. 1

Many investigators use QE designs to answer their research questions, though not necessarily as an “experiment” with an intervention. For example, Thomas et al. 2 compared psychosocial dysfunction and family burden between outpatients diagnosed with schizophrenia and those diagnosed with obsessive-compulsive disorder (OCD). Obviously, it is not feasible to randomize patients to have schizophrenia or OCD. So, in their analysis, Thomas et al. 2 first examined whether the two groups were comparable on important sociodemographic and clinical variables. They found that the groups did not differ on, for example, age, family income, and duration of illness (but here, and in other QE studies, as well, these baseline comparisons would almost certainly have been underpowered); however, the schizophrenia group was overrepresented for males and for a history of substance abuse. In further analysis, Thomas et al. 2 used t tests to compare dysfunction and burden between the two groups; they found that both dysfunction and burden were greater in schizophrenia than in OCD.

Now, because patients had not been randomized to their respective diagnoses, it is obvious that the groups could have differed in many ways and not in diagnosis, alone. So, separate regressions should have been conducted with dysfunction and with burden as the dependent variable, and with diagnosis, age, sex, socioeconomic status, duration of illness, history of substance abuse, and others as the independent variables. Such an analysis would allow the investigators to understand not only the unique impact of the diagnosis but also the impact of the other sociodemographic and clinical variables on dysfunction and burden.

Note that inadequately measured, unmeasured, and unknown confounds would still have plagued the results. For example, in this study, 2 severity of illness was an unmeasured confound. What if the authors had, by chance, sampled more severely ill schizophrenia patients and less severely ill OCD patients? Then, illness severity rather than clinical diagnosis would have explained the greater dysfunction and burden observed in the schizophrenia group. Had they obtained a global rating of illness, they could have included it as an additional, important independent variable in the regression.

In another study with a QE design, Harave et al., 3 like Thomas et al., 2 used univariate tests to compare neurocognitive functioning between unaffected first-degree relatives of schizophrenia patients and healthy controls. More correctly, because there are likely to be systematic differences between schizophrenia relatives and healthy controls, they should have performed multivariable regressions with neurocognitive measures as the dependent variables, and with group and confounders as independent variables. Confounders that could have been considered include age, sex, education, family income, a measure of stress, history of smoking, drinking, other substance use, and so on, all of which can directly or indirectly influence neurocognitive performances.

This multivariable regression approach to data analysis in QE designs requires the a priori identification and measurement of all important confounding variables. In such analyses, the sample size for a continuous dependent variable should ideally be at least 10–15 times the number of independent variables. 4 Given that the number of confounding variables to be included is likely to be large, a very large sample will become necessary. Additionally, because studies are never perfect, it would be impossible to adjust for inadequately measured, unmeasured, and unknown confounds (but adjusting for whatever is known and measured is better than making no adjustments, at all). All said and done, the QE research design is best avoided because it is flawed and because even the best statistical approaches to data analysis would be imperfect. The QE design should be considered only when no other options are available. Readers are referred to Harris et al. 5 for a further discussion on QE studies.

Declaration of Conflicting Interests: The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author received no financial support for the research, authorship, and/or publication of this article.

IMAGES

  1. PPT

    research questions in quasi experimental design

  2. PPT

    research questions in quasi experimental design

  3. Quasi experimental research design

    research questions in quasi experimental design

  4. The quasi-experimental research design's conceptual framework

    research questions in quasi experimental design

  5. Experimental Design Worksheet Answers

    research questions in quasi experimental design

  6. PPT

    research questions in quasi experimental design

VIDEO

  1. Chapter 5. Alternatives to Experimentation: Correlational and Quasi Experimental Designs

  2. Quasi-experimental design #quasiexperimentaldesign

  3. Quasi Experimental & Experimental Research Strategies in Social & Behavioral Sciences

  4. QUASI

  5. Chapter 4: Experimental & Quasi-Experimental Research

  6. Quantitative Research Designs

COMMENTS

  1. Quasi-Experimental Design

    Revised on January 22, 2024. Like a true experiment, a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable. However, unlike a true experiment, a quasi-experiment does not rely on random assignment. Instead, subjects are assigned to groups based on non-random criteria.

  2. Quasi-Experimental Research Design

    Quasi-experimental design is a research method that seeks to evaluate the causal relationships between variables, but without the full control over the independent variable(s) that is available in a true experimental design. ... Choose the appropriate quasi-experimental design to address the research question. Examples include the pretest ...

  3. Quasi-experimental Research: What It Is, Types & Examples

    Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn't give full control over the independent variable (s) like true experimental designs do. In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at ...

  4. Selecting and Improving Quasi-Experimental Designs in Effectiveness and

    This article provides guidance on how to select and improve quasi-experimental designs in effectiveness and implementation research, using examples from health and social sciences. It discusses the advantages and limitations of different designs, such as pre-post, interrupted time series, and regression discontinuity, and offers practical tips on how to enhance their validity and rigor.

  5. Quasi-experimental study designs series—paper 5: a checklist for

    The answers to this question will be less important if the researchers of the original study used a method to control for any confounding, that is, used a credible quasi-experimental design. The health care evaluation community has historically been much more difficult to win around to the potential value of nonrandomized studies to evaluate ...

  6. Experimental and Quasi-Experimental Designs in Implementation Research

    Other implementation science questions are more suited to quasi-experimental designs, which are intended to estimate the effect of an intervention in the absence of randomization. These designs include pre-post designs with a non-equivalent control group, interrupted time series (ITS), and stepped wedges, the last of which require all ...

  7. Quasi Experimental Design Overview & Examples

    Quasi-experimental research is a design that closely resembles experimental research but is different. The term "quasi" means "resembling," so you can think of it as a cousin to actual experiments. In these studies, researchers can manipulate an independent variable — that is, they change one factor to see what effect it has.

  8. 7.3 Quasi-Experimental Research

    Key Takeaways. Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.

  9. 14

    Specifically, we describe four quasi-experimental designs - one-group pretest-posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs - and their statistical analyses in detail. Both simple quasi-experimental designs and embellishments of these simple designs are presented.

  10. Use of Quasi-Experimental Research Designs in Education Research

    The increasing use of quasi-experimental research designs (QEDs) in education, brought into focus following the "credibility revolution" (Angrist & Pischke, 2010) in economics, which sought to use data to empirically test theoretical assertions, has indeed improved causal claims in education (Loeb et al., 2017).However, more recently, scholars, practitioners, and policymakers have ...

  11. How to Use and Interpret Quasi-Experimental Design

    A quasi-experimental study (also known as a non-randomized pre-post intervention) is a research design in which the independent variable is manipulated, but participants are not randomly assigned to conditions. Commonly used in medical informatics (a field that uses digital information to ensure better patient care), researchers generally use ...

  12. (PDF) Quasi-Experimental Research Designs

    Quasi-experimental research designs are the most widely used research approach employed to evaluate the outcomes of social work programs and policies. This new volume describes the logic, design ...

  13. Quasi-Experimental Design

    Quasi-Experimental Research Designs by Bruce A. Thyer. This pocket guide describes the logic, design, and conduct of the range of quasi-experimental designs, encompassing pre-experiments, quasi-experiments making use of a control or comparison group, and time-series designs. An introductory chapter describes the valuable role these types of ...

  14. Quasi-Experimental Design: Types, Examples, Pros, and Cons

    See why leading organizations rely on MasterClass for learning & development. A quasi-experimental design can be a great option when ethical or practical concerns make true experiments impossible, but the research methodology does have its drawbacks. Learn all the ins and outs of a quasi-experimental design.

  15. Quasi-Experimental Design: Definition, Types, Examples

    Quasi-experimental design is a research methodology used to study the effects of independent variables on dependent variables when full experimental control is not possible or ethical. ... Researchers use this approach to answer research questions, test hypotheses, and explore the impact of interventions or treatments when they cannot employ ...

  16. 5 Chapter 5: Experimental and Quasi-Experimental Designs

    The importance of randomization in an experimental design. The types of questions that can be answered with an experimental or quasi-experimental research design. About the three factors required for a causal relationship. That a relationship between two or more variables may appear causal, but may in fact be spurious, or explained by another ...

  17. Quasi-Experimental Research

    The prefix quasi means "resembling." Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable ...

  18. 14.3 Quasi-experimental designs

    Time series design. Another type of quasi-experimental design is a time series design. Unlike other types of experimental design, time series designs do not have a comparison group. A time series is a set of measurements taken at intervals over a period of time (Figure 14.7). Proper time series design should include at least three pre- and post ...

  19. The Use and Interpretation of Quasi-Experimental Studies in Medical

    This design is the weakest of the quasi-experimental designs that are discussed in this article. Without any pretest observations or a control group, there are multiple threats to internal validity. Unfortunately, this study design is often used in medical informatics when new software is introduced since it may be difficult to have pretest ...

  20. Quasi-Experimental Design: Explanation, Methods and FAQs

    The double pre-test design is a very robust quasi-experimental design designed to rule out the internal validity problem we had with the non-equivalent design. It has two pre-tests before the program. It is when the two groups are progressing at a different pace, that you should change from pre-test 1 to pre-test 2.

  21. (PDF) Experimental and quasi-experimental designs

    Therefore, a quantitative research based on a quasi-experimental methodology and a pre-test-post-test design has been developed. The sample of participants was composed by 22 children with ASD.

  22. Journal of Medical Internet Research

    This study advocates for continued research in this rapidly advancing field to better serve the aging population. ... the Quick Mild Cognitive Impairment (Qmci) screen. Methods: A concurrent mixed methods, prospective study using a quasi-experimental design was conducted with 147 participants from 5 primary care Family Health Teams (FHTs ...

  23. The Limitations of Quasi-Experimental Studies, and Methods for Data

    Keywords: Quasi-experimental study, research design, univariable analysis, multivariable regression, confounding variables If we wish to study how antidepressant drug treatment affects outcomes in pregnancy, we should ideally randomize depressed pregnant women to receive an antidepressant drug or placebo; this is a randomized controlled trial ...