helpful professor logo

15 Famous Experiments and Case Studies in Psychology

psychology theories, explained below

Psychology has seen thousands upon thousands of research studies over the years. Most of these studies have helped shape our current understanding of human thoughts, behavior, and feelings.

The psychology case studies in this list are considered classic examples of psychological case studies and experiments, which are still being taught in introductory psychology courses up to this day.

Some studies, however, were downright shocking and controversial that you’d probably wonder why such studies were conducted back in the day. Imagine participating in an experiment for a small reward or extra class credit, only to be left scarred for life. These kinds of studies, however, paved the way for a more ethical approach to studying psychology and implementation of research standards such as the use of debriefing in psychology research .

Case Study vs. Experiment

Before we dive into the list of the most famous studies in psychology, let us first review the difference between case studies and experiments.

  • It is an in-depth study and analysis of an individual, group, community, or phenomenon. The results of a case study cannot be applied to the whole population, but they can provide insights for further studies.
  • It often uses qualitative research methods such as observations, surveys, and interviews.
  • It is often conducted in real-life settings rather than in controlled environments.
  • An experiment is a type of study done on a sample or group of random participants, the results of which can be generalized to the whole population.
  • It often uses quantitative research methods that rely on numbers and statistics.
  • It is conducted in controlled environments, wherein some things or situations are manipulated.

See Also: Experimental vs Observational Studies

Famous Experiments in Psychology

1. the marshmallow experiment.

Psychologist Walter Mischel conducted the marshmallow experiment at Stanford University in the 1960s to early 1970s. It was a simple test that aimed to define the connection between delayed gratification and success in life.

The instructions were fairly straightforward: children ages 4-6 were presented a piece of marshmallow on a table and they were told that they would receive a second piece if they could wait for 15 minutes without eating the first marshmallow.

About one-third of the 600 participants succeeded in delaying gratification to receive the second marshmallow. Mischel and his team followed up on these participants in the 1990s, learning that those who had the willpower to wait for a larger reward experienced more success in life in terms of SAT scores and other metrics.

This case study also supported self-control theory , a theory in criminology that holds that people with greater self-control are less likely to end up in trouble with the law!

The classic marshmallow experiment, however, was debunked in a 2018 replication study done by Tyler Watts and colleagues.

This more recent experiment had a larger group of participants (900) and a better representation of the general population when it comes to race and ethnicity. In this study, the researchers found out that the ability to wait for a second marshmallow does not depend on willpower alone but more so on the economic background and social status of the participants.

2. The Bystander Effect

In 1694, Kitty Genovese was murdered in the neighborhood of Kew Gardens, New York. It was told that there were up to 38 witnesses and onlookers in the vicinity of the crime scene, but nobody did anything to stop the murder or call for help.

Such tragedy was the catalyst that inspired social psychologists Bibb Latane and John Darley to formulate the phenomenon called bystander effect or bystander apathy .

Subsequent investigations showed that this story was exaggerated and inaccurate, as there were actually only about a dozen witnesses, at least two of whom called the police. But the case of Kitty Genovese led to various studies that aim to shed light on the bystander phenomenon.

Latane and Darley tested bystander intervention in an experimental study . Participants were asked to answer a questionnaire inside a room, and they would either be alone or with two other participants (who were actually actors or confederates in the study). Smoke would then come out from under the door. The reaction time of participants was tested — how long would it take them to report the smoke to the authorities or the experimenters?

The results showed that participants who were alone in the room reported the smoke faster than participants who were with two passive others. The study suggests that the more onlookers are present in an emergency situation, the less likely someone would step up to help, a social phenomenon now popularly called the bystander effect.

3. Asch Conformity Study

Have you ever made a decision against your better judgment just to fit in with your friends or family? The Asch Conformity Studies will help you understand this kind of situation better.

In this experiment, a group of participants were shown three numbered lines of different lengths and asked to identify the longest of them all. However, only one true participant was present in every group and the rest were actors, most of whom told the wrong answer.

Results showed that the participants went for the wrong answer, even though they knew which line was the longest one in the first place. When the participants were asked why they identified the wrong one, they said that they didn’t want to be branded as strange or peculiar.

This study goes to show that there are situations in life when people prefer fitting in than being right. It also tells that there is power in numbers — a group’s decision can overwhelm a person and make them doubt their judgment.

4. The Bobo Doll Experiment

The Bobo Doll Experiment was conducted by Dr. Albert Bandura, the proponent of social learning theory .

Back in the 1960s, the Nature vs. Nurture debate was a popular topic among psychologists. Bandura contributed to this discussion by proposing that human behavior is mostly influenced by environmental rather than genetic factors.

In the Bobo Doll Experiment, children were divided into three groups: one group was shown a video in which an adult acted aggressively toward the Bobo Doll, the second group was shown a video in which an adult play with the Bobo Doll, and the third group served as the control group where no video was shown.

The children were then led to a room with different kinds of toys, including the Bobo Doll they’ve seen in the video. Results showed that children tend to imitate the adults in the video. Those who were presented the aggressive model acted aggressively toward the Bobo Doll while those who were presented the passive model showed less aggression.

While the Bobo Doll Experiment can no longer be replicated because of ethical concerns, it has laid out the foundations of social learning theory and helped us understand the degree of influence adult behavior has on children.

5. Blue Eye / Brown Eye Experiment

Following the assassination of Martin Luther King Jr. in 1968, third-grade teacher Jane Elliott conducted an experiment in her class. Although not a formal experiment in controlled settings, A Class Divided is a good example of a social experiment to help children understand the concept of racism and discrimination.

The class was divided into two groups: blue-eyed children and brown-eyed children. For one day, Elliott gave preferential treatment to her blue-eyed students, giving them more attention and pampering them with rewards. The next day, it was the brown-eyed students’ turn to receive extra favors and privileges.

As a result, whichever group of students was given preferential treatment performed exceptionally well in class, had higher quiz scores, and recited more frequently; students who were discriminated against felt humiliated, answered poorly in tests, and became uncertain with their answers in class.

This study is now widely taught in sociocultural psychology classes.

6. Stanford Prison Experiment

One of the most controversial and widely-cited studies in psychology is the Stanford Prison Experiment , conducted by Philip Zimbardo at the basement of the Stanford psychology building in 1971. The hypothesis was that abusive behavior in prisons is influenced by the personality traits of the prisoners and prison guards.

The participants in the experiment were college students who were randomly assigned as either a prisoner or a prison guard. The prison guards were then told to run the simulated prison for two weeks. However, the experiment had to be stopped in just 6 days.

The prison guards abused their authority and harassed the prisoners through verbal and physical means. The prisoners, on the other hand, showed submissive behavior. Zimbardo decided to stop the experiment because the prisoners were showing signs of emotional and physical breakdown.

Although the experiment wasn’t completed, the results strongly showed that people can easily get into a social role when others expect them to, especially when it’s highly stereotyped .

7. The Halo Effect

Have you ever wondered why toothpastes and other dental products are endorsed in advertisements by celebrities more often than dentists? The Halo Effect is one of the reasons!

The Halo Effect shows how one favorable attribute of a person can gain them positive perceptions in other attributes. In the case of product advertisements, attractive celebrities are also perceived as intelligent and knowledgeable of a certain subject matter even though they’re not technically experts.

The Halo Effect originated in a classic study done by Edward Thorndike in the early 1900s. He asked military commanding officers to rate their subordinates based on different qualities, such as physical appearance, leadership, dependability, and intelligence.

The results showed that high ratings of a particular quality influences the ratings of other qualities, producing a halo effect of overall high ratings. The opposite also applied, which means that a negative rating in one quality also correlated to negative ratings in other qualities.

Experiments on the Halo Effect came in various formats as well, supporting Thorndike’s original theory. This phenomenon suggests that our perception of other people’s overall personality is hugely influenced by a quality that we focus on.

8. Cognitive Dissonance

There are experiences in our lives when our beliefs and behaviors do not align with each other and we try to justify them in our minds. This is cognitive dissonance , which was studied in an experiment by Leon Festinger and James Carlsmith back in 1959.

In this experiment, participants had to go through a series of boring and repetitive tasks, such as spending an hour turning pegs in a wooden knob. After completing the tasks, they were then paid either $1 or $20 to tell the next participants that the tasks were extremely fun and enjoyable. Afterwards, participants were asked to rate the experiment. Those who were given $1 rated the experiment as more interesting and fun than those who received $20.

The results showed that those who received a smaller incentive to lie experienced cognitive dissonance — $1 wasn’t enough incentive for that one hour of painstakingly boring activity, so the participants had to justify that they had fun anyway.

Famous Case Studies in Psychology

9. little albert.

In 1920, behaviourist theorists John Watson and Rosalie Rayner experimented on a 9-month-old baby to test the effects of classical conditioning in instilling fear in humans.

This was such a controversial study that it gained popularity in psychology textbooks and syllabi because it is a classic example of unethical research studies done in the name of science.

In one of the experiments, Little Albert was presented with a harmless stimulus or object, a white rat, which he wasn’t scared of at first. But every time Little Albert would see the white rat, the researchers would play a scary sound of hammer and steel. After about 6 pairings, Little Albert learned to fear the rat even without the scary sound.

Little Albert developed signs of fear to different objects presented to him through classical conditioning . He even generalized his fear to other stimuli not present in the course of the experiment.

10. Phineas Gage

Phineas Gage is such a celebrity in Psych 101 classes, even though the way he rose to popularity began with a tragic accident. He was a resident of Central Vermont and worked in the construction of a new railway line in the mid-1800s. One day, an explosive went off prematurely, sending a tamping iron straight into his face and through his brain.

Gage survived the accident, fortunately, something that is considered a feat even up to this day. He managed to find a job as a stagecoach after the accident. However, his family and friends reported that his personality changed so much that “he was no longer Gage” (Harlow, 1868).

New evidence on the case of Phineas Gage has since come to light, thanks to modern scientific studies and medical tests. However, there are still plenty of mysteries revolving around his brain damage and subsequent recovery.

11. Anna O.

Anna O., a social worker and feminist of German Jewish descent, was one of the first patients to receive psychoanalytic treatment.

Her real name was Bertha Pappenheim and she inspired much of Sigmund Freud’s works and books on psychoanalytic theory, although they hadn’t met in person. Their connection was through Joseph Breuer, Freud’s mentor when he was still starting his clinical practice.

Anna O. suffered from paralysis, personality changes, hallucinations, and rambling speech, but her doctors could not find the cause. Joseph Breuer was then called to her house for intervention and he performed psychoanalysis, also called the “talking cure”, on her.

Breuer would tell Anna O. to say anything that came to her mind, such as her thoughts, feelings, and childhood experiences. It was noted that her symptoms subsided by talking things out.

However, Breuer later referred Anna O. to the Bellevue Sanatorium, where she recovered and set out to be a renowned writer and advocate of women and children.

12. Patient HM

H.M., or Henry Gustav Molaison, was a severe amnesiac who had been the subject of countless psychological and neurological studies.

Henry was 27 when he underwent brain surgery to cure the epilepsy that he had been experiencing since childhood. In an unfortunate turn of events, he lost his memory because of the surgery and his brain also became unable to store long-term memories.

He was then regarded as someone living solely in the present, forgetting an experience as soon as it happened and only remembering bits and pieces of his past. Over the years, his amnesia and the structure of his brain had helped neuropsychologists learn more about cognitive functions .

Suzanne Corkin, a researcher, writer, and good friend of H.M., recently published a book about his life. Entitled Permanent Present Tense , this book is both a memoir and a case study following the struggles and joys of Henry Gustav Molaison.

13. Chris Sizemore

Chris Sizemore gained celebrity status in the psychology community when she was diagnosed with multiple personality disorder, now known as dissociative identity disorder.

Sizemore has several alter egos, which included Eve Black, Eve White, and Jane. Various papers about her stated that these alter egos were formed as a coping mechanism against the traumatic experiences she underwent in her childhood.

Sizemore said that although she has succeeded in unifying her alter egos into one dominant personality, there were periods in the past experienced by only one of her alter egos. For example, her husband married her Eve White alter ego and not her.

Her story inspired her psychiatrists to write a book about her, entitled The Three Faces of Eve , which was then turned into a 1957 movie of the same title.

14. David Reimer

When David was just 8 months old, he lost his penis because of a botched circumcision operation.

Psychologist John Money then advised Reimer’s parents to raise him as a girl instead, naming him Brenda. His gender reassignment was supported by subsequent surgery and hormonal therapy.

Money described Reimer’s gender reassignment as a success, but problems started to arise as Reimer was growing up. His boyishness was not completely subdued by the hormonal therapy. When he was 14 years old, he learned about the secrets of his past and he underwent gender reassignment to become male again.

Reimer became an advocate for children undergoing the same difficult situation he had been. His life story ended when he was 38 as he took his own life.

15. Kim Peek

Kim Peek was the inspiration behind Rain Man , an Oscar-winning movie about an autistic savant character played by Dustin Hoffman.

The movie was released in 1988, a time when autism wasn’t widely known and acknowledged yet. So it was an eye-opener for many people who watched the film.

In reality, Kim Peek was a non-autistic savant. He was exceptionally intelligent despite the brain abnormalities he was born with. He was like a walking encyclopedia, knowledgeable about travel routes, US zip codes, historical facts, and classical music. He also read and memorized approximately 12,000 books in his lifetime.

This list of experiments and case studies in psychology is just the tip of the iceberg! There are still countless interesting psychology studies that you can explore if you want to learn more about human behavior and dynamics.

You can also conduct your own mini-experiment or participate in a study conducted in your school or neighborhood. Just remember that there are ethical standards to follow so as not to repeat the lasting physical and emotional harm done to Little Albert or the Stanford Prison Experiment participants.

Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied, 70 (9), 1–70. https://doi.org/10.1037/h0093718

Bandura, A., Ross, D., & Ross, S. A. (1961). Transmission of aggression through imitation of aggressive models. The Journal of Abnormal and Social Psychology, 63 (3), 575–582. https://doi.org/10.1037/h0045925

Elliott, J., Yale University., WGBH (Television station : Boston, Mass.), & PBS DVD (Firm). (2003). A class divided. New Haven, Conn.: Yale University Films.

Festinger, L., & Carlsmith, J. M. (1959). Cognitive consequences of forced compliance. The Journal of Abnormal and Social Psychology, 58 (2), 203–210. https://doi.org/10.1037/h0041593

Haney, C., Banks, W. C., & Zimbardo, P. G. (1973). A study of prisoners and guards in a simulated prison. Naval Research Review , 30 , 4-17.

Latane, B., & Darley, J. M. (1968). Group inhibition of bystander intervention in emergencies. Journal of Personality and Social Psychology, 10 (3), 215–221. https://doi.org/10.1037/h0026570

Mischel, W. (2014). The Marshmallow Test: Mastering self-control. Little, Brown and Co.

Thorndike, E. (1920) A Constant Error in Psychological Ratings. Journal of Applied Psychology , 4 , 25-29. http://dx.doi.org/10.1037/h0071663

Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of experimental psychology , 3 (1), 1.

Chris

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 15 Animism Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 10 Magical Thinking Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Social-Emotional Learning (Definition, Examples, Pros & Cons)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ What is Educational Psychology?

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Single-Case Experimental Designs: A Systematic Review of Published Research and Current Standards

Justin d. smith.

Child and Family Center, University of Oregon

This article systematically reviews the research design and methodological characteristics of single-case experimental design (SCED) research published in peer-reviewed journals between 2000 and 2010. SCEDs provide researchers with a flexible and viable alternative to group designs with large sample sizes. However, methodological challenges have precluded widespread implementation and acceptance of the SCED as a viable complementary methodology to the predominant group design. This article includes a description of the research design, measurement, and analysis domains distinctive to the SCED; a discussion of the results within the framework of contemporary standards and guidelines in the field; and a presentation of updated benchmarks for key characteristics (e.g., baseline sampling, method of analysis), and overall, it provides researchers and reviewers with a resource for conducting and evaluating SCED research. The results of the systematic review of 409 studies suggest that recently published SCED research is largely in accordance with contemporary criteria for experimental quality. Analytic method emerged as an area of discord. Comparison of the findings of this review with historical estimates of the use of statistical analysis indicates an upward trend, but visual analysis remains the most common analytic method and also garners the most support amongst those entities providing SCED standards. Although consensus exists along key dimensions of single-case research design and researchers appear to be practicing within these parameters, there remains a need for further evaluation of assessment and sampling techniques and data analytic methods.

The single-case experiment has a storied history in psychology dating back to the field’s founders: Fechner (1889) , Watson (1925) , and Skinner (1938) . It has been used to inform and develop theory, examine interpersonal processes, study the behavior of organisms, establish the effectiveness of psychological interventions, and address a host of other research questions (for a review, see Morgan & Morgan, 2001 ). In recent years the single-case experimental design (SCED) has been represented in the literature more often than in past decades, as is evidenced by recent reviews ( Hammond & Gast, 2010 ; Shadish & Sullivan, 2011 ), but it still languishes behind the more prominent group design in nearly all subfields of psychology. Group designs are often professed to be superior because they minimize, although do not necessarily eliminate, the major internal validity threats to drawing scientifically valid inferences from the results ( Shadish, Cook, & Campbell, 2002 ). SCEDs provide a rigorous, methodologically sound alternative method of evaluation (e.g., Barlow, Nock, & Hersen, 2008 ; Horner et al., 2005 ; Kazdin, 2010 ; Kratochwill & Levin, 2010 ; Shadish et al., 2002 ) but are often overlooked as a true experimental methodology capable of eliciting legitimate inferences (e.g., Barlow et al., 2008 ; Kazdin, 2010 ). Despite a shift in the zeitgeist from single-case experiments to group designs more than a half century ago, recent and rapid methodological advancements suggest that SCEDs are poised for resurgence.

Single case refers to the participant or cluster of participants (e.g., a classroom, hospital, or neighborhood) under investigation. In contrast to an experimental group design in which one group is compared with another, participants in a single-subject experiment research provide their own control data for the purpose of comparison in a within-subject rather than a between-subjects design. SCEDs typically involve a comparison between two experimental time periods, known as phases. This approach typically includes collecting a representative baseline phase to serve as a comparison with subsequent phases. In studies examining single subjects that are actually groups (i.e., classroom, school), there are additional threats to internal validity of the results, as noted by Kratochwill and Levin (2010) , which include setting or site effects.

The central goal of the SCED is to determine whether a causal or functional relationship exists between a researcher-manipulated independent variable (IV) and a meaningful change in the dependent variable (DV). SCEDs generally involve repeated, systematic assessment of one or more IVs and DVs over time. The DV is measured repeatedly across and within all conditions or phases of the IV. Experimental control in SCEDs includes replication of the effect either within or between participants ( Horner et al., 2005 ). Randomization is another way in which threats to internal validity can be experimentally controlled. Kratochwill and Levin (2010) recently provided multiple suggestions for adding a randomization component to SCEDs to improve the methodological rigor and internal validity of the findings.

Examination of the effectiveness of interventions is perhaps the area in which SCEDs are most well represented ( Morgan & Morgan, 2001 ). Researchers in behavioral medicine and in clinical, health, educational, school, sport, rehabilitation, and counseling psychology often use SCEDs because they are particularly well suited to examining the processes and outcomes of psychological and behavioral interventions (e.g., Borckardt et al., 2008 ; Kazdin, 2010 ; Robey, Schultz, Crawford, & Sinner, 1999 ). Skepticism about the clinical utility of the randomized controlled trial (e.g., Jacobsen & Christensen, 1996 ; Wachtel, 2010 ; Westen & Bradley, 2005 ; Westen, Novotny, & Thompson-Brenner, 2004 ) has renewed researchers’ interest in SCEDs as a means to assess intervention outcomes (e.g., Borckardt et al., 2008 ; Dattilio, Edwards, & Fishman, 2010 ; Horner et al., 2005 ; Kratochwill, 2007 ; Kratochwill & Levin, 2010 ). Although SCEDs are relatively well represented in the intervention literature, it is by no means their sole home: Examples appear in nearly every subfield of psychology (e.g., Bolger, Davis, & Rafaeli, 2003 ; Piasecki, Hufford, Solham, & Trull, 2007 ; Reis & Gable, 2000 ; Shiffman, Stone, & Hufford, 2008 ; Soliday, Moore, & Lande, 2002 ). Aside from the current preference for group-based research designs, several methodological challenges have repressed the proliferation of the SCED.

Methodological Complexity

SCEDs undeniably present researchers with a complex array of methodological and research design challenges, such as establishing a representative baseline, managing the nonindependence of sequential observations (i.e., autocorrelation, serial dependence), interpreting single-subject effect sizes, analyzing the short data streams seen in many applications, and appropriately addressing the matter of missing observations. In the field of intervention research for example, Hser et al. (2001) noted that studies using SCEDs are “rare” because of the minimum number of observations that are necessary (e.g., 3–5 data points in each phase) and the complexity of available data analysis approaches. Advances in longitudinal person-based trajectory analysis (e.g., Nagin, 1999 ), structural equation modeling techniques (e.g., Lubke & Muthén, 2005 ), time-series forecasting (e.g., autoregressive integrated moving averages; Box & Jenkins, 1970 ), and statistical programs designed specifically for SCEDs (e.g., Simulation Modeling Analysis; Borckardt, 2006 ) have provided researchers with robust means of analysis, but they might not be feasible methods for the average psychological scientist.

Application of the SCED has also expanded. Today, researchers use variants of the SCED to examine complex psychological processes and the relationship between daily and momentary events in peoples’ lives and their psychological correlates. Research in nearly all subfields of psychology has begun to use daily diary and ecological momentary assessment (EMA) methods in the context of the SCED, opening the door to understanding increasingly complex psychological phenomena (see Bolger et al., 2003 ; Shiffman et al., 2008 ). In contrast to the carefully controlled laboratory experiment that dominated research in the first half of the twentieth century (e.g., Skinner, 1938 ; Watson, 1925 ), contemporary proponents advocate application of the SCED in naturalistic studies to increase the ecological validity of empirical findings (e.g., Bloom, Fisher, & Orme, 2003 ; Borckardt et al., 2008 ; Dattilio et al., 2010 ; Jacobsen & Christensen, 1996 ; Kazdin, 2008 ; Morgan & Morgan, 2001 ; Westen & Bradley, 2005 ; Westen et al., 2004 ). Recent advancements and expanded application of SCEDs indicate a need for updated design and reporting standards.

Many current benchmarks in the literature concerning key parameters of the SCED were established well before current advancements and innovations, such as the suggested minimum number of data points in the baseline phase(s), which remains a disputed area of SCED research (e.g., Center, Skiba, & Casey, 1986 ; Huitema, 1985 ; R. R. Jones, Vaught, & Weinrott, 1977 ; Sharpley, 1987 ). This article comprises (a) an examination of contemporary SCED methodological and reporting standards; (b) a systematic review of select design, measurement, and statistical characteristics of published SCED research during the past decade; and (c) a broad discussion of the critical aspects of this research to inform methodological improvements and study reporting standards. The reader will garner a fundamental understanding of what constitutes appropriate methodological soundness in single-case experimental research according to the established standards in the field, which can be used to guide the design of future studies, improve the presentation of publishable empirical findings, and inform the peer-review process. The discussion begins with the basic characteristics of the SCED, including an introduction to time-series, daily diary, and EMA strategies, and describes how current reporting and design standards apply to each of these areas of single-case research. Interweaved within this presentation are the results of a systematic review of SCED research published between 2000 and 2010 in peer-reviewed outlets and a discussion of the way in which these findings support, or differ from, existing design and reporting standards and published SCED benchmarks.

Review of Current SCED Guidelines and Reporting Standards

In contrast to experimental group comparison studies, which conform to generally well agreed upon methodological design and reporting guidelines, such as the CONSORT ( Moher, Schulz, Altman, & the CONSORT Group, 2001 ) and TREND ( Des Jarlais, Lyles, & Crepaz, 2004 ) statements for randomized and nonrandomized trials, respectively, there is comparatively much less consensus when it comes to the SCED. Until fairly recently, design and reporting guidelines for single-case experiments were almost entirely absent in the literature and were typically determined by the preferences of a research subspecialty or a particular journal’s editorial board. Factions still exist within the larger field of psychology, as can be seen in the collection of standards presented in this article, particularly in regard to data analytic methods of SCEDs, but fortunately there is budding agreement about certain design and measurement characteristics. A number of task forces, professional groups, and independent experts in the field have recently put forth guidelines; each has a relatively distinct purpose, which likely accounts for some of the discrepancies between them. In what is to be a central theme of this article, researchers are ultimately responsible for thoughtfully and synergistically combining research design, measurement, and analysis aspects of a study.

This review presents the more prominent, comprehensive, and recently established SCED standards. Six sources are discussed: (1) Single-Case Design Technical Documentation from the What Works Clearinghouse (WWC; Kratochwill et al., 2010 ); (2) the APA Division 12 Task Force on Psychological Interventions, with contributions from the Division 12 Task Force on Promotion and Dissemination of Psychological Procedures and the APA Task Force for Psychological Intervention Guidelines (DIV12; presented in Chambless & Hollon, 1998 ; Chambless & Ollendick, 2001 ), adopted and expanded by APA Division 53, the Society for Clinical Child and Adolescent Psychology ( Weisz & Hawley, 1998 , 1999 ); (3) the APA Division 16 Task Force on Evidence-Based Interventions in School Psychology (DIV16; Members of the Task Force on Evidence-Based Interventions in School Psychology. Chair: T. R. Kratochwill, 2003); (4) the National Reading Panel (NRP; National Institute of Child Health and Human Development, 2000 ); (5) the Single-Case Experimental Design Scale ( Tate et al., 2008 ); and (6) the reporting guidelines for EMA put forth by Stone & Shiffman (2002) . Although the specific purposes of each source differ somewhat, the overall aim is to provide researchers and reviewers with agreed-upon criteria to be used in the conduct and evaluation of SCED research. The standards provided by WWC, DIV12, DIV16, and the NRP represent the efforts of task forces. The Tate et al. scale was selected for inclusion in this review because it represents perhaps the only psychometrically validated tool for assessing the rigor of SCED methodology. Stone and Shiffman’s (2002) standards were intended specifically for EMA methods, but many of their criteria also apply to time-series, daily diary, and other repeated-measurement and sampling methods, making them pertinent to this article. The design, measurement, and analysis standards are presented in the later sections of this article and notable concurrences, discrepancies, strengths, and deficiencies are summarized.

Systematic Review Search Procedures and Selection Criteria

Search strategy.

A comprehensive search strategy of SCEDs was performed to identify studies published in peer-reviewed journals meeting a priori search and inclusion criteria. First, a computer-based PsycINFO search of articles published between 2000 and 2010 (search conducted in July 2011) was conducted that used the following primary key terms and phrases that appeared anywhere in the article (asterisks denote that any characters/letters can follow the last character of the search term): alternating treatment design, changing criterion design, experimental case*, multiple baseline design, replicated single-case design, simultaneous treatment design, time-series design. The search was limited to studies published in the English language and those appearing in peer-reviewed journals within the specified publication year range. Additional limiters of the type of article were also used in PsycINFO to increase specificity: The search was limited to include methodologies indexed as either quantitative study OR treatment outcome/randomized clinical trial and NOT field study OR interview OR focus group OR literature review OR systematic review OR mathematical model OR qualitative study.

Study selection

The author used a three-phase study selection, screening, and coding procedure to select the highest number of applicable studies. Phase 1 consisted of the initial systematic review conducted using PsycINFO, which resulted in 571 articles. In Phase 2, titles and abstracts were screened: Articles appearing to use a SCED were retained (451) for Phase 3, in which the author and a trained research assistant read each full-text article and entered the characteristics of interest into a database. At each phase of the screening process, studies that did not use a SCED or that either self-identified as, or were determined to be, quasi-experimental were dropped. Of the 571 original studies, 82 studies were determined to be quasi-experimental. The definition of a quasi-experimental design used in the screening procedure conforms to the descriptions provided by Kazdin (2010) and Shadish et al. (2002) regarding the necessary components of an experimental design. For example, reversal designs require a minimum of four phases (e.g., ABAB), and multiple baseline designs must demonstrate replication of the effect across at least three conditions (e.g., subjects, settings, behaviors). Sixteen studies were unavailable in full text in English, and five could not be obtained in full text and were thus dropped. The remaining articles that were not retained for review (59) were determined not to be SCED studies meeting our inclusion criteria, but had been identified in our PsycINFO search using the specified keyword and methodology terms. For this review, 409 studies were selected. The sources of the 409 reviewed studies are summarized in Table 1 . A complete bibliography of the 571 studies appearing in the initial search, with the included studies marked, is available online as an Appendix or from the author.

Journal Sources of Studies Included in the Systematic Review (N = 409)

Note: Each of the following journal titles contributed 1 study unless otherwise noted in parentheses: Augmentative and Alternative Communication; Acta Colombiana de Psicología; Acta Comportamentalia; Adapted Physical Activity Quarterly (2); Addiction Research and Theory; Advances in Speech Language Pathology; American Annals of the Deaf; American Journal of Education; American Journal of Occupational Therapy; American Journal of Speech-Language Pathology; The American Journal on Addictions; American Journal on Mental Retardation; Applied Ergonomics; Applied Psychophysiology and Biofeedback; Australian Journal of Guidance & Counseling; Australian Psychologist; Autism; The Behavior Analyst; The Behavior Analyst Today; Behavior Analysis in Practice (2); Behavior and Social Issues (2); Behaviour Change (2); Behavioural and Cognitive Psychotherapy; Behaviour Research and Therapy (3); Brain and Language (2); Brain Injury (2); Canadian Journal of Occupational Therapy (2); Canadian Journal of School Psychology; Career Development for Exceptional Individuals; Chinese Mental Health Journal; Clinical Linguistics and Phonetics; Clinical Psychology & Psychotherapy; Cognitive and Behavioral Practice; Cognitive Computation; Cognitive Therapy and Research; Communication Disorders Quarterly; Developmental Medicine & Child Neurology (2); Developmental Neurorehabilitation (2); Disability and Rehabilitation: An International, Multidisciplinary Journal (3); Disability and Rehabilitation: Assistive Technology; Down Syndrome: Research & Practice; Drug and Alcohol Dependence (2); Early Childhood Education Journal (2); Early Childhood Services: An Interdisciplinary Journal of Effectiveness; Educational Psychology (2); Education and Training in Autism and Developmental Disabilities; Electronic Journal of Research in Educational Psychology; Environment and Behavior (2); European Eating Disorders Review; European Journal of Sport Science; European Review of Applied Psychology; Exceptional Children; Exceptionality; Experimental and Clinical Psychopharmacology; Family & Community Health: The Journal of Health Promotion & Maintenance; Headache: The Journal of Head and Face Pain; International Journal of Behavioral Consultation and Therapy (2); International Journal of Disability; Development and Education (2); International Journal of Drug Policy; International Journal of Psychology; International Journal of Speech-Language Pathology; International Psychogeriatrics; Japanese Journal of Behavior Analysis (3); Japanese Journal of Special Education; Journal of Applied Research in Intellectual Disabilities (2); Journal of Applied Sport Psychology (3); Journal of Attention Disorders (2); Journal of Behavior Therapy and Experimental Psychiatry; Journal of Child Psychology and Psychiatry; Journal of Clinical Psychology in Medical Settings; Journal of Clinical Sport Psychology; Journal of Cognitive Psychotherapy; Journal of Consulting and Clinical Psychology (2); Journal of Deaf Studies and Deaf Education; Journal of Educational & Psychological Consultation (2); Journal of Evidence-Based Practices for Schools (2); Journal of the Experimental Analysis of Behavior (2); Journal of General Internal Medicine; Journal of Intellectual and Developmental Disabilities; Journal of Intellectual Disability Research (2); Journal of Medical Speech-Language Pathology; Journal of Neurology, Neurosurgery & Psychiatry; Journal of Paediatrics and Child Health; Journal of Prevention and Intervention in the Community; Journal of Safety Research; Journal of School Psychology (3); The Journal of Socio-Economics; The Journal of Special Education; Journal of Speech, Language, and Hearing Research (2); Journal of Sport Behavior; Journal of Substance Abuse Treatment; Journal of the International Neuropsychological Society; Journal of Traumatic Stress; The Journals of Gerontology: Series B: Psychological Sciences and Social Sciences; Language, Speech, and Hearing Services in Schools; Learning Disabilities Research & Practice (2); Learning Disability Quarterly (2); Music Therapy Perspectives; Neurorehabilitation and Neural Repair; Neuropsychological Rehabilitation (2); Pain; Physical Education and Sport Pedagogy (2); Preventive Medicine: An International Journal Devoted to Practice and Theory; Psychological Assessment; Psychological Medicine: A Journal of Research in Psychiatry and the Allied Sciences; The Psychological Record; Reading and Writing; Remedial and Special Education (3); Research and Practice for Persons with Severe Disabilities (2); Restorative Neurology and Neuroscience; School Psychology International; Seminars in Speech and Language; Sleep and Hypnosis; School Psychology Quarterly; Social Work in Health Care; The Sport Psychologist (3); Therapeutic Recreation Journal (2); The Volta Review; Work: Journal of Prevention, Assessment & Rehabilitation.

Coding criteria amplifications

A comprehensive description of the coding criteria for each category in this review is available from the author by request. The primary coding criteria are described here and in later sections of this article.

  • Research design was classified into one of the types discussed later in the section titled Predominant Single-Case Experimental Designs on the basis of the authors’ stated design type. Secondary research designs were then coded when applicable (i.e., mixed designs). Distinctions between primary and secondary research designs were made based on the authors’ description of their study. For example, if an author described the study as a “multiple baseline design with time-series measurement,” the primary research design would be coded as being multiple baseline, and time-series would be coded as the secondary research design.
  • Observer ratings were coded as present when observational coding procedures were described and/or the results of a test of interobserver agreement were reported.
  • Interrater reliability for observer ratings was coded as present in any case in which percent agreement, alpha, kappa, or another appropriate statistic was reported, regardless of the amount of the total data that were examined for agreement.
  • Daily diary, daily self-report, and EMA codes were given when authors explicitly described these procedures in the text by name. Coders did not infer the use of these measurement strategies.
  • The number of baseline observations was either taken directly from the figures provided in text or was simply counted in graphical displays of the data when this was determined to be a reliable approach. In some cases, it was not possible to reliably determine the number of baseline data points from the graphical display of data, in which case, the “unavailable” code was assigned. Similarly, the “unavailable” code was assigned when the number of observations was either unreported or ambiguous, or only a range was provided and thus no mean could be determined. Similarly, the mean number of baseline observations was calculated for each study prior to further descriptive statistical analyses because a number of studies reported means only.
  • The coding of the analytic method used in the reviewed studies is discussed later in the section titled Discussion of Review Results and Coding of Analytic Methods .

Results of the Systematic Review

Descriptive statistics of the design, measurement, and analysis characteristics of the reviewed studies are presented in Table 2 . The results and their implications are discussed in the relevant sections throughout the remainder of the article.

Descriptive Statistics of Reviewed SCED Characteristics

Note. % refers to the proportion of reviewed studies that satisfied criteria for this code: For example, the percent of studies reporting observer ratings.

Discussion of the Systematic Review Results in Context

The SCED is a very flexible methodology and has many variants. Those mentioned here are the building blocks from which other designs are then derived. For those readers interested in the nuances of each design, Barlow et al., (2008) ; Franklin, Allison, and Gorman (1997) ; Kazdin (2010) ; and Kratochwill and Levin (1992) , among others, provide cogent, in-depth discussions. Identifying the appropriate SCED depends upon many factors, including the specifics of the IV, the setting in which the study will be conducted, participant characteristics, the desired or hypothesized outcomes, and the research question(s). Similarly, the researcher’s selection of measurement and analysis techniques is determined by these factors.

Predominant Single-Case Experimental Designs

Alternating/simultaneous designs (6%; primary design of the studies reviewed).

Alternating and simultaneous designs involve an iterative manipulation of the IV(s) across different phases to show that changes in the DV vary systematically as a function of manipulating the IV(s). In these multielement designs, the researcher has the option to alternate the introduction of two or more IVs or present two or more IVs at the same time. In the alternating variation, the researcher is able to determine the relative impact of two different IVs on the DV, when all other conditions are held constant. Another variation of this design is to alternate IVs across various conditions that could be related to the DV (e.g., class period, interventionist). Similarly, the simultaneous design would occur when the IVs were presented at the same time within the same phase of the study.

Changing criterion design (4%)

Changing criterion designs are used to demonstrate a gradual change in the DV over the course of the phase involving the active manipulation of the IV. Criteria indicating that a change has occurred happen in a step-wise manner, in which the criterion shifts as the participant responds to the presence of the manipulated IV. The changing criterion design is particularly useful in applied intervention research for a number of reasons. The IV is continuous and never withdrawn, unlike the strategy used in a reversal design. This is particularly important in situations where removal of a psychological intervention would be either detrimental or dangerous to the participant, or would be otherwise unfeasible or unethical. The multiple baseline design also does not withdraw intervention, but it requires replicating the effects of the intervention across participants, settings, or situations. A changing criterion design can be accomplished with one participant in one setting without withholding or withdrawing treatment.

Multiple baseline/combined series design (69%)

The multiple baseline or combined series design can be used to test within-subject change across conditions and often involves multiple participants in a replication context. The multiple baseline design is quite simple in many ways, essentially consisting of a number of repeated, miniature AB experiments or variations thereof. Introduction of the IV is staggered temporally across multiple participants or across multiple within-subject conditions, which allows the researcher to demonstrate that changes in the DV reliably occur only when the IV is introduced, thus controlling for the effects of extraneous factors. Multiple baseline designs can be used both within and across units (i.e., persons or groups of persons). When the baseline phase of each subject begins simultaneously, it is called a concurrent multiple baseline design. In a nonconcurrent variation, baseline periods across subjects begin at different points in time. The multiple baseline design is useful in many settings in which withdrawal of the IV would not be appropriate or when introduction of the IV is hypothesized to result in permanent change that would not reverse when the IV is withdrawn. The major drawback of this design is that the IV must be initially withheld for a period of time to ensure different starting points across the different units in the baseline phase. Depending upon the nature of the research questions, withholding an IV, such as a treatment, could be potentially detrimental to participants.

Reversal designs (17%)

Reversal designs are also known as introduction and withdrawal and are denoted as ABAB designs in their simplest form. As the name suggests, the reversal design involves collecting a baseline measure of the DV (the first A phase), introducing the IV (the first B phase), removing the IV while continuing to assess the DV (the second A phase), and then reintroducing the IV (the second B phase). This pattern can be repeated as many times as is necessary to demonstrate an effect or otherwise address the research question. Reversal designs are useful when the manipulation is hypothesized to result in changes in the DV that are expected to reverse or discontinue when the manipulation is not present. Maintenance of an effect is often necessary to uphold the findings of reversal designs. The demonstration of an effect is evident in reversal designs when improvement occurs during the first manipulation phase, compared to the first baseline phase, then reverts to or approaches original baseline levels during the second baseline phase when the manipulation has been withdrawn, and then improves again when the manipulation in then reinstated. This pattern of reversal, when the manipulation is introduced and then withdrawn, is essential to attributing changes in the DV to the IV. However, maintenance of the effects in a reversal design, in which the DV is hypothesized to reverse when the IV is withdrawn, is not incompatible ( Kazdin, 2010 ). Maintenance is demonstrated by repeating introduction–withdrawal segments until improvement in the DV becomes permanent even when the IV is withdrawn. There is not always a need to demonstrate maintenance in all applications, nor is it always possible or desirable, but it is paramount in the learning and intervention research contexts.

Mixed designs (10%)

Mixed designs include a combination of more than one SCED (e.g., a reversal design embedded within a multiple baseline) or an SCED embedded within a group design (i.e., a randomized controlled trial comparing two groups of multiple baseline experiments). Mixed designs afford the researcher even greater flexibility in designing a study to address complex psychological hypotheses, but also capitalize on the strengths of the various designs. See Kazdin (2010) for a discussion of the variations and utility of mixed designs.

Related Nonexperimental Designs

Quasi-experimental designs.

In contrast to the designs previously described, all of which constitute “true experiments” ( Kazdin, 2010 ; Shadish et al., 2002 ), in quasi-experimental designs the conditions of a true experiment (e.g., active manipulation of the IV, replication of the effect) are approximated and are not readily under the control of the researcher. Because the focus of this article is on experimental designs, quasi-experiments are not discussed in detail; instead the reader is referred to Kazdin (2010) and Shadish et al. (2002) .

Ecological and naturalistic single-case designs

For a single-case design to be experimental, there must be active manipulation of the IV, but in some applications, such as those that might be used in social and personality psychology, the researcher might be interested in measuring naturally occurring phenomena and examining their temporal relationships. Thus, the researcher will not use a manipulation. An example of this type of research might be a study about the temporal relationship between alcohol consumption and depressed mood, which can be measured reliably using EMA methods. Psychotherapy process researchers also use this type of design to assess dyadic relationship dynamics between therapists and clients (e.g., Tschacher & Ramseyer, 2009 ).

Research Design Standards

Each of the reviewed standards provides some degree of direction regarding acceptable research designs. The WWC provides the most detailed and specific requirements regarding design characteristics. Those guidelines presented in Tables 3 , ​ ,4, 4 , and ​ and5 5 are consistent with the methodological rigor necessary to meet the WWC distinction “meets standards.” The WWC also provides less-stringent standards for a “meets standards with reservations” distinction. When minimum criteria in the design, measurement, or analysis sections of a study are not met, it is rated “does not meet standards” ( Kratochwill et al., 2010 ). Many SCEDs are acceptable within the standards of DIV12, DIV16, NRP, and in the Tate et al. SCED scale. DIV12 specifies that replication occurs across a minimum of three successive cases, which differs from the WWC specifications, which allow for three replications within a single-subject design but does not necessarily need to be across multiple subjects. DIV16 does not require, but seems to prefer, a multiple baseline design with a between-subject replication. Tate et al. state that the “design allows for the examination of cause and effect relationships to demonstrate efficacy” (p. 400, 2008). Determining whether or not a design meets this requirement is left up to the evaluator, who might then refer to one of the other standards or another source for direction.

Research Design Standards and Guidelines

Measurement and Assessment Standards and Guidelines

Analysis Standards and Guidelines

The Stone and Shiffman (2002) standards for EMA are concerned almost entirely with the reporting of measurement characteristics and less so with research design. One way in which these standards differ from those of other sources is in the active manipulation of the IV. Many research questions in EMA, daily diary, and time-series designs are concerned with naturally occurring phenomena, and a researcher manipulation would run counter to this aim. The EMA standards become important when selecting an appropriate measurement strategy within the SCED. In EMA applications, as is also true in some other time-series and daily diary designs, researcher manipulation occurs as a function of the sampling interval in which DVs of interest are measured according to fixed time schedules (e.g., reporting occurs at the end of each day), random time schedules (e.g., the data collection device prompts the participant to respond at random intervals throughout the day), or on an event-based schedule (e.g., reporting occurs after a specified event takes place).

Measurement

The basic measurement requirement of the SCED is a repeated assessment of the DV across each phase of the design in order to draw valid inferences regarding the effect of the IV on the DV. In other applications, such as those used by personality and social psychology researchers to study various human phenomena ( Bolger et al., 2003 ; Reis & Gable, 2000 ), sampling strategies vary widely depending on the topic area under investigation. Regardless of the research area, SCEDs are most typically concerned with within-person change and processes and involve a time-based strategy, most commonly to assess global daily averages or peak daily levels of the DV. Many sampling strategies, such as time-series, in which reporting occurs at uniform intervals or on event-based, fixed, or variable schedules, are also appropriate measurement methods and are common in psychological research (see Bolger et al., 2003 ).

Repeated-measurement methods permit the natural, even spontaneous, reporting of information ( Reis, 1994 ), which reduces the biases of retrospection by minimizing the amount of time elapsed between an experience and the account of this experience ( Bolger et al., 2003 ). Shiffman et al. (2008) aptly noted that the majority of research in the field of psychology relies heavily on retrospective assessment measures, even though retrospective reports have been found to be susceptible to state-congruent recall (e.g., Bower, 1981 ) and a tendency to report peak levels of the experience instead of giving credence to temporal fluctuations ( Redelmeier & Kahneman, 1996 ; Stone, Broderick, Kaell, Deles-Paul, & Porter, 2000 ). Furthermore, Shiffman et al. (1997) demonstrated that subjective aggregate accounts were a poor fit to daily reported experiences, which can be attributed to reductions in measurement error resulting in increased validity and reliability of the daily reports.

The necessity of measuring at least one DV repeatedly means that the selected assessment method, instrument, and/or construct must be sensitive to change over time and be capable of reliably and validly capturing change. Horner et al. (2005) discusses the important features of outcome measures selected for use in these types of designs. Kazdin (2010) suggests that measures be dimensional, which can more readily detect effects than categorical and binary measures. Although using an established measure or scale, such as the Outcome Questionnaire System ( M. J. Lambert, Hansen, & Harmon, 2010 ), provides empirically validated items for assessing various outcomes, most measure validation studies conducted on this type of instrument involve between-subject designs, which is no guarantee that these measures are reliable and valid for assessing within-person variability. Borsboom, Mellenbergh, and van Heerden (2003) suggest that researchers adapting validated measures should consider whether the items they propose using have a factor structure within subjects similar to that obtained between subjects. This is one of the reasons that SCEDs often use observational assessments from multiple sources and report the interrater reliability of the measure. Self-report measures are acceptable practice in some circles, but generally additional assessment methods or informants are necessary to uphold the highest methodological standards. The results of this review indicate that the majority of studies include observational measurement (76.0%). Within those studies, nearly all (97.1%) reported interrater reliability procedures and results. The results within each design were similar, with the exception of time-series designs, which used observer ratings in only half of the reviewed studies.

Time-series

Time-series designs are defined by repeated measurement of variables of interest over a period of time ( Box & Jenkins, 1970 ). Time-series measurement most often occurs in uniform intervals; however, this is no longer a constraint of time-series designs (see Harvey, 2001 ). Although uniform interval reporting is not necessary in SCED research, repeated measures often occur at uniform intervals, such as once each day or each week, which constitutes a time-series design. The time-series design has been used in various basic science applications ( Scollon, Kim-Pietro, & Diener, 2003 ) across nearly all subspecialties in psychology (e.g., Bolger et al., 2003 ; Piasecki et al., 2007 ; for a review, see Reis & Gable, 2000 ; Soliday et al., 2002 ). The basic time-series formula for a two-phase (AB) data stream is presented in Equation 1 . In this formula α represents the step function of the data stream; S represents the change between the first and second phases, which is also the intercept in a two-phase data stream and a step function being 0 at times i = 1, 2, 3…n1 and 1 at times i = n1+1, n1+2, n1+3…n; n 1 is the number of observations in the baseline phase; n is the total number of data points in the data stream; i represents time; and ε i = ρε i −1 + e i , which indicates the relationship between the autoregressive function (ρ) and the distribution of the data in the stream.

Time-series formulas become increasingly complex when seasonality and autoregressive processes are modeled in the analytic procedures, but these are rarely of concern for short time-series data streams in SCEDs. For a detailed description of other time-series design and analysis issues, see Borckardt et al. (2008) , Box and Jenkins (1970) , Crosbie (1993) , R. R. Jones et al. (1977) , and Velicer and Fava (2003) .

Time-series and other repeated-measures methodologies also enable examination of temporal effects. Borckardt et al. (2008) and others have noted that time-series designs have the potential to reveal how change occurs, not simply if it occurs. This distinction is what most interested Skinner (1938) , but it often falls below the purview of today’s researchers in favor of group designs, which Skinner felt obscured the process of change. In intervention and psychopathology research, time-series designs can assess mediators of change ( Doss & Atkins, 2006 ), treatment processes ( Stout, 2007 ; Tschacher & Ramseyer, 2009 ), and the relationship between psychological symptoms (e.g., Alloy, Just, & Panzarella, 1997 ; Hanson & Chen, 2010 ; Oslin, Cary, Slaymaker, Colleran, & Blow, 2009 ), and might be capable of revealing mechanisms of change ( Kazdin, 2007 , 2009 , 2010 ). Between- and within-subject SCED designs with repeated measurements enable researchers to examine similarities and differences in the course of change, both during and as a result of manipulating an IV. Temporal effects have been largely overlooked in many areas of psychological science ( Bolger et al., 2003 ): Examining temporal relationships is sorely needed to further our understanding of the etiology and amplification of numerous psychological phenomena.

Time-series studies were very infrequently found in this literature search (2%). Time-series studies traditionally occur in subfields of psychology in which single-case research is not often used (e.g., personality, physiological/biological). Recent advances in methods for collecting and analyzing time-series data (e.g., Borckardt et al., 2008 ) could expand the use of time-series methodology in the SCED community. One problem with drawing firm conclusions from this particular review finding is a semantic factor: Time-series is a specific term reserved for measurement occurring at a uniform interval. However, SCED research appears to not yet have adopted this language when referring to data collected in this fashion. When time-series data analytic methods are not used, the matter of measurement interval is of less importance and might not need to be specified or described as a time-series. An interesting extension of this work would be to examine SCED research that used time-series measurement strategies but did not label it as such. This is important because then it could be determined how many SCEDs could be analyzed with time-series statistical methods.

Daily diary and ecological momentary assessment methods

EMA and daily diary approaches represent methodological procedures for collecting repeated measurements in time-series and non-time-series experiments, which are also known as experience sampling. Presenting an in-depth discussion of the nuances of these sampling techniques is well beyond the scope of this paper. The reader is referred to the following review articles: daily diary ( Bolger et al., 2003 ; Reis & Gable, 2000 ; Thiele, Laireiter, & Baumann, 2002 ), and EMA ( Shiffman et al., 2008 ). Experience sampling in psychology has burgeoned in the past two decades as technological advances have permitted more precise and immediate reporting by participants (e.g., Internet-based, two-way pagers, cellular telephones, handheld computers) than do paper and pencil methods (for reviews see Barrett & Barrett, 2001 ; Shiffman & Stone, 1998 ). Both methods have practical limitations and advantages. For example, electronic methods are more costly and may exclude certain subjects from participating in the study, either because they do not have access to the necessary technology or they do not have the familiarity or savvy to successfully complete reporting. Electronic data collection methods enable the researcher to prompt responses at random or predetermined intervals and also accurately assess compliance. Paper and pencil methods have been criticized for their inability to reliably track respondents’ compliance: Palermo, Valenzuela, and Stork (2004) found better compliance with electronic diaries than with paper and pencil. On the other hand, Green, Rafaeli, Bolger, Shrout, & Reis (2006) demonstrated the psychometric data structure equivalence between these two methods, suggesting that the data collected in either method will yield similar statistical results given comparable compliance rates.

Daily diary/daily self-report and EMA measurement were somewhat rarely represented in this review, occurring in only 6.1% of the total studies. EMA methods had been used in only one of the reviewed studies. The recent proliferation of EMA and daily diary studies in psychology reported by others ( Bolger et al., 2003 ; Piasecki et al., 2007 ; Shiffman et al., 2008 ) suggests that these methods have not yet reached SCED researchers, which could in part have resulted from the long-held supremacy of observational measurement in fields that commonly practice single-case research.

Measurement Standards

As was previously mentioned, measurement in SCEDs requires the reliable assessment of change over time. As illustrated in Table 4 , DIV16 and the NRP explicitly require that reliability of all measures be reported. DIV12 provides little direction in the selection of the measurement instrument, except to require that three or more clinically important behaviors with relative independence be assessed. Similarly, the only item concerned with measurement on the Tate et al. scale specifies assessing behaviors consistent with the target of the intervention. The WWC and the Tate et al. scale require at least two independent assessors of the DV and that interrater reliability meeting minimum established thresholds be reported. Furthermore, WWC requires that interrater reliability be assessed on at least 20% of the data in each phase and in each condition. DIV16 expects that assessment of the outcome measures will be multisource and multimethod, when applicable. The interval of measurement is not specified by any of the reviewed sources. The WWC and the Tate et al. scale require that DVs be measured repeatedly across phases (e.g., baseline and treatment), which is a typical requirement of a SCED. The NRP asks that the time points at which DV measurement occurred be reported.

The baseline measurement represents one of the most crucial design elements of the SCED. Because subjects provide their own data for comparison, gathering a representative, stable sampling of behavior before manipulating the IV is essential to accurately inferring an effect. Some researchers have reported the typical length of the baseline period to range from 3 to 12 observations in intervention research applications (e.g., Center et al., 1986 ; Huitema, 1985 ; R. R. Jones et al., 1977 ; Sharpley, 1987 ); Huitema’s (1985) review of 881 experiments published in the Journal of Applied Behavior Analysis resulted in a modal number of three to four baseline points. Center et al. (1986) suggested five as the minimum number of baseline measurements needed to accurately estimate autocorrelation. Longer baseline periods suggest a greater likelihood of a representative measurement of the DVs, which has been found to increase the validity of the effects and reduce bias resulting from autocorrelation ( Huitema & McKean, 1994 ). The results of this review are largely consistent with those of previous researchers: The mean number of baseline observations was found to be 10.22 ( SD = 9.59), and 6 was the modal number of observations. Baseline data were available in 77.8% of the reviewed studies. Although the baseline assessment has tremendous bearing on the results of a SCED study, it was often difficult to locate the exact number of data points. Similarly, the number of data points assessed across all phases of the study were not easily identified.

The WWC, DIV12, and DIV16 agree that a minimum of three data points during the baseline is necessary. However, to receive the highest rating by the WWC, five data points are necessary in each phase, including the baseline and any subsequent withdrawal baselines as would occur in a reversal design. DIV16 explicitly states that more than three points are preferred and further stipulates that the baseline must demonstrate stability (i.e., limited variability), absence of overlap between the baseline and other phases, absence of a trend, and that the level of the baseline measurement is severe enough to warrant intervention; each of these aspects of the data is important in inferential accuracy. Detrending techniques can be used to address baseline data trend. The integration option in ARIMA-based modeling and the empirical mode decomposition method ( Wu, Huang, Long, & Peng, 2007 ) are two sophisticated detrending techniques. In regression-based analytic methods, detrending can be accomplished by simply regressing each variable in the model on time (i.e., the residuals become the detrended series), which is analogous to adding a linear, exponential, or quadratic term to the regression equation.

NRP does not provide a minimum for data points, nor does the Tate et al. scale, which requires only a sufficient sampling of baseline behavior. Although the mean and modal number of baseline observations is well within these parameters, seven (1.7%) studies reported mean baselines of less than three data points.

Establishing a uniform minimum number of required baseline observations would provide researchers and reviewers with only a starting guide. The baseline phase is important in SCED research because it establishes a trend that can then be compared with that of subsequent phases. Although a minimum number of observations might be required to meet standards, many more might be necessary to establish a trend when there is variability and trends in the direction of the expected effect. The selected data analytic approach also has some bearing on the number of necessary baseline observations. This is discussed further in the Analysis section.

Reporting of repeated measurements

Stone and Shiffman (2002) provide a comprehensive set of guidelines for the reporting of EMA data, which can also be applied to other repeated-measurement strategies. Because the application of EMA is widespread and not confined to specific research designs, Stone and Shiffman intentionally place few restraints on researchers regarding selection of the DV and the reporter, which is determined by the research question under investigation. The methods of measurement, however, are specified in detail: Descriptions of prompting, recording of responses, participant-initiated entries, and the data acquisition interface (e.g., paper and pencil diary, PDA, cellular telephone) ought to be provided with sufficient detail for replication. Because EMA specifically, and time-series/daily diary methods similarly, are primarily concerned with the interval of assessment, Stone and Shiffman suggest reporting the density and schedule of assessment. The approach is generally determined by the nature of the research question and pragmatic considerations, such as access to electronic data collection devices at certain times of the day and participant burden. Compliance and missing data concerns are present in any longitudinal research design, but they are of particular importance in repeated-measurement applications with frequent measurement. When the research question pertains to temporal effects, compliance becomes paramount, and timely, immediate responding is necessary. For this reason, compliance decisions, rates of missing data, and missing data management techniques must be reported. The effect of missing data in time-series data streams has been the topic of recent research in the social sciences (e.g., Smith, Borckardt, & Nash, in press ; Velicer & Colby, 2005a , 2005b ). The results and implications of these and other missing data studies are discussed in the next section.

Analysis of SCED Data

Visual analysis.

Experts in the field generally agree about the majority of critical single-case experiment design and measurement characteristics. Analysis, on the other hand, is an area of significant disagreement, yet it has also received extensive recent attention and advancement. Debate regarding the appropriateness and accuracy of various methods for analyzing SCED data, the interpretation of single-case effect sizes, and other concerns vital to the validity of SCED results has been ongoing for decades, and no clear consensus has been reached. Visual analysis, following systematic procedures such as those provided by Franklin, Gorman, Beasley, and Allison (1997) and Parsonson and Baer (1978) , remains the standard by which SCED data are most commonly analyzed ( Parker, Cryer, & Byrns, 2006 ). Visual analysis can arguably be applied to all SCEDs. However, a number of baseline data characteristics must be met for effects obtained through visual analysis to be valid and reliable. The baseline phase must be relatively stable; free of significant trend, particularly in the hypothesized direction of the effect; have minimal overlap of data with subsequent phases; and have a sufficient sampling of behavior to be considered representative ( Franklin, Gorman, et al., 1997 ; Parsonson & Baer, 1978 ). The effect of baseline trend on visual analysis, and a technique to control baseline trend, are offered by Parker et al. (2006) . Kazdin (2010) suggests using statistical analysis when a trend or significant variability appears in the baseline phase, two conditions that ought to preclude the use of visual analysis techniques. Visual analysis methods are especially adept at determining intervention effects and can be of particular relevance in real-world applications (e.g., Borckardt et al., 2008 ; Kratochwill, Levin, Horner, & Swoboda, 2011 ).

However, visual analysis has its detractors. It has been shown to be inconsistent, can be affected by autocorrelation, and results in overestimation of effect (e.g., Matyas & Greenwood, 1990 ). Visual analysis as a means of estimating an effect precludes the results of SCED research from being included in meta-analysis, and also makes it very difficult to compare results to the effect sizes generated by other statistical methods. Yet, visual analysis proliferates in large part because SCED researchers are familiar with these methods and are not only generally unfamiliar with statistical approaches, but lack agreement about their appropriateness. Still, top experts in single-case analysis champion the use of statistical methods alongside visual analysis whenever it is appropriate to do so ( Kratochwill et al., 2011 ).

Statistical analysis

Statistical analysis of SCED data consists generally of an attempt to address one or more of three broad research questions: (1) Does introduction/manipulation of the IV result in statistically significant change in the level of the DV (level-change or phase-effect analysis)? (2) Does introduction/manipulation of the IV result in statistically significant change in the slope of the DV over time (slope-change analysis)? and (3) Do meaningful relationships exist between the trajectory of the DV and other potential covariates? Level- and slope-change analyses are relevant to intervention effectiveness studies and other research questions in which the IV is expected to result in changes in the DV in a particular direction. Visual analysis methods are most adept at addressing research questions pertaining to changes in level and slope (Questions 1 and 2), most often using some form of graphical representation and standardized computation of a mean level or trend line within and between each phase of interest (e.g., Horner & Spaulding, 2010 ; Kratochwill et al., 2011 ; Matyas & Greenwood, 1990 ). Research questions in other areas of psychological science might address the relationship between DVs or the slopes of DVs (Question 3). A number of sophisticated modeling approaches (e.g., cross-lag, multilevel, panel, growth mixture, latent class analysis) may be used for this type of question, and some are discussed in greater detail later in this section. However, a discussion about the nuances of this type of analysis and all their possible methods is well beyond the scope of this article.

The statistical analysis of SCEDs is a contentious issue in the field. Not only is there no agreed-upon statistical method, but the practice of statistical analysis in the context of the SCED is viewed by some as unnecessary (see Shadish, Rindskopf, & Hedges, 2008 ). Traditional trends in the prevalence of statistical analysis usage by SCED researchers are revealing: Busk & Marascuilo (1992) found that only 10% of the published single-case studies they reviewed used statistical analysis; Brossart, Parker, Olson, & Mahadevan (2006) estimated that this figure had roughly doubled by 2006. A range of concerns regarding single-case effect size calculation and interpretation is discussed in significant detail elsewhere (e.g., Campbell, 2004 ; Cohen, 1994 ; Ferron & Sentovich, 2002 ; Ferron & Ware, 1995 ; Kirk, 1996 ; Manolov & Solanas, 2008 ; Olive & Smith, 2005 ; Parker & Brossart, 2003 ; Robey et al., 1999 ; Smith et al., in press ; Velicer & Fava, 2003 ). One concern is the lack of a clearly superior method across datasets. Although statistical methods for analyzing SCEDs abound, few studies have examined their comparative performance with the same dataset. The most recent studies of this kind, performed by Brossart et al. (2006) , Campbell (2004) , Parker and Brossart (2003) , and Parker and Vannest (2009) , found that the more promising available statistical analysis methods yielded moderately different results on the same data series, which led them to conclude that each available method is equipped to adequately address only a relatively narrow spectrum of data. Given these findings, analysts need to select an appropriate model for the research questions and data structure, being mindful of how modeling results can be influenced by extraneous factors.

The current standards unfortunately provide little guidance in the way of statistical analysis options. This article presents an admittedly cursory introduction to available statistical methods; many others are not covered in this review. The following articles provide more in-depth discussion and description of other methods: Barlow et al. (2008) ; Franklin et al., (1997) ; Kazdin (2010) ; and Kratochwill and Levin (1992 , 2010 ). Shadish et al. (2008) summarize more recently developed methods. Similarly, a Special Issue of Evidence-Based Communication Assessment and Intervention (2008, Volume 2) provides articles and discussion of the more promising statistical methods for SCED analysis. An introduction to autocorrelation and its implications for statistical analysis is necessary before specific analytic methods can be discussed. It is also pertinent at this time to discuss the implications of missing data.

Autocorrelation

Many repeated measurements within a single subject or unit create a situation that most psychological researchers are unaccustomed to dealing with: autocorrelated data, which is the nonindependence of sequential observations, also known as serial dependence. Basic and advanced discussions of autocorrelation in single-subject data can be found in Borckardt et al. (2008) , Huitema (1985) , and Marshall (1980) , and discussions of autocorrelation in multilevel models can be found in Snijders and Bosker (1999) and Diggle and Liang (2001) . Along with trend and seasonal variation, autocorrelation is one example of the internal structure of repeated measurements. In the social sciences, autocorrelated data occur most naturally in the fields of physiological psychology, econometrics, and finance, where each phase of interest has potentially hundreds or even thousands of observations that are tightly packed across time (e.g., electroencephalography actuarial data, financial market indices). Applied SCED research in most areas of psychology is more likely to have measurement intervals of day, week, or hour.

Autocorrelation is a direct result of the repeated-measurement requirements of the SCED, but its effect is most noticeable and problematic when one is attempting to analyze these data. Many commonly used data analytic approaches, such as analysis of variance, assume independence of observations and can produce spurious results when the data are nonindependent. Even statistically insignificant autocorrelation estimates are generally viewed as sufficient to cause inferential bias when conventional statistics are used (e.g., Busk & Marascuilo, 1988 ; R. R. Jones et al., 1977 ; Matyas & Greenwood, 1990 ). The effect of autocorrelation on statistical inference in single-case applications has also been known for quite some time (e.g., R. R. Jones et al., 1977 ; Kanfer, 1970 ; Kazdin, 1981 ; Marshall, 1980 ). The findings of recent simulation studies of single-subject data streams indicate that autocorrelation is a nontrivial matter. For example, Manolov and Solanas (2008) determined that calculated effect sizes were linearly related to the autocorrelation of the data stream, and Smith et al. (in press) demonstrated that autocorrelation estimates in the vicinity of 0.80 negatively affect the ability to correctly infer a significant level-change effect using a standardized mean differences method. Huitema and colleagues (e.g., Huitema, 1985 ; Huitema & McKean, 1994 ) argued that autocorrelation is rarely a concern in applied research. Huitema’s methods and conclusions have been questioned and opposing data have been published (e.g., Allison & Gorman, 1993 ; Matyas & Greenwood, 1990 ; Robey et al., 1999 ), resulting in abandonment of the position that autocorrelation can be conscionably ignored without compromising the validity of the statistical procedures. Procedures for removing autocorrelation in the data stream prior to calculating effect sizes are offered as one option: One of the more promising analysis methods, autoregressive integrated moving averages (discussed later in this article), was specifically designed to remove the internal structure of time-series data, such as autocorrelation, trend, and seasonality ( Box & Jenkins, 1970 ; Tiao & Box, 1981 ).

Missing observations

Another concern inherent in repeated-measures designs is missing data. Daily diary and EMA methods are intended to reduce the risk of retrospection error by eliciting accurate, real-time information ( Bolger et al., 2003 ). However, these methods are subject to missing data as a result of honest forgetfulness, not possessing the diary collection tool at the specified time of collection, and intentional or systematic noncompliance. With paper and pencil diaries and some electronic methods, subjects might be able to complete missed entries retrospectively, defeating the temporal benefits of these assessment strategies ( Bolger et al., 2003 ). Methods of managing noncompliance through the study design and measurement methods include training the subject to use the data collection device appropriately, using technology to prompt responding and track the time of response, and providing incentives to participants for timely compliance (for additional discussion of this topic, see Bolger et al., 2003 ; Shiffman & Stone, 1998 ).

Even when efforts are made to maximize compliance during the conduct of the research, the problem of missing data is often unavoidable. Numerous approaches exist for handling missing observations in group multivariate designs (e.g., Horton & Kleinman, 2007 ; Ibrahim, Chen, Lipsitz, & Herring, 2005 ). Ragunathan (2004) and others concluded that full information and raw data maximum likelihood methods are preferable. Velicer and Colby (2005a , 2005b ) established the superiority of maximum likelihood methods over listwise deletion, mean of adjacent observations, and series mean substitution in the estimation of various critical time-series data parameters. Smith et al. (in press) extended these findings regarding the effect of missing data on inferential precision. They found that managing missing data with the EM procedure ( Dempster, Laird, & Rubin, 1977 ), a maximum likelihood algorithm, did not affect one’s ability to correctly infer a significant effect. However, lag-1 autocorrelation estimates in the vicinity of 0.80 resulted in insufficient power sensitivity (< 0.80), regardless of the proportion of missing data (10%, 20%, 30%, or 40%). 1 Although maximum likelihood methods have garnered some empirical support, methodological strategies that minimize missing data, particularly systematically missing data, are paramount to post-hoc statistical remedies.

Nonnormal distribution of data

In addition to the autocorrelated nature of SCED data, typical measurement methods also present analytic challenges. Many statistical methods, particularly those involving model finding, assume that the data are normally distributed. This is often not satisfied in SCED research when measurements involve count data, observer-rated behaviors, and other, similar metrics that result in skewed distributions. Techniques are available to manage nonnormal distributions in regression-based analysis, such as zero-inflated Poisson regression ( D. Lambert, 1992 ) and negative binomial regression ( Gardner, Mulvey, & Shaw, 1995 ), but many other statistical analysis methods do not include these sophisticated techniques. A skewed data distribution is perhaps one of the reasons Kazdin (2010) suggests not using count, categorical, or ordinal measurement methods.

Available statistical analysis methods

Following is a basic introduction to the more promising and prevalent analytic methods for SCED research. Because there is little consensus regarding the superiority of any single method, the burden unfortunately falls on the researcher to select a method capable of addressing the research question and handling the data involved in the study. Some indications and contraindications are provided for each method presented here.

Multilevel and structural equation modeling

Multilevel modeling (MLM; e.g., Schmidt, Perels, & Schmitz, 2010 ) techniques represent the state of the art among parametric approaches to SCED analysis, particularly when synthesizing SCED results ( Shadish et al., 2008 ). MLM and related latent growth curve and factor mixture methods in structural equation modeling (SEM; e.g., Lubke & Muthén, 2005 ; B. O. Muthén & Curran, 1997 ) are particularly effective for evaluating trajectories and slopes in longitudinal data and relating changes to potential covariates. MLM and related hierarchical linear models (HLM) can also illuminate the relationship between the trajectories of different variables under investigation and clarify whether or not these relationships differ amongst the subjects in the study. Time-series and cross-lag analyses can also be used in MLM and SEM ( Chow, Ho, Hamaker, & Dolan, 2010 ; du Toit & Browne, 2007 ). However, they generally require sophisticated model-fitting techniques, making them difficult for many social scientists to implement. The structure (autocorrelation) and trend of the data can also complicate many MLM methods. The common, short data streams in SCED research and the small number of subjects also present problems to MLM and SEM approaches, which were developed for data with significantly greater numbers of observations when the number of subjects is fewer, and for a greater number of participants for model-fitting purposes, particularly when there are fewer data points. Still, MLM and related techniques arguably represent the most promising analytic methods.

A number of software options 2 exist for SEM. Popular statistical packages in the social sciences provide SEM options, such as PROC CALIS in SAS ( SAS Institute Inc., 2008 ), the AMOS module ( Arbuckle, 2006 ) of SPSS ( SPSS Statistics, 2011 ), and the sempackage for R ( R Development Core Team, 2005 ), the use of which is described by Fox ( Fox, 2006 ). A number of stand-alone software options are also available for SEM applications, including Mplus ( L. K. Muthén & Muthén, 2010 ) and Stata ( StataCorp., 2011 ). Each of these programs also provides options for estimating multilevel/hierarchical models (for a review of using these programs for MLM analysis see Albright & Marinova, 2010 ). Hierarchical linear and nonlinear modeling can also be accomplished using the HLM 7 program ( Raudenbush, Bryk, & Congdon, 2011 ).

Autoregressive moving averages (ARMA; e.g., Browne & Nesselroade, 2005 ; Liu & Hudack, 1995 ; Tiao & Box, 1981 )

Two primary points have been raised regarding ARMA modeling: length of the data stream and feasibility of the modeling technique. ARMA models generally require 30–50 observations in each phase when analyzing a single-subject experiment (e.g., Borckardt et al., 2008 ; Box & Jenkins, 1970 ), which is often difficult to satisfy in applied psychological research applications. However, ARMA models in an SEM framework, such as those described by du Toit & Browne (2001) , are well suited for longitudinal panel data with few observations and many subjects. Autoregressive SEM models are also applicable under similar conditions. Model-fitting options are available in SPSS, R, and SAS via PROC ARMA.

ARMA modeling also requires considerable training in the method and rather advanced knowledge about statistical methods (e.g., Kratochwill & Levin, 1992 ). However, Brossart et al. (2006) point out that ARMA-based approaches can produce excellent results when there is no “model finding” and a simple lag-1 model, with no differencing and no moving average, is used. This approach can be taken for many SCED applications when phase- or slope-change analyses are of interest with a single, or very few, subjects. As already mentioned, this method is particularly useful when one is seeking to account for autocorrelation or other over-time variations that are not directly related to the experimental or intervention effect of interest (i.e., detrending). ARMA and other time-series analysis methods require missing data to be managed prior to analysis by means of options such as full information maximum likelihood estimation, multiple imputation, or the Kalman filter (see Box & Jenkins, 1970 ; Hamilton, 1994 ; Shumway & Stoffer, 1982 ) because listwise deletion has been shown to result in inaccurate time-series parameter estimates ( Velicer & Colby, 2005a ).

Standardized mean differences

Standardized mean differences approaches include the common Cohen’s d , Glass’s Delta, and Hedge’s g that are used in the analysis of group designs. The computational properties of mean differences approaches to SCEDs are identical to those used for group comparisons, except that the results represent within-case variation instead of the variation between groups, which suggests that the obtained effect sizes are not interpretively equivalent. The advantage of the mean differences approach is its simplicity of calculation and also its familiarity to social scientists. The primary drawback of these approaches is that they were not developed to contend with autocorrelated data. However, Manolov and Solanas (2008) reported that autocorrelation least affected effect sizes calculated using standardized mean differences approaches. To the applied-research scientist this likely represents the most accessible analytic approach, because statistical software is not required to calculate these effect sizes. The resultant effect sizes of single subject standardized mean differences analysis must be interpreted cautiously because their relation to standard effect size benchmarks, such as those provided by Cohen (1988) , is unknown. Standardized mean differences approaches are appropriate only when examining significant differences between phases of the study and cannot illuminate trajectories or relationships between variables.

Other analytic approaches

Researchers have offered other analytic methods to deal with the characteristics of SCED data. A number of methods for analyzing N -of-1 experiments have been developed. Borckardt’s Simulation Modeling Analysis (2006) program provides a method for analyzing level- and slope-change in short (<30 observations per phase; see Borckardt et al., 2008 ), autocorrelated data streams that is statistically sophisticated, yet accessible and freely available to typical psychological scientists and clinicians. A replicated single-case time-series design conducted by Smith, Handler, & Nash (2010) provides an example of SMA application. The Singwin Package, described in Bloom et al., (2003) , is a another easy-to-use parametric approach for analyzing single-case experiments. A number of nonparametric approaches have also been developed that emerged from the visual analysis tradition: Some examples include percent nonoverlapping data ( Scruggs, Mastropieri, & Casto, 1987 ) and nonoverlap of all pairs ( Parker & Vannest, 2009 ); however, these methods have come under scrutiny, and Wolery, Busick, Reichow, and Barton (2010) have suggested abandoning them altogether. Each of these methods appears to be well suited for managing specific data characteristics, but they should not be used to analyze data streams beyond their intended purpose until additional empirical research is conducted.

Combining SCED Results

Beyond the issue of single-case analysis is the matter of integrating and meta-analyzing the results of single-case experiments. SCEDs have been given short shrift in the majority of meta-analytic literature ( Littell, Corcoran, & Pillai, 2008 ; Shadish et al., 2008 ), with only a few exceptions ( Carr et al., 1999 ; Horner & Spaulding, 2010 ). Currently, few proven methods exist for integrating the results of multiple single-case experiments. Allison and Gorman (1993) and Shadish et al. (2008) present the problems associated with meta-analyzing single-case effect sizes, and W. P. Jones (2003) , Manolov and Solanas (2008) , Scruggs and Mastropieri (1998) , and Shadish et al. (2008) offer four different potential statistical solutions for this problem, none of which appear to have received consensus amongst researchers. The ability to synthesize and compare single-case effect sizes, particularly effect sizes garnered through group design research, is undoubtedly necessary to increase SCED proliferation.

Discussion of Review Results and Coding of Analytic Methods

The coding criteria for this review were quite stringent in terms of what was considered to be either visual or statistical analysis. For visual analysis to be coded as present, it was necessary for the authors to self-identify as having used a visual analysis method. In many cases, it could likely be inferred that visual analysis had been used, but it was often not specified. Similarly, statistical analysis was reserved for analytic methods that produced an effect. 3 Analyses that involved comparing magnitude of change using raw count data or percentages were not considered rigorous enough. These two narrow definitions of visual and statistical analysis contributed to the high rate of unreported analytic method, shown in Table 1 (52.3%). A better representation of the use of visual and statistical analysis would likely be the percentage of studies within those that reported a method of analysis. Under these parameters, 41.5% used visual analysis and 31.3% used statistical analysis. Included in these figures are studies that included both visual and statistical methods (11%). These findings are slightly higher than those estimated by Brossart et al. (2006) , who estimated statistical analysis is used in about 20% of SCED studies. Visual analysis continues to undoubtedly be the most prevalent method, but there appears to be a trend for increased use of statistical approaches, which is likely to only gain momentum as innovations continue.

Analysis Standards

The standards selected for inclusion in this review offer minimal direction in the way of analyzing the results of SCED research. Table 5 summarizes analysis-related information provided by the six reviewed sources for SCED standards. Visual analysis is acceptable to DV12 and DIV16, along with unspecified statistical approaches. In the WWC standards, visual analysis is the acceptable method of determining an intervention effect, with statistical analyses and randomization tests permissible as a complementary or supporting method to the results of visual analysis methods. However, the authors of the WWC standards state, “As the field reaches greater consensus about appropriate statistical analyses and quantitative effect-size measures, new standards for effect demonstration will need to be developed” ( Kratochwill et al., 2010 , p.16). The NRP and DIV12 seem to prefer statistical methods when they are warranted. The Tate at al. scale accepts only statistical analysis with the reporting of an effect size. Only the WWC and DIV16 provide guidance in the use of statistical analysis procedures: The WWC “recommends” nonparametric and parametric approaches, multilevel modeling, and regression when statistical analysis is used. DIV16 refers the reader to Wilkinson and the Task Force on Statistical Inference of the APA Board of Scientific Affairs (1999) for direction in this matter. Statistical analysis of daily diary and EMA methods is similarly unsettled. Stone and Shiffman (2002) ask for a detailed description of the statistical procedures used, in order for the approach to be replicated and evaluated. They provide direction for analyzing aggregated and disaggregated data. They also aptly note that because many different modes of analysis exist, researchers must carefully match the analytic approach to the hypotheses being pursued.

Limitations and Future Directions

This review has a number of limitations that leave the door open for future study of SCED methodology. Publication bias is a concern in any systematic review. This is particularly true for this review because the search was limited to articles published in peer-reviewed journals. This strategy was chosen in order to inform changes in the practice of reporting and of reviewing, but it also is likely to have inflated the findings regarding the methodological rigor of the reviewed works. Inclusion of book chapters, unpublished studies, and dissertations would likely have yielded somewhat different results.

A second concern is the stringent coding criteria in regard to the analytic methods and the broad categorization into visual and statistical analytic approaches. The selection of an appropriate method for analyzing SCED data is perhaps the murkiest area of this type of research. Future reviews that evaluate the appropriateness of selected analytic strategies and provide specific decision-making guidelines for researchers would be a very useful contribution to the literature. Although six sources of standards apply to SCED research reviewed in this article, five of them were developed almost exclusively to inform psychological and behavioral intervention research. The principles of SCED research remain the same in different contexts, but there is a need for non–intervention scientists to weigh in on these standards.

Finally, this article provides a first step in the synthesis of the available SCED reporting guidelines. However, it does not resolve disagreements, nor does it purport to be a definitive source. In the future, an entity with the authority to construct such a document ought to convene and establish a foundational, adaptable, and agreed-upon set of guidelines that cuts across subspecialties but is applicable to many, if not all, areas of psychological research, which is perhaps an idealistic goal. Certain preferences will undoubtedly continue to dictate what constitutes acceptable practice in each subspecialty of psychology, but uniformity along critical dimensions will help advance SCED research.

Conclusions

The first decade of the twenty-first century has seen an upwelling of SCED research across nearly all areas of psychology. This article contributes updated benchmarks in terms of the frequency with which SCED design and methodology characteristics are used, including the number of baseline observations, assessment and measurement practices, and data analytic approaches, most of which are largely consistent with previously reported benchmarks. However, this review is much broader than those of previous research teams and also breaks down the characteristics of single-case research by the predominant design. With the recent SCED proliferation came a number of standards for the conduct and reporting of such research. This article also provides a much-needed synthesis of recent SCED standards that can inform the work of researchers, reviewers, and funding agencies conducting and evaluating single-case research, which reveals many areas of consensus as well as areas of significant disagreement. It appears that the question of where to go next is very relevant at this point in time. The majority of the research design and measurement characteristics of the SCED are reasonably well established, and the results of this review suggest general practice that is in accord with existing standards and guidelines, at least in regard to published peer-reviewed works. In general, the published literature appears to be meeting the basic design and measurement requirement to ensure adequate internal validity of SCED studies.

Consensus regarding the superiority of any one analytic method stands out as an area of divergence. Judging by the current literature and lack of consensus, researchers will need to carefully select a method that matches the research design, hypotheses, and intended conclusions of the study, while also considering the most up-to-date empirical support for the chosen analytic method, whether it be visual or statistical. In some cases the number of observations and subjects in the study will dictate which analytic methods can and cannot be used. In the case of the true N -of-1 experiment, there are relatively few sound analytic methods, and even fewer that are robust with shorter data streams (see Borckardt et al., 2008 ). As the number of observations and subjects increases, sophisticated modeling techniques, such as MLM, SEM, and ARMA, become applicable. Trends in the data and autocorrelation further obfuscate the development of a clear statistical analysis selection algorithm, which currently does not exist. Autocorrelation was rarely addressed or discussed in the articles reviewed, except when the selected statistical analysis dictated consideration. Given the empirical evidence regarding the effect of autocorrelation on visual and statistical analysis, researchers need to address this more explicitly. Missing-data considerations are similarly left out when they are unnecessary for analytic purposes. As newly devised statistical analysis approaches mature and are compared with one another for appropriateness in specific SCED applications, guidelines for statistical analysis will necessarily be revised. Similarly, empirically derived guidance, in the form of a decision tree, must be developed to ensure application of appropriate methods based on characteristics of the data and the research questions being addressed. Researchers could also benefit from tutorials and comparative reviews of different software packages: This is a needed area of future research. Powerful and reliable statistical analyses help move the SCED up the ladder of experimental designs and attenuate the view that the method applies primarily to pilot studies and idiosyncratic research questions and situations.

Another potential future advancement of SCED research comes in the area of measurement. Currently, SCED research gives significant weight to observer ratings and seems to discourage other forms of data collection methods. This is likely due to the origins of the SCED in behavioral assessment and applied behavior analysis, which remains a present-day stronghold. The dearth of EMA and diary-like sampling procedures within the SCED research reviewed, yet their ever-growing prevalence in the larger psychological research arena, highlights an area for potential expansion. Observational measurement, although reliable and valid in many contexts, is time and resource intensive and not feasible in all areas in which psychologists conduct research. It seems that numerous untapped research questions are stifled because of this measurement constraint. SCED researchers developing updated standards in the future should include guidelines for the appropriate measurement requirement of non-observer-reported data. For example, the results of this review indicate that reporting of repeated measurements, particularly the high-density type found in diary and EMA sampling strategies, ought to be more clearly spelled out, with specific attention paid to autocorrelation and trend in the data streams. In the event that SCED researchers adopt self-reported assessment strategies as viable alternatives to observation, a set of standards explicitly identifying the necessary psychometric properties of the measures and specific items used would be in order.

Along similar lines, SCED researchers could take a page from other areas of psychology that champion multimethod and multisource evaluation of primary outcomes. In this way, the long-standing tradition of observational assessment and the cutting-edge technological methods of EMA and daily diary could be married with the goal of strengthening conclusions drawn from SCED research and enhancing the validity of self-reported outcome assessment. The results of this review indicate that they rarely intersect today, and I urge SCED researchers to adopt other methods of assessment informed by time-series, daily diary, and EMA methods. The EMA standards could serve as a jumping-off point for refined measurement and assessment reporting standards in the context of multimethod SCED research.

One limitation of the current SCED standards is their relatively limited scope. To clarify, with the exception of the Stone & Shiffman EMA reporting guidelines, the other five sources of standards were developed in the context of designing and evaluating intervention research. Although this is likely to remain its patent emphasis, SCEDs are capable of addressing other pertinent research questions in the psychological sciences, and the current standards truly only roughly approximate salient crosscutting SCED characteristics. I propose developing broad SCED guidelines that address the specific design, measurement, and analysis issues in a manner that allows it to be useful across applications, as opposed to focusing solely on intervention effects. To accomplish this task, methodology experts across subspecialties in psychology would need to convene. Admittedly this is no small task.

Perhaps funding agencies will also recognize the fiscal and practical advantages of SCED research in certain areas of psychology. One example is in the field of intervention effectiveness, efficacy, and implementation research. A few exemplary studies using robust forms of SCED methodology are needed in the literature. Case-based methodologies will never supplant the group design as the gold standard in experimental applications, nor should that be the goal. Instead, SCEDs provide a viable and valid alternative experimental methodology that could stimulate new areas of research and answer questions that group designs cannot. With the astonishing number of studies emerging every year that use single-case designs and explore the methodological aspects of the design, we are poised to witness and be a part of an upsurge in the sophisticated application of the SCED. When federal grant-awarding agencies and journal editors begin to use formal standards while making funding and publication decisions, the field will benefit.

Last, for the practice of SCED research to continue and mature, graduate training programs must provide students with instruction in all areas of the SCED. This is particularly true of statistical analysis techniques that are not often taught in departments of psychology and education, where the vast majority of SCED studies seem to be conducted. It is quite the conundrum that the best available statistical analytic methods are often cited as being inaccessible to social science researchers who conduct this type of research. This need not be the case. To move the field forward, emerging scientists must be able to apply the most state-of-the-art research designs, measurement techniques, and analytic methods.

Acknowledgments

Research support for the author was provided by research training grant MH20012 from the National Institute of Mental Health, awarded to Elizabeth A. Stormshak. The author gratefully acknowledges Robert Horner and Laura Lee McIntyre, University of Oregon; Michael Nash, University of Tennessee; John Ferron, University of South Florida; the Action Editor, Lisa Harlow, and the anonymous reviewers for their thoughtful suggestions and guidance in shaping this article; Cheryl Mikkola for her editorial support; and Victoria Mollison for her assistance in the systematic review process.

Appendix. Results of Systematic Review Search and Studies Included in the Review

Psycinfo search conducted july 2011.

  • Alternating treatment design
  • Changing criterion design
  • Experimental case*
  • Multiple baseline design
  • Replicated single-case design
  • Simultaneous treatment design
  • Time-series design
  • Quantitative study OR treatment outcome/randomized clinical trial
  • NOT field study OR interview OR focus group OR literature review OR systematic review OR mathematical model OR qualitative study
  • Publication range: 2000–2010
  • Published in peer-reviewed journals
  • Available in the English Language

Bibliography

(* indicates inclusion in study: N = 409)

1 Autocorrelation estimates in this range can be caused by trends in the data streams, which creates complications in terms of detecting level-change effects. The Smith et al. (in press) study used a Monte Carlo simulation to control for trends in the data streams, but trends are likely to exist in real-world data with high lag-1 autocorrelation estimates.

2 The author makes no endorsement regarding the superiority of any statistical program or package over another by their mention or exclusion in this article. The author also has no conflicts of interest in this regard.

3 However, it should be noted that it was often very difficult to locate an actual effect size reported in studies that used statistical analysis. Although this issue would likely have added little to this review, it does inhibit the inclusion of the results in meta-analysis.

  • Albright JJ, Marinova DM. Estimating multilevel modelsuUsing SPSS, Stata, and SAS. Indiana University; 2010. Retrieved from http://www.iub.edu/%7Estatmath/stat/all/hlm/hlm.pdf . [ Google Scholar ]
  • Allison DB, Gorman BS. Calculating effect sizes for meta-analysis: The case of the single case. Behavior Research and Therapy. 1993; 31 (6):621–631. doi: 10.1016/0005-7967(93)90115-B. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alloy LB, Just N, Panzarella C. Attributional style, daily life events, and hopelessness depression: Subtype validation by prospective variability and specificity of symptoms. Cognitive Therapy Research. 1997; 21 :321–344. doi: 10.1023/A:1021878516875. [ CrossRef ] [ Google Scholar ]
  • Arbuckle JL. Amos (Version 7.0) Chicago, IL: SPSS, Inc; 2006. [ Google Scholar ]
  • Barlow DH, Nock MK, Hersen M. Single case research designs: Strategies for studying behavior change. 3. New York, NY: Allyn and Bacon; 2008. [ Google Scholar ]
  • Barrett LF, Barrett DJ. An introduction to computerized experience sampling in psychology. Social Science Computer Review. 2001; 19 (2):175–185. doi: 10.1177/089443930101900204. [ CrossRef ] [ Google Scholar ]
  • Bloom M, Fisher J, Orme JG. Evaluating practice: Guidelines for the accountable professional. 4. Boston, MA: Allyn & Bacon; 2003. [ Google Scholar ]
  • Bolger N, Davis A, Rafaeli E. Diary methods: Capturing life as it is lived. Annual Review of Psychology. 2003; 54 :579–616. doi: 10.1146/annurev.psych.54.101601.145030. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Borckardt JJ. Simulation Modeling Analysis: Time series analysis program for short time series data streams (Version 8.3.3) Charleston, SC: Medical University of South Carolina; 2006. [ Google Scholar ]
  • Borckardt JJ, Nash MR, Murphy MD, Moore M, Shaw D, O’Neil P. Clinical practice as natural laboratory for psychotherapy research. American Psychologist. 2008; 63 :1–19. doi: 10.1037/0003-066X.63.2.77. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Borsboom D, Mellenbergh GJ, van Heerden J. The theoretical status of latent variables. Psychological Review. 2003; 110 (2):203–219. doi: 10.1037/0033-295X.110.2.203. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bower GH. Mood and memory. American Psychologist. 1981; 36 (2):129–148. doi: 10.1037/0003-066x.36.2.129. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Box GEP, Jenkins GM. Time-series analysis: Forecasting and control. San Francisco, CA: Holden-Day; 1970. [ Google Scholar ]
  • Brossart DF, Parker RI, Olson EA, Mahadevan L. The relationship between visual analysis and five statistical analyses in a simple AB single-case research design. Behavior Modification. 2006; 30 (5):531–563. doi: 10.1177/0145445503261167. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Browne MW, Nesselroade JR. Representing psychological processes with dynamic factor models: Some promising uses and extensions of autoregressive moving average time series models. In: Maydeu-Olivares A, McArdle JJ, editors. Contemporary psychometrics: A festschrift for Roderick P McDonald. Mahwah, NJ: Lawrence Erlbaum Associates Publishers; 2005. pp. 415–452. [ Google Scholar ]
  • Busk PL, Marascuilo LA. Statistical analysis in single-case research: Issues, procedures, and recommendations, with applications to multiple behaviors. In: Kratochwill TR, Levin JR, editors. Single-case research design and analysis: New directions for psychology and education. Hillsdale, NJ, England: Lawrence Erlbaum Associates, Inc; 1992. pp. 159–185. [ Google Scholar ]
  • Busk PL, Marascuilo RC. Autocorrelation in single-subject research: A counterargument to the myth of no autocorrelation. Behavioral Assessment. 1988; 10 :229–242. [ Google Scholar ]
  • Campbell JM. Statistical comparison of four effect sizes for single-subject designs. Behavior Modification. 2004; 28 (2):234–246. doi: 10.1177/0145445503259264. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Carr EG, Horner RH, Turnbull AP, Marquis JG, Magito McLaughlin D, McAtee ML, Doolabh A. Positive behavior support for people with developmental disabilities: A research synthesis. Washington, DC: American Association on Mental Retardation; 1999. [ Google Scholar ]
  • Center BA, Skiba RJ, Casey A. A methodology for the quantitative synthesis of intra-subject design research. Journal of Educational Science. 1986; 19 :387–400. doi: 10.1177/002246698501900404. [ CrossRef ] [ Google Scholar ]
  • Chambless DL, Hollon SD. Defining empirically supported therapies. Journal of Consulting and Clinical Psychology. 1998; 66 (1):7–18. doi: 10.1037/0022-006X.66.1.7. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chambless DL, Ollendick TH. Empirically supported psychological interventions: Controversies and evidence. Annual Review of Psychology. 2001; 52 :685–716. doi: 10.1146/annurev.psych.52.1.685. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chow S-M, Ho M-hR, Hamaker EL, Dolan CV. Equivalence and differences between structural equation modeling and state-space modeling techniques. Structural Equation Modeling. 2010; 17 (2):303–332. doi: 10.1080/10705511003661553. [ CrossRef ] [ Google Scholar ]
  • Cohen J. Statistical power analysis for the bahavioral sciences. 2. Hillsdale, NJ: Erlbaum; 1988. [ Google Scholar ]
  • Cohen J. The earth is round (p < .05) American Psychologist. 1994; 49 :997–1003. doi: 10.1037/0003-066X.49.12.997. [ CrossRef ] [ Google Scholar ]
  • Crosbie J. Interrupted time-series analysis with brief single-subject data. Journal of Consulting and Clinical Psychology. 1993; 61 (6):966–974. doi: 10.1037/0022-006X.61.6.966. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dattilio FM, Edwards JA, Fishman DB. Case studies within a mixed methods paradigm: Toward a resolution of the alienation between researcher and practitioner in psychotherapy research. Psychotherapy: Theory, Research, Practice, Training. 2010; 47 (4):427–441. doi: 10.1037/a0021181. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dempster A, Laird N, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B. 1977; 39 (1):1–38. [ Google Scholar ]
  • Des Jarlais DC, Lyles C, Crepaz N. Improving the reporting quality of nonrandomized evaluations of behavioral and public health interventions: the TREND statement. American Journal of Public Health. 2004; 94 (3):361–366. doi: 10.2105/ajph.94.3.361. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Diggle P, Liang KY. Analyses of longitudinal data. New York: Oxford University Press; 2001. [ Google Scholar ]
  • Doss BD, Atkins DC. Investigating treatment mediators when simple random assignment to a control group is not possible. Clinical Psychology: Science and Practice. 2006; 13 (4):321–336. doi: 10.1111/j.1468-2850.2006.00045.x. [ CrossRef ] [ Google Scholar ]
  • du Toit SHC, Browne MW. The covariance structure of a vector ARMA time series. In: Cudeck R, du Toit SHC, Sörbom D, editors. Structural equation modeling: Present and future. Lincolnwood, IL: Scientific Software International; 2001. pp. 279–314. [ Google Scholar ]
  • du Toit SHC, Browne MW. Structural equation modeling of multivariate time series. Multivariate Behavioral Research. 2007; 42 :67–101. doi: 10.1080/00273170701340953. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Fechner GT. Elemente der psychophysik [Elements of psychophysics] Leipzig, Germany: Breitkopf & Hartel; 1889. [ Google Scholar ]
  • Ferron J, Sentovich C. Statistical power of randomization tests used with multiple-baseline designs. The Journal of Experimental Education. 2002; 70 :165–178. doi: 10.1080/00220970209599504. [ CrossRef ] [ Google Scholar ]
  • Ferron J, Ware W. Analyzing single-case data: The power of randomization tests. The Journal of Experimental Education. 1995; 63 :167–178. [ Google Scholar ]
  • Fox J. TEACHER’S CORNER: Structural equation modeling with the sem package in R. Structural Equation Modeling: A Multidisciplinary Journal. 2006; 13 (3):465–486. doi: 10.1207/s15328007sem1303_7. [ CrossRef ] [ Google Scholar ]
  • Franklin RD, Allison DB, Gorman BS, editors. Design and analysis of single-case research. Mahwah, NJ: Lawrence Erlbaum Associates; 1997. [ Google Scholar ]
  • Franklin RD, Gorman BS, Beasley TM, Allison DB. Graphical display and visual analysis. In: Franklin RD, Allison DB, Gorman BS, editors. Design and analysis of single-case research. Mahway, NJ: Lawrence Erlbaum Associates, Publishers; 1997. pp. 119–158. [ Google Scholar ]
  • Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin. 1995; 118 (3):392–404. doi: 10.1037/0033-2909.118.3.392. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Green AS, Rafaeli E, Bolger N, Shrout PE, Reis HT. Paper or plastic? Data equivalence in paper and electronic diaries. Psychological Methods. 2006; 11 (1):87–105. doi: 10.1037/1082-989X.11.1.87. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hamilton JD. Time series analysis. Princeton, NJ: Princeton University Press; 1994. [ Google Scholar ]
  • Hammond D, Gast DL. Descriptive analysis of single-subject research designs: 1983–2007. Education and Training in Autism and Developmental Disabilities. 2010; 45 :187–202. [ Google Scholar ]
  • Hanson MD, Chen E. Daily stress, cortisol, and sleep: The moderating role of childhood psychosocial environments. Health Psychology. 2010; 29 (4):394–402. doi: 10.1037/a0019879. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Harvey AC. Forecasting, structural time series models and the Kalman filter. Cambridge, MA: Cambridge University Press; 2001. [ Google Scholar ]
  • Horner RH, Carr EG, Halle J, McGee G, Odom S, Wolery M. The use of single-subject research to identify evidence-based practice in special education. Exceptional Children. 2005; 71 :165–179. [ Google Scholar ]
  • Horner RH, Spaulding S. Single-case research designs. In: Salkind NJ, editor. Encyclopedia of research design. Thousand Oaks, CA: Sage Publications; 2010. [ Google Scholar ]
  • Horton NJ, Kleinman KP. Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models. The American Statistician. 2007; 61 (1):79–90. doi: 10.1198/000313007X172556. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hser Y, Shen H, Chou C, Messer SC, Anglin MD. Analytic approaches for assessing long-term treatment effects. Evaluation Review. 2001; 25 (2):233–262. doi: 10.1177/0193841X0102500206. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Huitema BE. Autocorrelation in applied behavior analysis: A myth. Behavioral Assessment. 1985; 7 (2):107–118. [ Google Scholar ]
  • Huitema BE, McKean JW. Reduced bias autocorrelation estimation: Three jackknife methods. Educational and Psychological Measurement. 1994; 54 (3):654–665. doi: 10.1177/0013164494054003008. [ CrossRef ] [ Google Scholar ]
  • Ibrahim JG, Chen M-H, Lipsitz SR, Herring AH. Missing-data methods for generalized linear models: A comparative review. Journal of the American Statistical Association. 2005; 100 (469):332–346. doi: 10.1198/016214504000001844. [ CrossRef ] [ Google Scholar ]
  • Institute of Medicine. Reducing risks for mental disorders: Frontiers for preventive intervention research. Washington, DC: National Academy Press; 1994. [ PubMed ] [ Google Scholar ]
  • Jacobsen NS, Christensen A. Studying the effectiveness of psychotherapy: How well can clinical trials do the job? American Psychologist. 1996; 51 :1031–1039. doi: 10.1037/0003-066X.51.10.1031. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jones RR, Vaught RS, Weinrott MR. Time-series analysis in operant research. Journal of Behavior Analysis. 1977; 10 (1):151–166. doi: 10.1901/jaba.1977.10-151. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jones WP. Single-case time series with Bayesian analysis: A practitioner’s guide. Measurement and Evaluation in Counseling and Development. 2003; 36 (28–39) [ Google Scholar ]
  • Kanfer H. Self-monitoring: Methodological limitations and clinical applications. Journal of Consulting and Clinical Psychology. 1970; 35 (2):148–152. doi: 10.1037/h0029874. [ CrossRef ] [ Google Scholar ]
  • Kazdin AE. Drawing valid inferences from case studies. Journal of Consulting and Clinical Psychology. 1981; 49 (2):183–192. doi: 10.1037/0022-006X.49.2.183. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kazdin AE. Mediators and mechanisms of change in psychotherapy research. Annual Review of Clinical Psychology. 2007; 3 :1–27. doi: 10.1146/annurev.clinpsy.3.022806.091432. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kazdin AE. Evidence-based treatment and practice: New opportunities to bridge clinical research and practice, enhance the knowledge base, and improve patient care. American Psychologist. 2008; 63 (3):146–159. doi: 10.1037/0003-066X.63.3.146. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kazdin AE. Understanding how and why psychotherapy leads to change. Psychotherapy Research. 2009; 19 (4):418–428. doi: 10.1080/10503300802448899. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kazdin AE. Single-case research designs: Methods for clinical and applied settings. 2. New York, NY: Oxford University Press; 2010. [ Google Scholar ]
  • Kirk RE. Practical significance: A concept whose time has come. Educational and Psychological Measurement. 1996; 56 :746–759. doi: 10.1177/0013164496056005002. [ CrossRef ] [ Google Scholar ]
  • Kratochwill TR. Preparing psychologists for evidence-based school practice: Lessons learned and challenges ahead. American Psychologist. 2007; 62 :829–843. doi: 10.1037/0003-066X.62.8.829. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kratochwill TR, Hitchcock J, Horner RH, Levin JR, Odom SL, Rindskopf DM, Shadish WR. Single-case designs technical documentation. 2010 Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf . Retrieved from http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf .
  • Kratochwill TR, Levin JR. Single-case research design and analysis: New directions for psychology and education. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc; 1992. [ Google Scholar ]
  • Kratochwill TR, Levin JR. Enhancing the scientific credibility of single-case intervention research: Randomization to the rescue. Psychological Methods. 2010; 15 (2):124–144. doi: 10.1037/a0017736. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kratochwill TR, Levin JR, Horner RH, Swoboda C. Visual analysis of single-case intervention research: Conceptual and methodological considerations (WCER Working Paper No. 2011-6) 2011 Retrieved from University of Wisconsin–Madison, Wisconsin Center for Education Research website: http://www.wcer.wisc.edu/publications/workingPapers/papers.php .
  • Lambert D. Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics. 1992; 34 (1):1–14. [ Google Scholar ]
  • Lambert MJ, Hansen NB, Harmon SC. Developing and Delivering Practice-Based Evidence. John Wiley & Sons, Ltd; 2010. Outcome Questionnaire System (The OQ System): Development and practical applications in healthcare settings; pp. 139–154. [ Google Scholar ]
  • Littell JH, Corcoran J, Pillai VK. Systematic reviews and meta-analysis. New York: Oxford University Press; 2008. [ Google Scholar ]
  • Liu LM, Hudack GB. The SCA statistical system. Vector ARMA modeling of multiple time series. Oak Brook, IL: Scientific Computing Associates Corporation; 1995. [ Google Scholar ]
  • Lubke GH, Muthén BO. Investigating population heterogeneity with factor mixture models. Psychological Methods. 2005; 10 (1):21–39. doi: 10.1037/1082-989x.10.1.21. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Manolov R, Solanas A. Comparing N = 1 effect sizes in presence of autocorrelation. Behavior Modification. 2008; 32 (6):860–875. doi: 10.1177/0145445508318866. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Marshall RJ. Autocorrelation estimation of time series with randomly missing observations. Biometrika. 1980; 67 (3):567–570. doi: 10.1093/biomet/67.3.567. [ CrossRef ] [ Google Scholar ]
  • Matyas TA, Greenwood KM. Visual analysis of single-case time series: Effects of variability, serial dependence, and magnitude of intervention effects. Journal of Applied Behavior Analysis. 1990; 23 (3):341–351. doi: 10.1901/jaba.1990.23-341. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kratochwill TR, Chair Members of the Task Force on Evidence-Based Interventions in School Psychology. Procedural and coding manual for review of evidence-based interventions. 2003 Retrieved July 18, 2011 from http://www.sp-ebi.org/documents/_workingfiles/EBImanual1.pdf .
  • Moher D, Schulz KF, Altman DF the CONSORT Group. The CONSORT statement: Revised recommendations for improving the quality of reports of parallel-group randomized trials. Journal of the American Medical Association. 2001; 285 :1987–1991. doi: 10.1001/jama.285.15.1987. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Morgan DL, Morgan RK. Single-participant research design: Bringing science to managed care. American Psychologist. 2001; 56 (2):119–127. doi: 10.1037/0003-066X.56.2.119. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Muthén BO, Curran PJ. General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods. 1997; 2 (4):371–402. doi: 10.1037/1082-989x.2.4.371. [ CrossRef ] [ Google Scholar ]
  • Muthén LK, Muthén BO. Mplus (Version 6.11) Los Angeles, CA: Muthén & Muthén; 2010. [ Google Scholar ]
  • Nagin DS. Analyzing developmental trajectories: A semiparametric, group-based approach. Psychological Methods. 1999; 4 (2):139–157. doi: 10.1037/1082-989x.4.2.139. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • National Institute of Child Health and Human Development. Report of the National Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction (NIH Publication No. 00-4769) Washington, DC: U.S. Government Printing Office; 2000. [ Google Scholar ]
  • Olive ML, Smith BW. Effect size calculations and single subject designs. Educational Psychology. 2005; 25 (2–3):313–324. doi: 10.1080/0144341042000301238. [ CrossRef ] [ Google Scholar ]
  • Oslin DW, Cary M, Slaymaker V, Colleran C, Blow FC. Daily ratings measures of alcohol craving during an inpatient stay define subtypes of alcohol addiction that predict subsequent risk for resumption of drinking. Drug and Alcohol Dependence. 2009; 103 (3):131–136. doi: 10.1016/J.Drugalcdep.2009.03.009. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Palermo TP, Valenzuela D, Stork PP. A randomized trial of electronic versus paper pain diaries in children: Impact on compliance, accuracy, and acceptability. Pain. 2004; 107 (3):213–219. doi: 10.1016/j.pain.2003.10.005. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Parker RI, Brossart DF. Evaluating single-case research data: A comparison of seven statistical methods. Behavior Therapy. 2003; 34 (2):189–211. doi: 10.1016/S0005-7894(03)80013-8. [ CrossRef ] [ Google Scholar ]
  • Parker RI, Cryer J, Byrns G. Controlling baseline trend in single case research. School Psychology Quarterly. 2006; 21 (4):418–440. doi: 10.1037/h0084131. [ CrossRef ] [ Google Scholar ]
  • Parker RI, Vannest K. An improved effect size for single-case research: Nonoverlap of all pairs. Behavior Therapy. 2009; 40 (4):357–367. doi: 10.1016/j.beth.2008.10.006. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Parsonson BS, Baer DM. The analysis and presentation of graphic data. In: Kratochwill TR, editor. Single subject research. New York, NY: Academic Press; 1978. pp. 101–166. [ Google Scholar ]
  • Parsonson BS, Baer DM. The visual analysis of data, and current research into the stimuli controlling it. In: Kratochwill TR, Levin JR, editors. Single-case research design and analysis: New directions for psychology and education. Hillsdale, NJ; England: Lawrence Erlbaum Associates, Inc; 1992. pp. 15–40. [ Google Scholar ]
  • Piasecki TM, Hufford MR, Solham M, Trull TJ. Assessing clients in their natural environments with electronic diaries: Rationale, benefits, limitations, and barriers. Psychological Assessment. 2007; 19 (1):25–43. doi: 10.1037/1040-3590.19.1.25. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2005. [ Google Scholar ]
  • Ragunathan TE. What do we do with missing data? Some options for analysis of incomplete data. Annual Review of Public Health. 2004; 25 :99–117. doi: 10.1146/annurev.publhealth.25.102802.124410. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Raudenbush SW, Bryk AS, Congdon R. HLM 7 Hierarchical Linear and Nonlinear Modeling. Scientific Software International, Inc; 2011. [ Google Scholar ]
  • Redelmeier DA, Kahneman D. Patients’ memories of painful medical treatments: Real-time and retrospective evaluations of two minimally invasive procedures. Pain. 1996; 66 (1):3–8. doi: 10.1016/0304-3959(96)02994-6. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Reis HT. Domains of experience: Investigating relationship processes from three perspectives. In: Erber R, Gilmore R, editors. Theoretical frameworks in personal relationships. Mahwah, NJ: Erlbaum; 1994. pp. 87–110. [ Google Scholar ]
  • Reis HT, Gable SL. Event sampling and other methods for studying everyday experience. In: Reis HT, Judd CM, editors. Handbook of research methods in social and personality psychology. New York, NY: Cambridge University Press; 2000. pp. 190–222. [ Google Scholar ]
  • Robey RR, Schultz MC, Crawford AB, Sinner CA. Single-subject clinical-outcome research: Designs, data, effect sizes, and analyses. Aphasiology. 1999; 13 (6):445–473. doi: 10.1080/026870399402028. [ CrossRef ] [ Google Scholar ]
  • Rossi PH, Freeman HE. Evaluation: A systematic approach. 5. Thousand Oaks, CA: Sage; 1993. [ Google Scholar ]
  • SAS Institute Inc. The SAS system for Windows, Version 9. Cary, NC: SAS Institute Inc; 2008. [ Google Scholar ]
  • Schmidt M, Perels F, Schmitz B. How to perform idiographic and a combination of idiographic and nomothetic approaches: A comparison of time series analyses and hierarchical linear modeling. Journal of Psychology. 2010; 218 (3):166–174. doi: 10.1027/0044-3409/a000026. [ CrossRef ] [ Google Scholar ]
  • Scollon CN, Kim-Pietro C, Diener E. Experience sampling: Promises and pitfalls, strengths and weaknesses. Assessing Well-Being. 2003; 4 :5–35. doi: 10.1007/978-90-481-2354-4_8. [ CrossRef ] [ Google Scholar ]
  • Scruggs TE, Mastropieri MA. Summarizing single-subject research: Issues and applications. Behavior Modification. 1998; 22 (3):221–242. doi: 10.1177/01454455980223001. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Scruggs TE, Mastropieri MA, Casto G. The quantitative synthesis of single-subject research. Remedial and Special Education. 1987; 8 (2):24–33. doi: 10.1177/074193258700800206. [ CrossRef ] [ Google Scholar ]
  • Shadish WR, Cook TD, Campbell DT. Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin; 2002. [ Google Scholar ]
  • Shadish WR, Rindskopf DM, Hedges LV. The state of the science in the meta-analysis of single-case experimental designs. Evidence-Based Communication Assessment and Intervention. 2008; 3 :188–196. doi: 10.1080/17489530802581603. [ CrossRef ] [ Google Scholar ]
  • Shadish WR, Sullivan KJ. Characteristics of single-case designs used to assess treatment effects in 2008. Behavior Research Methods. 2011; 43 :971–980. doi: 10.3758/s13428-011-0111-y. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sharpley CF. Time-series analysis of behavioural data: An update. Behaviour Change. 1987; 4 :40–45. [ Google Scholar ]
  • Shiffman S, Hufford M, Hickcox M, Paty JA, Gnys M, Kassel JD. Remember that? A comparison of real-time versus retrospective recall of smoking lapses. Journal of Consulting and Clinical Psychology. 1997; 65 :292–300. doi: 10.1037/0022-006X.65.2.292.a. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shiffman S, Stone AA. Ecological momentary assessment: A new tool for behavioral medicine research. In: Krantz DS, Baum A, editors. Technology and methods in behavioral medicine. Mahwah, NJ: Erlbaum; 1998. pp. 117–131. [ Google Scholar ]
  • Shiffman S, Stone AA, Hufford MR. Ecological momentary assessment. Annual Review of Clinical Psychology. 2008; 4 :1–32. doi: 10.1146/annurev.clinpsy.3.022806.091415. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shumway RH, Stoffer DS. An approach to time series smoothing and forecasting using the EM Algorithm. Journal of Time Series Analysis. 1982; 3 (4):253–264. doi: 10.1111/j.1467-9892.1982.tb00349.x. [ CrossRef ] [ Google Scholar ]
  • Skinner BF. The behavior of organisms. New York, NY: Appleton-Century-Crofts; 1938. [ Google Scholar ]
  • Smith JD, Borckardt JJ, Nash MR. Inferential precision in single-case time-series datastreams: How well does the EM Procedure perform when missing observations occur in autocorrelated data? Behavior Therapy. doi: 10.1016/j.beth.2011.10.001. (in press) [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Smith JD, Handler L, Nash MR. Therapeutic Assessment for preadolescent boys with oppositional-defiant disorder: A replicated single-case time-series design. Psychological Assessment. 2010; 22 (3):593–602. doi: 10.1037/a0019697. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Snijders TAB, Bosker RJ. Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage; 1999. [ Google Scholar ]
  • Soliday E, Moore KJ, Lande MB. Daily reports and pooled time series analysis: Pediatric psychology applications. Journal of Pediatric Psychology. 2002; 27 (1):67–76. doi: 10.1093/jpepsy/27.1.67. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • SPSS Statistics. Chicago, IL: SPSS Inc; 2011. (Version 20.0.0) [ Google Scholar ]
  • StataCorp. Stata Statistical Software: Release 12. College Station, TX: StataCorp LP; 2011. [ Google Scholar ]
  • Stone AA, Broderick JE, Kaell AT, Deles-Paul PAEG, Porter LE. Does the peak-end phenomenon observed in laboratory pain studies apply to real-world pain in rheumatoid arthritics? Journal of Pain. 2000; 1 :212–217. doi: 10.1054/jpai.2000.7568. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stone AA, Shiffman S. Capturing momentary, self-report data: A proposal for reporting guidelines. Annals of Behavioral Medicine. 2002; 24 :236–243. doi: 10.1207/S15324796ABM2403_09. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stout RL. Advancing the analysis of treatment process. Addiction. 2007; 102 :1539–1545. doi: 10.1111/j.1360-0443.2007.01880.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tate RL, McDonald S, Perdices M, Togher L, Schultz R, Savage S. Rating the methodological quality of single-subject designs and N-of-1 trials: Introducing the Single-Case Experimental Design (SCED) Scale. Neuropsychological Rehabilitation. 2008; 18 (4):385–401. doi: 10.1080/09602010802009201. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Thiele C, Laireiter A-R, Baumann U. Diaries in clinical psychology and psychotherapy: A selective review. Clinical Psychology & Psychotherapy. 2002; 9 (1):1–37. doi: 10.1002/cpp.302. [ CrossRef ] [ Google Scholar ]
  • Tiao GC, Box GEP. Modeling multiple time series with applications. Journal of the American Statistical Association. 1981; 76 :802–816. [ Google Scholar ]
  • Tschacher W, Ramseyer F. Modeling psychotherapy process by time-series panel analysis (TSPA) Psychotherapy Research. 2009; 19 (4):469–481. doi: 10.1080/10503300802654496. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Velicer WF, Colby SM. A comparison of missing-data procedures for ARIMA time-series analysis. Educational and Psychological Measurement. 2005a; 65 (4):596–615. doi: 10.1177/0013164404272502. [ CrossRef ] [ Google Scholar ]
  • Velicer WF, Colby SM. Missing data and the general transformation approach to time series analysis. In: Maydeu-Olivares A, McArdle JJ, editors. Contemporary psychometrics. A festschrift to Roderick P McDonald. Hillsdale, NJ: Lawrence Erlbaum; 2005b. pp. 509–535. [ Google Scholar ]
  • Velicer WF, Fava JL. Time series analysis. In: Schinka J, Velicer WF, Weiner IB, editors. Research methods in psychology. Vol. 2. New York, NY: John Wiley & Sons; 2003. [ Google Scholar ]
  • Wachtel PL. Beyond “ESTs”: Problematic assumptions in the pursuit of evidence-based practice. Psychoanalytic Psychology. 2010; 27 (3):251–272. doi: 10.1037/a0020532. [ CrossRef ] [ Google Scholar ]
  • Watson JB. Behaviorism. New York, NY: Norton; 1925. [ Google Scholar ]
  • Weisz JR, Hawley KM. Finding, evaluating, refining, and applying empirically supported treatments for children and adolescents. Journal of Clinical Child Psychology. 1998; 27 :206–216. doi: 10.1207/s15374424jccp2702_7. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Weisz JR, Hawley KM. Procedural and coding manual for identification of beneficial treatments. Washinton, DC: American Psychological Association, Society for Clinical Psychology, Division 12, Committee on Science and Practice; 1999. [ Google Scholar ]
  • Westen D, Bradley R. Empirically supported complexity. Current Directions in Psychological Science. 2005; 14 :266–271. doi: 10.1111/j.0963-7214.2005.00378.x. [ CrossRef ] [ Google Scholar ]
  • Westen D, Novotny CM, Thompson-Brenner HK. The empirical status of empirically supported psychotherapies: Assumptions, findings, and reporting controlled clinical trials. Psychological Bulletin. 2004; 130 :631–663. doi: 10.1037/0033-2909.130.4.631. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wilkinson L The Task Force on Statistical Inference. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist. 1999; 54 :694–704. doi: 10.1037/0003-066X.54.8.594. [ CrossRef ] [ Google Scholar ]
  • Wolery M, Busick M, Reichow B, Barton EE. Comparison of overlap methods for quantitatively synthesizing single-subject data. The Journal of Special Education. 2010; 44 (1):18–28. doi: 10.1177/0022466908328009. [ CrossRef ] [ Google Scholar ]
  • Wu Z, Huang NE, Long SR, Peng C-K. On the trend, detrending, and variability of nonlinear and nonstationary time series. Proceedings of the National Academy of Sciences. 2007; 104 (38):14889–14894. doi: 10.1073/pnas.0701020104. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Perspective
  • Published: 22 November 2022

Single case studies are a powerful tool for developing, testing and extending theories

  • Lyndsey Nickels   ORCID: orcid.org/0000-0002-0311-3524 1 , 2 ,
  • Simon Fischer-Baum   ORCID: orcid.org/0000-0002-6067-0538 3 &
  • Wendy Best   ORCID: orcid.org/0000-0001-8375-5916 4  

Nature Reviews Psychology volume  1 ,  pages 733–747 ( 2022 ) Cite this article

669 Accesses

5 Citations

26 Altmetric

Metrics details

  • Neurological disorders

Psychology embraces a diverse range of methodologies. However, most rely on averaging group data to draw conclusions. In this Perspective, we argue that single case methodology is a valuable tool for developing and extending psychological theories. We stress the importance of single case and case series research, drawing on classic and contemporary cases in which cognitive and perceptual deficits provide insights into typical cognitive processes in domains such as memory, delusions, reading and face perception. We unpack the key features of single case methodology, describe its strengths, its value in adjudicating between theories, and outline its benefits for a better understanding of deficits and hence more appropriate interventions. The unique insights that single case studies have provided illustrate the value of in-depth investigation within an individual. Single case methodology has an important place in the psychologist’s toolkit and it should be valued as a primary research tool.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 digital issues and online access to articles

55,14 € per year

only 4,60 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

case study experimental psychology

Similar content being viewed by others

case study experimental psychology

Comparing meta-analyses and preregistered multiple-laboratory replication projects

case study experimental psychology

The fundamental importance of method to theory

case study experimental psychology

A critical evaluation of the p-factor literature

Corkin, S. Permanent Present Tense: The Unforgettable Life Of The Amnesic Patient, H. M . Vol. XIX, 364 (Basic Books, 2013).

Lilienfeld, S. O. Psychology: From Inquiry To Understanding (Pearson, 2019).

Schacter, D. L., Gilbert, D. T., Nock, M. K. & Wegner, D. M. Psychology (Worth Publishers, 2019).

Eysenck, M. W. & Brysbaert, M. Fundamentals Of Cognition (Routledge, 2018).

Squire, L. R. Memory and brain systems: 1969–2009. J. Neurosci. 29 , 12711–12716 (2009).

Article   PubMed   PubMed Central   Google Scholar  

Corkin, S. What’s new with the amnesic patient H.M.? Nat. Rev. Neurosci. 3 , 153–160 (2002).

Article   PubMed   Google Scholar  

Schubert, T. M. et al. Lack of awareness despite complex visual processing: evidence from event-related potentials in a case of selective metamorphopsia. Proc. Natl Acad. Sci. USA 117 , 16055–16064 (2020).

Behrmann, M. & Plaut, D. C. Bilateral hemispheric processing of words and faces: evidence from word impairments in prosopagnosia and face impairments in pure alexia. Cereb. Cortex 24 , 1102–1118 (2014).

Plaut, D. C. & Behrmann, M. Complementary neural representations for faces and words: a computational exploration. Cogn. Neuropsychol. 28 , 251–275 (2011).

Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293 , 2425–2430 (2001).

Hirshorn, E. A. et al. Decoding and disrupting left midfusiform gyrus activity during word reading. Proc. Natl Acad. Sci. USA 113 , 8162–8167 (2016).

Kosakowski, H. L. et al. Selective responses to faces, scenes, and bodies in the ventral visual pathway of infants. Curr. Biol. 32 , 265–274.e5 (2022).

Harlow, J. Passage of an iron rod through the head. Boston Med. Surgical J . https://doi.org/10.1176/jnp.11.2.281 (1848).

Broca, P. Remarks on the seat of the faculty of articulated language, following an observation of aphemia (loss of speech). Bull. Soc. Anat. 6 , 330–357 (1861).

Google Scholar  

Dejerine, J. Contribution A L’étude Anatomo-pathologique Et Clinique Des Différentes Variétés De Cécité Verbale: I. Cécité Verbale Avec Agraphie Ou Troubles Très Marqués De L’écriture; II. Cécité Verbale Pure Avec Intégrité De L’écriture Spontanée Et Sous Dictée (Société de Biologie, 1892).

Liepmann, H. Das Krankheitsbild der Apraxie (“motorischen Asymbolie”) auf Grund eines Falles von einseitiger Apraxie (Fortsetzung). Eur. Neurol. 8 , 102–116 (1900).

Article   Google Scholar  

Basso, A., Spinnler, H., Vallar, G. & Zanobio, M. E. Left hemisphere damage and selective impairment of auditory verbal short-term memory. A case study. Neuropsychologia 20 , 263–274 (1982).

Humphreys, G. W. & Riddoch, M. J. The fractionation of visual agnosia. In Visual Object Processing: A Cognitive Neuropsychological Approach 281–306 (Lawrence Erlbaum, 1987).

Whitworth, A., Webster, J. & Howard, D. A Cognitive Neuropsychological Approach To Assessment And Intervention In Aphasia (Psychology Press, 2014).

Caramazza, A. On drawing inferences about the structure of normal cognitive systems from the analysis of patterns of impaired performance: the case for single-patient studies. Brain Cogn. 5 , 41–66 (1986).

Caramazza, A. & McCloskey, M. The case for single-patient studies. Cogn. Neuropsychol. 5 , 517–527 (1988).

Shallice, T. Cognitive neuropsychology and its vicissitudes: the fate of Caramazza’s axioms. Cogn. Neuropsychol. 32 , 385–411 (2015).

Shallice, T. From Neuropsychology To Mental Structure (Cambridge Univ. Press, 1988).

Coltheart, M. Assumptions and methods in cognitive neuropscyhology. In The Handbook Of Cognitive Neuropsychology: What Deficits Reveal About The Human Mind (ed. Rapp, B.) 3–22 (Psychology Press, 2001).

McCloskey, M. & Chaisilprungraung, T. The value of cognitive neuropsychology: the case of vision research. Cogn. Neuropsychol. 34 , 412–419 (2017).

McCloskey, M. The future of cognitive neuropsychology. In The Handbook Of Cognitive Neuropsychology: What Deficits Reveal About The Human Mind (ed. Rapp, B.) 593–610 (Psychology Press, 2001).

Lashley, K. S. In search of the engram. In Physiological Mechanisms in Animal Behavior 454–482 (Academic Press, 1950).

Squire, L. R. & Wixted, J. T. The cognitive neuroscience of human memory since H.M. Annu. Rev. Neurosci. 34 , 259–288 (2011).

Stone, G. O., Vanhoy, M. & Orden, G. C. V. Perception is a two-way street: feedforward and feedback phonology in visual word recognition. J. Mem. Lang. 36 , 337–359 (1997).

Perfetti, C. A. The psycholinguistics of spelling and reading. In Learning To Spell: Research, Theory, And Practice Across Languages 21–38 (Lawrence Erlbaum, 1997).

Nickels, L. The autocue? self-generated phonemic cues in the treatment of a disorder of reading and naming. Cogn. Neuropsychol. 9 , 155–182 (1992).

Rapp, B., Benzing, L. & Caramazza, A. The autonomy of lexical orthography. Cogn. Neuropsychol. 14 , 71–104 (1997).

Bonin, P., Roux, S. & Barry, C. Translating nonverbal pictures into verbal word names. Understanding lexical access and retrieval. In Past, Present, And Future Contributions Of Cognitive Writing Research To Cognitive Psychology 315–522 (Psychology Press, 2011).

Bonin, P., Fayol, M. & Gombert, J.-E. Role of phonological and orthographic codes in picture naming and writing: an interference paradigm study. Cah. Psychol. Cogn./Current Psychol. Cogn. 16 , 299–324 (1997).

Bonin, P., Fayol, M. & Peereman, R. Masked form priming in writing words from pictures: evidence for direct retrieval of orthographic codes. Acta Psychol. 99 , 311–328 (1998).

Bentin, S., Allison, T., Puce, A., Perez, E. & McCarthy, G. Electrophysiological studies of face perception in humans. J. Cogn. Neurosci. 8 , 551–565 (1996).

Jeffreys, D. A. Evoked potential studies of face and object processing. Vis. Cogn. 3 , 1–38 (1996).

Laganaro, M., Morand, S., Michel, C. M., Spinelli, L. & Schnider, A. ERP correlates of word production before and after stroke in an aphasic patient. J. Cogn. Neurosci. 23 , 374–381 (2011).

Indefrey, P. & Levelt, W. J. M. The spatial and temporal signatures of word production components. Cognition 92 , 101–144 (2004).

Valente, A., Burki, A. & Laganaro, M. ERP correlates of word production predictors in picture naming: a trial by trial multiple regression analysis from stimulus onset to response. Front. Neurosci. 8 , 390 (2014).

Kittredge, A. K., Dell, G. S., Verkuilen, J. & Schwartz, M. F. Where is the effect of frequency in word production? Insights from aphasic picture-naming errors. Cogn. Neuropsychol. 25 , 463–492 (2008).

Domdei, N. et al. Ultra-high contrast retinal display system for single photoreceptor psychophysics. Biomed. Opt. Express 9 , 157 (2018).

Poldrack, R. A. et al. Long-term neural and physiological phenotyping of a single human. Nat. Commun. 6 , 8885 (2015).

Coltheart, M. The assumptions of cognitive neuropsychology: reflections on Caramazza (1984, 1986). Cogn. Neuropsychol. 34 , 397–402 (2017).

Badecker, W. & Caramazza, A. A final brief in the case against agrammatism: the role of theory in the selection of data. Cognition 24 , 277–282 (1986).

Fischer-Baum, S. Making sense of deviance: Identifying dissociating cases within the case series approach. Cogn. Neuropsychol. 30 , 597–617 (2013).

Nickels, L., Howard, D. & Best, W. On the use of different methodologies in cognitive neuropsychology: drink deep and from several sources. Cogn. Neuropsychol. 28 , 475–485 (2011).

Dell, G. S. & Schwartz, M. F. Who’s in and who’s out? Inclusion criteria, model evaluation, and the treatment of exceptions in case series. Cogn. Neuropsychol. 28 , 515–520 (2011).

Schwartz, M. F. & Dell, G. S. Case series investigations in cognitive neuropsychology. Cogn. Neuropsychol. 27 , 477–494 (2010).

Cohen, J. A power primer. Psychol. Bull. 112 , 155–159 (1992).

Martin, R. C. & Allen, C. Case studies in neuropsychology. In APA Handbook Of Research Methods In Psychology Vol. 2 Research Designs: Quantitative, Qualitative, Neuropsychological, And Biological (eds Cooper, H. et al.) 633–646 (American Psychological Association, 2012).

Leivada, E., Westergaard, M., Duñabeitia, J. A. & Rothman, J. On the phantom-like appearance of bilingualism effects on neurocognition: (how) should we proceed? Bilingualism 24 , 197–210 (2021).

Arnett, J. J. The neglected 95%: why American psychology needs to become less American. Am. Psychol. 63 , 602–614 (2008).

Stolz, J. A., Besner, D. & Carr, T. H. Implications of measures of reliability for theories of priming: activity in semantic memory is inherently noisy and uncoordinated. Vis. Cogn. 12 , 284–336 (2005).

Cipora, K. et al. A minority pulls the sample mean: on the individual prevalence of robust group-level cognitive phenomena — the instance of the SNARC effect. Preprint at psyArXiv https://doi.org/10.31234/osf.io/bwyr3 (2019).

Andrews, S., Lo, S. & Xia, V. Individual differences in automatic semantic priming. J. Exp. Psychol. Hum. Percept. Perform. 43 , 1025–1039 (2017).

Tan, L. C. & Yap, M. J. Are individual differences in masked repetition and semantic priming reliable? Vis. Cogn. 24 , 182–200 (2016).

Olsson-Collentine, A., Wicherts, J. M. & van Assen, M. A. L. M. Heterogeneity in direct replications in psychology and its association with effect size. Psychol. Bull. 146 , 922–940 (2020).

Gratton, C. & Braga, R. M. Editorial overview: deep imaging of the individual brain: past, practice, and promise. Curr. Opin. Behav. Sci. 40 , iii–vi (2021).

Fedorenko, E. The early origins and the growing popularity of the individual-subject analytic approach in human neuroscience. Curr. Opin. Behav. Sci. 40 , 105–112 (2021).

Xue, A. et al. The detailed organization of the human cerebellum estimated by intrinsic functional connectivity within the individual. J. Neurophysiol. 125 , 358–384 (2021).

Petit, S. et al. Toward an individualized neural assessment of receptive language in children. J. Speech Lang. Hear. Res. 63 , 2361–2385 (2020).

Jung, K.-H. et al. Heterogeneity of cerebral white matter lesions and clinical correlates in older adults. Stroke 52 , 620–630 (2021).

Falcon, M. I., Jirsa, V. & Solodkin, A. A new neuroinformatics approach to personalized medicine in neurology: the virtual brain. Curr. Opin. Neurol. 29 , 429–436 (2016).

Duncan, G. J., Engel, M., Claessens, A. & Dowsett, C. J. Replication and robustness in developmental research. Dev. Psychol. 50 , 2417–2425 (2014).

Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349 , aac4716 (2015).

Tackett, J. L., Brandes, C. M., King, K. M. & Markon, K. E. Psychology’s replication crisis and clinical psychological science. Annu. Rev. Clin. Psychol. 15 , 579–604 (2019).

Munafò, M. R. et al. A manifesto for reproducible science. Nat. Hum. Behav. 1 , 0021 (2017).

Oldfield, R. C. & Wingfield, A. The time it takes to name an object. Nature 202 , 1031–1032 (1964).

Oldfield, R. C. & Wingfield, A. Response latencies in naming objects. Q. J. Exp. Psychol. 17 , 273–281 (1965).

Brysbaert, M. How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. J. Cogn. 2 , 16 (2019).

Brysbaert, M. Power considerations in bilingualism research: time to step up our game. Bilingualism https://doi.org/10.1017/S1366728920000437 (2020).

Machery, E. What is a replication? Phil. Sci. 87 , 545–567 (2020).

Nosek, B. A. & Errington, T. M. What is replication? PLoS Biol. 18 , e3000691 (2020).

Li, X., Huang, L., Yao, P. & Hyönä, J. Universal and specific reading mechanisms across different writing systems. Nat. Rev. Psychol. 1 , 133–144 (2022).

Rapp, B. (Ed.) The Handbook Of Cognitive Neuropsychology: What Deficits Reveal About The Human Mind (Psychology Press, 2001).

Code, C. et al. Classic Cases In Neuropsychology (Psychology Press, 1996).

Patterson, K., Marshall, J. C. & Coltheart, M. Surface Dyslexia: Neuropsychological And Cognitive Studies Of Phonological Reading (Routledge, 2017).

Marshall, J. C. & Newcombe, F. Patterns of paralexia: a psycholinguistic approach. J. Psycholinguist. Res. 2 , 175–199 (1973).

Castles, A. & Coltheart, M. Varieties of developmental dyslexia. Cognition 47 , 149–180 (1993).

Khentov-Kraus, L. & Friedmann, N. Vowel letter dyslexia. Cogn. Neuropsychol. 35 , 223–270 (2018).

Winskel, H. Orthographic and phonological parafoveal processing of consonants, vowels, and tones when reading Thai. Appl. Psycholinguist. 32 , 739–759 (2011).

Hepner, C., McCloskey, M. & Rapp, B. Do reading and spelling share orthographic representations? Evidence from developmental dysgraphia. Cogn. Neuropsychol. 34 , 119–143 (2017).

Hanley, J. R. & Sotiropoulos, A. Developmental surface dysgraphia without surface dyslexia. Cogn. Neuropsychol. 35 , 333–341 (2018).

Zihl, J. & Heywood, C. A. The contribution of single case studies to the neuroscience of vision: single case studies in vision neuroscience. Psych. J. 5 , 5–17 (2016).

Bouvier, S. E. & Engel, S. A. Behavioral deficits and cortical damage loci in cerebral achromatopsia. Cereb. Cortex 16 , 183–191 (2006).

Zihl, J. & Heywood, C. A. The contribution of LM to the neuroscience of movement vision. Front. Integr. Neurosci. 9 , 6 (2015).

Dotan, D. & Friedmann, N. Separate mechanisms for number reading and word reading: evidence from selective impairments. Cortex 114 , 176–192 (2019).

McCloskey, M. & Schubert, T. Shared versus separate processes for letter and digit identification. Cogn. Neuropsychol. 31 , 437–460 (2014).

Fayol, M. & Seron, X. On numerical representations. Insights from experimental, neuropsychological, and developmental research. In Handbook of Mathematical Cognition (ed. Campbell, J.) 3–23 (Psychological Press, 2005).

Bornstein, B. & Kidron, D. P. Prosopagnosia. J. Neurol. Neurosurg. Psychiat. 22 , 124–131 (1959).

Kühn, C. D., Gerlach, C., Andersen, K. B., Poulsen, M. & Starrfelt, R. Face recognition in developmental dyslexia: evidence for dissociation between faces and words. Cogn. Neuropsychol. 38 , 107–115 (2021).

Barton, J. J. S., Albonico, A., Susilo, T., Duchaine, B. & Corrow, S. L. Object recognition in acquired and developmental prosopagnosia. Cogn. Neuropsychol. 36 , 54–84 (2019).

Renault, B., Signoret, J.-L., Debruille, B., Breton, F. & Bolgert, F. Brain potentials reveal covert facial recognition in prosopagnosia. Neuropsychologia 27 , 905–912 (1989).

Bauer, R. M. Autonomic recognition of names and faces in prosopagnosia: a neuropsychological application of the guilty knowledge test. Neuropsychologia 22 , 457–469 (1984).

Haan, E. H. F., de, Young, A. & Newcombe, F. Face recognition without awareness. Cogn. Neuropsychol. 4 , 385–415 (1987).

Ellis, H. D. & Lewis, M. B. Capgras delusion: a window on face recognition. Trends Cogn. Sci. 5 , 149–156 (2001).

Ellis, H. D., Young, A. W., Quayle, A. H. & De Pauw, K. W. Reduced autonomic responses to faces in Capgras delusion. Proc. R. Soc. Lond. B 264 , 1085–1092 (1997).

Collins, M. N., Hawthorne, M. E., Gribbin, N. & Jacobson, R. Capgras’ syndrome with organic disorders. Postgrad. Med. J. 66 , 1064–1067 (1990).

Enoch, D., Puri, B. K. & Ball, H. Uncommon Psychiatric Syndromes 5th edn (Routledge, 2020).

Tranel, D., Damasio, H. & Damasio, A. R. Double dissociation between overt and covert face recognition. J. Cogn. Neurosci. 7 , 425–432 (1995).

Brighetti, G., Bonifacci, P., Borlimi, R. & Ottaviani, C. “Far from the heart far from the eye”: evidence from the Capgras delusion. Cogn. Neuropsychiat. 12 , 189–197 (2007).

Coltheart, M., Langdon, R. & McKay, R. Delusional belief. Annu. Rev. Psychol. 62 , 271–298 (2011).

Coltheart, M. Cognitive neuropsychiatry and delusional belief. Q. J. Exp. Psychol. 60 , 1041–1062 (2007).

Coltheart, M. & Davies, M. How unexpected observations lead to new beliefs: a Peircean pathway. Conscious. Cogn. 87 , 103037 (2021).

Coltheart, M. & Davies, M. Failure of hypothesis evaluation as a factor in delusional belief. Cogn. Neuropsychiat. 26 , 213–230 (2021).

McCloskey, M. et al. A developmental deficit in localizing objects from vision. Psychol. Sci. 6 , 112–117 (1995).

McCloskey, M., Valtonen, J. & Cohen Sherman, J. Representing orientation: a coordinate-system hypothesis and evidence from developmental deficits. Cogn. Neuropsychol. 23 , 680–713 (2006).

McCloskey, M. Spatial representations and multiple-visual-systems hypotheses: evidence from a developmental deficit in visual location and orientation processing. Cortex 40 , 677–694 (2004).

Gregory, E. & McCloskey, M. Mirror-image confusions: implications for representation and processing of object orientation. Cognition 116 , 110–129 (2010).

Gregory, E., Landau, B. & McCloskey, M. Representation of object orientation in children: evidence from mirror-image confusions. Vis. Cogn. 19 , 1035–1062 (2011).

Laine, M. & Martin, N. Cognitive neuropsychology has been, is, and will be significant to aphasiology. Aphasiology 26 , 1362–1376 (2012).

Howard, D. & Patterson, K. The Pyramids And Palm Trees Test: A Test Of Semantic Access From Words And Pictures (Thames Valley Test Co., 1992).

Kay, J., Lesser, R. & Coltheart, M. PALPA: Psycholinguistic Assessments Of Language Processing In Aphasia. 2: Picture & Word Semantics, Sentence Comprehension (Erlbaum, 2001).

Franklin, S. Dissociations in auditory word comprehension; evidence from nine fluent aphasic patients. Aphasiology 3 , 189–207 (1989).

Howard, D., Swinburn, K. & Porter, G. Putting the CAT out: what the comprehensive aphasia test has to offer. Aphasiology 24 , 56–74 (2010).

Conti-Ramsden, G., Crutchley, A. & Botting, N. The extent to which psychometric tests differentiate subgroups of children with SLI. J. Speech Lang. Hear. Res. 40 , 765–777 (1997).

Bishop, D. V. M. & McArthur, G. M. Individual differences in auditory processing in specific language impairment: a follow-up study using event-related potentials and behavioural thresholds. Cortex 41 , 327–341 (2005).

Bishop, D. V. M., Snowling, M. J., Thompson, P. A. & Greenhalgh, T., and the CATALISE-2 consortium. Phase 2 of CATALISE: a multinational and multidisciplinary Delphi consensus study of problems with language development: terminology. J. Child. Psychol. Psychiat. 58 , 1068–1080 (2017).

Wilson, A. J. et al. Principles underlying the design of ‘the number race’, an adaptive computer game for remediation of dyscalculia. Behav. Brain Funct. 2 , 19 (2006).

Basso, A. & Marangolo, P. Cognitive neuropsychological rehabilitation: the emperor’s new clothes? Neuropsychol. Rehabil. 10 , 219–229 (2000).

Murad, M. H., Asi, N., Alsawas, M. & Alahdab, F. New evidence pyramid. Evidence-based Med. 21 , 125–127 (2016).

Greenhalgh, T., Howick, J. & Maskrey, N., for the Evidence Based Medicine Renaissance Group. Evidence based medicine: a movement in crisis? Br. Med. J. 348 , g3725–g3725 (2014).

Best, W., Ping Sze, W., Edmundson, A. & Nickels, L. What counts as evidence? Swimming against the tide: valuing both clinically informed experimentally controlled case series and randomized controlled trials in intervention research. Evidence-based Commun. Assess. Interv. 13 , 107–135 (2019).

Best, W. et al. Understanding differing outcomes from semantic and phonological interventions with children with word-finding difficulties: a group and case series study. Cortex 134 , 145–161 (2021).

OCEBM Levels of Evidence Working Group. The Oxford Levels of Evidence 2. CEBM https://www.cebm.ox.ac.uk/resources/levels-of-evidence/ocebm-levels-of-evidence (2011).

Holler, D. E., Behrmann, M. & Snow, J. C. Real-world size coding of solid objects, but not 2-D or 3-D images, in visual agnosia patients with bilateral ventral lesions. Cortex 119 , 555–568 (2019).

Duchaine, B. C., Yovel, G., Butterworth, E. J. & Nakayama, K. Prosopagnosia as an impairment to face-specific mechanisms: elimination of the alternative hypotheses in a developmental case. Cogn. Neuropsychol. 23 , 714–747 (2006).

Hartley, T. et al. The hippocampus is required for short-term topographical memory in humans. Hippocampus 17 , 34–48 (2007).

Pishnamazi, M. et al. Attentional bias towards and away from fearful faces is modulated by developmental amygdala damage. Cortex 81 , 24–34 (2016).

Rapp, B., Fischer-Baum, S. & Miozzo, M. Modality and morphology: what we write may not be what we say. Psychol. Sci. 26 , 892–902 (2015).

Yong, K. X. X., Warren, J. D., Warrington, E. K. & Crutch, S. J. Intact reading in patients with profound early visual dysfunction. Cortex 49 , 2294–2306 (2013).

Rockland, K. S. & Van Hoesen, G. W. Direct temporal–occipital feedback connections to striate cortex (V1) in the macaque monkey. Cereb. Cortex 4 , 300–313 (1994).

Haynes, J.-D., Driver, J. & Rees, G. Visibility reflects dynamic changes of effective connectivity between V1 and fusiform cortex. Neuron 46 , 811–821 (2005).

Tanaka, K. Mechanisms of visual object recognition: monkey and human studies. Curr. Opin. Neurobiol. 7 , 523–529 (1997).

Fischer-Baum, S., McCloskey, M. & Rapp, B. Representation of letter position in spelling: evidence from acquired dysgraphia. Cognition 115 , 466–490 (2010).

Houghton, G. The problem of serial order: a neural network model of sequence learning and recall. In Current Research In Natural Language Generation (eds Dale, R., Mellish, C. & Zock, M.) 287–319 (Academic Press, 1990).

Fieder, N., Nickels, L., Biedermann, B. & Best, W. From “some butter” to “a butter”: an investigation of mass and count representation and processing. Cogn. Neuropsychol. 31 , 313–349 (2014).

Fieder, N., Nickels, L., Biedermann, B. & Best, W. How ‘some garlic’ becomes ‘a garlic’ or ‘some onion’: mass and count processing in aphasia. Neuropsychologia 75 , 626–645 (2015).

Schröder, A., Burchert, F. & Stadie, N. Training-induced improvement of noncanonical sentence production does not generalize to comprehension: evidence for modality-specific processes. Cogn. Neuropsychol. 32 , 195–220 (2015).

Stadie, N. et al. Unambiguous generalization effects after treatment of non-canonical sentence production in German agrammatism. Brain Lang. 104 , 211–229 (2008).

Schapiro, A. C., Gregory, E., Landau, B., McCloskey, M. & Turk-Browne, N. B. The necessity of the medial temporal lobe for statistical learning. J. Cogn. Neurosci. 26 , 1736–1747 (2014).

Schapiro, A. C., Kustner, L. V. & Turk-Browne, N. B. Shaping of object representations in the human medial temporal lobe based on temporal regularities. Curr. Biol. 22 , 1622–1627 (2012).

Baddeley, A., Vargha-Khadem, F. & Mishkin, M. Preserved recognition in a case of developmental amnesia: implications for the acaquisition of semantic memory? J. Cogn. Neurosci. 13 , 357–369 (2001).

Snyder, J. J. & Chatterjee, A. Spatial-temporal anisometries following right parietal damage. Neuropsychologia 42 , 1703–1708 (2004).

Ashkenazi, S., Henik, A., Ifergane, G. & Shelef, I. Basic numerical processing in left intraparietal sulcus (IPS) acalculia. Cortex 44 , 439–448 (2008).

Lebrun, M.-A., Moreau, P., McNally-Gagnon, A., Mignault Goulet, G. & Peretz, I. Congenital amusia in childhood: a case study. Cortex 48 , 683–688 (2012).

Vannuscorps, G., Andres, M. & Pillon, A. When does action comprehension need motor involvement? Evidence from upper limb aplasia. Cogn. Neuropsychol. 30 , 253–283 (2013).

Jeannerod, M. Neural simulation of action: a unifying mechanism for motor cognition. NeuroImage 14 , S103–S109 (2001).

Blakemore, S.-J. & Decety, J. From the perception of action to the understanding of intention. Nat. Rev. Neurosci. 2 , 561–567 (2001).

Rizzolatti, G. & Craighero, L. The mirror-neuron system. Annu. Rev. Neurosci. 27 , 169–192 (2004).

Forde, E. M. E., Humphreys, G. W. & Remoundou, M. Disordered knowledge of action order in action disorganisation syndrome. Neurocase 10 , 19–28 (2004).

Mazzi, C. & Savazzi, S. The glamor of old-style single-case studies in the neuroimaging era: insights from a patient with hemianopia. Front. Psychol. 10 , 965 (2019).

Coltheart, M. What has functional neuroimaging told us about the mind (so far)? (Position Paper Presented to the European Cognitive Neuropsychology Workshop, Bressanone, 2005). Cortex 42 , 323–331 (2006).

Page, M. P. A. What can’t functional neuroimaging tell the cognitive psychologist? Cortex 42 , 428–443 (2006).

Blank, I. A., Kiran, S. & Fedorenko, E. Can neuroimaging help aphasia researchers? Addressing generalizability, variability, and interpretability. Cogn. Neuropsychol. 34 , 377–393 (2017).

Niv, Y. The primacy of behavioral research for understanding the brain. Behav. Neurosci. 135 , 601–609 (2021).

Crawford, J. R. & Howell, D. C. Comparing an individual’s test score against norms derived from small samples. Clin. Neuropsychol. 12 , 482–486 (1998).

Crawford, J. R., Garthwaite, P. H. & Ryan, K. Comparing a single case to a control sample: testing for neuropsychological deficits and dissociations in the presence of covariates. Cortex 47 , 1166–1178 (2011).

McIntosh, R. D. & Rittmo, J. Ö. Power calculations in single-case neuropsychology: a practical primer. Cortex 135 , 146–158 (2021).

Patterson, K. & Plaut, D. C. “Shallow draughts intoxicate the brain”: lessons from cognitive science for cognitive neuropsychology. Top. Cogn. Sci. 1 , 39–58 (2009).

Lambon Ralph, M. A., Patterson, K. & Plaut, D. C. Finite case series or infinite single-case studies? Comments on “Case series investigations in cognitive neuropsychology” by Schwartz and Dell (2010). Cogn. Neuropsychol. 28 , 466–474 (2011).

Horien, C., Shen, X., Scheinost, D. & Constable, R. T. The individual functional connectome is unique and stable over months to years. NeuroImage 189 , 676–687 (2019).

Epelbaum, S. et al. Pure alexia as a disconnection syndrome: new diffusion imaging evidence for an old concept. Cortex 44 , 962–974 (2008).

Fischer-Baum, S. & Campana, G. Neuroplasticity and the logic of cognitive neuropsychology. Cogn. Neuropsychol. 34 , 403–411 (2017).

Paul, S., Baca, E. & Fischer-Baum, S. Cerebellar contributions to orthographic working memory: a single case cognitive neuropsychological investigation. Neuropsychologia 171 , 108242 (2022).

Feinstein, J. S., Adolphs, R., Damasio, A. & Tranel, D. The human amygdala and the induction and experience of fear. Curr. Biol. 21 , 34–38 (2011).

Crawford, J., Garthwaite, P. & Gray, C. Wanted: fully operational definitions of dissociations in single-case studies. Cortex 39 , 357–370 (2003).

McIntosh, R. D. Simple dissociations for a higher-powered neuropsychology. Cortex 103 , 256–265 (2018).

McIntosh, R. D. & Brooks, J. L. Current tests and trends in single-case neuropsychology. Cortex 47 , 1151–1159 (2011).

Best, W., Schröder, A. & Herbert, R. An investigation of a relative impairment in naming non-living items: theoretical and methodological implications. J. Neurolinguistics 19 , 96–123 (2006).

Franklin, S., Howard, D. & Patterson, K. Abstract word anomia. Cogn. Neuropsychol. 12 , 549–566 (1995).

Coltheart, M., Patterson, K. E. & Marshall, J. C. Deep Dyslexia (Routledge, 1980).

Nickels, L., Kohnen, S. & Biedermann, B. An untapped resource: treatment as a tool for revealing the nature of cognitive processes. Cogn. Neuropsychol. 27 , 539–562 (2010).

Download references

Acknowledgements

The authors thank all of those pioneers of and advocates for single case study research who have mentored, inspired and encouraged us over the years, and the many other colleagues with whom we have discussed these issues.

Author information

Authors and affiliations.

School of Psychological Sciences & Macquarie University Centre for Reading, Macquarie University, Sydney, New South Wales, Australia

Lyndsey Nickels

NHMRC Centre of Research Excellence in Aphasia Recovery and Rehabilitation, Australia

Psychological Sciences, Rice University, Houston, TX, USA

Simon Fischer-Baum

Psychology and Language Sciences, University College London, London, UK

You can also search for this author in PubMed   Google Scholar

Contributions

L.N. led and was primarily responsible for the structuring and writing of the manuscript. All authors contributed to all aspects of the article.

Corresponding author

Correspondence to Lyndsey Nickels .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Reviews Psychology thanks Yanchao Bi, Rob McIntosh, and the other, anonymous, reviewer for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Nickels, L., Fischer-Baum, S. & Best, W. Single case studies are a powerful tool for developing, testing and extending theories. Nat Rev Psychol 1 , 733–747 (2022). https://doi.org/10.1038/s44159-022-00127-y

Download citation

Accepted : 13 October 2022

Published : 22 November 2022

Issue Date : December 2022

DOI : https://doi.org/10.1038/s44159-022-00127-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case study experimental psychology

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 3. Psychological Science

3.2 Psychologists Use Descriptive, Correlational, and Experimental Research Designs to Understand Behaviour

Learning objectives.

  • Differentiate the goals of descriptive, correlational, and experimental research designs and explain the advantages and disadvantages of each.
  • Explain the goals of descriptive research and the statistical techniques used to interpret it.
  • Summarize the uses of correlational research and describe why correlational research cannot be used to infer causality.
  • Review the procedures of experimental research and explain how it can be used to draw causal inferences.

Psychologists agree that if their ideas and theories about human behaviour are to be taken seriously, they must be backed up by data. However, the research of different psychologists is designed with different goals in mind, and the different goals require different approaches. These varying approaches, summarized in Table 3.2, are known as research designs . A research design  is the specific method a researcher uses to collect, analyze, and interpret data . Psychologists use three major types of research designs in their research, and each provides an essential avenue for scientific investigation. Descriptive research  is research designed to provide a snapshot of the current state of affairs . Correlational research  is research designed to discover relationships among variables and to allow the prediction of future events from present knowledge . Experimental research  is research in which initial equivalence among research participants in more than one group is created, followed by a manipulation of a given experience for these groups and a measurement of the influence of the manipulation . Each of the three research designs varies according to its strengths and limitations, and it is important to understand how each differs.

Descriptive Research: Assessing the Current State of Affairs

Descriptive research is designed to create a snapshot of the current thoughts, feelings, or behaviour of individuals. This section reviews three types of descriptive research : case studies , surveys , and naturalistic observation (Figure 3.4).

Sometimes the data in a descriptive research project are based on only a small set of individuals, often only one person or a single small group. These research designs are known as case studies — descriptive records of one or more individual’s experiences and behaviour . Sometimes case studies involve ordinary individuals, as when developmental psychologist Jean Piaget used his observation of his own children to develop his stage theory of cognitive development. More frequently, case studies are conducted on individuals who have unusual or abnormal experiences or characteristics or who find themselves in particularly difficult or stressful situations. The assumption is that by carefully studying individuals who are socially marginal, who are experiencing unusual situations, or who are going through a difficult phase in their lives, we can learn something about human nature.

Sigmund Freud was a master of using the psychological difficulties of individuals to draw conclusions about basic psychological processes. Freud wrote case studies of some of his most interesting patients and used these careful examinations to develop his important theories of personality. One classic example is Freud’s description of “Little Hans,” a child whose fear of horses the psychoanalyst interpreted in terms of repressed sexual impulses and the Oedipus complex (Freud, 1909/1964).

Another well-known case study is Phineas Gage, a man whose thoughts and emotions were extensively studied by cognitive psychologists after a railroad spike was blasted through his skull in an accident. Although there are questions about the interpretation of this case study (Kotowicz, 2007), it did provide early evidence that the brain’s frontal lobe is involved in emotion and morality (Damasio et al., 2005). An interesting example of a case study in clinical psychology is described by Rokeach (1964), who investigated in detail the beliefs of and interactions among three patients with schizophrenia, all of whom were convinced they were Jesus Christ.

In other cases the data from descriptive research projects come in the form of a survey — a measure administered through either an interview or a written questionnaire to get a picture of the beliefs or behaviours of a sample of people of interest . The people chosen to participate in the research (known as the sample) are selected to be representative of all the people that the researcher wishes to know about (the population). In election polls, for instance, a sample is taken from the population of all “likely voters” in the upcoming elections.

The results of surveys may sometimes be rather mundane, such as “Nine out of 10 doctors prefer Tymenocin” or “The median income in the city of Hamilton is $46,712.” Yet other times (particularly in discussions of social behaviour), the results can be shocking: “More than 40,000 people are killed by gunfire in the United States every year” or “More than 60% of women between the ages of 50 and 60 suffer from depression.” Descriptive research is frequently used by psychologists to get an estimate of the prevalence (or incidence ) of psychological disorders.

A final type of descriptive research — known as naturalistic observation — is research based on the observation of everyday events . For instance, a developmental psychologist who watches children on a playground and describes what they say to each other while they play is conducting descriptive research, as is a biopsychologist who observes animals in their natural habitats. One example of observational research involves a systematic procedure known as the strange situation , used to get a picture of how adults and young children interact. The data that are collected in the strange situation are systematically coded in a coding sheet such as that shown in Table 3.3.

The results of descriptive research projects are analyzed using descriptive statistics — numbers that summarize the distribution of scores on a measured variable . Most variables have distributions similar to that shown in Figure 3.5 where most of the scores are located near the centre of the distribution, and the distribution is symmetrical and bell-shaped. A data distribution that is shaped like a bell is known as a normal distribution .

A distribution can be described in terms of its central tendency — that is, the point in the distribution around which the data are centred — and its dispersion, or spread . The arithmetic average, or arithmetic mean , symbolized by the letter M , is the most commonly used measure of central tendency . It is computed by calculating the sum of all the scores of the variable and dividing this sum by the number of participants in the distribution (denoted by the letter N ). In the data presented in Figure 3.5 the mean height of the students is 67.12 inches (170.5 cm). The sample mean is usually indicated by the letter M .

In some cases, however, the data distribution is not symmetrical. This occurs when there are one or more extreme scores (known as outliers ) at one end of the distribution. Consider, for instance, the variable of family income (see Figure 3.6), which includes an outlier (a value of $3,800,000). In this case the mean is not a good measure of central tendency. Although it appears from Figure 3.6 that the central tendency of the family income variable should be around $70,000, the mean family income is actually $223,960. The single very extreme income has a disproportionate impact on the mean, resulting in a value that does not well represent the central tendency.

The median is used as an alternative measure of central tendency when distributions are not symmetrical. The median  is the score in the center of the distribution, meaning that 50% of the scores are greater than the median and 50% of the scores are less than the median . In our case, the median household income ($73,000) is a much better indication of central tendency than is the mean household income ($223,960).

A final measure of central tendency, known as the mode , represents the value that occurs most frequently in the distribution . You can see from Figure 3.6 that the mode for the family income variable is $93,000 (it occurs four times).

In addition to summarizing the central tendency of a distribution, descriptive statistics convey information about how the scores of the variable are spread around the central tendency. Dispersion refers to the extent to which the scores are all tightly clustered around the central tendency , as seen in Figure 3.7.

Or they may be more spread out away from it, as seen in Figure 3.8.

One simple measure of dispersion is to find the largest (the maximum ) and the smallest (the minimum ) observed values of the variable and to compute the range of the variable as the maximum observed score minus the minimum observed score. You can check that the range of the height variable in Figure 3.5 is 72 – 62 = 10. The standard deviation , symbolized as s , is the most commonly used measure of dispersion . Distributions with a larger standard deviation have more spread. The standard deviation of the height variable is s = 2.74, and the standard deviation of the family income variable is s = $745,337.

An advantage of descriptive research is that it attempts to capture the complexity of everyday behaviour. Case studies provide detailed information about a single person or a small group of people, surveys capture the thoughts or reported behaviours of a large population of people, and naturalistic observation objectively records the behaviour of people or animals as it occurs naturally. Thus descriptive research is used to provide a relatively complete understanding of what is currently happening.

Despite these advantages, descriptive research has a distinct disadvantage in that, although it allows us to get an idea of what is currently happening, it is usually limited to static pictures. Although descriptions of particular experiences may be interesting, they are not always transferable to other individuals in other situations, nor do they tell us exactly why specific behaviours or events occurred. For instance, descriptions of individuals who have suffered a stressful event, such as a war or an earthquake, can be used to understand the individuals’ reactions to the event but cannot tell us anything about the long-term effects of the stress. And because there is no comparison group that did not experience the stressful situation, we cannot know what these individuals would be like if they hadn’t had the stressful experience.

Correlational Research: Seeking Relationships among Variables

In contrast to descriptive research, which is designed primarily to provide static pictures, correlational research involves the measurement of two or more relevant variables and an assessment of the relationship between or among those variables. For instance, the variables of height and weight are systematically related (correlated) because taller people generally weigh more than shorter people. In the same way, study time and memory errors are also related, because the more time a person is given to study a list of words, the fewer errors he or she will make. When there are two variables in the research design, one of them is called the predictor variable and the other the outcome variable . The research design can be visualized as shown in Figure 3.9, where the curved arrow represents the expected correlation between these two variables.

One way of organizing the data from a correlational study with two variables is to graph the values of each of the measured variables using a scatter plot . As you can see in Figure 3.10 a scatter plot  is a visual image of the relationship between two variables . A point is plotted for each individual at the intersection of his or her scores for the two variables. When the association between the variables on the scatter plot can be easily approximated with a straight line , as in parts (a) and (b) of Figure 3.10 the variables are said to have a linear relationship .

When the straight line indicates that individuals who have above-average values for one variable also tend to have above-average values for the other variable , as in part (a), the relationship is said to be positive linear . Examples of positive linear relationships include those between height and weight, between education and income, and between age and mathematical abilities in children. In each case, people who score higher on one of the variables also tend to score higher on the other variable. Negative linear relationships , in contrast, as shown in part (b), occur when above-average values for one variable tend to be associated with below-average values for the other variable. Examples of negative linear relationships include those between the age of a child and the number of diapers the child uses, and between practice on and errors made on a learning task. In these cases, people who score higher on one of the variables tend to score lower on the other variable.

Relationships between variables that cannot be described with a straight line are known as nonlinear relationships . Part (c) of Figure 3.10 shows a common pattern in which the distribution of the points is essentially random. In this case there is no relationship at all between the two variables, and they are said to be independent . Parts (d) and (e) of Figure 3.10 show patterns of association in which, although there is an association, the points are not well described by a single straight line. For instance, part (d) shows the type of relationship that frequently occurs between anxiety and performance. Increases in anxiety from low to moderate levels are associated with performance increases, whereas increases in anxiety from moderate to high levels are associated with decreases in performance. Relationships that change in direction and thus are not described by a single straight line are called curvilinear relationships .

The most common statistical measure of the strength of linear relationships among variables is the Pearson correlation coefficient , which is symbolized by the letter r . The value of the correlation coefficient ranges from r = –1.00 to r = +1.00. The direction of the linear relationship is indicated by the sign of the correlation coefficient. Positive values of r (such as r = .54 or r = .67) indicate that the relationship is positive linear (i.e., the pattern of the dots on the scatter plot runs from the lower left to the upper right), whereas negative values of r (such as r = –.30 or r = –.72) indicate negative linear relationships (i.e., the dots run from the upper left to the lower right). The strength of the linear relationship is indexed by the distance of the correlation coefficient from zero (its absolute value). For instance, r = –.54 is a stronger relationship than r = .30, and r = .72 is a stronger relationship than r = –.57. Because the Pearson correlation coefficient only measures linear relationships, variables that have curvilinear relationships are not well described by r , and the observed correlation will be close to zero.

It is also possible to study relationships among more than two measures at the same time. A research design in which more than one predictor variable is used to predict a single outcome variable is analyzed through multiple regression (Aiken & West, 1991).  Multiple regression  is a statistical technique, based on correlation coefficients among variables, that allows predicting a single outcome variable from more than one predictor variable . For instance, Figure 3.11 shows a multiple regression analysis in which three predictor variables (Salary, job satisfaction, and years employed) are used to predict a single outcome (job performance). The use of multiple regression analysis shows an important advantage of correlational research designs — they can be used to make predictions about a person’s likely score on an outcome variable (e.g., job performance) based on knowledge of other variables.

An important limitation of correlational research designs is that they cannot be used to draw conclusions about the causal relationships among the measured variables. Consider, for instance, a researcher who has hypothesized that viewing violent behaviour will cause increased aggressive play in children. He has collected, from a sample of Grade 4 children, a measure of how many violent television shows each child views during the week, as well as a measure of how aggressively each child plays on the school playground. From his collected data, the researcher discovers a positive correlation between the two measured variables.

Although this positive correlation appears to support the researcher’s hypothesis, it cannot be taken to indicate that viewing violent television causes aggressive behaviour. Although the researcher is tempted to assume that viewing violent television causes aggressive play, there are other possibilities. One alternative possibility is that the causal direction is exactly opposite from what has been hypothesized. Perhaps children who have behaved aggressively at school develop residual excitement that leads them to want to watch violent television shows at home (Figure 3.13):

Although this possibility may seem less likely, there is no way to rule out the possibility of such reverse causation on the basis of this observed correlation. It is also possible that both causal directions are operating and that the two variables cause each other (Figure 3.14).

Still another possible explanation for the observed correlation is that it has been produced by the presence of a common-causal variable (also known as a third variable ). A common-causal variable  is a variable that is not part of the research hypothesis but that causes both the predictor and the outcome variable and thus produces the observed correlation between them . In our example, a potential common-causal variable is the discipline style of the children’s parents. Parents who use a harsh and punitive discipline style may produce children who like to watch violent television and who also behave aggressively in comparison to children whose parents use less harsh discipline (Figure 3.15)

In this case, television viewing and aggressive play would be positively correlated (as indicated by the curved arrow between them), even though neither one caused the other but they were both caused by the discipline style of the parents (the straight arrows). When the predictor and outcome variables are both caused by a common-causal variable, the observed relationship between them is said to be spurious . A spurious relationship  is a relationship between two variables in which a common-causal variable produces and “explains away” the relationship . If effects of the common-causal variable were taken away, or controlled for, the relationship between the predictor and outcome variables would disappear. In the example, the relationship between aggression and television viewing might be spurious because by controlling for the effect of the parents’ disciplining style, the relationship between television viewing and aggressive behaviour might go away.

Common-causal variables in correlational research designs can be thought of as mystery variables because, as they have not been measured, their presence and identity are usually unknown to the researcher. Since it is not possible to measure every variable that could cause both the predictor and outcome variables, the existence of an unknown common-causal variable is always a possibility. For this reason, we are left with the basic limitation of correlational research: correlation does not demonstrate causation. It is important that when you read about correlational research projects, you keep in mind the possibility of spurious relationships, and be sure to interpret the findings appropriately. Although correlational research is sometimes reported as demonstrating causality without any mention being made of the possibility of reverse causation or common-causal variables, informed consumers of research, like you, are aware of these interpretational problems.

In sum, correlational research designs have both strengths and limitations. One strength is that they can be used when experimental research is not possible because the predictor variables cannot be manipulated. Correlational designs also have the advantage of allowing the researcher to study behaviour as it occurs in everyday life. And we can also use correlational designs to make predictions — for instance, to predict from the scores on their battery of tests the success of job trainees during a training session. But we cannot use such correlational information to determine whether the training caused better job performance. For that, researchers rely on experiments.

Experimental Research: Understanding the Causes of Behaviour

The goal of experimental research design is to provide more definitive conclusions about the causal relationships among the variables in the research hypothesis than is available from correlational designs. In an experimental research design, the variables of interest are called the independent variable (or variables ) and the dependent variable . The independent variable  in an experiment is the causing variable that is created (manipulated) by the experimenter . The dependent variable  in an experiment is a measured variable that is expected to be influenced by the experimental manipulation . The research hypothesis suggests that the manipulated independent variable or variables will cause changes in the measured dependent variables. We can diagram the research hypothesis by using an arrow that points in one direction. This demonstrates the expected direction of causality (Figure 3.16):

Research Focus: Video Games and Aggression

Consider an experiment conducted by Anderson and Dill (2000). The study was designed to test the hypothesis that viewing violent video games would increase aggressive behaviour. In this research, male and female undergraduates from Iowa State University were given a chance to play with either a violent video game (Wolfenstein 3D) or a nonviolent video game (Myst). During the experimental session, the participants played their assigned video games for 15 minutes. Then, after the play, each participant played a competitive game with an opponent in which the participant could deliver blasts of white noise through the earphones of the opponent. The operational definition of the dependent variable (aggressive behaviour) was the level and duration of noise delivered to the opponent. The design of the experiment is shown in Figure 3.17

Two advantages of the experimental research design are (a) the assurance that the independent variable (also known as the experimental manipulation ) occurs prior to the measured dependent variable, and (b) the creation of initial equivalence between the conditions of the experiment (in this case by using random assignment to conditions).

Experimental designs have two very nice features. For one, they guarantee that the independent variable occurs prior to the measurement of the dependent variable. This eliminates the possibility of reverse causation. Second, the influence of common-causal variables is controlled, and thus eliminated, by creating initial equivalence among the participants in each of the experimental conditions before the manipulation occurs.

The most common method of creating equivalence among the experimental conditions is through random assignment to conditions, a procedure in which the condition that each participant is assigned to is determined through a random process, such as drawing numbers out of an envelope or using a random number table . Anderson and Dill first randomly assigned about 100 participants to each of their two groups (Group A and Group B). Because they used random assignment to conditions, they could be confident that, before the experimental manipulation occurred, the students in Group A were, on average, equivalent to the students in Group B on every possible variable, including variables that are likely to be related to aggression, such as parental discipline style, peer relationships, hormone levels, diet — and in fact everything else.

Then, after they had created initial equivalence, Anderson and Dill created the experimental manipulation — they had the participants in Group A play the violent game and the participants in Group B play the nonviolent game. Then they compared the dependent variable (the white noise blasts) between the two groups, finding that the students who had viewed the violent video game gave significantly longer noise blasts than did the students who had played the nonviolent game.

Anderson and Dill had from the outset created initial equivalence between the groups. This initial equivalence allowed them to observe differences in the white noise levels between the two groups after the experimental manipulation, leading to the conclusion that it was the independent variable (and not some other variable) that caused these differences. The idea is that the only thing that was different between the students in the two groups was the video game they had played.

Despite the advantage of determining causation, experiments do have limitations. One is that they are often conducted in laboratory situations rather than in the everyday lives of people. Therefore, we do not know whether results that we find in a laboratory setting will necessarily hold up in everyday life. Second, and more important, is that some of the most interesting and key social variables cannot be experimentally manipulated. If we want to study the influence of the size of a mob on the destructiveness of its behaviour, or to compare the personality characteristics of people who join suicide cults with those of people who do not join such cults, these relationships must be assessed using correlational designs, because it is simply not possible to experimentally manipulate these variables.

Key Takeaways

  • Descriptive, correlational, and experimental research designs are used to collect and analyze data.
  • Descriptive designs include case studies, surveys, and naturalistic observation. The goal of these designs is to get a picture of the current thoughts, feelings, or behaviours in a given group of people. Descriptive research is summarized using descriptive statistics.
  • Correlational research designs measure two or more relevant variables and assess a relationship between or among them. The variables may be presented on a scatter plot to visually show the relationships. The Pearson Correlation Coefficient ( r ) is a measure of the strength of linear relationship between two variables.
  • Common-causal variables may cause both the predictor and outcome variable in a correlational design, producing a spurious relationship. The possibility of common-causal variables makes it impossible to draw causal conclusions from correlational research designs.
  • Experimental research involves the manipulation of an independent variable and the measurement of a dependent variable. Random assignment to conditions is normally used to create initial equivalence between the groups, allowing researchers to draw causal conclusions.

Exercises and Critical Thinking

  • There is a negative correlation between the row that a student sits in in a large class (when the rows are numbered from front to back) and his or her final grade in the class. Do you think this represents a causal relationship or a spurious relationship, and why?
  • Think of two variables (other than those mentioned in this book) that are likely to be correlated, but in which the correlation is probably spurious. What is the likely common-causal variable that is producing the relationship?
  • Imagine a researcher wants to test the hypothesis that participating in psychotherapy will cause a decrease in reported anxiety. Describe the type of research design the investigator might use to draw this conclusion. What would be the independent and dependent variables in the research?

Image Attributions

Figure 3.4: “ Reading newspaper ” by Alaskan Dude (http://commons.wikimedia.org/wiki/File:Reading_newspaper.jpg) is licensed under CC BY 2.0

Aiken, L., & West, S. (1991).  Multiple regression: Testing and interpreting interactions . Newbury Park, CA: Sage.

Ainsworth, M. S., Blehar, M. C., Waters, E., & Wall, S. (1978).  Patterns of attachment: A psychological study of the strange situation . Hillsdale, NJ: Lawrence Erlbaum Associates.

Anderson, C. A., & Dill, K. E. (2000). Video games and aggressive thoughts, feelings, and behavior in the laboratory and in life.  Journal of Personality and Social Psychology, 78 (4), 772–790.

Damasio, H., Grabowski, T., Frank, R., Galaburda, A. M., Damasio, A. R., Cacioppo, J. T., & Berntson, G. G. (2005). The return of Phineas Gage: Clues about the brain from the skull of a famous patient. In  Social neuroscience: Key readings.  (pp. 21–28). New York, NY: Psychology Press.

Freud, S. (1909/1964). Analysis of phobia in a five-year-old boy. In E. A. Southwell & M. Merbaum (Eds.),  Personality: Readings in theory and research  (pp. 3–32). Belmont, CA: Wadsworth. (Original work published 1909).

Kotowicz, Z. (2007). The strange case of Phineas Gage.  History of the Human Sciences, 20 (1), 115–131.

Rokeach, M. (1964).  The three Christs of Ypsilanti: A psychological study . New York, NY: Knopf.

Stangor, C. (2011). Research methods for the behavioural sciences (4th ed.). Mountain View, CA: Cengage.

Long Descriptions

Figure 3.6 long description: There are 25 families. 24 families have an income between $44,000 and $111,000 and one family has an income of $3,800,000. The mean income is $223,960 while the median income is $73,000. [Return to Figure 3.6]

Figure 3.10 long description: Types of scatter plots.

  • Positive linear, r=positive .82. The plots on the graph form a rough line that runs from lower left to upper right.
  • Negative linear, r=negative .70. The plots on the graph form a rough line that runs from upper left to lower right.
  • Independent, r=0.00. The plots on the graph are spread out around the centre.
  • Curvilinear, r=0.00. The plots of the graph form a rough line that goes up and then down like a hill.
  • Curvilinear, r=0.00. The plots on the graph for a rough line that goes down and then up like a ditch.

[Return to Figure 3.10]

Introduction to Psychology - 1st Canadian Edition Copyright © 2014 by Jennifer Walinga and Charles Stangor is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

case study experimental psychology

Explore Psychology

What Is a Case Study in Psychology?

Categories Research Methods

A case study is a research method used in psychology to investigate a particular individual, group, or situation in depth . It involves a detailed analysis of the subject, gathering information from various sources such as interviews, observations, and documents.

In a case study, researchers aim to understand the complexities and nuances of the subject under investigation. They explore the individual’s thoughts, feelings, behaviors, and experiences to gain insights into specific psychological phenomena. 

This type of research can provide great detail regarding a particular case, allowing researchers to examine rare or unique situations that may not be easily replicated in a laboratory setting. They offer a holistic view of the subject, considering various factors influencing their behavior or mental processes. 

By examining individual cases, researchers can generate hypotheses, develop theories, and contribute to the existing body of knowledge in psychology. Case studies are often utilized in clinical psychology, where they can provide valuable insights into the diagnosis, treatment, and outcomes of specific psychological disorders. 

Case studies offer a comprehensive and in-depth understanding of complex psychological phenomena, providing researchers with valuable information to inform theory, practice, and future research.

Table of Contents

Examples of Case Studies in Psychology

Case studies in psychology provide real-life examples that illustrate psychological concepts and theories. They offer a detailed analysis of specific individuals, groups, or situations, allowing researchers to understand psychological phenomena better. Here are a few examples of case studies in psychology: 

Phineas Gage

This famous case study explores the effects of a traumatic brain injury on personality and behavior. A railroad construction worker, Phineas Gage survived a severe brain injury that dramatically changed his personality.

This case study helped researchers understand the role of the frontal lobe in personality and social behavior. 

Little Albert

Conducted by behaviorist John B. Watson, the Little Albert case study aimed to demonstrate classical conditioning. In this study, a young boy named Albert was conditioned to fear a white rat by pairing it with a loud noise.

This case study provided insights into the process of fear conditioning and the impact of early experiences on behavior. 

Genie’s case study focused on a girl who experienced extreme social isolation and deprivation during her childhood. This study shed light on the critical period for language development and the effects of severe neglect on cognitive and social functioning. 

These case studies highlight the value of in-depth analysis and provide researchers with valuable insights into various psychological phenomena. By examining specific cases, psychologists can uncover unique aspects of human behavior and contribute to the field’s knowledge and understanding.

Types of Case Studies in Psychology

Psychology case studies come in various forms, each serving a specific purpose in research and analysis. Understanding the different types of case studies can help researchers choose the most appropriate approach. 

Descriptive Case Studies

These studies aim to describe a particular individual, group, or situation. Researchers use descriptive case studies to explore and document specific characteristics, behaviors, or experiences.

For example, a descriptive case study may examine the life and experiences of a person with a rare psychological disorder. 

Exploratory Case Studies

Exploratory case studies are conducted when there is limited existing knowledge or understanding of a particular phenomenon. Researchers use these studies to gather preliminary information and generate hypotheses for further investigation.

Exploratory case studies often involve in-depth interviews, observations, and analysis of existing data. 

Explanatory Case Studies

These studies aim to explain the causal relationship between variables or events. Researchers use these studies to understand why certain outcomes occur and to identify the underlying mechanisms or processes.

Explanatory case studies often involve comparing multiple cases to identify common patterns or factors. 

Instrumental Case Studies

Instrumental case studies focus on using a particular case to gain insights into a broader issue or theory. Researchers select cases that are representative or critical in understanding the phenomenon of interest.

Instrumental case studies help researchers develop or refine theories and contribute to the general knowledge in the field. 

By utilizing different types of case studies, psychologists can explore various aspects of human behavior and gain a deeper understanding of psychological phenomena. Each type of case study offers unique advantages and contributes to the overall body of knowledge in psychology.

How to Collect Data for a Case Study

There are a variety of ways that researchers gather the data they need for a case study. Some sources include:

  • Directly observing the subject
  • Collecting information from archival records
  • Conducting interviews
  • Examining artifacts related to the subject
  • Examining documents that provide information about the subject

The way that this information is collected depends on the nature of the study itself

Prospective Research

In a prospective study, researchers observe the individual or group in question. These observations typically occur over a period of time and may be used to track the progress or progression of a phenomenon or treatment.

Retrospective Research

A retrospective case study involves looking back on a phenomenon. Researchers typically look at the outcome and then gather data to help them understand how the individual or group reached that point.

Benefits of a Case Study

Case studies offer several benefits in the field of psychology. They provide researchers with a unique opportunity to delve deep into specific individuals, groups, or situations, allowing for a comprehensive understanding of complex phenomena.

Case studies offer valuable insights that can inform theory development and practical applications by examining real-life examples. 

Complex Data

One of the key benefits of case studies is their ability to provide complex and detailed data. Researchers can gather in-depth information through various methods such as interviews, observations, and analysis of existing records.

This depth of data allows for a thorough exploration of the factors influencing behavior and the underlying mechanisms at play. 

Unique Data

Additionally, case studies allow researchers to study rare or unique cases that may not be easily replicated in experimental settings. This enables the examination of phenomena that are difficult to study through other psychology research methods . 

By focusing on specific cases, researchers can uncover patterns, identify causal relationships, and generate hypotheses for further investigation.

General Knowledge

Case studies can also contribute to the general knowledge of psychology by providing real-world examples that can be used to support or challenge existing theories. They offer a bridge between theory and practice, allowing researchers to apply theoretical concepts to real-life situations and vice versa. 

Case studies offer a range of benefits in psychology, including providing rich and detailed data, studying unique cases, and contributing to theory development. These benefits make case studies valuable in understanding human behavior and psychological phenomena.

Limitations of a Case Study

While case studies offer numerous benefits in the field of psychology, they also have certain limitations that researchers need to consider. Understanding these limitations is crucial for interpreting the findings and generalizing the results. 

Lack of Generalizability

One limitation of case studies is the issue of generalizability. Since case studies focus on specific individuals, groups, and situations, applying the findings to a larger population can be challenging. The unique characteristics and circumstances of the case may not be representative of the broader population, making it difficult to draw universal conclusions. 

Researcher bias is another possible limitation. The researcher’s subjective interpretation and personal beliefs can influence the data collection, analysis, and interpretation process. This bias can affect the objectivity and reliability of the findings, raising questions about the study’s validity. 

Case studies are often time-consuming and resource-intensive. They require extensive data collection, analysis, and interpretation, which can be lengthy. This can limit the number of cases that can be studied and may result in a smaller sample size, reducing the study’s statistical power. 

Case studies are retrospective in nature, relying on past events and experiences. This reliance on memory and self-reporting can introduce recall bias and inaccuracies in the data. Participants may forget or misinterpret certain details, leading to incomplete or unreliable information.

Despite these limitations, case studies remain a valuable research tool in psychology. By acknowledging and addressing these limitations, researchers can enhance the validity and reliability of their findings, contributing to a more comprehensive understanding of human behavior and psychological phenomena. 

While case studies have limitations, they remain valuable when researchers acknowledge and address these concerns, leading to more reliable and valid findings in psychology.

Alpi, K. M., & Evans, J. J. (2019). Distinguishing case study as a research method from case reports as a publication type. Journal of the Medical Library Association , 107(1). https://doi.org/10.5195/jmla.2019.615

Crowe, S., Cresswell, K., Robertson, A., Huby, G., Avery, A., & Sheikh, A. (2011). The case study approach. BMC Medical Research Methodology , 11(1), 100. https://doi.org/10.1186/1471-2288-11-100

Paparini, S., Green, J., Papoutsi, C., Murdoch, J., Petticrew, M., Greenhalgh, T., Hanckel, B., & Shaw, S. (2020). Case study research for better evaluations of complex interventions: Rationale and challenges. BMC Medicine , 18(1), 301. https://doi.org/10.1186/s12916-020-01777-6

Willemsen, J. (2023). What is preventing psychotherapy case studies from having a greater impact on evidence-based practice, and how to address the challenges? Frontiers in Psychiatry , 13, 1101090. https://doi.org/10.3389/fpsyt.2022.1101090

Yin, Robert K. Case Study Research and Applications: Design and Methods . United States, SAGE Publications, 2017.

Experimental Psychology

  • Living reference work entry
  • First Online: 18 April 2024
  • Cite this living reference work entry

case study experimental psychology

  • Chen Shuyong 2 &
  • He Dongjun 3  

Experimental psychology refers to the branch of psychology concerning the basic process and design method of experimental research and its application in the field of basic research and practice of psychology.

Brief History

Before the nineteenth century, the problems of psychology were mostly discussed in the field of philosophy, using the method of speculation and empirical generalization. There was a prevailing belief that the experimental method was not applicable for the study of psychological phenomena. After the Renaissance, the materialist philosophical trend of ideas and the development of natural science in Europe gave birth to the experimental psychology at the end of the nineteenth century. The former includes John Locke’s materialist empiricism, David Hartley’s associationism, and Julien Offray de La Mettrie’s mechanical materialism. The latter includes the research on nerve conduction in physiology, the debate on brain function localization and the establishment of the...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Further Reading

Kantowitz BH, Roediger HL, Elmes DG (2015) Experimental psychology, 10th edn. Cengage Learning, Boston

Google Scholar  

Zhang X-M, Hua S (2014) Experimental psychology. Beijing Normal University Publishing Group, Beijing

Download references

Author information

Authors and affiliations.

School of Psychological and Cognitive Sciences, Peking University, Beijing, China

Chen Shuyong

School of Psychology, Chengdu Medical University, Chengdu, China

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to He Dongjun .

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this entry

Cite this entry.

Shuyong, C., Dongjun, H. (2024). Experimental Psychology. In: The ECPH Encyclopedia of Psychology. Springer, Singapore. https://doi.org/10.1007/978-981-99-6000-2_753-1

Download citation

DOI : https://doi.org/10.1007/978-981-99-6000-2_753-1

Received : 23 March 2024

Accepted : 25 March 2024

Published : 18 April 2024

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-6000-2

Online ISBN : 978-981-99-6000-2

eBook Packages : Springer Reference Behavioral Science and Psychology Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • The 25 Most Influential Psychological Experiments in History

Most Influential Psychological Experiments in History

While each year thousands and thousands of studies are completed in the many specialty areas of psychology, there are a handful that, over the years, have had a lasting impact in the psychological community as a whole. Some of these were dutifully conducted, keeping within the confines of ethical and practical guidelines. Others pushed the boundaries of human behavior during their psychological experiments and created controversies that still linger to this day. And still others were not designed to be true psychological experiments, but ended up as beacons to the psychological community in proving or disproving theories.

This is a list of the 25 most influential psychological experiments still being taught to psychology students of today.

1. A Class Divided

Study conducted by: jane elliott.

Study Conducted in 1968 in an Iowa classroom

A Class Divided Study Conducted By: Jane Elliott

Experiment Details: Jane Elliott’s famous experiment was inspired by the assassination of Dr. Martin Luther King Jr. and the inspirational life that he led. The third grade teacher developed an exercise, or better yet, a psychological experiment, to help her Caucasian students understand the effects of racism and prejudice.

Elliott divided her class into two separate groups: blue-eyed students and brown-eyed students. On the first day, she labeled the blue-eyed group as the superior group and from that point forward they had extra privileges, leaving the brown-eyed children to represent the minority group. She discouraged the groups from interacting and singled out individual students to stress the negative characteristics of the children in the minority group. What this exercise showed was that the children’s behavior changed almost instantaneously. The group of blue-eyed students performed better academically and even began bullying their brown-eyed classmates. The brown-eyed group experienced lower self-confidence and worse academic performance. The next day, she reversed the roles of the two groups and the blue-eyed students became the minority group.

At the end of the experiment, the children were so relieved that they were reported to have embraced one another and agreed that people should not be judged based on outward appearances. This exercise has since been repeated many times with similar outcomes.

For more information click here

2. Asch Conformity Study

Study conducted by: dr. solomon asch.

Study Conducted in 1951 at Swarthmore College

Asch Conformity Study

Experiment Details: Dr. Solomon Asch conducted a groundbreaking study that was designed to evaluate a person’s likelihood to conform to a standard when there is pressure to do so.

A group of participants were shown pictures with lines of various lengths and were then asked a simple question: Which line is longest? The tricky part of this study was that in each group only one person was a true participant. The others were actors with a script. Most of the actors were instructed to give the wrong answer. Strangely, the one true participant almost always agreed with the majority, even though they knew they were giving the wrong answer.

The results of this study are important when we study social interactions among individuals in groups. This study is a famous example of the temptation many of us experience to conform to a standard during group situations and it showed that people often care more about being the same as others than they do about being right. It is still recognized as one of the most influential psychological experiments for understanding human behavior.

3. Bobo Doll Experiment

Study conducted by: dr. alburt bandura.

Study Conducted between 1961-1963 at Stanford University

Bobo Doll Experiment

In his groundbreaking study he separated participants into three groups:

  • one was exposed to a video of an adult showing aggressive behavior towards a Bobo doll
  • another was exposed to video of a passive adult playing with the Bobo doll
  • the third formed a control group

Children watched their assigned video and then were sent to a room with the same doll they had seen in the video (with the exception of those in the control group). What the researcher found was that children exposed to the aggressive model were more likely to exhibit aggressive behavior towards the doll themselves. The other groups showed little imitative aggressive behavior. For those children exposed to the aggressive model, the number of derivative physical aggressions shown by the boys was 38.2 and 12.7 for the girls.

The study also showed that boys exhibited more aggression when exposed to aggressive male models than boys exposed to aggressive female models. When exposed to aggressive male models, the number of aggressive instances exhibited by boys averaged 104. This is compared to 48.4 aggressive instances exhibited by boys who were exposed to aggressive female models.

While the results for the girls show similar findings, the results were less drastic. When exposed to aggressive female models, the number of aggressive instances exhibited by girls averaged 57.7. This is compared to 36.3 aggressive instances exhibited by girls who were exposed to aggressive male models. The results concerning gender differences strongly supported Bandura’s secondary prediction that children will be more strongly influenced by same-sex models. The Bobo Doll Experiment showed a groundbreaking way to study human behavior and it’s influences.

4. Car Crash Experiment

Study conducted by: elizabeth loftus and john palmer.

Study Conducted in 1974 at The University of California in Irvine

Car Crash Experiment

The participants watched slides of a car accident and were asked to describe what had happened as if they were eyewitnesses to the scene. The participants were put into two groups and each group was questioned using different wording such as “how fast was the car driving at the time of impact?” versus “how fast was the car going when it smashed into the other car?” The experimenters found that the use of different verbs affected the participants’ memories of the accident, showing that memory can be easily distorted.

This research suggests that memory can be easily manipulated by questioning technique. This means that information gathered after the event can merge with original memory causing incorrect recall or reconstructive memory. The addition of false details to a memory of an event is now referred to as confabulation. This concept has very important implications for the questions used in police interviews of eyewitnesses.

5. Cognitive Dissonance Experiment

Study conducted by: leon festinger and james carlsmith.

Study Conducted in 1957 at Stanford University

Experiment Details: The concept of cognitive dissonance refers to a situation involving conflicting:

This conflict produces an inherent feeling of discomfort leading to a change in one of the attitudes, beliefs or behaviors to minimize or eliminate the discomfort and restore balance.

Cognitive dissonance was first investigated by Leon Festinger, after an observational study of a cult that believed that the earth was going to be destroyed by a flood. Out of this study was born an intriguing experiment conducted by Festinger and Carlsmith where participants were asked to perform a series of dull tasks (such as turning pegs in a peg board for an hour). Participant’s initial attitudes toward this task were highly negative.

They were then paid either $1 or $20 to tell a participant waiting in the lobby that the tasks were really interesting. Almost all of the participants agreed to walk into the waiting room and persuade the next participant that the boring experiment would be fun. When the participants were later asked to evaluate the experiment, the participants who were paid only $1 rated the tedious task as more fun and enjoyable than the participants who were paid $20 to lie.

Being paid only $1 is not sufficient incentive for lying and so those who were paid $1 experienced dissonance. They could only overcome that cognitive dissonance by coming to believe that the tasks really were interesting and enjoyable. Being paid $20 provides a reason for turning pegs and there is therefore no dissonance.

6. Fantz’s Looking Chamber

Study conducted by: robert l. fantz.

Study Conducted in 1961 at the University of Illinois

Experiment Details: The study conducted by Robert L. Fantz is among the simplest, yet most important in the field of infant development and vision. In 1961, when this experiment was conducted, there very few ways to study what was going on in the mind of an infant. Fantz realized that the best way was to simply watch the actions and reactions of infants. He understood the fundamental factor that if there is something of interest near humans, they generally look at it.

To test this concept, Fantz set up a display board with two pictures attached. On one was a bulls-eye. On the other was the sketch of a human face. This board was hung in a chamber where a baby could lie safely underneath and see both images. Then, from behind the board, invisible to the baby, he peeked through a hole to watch what the baby looked at. This study showed that a two-month old baby looked twice as much at the human face as it did at the bulls-eye. This suggests that human babies have some powers of pattern and form selection. Before this experiment it was thought that babies looked out onto a chaotic world of which they could make little sense.

7. Hawthorne Effect

Study conducted by: henry a. landsberger.

Study Conducted in 1955 at Hawthorne Works in Chicago, Illinois

Hawthorne Effect

Landsberger performed the study by analyzing data from experiments conducted between 1924 and 1932, by Elton Mayo, at the Hawthorne Works near Chicago. The company had commissioned studies to evaluate whether the level of light in a building changed the productivity of the workers. What Mayo found was that the level of light made no difference in productivity. The workers increased their output whenever the amount of light was switched from a low level to a high level, or vice versa.

The researchers noticed a tendency that the workers’ level of efficiency increased when any variable was manipulated. The study showed that the output changed simply because the workers were aware that they were under observation. The conclusion was that the workers felt important because they were pleased to be singled out. They increased productivity as a result. Being singled out was the factor dictating increased productivity, not the changing lighting levels, or any of the other factors that they experimented upon.

The Hawthorne Effect has become one of the hardest inbuilt biases to eliminate or factor into the design of any experiment in psychology and beyond.

8. Kitty Genovese Case

Study conducted by: new york police force.

Study Conducted in 1964 in New York City

Experiment Details: The murder case of Kitty Genovese was never intended to be a psychological experiment, however it ended up having serious implications for the field.

According to a New York Times article, almost 40 neighbors witnessed Kitty Genovese being savagely attacked and murdered in Queens, New York in 1964. Not one neighbor called the police for help. Some reports state that the attacker briefly left the scene and later returned to “finish off” his victim. It was later uncovered that many of these facts were exaggerated. (There were more likely only a dozen witnesses and records show that some calls to police were made).

What this case later become famous for is the “Bystander Effect,” which states that the more bystanders that are present in a social situation, the less likely it is that anyone will step in and help. This effect has led to changes in medicine, psychology and many other areas. One famous example is the way CPR is taught to new learners. All students in CPR courses learn that they must assign one bystander the job of alerting authorities which minimizes the chances of no one calling for assistance.

9. Learned Helplessness Experiment

Study conducted by: martin seligman.

Study Conducted in 1967 at the University of Pennsylvania

Learned Helplessness Experiment

Seligman’s experiment involved the ringing of a bell and then the administration of a light shock to a dog. After a number of pairings, the dog reacted to the shock even before it happened. As soon as the dog heard the bell, he reacted as though he’d already been shocked.

During the course of this study something unexpected happened. Each dog was placed in a large crate that was divided down the middle with a low fence. The dog could see and jump over the fence easily. The floor on one side of the fence was electrified, but not on the other side of the fence. Seligman placed each dog on the electrified side and administered a light shock. He expected the dog to jump to the non-shocking side of the fence. In an unexpected turn, the dogs simply laid down.

The hypothesis was that as the dogs learned from the first part of the experiment that there was nothing they could do to avoid the shocks, they gave up in the second part of the experiment. To prove this hypothesis the experimenters brought in a new set of animals and found that dogs with no history in the experiment would jump over the fence.

This condition was described as learned helplessness. A human or animal does not attempt to get out of a negative situation because the past has taught them that they are helpless.

10. Little Albert Experiment

Study conducted by: john b. watson and rosalie rayner.

Study Conducted in 1920 at Johns Hopkins University

Little Albert Experiment

The experiment began by placing a white rat in front of the infant, who initially had no fear of the animal. Watson then produced a loud sound by striking a steel bar with a hammer every time little Albert was presented with the rat. After several pairings (the noise and the presentation of the white rat), the boy began to cry and exhibit signs of fear every time the rat appeared in the room. Watson also created similar conditioned reflexes with other common animals and objects (rabbits, Santa beard, etc.) until Albert feared them all.

This study proved that classical conditioning works on humans. One of its most important implications is that adult fears are often connected to early childhood experiences.

11. Magical Number Seven

Study conducted by: george a. miller.

Study Conducted in 1956 at Princeton University

Experiment Details:   Frequently referred to as “ Miller’s Law,” the Magical Number Seven experiment purports that the number of objects an average human can hold in working memory is 7 ± 2. This means that the human memory capacity typically includes strings of words or concepts ranging from 5-9. This information on the limits to the capacity for processing information became one of the most highly cited papers in psychology.

The Magical Number Seven Experiment was published in 1956 by cognitive psychologist George A. Miller of Princeton University’s Department of Psychology in Psychological Review .  In the article, Miller discussed a concurrence between the limits of one-dimensional absolute judgment and the limits of short-term memory.

In a one-dimensional absolute-judgment task, a person is presented with a number of stimuli that vary on one dimension (such as 10 different tones varying only in pitch). The person responds to each stimulus with a corresponding response (learned before).

Performance is almost perfect up to five or six different stimuli but declines as the number of different stimuli is increased. This means that a human’s maximum performance on one-dimensional absolute judgment can be described as an information store with the maximum capacity of approximately 2 to 3 bits of information There is the ability to distinguish between four and eight alternatives.

12. Pavlov’s Dog Experiment

Study conducted by: ivan pavlov.

Study Conducted in the 1890s at the Military Medical Academy in St. Petersburg, Russia

Pavlov’s Dog Experiment

Pavlov began with the simple idea that there are some things that a dog does not need to learn. He observed that dogs do not learn to salivate when they see food. This reflex is “hard wired” into the dog. This is an unconditioned response (a stimulus-response connection that required no learning).

Pavlov outlined that there are unconditioned responses in the animal by presenting a dog with a bowl of food and then measuring its salivary secretions. In the experiment, Pavlov used a bell as his neutral stimulus. Whenever he gave food to his dogs, he also rang a bell. After a number of repeats of this procedure, he tried the bell on its own. What he found was that the bell on its own now caused an increase in salivation. The dog had learned to associate the bell and the food. This learning created a new behavior. The dog salivated when he heard the bell. Because this response was learned (or conditioned), it is called a conditioned response. The neutral stimulus has become a conditioned stimulus.

This theory came to be known as classical conditioning.

13. Robbers Cave Experiment

Study conducted by: muzafer and carolyn sherif.

Study Conducted in 1954 at the University of Oklahoma

Experiment Details: This experiment, which studied group conflict, is considered by most to be outside the lines of what is considered ethically sound.

In 1954 researchers at the University of Oklahoma assigned 22 eleven- and twelve-year-old boys from similar backgrounds into two groups. The two groups were taken to separate areas of a summer camp facility where they were able to bond as social units. The groups were housed in separate cabins and neither group knew of the other’s existence for an entire week. The boys bonded with their cabin mates during that time. Once the two groups were allowed to have contact, they showed definite signs of prejudice and hostility toward each other even though they had only been given a very short time to develop their social group. To increase the conflict between the groups, the experimenters had them compete against each other in a series of activities. This created even more hostility and eventually the groups refused to eat in the same room. The final phase of the experiment involved turning the rival groups into friends. The fun activities the experimenters had planned like shooting firecrackers and watching movies did not initially work, so they created teamwork exercises where the two groups were forced to collaborate. At the end of the experiment, the boys decided to ride the same bus home, demonstrating that conflict can be resolved and prejudice overcome through cooperation.

Many critics have compared this study to Golding’s Lord of the Flies novel as a classic example of prejudice and conflict resolution.

14. Ross’ False Consensus Effect Study

Study conducted by: lee ross.

Study Conducted in 1977 at Stanford University

Experiment Details: In 1977, a social psychology professor at Stanford University named Lee Ross conducted an experiment that, in lay terms, focuses on how people can incorrectly conclude that others think the same way they do, or form a “false consensus” about the beliefs and preferences of others. Ross conducted the study in order to outline how the “false consensus effect” functions in humans.

Featured Programs

In the first part of the study, participants were asked to read about situations in which a conflict occurred and then were told two alternative ways of responding to the situation. They were asked to do three things:

  • Guess which option other people would choose
  • Say which option they themselves would choose
  • Describe the attributes of the person who would likely choose each of the two options

What the study showed was that most of the subjects believed that other people would do the same as them, regardless of which of the two responses they actually chose themselves. This phenomenon is referred to as the false consensus effect, where an individual thinks that other people think the same way they do when they may not. The second observation coming from this important study is that when participants were asked to describe the attributes of the people who will likely make the choice opposite of their own, they made bold and sometimes negative predictions about the personalities of those who did not share their choice.

15. The Schacter and Singer Experiment on Emotion

Study conducted by: stanley schachter and jerome e. singer.

Study Conducted in 1962 at Columbia University

Experiment Details: In 1962 Schachter and Singer conducted a ground breaking experiment to prove their theory of emotion.

In the study, a group of 184 male participants were injected with epinephrine, a hormone that induces arousal including increased heartbeat, trembling, and rapid breathing. The research participants were told that they were being injected with a new medication to test their eyesight. The first group of participants was informed the possible side effects that the injection might cause while the second group of participants were not. The participants were then placed in a room with someone they thought was another participant, but was actually a confederate in the experiment. The confederate acted in one of two ways: euphoric or angry. Participants who had not been informed about the effects of the injection were more likely to feel either happier or angrier than those who had been informed.

What Schachter and Singer were trying to understand was the ways in which cognition or thoughts influence human emotion. Their study illustrates the importance of how people interpret their physiological states, which form an important component of your emotions. Though their cognitive theory of emotional arousal dominated the field for two decades, it has been criticized for two main reasons: the size of the effect seen in the experiment was not that significant and other researchers had difficulties repeating the experiment.

16. Selective Attention / Invisible Gorilla Experiment

Study conducted by: daniel simons and christopher chabris.

Study Conducted in 1999 at Harvard University

Experiment Details: In 1999 Simons and Chabris conducted their famous awareness test at Harvard University.

Participants in the study were asked to watch a video and count how many passes occurred between basketball players on the white team. The video moves at a moderate pace and keeping track of the passes is a relatively easy task. What most people fail to notice amidst their counting is that in the middle of the test, a man in a gorilla suit walked onto the court and stood in the center before walking off-screen.

The study found that the majority of the subjects did not notice the gorilla at all, proving that humans often overestimate their ability to effectively multi-task. What the study set out to prove is that when people are asked to attend to one task, they focus so strongly on that element that they may miss other important details.

17. Stanford Prison Study

Study conducted by philip zimbardo.

Study Conducted in 1971 at Stanford University

Stanford Prison Study

The Stanford Prison Experiment was designed to study behavior of “normal” individuals when assigned a role of prisoner or guard. College students were recruited to participate. They were assigned roles of “guard” or “inmate.”  Zimbardo played the role of the warden. The basement of the psychology building was the set of the prison. Great care was taken to make it look and feel as realistic as possible.

The prison guards were told to run a prison for two weeks. They were told not to physically harm any of the inmates during the study. After a few days, the prison guards became very abusive verbally towards the inmates. Many of the prisoners became submissive to those in authority roles. The Stanford Prison Experiment inevitably had to be cancelled because some of the participants displayed troubling signs of breaking down mentally.

Although the experiment was conducted very unethically, many psychologists believe that the findings showed how much human behavior is situational. People will conform to certain roles if the conditions are right. The Stanford Prison Experiment remains one of the most famous psychology experiments of all time.

18. Stanley Milgram Experiment

Study conducted by stanley milgram.

Study Conducted in 1961 at Stanford University

Experiment Details: This 1961 study was conducted by Yale University psychologist Stanley Milgram. It was designed to measure people’s willingness to obey authority figures when instructed to perform acts that conflicted with their morals. The study was based on the premise that humans will inherently take direction from authority figures from very early in life.

Participants were told they were participating in a study on memory. They were asked to watch another person (an actor) do a memory test. They were instructed to press a button that gave an electric shock each time the person got a wrong answer. (The actor did not actually receive the shocks, but pretended they did).

Participants were told to play the role of “teacher” and administer electric shocks to “the learner,” every time they answered a question incorrectly. The experimenters asked the participants to keep increasing the shocks. Most of them obeyed even though the individual completing the memory test appeared to be in great pain. Despite these protests, many participants continued the experiment when the authority figure urged them to. They increased the voltage after each wrong answer until some eventually administered what would be lethal electric shocks.

This experiment showed that humans are conditioned to obey authority and will usually do so even if it goes against their natural morals or common sense.

19. Surrogate Mother Experiment

Study conducted by: harry harlow.

Study Conducted from 1957-1963 at the University of Wisconsin

Experiment Details: In a series of controversial experiments during the late 1950s and early 1960s, Harry Harlow studied the importance of a mother’s love for healthy childhood development.

In order to do this he separated infant rhesus monkeys from their mothers a few hours after birth and left them to be raised by two “surrogate mothers.” One of the surrogates was made of wire with an attached bottle for food. The other was made of soft terrycloth but lacked food. The researcher found that the baby monkeys spent much more time with the cloth mother than the wire mother, thereby proving that affection plays a greater role than sustenance when it comes to childhood development. They also found that the monkeys that spent more time cuddling the soft mother grew up to healthier.

This experiment showed that love, as demonstrated by physical body contact, is a more important aspect of the parent-child bond than the provision of basic needs. These findings also had implications in the attachment between fathers and their infants when the mother is the source of nourishment.

20. The Good Samaritan Experiment

Study conducted by: john darley and daniel batson.

Study Conducted in 1973 at The Princeton Theological Seminary (Researchers were from Princeton University)

Experiment Details: In 1973, an experiment was created by John Darley and Daniel Batson, to investigate the potential causes that underlie altruistic behavior. The researchers set out three hypotheses they wanted to test:

  • People thinking about religion and higher principles would be no more inclined to show helping behavior than laymen.
  • People in a rush would be much less likely to show helping behavior.
  • People who are religious for personal gain would be less likely to help than people who are religious because they want to gain some spiritual and personal insights into the meaning of life.

Student participants were given some religious teaching and instruction. They were then were told to travel from one building to the next. Between the two buildings was a man lying injured and appearing to be in dire need of assistance. The first variable being tested was the degree of urgency impressed upon the subjects, with some being told not to rush and others being informed that speed was of the essence.

The results of the experiment were intriguing, with the haste of the subject proving to be the overriding factor. When the subject was in no hurry, nearly two-thirds of people stopped to lend assistance. When the subject was in a rush, this dropped to one in ten.

People who were on the way to deliver a speech about helping others were nearly twice as likely to help as those delivering other sermons,. This showed that the thoughts of the individual were a factor in determining helping behavior. Religious beliefs did not appear to make much difference on the results. Being religious for personal gain, or as part of a spiritual quest, did not appear to make much of an impact on the amount of helping behavior shown.

21. The Halo Effect Experiment

Study conducted by: richard e. nisbett and timothy decamp wilson.

Study Conducted in 1977 at the University of Michigan

Experiment Details: The Halo Effect states that people generally assume that people who are physically attractive are more likely to:

  • be intelligent
  • be friendly
  • display good judgment

To prove their theory, Nisbett and DeCamp Wilson created a study to prove that people have little awareness of the nature of the Halo Effect. They’re not aware that it influences:

  • their personal judgments
  • the production of a more complex social behavior

In the experiment, college students were the research participants. They were asked to evaluate a psychology instructor as they view him in a videotaped interview. The students were randomly assigned to one of two groups. Each group was shown one of two different interviews with the same instructor. The instructor is a native French-speaking Belgian who spoke English with a noticeable accent. In the first video, the instructor presented himself as someone:

  • respectful of his students’ intelligence and motives
  • flexible in his approach to teaching
  • enthusiastic about his subject matter

In the second interview, he presented himself as much more unlikable. He was cold and distrustful toward the students and was quite rigid in his teaching style.

After watching the videos, the subjects were asked to rate the lecturer on:

  • physical appearance

His mannerisms and accent were kept the same in both versions of videos. The subjects were asked to rate the professor on an 8-point scale ranging from “like extremely” to “dislike extremely.” Subjects were also told that the researchers were interested in knowing “how much their liking for the teacher influenced the ratings they just made.” Other subjects were asked to identify how much the characteristics they just rated influenced their liking of the teacher.

After responding to the questionnaire, the respondents were puzzled about their reactions to the videotapes and to the questionnaire items. The students had no idea why they gave one lecturer higher ratings. Most said that how much they liked the lecturer had not affected their evaluation of his individual characteristics at all.

The interesting thing about this study is that people can understand the phenomenon, but they are unaware when it is occurring. Without realizing it, humans make judgments. Even when it is pointed out, they may still deny that it is a product of the halo effect phenomenon.

22. The Marshmallow Test

Study conducted by: walter mischel.

Study Conducted in 1972 at Stanford University

The Marshmallow Test

In his 1972 Marshmallow Experiment, children ages four to six were taken into a room where a marshmallow was placed in front of them on a table. Before leaving each of the children alone in the room, the experimenter informed them that they would receive a second marshmallow if the first one was still on the table after they returned in 15 minutes. The examiner recorded how long each child resisted eating the marshmallow and noted whether it correlated with the child’s success in adulthood. A small number of the 600 children ate the marshmallow immediately and one-third delayed gratification long enough to receive the second marshmallow.

In follow-up studies, Mischel found that those who deferred gratification were significantly more competent and received higher SAT scores than their peers. This characteristic likely remains with a person for life. While this study seems simplistic, the findings outline some of the foundational differences in individual traits that can predict success.

23. The Monster Study

Study conducted by: wendell johnson.

Study Conducted in 1939 at the University of Iowa

Experiment Details: The Monster Study received this negative title due to the unethical methods that were used to determine the effects of positive and negative speech therapy on children.

Wendell Johnson of the University of Iowa selected 22 orphaned children, some with stutters and some without. The children were in two groups. The group of children with stutters was placed in positive speech therapy, where they were praised for their fluency. The non-stutterers were placed in negative speech therapy, where they were disparaged for every mistake in grammar that they made.

As a result of the experiment, some of the children who received negative speech therapy suffered psychological effects and retained speech problems for the rest of their lives. They were examples of the significance of positive reinforcement in education.

The initial goal of the study was to investigate positive and negative speech therapy. However, the implication spanned much further into methods of teaching for young children.

24. Violinist at the Metro Experiment

Study conducted by: staff at the washington post.

Study Conducted in 2007 at a Washington D.C. Metro Train Station

Grammy-winning musician, Joshua Bell

During the study, pedestrians rushed by without realizing that the musician playing at the entrance to the metro stop was Grammy-winning musician, Joshua Bell. Two days before playing in the subway, he sold out at a theater in Boston where the seats average $100. He played one of the most intricate pieces ever written with a violin worth 3.5 million dollars. In the 45 minutes the musician played his violin, only 6 people stopped and stayed for a while. Around 20 gave him money, but continued to walk their normal pace. He collected $32.

The study and the subsequent article organized by the Washington Post was part of a social experiment looking at:

  • the priorities of people

Gene Weingarten wrote about the social experiment: “In a banal setting at an inconvenient time, would beauty transcend?” Later he won a Pulitzer Prize for his story. Some of the questions the article addresses are:

  • Do we perceive beauty?
  • Do we stop to appreciate it?
  • Do we recognize the talent in an unexpected context?

As it turns out, many of us are not nearly as perceptive to our environment as we might like to think.

25. Visual Cliff Experiment

Study conducted by: eleanor gibson and richard walk.

Study Conducted in 1959 at Cornell University

Experiment Details: In 1959, psychologists Eleanor Gibson and Richard Walk set out to study depth perception in infants. They wanted to know if depth perception is a learned behavior or if it is something that we are born with. To study this, Gibson and Walk conducted the visual cliff experiment.

They studied 36 infants between the ages of six and 14 months, all of whom could crawl. The infants were placed one at a time on a visual cliff. A visual cliff was created using a large glass table that was raised about a foot off the floor. Half of the glass table had a checker pattern underneath in order to create the appearance of a ‘shallow side.’

In order to create a ‘deep side,’ a checker pattern was created on the floor; this side is the visual cliff. The placement of the checker pattern on the floor creates the illusion of a sudden drop-off. Researchers placed a foot-wide centerboard between the shallow side and the deep side. Gibson and Walk found the following:

  • Nine of the infants did not move off the centerboard.
  • All of the 27 infants who did move crossed into the shallow side when their mothers called them from the shallow side.
  • Three of the infants crawled off the visual cliff toward their mother when called from the deep side.
  • When called from the deep side, the remaining 24 children either crawled to the shallow side or cried because they could not cross the visual cliff and make it to their mother.

What this study helped demonstrate is that depth perception is likely an inborn train in humans.

Among these experiments and psychological tests, we see boundaries pushed and theories taking on a life of their own. It is through the endless stream of psychological experimentation that we can see simple hypotheses become guiding theories for those in this field. The greater field of psychology became a formal field of experimental study in 1879, when Wilhelm Wundt established the first laboratory dedicated solely to psychological research in Leipzig, Germany. Wundt was the first person to refer to himself as a psychologist. Since 1879, psychology has grown into a massive collection of:

  • methods of practice

It’s also a specialty area in the field of healthcare. None of this would have been possible without these and many other important psychological experiments that have stood the test of time.

  • 20 Most Unethical Experiments in Psychology
  • What Careers are in Experimental Psychology?
  • 10 Things to Know About the Psychology of Psychotherapy

About Education: Psychology

Explorable.com

Mental Floss.com

About the Author

After earning a Bachelor of Arts in Psychology from Rutgers University and then a Master of Science in Clinical and Forensic Psychology from Drexel University, Kristen began a career as a therapist at two prisons in Philadelphia. At the same time she volunteered as a rape crisis counselor, also in Philadelphia. After a few years in the field she accepted a teaching position at a local college where she currently teaches online psychology courses. Kristen began writing in college and still enjoys her work as a writer, editor, professor and mother.

  • 5 Best Online Ph.D. Marriage and Family Counseling Programs
  • Top 5 Online Doctorate in Educational Psychology
  • 5 Best Online Ph.D. in Industrial and Organizational Psychology Programs
  • Top 10 Online Master’s in Forensic Psychology
  • 10 Most Affordable Counseling Psychology Online Programs
  • 10 Most Affordable Online Industrial Organizational Psychology Programs
  • 10 Most Affordable Online Developmental Psychology Online Programs
  • 15 Most Affordable Online Sport Psychology Programs
  • 10 Most Affordable School Psychology Online Degree Programs
  • Top 50 Online Psychology Master’s Degree Programs
  • Top 25 Online Master’s in Educational Psychology
  • Top 25 Online Master’s in Industrial/Organizational Psychology
  • Top 10 Most Affordable Online Master’s in Clinical Psychology Degree Programs
  • Top 6 Most Affordable Online PhD/PsyD Programs in Clinical Psychology
  • 50 Great Small Colleges for a Bachelor’s in Psychology
  • 50 Most Innovative University Psychology Departments
  • The 30 Most Influential Cognitive Psychologists Alive Today
  • Top 30 Affordable Online Psychology Degree Programs
  • 30 Most Influential Neuroscientists
  • Top 40 Websites for Psychology Students and Professionals
  • Top 30 Psychology Blogs
  • 25 Celebrities With Animal Phobias
  • Your Phobias Illustrated (Infographic)
  • 15 Inspiring TED Talks on Overcoming Challenges
  • 10 Fascinating Facts About the Psychology of Color
  • 15 Scariest Mental Disorders of All Time
  • 15 Things to Know About Mental Disorders in Animals
  • 13 Most Deranged Serial Killers of All Time

Online Psychology Degree Guide

Site Information

  • About Online Psychology Degree Guide
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

Classic Psychology Experiments

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

case study experimental psychology

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

case study experimental psychology

The history of psychology is filled with fascinating studies and classic psychology experiments that helped change the way we think about ourselves and human behavior. Sometimes the results of these experiments were so surprising they challenged conventional wisdom about the human mind and actions. In other cases, these experiments were also quite controversial.

Some of the most famous examples include Milgram's obedience experiment and Zimbardo's prison experiment. Explore some of these classic psychology experiments to learn more about some of the best-known research in psychology history.

Harlow’s Rhesus Monkey Experiments

In a series of controversial experiments conducted in the late 1950s and early 1960s, psychologist Harry Harlow demonstrated the powerful effects of love on normal development. By showing the devastating effects of deprivation on young rhesus monkeys , Harlow revealed the importance of love for healthy childhood development.

His experiments were often unethical and shockingly cruel, yet they uncovered fundamental truths that have heavily influenced our understanding of child development.

In one famous version of the experiments, infant monkeys were separated from their mothers immediately after birth and placed in an environment where they had access to either a wire monkey "mother" or a version of the faux-mother covered in a soft-terry cloth. While the wire mother provided food, the cloth mother provided only softness and comfort.

Harlow found that while the infant monkeys would go to the wire mother for food, they vastly preferred the company of the soft and comforting cloth mother. The study demonstrated that maternal bonds   were about much more than simply providing nourishment and that comfort and security played a major role in the formation of attachments .

Pavlov’s Classical Conditioning Experiments

The concept of classical conditioning is studied by every entry-level psychology student, so it may be surprising to learn that the man who first noted this phenomenon was not a psychologist at all. Pavlov was actually studying the digestive systems of dogs when he noticed that his subjects began to salivate whenever they saw his lab assistant.

What he soon discovered through his experiments was that certain responses (drooling) could be conditioned by associating a previously neutral stimulus (metronome or buzzer) with a stimulus that naturally and automatically triggers a response (food). Pavlov's experiments with dogs established classical conditioning.

The Asch Conformity Experiments

Researchers have long been interested in the degree to which people follow or rebel against social norms. During the 1950s, psychologist Solomon Asch conducted a series of experiments designed to demonstrate the powers of conformity in groups.  

The study revealed that people are surprisingly susceptible to going along with the group, even when they know the group is wrong.​ In Asch's studies, students were told that they were taking a vision test and were asked to identify which of three lines was the same length as a target line.

When asked alone, the students were highly accurate in their assessments. In other trials, confederate participants intentionally picked the incorrect line. As a result, many of the real participants gave the same answer as the other students, demonstrating how conformity could be both a powerful and subtle influence on human behavior.

Skinner's Operant Conditioning Experiments

Skinner studied how behavior can be reinforced to be repeated or weakened to be extinguished. He designed the Skinner Box where an animal, often a rodent, would be given a food pellet or an electric shock. A rat would learn that pressing a level delivered a food pellet. Or the rat would learn to press the lever in order to halt electric shocks.

Then, the animal may learn to associate a light or sound with being able to get the reward or halt negative stimuli by pressing the lever. Furthermore, he studied whether continuous, fixed ratio, fixed interval , variable ratio, and variable interval reinforcement led to faster response or learning.

Milgram’s Obedience Experiments

In Milgram's experiment , participants were asked to deliver electrical shocks to a "learner" whenever an incorrect answer was given. In reality, the learner was actually a confederate in the experiment who pretended to be shocked. The purpose of the experiment was to determine how far people were willing to go in order to obey the commands of an authority figure.

Milgram  found that 65% of participants were willing to deliver the maximum level of shocks   despite the fact that the learner seemed to be in serious distress or even unconscious.

Why This Experiment Is Notable

Milgram's experiment is one of the most controversial in psychology history. Many participants experienced considerable distress as a result of their participation and in many cases were never debriefed after the conclusion of the experiment. The experiment played a role in the development of ethical guidelines for the use of human participants in psychology experiments.

The Stanford Prison Experiment

Philip Zimbardo's famous experiment cast regular students in the roles of prisoners and prison guards. While the study was originally slated to last 2 weeks, it had to be halted after just 6 days because the guards became abusive and the prisoners began to show signs of extreme stress and anxiety.

Zimbardo's famous study was referred to after the abuses in Abu Ghraib came to light. Many experts believe that such group behaviors are heavily influenced by the power of the situation and the behavioral expectations placed on people cast in different roles.

It is worth noting criticisms of Zimbardo's experiment, however. While the general recollection of the experiment is that the guards became excessively abusive on their own as a natural response to their role, the reality is that they were explicitly instructed to mistreat the prisoners, potentially detracting from the conclusions of the study.

Van rosmalen L, Van der veer R, Van der horst FCP. The nature of love: Harlow, Bowlby and Bettelheim on affectionless mothers. Hist Psychiatry. 2020. doi:10.1177/0957154X19898997

Gantt WH . Ivan Pavlov . Encyclopaedia Brittanica .

Jeon, HL. The environmental factor within the Solomon Asch Line Test . International Journal of Social Science and Humanity. 2014;4(4):264-268. doi:10.7763/IJSSH.2014.V4.360 

Koren M. B.F. Skinner: The man who taught pigeons to play ping-pong and rats to pull levers . Smithsonian Magazine .

B.F. Skinner Foundation. A brief survey of operant behavior .

Gonzalez-franco M, Slater M, Birney ME, Swapp D, Haslam SA, Reicher SD. Participant concerns for the Learner in a Virtual Reality replication of the Milgram obedience study. PLoS ONE. 2018;13(12):e0209704. doi:10.1371/journal.pone.0209704

Zimbardo PG. Philip G. Zimbardo on his career and the Stanford Prison Experiment's 40th anniversary. Interview by Scott Drury, Scott A. Hutchens, Duane E. Shuttlesworth, and Carole L. White. Hist Psychol. 2012;15(2):161-170. doi:10.1037/a0025884

Le texier T. Debunking the Stanford Prison Experiment. Am Psychol. 2019;74(7):823-839. doi:10.1037/amp0000401

Perry G. Deception and illusion in Milgram's accounts of the Obedience Experiments . Theoretical & Applied Ethics . 2013;2(2):79-92.

Specter M. Drool: How Everyone Gets Pavlov Wrong . The New Yorker. 2014; November 24.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Frank T. McAndrew Ph.D.

How to Get Started on Your First Psychology Experiment

Acquiring even a little expertise in advance makes science research easier..

Updated May 16, 2024 | Reviewed by Ray Parker

  • Why Education Is Important
  • Find a Child Therapist
  • Students often struggle at the beginning of research projects—knowing how to begin.
  • Research projects can sometimes be inspired by everyday life or personal concerns.
  • Becoming something of an "expert" on a topic in advance makes designing a study go more smoothly.

ARENA Creative/Shutterstock

One of the most rewarding and frustrating parts of my long career as a psychology professor at a small liberal arts college has been guiding students through the senior capstone research experience required near the end of their college years. Each psychology major must conduct an independent experiment in which they collect data to test a hypothesis, analyze the data, write a research paper, and present their results at a college poster session or at a professional conference.

The rewarding part of the process is clear: The students' pride at seeing their poster on display and maybe even getting their name on an article in a professional journal allows us professors to get a glimpse of students being happy and excited—for a change. I also derive great satisfaction from watching a student discover that he or she has an aptitude for research and perhaps start shifting their career plans accordingly.

The frustrating part comes at the beginning of the research process when students are attempting to find a topic to work on. There is a lot of floundering around as students get stuck by doing something that seems to make sense: They begin by trying to “think up a study.”

The problem is that even if the student's research interest is driven by some very personal topic that is deeply relevant to their own life, they simply do not yet know enough to know where to begin. They do not know what has already been done by others, nor do they know how researchers typically attack that topic.

Students also tend to think in terms of mission statements (I want to cure eating disorders) rather than in terms of research questions (Why are people of some ages or genders more susceptible to eating disorders than others?).

Needless to say, attempting to solve a serious, long-standing societal problem in a few weeks while conducting one’s first psychology experiment can be a showstopper.

Even a Little Bit of Expertise Can Go a Long Way

My usual approach to helping students get past this floundering stage is to tell them to try to avoid thinking up a study altogether. Instead, I tell them to conceive of their mission as becoming an “expert” on some topic that they find interesting. They begin by reading journal articles, writing summaries of these articles, and talking to me about them. As the student learns more about the topic, our conversations become more sophisticated and interesting. Researchable questions begin to emerge, and soon, the student is ready to start writing a literature review that will sharpen the focus of their research question.

In short, even a little bit of expertise on a subject makes it infinitely easier to craft an experiment on that topic because the research done by others provides a framework into which the student can fit his or her own work.

This was a lesson I learned early in my career when I was working on my own undergraduate capstone experience. Faced with the necessity of coming up with a research topic and lacking any urgent personal issues that I was trying to resolve, I fell back on what little psychological expertise I had already accumulated.

In a previous psychology course, I had written a literature review on why some information fails to move from short-term memory into long-term memory. The journal articles that I had read for this paper relied primarily on laboratory studies with mice, and the debate that was going on between researchers who had produced different results in their labs revolved around subtle differences in the way that mice were released into the experimental apparatus in the studies.

Because I already had done some homework on this, I had a ready-made research question available: What if the experimental task was set up so that the researcher had no influence on how the mouse entered the apparatus at all? I was able to design a simple animal memory experiment that fit very nicely into the psychological literature that was already out there, and this prevented a lot of angst.

Please note that my undergraduate research project was guided by the “expertise” that I had already acquired rather than by a burning desire to solve some sort of personal or social problem. I guarantee that I had not been walking around as an undergraduate student worrying about why mice forget things, but I was nonetheless able to complete a fun and interesting study.

case study experimental psychology

My first experiment may not have changed the world, but it successfully launched my research career, and I fondly remember it as I work with my students 50 years later.

Frank T. McAndrew Ph.D.

Frank McAndrew, Ph.D., is the Cornelia H. Dudley Professor of Psychology at Knox College.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Online Therapy
  • International
  • New Zealand
  • South Africa
  • Switzerland
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Therapy Center NEW
  • Diagnosis Dictionary
  • Types of Therapy

May 2024 magazine cover

At any moment, someone’s aggravating behavior or our own bad luck can set us off on an emotional spiral that threatens to derail our entire day. Here’s how we can face our triggers with less reactivity so that we can get on with our lives.

  • Emotional Intelligence
  • Gaslighting
  • Affective Forecasting
  • Neuroscience

Case Study vs. Experiment

What's the difference.

Case studies and experiments are both research methods used in various fields to gather data and draw conclusions. However, they differ in their approach and purpose. A case study involves in-depth analysis of a particular individual, group, or situation, aiming to provide a detailed understanding of a specific phenomenon. On the other hand, an experiment involves manipulating variables and observing the effects on a sample population, aiming to establish cause-and-effect relationships. While case studies provide rich qualitative data, experiments provide quantitative data that can be statistically analyzed. Ultimately, the choice between these methods depends on the research question and the desired outcomes.

Further Detail

Introduction.

When conducting research, there are various methods available to gather data and analyze phenomena. Two commonly used approaches are case study and experiment. While both methods aim to provide insights and answers to research questions, they differ in their design, implementation, and the type of data they generate. In this article, we will explore the attributes of case study and experiment, highlighting their strengths and limitations.

A case study is an in-depth investigation of a particular individual, group, or phenomenon. It involves collecting and analyzing detailed information from multiple sources, such as interviews, observations, documents, and archival records. Case studies are often used in social sciences, psychology, and business research to gain a deep understanding of complex and unique situations.

One of the key attributes of a case study is its ability to provide rich and detailed data. Researchers can gather a wide range of information, allowing for a comprehensive analysis of the case. This depth of data enables researchers to explore complex relationships, identify patterns, and generate new hypotheses.

Furthermore, case studies are particularly useful when studying rare or unique phenomena. Since they focus on specific cases, they can provide valuable insights into situations that are not easily replicated or observed in controlled experiments. This attribute makes case studies highly relevant in fields where generalizability is not the primary goal.

However, it is important to note that case studies have limitations. Due to their qualitative nature, the findings may lack generalizability to broader populations or contexts. The small sample size and the subjective interpretation of data can also introduce bias. Additionally, case studies are time-consuming and resource-intensive, requiring extensive data collection and analysis.

An experiment is a research method that involves manipulating variables and measuring their effects on outcomes. It aims to establish cause-and-effect relationships by controlling and manipulating independent variables while keeping other factors constant. Experiments are commonly used in natural sciences, psychology, and medicine to test hypotheses and determine the impact of specific interventions or treatments.

One of the key attributes of an experiment is its ability to establish causal relationships. By controlling variables and randomly assigning participants to different conditions, researchers can confidently attribute any observed effects to the manipulated variables. This attribute allows for strong internal validity, making experiments a powerful tool for drawing causal conclusions.

Moreover, experiments often provide quantitative data, allowing for statistical analysis and objective comparisons. This attribute enhances the precision and replicability of findings, enabling researchers to draw more robust conclusions. The ability to replicate experiments also contributes to the cumulative nature of scientific knowledge.

However, experiments also have limitations. They are often conducted in controlled laboratory settings, which may limit the generalizability of findings to real-world contexts. Ethical considerations may also restrict the manipulation of certain variables or the use of certain interventions. Additionally, experiments can be time-consuming and costly, especially when involving large sample sizes or long-term follow-ups.

While case studies and experiments have distinct attributes, they can complement each other in research. Case studies provide in-depth insights and a rich understanding of complex phenomena, while experiments offer controlled conditions and the ability to establish causal relationships. By combining these methods, researchers can gain a more comprehensive understanding of the research question at hand.

When deciding between case study and experiment, researchers should consider the nature of their research question, the available resources, and the desired level of control and generalizability. Case studies are particularly suitable when exploring unique or rare phenomena, aiming for depth rather than breadth, and when resources allow for extensive data collection and analysis. On the other hand, experiments are ideal for establishing causal relationships, testing specific hypotheses, and when control over variables is crucial.

In conclusion, case study and experiment are two valuable research methods with their own attributes and limitations. Both approaches contribute to the advancement of knowledge in various fields, and their selection depends on the research question, available resources, and desired outcomes. By understanding the strengths and weaknesses of each method, researchers can make informed decisions and conduct rigorous and impactful research.

Comparisons may contain inaccurate information about people, places, or facts. Please report any issues.

Research Methods In Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.

research methods3

Hypotheses are statements about the prediction of the results, that can be verified or disproved by some investigation.

There are four types of hypotheses :
  • Null Hypotheses (H0 ) – these predict that no difference will be found in the results between the conditions. Typically these are written ‘There will be no difference…’
  • Alternative Hypotheses (Ha or H1) – these predict that there will be a significant difference in the results between the two conditions. This is also known as the experimental hypothesis.
  • One-tailed (directional) hypotheses – these state the specific direction the researcher expects the results to move in, e.g. higher, lower, more, less. In a correlation study, the predicted direction of the correlation can be either positive or negative.
  • Two-tailed (non-directional) hypotheses – these state that a difference will be found between the conditions of the independent variable but does not state the direction of a difference or relationship. Typically these are always written ‘There will be a difference ….’

All research has an alternative hypothesis (either a one-tailed or two-tailed) and a corresponding null hypothesis.

Once the research is conducted and results are found, psychologists must accept one hypothesis and reject the other. 

So, if a difference is found, the Psychologist would accept the alternative hypothesis and reject the null.  The opposite applies if no difference is found.

Sampling techniques

Sampling is the process of selecting a representative group from the population under study.

Sample Target Population

A sample is the participants you select from a target population (the group you are interested in) to make generalizations about.

Representative means the extent to which a sample mirrors a researcher’s target population and reflects its characteristics.

Generalisability means the extent to which their findings can be applied to the larger population of which their sample was a part.

  • Volunteer sample : where participants pick themselves through newspaper adverts, noticeboards or online.
  • Opportunity sampling : also known as convenience sampling , uses people who are available at the time the study is carried out and willing to take part. It is based on convenience.
  • Random sampling : when every person in the target population has an equal chance of being selected. An example of random sampling would be picking names out of a hat.
  • Systematic sampling : when a system is used to select participants. Picking every Nth person from all possible participants. N = the number of people in the research population / the number of people needed for the sample.
  • Stratified sampling : when you identify the subgroups and select participants in proportion to their occurrences.
  • Snowball sampling : when researchers find a few participants, and then ask them to find participants themselves and so on.
  • Quota sampling : when researchers will be told to ensure the sample fits certain quotas, for example they might be told to find 90 participants, with 30 of them being unemployed.

Experiments always have an independent and dependent variable .

  • The independent variable is the one the experimenter manipulates (the thing that changes between the conditions the participants are placed into). It is assumed to have a direct effect on the dependent variable.
  • The dependent variable is the thing being measured, or the results of the experiment.

variables

Operationalization of variables means making them measurable/quantifiable. We must use operationalization to ensure that variables are in a form that can be easily tested.

For instance, we can’t really measure ‘happiness’, but we can measure how many times a person smiles within a two-hour period. 

By operationalizing variables, we make it easy for someone else to replicate our research. Remember, this is important because we can check if our findings are reliable.

Extraneous variables are all variables which are not independent variable but could affect the results of the experiment.

It can be a natural characteristic of the participant, such as intelligence levels, gender, or age for example, or it could be a situational feature of the environment such as lighting or noise.

Demand characteristics are a type of extraneous variable that occurs if the participants work out the aims of the research study, they may begin to behave in a certain way.

For example, in Milgram’s research , critics argued that participants worked out that the shocks were not real and they administered them as they thought this was what was required of them. 

Extraneous variables must be controlled so that they do not affect (confound) the results.

Randomly allocating participants to their conditions or using a matched pairs experimental design can help to reduce participant variables. 

Situational variables are controlled by using standardized procedures, ensuring every participant in a given condition is treated in the same way

Experimental Design

Experimental design refers to how participants are allocated to each condition of the independent variable, such as a control or experimental group.
  • Independent design ( between-groups design ): each participant is selected for only one group. With the independent design, the most common way of deciding which participants go into which group is by means of randomization. 
  • Matched participants design : each participant is selected for only one group, but the participants in the two groups are matched for some relevant factor or factors (e.g. ability; sex; age).
  • Repeated measures design ( within groups) : each participant appears in both groups, so that there are exactly the same participants in each group.
  • The main problem with the repeated measures design is that there may well be order effects. Their experiences during the experiment may change the participants in various ways.
  • They may perform better when they appear in the second group because they have gained useful information about the experiment or about the task. On the other hand, they may perform less well on the second occasion because of tiredness or boredom.
  • Counterbalancing is the best way of preventing order effects from disrupting the findings of an experiment, and involves ensuring that each condition is equally likely to be used first and second by the participants.

If we wish to compare two groups with respect to a given independent variable, it is essential to make sure that the two groups do not differ in any other important way. 

Experimental Methods

All experimental methods involve an iv (independent variable) and dv (dependent variable)..

  • Field experiments are conducted in the everyday (natural) environment of the participants. The experimenter still manipulates the IV, but in a real-life setting. It may be possible to control extraneous variables, though such control is more difficult than in a lab experiment.
  • Natural experiments are when a naturally occurring IV is investigated that isn’t deliberately manipulated, it exists anyway. Participants are not randomly allocated, and the natural event may only occur rarely.

Case studies are in-depth investigations of a person, group, event, or community. It uses information from a range of sources, such as from the person concerned and also from their family and friends.

Many techniques may be used such as interviews, psychological tests, observations and experiments. Case studies are generally longitudinal: in other words, they follow the individual or group over an extended period of time. 

Case studies are widely used in psychology and among the best-known ones carried out were by Sigmund Freud . He conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

Case studies provide rich qualitative data and have high levels of ecological validity. However, it is difficult to generalize from individual cases as each one has unique characteristics.

Correlational Studies

Correlation means association; it is a measure of the extent to which two variables are related. One of the variables can be regarded as the predictor variable with the other one as the outcome variable.

Correlational studies typically involve obtaining two different measures from a group of participants, and then assessing the degree of association between the measures. 

The predictor variable can be seen as occurring before the outcome variable in some sense. It is called the predictor variable, because it forms the basis for predicting the value of the outcome variable.

Relationships between variables can be displayed on a graph or as a numerical score called a correlation coefficient.

types of correlation. Scatter plot. Positive negative and no correlation

  • If an increase in one variable tends to be associated with an increase in the other, then this is known as a positive correlation .
  • If an increase in one variable tends to be associated with a decrease in the other, then this is known as a negative correlation .
  • A zero correlation occurs when there is no relationship between variables.

After looking at the scattergraph, if we want to be sure that a significant relationship does exist between the two variables, a statistical test of correlation can be conducted, such as Spearman’s rho.

The test will give us a score, called a correlation coefficient . This is a value between 0 and 1, and the closer to 1 the score is, the stronger the relationship between the variables. This value can be both positive e.g. 0.63, or negative -0.63.

Types of correlation. Strong, weak, and perfect positive correlation, strong, weak, and perfect negative correlation, no correlation. Graphs or charts ...

A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables.

Correlation does not always prove causation, as a third variable may be involved. 

causation correlation

Interview Methods

Interviews are commonly divided into two types: structured and unstructured.

A fixed, predetermined set of questions is put to every participant in the same order and in the same way. 

Responses are recorded on a questionnaire, and the researcher presets the order and wording of questions, and sometimes the range of alternative answers.

The interviewer stays within their role and maintains social distance from the interviewee.

There are no set questions, and the participant can raise whatever topics he/she feels are relevant and ask them in their own way. Questions are posed about participants’ answers to the subject

Unstructured interviews are most useful in qualitative research to analyze attitudes and values.

Though they rarely provide a valid basis for generalization, their main advantage is that they enable the researcher to probe social actors’ subjective point of view. 

Questionnaire Method

Questionnaires can be thought of as a kind of written interview. They can be carried out face to face, by telephone, or post.

The choice of questions is important because of the need to avoid bias or ambiguity in the questions, ‘leading’ the respondent or causing offense.

  • Open questions are designed to encourage a full, meaningful answer using the subject’s own knowledge and feelings. They provide insights into feelings, opinions, and understanding. Example: “How do you feel about that situation?”
  • Closed questions can be answered with a simple “yes” or “no” or specific information, limiting the depth of response. They are useful for gathering specific facts or confirming details. Example: “Do you feel anxious in crowds?”

Its other practical advantages are that it is cheaper than face-to-face interviews and can be used to contact many respondents scattered over a wide area relatively quickly.

Observations

There are different types of observation methods :
  • Covert observation is where the researcher doesn’t tell the participants they are being observed until after the study is complete. There could be ethical problems or deception and consent with this particular observation method.
  • Overt observation is where a researcher tells the participants they are being observed and what they are being observed for.
  • Controlled : behavior is observed under controlled laboratory conditions (e.g., Bandura’s Bobo doll study).
  • Natural : Here, spontaneous behavior is recorded in a natural setting.
  • Participant : Here, the observer has direct contact with the group of people they are observing. The researcher becomes a member of the group they are researching.  
  • Non-participant (aka “fly on the wall): The researcher does not have direct contact with the people being observed. The observation of participants’ behavior is from a distance

Pilot Study

A pilot  study is a small scale preliminary study conducted in order to evaluate the feasibility of the key s teps in a future, full-scale project.

A pilot study is an initial run-through of the procedures to be used in an investigation; it involves selecting a few people and trying out the study on them. It is possible to save time, and in some cases, money, by identifying any flaws in the procedures designed by the researcher.

A pilot study can help the researcher spot any ambiguities (i.e. unusual things) or confusion in the information given to participants or problems with the task devised.

Sometimes the task is too hard, and the researcher may get a floor effect, because none of the participants can score at all or can complete the task – all performances are low.

The opposite effect is a ceiling effect, when the task is so easy that all achieve virtually full marks or top performances and are “hitting the ceiling”.

Research Design

In cross-sectional research , a researcher compares multiple segments of the population at the same time

Sometimes, we want to see how people change over time, as in studies of human development and lifespan. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time.

In cohort studies , the participants must share a common factor or characteristic such as age, demographic, or occupation. A cohort study is a type of longitudinal study in which researchers monitor and observe a chosen population over an extended period.

Triangulation means using more than one research method to improve the study’s validity.

Reliability

Reliability is a measure of consistency, if a particular measurement is repeated and the same result is obtained then it is described as being reliable.

  • Test-retest reliability :  assessing the same person on two different occasions which shows the extent to which the test produces the same answers.
  • Inter-observer reliability : the extent to which there is an agreement between two or more observers.

Meta-Analysis

A meta-analysis is a systematic review that involves identifying an aim and then searching for research studies that have addressed similar aims/hypotheses.

This is done by looking through various databases, and then decisions are made about what studies are to be included/excluded.

Strengths: Increases the conclusions’ validity as they’re based on a wider range.

Weaknesses: Research designs in studies can vary, so they are not truly comparable.

Peer Review

A researcher submits an article to a journal. The choice of the journal may be determined by the journal’s audience or prestige.

The journal selects two or more appropriate experts (psychologists working in a similar field) to peer review the article without payment. The peer reviewers assess: the methods and designs used, originality of the findings, the validity of the original research findings and its content, structure and language.

Feedback from the reviewer determines whether the article is accepted. The article may be: Accepted as it is, accepted with revisions, sent back to the author to revise and re-submit or rejected without the possibility of submission.

The editor makes the final decision whether to accept or reject the research report based on the reviewers comments/ recommendations.

Peer review is important because it prevent faulty data from entering the public domain, it provides a way of checking the validity of findings and the quality of the methodology and is used to assess the research rating of university departments.

Peer reviews may be an ideal, whereas in practice there are lots of problems. For example, it slows publication down and may prevent unusual, new work being published. Some reviewers might use it as an opportunity to prevent competing researchers from publishing work.

Some people doubt whether peer review can really prevent the publication of fraudulent research.

The advent of the internet means that a lot of research and academic comment is being published without official peer reviews than before, though systems are evolving on the internet where everyone really has a chance to offer their opinions and police the quality of research.

Types of Data

  • Quantitative data is numerical data e.g. reaction time or number of mistakes. It represents how much or how long, how many there are of something. A tally of behavioral categories and closed questions in a questionnaire collect quantitative data.
  • Qualitative data is virtually any type of information that can be observed and recorded that is not numerical in nature and can be in the form of written or verbal communication. Open questions in questionnaires and accounts from observational studies collect qualitative data.
  • Primary data is first-hand data collected for the purpose of the investigation.
  • Secondary data is information that has been collected by someone other than the person who is conducting the research e.g. taken from journals, books or articles.

Validity means how well a piece of research actually measures what it sets out to, or how well it reflects the reality it claims to represent.

Validity is whether the observed effect is genuine and represents what is actually out there in the world.

  • Concurrent validity is the extent to which a psychological measure relates to an existing similar measure and obtains close results. For example, a new intelligence test compared to an established test.
  • Face validity : does the test measure what it’s supposed to measure ‘on the face of it’. This is done by ‘eyeballing’ the measuring or by passing it to an expert to check.
  • Ecological validit y is the extent to which findings from a research study can be generalized to other settings / real life.
  • Temporal validity is the extent to which findings from a research study can be generalized to other historical times.

Features of Science

  • Paradigm – A set of shared assumptions and agreed methods within a scientific discipline.
  • Paradigm shift – The result of the scientific revolution: a significant change in the dominant unifying theory within a scientific discipline.
  • Objectivity – When all sources of personal bias are minimised so not to distort or influence the research process.
  • Empirical method – Scientific approaches that are based on the gathering of evidence through direct observation and experience.
  • Replicability – The extent to which scientific procedures and findings can be repeated by other researchers.
  • Falsifiability – The principle that a theory cannot be considered scientific unless it admits the possibility of being proved untrue.

Statistical Testing

A significant result is one where there is a low probability that chance factors were responsible for any observed difference, correlation, or association in the variables tested.

If our test is significant, we can reject our null hypothesis and accept our alternative hypothesis.

If our test is not significant, we can accept our null hypothesis and reject our alternative hypothesis. A null hypothesis is a statement of no effect.

In Psychology, we use p < 0.05 (as it strikes a balance between making a type I and II error) but p < 0.01 is used in tests that could cause harm like introducing a new drug.

A type I error is when the null hypothesis is rejected when it should have been accepted (happens when a lenient significance level is used, an error of optimism).

A type II error is when the null hypothesis is accepted when it should have been rejected (happens when a stringent significance level is used, an error of pessimism).

Ethical Issues

  • Informed consent is when participants are able to make an informed judgment about whether to take part. It causes them to guess the aims of the study and change their behavior.
  • To deal with it, we can gain presumptive consent or ask them to formally indicate their agreement to participate but it may invalidate the purpose of the study and it is not guaranteed that the participants would understand.
  • Deception should only be used when it is approved by an ethics committee, as it involves deliberately misleading or withholding information. Participants should be fully debriefed after the study but debriefing can’t turn the clock back.
  • All participants should be informed at the beginning that they have the right to withdraw if they ever feel distressed or uncomfortable.
  • It causes bias as the ones that stayed are obedient and some may not withdraw as they may have been given incentives or feel like they’re spoiling the study. Researchers can offer the right to withdraw data after participation.
  • Participants should all have protection from harm . The researcher should avoid risks greater than those experienced in everyday life and they should stop the study if any harm is suspected. However, the harm may not be apparent at the time of the study.
  • Confidentiality concerns the communication of personal information. The researchers should not record any names but use numbers or false names though it may not be possible as it is sometimes possible to work out who the researchers were.

Print Friendly, PDF & Email

Related Articles

What Is a Focus Group?

Research Methodology

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

A-level Psychology AQA Revision Notes

A-Level Psychology

A-level Psychology AQA Revision Notes

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

IMAGES

  1. Case Study: Definition, Examples, Types, and How to Write

    case study experimental psychology

  2. FREE 10+ Psychology Case Study Samples & Templates in MS Word

    case study experimental psychology

  3. A Case Study in Psychology

    case study experimental psychology

  4. Classic Case Studies in Psychology

    case study experimental psychology

  5. Experimental Psychology: 10 Examples & Definition (2023)

    case study experimental psychology

  6. 12+ Case Study Examples

    case study experimental psychology

VIDEO

  1. Pepsico inc. Case study

  2. Episode 2: Short History of Experimental Psychology

  3. METHODS OF PSYCHOLOGY

  4. Cognitive Psychology Research Methods Experiments & Case Studies

  5. The Stanford Prison Experiment (Most disturbing Experiment) 😔

  6. 17. Experimental Design Case Study: Accounting For Initial Biases (LE: Module 3, Part 2)

COMMENTS

  1. 15 Famous Experiments and Case Studies in Psychology

    6. Stanford Prison Experiment. One of the most controversial and widely-cited studies in psychology is the Stanford Prison Experiment, conducted by Philip Zimbardo at the basement of the Stanford psychology building in 1971. The hypothesis was that abusive behavior in prisons is influenced by the personality traits of the prisoners and prison ...

  2. Case Study Research Method in Psychology

    Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews). The case study research method originated in clinical medicine (the case history, i.e., the patient's personal history). In psychology, case studies are ...

  3. Single-Case Experimental Designs: A Systematic Review of Published

    The single-case experiment has a storied history in psychology dating back to the field's founders: Fechner (1889), Watson (1925), and Skinner (1938).It has been used to inform and develop theory, examine interpersonal processes, study the behavior of organisms, establish the effectiveness of psychological interventions, and address a host of other research questions (for a review, see ...

  4. Patient H.M. Case Study In Psychology: Henry Gustav Molaison

    At the time, Scoville's experimental procedure had previously only been performed on patients with psychosis, so H.M. was the first epileptic patient and showed no sign of mental illness. In the original case study of H.M., which is discussed in further detail below, nine of Scoville's patients from this experimental surgery were described.

  5. Journal of Experimental Psychology: General

    The Journal of Experimental Psychology: General ® publishes articles describing empirical work that is of broad interest or bridges the traditional interests of two or more communities of psychology. The work may touch on issues dealt with in JEP: Learning, Memory, and Cognition, JEP: Human Perception and Performance, JEP: Animal Behavior Processes, or JEP: Applied, but may also concern ...

  6. 6

    Summary. The case study approach has a rich history in psychology as a method for observing the ways in which individuals may demonstrate abnormal thinking and behavior, for collecting evidence concerning the circumstances and consequences surrounding such disorders, and for providing data to generate and test models of human behavior (see Yin ...

  7. Experimental Psychology Studies Humans and Animals

    Experimental psychologists are interested in exploring theoretical questions, often by creating a hypothesis and then setting out to prove or disprove it through experimentation. They study a wide range of behavioral topics among humans and animals, including sensation, perception, attention, memory, cognition and emotion.

  8. Single case studies are a powerful tool for developing ...

    The majority of methods in psychology rely on averaging group data to draw conclusions. In this Perspective, Nickels et al. argue that single case methodology is a valuable tool for developing and ...

  9. Experimental Method In Psychology

    There are three types of experiments you need to know: 1. Lab Experiment. A laboratory experiment in psychology is a research method in which the experimenter manipulates one or more independent variables and measures the effects on the dependent variable under controlled conditions. A laboratory experiment is conducted under highly controlled ...

  10. How Does Experimental Psychology Study Behavior?

    The experimental method in psychology helps us learn more about how people think and why they behave the way they do. Experimental psychologists can research a variety of topics using many different experimental methods. Each one contributes to what we know about the mind and human behavior. 4 Sources.

  11. Experimental psychology

    Experimental psychology refers to work done by those who apply experimental methods to psychological study and the underlying processes. Experimental psychologists employ human participants and animal subjects to study a great many topics, including (among others) sensation, perception, memory, cognition, learning, motivation, emotion; developmental processes, social psychology, and the neural ...

  12. Case Study: Definition, Examples, Types, and How to Write

    A case study is an in-depth study of one person, group, or event. In a case study, nearly every aspect of the subject's life and history is analyzed to seek patterns and causes of behavior. Case studies can be used in many different fields, including psychology, medicine, education, anthropology, political science, and social work.

  13. How the Experimental Method Works in Psychology

    The experimental method involves manipulating one variable to determine if this causes changes in another variable. This method relies on controlled research methods and random assignment of study subjects to test a hypothesis. For example, researchers may want to learn how different visual patterns may impact our perception.

  14. 3.2 Psychologists Use Descriptive, Correlational, and Experimental

    Experimental research is research in which initial equivalence among research participants in more than one group is created, ... An interesting example of a case study in clinical psychology is described by Rokeach (1964), who investigated in detail the beliefs of and interactions among three patients with schizophrenia, all of whom were ...

  15. What Is a Case Study in Psychology?

    A case study is a research method used in psychology to investigate a particular individual, group, or situation in depth. It involves a detailed analysis of the subject, gathering information from various sources such as interviews, observations, and documents. In a case study, researchers aim to understand the complexities and nuances of the ...

  16. The Family of Single-Case Experimental Designs

    Abstract. Single-case experimental designs (SCEDs) represent a family of research designs that use experimental methods to study the effects of treatments on outcomes. The fundamental unit of analysis is the single case—which can be an individual, clinic, or community—ideally with replications of effects within and/or between cases.

  17. Little Albert Experiment (Watson & Rayner)

    The Little Albert experiment was a controversial psychology experiment by John B. Watson and his graduate student, Rosalie Rayner, at Johns Hopkins University. The experiment was performed in 1920 and was a case study aimed at testing the principles of classical conditioning. Watson and Raynor presented Little Albert (a nine-month-old boy) with ...

  18. Experimental Psychology

    Chinese psychologists realize that under the principle of guiding psychology by Marxist-Leninist philosophy, experimental research in psychology must be strengthened in order to further develop psychological theories and enable psychology to play its due role in socialist construction. With the development of education and science, universities ...

  19. The 25 Most Influential Psychological Experiments in History

    1. A Class Divided Study Conducted By: Jane Elliott. Study Conducted in 1968 in an Iowa classroom. Experiment Details: Jane Elliott's famous experiment was inspired by the assassination of Dr. Martin Luther King Jr. and the inspirational life that he led. The third grade teacher developed an exercise, or better yet, a psychological experiment, to help her Caucasian students understand the ...

  20. 6 Classic Psychology Experiments

    The history of psychology is filled with fascinating studies and classic psychology experiments that helped change the way we think about ourselves and human behavior. Sometimes the results of these experiments were so surprising they challenged conventional wisdom about the human mind and actions. In other cases, these experiments were also ...

  21. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  22. Experimental Design: Types, Examples & Methods

    Three types of experimental designs are commonly used: 1. Independent Measures. Independent measures design, also known as between-groups, is an experimental design where different participants are used in each condition of the independent variable. This means that each condition of the experiment includes a different group of participants.

  23. How to Get Started on Your First Psychology Experiment

    The frustrating part comes at the beginning of the research process when students are attempting to find a topic to work on. There is a lot of floundering around as students get stuck doing ...

  24. Case Study vs. Experiment

    A case study involves in-depth analysis of a particular individual, group, or situation, aiming to provide a detailed understanding of a specific phenomenon. On the other hand, an experiment involves manipulating variables and observing the effects on a sample population, aiming to establish cause-and-effect relationships.

  25. Research Methods In Psychology

    Olivia Guy-Evans, MSc. Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.