• Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Media
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business Ethics
  • Business History
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic History
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Theory
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Counseling Psychology

  • < Previous chapter
  • Next chapter >

9 Methodologies in Counseling Psychology

Nancy E. Betz Department of Psychology The Ohio State University Columbus, Ohio

Ruth E. Fassinger, College of Graduate and Professional Studies, John F. Kennedy University, Pleasant Hill, CA

  • Published: 18 September 2012
  • Cite Icon Cite
  • Permissions Icon Permissions

This chapter reviews quantitative and qualitative methodologies most frequently used in counseling psychology research. We begin with a review of the paradigmatic bases and epistemological stances of quantitative and qualitative research, followed by overviews of both approaches to empirical research in counseling psychology. In these overviews, our goal is to provide a broad conceptual understanding of the “why” of these methods. Among the quantitative methods receiving attention are analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA), factor analysis, structural equation modeling, and discriminant analysis. We include discussion of such qualitative methods as grounded theory, narratology, phenomenology, ethnography, and participatory action research. Important general issues in designing qualitative studies are also discussed. The chapter concludes with a discussion of mixed methods in research and the importance of knowledge of both major approaches in order to maximally utilize findings from our rich and diverse counseling psychology literature.

All research in counseling psychology is based on the fundamental principle that we learn about people—their thoughts, feelings, and behaviors—by “observing” them in some systematic manner. Certainly, we learn much about people from the processes of informal observation, as might be the case when we watch someone interact at a party and make inferences about his or her social skills. But to advance science, our observations must be done under conditions that include some sort of control over the process of gathering those observations. The observations can be gained in a variety of ways: via formalized assessments or measures, from interviews or other kinds of narratives, through structured viewing and tracking of behavior, or even from cultural artifacts that provide information about the phenomenon of interest.

Broadly speaking, observations can be divided into those yielding numeric representations and those yielding linguistic representations, and each offers a unique and heuristically valuable approach to explaining the phenomenon of interest. The manner in which we examine the observations to make sense of them is determined by the nature of those observations, and science provides us with a wide variety of methods from which to choose. These methods are the subject of this chapter.

In this chapter, we begin with a review of the paradigmatic bases and epistemological stances of quantitative and qualitative research, followed by overviews of both approaches to empirical research in counseling psychology. In these reviews, our goal is to provide a broad conceptual understanding of the “why” of these methods rather than a detailed technical “how-to” description of any method. We note that our somewhat differential coverage of quantitative and qualitative approaches in the chapter reflects the disproportionate attention to quantitative methods currently seen in our field, an empirical imbalance that we hope will be rectified over the next decade. We also point out that, although we cover quantitative and qualitative approaches separately in this chapter for ease in presentation, combining these approaches can yield the most informative programs of research over time. Thus, we conclude the chapter with a brief discussion of mixed methods research and of future directions in research methodology.

Paradigmatic Bases and Epistemological Stances of Qualitative Research

Ponterotto (2002, 2005 ) has written cogently about the paradigmatic bases and epistemological stances underlying quantitative and qualitative research in counseling psychology, using a four-category system that distinguishes among positivist, post-positivist, constructivist-interpretivist, and critical-ideological paradigms. Beginning with several concepts integral to the philosophy of science, Ponterotto ( 2005 ) outlines the major premises of the four paradigms based on their assumptions regarding ontology (nature of reality), epistemology (acquisition of knowledge), axiology (role of values), rhetorical structure (language of research), and methodology (research procedures). Generally, it can be said that positivist and post-positivist paradigms most often undergird quantitative research, whereas constructivist-interpretivist and critical-ideological paradigms most often form the foundation for qualitative research. However, these distinctions are not completely orthogonal; the post-positivist position, for example, can typify the work of some qualitative as well as quantitative researchers, and some of the specific “qualitative” approaches (e.g., participatory action research, ethnography) may incorporate different forms of quantitative data if appropriate to the goals of the study.

Positivist and Post-positivist Paradigms

Positivism is often called the “received view” in psychology (Guba & Lincoln, 1994, cited in Ponterotto, 2005 ). It is based on the theory-driven, hypothesis-testing, deductive methods of the natural sciences, and involves a controlled approach to generating hypotheses about a phenomenon of interest, collecting carefully measured observations, testing hypotheses for verification using descriptive and inferential statistics, and producing general theories and cause-and-effect models that seek to predict and control the phenomenon under investigation. Ontologically, positivism assumes an objective reality that can be apprehended and measured, and epistemologically, it posits separation between the researcher and the participant, so that the researcher can maintain objectivity in knowing. Axiologically, it presumes the absence of values in research, and methodologically, it focuses on discovering reality as accurately and dispassionately as possible (hence the emphasis on experimentation and quasi-experimentation, as well as psychometric rigor). Finally, its rhetorical structure of detachment, neutrality, and third-person voice is aimed at capturing the objectivity that has characterized the entire research endeavor.

Post-positivism shares much in common with positivism, the main distinction being that post-positivism assumes human fallibility and, ontologically, accepts a true reality that can be apprehended and measured only imperfectly. Epistemologically, axiologically, methodologically, and rhetorically, the ideal of objectivity, the assumption of researcher–participant independence, the prevention of values from entering research, the controlled use of the hypothetico-deductive method, and the detached scientific voice, respectively, are acknowledged as goals that may be realized only imperfectly in actual practice. For this reason, the post-positivist paradigm embraces theory falsification rather than verification, while maintaining the remainder of the positivist assumptive structure. Both the positivist and post-positivist paradigms stand in contrast to the two broad classes of qualitative research paradigms described below

Constructivist-Interpretivist Paradigms

In contrast to the ontological assumption of a fixed, external, measurable reality that can be apprehended, the constructivist-interpretivist paradigm assumes a relativist notion of multiple, equally valid realities that are constructed in the minds of actors and observers; that is, there is no objective reality that exists apart from the person who is either experiencing or processing the reality, or both. Thus, any account of a phenomenon is necessarily an experientially driven, co-constructed account influenced by the narrator/actor/participant and the listener/observer/researcher, both of whom bring their unique interpretive lenses (shaped by context, history, individual differences, and other forces) to the co-constructed account, which itself is created within a particular experiential context. Epistemologically speaking, then, there cannot be separation between the participant and the researcher, as the process of coming to know and understand is a transactional process that relies on the relationship between them and their mutual construction and interpretation of a lived experience. Imperative to building the kind of relationship that facilitates a full and detailed sharing of lived experience is connection to and entry into the participant’s world on the part of the researcher; thus, methodologically, the research process often involves intense and prolonged contact between researchers and participants, utilizing narrative, observational, contextual, historical, and other kinds of data that reveal deep or hidden aspects of the lived experiences under investigation.

It should be obvious that the role of the researcher differs markedly in this paradigm from the positivist or post-positivist researcher position. Researcher subjectivity in a constructivist-interpretivist paradigm is not only acknowledged but becomes an integral part of the research process. Axiologically, values are acknowledged as important, but “bracketed” so that they do not unduly influence the lived experience or the perception of that experience being shared and documented, and researcher reflexivity (in the form of deep reflection, self-monitoring, and immersion in the participant’s world) is vitally important to maintaining the integrity of the researcher–participant relationship. Rhetorically, the intense involvement of both the researcher and the participant in the constructivist-interpretivist research process is captured in first-person language, detailed descriptions of how interpretations were generated, direct quotations from the primary data sources (e.g., narratives), and personal reflections of the researcher, including statements regarding values and expectations that likely influenced the work.

Critical-Ideological Paradigms

The critical-ideological paradigm shares a great deal of its assumptive structure with the constructivist-interpretivist paradigm, upon which it is built. However, it is more radical in its goals, which include disruption of the power inequalities of the societal status quo and the liberation and transformation of individual lives. Ontologically, it assumes relativist, constructed realities, but it focuses on historically and societally situated power relations that permeate those realities, and it seeks to dismantle the power structures that have been socially constructed to oppress particular groups of people (e.g., women, people of color, sexual minorities). Epistemologically, the mutual, transactional relationship between participant and researcher embedded in the constructivist-interpretivist paradigm is presumed but expanded to a more dialectical aim of “inciting transformation in the participants that leads to group empowerment and emancipation from oppression” (Ponterotto, 2005 , p. 131). Thus, axiologically speaking, values on the part of the researcher not only are acknowledged and described, but become the driving force behind the ultimate goal of social change.

As might be expected in an approach in which research constitutes a social intervention, the constructivist-interpretivist methodological practice of deeply connecting to individuals to document their lived experiences becomes liberationist in the critical-ideological paradigm. As such, the researcher joins with participants not only as a reflective, empathic chronicler of their lived experiences, but as a passionate advocate for the social change that would empower and emancipate them. This kind of approach necessitates much more prolonged engagement between researchers and participants than is typical in behavioral research and often results in deeply forged alliances that are maintained long after the formal research project has ended. The advocacy stance on the part of the researcher also is reflected in the rhetorical structures used in critical-ideological research, which include description of the societal (interpersonal and intergroup relationships, institutional, community, and policy) changes that resulted or are expected to result from the research.

The constructivist-interpretivist and critical-ideological paradigms give rise to and subsume most of the specific qualitative approaches being used in psychology currently. Although qualitative approaches share some basic philosophical and epistemological premises, each of the extant approaches has features that distinguish it from other approaches. However, some approaches are more fully developed and articulated than others, leading qualitative researchers to borrow and combine aspects of other approaches (e.g., coding procedures, interviewing techniques) in the implementation of studies that may emanate from very different philosophical and epistemological foundations and goals. In addition, a host of contextual factors shape the practical application of qualitative research aims, such as academic publish-or-perish institutional structures, lack of qualitative expertise and training in most graduate programs, and lack of resources or support for the kind of sustained effort considered ideal in these approaches. Thus, the variability among approaches (as well as the specific philosophical underpinnings of a study) may be masked by compromises in conducting the actual study.

Because qualitative approaches differ substantially based on their particular paradigms and epistemological assumptions throughout all phases of the research (and the closer the adherence to core tenets of the approaches, the greater the differences), every qualitative research project must begin with a thoughtful consideration of these issues (Denzin & Lincoln, 2000 ; Patton, 2002 ; Ponterotto, 2002, 2005 ). Quantitative approaches, on the other hand, because they tend to share in common a positivist or post-positivist paradigm involving hypothetico-deductive theory verification or theory falsification methods, find their variability in approach primarily in the kinds of measurement (e.g., interval or categorical) and statistical analyses (e.g., correlation or analysis of variance) utilized. In the following section, we provide an overview of quantitative methods.

Quantitative Methods

As noted above, quantitative methods are based on positivist or post-positivist epistemologies in which theories are used to guide hypothesis generation and hypothesis testing regarding phenomena of interest. Hypotheses are examined using carefully defined and obtained empirical observations, which are assumed to represent important abstract constructs. These hypotheses are tested using descriptive and inferential statistics. The usefulness of quantitative methods depends on the quality of the observational data, which in this case refers to the quality of measurement. Measurement is the process by which we assign numbers to observations, usually of human characteristics or behaviors. Measures (scores) could be derived from a vocational interest inventory, from a measure of depressive symptoms, from a measure of client’s liking for the therapist, or from an index of problematic eating behaviors. What is essential is the quality of measurement, usually grounded in the concepts of reliability and validity. Without reliable and valid measures, further data analysis is futile, a waste of time. Detailed discussion of the quality of measures is outside the scope of this chapter, but see the chapter by Swanson ( 2011 , Chapter 8 , this volume).

The next sections will summarize different types of data analysis using quantitative methods. It is important to note that complete coverage of statistical methods would require several textbooks, obviously beyond the scope of this chapter. We cover the most commonly used methods in counseling psychology research and refer readers to well-known introductory statistics books and to advanced volumes such as Tabachnick and Fidell’s ( 2007 ) excellent text Using Multivariate Statistics.

Describing Observations

Two basic areas of introductory statistics are scales of measurement (Stevens, 1951 ) and descriptive statistics, and they will be mentioned here only briefly. Scales of measurement—nominal, ordinal, interval, and ratio—are important because our use of quantitative methods often depends on the kind of data we have. Nominal, or categorical, scales do not actually represent measurement but rather category membership, for example gender or marital status. Next in level of measurement is the ordinal scale, or rank orders. League standings in baseball and class rank are ordinal scales. These numbers have a “greater than” or “less than” quality, but the intervals between numbers are not necessarily (or even usually) equal. Next in level is the interval scale, in which not only order but interval size are presumed to be meaningful. Most test data are interval scale data. Finally, ratio scales have a true zero point as well as ordinal meaning and the assumption of equal intervals. Ratio scales are often found in the physical or psychophysical sciences.

Basic descriptive statistics are also important—in their eagerness to move to inferential statistics, many researchers underemphasize the basic importance of descriptive statistics, or how the numbers “look.” If we recall that numbers reflect the observations of people, basic summary descriptions of those numbers become interesting. Observations can be described using frequency distributions (numbers of observations at each score point or score interval, often called histograms [bar diagrams] or frequency polygons). The most important frequency distribution is the normal distribution , or bell curve , which has certain very useful properties, most especially predicted percentages of cases that fall above and below each z score point on the normal curve. These points are used in determining such critical statistics as standard error of measurement and standard error of estimate.

The best known descriptive statistics are measures of central tendency and variability . Measures of central tendency include the arithmetic mean, median, and mode, and indices of variability include the range, the variance, and the standard deviation (the square root of the variance). Descriptive statistics are important in and of themselves. For example, we may want to answer the question, “How depressed are college students at our university?” Not only the mean score on our measure of depression but the range and variance of scores are critical information—for example, how many of our students are at risk of suicidal thoughts or behavior? We usually compare score means across gender or race/ethnicity, for which we need group standard deviations as well as means. We may want to compare the scores of a new sample with the original normative sample. We may want to compare the scores of those receiving an intervention to those in a control group. Thus, the mean, variance, and standard deviation are fundamental to many types of analysis.

Inferential Statistics—Group Differences

When we want to compare means—such as those of women and men, normative versus new samples, or those of treatment versus control groups—we usually begin with the assumption that we are examining a sample or samples from a population/populations. Only in very rare instances would we assume that we have assessed the entire population, and we will not deal with that possibility here. Because we are assessing samples from populations, we are using inferential statistics, making educated guesses about population parameters from sample statistics. In doing this, we assume that a population parameter has a sampling distribution with a mean and standard deviation, and that sample values taken from the population will array themselves around the population mean—this array of values is known as the sampling distribution . Generally speaking, the larger the standard deviation of the sampling distribution, the more variation in means we will have when sampling from that distribution. The key question for our statistical methods— t -tests, analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA)—is to estimate the probability that the means have come from the same versus different populations (we do not refer to z tests here because they assume that we have sampled the entire population).

Sum of Squares

Before proceeding, it may be helpful to review the concept of sum of squares (SS), which is critical to the variance calculations required by all these tests. Recall that a sample has a mean that is an estimate of the population parameter. And it has variability of individual scores around that mean. The wider the variability, the greater will be the variance, or its positive square root, the standard deviation. The variance and standard deviation are based on the sum of squared deviations of individual values from the sample mean, divided by the degrees of freedom, generally n -1 to reflect that the mean used is the sample rather than the population mean. The concept of SS is very important in ANOVA and regression as well, as it is a basis of the general linear model (GLM) fundamental to many of our statistical methods.

Given the above, we begin with a null hypothesis, which usually is that group means are not different from each other. The test statistic, in this case t , is the difference between our two means divided by the standard error of the difference. This latter value is the combined standard deviations divided by the combined degrees of freedom. Most computer programs will give the value of t for both pooled and separate variances—pooled variance estimates are appropriate if it can be assumed the two population variances are equal. Normally, the statistical software will provide as an option a test of homogeneity of variance, often Levene’s test, Hartley’s F max test (Kanji, 1993 ), or Bartlett’s test (see Glass & Hopkins, 1984; Kanji, 1993 ). If the hypothesis of homogeneity of variance is rejected, then the separate variance estimates must be used.

The obtained value of t is taken to the table of percentile points of the t distribution, sometimes called the critical values of the t distribution. Using the degrees of freedom, we determine the critical value for rejection of the null Ho at the prescribed value of α—for example, if our two groups each had an N of 21, then our df would be 40 and the critical values of t are 2.021 ( p < .05), 2.704 ( p < .01), and 3.551 ( p < .001). As sample sizes increase, t approaches the value of z . At df = infinity, the critical t at .05 is 1.96, which is also the .05 critical value for the z statistic.

Confidence Intervals and Effect Sizes

In addition to acceptance or rejection of the null hypothesis, we should compute both confidence intervals and effect sizes. There has been much criticism of null hypothesis significance testing (NHST) based on the fact that it allows us only to reject a null hypothesis (which usually is not what we actually want to know), at an arbitrary and necessarily dichotomous level, instead of giving us a probability that our desired hypothesis is true. Cohen ( 1994 ) argues that what we want to know is, “Given these data, what is the probability that Ho is true?” but that what it actually indicates is, “Given that Ho is true that is the probability of these (or more extreme) data?” (p. 997).

Confidence intervals improve hypothesis testing by providing estimates of the range of values that likely includes the true population value. When based on a t test, the confidence interval is the mean difference plus or minus a value computed as the critical value of t , multiplied by the standard error of the difference. Thus, we derive a confidence interval that contains the true (population) difference with a probability of 1- p ; if that interval includes zero, then we have not rejected the null hypothesis of no difference between the means.

Also, because the statistical significance of t tests is dependent on sample size, we could find that a trivial difference in means was statistically significant but practically unimportant. Likewise, we might overlook a potentially important difference if small sample sizes prevented it from reaching statistical significance. Because of this, it is always advisable to also calculate effect size, an index of the importance of the difference or, as noted by Kirk ( 1996 ) “practical significance.” There are several indices of effect size in terms of standardized mean differences, most commonly Cohen’s ( 1969 ) d and Glass’ ( 1976 ) δ. Cohen’s d is the mean difference divided by the combined (pooled) standard deviation. It essentially can be interpreted as the size of the difference in standard deviation units, and is directly related to the amount of overlap between two score distributions. Greater effect size equals less overlap between the distributions. With caution, in that all interpretations must be based on the context and purposes of the research, Cohen ( 1988 ) recommended the following ranges of d  : .20 to .50 is a small to medium effect, .50 to .80 is a medium to large effect, and over .80 is a large effect. There is not a one-to-one correspondence between statistical significance and effect size—a difference can be statistically significant but not practically important and vice versa.

Effect size is especially important since a difference of a given magnitude can be more or less important based on the overall spread of the score distributions. Assume we have a mean difference of 5 score points between men and women; if the standard deviation of the combined distributions is 10, we have an effect size ( d  ) of .5, what Cohen would describe as a moderately large effect. But if the standard deviation were 30, d would be .17, less than what Cohen would require for a small effect. In the first case, the difference is one-half a standard deviation, but in the second case it is only one-sixth a standard deviation.

One major use of effect size data is in the technique of meta-analysis , first coined by Glass ( 1976 ). Meta-analysis is used to quantitatively summarize the findings of many studies of the same general topic. In the case of mean differences, we often use it to summarize a number of studies of a treatment versus a control group. In the classic study of Smith and Glass ( 1977 ), 833 comparisons of psychotherapy treatment versus control groups were done, yielding a definitive conclusion of the effectiveness of psychotherapy. Meta-analysis involves searching the literature for all relevant studies, extracting the statistic of interest (in this case the effect size of a comparison of means) and then averaging the effect sizes across studies, often weighted by sample size or, in some cases, by judgments of the quality of the studies. Meta-analyses can also be done within subgroups, such as gender or race/ethnicity; in the case of Smith and Glass ( 1977 ), analyses were done for each theoretical approach to therapy. Readers may consult Glass ( 2006 ) for a comprehensive discussion of how to conduct a meta-analysis.

ANOVA is used to simultaneously test the differences between two or more means. When there are only two means, its results are identical to those from a t test, but it is more versatile in that three or more means can be tested for difference simultaneously. The null hypothesis is that the population means (μ) are equal: μ 1 = μ 2 = μ 3, etc. This is also called an omnibus test of the equality of means .

The method of ANOVA utilizes the decomposition of variance components. In a simple one-way analysis of variance, when there is only one independent variable, the total variance is the sum of the variance between treatment and the variance within treatments. We estimate these using the concept of sums of squares mentioned previously. The total SS is the squared deviation of each score in the entire set from the grand mean of all the scores. The SS Between is the SS from the squared deviation of each group mean from the grand mean, and the SS Within is the sum of squares of each individual’s score from the mean of his or her group.

To convert these into variances we obtain the mean squares (MS) by dividing the SS by the degrees of freedom, usually shown as the Greek letter ν. For the between-group SS, ν is J–1, where J is the number of groups, and for the within-group SS, ν is n –J–1, where n is the total sample size and J is again the number of groups. The test statistic F , is MSB/MSW, or the between-group variance divided by the within-group variance. We take this value to the F table using the df of 1 and 17. The closer the value of F is to 1.0, the less likely it is to be statistically significant—if the MSB/MSW is close to 1, then we can assume that they are estimating the same variance, the population variance, and that the groups do not differ from each other. If the value of F is statistically significant, we can reject the null hypothesis of group mean equality. Note that, if there are three or more groups, a significant result does not tell us specifically which groups are different from each other—that is examined using post hoc tests of means, such as the Tukey, Tukey-B, and Scheffe tests.

Next in complexity to one-way ANOVA is two- or more-way ANOVA, used with factorial designs. A factorial design involves two or more independent variables, for example type of treatment and gender. The IVs can be manipulated or formed into “natural groups” (such as gender or ethnicity, groups, which existed a priori and are not manipulated). When each level of one IV is paired with each level of the other, we call it a completely crossed design . This is probably the most commonly used factorial design, although there are many others—see research design texts (e.g., Heppner, Wampold, & Kivlighan, 2008 ) for possibilities. A two-way ANOVA can provide one or more main effects and/or an interaction effect.

In a two-way ANOVA, we calculate the MS for each independent variable and the MS for the interaction between the independent variables. Each MS is divided by MSW(within) to get the F for that effect. These are taken to the appropriate cell in the F table to determine whether we accept or reject the null hypothesis for that effect. Again, with an interaction or any main effect involving more than two levels, we must do post hoc tests.

Effect Sizes

In the same way that we use effect sizes to evaluate the practical importance of the differences when we have done t tests, effect sizes also should be provided for ANOVA and MANOVA. The effect sizes we use here are in terms of variance accounted for, which is also known as strength of association. η 2 is the proportion of sample total variance that is attributable to an effect in an ANOVA design. It is the ratio of effect variance to total variance; Cohen ( 1988 ) suggests that η 2 = .01, .09, and .25 correspond to small, medium, and large effect sizes, respectively. ω 2 estimates the effect size in the population (Tabachnick & Fidell, 2007 ). The intraclass correlation also can be used as an index of effect size in ANOVA models.

Vacha-Haase and Thompson ( 2004 ) provide an excellent table summarizing strategies for obtaining effect sizes for different analyses using IBM’s SPSS software suite (p. 477).

Confidence Intervals For Effect Sizes

There are two final important trends to mention in hypothesis testing. First, the American Psychological Association (APA) Task Force on Statistical Inference included as a recommendation that confidence intervals be provided for effect sizes themselves (see Tabachnick & Fidell, 2007 , and Vacha-Haase & Thompson, 2004 , for further details). Moreover, some journal editors are now recommending that the statistic called p rep (probability of replication) should be reported instead of the p itself. The p rep is the probability of replicating an effect and is itself a function of the effect size and the sample size (Killeen, 2005 ). The value of p rep is inversely related to p —for example, in a given study, p values of .05, .01, and .001 might correspond respectively to p rep values of .88, .95, and .99 (Killeen, 2005 ). Note that pr ep is a positive way of attaching a probability to the likelihood that we will find the same effect again, instead of a statistical sign that we should reject a null hypothesis.

MANOVA is appropriate when we have several dependent variables (DVs). Using the logic of type 1 error, rejecting the null hypothesis at p < .05 means that there is a .05 chance of error, that is, of falsely rejecting the null hypothesis. When we do several tests at the .05 level, we compound that probability of error—we have what is called “experiment-wise error” or “family-wise error.” With two DVs, the error rate is approximately .10, and with five DVs it is .23 (see Haase & Ellis, 1987 , for the formula). Clearly, these levels of experiment-wise error will lead to excessive type 1 error.

One method of correction for this error is known as the Bonferroni correction , which sets the per comparison level of α at approximately α/p, where p equals the number of dependent variables (Tabachnick & Fidell, 2007 ). This correction is adequate if the variables are not correlated, or if they are highly correlated, but it is not preferable if the variables are mildly correlated, which will be true in most cases. In these cases, MANOVA can be used. It controls the experiment-wise error at the original α, .05, .01, or .001, whatever probability had been determined as the critical value. Like ANOVA, MANOVA can involve only one factor or it can involve a factorial design with two or more IVs. The analysis yields a multivariate F based on an “omnibus” or simultaneous tests of means, which describes the probability of type 1 error over all the tests made. If the multivariate F is statistically significant, we may proceed to examine the univariate F statistics, the same as those we would receive in a univariate ANOVA, to determine which dependent variables are contributing to the significant overall F . The F statistics in MANOVA are provided by one or more of four statistical tests—Wilkes’ λ, Pillai’s trace, Hotelling’s trace, and Roy’s criterion (see Haase & Ellis, 1987 ).

Describing Relationships

Methods for studying relationships between and among variables have at least two important uses within psychology. The most basic use is to further the understanding of human behavior by helping to elucidate the interrelationships of behavior, personality, and functioning. Another important use is that of prediction; when we understand relationships, we can use them to predict future behavior. The study of relationships, including those used in prediction, begins with the topics of correlation and regression. The ideas of association, or covariation, are fundamental in science, including psychology. Many of our quantitative methods are based on the study of relationships.

The index of correlation we use depends on the nature of our variables, but the best known and most frequently used is the Pearson product moment correlation r. This statistic describes the relationship between two interval scale variables, and its calculation is based on the cross-products of the deviation of each score from its own mean. These values can be positive in value, indicating that, as one variable becomes large, the other one does as well; negative, indicating that larger values of one are associated with smaller values of the other; or zero, meaning that there is no association between the two variables. Pearson correlations range from 1 to –1, and it is the absolute value of the correlation that indicates its strength—a correlation of –.5 is as strong as one of .5.

Correlations can be interpreted in terms of statistical significance, percentage of variance accounted for, and effect size. Statistical significance when applied to a correlation means that the correlation is statistically different from zero. The null hypothesis for a correlation is that the population parameter ρ is equal to 0. All statistics texts contain a table that presents this as a function of sample size—the larger the sample size, the smaller an r must be to be statistically different from zero. For example with an N of 100, a correlation of .20 is significant at p < .05, whereas if the N were 1,000, a correlation of .06 would be significant at that level. As with all hypothesis testing, we specify the level of type 1 error we are willing to tolerate, and test our hypothesis at the .05, .01, or .001 levels. If the N is large enough, very small correlations can be statistically significant. For example, the critical value of r ( p < .01) in a sample of 10,000 is .02.

The latter is an example of a case in which a statistically significant value may be practically insignificant. The square of the r is the coefficient of determination, which is the percentage of variance shared by the two variables. Thus, a correlation of .10, when squared, is .01, meaning that only 1% of the variance is shared between the two variables, and 99% of their variance is not shared. Under most circumstances, this would be a trivial association. In the example given above, the statically significant correlation of .02 when N = 10,000, we are accounting for only an infinitesimal .0004% (4/10 of 1%) of the shared variance between the two variables.

In addition to the percent of shared variance, practical importance also is reflected by effect size, as it is with the description of mean differences in t tests and ANOVA. The value of r is itself an established index of effect size, and its interpretation is based on its relationship to the most common measures of effect size, such as Cohen’s d ( 1988 ). Cohen ( 1992 ) attaches values of r of .10, .30, and .50 to small, medium, and large effect sizes, respectively, but it should be recalled that these describe, respectively, only 1%, 9%, and 25% of the shared variance.

Table 9.1 contains a matrix of correlations among eight personality variables measured by the Healthy Personality Inventory (Borgen & Betz, 2008 ), an inventory designed to reflect the emphasis on positive psychology and the healthy personality. The table provides the bivariate correlations among the eight variables. As is standard practice, no values are shown for the diagonal, as they are 1.0 (the correlation of a variable with itself, although in some cases the value of coefficient α is shown instead). The table is bilaterally symmetric, so it is necessary to show only the upper or lower diagonal. In some cases, values for one gender are shown above the diagonal, and those for the other gender are shown below the diagonal. Values of r that are statistically significant for an N of 206 are provided in a note below the table.

Values for 206 college students. For an N of 206, values of r of .14, 18, and .23 are significant at the .05, .01, and .001 levels, respectively, for a two-tailed test.

It is important to note four cautions. The value of r reflects only linear relationships. If there is curvilinearity in the relationship, then the relationship is better described by the statistic η. Second, the value of the correlation coefficient will be restricted if there is restriction in range in either or both of the two variables being studied. If there is little variability in the scores, there is less chance for changes on one to be reflected by changes in the other. There are formulas for corrections for restriction in range (see Hogan, 2007 , p. 122), and these are often used in predictive validity studies (see Sireci & Talento-Miller, 2006 ). But unless there is some reasonable expectation that score ranges can be increased, then this correction is unduly optimistic and not reflective of what actual relationships will be found in the data.

A third caution is that we must resist the temptation to conclude that a statistically significant correlation is “significantly larger” than a nonsignificant correlation. For example, if we had an N of 50, a correlation of .28 would be significant (at p < .05), whereas one of .25 would not be statistically significant, yet they are not statistically different from each other. This must be tested by the z test for the significance of the difference between two values of ρ (see Glass & Hopkins, 1996 ).

Finally, correlation does not imply causation. Correlation reflects the covariation among two variables, but does not allow any assumptions about whether or not one causes the other or whether both are caused by a third (or more) variable. For example, we might find that depression and loneliness are correlated. We could postulate that depression leads to loneliness, that loneliness makes people depressed, or that some third variable like low self-esteem or perceived social inadequacy causes both depression and loneliness. Other research designs (e.g., experimental examinations of treatments for low self-esteem, depression, or loneliness, or structural equation modeling) are required to address questions of causality.

Other Correlation Coefficients

Although r is the most commonly used index of correlation, a number of others are suitable when one or both variables are not interval in nature. φ is often used with two categorical variables, whereas the contingency coefficient is used with two polychotomous variables (categorical variables with more than two categories, such as marital status or race/ethnicity). The relationship between a dichotomous variable (such as right–wrong or true–false answers) and a continuous variable, such as total test score, is indexed by the point-biserial coefficient if we assume a true dichotomy, or the biserial coefficient, which assumes that the dichotomous answer actually reflects an underlying continuum and is therefore dichotomized. The biserial is not a Pearson product moment r , so its absolute value can exceed 1.0. Correlations between data, when both sets of numbers are ordinal, can be computed using the Spearman rank correlation (see Glass & Hopkins, 1996 ).

Although the most basic reason for studying covariation is to understand the myriad relationships in human behavior, characteristics, and functioning, in many settings these relationships also are used to predict behavior, and in these cases we use the method of regression. One of the oldest uses of regression is based on the relationship between high school grades and performance in college. Using a scatter plot, we could place the predictor variable, high school grades, on the horizontal axis and the criterion, college grade point average (GPA), on the vertical axis. The relationship between these two sets of scores would be described by Pearson’s r . A regression equation is an equation for a line Y ′ = bX + a , where X is the value of the predictor variable and Y ′ is the predicted value of the criterion. In this equation, the a is the Y intercept, the value of Y where the regression line crosses the y axis or, in other words, the value of y corresponding to an X of 0. The b is the slope of the line and is a direct function of the correlation r between X and Y. The slope of the line is the rate of change in Y as a function of changes in X. Given the formula Y ′ = bX + a , we can estimate a person’s score on the criterion variable, given his or her score on the predictor variable. The regression line is known as the “line of best fit” and is determined mathematically as that equation which minimizes the errors of prediction of the criterion from the predictor. In new samples, we then could use this equation to make predictions of collegiate performance from high school GPA.

Multiple Regression

In many cases, we wish to use multiple predictor variables to predict a criterion—the simplest example of this is the use of both scholastic aptitude test scores and high school GPA to predict college GPA. In this case, the formula for a line is generalized to multiple predictors and takes the form Y ′ = b 1  X 1 + b 2   X 2 +… b n  X n + a. The quality of prediction is based on the strength of the multiple correlation coefficient R , describing the relationship between a linear composite or summary of the predictor variables and the criterion variable Y. And R  2 , like r 2 , is referred to as the coefficient of determination .

Variables can be entered into a regression analysis in several different ways. In simultaneous entry, all variables are entered together, and each is evaluated according to what it adds after the other variables have been accounted for. In sequential or hierarchical regression, variables are entered in a specific order as determined by the researcher. In stepwise regression, variables are entered one at a time according to statistical criteria—forward, backward, and stepwise entry can be used (see Tabachnick & Fidell, 2007 ).

Regardless of the method of variable entry utilized, weights in the equation should be cross-validated. Because multiple regression is a maximization procedure, meaning that it selects the weights that will maximize predictive efficacy in that particular sample, it is subject to shrinkage in subsequent samples. Therefore it is recommended that the equation be cross-validated. This is done by dividing the original sample in two, with the development sample being larger (see Tabachnick & Fidell, 2007 ). In the cross validation step, the predictive weights derived for the first sample are applied to the second sample, and the resulting R  2 determined. The second R 2 is probably a more realistic estimate of the predictive power of the set of variables. The method of double cross validation involves separately obtaining an original set of weights on each half of the sample, and then applying the within sample weights to the other sample. The average of the two R 2 is probably a good estimate of the predictive efficacy of the variable set.

Meta-analysis

Meta-analysis, described previously in the discussion of group differences, is used frequently in the study of predictive validity. We use meta-analysis in this context to summarize across many predictive validity studies—the summary if we do find evidence for predictive validity across studies is often called “validity generalization.” For example, DeNeve and Cooper ( 1998 ) published a meta-analysis of 1,538 correlation coefficients from 148 studies of the relationships of 137 personality variables to measures of subjective well being. In brief, they found that Big Five Neuroticism was the strongest (negative) predictor of life satisfaction and happiness and the strongest predictor of negative affect. The strongest predictors of positive affect were Big Five Extraversion and Agreeableness.

Moderators and Mediators

Two other types of variables often are used in predictive and other correlational studies—moderator variables and mediator variables. Perhaps because the two terms are similar and/or perhaps because they both involve a “third variable” that influences the interpretation and meaning of a bivariate correlation, these terms are often confused or are assumed to be equivalent. This is not the case.

As mentioned, both moderators and mediators are third variables that can be involved in examining the relationship between two other variables. A moderator (see Frazier, Tix, & Barron, 2004 ) is a third variable that influences the strength of relationship of two other variables to each other but is not related itself to either one. That third variable can be categorical, such as gender, or interval, such as job satisfaction. For example, two variables may be more strongly related in women than in men or more strongly related in more highly versus less highly satisfied workers. A moderator in ANOVA terms is an interaction in which the effect of one variable depends on the level of the other. It may be suggested that a moderator leads to “differential predictability” of criterion from predictor variables.

The analytic methods used to identify moderators include the z test comparison of two correlations after conversion to Fisher’s Z (see Glass & Hopkins, 1996 ) and moderated multiple regression (see Tabachnick & Fidell, 2007 ). Shown in Figure 9.1A is a hypothetical example wherein social support moderates the relationship between stress and distress. It is postulated that, for people high in social support, the correlation between stress and distress is lower than for those low in social support. The moderator effect is shown by an arrow leading down to the arrow showing the relationship between stress and distress. Suppose that, for high support individuals, the correlation is only .10, whereas for low support individuals it is higher, .50. If these are shown to differ significantly using the z test following transformation to Fisher’s Z , then we can conclude a moderator effect for social support in this study. Moderator effects should always be replicated and the search for them should be based on theoretical considerations rather than “data snooping.”

A mediator is a variable that represents the generative mechanism by which one variable affects another (Baron & Kenny, 1986 ). In this case, we have a relationship between two variables, but we postulate that the intervening mechanism is the relationship of each with a third variable, the mediator. Figure 9.1B shows a postulated mediator relationship for the same variables shown in Figure 9.1A . Say, for example, that we postulate that stress causes people to avoid social support, which causes them distress. If the relationship between stress and distress is significantly reduced when the path to social support is considered, we may have a mediator. Baron and Kenny postulate that finding a mediator requires four steps (shown in the figure): (1) that the predictor is related to the criterion (c); (2) that the predictor is related to the mediator (a); (3) that the mediator is related to the criterion or outcome (b); and (4) that the strength of the relationship between predictor and criterion is significantly reduced (c') after the variance due to the mediator is removed. This can be tested using multiple regression, structural equation modeling, or the Sobel ( 1982 ) test, a handy and easy-to-use test available online (e.g., www.quantpsy.org ), or as a subroutine of SPSS and other software.

Discriminant Analysis

Discriminant analysis is a topic that could be covered either in the section on MANOVA or in this section on multiple regression in that it has purposes and procedures similar to both (see Sherry, 2006 , for an extended description of discriminant analysis). Probably the most common use of discriminant analysis is the use of multiple predictor variables to predict a categorical criterion—thus, it is like multiple regression except that the criterion variable is categorical rather than continuous. It is like MANOVA in that it tells us which of a set of variables differs significantly as a function of group membership. Probably its most frequent use would be to use a set of predictor variables to predict success versus failure, for example, in a job training program or in completion of a college degree. Like regression, it yields a set of weights that are applied to the predictors to yield the maximally predictive composite of scores to predict group membership.

As another possibility, discriminant analysis could be used as a follow-up to a significant MANOVA. MANOVA tells us whether or not a set of variables significantly differentiates two or more groups, controlling for the experiment-wise error by giving us a multivariate F . Post hoc univariate tests tell us for which variables significant group differences exist, but they do not tell us which variables contribute most strongly to the overall group separation. Discriminant analysis will give us discriminant weights, analogous to regression weights, which will tell us the strongest contributors to the group differences. Effect sizes could be used with the MANOVA to determine the variables leading to the largest differences between the groups, but that method would not control for the intercorrelations among the predictors.

( A ) Hypothetical example of social support as moderator variable. ( B ) Hypothetical example of social support as mediator variable.

Like MANOVA, discriminant analysis requires a set of two or more variables for two or more groups. The method of analysis involves a search for the linear equation that best differentiates the groups, so it is (like multiple regression) a maximization procedure and must be cross-validated. The analysis yields at least one discriminant function, analogous to a regression equation, which contains a set of β weights that are applied to the variables. Like β weights in regression, the weights indicate the importance of the variables in separation, or differentiating, the groups. The maximum number of discriminant functions is the number of groups minus 1 or the number of predictors, whichever is smaller. Of the discriminant functions, none, one, or more can be statistically significant. If not significant, then the function is not making a meaningful contribution to our understanding of group differences.

For predictive purposes, the weights are applied to each individual’s scores and compared with what are termed group centroids to estimate the probabilities of group membership. If the discriminant weights are applied to the mean scores within each group, the results, two or more centroids, depending on the number of groups differentiated, will be maximally separate from each other. We assign each individual’s score composite to the closest centroid. The number of correct versus incorrect assignments is known as the “hit rate” and is compared to the probability of correct assignment by chance. For example if we have three groups of equal size, the probability of making correct assignments by chance is .333. To the extent that the discriminant weights can improve on that, which we examine using the z test for the difference between proportions (Glass & Hopkins, 1996 ), the discriminant function is enhancing prediction. Cross-validation can be done using a holdout sample, double cross validation, and the “jackknife” method. In the latter method, one case is held out at a time, and the discriminant function is calculated based on the remaining cases. The weights are applied to the case held out to make a group assignment. This is done for each case, and the probability of correct classification is based on the cumulative number of correct classifications across all cases.

An excellent example of the use of discriminant analysis in counseling psychology research is the study of Larson, Wei, Wu, Borgen, and Bailey ( 2007 ) of the degree to which personality and confidence measures differentiated four college major groups in 312 Taiwanese college students. Personality and confidence measures each differentiated the college major groups well, but the combination of both significantly improved prediction beyond either set used alone.

Other Related Analyses

Less often used in counseling psychology research but worth knowing about are logistic regression and multiway frequency analysis (MFA; or its extension, log-linear analysis). Both are used with data in which some or all are categorical. Logistic regression (see Tabachnick & Fidell, 2007 , for a full description) is used to predict a categorical dependent variable (criterion) from a set of interval and/or categorical variables. It is used extensively in the medical field. For example, gender and whether or not a smoker (both categorical), and body mass index and amount of exercise per week (both interval scale), could be used to predict whether or not someone has a heart attack before age 50. Logistic regression is similar to discriminant analysis except that the latter uses only continuous predictor variables (unless categorical variables are dummy coded, e.g., assigning 1 to female and 2 to male).

Multiway frequency analysis, or an extension called log-linear analysis , is used to examine the relationships among multiple categorical variables. If we have only two categorical variables, we use the χ-square test of independence to investigate the relationship (vs. independence) between the two variables. For example, we could examine the relationship between gender and whether or not a student dropped out of college before finishing. If we have three or more categorical variables (for example, race/ethnicity and whether or not the student is a first-generation college student in addition to gender and completion of college), we would use MFA. For more information on all of these methods see Tabachnick and Fidell ( 2007 ).

Finally, one problem which is common to all quantitative data analyses is the problem of missing data. Several recent papers (Schafer & Graham, 2002 ; Sterner, 2011 ) have detailed the types of missing data and methods for handling each type. Usually less problematic are those instances where missing data are assumed to occur at random, while more serious problems may be caused in instances where there is non-randomness, or systematicity, in the missing data – for example if missing data is significantly more likely among one gender or ethnic group than the other gender or another ethnic group. These articles provide excellent suggestions for handling missing data in all of these cases and provide recommendations for statistical software that can be useful in each of these cases.

Examining Structure and Dimensionality

Factor analysis.

Factor analysis has been one of the most widely used analytical procedures in psychological research. It began with the work of Charles Spearman ( 1904 ) on the structure of mental abilities. He developed a mathematical model specifying that ability tests were composed of two factors—a general ability factor (  g ) and a specific factor ( s ). Factor analysis has grown into a family of methods that enable us to study the structure and dimensionality of measures and of sets of variables. For example, we could use factor analysis to determine dimensions underlying several indices of social behavior or to ask how many underlying dimensions of personality there are in a new measure we have constructed. In recent years, factor analytic methods have been differentiated as exploratory factor analyses (EFA) and confirmatory factor analyses (CFA).

Exploratory Factor Analysis

As defined by Fabrigar, Wegener, MacCallum, and Strahan ( 1999 ): “The primary purpose of EFA is to arrive at a more parsimonious conceptual understanding of a set of measured variables by determining the number and nature of common factors needed to account for the pattern of correlations among the measured variables” (p. 275). Exploratory factor analysis is used when the researcher has no a priori theories about the structure of the measure or construct, or when a priori theories have not been supported by confirmatory factor analyses. The method utilizes a matrix of either correlations or covariances describing the relationships among the variables to be analyzed. The variables can be measures or items, for each of which there is a matrix of scores for a sample of people. The correlation or covariance matrix is a symmetrical matrix showing the relationships of each variable with every other variable (or item).

Given such a matrix, software from packages such as SPSS, SAS, CEFA, BMDP, Systat, or RAMONA are used to do the analyses. However, any EFA involves a sequential series of considerations that will determine the results of the analysis. These considerations are the nature of the variables and sample, the appropriate method of analysis, method of factor extraction, number of factors to extract, and method of rotation. In addition, the interpretation or naming of the factors and the decision as to whether or not to compute factor scores follow the analyses themselves.

Assumptions Regarding The Data

In both EFA and CFA, certain assumptions about the data are necessary. First, quality solutions result only from quality data—measures (or items, if it is to be a factor analysis of an item set) must be carefully selected to represent a defined domain of interest (Fabrigar et al., 1999 ). Just as in scale construction itself, the quality of the scale depends on the care put into defining the construct or domain of interest. There should be evidence for item or scale reliability. MacCallum, Widman, Zhang, and Hong ( 1999 ) suggest that, if one has an idea of the common factors to be represented, three to five measured variables (MVs) per factor will provide stable and interpretable results. If the researcher does not have hypotheses about the number of common factors, then the domain of variables should be delineated carefully and as many of those variables as possible included in the study. The data should be interval or quasi-interval in nature and be normally distributed, although the latter criterion depends on the method of factor extraction used. Some researchers have found that both EFA and CFA are relatively robust in the face of non-normality. However, less biased fit indices and more interpretable and more replicable solutions may follow when data are normally distributed.

Sample Size and Variable Independence

Although there has been much discussion of necessary sample sizes for factor analysis, a generally accepted guideline is five to ten participants per variable or item, if the analysis is at the item level (Joreskog & Sorbom, 2008 ). If sample sizes are larger than that, they may be divided into subsamples, so that the solution can be replicated. However, other authors have demonstrated that when common factors are overdetermined (three or four variables per factor) and communalities are high (averaging at least .70), smaller sample sizes (e.g., N = 100) are often sufficient (MacCallum et al., 1999 ). When the reverse is true—that is, factors are less well determined or communalities are lower—even very large sample sizes (up to N = 800) may not be sufficient. It is clear there are no simple answers to the question of sample size.

Methods of Factor Extraction

It is necessary at the outset to differentiate two different types of analysis: principal components analysis (PCA) and factor analysis (EFA or CFA). The major difference between them is that PCA analyzes all the variance among the variables, both common and unique, where unique variance includes that specific to the variable and also error variance. It is designed to rescale the original variables into a new set of components that can be equal in number to the original set but that are now uncorrelated with each other. It is not designed to elucidate underlying structure or latent variables (LVs) but to rescale or reassign the variables in the analysis. Generally speaking, it is not considered a method of factor analysis (Fabrigar et al., 1999 ), but if the researcher’s goal is to determine the linear composite of variables that retains as much information as possible from the original set of variables, then PCA is appropriate. An example of an appropriate use would be analysis of a large set of vocational interest items, where the purpose was to assign them to interest scales, retaining as much variance as possible from the original set.

If the purpose of the analysis is to more parsimoniously describe the underlying dimensions common to a set of variables, also known as the underlying LVs , then common variance analysis is much more appropriate. The common factor model is implemented by model-fitting methods, also known as factor extraction techniques . The major ones are maximum likelihood (ML) and principal axis factoring (PAF). All use only common variance in the estimation of communalities. The advantage of ML procedures is that they are accompanied by a large number of fit indices that can be used to evaluate the goodness of fit of the factor model to the data (see Browne, Cudeck, Tateneni, & Mels, 2004 ). However, they also require the assumption of multivariate normality. Principal axis factoring does not require such distributional assumptions but also provides fewer fit indices.

Although PCA places 1’s in the diagonal of the correlation matrix, common FA uses a communality estimate in the diagonal, where commonality refers to the shared common variance of that variable with other variables in the set. There are several commonality estimates typically used, including the largest correlation of a variable with any other variable in the set, the squared multiple correlation (SMC) of the variable with the remaining variables, and iterated estimates based on preliminary SMCs.

Number of Factors To Extract

The decision regarding the number of factors to extract should be based on a balance of parsimony with theoretical meaningfulness. In theory, we want to arrive at a smaller number of fundamental LVs, but we also want those LVs to be important and to accurately define the domain of interest. Especially if our goal is to explore a reduced number of factors that reflect underlying LVs, it is pointless to extract minor or trivial factors. However, researchers generally agree that it is more problematic to underfactor (to select too few factors) than to overfactor—in the former case, we may overlook important aspects of the behavioral domain, whereas in the latter case, we may simply end up focusing on an unimportant or trivial aspect of behavior.

There are several approaches to determining how many factors to extract, all of them in some way attempting to operationalize factor importance, as we only want to extract important factors. One basis for decisions is how much variance a factor accounts for. A variable’s contribution to a factor is represented by the square of the factor loading (factor loadings are analogous to correlations, which, when squared, represent the proportion of variance accounted for). In an unrotated solution, the factor contribution is known as the eigenvalue . The best known and most commonly used is the Kaiser-Guttman criterion (Gorsuch, 1983 ), in which factors having eigenvalues greater than 1 are extracted. This method is appropriate only for PCA or for other methods where 1’s are in the diagonal (such as α or image FA) and should not be used in common factor analyses where communality estimates are in the diagonal. This is the default in some statistical packages, although it tends to lead to overfactoring (more than an optimal number of components or factors) (Zwick & Velicer, 1986 ).

A frequently used method is the scree plot (Zwick & Velicer, 1986 ), in which the values of the eigenvalues are plotted, in order of factor extraction, on the vertical axis. The point at which the plot levels out (or the slope of the line approaches zero) is where factoring should stop. Common sense should be used, however, as there may be cases in which the scree plot would lead to the inclusion of factors with eigenvalues below 1.0. In some cases, there is no clear leveling off, or there is more than one leveling off point. A logical criterion for number of factors to extract is to include only factors having at least two or three variables loading highly on them. If only one variable loads on a factor, then it is questionable whether that variable reflects an underlying latent dimension. Other methods include parallel analysis (Hayton, Allen, & Scarpello, 2004 ) and root mean square error of approximation (RMSEA; Steiger, 1990 ), in which maximum likelihood estimation is used to extract factors (see also Browne & Cudeck, 1993 ).

The results of a factor analysis yield solutions based on mathematical maximization procedures, rather than solutions that are psychologically or intuitively satisfying. Factor rotation is designed to lead to a more interpretable set of factors. Methods of rotation are generally classified as orthogonal or oblique. An orthogonal rotation yields factors that are uncorrelated, whereas an oblique rotation allows factors to be correlated.

To understand rotation methods, it is necessary to understand Thurstone’s ( 1947 ) concept of simple structure . Simple structure defines a maximal interpretability and simplicity of a factor structure, such that each factor is described by less than the total number of variables, and each variable should be described by only one factor. Ideally, each factor should be loaded on by at least two but fewer than a majority of variables.

Orthogonal rotational methods include varimax and quartimax; varimax (Kaiser, 1958) is regarded as the best orthogonal rotation (Fabrigar et al., 1999 ) and is often the default in computer packages. Comparing the two, varimax is more likely to “spread out” the variance across factors, reducing the predominance of the general factor or of specific factors and increasing the number of common factors (factors on which a few variables load strongly). Quartimax has the opposite effect—emphasizing general and specific factors and de-emphasizing common factors.

Oblique rotations are generally considered preferable because they allow correlated factors, and the reality is that most psychological variables are at least partially correlated naturally. If the factors are truly uncorrelated, oblique rotations will yield an orthogonal set of factors. Oblique rotations provide the correlations among factors and, therefore, second-order factor analyses, where the factor intercorrelations are themselves factor analyzed to examine high-order structures. Oblique rotations include direct oblimin and promax . Most quantitative researchers view direct oblimin as preferable because the mathematical functions minimized (and maximized) in factor rotation are made explicit (Browne, 2001 ).

Regardless of which method of rotation is used, a matrix of factor structure coefficients will result that is different from the coefficients generated before rotation. The structure coefficients represent the correlations between the variables and the factors. For clarity of interpretation, it is best if the coefficients are either large or very small (near zero). The rule of thumb is to retain on a factor any variable with a loading of .40 or greater, although there may be instances where loadings as small as .30 or as large as .50 are determined as the minimum (Floyd & Widaman, 1995). A loading of .40 indicates a reasonably strong contribution of the variable to determining the “nature” of that factor.

Another feature of the results will be the percentage of variance accounted for by each factor and the total percentage of variance accounted for by the solution. A general rule is that, to meaningfully explain the interrelationships in the data, a factor structure should account for from 50% to 80% of the common variance. If factors 1 and 2 account for 40% and 20% of the variance and factor 3 accounts for only 2% of the variance, we may decide that factor 3 is too trivial to divert attention from the two more significant factors.

Table 9.2 shows the factor matrix resulting when principal axis factor analysis is applied to the correlation matrix shown in Table 9.1 , the eight variables from the Healthy Personality Inventory. Direct oblimin (an oblique) rotation was used. Using the decision criterion that factors with eigenvalues over 1.0 should be retained led to the retention of two factors with eigenvalues of 4.1 and 1.8, respectively; two factors were also indicated by the scree plot. Table 9.2 shows the resulting factor structure matrix. The most important factor in terms of variance accounted for is shown first—this factor accounts for 47% of the common variance. The second factor accounts for an additional 18% of the common variance. For the factor loadings shown for each variable on the two factors, larger loadings mean that the variable is more important to the definition of the factor. As we used oblique rotation, we allowed the factors to be correlated (they correlate r = .37). It is most useful to name the factors based on the variables that load highly on them. Thus, in the example shown, Factor 1 was named Productivity Styles, as it had strong loadings from the variables of Confident, Organized, Detail Oriented, and Goal Directed; and Factor 2 was named Interpersonal Styles, as it included Outgoing, Energetic, Adventurous, and Assertive.

N = 206; Highest loading of each scale on a factor is shown. Factor 1 accounts for 47% of the common variance, whereas factor 2 accounts for an additional 18% of the common variance. Factor 1 was named Productivity Styles, whereas Factor 2 was named Interpersonal styles. From Borgen, F. H., & Betz, N. E. ( 2008 ). Career self-efficacy and personality: Linking career confidence and the healthy personality. Journal of Career Assessment , 16 , 22–43.

Scores on the factors themselves also can be computed. Factor scores are useful if we wish to predict some type of criterion behavior from a concise set of factor scores. For example, assume that we have a battery of 15 ability tests—verbal ability, math ability, and spatial ability—that we wish to use to predict job performance. When calculated in the same sample, the factor score can be computed as the sum of the score on each variable multiplied by its weight on the factor(s) on which it loads significantly. However, since factor analysis is a maximization procedure, if used in subsequent samples, it has been shown that simple unit weighting of all the variables loading on a factor provides more stable results (Gorsuch, 1983 ).

Confirmatory Factor Analysis

Confirmatory factor analysis is used when we have an a priori hypothesis about the structure or dimensionality of the data or domain of behavior. There are several different types of such uses. One use is to examine the factor structure and/or construct validity of a scale or a set of measures: If a measure is postulated to have three underlying factors, we could use CFA to verify (or not) that structure.

The first step in CFA is to specify the model to be tested; that is, we specify which measures should load on which factor. In many cases, the CFA was preceded by an EFA to get an idea of the factor structure of the domain in question, and this structure is then tested using CFA. Hypotheses about relationships (of items or measures to factors or among factors) are operationalized in the model, usually with Thurstone’s simple structure in mind. We typically desire high loadings of items on one and only one factor, and specification of a factor by a few strongly loading items. In some cases, we postulate that one or more factors may be correlated.

Statistical software is used to compare the estimated covariance matrix of the specified model to the actual matrix of covariances found in the data. A number of possible software programs are available; all of these software packages are updated periodically. They include the SPSS subroutine AMOS (which must be purchased separately from the standard package), LISREL (Joreskog & Sorbom, 2008 ), EQS (Bentler, 1995 ), and Mplus (Methuen & Methuen, 1998). According to Kahn ( 2006 ), the latter three yield comparable results for a CFA, although Mplus may have more user-friendly syntax and also provides other multivariate analyses not available on other packages.

A number of fit indices are available to evaluate the fit of the model to the data. Fit indices indicate how well the actual covariances (relationships) in the data correspond to those in the hypothesized model (see Kahn, 2006 , for a full explanation). The traditional χ-square test of goodness of fit is best known and indicates the differences between the model-hypothesized covariances and those found in the data; the larger the value of the χ-square statistic, the greater the discrepancy between the hypothesized and actual models. Thus, a statistically significant χ-square indicates a lack of fit of the hypothesized model to the data. But the χ-square statistic is highly sensitive to large sample sizes and/or a large number of observed variables and often leads to the rejection of models that are good, if not perfect. One solution to this problem is the χ-square test of close fit (not perfect) developed by Brown and Cudeck ( 1993 ). This index seems to perform better across a range of sample sizes and models.

Other fit indices are not adversely affected by sample size. These include the Bentler-Bonnett non-normed fit index (NNFI; also known as the Tucker Lewis index) and the Comparative Fit Index (CFI). Using criteria suggested by Browne and Cudeck ( 1993 ) and Hu and Bentler ( 1999 ), models with CFI and NNFI (TLI) of at or above .95 indicate an excellent fit, whereas those between .90 and .94 indicate an adequate fit. The standardized root mean-squared residual (SRMR) and the RMSEA (Bentler, 1995 ) are other fit indices; for these indices, values at or below .05 indicate an excellent fit while those between .06 and .10 indicate an adequate fit. Confidence intervals also are provided for the RMSEA. Most authors (e.g., McCallum & Austin, 2000 ) recommend using multiple fit indices, paying particular attention to the RMSEA due to its sensitivity and its provision of a confidence interval.

Although it makes intuitive sense that CFA would be used most often to confirm structures (tentatively) established using EFA, there are instances in which the reverse sequence can be put to good use. One example of the latter can be found in Forester, Kahn, and Hesson-McInnis ( 2004 ), who reported the results of confirmatory and exploratory factor analyses of three previously published inventories of research self-efficacy. Using a sample of 1,004 graduate students in applied psychology programs, Forester et al. began with a CFA of each of the inventories separately, finding poor fit of each to its postulated factor structure. They then used EFA to evaluate the structure of the combined total of 107 items from the three inventories, arriving at a four-factor structure in which 58 items loaded at least .50 on one and only one factor. Of course, the next logical step in this research effort would be to return to CFA to examine whether the four-factor structure holds in new samples.

Confirmatory factor analysis also is well-suited to comparing factor structures in new demographic groups or across groups (such as gender or race/ethnicity). Often, EFA will have been used to derive a factor structure in original normative samples dominated by white males (particularly in older instruments), so it is crucial to demonstrations of construct validity that the factor structure be validated, or explored anew, in other groups with which we wish to use the measure(s). Kashubeck-West, Coker, Awad, Stinson, Bledman, and Mintz ( 2008 ), for example, found, using CFA, that factor structures in three inventories of body image and eating behavior that had been derived from white samples demonstrated poor fit in samples of African American women. They subsequently used EFA to explore the factor structure in the African American samples.

Multidimensional Scaling

Although used less frequently that factor analysis, the structure of a set of variables or items also can be described by multidimensional scaling (MDS). Multidimensional scaling provides the structure of variables in multidimensional (usually two-dimensional) space. Analysis of proximity or similarity data (which can be represented as correlations between variables) yields a series of points in two dimensional space, where each point represents a variable or item and the closeness of the points represents the variables’ similarity to each other. A good example of the use of MDS in counseling psychology research is the work of Hansen, Dik, and Zhou ( 2008 ). They analyzed 20 leisure interest scales using both EFA and nonmetric MDS and found two dimensions of leisure interests in college students and retirees—expressive-instrumental (e.g., arts and crafts vs. individual sports) and affiliative–nonaffiliative (e.g., shopping vs. gardening). With MDS, each leisure interest can be described on these two dimensions. For example, shopping would be a more affiliative expressive activity, whereas arts and crafts would be an expressive but less affiliative activity; team sports would be instrumental and affiliative, whereas building and repairing would be instrumental but less affiliative. For more information about MDS, readers may consult Fitzgerald and Hubert ( 1987 ).

Examining Causal Models

Structural equation models.

Structural equation modeling is actually a family of methods that subsumes many of the methods we have discussed so far. In the general case, it is a method of statistically testing a network of interrelationships among variables. It subsumes multiple regression analysis and confirmatory factor analysis but also includes path analysis and testing of full structural equation models. To understand the distinctions among these methods, it is useful to define two possible components of a structural model.

Elements and Path Diagrams

The elements of a structural model are MVs, usually shown as squares or rectangles, and LVs, usually shown as ellipses or circles. Measured variables are (as they sound) those that are measured directly, whereas LVs are unobservable constructs. In addition to measured and latent variables, the model must postulate relationships among variables, including error terms. These relationships are represented as unidirectional (one-way) and bidirectional (two-way) arrows. The values assigned to or resulting from directional relationships are regression coefficients, whereas those for nondirectional relationships are covariances (or correlations if variables are standardized). Variables in the model can be endogenous or exogenous. Endogenous (dependent) variables are those in which there is a directional influence to the variable from one or more other variables in the system. Exogenous (independent) variables are those that do not have directional influence from within the system; their influences may be unknown or may not be of interest in the current model.

( A ) Path model of predictors of loneliness in college students. ( B ) Path model predicted 45% of the variance in loneliness. From Hermann, K. ( 2005 ). Path models of the relationships of instrumentality and expressiveness, social self-efficacy, and self-esteem to depressive symptoms in college students. Unpublished Ph.D. Dissertation, Department of Psychology, Ohio State University.

The simplest model is a path model, for which each LV is directly measured—thus, this actually models relationships among a series of measures (and directional relationships are shown using arrows). A sample path model is shown in Figure 9.2A , where the researcher (Hermann, 2005 ) was examining variables related to loneliness in college students. Note that all variables are shown as rectangles, as the model does not incorporate latent variables; that is, each variable is assumed to be measured fully by one scale. Note also that only instrumentality in this model is exogenous—all other variables are endogenous; that is, they are postulated to be predicted by variables earlier in the model.

A full structural model consists of two parts: a measurement model, which represents the relationships of LVs and their indicators; and a structural model, which represents the interrelationships between LVs, both independent and dependent. In a full structural equation model (SEM), the measurement model is tested first to evaluate the fidelity by which the measures are valid indicators of the construct; following that, the full structural model is tested (see Figure 9.3 for an example of a full structural model).

Measurement and structural mode1s of intuitive eating.

The steps in SEM are (1) model specification, (2) identification, (3) estimation, and (4) modification (Schumaker & Lomax, 2004 ). In the first step, the researcher hypothesizes the relationships (including lack of relationship) among all variables. As in CFA, relationships between variables, also known as parameters or paths , must be either specified in advance or determined from the analysis of the correlation or covariance matrix. A free parameter is one whose value is unknown and must be estimated, whereas a fixed parameter is one we determine in advance; the latter also is known as a constrained parameter (Weston & Gore, 2006 ). Three types of parameters are necessary in a structural model. First, direct effects parameters specify relationships between a LV and its postulated MVs (known as factor loadings ) and between LVs (known as path coefficients ). To scale the MVs, it is common to set one of the factor loadings for each LV at 1.0, which has the effect of standardizing the set. Parameters other than those set at 1.0 need to be estimated (shown as asterisks). Error terms for dependent measured and latent variables also must be either fixed or estimated, and covariances among exogenous variables are specified as parameters as well.

Model Identification

This refers to the relationship between the number of parameters to be estimated and the number of data points in the correlation or covariance matrix. The number of elements in a correlation matrix is equal to the number of variables k by the following formula: [ k ( k + 1)]/2; if there are six variables, there are [6(7)]/2 = 21 elements in the matrix. Subtracting the number of parameters to be estimated from the number of elements yields the degrees of freedom for the analysis—if it is positive, the model is said to be overidentified, which is the optimal situation.

Structural equation modeling software is needed to estimate the free parameters and provide fit indices for the fit of the postulated model to the data. This software includes the same software programs used with CFA, including LISREL, AMOS (SPSS), Mplus, and EQS. Most programs describe path coefficients as either standardized β weights or unstandardized β weights, including standard errors, analogous to the results of a regression analysis. The statistical significance of weights can be computed (or may be provided by the software), and the sizes of standardized weights may be compared directly as indicators of relative importance.

The worth of the model tested can be evaluated by the significance and size of the path coefficients (indicating the strength of relationships among the variables), the amount of variance accounted for in endogenous variables, and indices of model fit. Like CFA, fit indices include the χ-square goodness of fit, in which a nonsignificant value is indicative of fit, and the χ-square test of close fit, postulated to be a more realistic examination of fit. Other indices are the NNFI (Tucker-Lewis Index), CFI, RMSEA, and SRMR (see Weston & Gore, 2006 , p. 742, for full descriptions). Criteria for good and adequate fit were described previously for CFA, but it is important to recognize that fit indices do not always agree with one another, so the use of multiple indicators of fit is recommended (MacCallum & Austin, 2000 ).

Figure 9.2B shows the results of simple path analysis of the model presented in Figure 9.2A using a sample of 696 college students (Herman, 2005 ). The path coefficients are regression coefficients showing relationships ranging from .58 (between instrumentality and social self-efficacy) and –.44 (between self-efficacy and loneliness) to as small as .03 (between instrumentality and self-esteem). All paths except the latter were statistically significant in testing this model. Results concerning the fit of the model were mixed, with acceptable values of RMSEA, NNFI, and CFI, but a statistically significant χ-square (which, it should be recalled, is sensitive to large sample sizes). Further testing indicated that the model demonstrated a good fit in males (N = 346) but an inadequate fit in females (N = 350) (see Hermann & Betz, 2006 , for the final published findings).

Figure 9.3 presents an example of a full structural equation model of intuitive eating developed and tested by Avalos ( 2005 ). Three indicators (or parcels of items) were constructed from the scales measuring each LV following the recommendations of Russell, Kahn, Spoth, and Altmaier ( 1998 ). Parcels were constructed by using EFA to derive the loadings of scale items on a single factor—items were successively assigned to the three parcels from highest to lowest loadings, so that the quality of the parcels as measures of the LVs is roughly comparable.

The measurement and structural components of the model were tested in 461 college women. Testing of the measurement model using CFA indicated fit indices ranging from adequate fit (RMSEA = .060) to excellent fit (CFI = .982, TLI = .975, and SRMR = .041). All indicators/item parcels loaded significantly on their latent factors, suggesting that all latent factors were adequately measured. The paths from the LVs to the parcels indicate the parcel loadings—in essence the factor saturation of each parcel; as can be seen, all parcels loaded highly on their respective LVs. Fit indices for the structural model also were adequate (RMSEA = .058) to excellent (CFI = .982; TLI = .977, SRMR = .046). All paths between LVs were statistically significant and ranged from .28 to .63. Thus, this is a plausible model (although not the only plausible model that could be hypothesized), and it shows a possible causal pathway by which variables related to acceptance can facilitative intuitive eating.

Modification Indices

When the model is not fitting optimally (one or more of the fit indices indicates poor or inadequate fit), some researchers use modification indices to attempt to improve it. Done through what is known as a specification search , two major modification indices are the Wald test, which uses a χ-square difference test to indicate any (non-zero) paths that might profitably be eliminated, and the Lagrange multiplier (LM) test, which uses a χ-square difference test to indicate any new paths that would significantly improve the model if added (Bentler, 1995 ).

MacCallum, Roznowski, and Necowitz ( 1992 ) suggested that, to avoid a data-driven model that capitalizes too much on sample specificity, only changes that are theoretically meaningful, based on prior evidence, should be made. And because modification of structural equation models based on statistical indices has been challenged as data-driven and often unstable across samples, it is important to cross-validate the modified model (MacCallum & Austin, 2000 ). This is done using calibration and validation samples, of which the first should be about two-thirds of the entire sample to provide stable initial parameter estimates.

Structural equation modeling can be used to compare models, for example, by testing models across populations (e.g., Fassinger, 1990 ; Lent et al., 2005 ). Standard error of the mean also can be used to examine longitudinal designs (e.g., Tracey, 2008 ) and to explore experimental designs more generally (see Russell, Kahn, Spoth, & Altmaier, 1998 ).

It should be clear to readers that many analytic methods are appropriate for use with a variety of kinds of quantitative data. Careful consideration of the purposes of the research and the type of data at hand or accessible will facilitate meaningful and useful analyses. In the following section, we turn to qualitative research methods.

Qualitative Methods

Qualitative approaches to research increasingly are being used in counseling psychology, resulting in what Ponterotto ( 2005 ) described as “a gradual paradigm shift from a primary reliance on quantitative methods to a more balanced reliance on quantitative and qualitative methods” (p. 126). In contrast to the nomothetic perspective of quantitative approaches, which seeks to identify large-scale normative patterns and universal laws, qualitative approaches take an idiographic perspective, focusing instead on in-depth understanding of the lived experiences of individuals or small groups. As outlined in the first half of this chapter, quantitative methods rely on quantifying carefully measured observations amassed from large samples (with some measure of control over the variables as the ideal) and statistically analyzing data to produce models of relationship and prediction thought to apply to the general population. Qualitative methods, on the other hand, rely on detailed or “thick” description of context-specific phenomena, most typically as narratives voiced by relatively small numbers of persons, with transparent interpretation by the researcher into descriptions, summaries, stories, or theories thought to capture the complexity of the phenomena under investigation. In quantitative research, the researcher seeks to remain distant and objective to avoid contaminating the data gathering process, such that the data stand as accurately as possible as a representation of an assumed reality apart from the researcher. In qualitative research, however, data collection occurs through the relationship between the researcher and the participant(s) in a co-creative process, and consideration of the subjectivity of the researcher is woven deliberately into every phase of the research process.

Thus, the perspectives, purposes, processes, and products of qualitative research are very different from those of quantitative research, and they require different mind sets and different standards for assessing quality and rigor. Readers should keep these complexities in mind in reviewing the following section, in which we present brief overviews of the most commonly used qualitative approaches within psychology and/or those most likely to enter the repertoire of counseling psychologists. We note that these do not represent the full range of qualitative approaches available; discourse analysis and case study methods, for example, offer considerable possibilities in counseling psychology (the former in studying counseling interactions and the latter in organizational consultation, for example), but they do not appear to have been embraced within our field at this time. We also note that, due to space limitations, we simply present broad descriptions of some of the distinctive features of these approaches, and readers should consult several excellent handbooks and overviews of qualitative methods (e.g., Camic, Rhodes, & Yardley, 2003; Creswell et al., 2007 ; Denzin & Lincoln, 2000 ; Patton, 2002 ; Ponterotto, Haverkamp, & Morrow, 2005 ) to learn about these and other methods in greater detail.

Common Qualitative Research Methods

Grounded theory.

Rooted in sociology and symbolic interactionism, grounded theory is a highly influential qualitative approach that is widely used throughout the health, social, and behavioral sciences, including counseling psychology (Charmaz, 2000 ; Fassinger, 2005 ; Henwood & Pigeon, 2003 ; Rennie, 2000 ). Developed by Glaser and Strauss ( 1967 ) and further articulated by these researchers and colleagues (e.g., Glaser, 1992 , 2000 ; Strauss, 1987 ; Strauss & Corbin, 1998 ), grounded theory is so named because its aim is to produce theories that are “grounded” in participants’ lived experiences within a social context. The central question of grounded theory is: “What theory emerges from systematic comparative analysis and is grounded in fieldwork so as to explain what has been and is observed?” (Patton, 2002 , p. 133).

Theory-building takes place inductively and iteratively using a method of “constant comparison,” in which data collection, coding, conceptualizing, and theorizing occur concurrently in a process of continually comparing new data to emerging concepts until theoretical saturation is reached (no new information is being generated); at this point, data collection/analysis ends and relationships among the emergent constructs are articulated in the form of an innovative theoretical statement about the behavior under investigation. Data usually consist of detailed narratives obtained in extensive interviews with participants, although other forms of data (e.g., observations, archival documents, case notes) can be used as well. The relationship between the researcher and the participant forms the foundation for the participant’s deep sharing of the lived experience, and there is an expectation that participants’ perspectives and feedback will be included throughout the process of data analysis and theory articulation, thus ensuring that the theory remains grounded in the participant’s lived experiences (Charmaz, 2000 ; Fassinger, 2005 ; Henwood & Pigeon, 2003 ).

Although there is some debate about the appropriate paradigmatic home for grounded theory, it most often is presented as a constructivist-interpretivist approach (Charmaz, 2000 ; Fassinger, 2005 ; Henwood & Pigeon, 2003 ). This makes sense, given its ontological and epistemological assumptions that researchers and participants will, through their relationships, co-construct accounts of the deep meanings of subjectively experienced realities, as well as its axiological and methodological foci on revealing, recording, and monitoring the expectations and interpretive lenses of the researcher. However, Fassinger ( 2005 ) has argued that the considerable flexibility of the grounded theory approach allows for its conceptualization and use across a broad paradigmatic range, from, for example, a post-positivist attempt to triangulate quantitative data to the liberationist aims of giving voice to and empowering marginalized populations characterized by the critical-ideological paradigm.

Fassinger ( 2005 ) further asserts that grounded theory can serve as a paradigmatic bridge for researchers. It allows those researchers holding fast to positivist and post-positivist empirical values to begin to venture into more naturalistic territory using the highly specified, rigorous analysis procedures of grounded theory. On the other hand, those who are oriented toward radical social reformation can find in this approach a means to tackle some of society’s most challenging problems. The adaptability of grounded theory is particularly well-suited to counseling psychology, as exemplified by the wide range of studies in our field that have used this approach successfully. Examples include the work of Fassinger and her colleagues (Gomez et al., 2001 ; Noonan et al., 2004 ; Richie et al., 1997 ), as well as Morrow and Smith ( 1995 ), Rennie ( 1994 ) and Kinnier, Tribbensee, Rose, and Vaughan ( 2001 ).

Narratology

Although narratives and narrative analysis techniques are used widely in many different approaches to qualitative research, Hoshmand ( 2005 ) uses the term “narratology” to denote a distinct qualitative perspective that is informed by narrative theory. Shaped broadly by the work of narrative theorists such as Foucault and Ricouer and articulated within psychology by Polkinghorne (1998, 2005 ) and others, the narratological approach to research is “concerned with the structure, content, and function of the stories that we tell each other and ourselves in social interaction” (Murray, 2003 , p. 95). Its central question is: “What does this narrative or story reveal about the person and the world from which it came? How can this narrative be interpreted to understand and illuminate the life and culture that created it?” (Patton, 2002 , p. 133).

Narratology relies on a “narrative mode of understanding” human experience (Hoshmand, 2005 , p. 180) in which the researcher interrogates narratives of individuals’ lived experiences for the story-like elements that underlie those narratives. In this approach, narratives are considered to be storied accounts of experience that have an internal, developmental coherence containing plot-like elements, thematic meanings, self-presentational style aspects, and temporal and causal sequences, and are mediated by culture, historical time, and other contextual elements. Narratological inquiry seeks both to understand narratives as well as to construct storied accounts of particular lived phenomena. Data may consist of documents already rendered in narrative form (e.g., interviews, oral histories, biographies, journals) or may be more loosely organized pieces of information (e.g., chronological events, observations, cultural artifacts) that will be translated into narrative form by the researcher in the data analysis process. Analyzing data may take several forms (e.g., linguistic/literary, grounded, contextual), but each approaches the narrative holistically within its social context, and arranges its elements into a coherently and chronologically sequenced account of experience (Hoshmand, 2005 ; Murray, 2003 ).

Hoshmand ( 2005 ) asserts that narratological research approaches are still evolving, and that what exist currently to guide researchers are simply concepts and principles rather than a unified method per se. Paradigmatically, narratological approaches appear to be constructivist-interpretivist in their reliance on the co-construction of the storied account and the importance of researcher positionality. However, Hoshmand ( 2005 ) distinguishes the narrative mode of understanding, which is focused on “descriptive and discovery-oriented research involving configural patterns of interpretation and a part-to-whole logic of argumentation” (p. 181), from the “paradigmatic mode of interpretation brought to bear on narrative data such as the theorizing stage of grounded theory” (p. 181), reinforcing the difference between narrative analysis and narratological inquiry. The focus of the narratological approach on the formation and expression of individual and cultural identity through story also renders it particularly useful for multicultural research, an area of interest to many counseling psychologists. Examples include Winter and Daniluk (2004) and Hardy, Barkham, Field, Elliott, and Shapiro ( 1998 ).

Ethnography

Spawned from cultural anthropology at the turn of the 20th century, including such giants as Boas and Malinowski, ethnography has found its way slowly into contemporary psychology, highlighted recently for counseling psychologists by Suzuki and her colleagues (Suzuki, Ahluwalia, Mattis, & Quizon, 2005 ). Focused on groups of people within their cultures and communities, the central question of ethnography is: “What is the culture of this group of people?” (Patton, 2002 , p. 132).

The ethnographic approach focuses on studying the cultural and community life (behaviors, language, artifacts) of individuals, and relies on the researcher functioning as a participant-observer in extensive fieldwork under conditions of prolonged engagement with the community (e.g., 6 months to 2 years or more). Interviewing and direct observation are the chief means of data collection, although archival records, surveys, and other documentation may be used as well. The end product of ethnographic research is the creation of narratives that are thought to capture the lived experiences of people in their complex cultural contexts, an aim that is consistent with and amenable to the multicultural emphasis within counseling psychology (Miller, Hengst, & Wang, 2003 ; Suzuki et al., 2005 ).

Ethnographic approaches can span the paradigmatic spectrum from post-positivist methods that rely largely on observations and quantitatively organized data (particularly in seeking out negative cases or contradictory information) to critical-ideological aims of giving voice to and thus empowering marginalized populations, especially if used in multicultural research in counseling psychology, as advocated by Suzuki et al. ( 2005 ). However, in its ideal form, ethnography probably most closely fits the constructivist-interpretivist paradigm in its epistemological focus on the awareness of the “subjectivities” and “guesthood” of the researcher, a position of genuine connection with participants balanced by enough distance to avoid compromising data collection or interpretation (Miller et al., 2003 ).

Indeed, one of the most intense debates within the ethnography literature concerns how and to what extent the insider or outsider status of the researcher influences the investigation, a debate that focuses, at its heart, on the relative roles of researchers and participants in co-constructing the final account of the lived experience under investigation (Miller et al., 2003 ; Suzuki et al., 2005 ). From a methodological perspective, the expectation that cultural immersion and a reflexive research stance will produce narratives and observational data that constitute an accurate or true reflection of lived cultural experience implicitly recognizes the subjectivity of the researcher in the co-construction of the account (as well as the need to monitor that subjectivity). Moreover, the assumption that ethnographers will decide upon “skill sets, material goods, or resources that they can and will gift to the community” (Suzuki et al., 2005 , p. 211; italics ours) also acknowledges the distance of the researcher even in the final procedural stages of a study that may have involved months or years of connection with participants.

The many types of ethnographic approaches available to researchers (e.g., memoir, life history, narrative ethnography, auto-ethnography) suggest wide variability in types of data and methods of interpreting those data (Miller et al., 2003 ; Suzuki et al., 2005 ). Moreover, aspects of the ethnographic approach can be found in similar research methods that may be more familiar to counseling psychology researchers (e.g., community-based research). These approaches offer considerable heuristic value, particularly in multicultural counseling psychology research. Examples, can be found in Miller, Wang, Sandel, and Cho (2002), Pipher (2002), and Suzuki, Prendes-Lintel, Wertlieb, and Stallings ( 1999 ).

Phenomenology

Rooted in the work of philosopher Edmund Husserl and the later American phenomenological and existential psychologists, the phenomenological approach has as its central question: “What is the meaning, structure, and essence of the lived experience of this phenomenon for this person or group of people?” (Patton, 2002 , p. 132). Phenomenology is a descriptive method of investigating the life-worlds of individuals, wherein the researcher “attempts to grasp the essence of the individual’s life experience through imaginative variation” (Wertz, 2005 , p. 172).

In this approach, the researcher seeks to enter empathically into the participant’s life-world to understand and communicate the subjective meaning of an individual’s lived experience. This is accomplished through a reflective process of suspending assumptions and biases and focusing on a phenomenon itself, then imaginatively varying concrete instances of the phenomena to distill their essential features, culminating in a description that is thought to portray the essence of that lived experience (Giorgi & Giorgi, 2003 ; Wertz, 2005 ). This kind of “intentional analysis begins with a situation just as it has been experienced—with all its various meanings—and reflectively explicates the experiential processes through which the situation is lived” (Wertz, 2005 , p. 169).

Data are collected as descriptions, and although typically they are direct verbal or written accounts from participants and others who interact with and/or know participants or can provide some kind of insight on the phenomenon under investigation, data also may consist of other forms of expression such as drawings, fictional accounts, poetry, and the like. Analysis consists of generating “situated descriptions” of the participant’s experience, organized sequentially or thematically, that then are mined for underlying psychological meanings and processes. The descriptions finally are synthesized into a case study representation that can be considered together with other cases to locate general themes and experiences, as well as variations in “knowledge of types” (Wertz, 2005 , p. 173). The final product is a context-bound descriptive presentation of the psychological structure of participants’ experiences in a specific life domain (Giorgi & Giorgi, 2003 ; Wertz, 2005 ).

Wertz ( 2005 ) locates phenomenology as the historical birthplace of contemporary qualitative research, and yet he also distinguishes phenomenology from other qualitative approaches in its unwavering commitment to bracketing researcher presuppositions and biases and its singular emphasis on pure description. Giorgi and Giorgi ( 2003 ) argue that phenomenology as a method is distinct from phenomenology as a philosophical endeavor, and it generally is acknowledged that phenomenology shares many procedural elements with other qualitative approaches (e.g., Giorgi & Giorgi, 2003 ; Wertz, 2005 ). As a very well-established research approach (including an entire curriculum devoted to phenomenology at Duquesne University), and one with high relevance to many areas of psychology, phenomenology has much to offer counseling psychologists. Examples can be found in Arminio ( 2001 ), Friedman, Friedlander, and Blustein ( 2005 ), and Muller and Thompson (2003).

Participatory Action Research

Emanating from the work of Kurt Lewin (Fine et al., 2003 ) and embodied in the writings of liberationists such as Frantz Fanon and Paulo Friere (Kidd & Kral, 2005 ), participatory action research is widely used in community psychology, as well as in other social science fields. Participatory action research (PAR; Fine et al., 2003 ; Kidd & Kral, 2005 ) has as its goal the creation of knowledge that directly benefits a group or community (typically marginalized, disenfranchised, or disempowered in some way) through political and social empowerment. Its central question is: How are systems of power and privilege manifested in the lived experiences of this person or group of people, and how can knowledge be gained and used to raise consciousness, emancipate, and empower this person and group?

It is an approach in which researchers and participants work collaboratively over an extended period of time to assess a need or problem in a particular social group, gather and analyze data, and implement results aimed at the “conscientization” (raising consciousness) of and giving voice to individual participants, such that their collective empowerment leads directly to social action and change. Participatory action research takes an unabashedly political stance, and, ideally, the values of the researcher and the participants mesh to drive the social change agenda. The involvement of the researcher is prolonged and intensive, and the success of a PAR project is judged by the manner and extent of changes that have occurred in the lives of participants (Fine et al., 2003 ; Kidd & Kral, 2005 ).

Although PAR clearly fits within the critical-ideological paradigm, based on its focus on power relations and structural inequality, in its goals of individual and group empowerment and social change, and its positioning of the researcher as a collaborator, it actually is more of a hybrid approach in many of its features. Data in a PAR project can take virtually any form (including quantitative surveys and statistical analyses of archival data), and its final products may include a wide range of artifacts, such as position papers, policy statements, charts and tables, records, and even speaking or lobbying activities. Kidd and Kral ( 2005 ) assert that PAR is not actually a method but rather is “the creation of a context in which knowledge development and change might occur—much like building a factory in which tools may be made rather than necessarily using tools already at hand” (p. 187). In this sense, PAR is much like organizational consultation in its collaborative approach to assessing needs, gathering data about what is happening in the collective, ensuring that all are given voice in articulating problems and determining future directions, and building readiness for implementation of clearly specified changes and goals.

Almost all experts in PAR note its challenges in practical use, including lack of time and resources for the prolonged engagement that PAR requires, resistance within traditional psychology to the overtly radical change agenda PAR espouses, and deeply entrenched societal and professional disrespect and disdain for the stigmatized, disenfranchised groups that PAR usually seeks to empower. Moreover, lack of knowledge and training in the PAR approach, and the emotional and psychological energy PAR requires from researchers (including the need for flexibility, good group management skills, and the ability to share power) make it difficult for some researchers, particularly novices. Finally, the volatile and changing nature of social groups and social problems can render the research-intervention goal of PAR a moving target, and there are often contextual barriers that make community participation and change extraordinarily difficult. Nevertheless, PAR is well-suited to the diversity and social justice focus within counseling psychology, and it provides unprecedented ways to enact the scientist–practitioner–advocate model of professionalism (Fassinger, 2001 ; Fassinger & O’Brien, 2000 ) becoming ever more popular in our field. Examples include Leff, Costigan, and Power ( 2004 ), O’Neill, Small, and Strachan (2003), and Fine et al. ( 2003 ).

In this final section on the various qualitative methods, we include two approaches developed by counseling psychologists that are (so far) used by small groups of researchers confined to counseling psychology. These approaches are consensual qualitative research developed by Hill and her colleagues (CQR; Hill, Knox, Thompson, Williams, Hess, & Ladany, 2005 ) and the action-project method of Young and his colleagues (Young, Valach, & Domene, 2005 ).

Consensual qualitative research, the better known of the two approaches, was developed in the mid-1990s in an effort to create an easy-to-use method of summarizing narrative data, and it has been used primarily in counseling-related investigations to date. Most qualitative methods assume, implicitly or explicitly, that more than one researcher will be participating in data gathering and/or data coding, monitoring, and interpretation. Consensual qualitative research, however, clearly delineates a team approach to collecting data through structured interviews (or counseling sessions) that are consistent across participants, with systematic coding and summarizing of data utilizing interjudge ratings, discussion, and consensus.

Hill and colleagues have claimed that CQR fits within a constructivist-interpretivist paradigm (Hill et al., 2005 ), but there is considerable disagreement on this point. Most experts acknowledge that it is strongly post-positivist in its use of theoretically or empirically generated structures for framing the study and starting the coding process, its reliance on consistency across participants in data gathering (including the search for negative cases or disconfirming evidence), its goal of achieving inter-rater agreement in coding, its quantitatively oriented analytic techniques and rhetorical structures, and its overall attempt to maintain researcher objectivity as much as possible throughout the research process. Indeed, it bears considerable similarity to simple content analysis in its aims and procedures. Nevertheless, it offers clearly specified procedures and a substantial number of model studies (especially related to counseling processes) for those interested in undertaking their own investigations. Moreover, it provides a viable starting point for researchers interested in qualitative techniques but needing more gradual movement in that direction. An example is Juntunen, Barraclough, Broneck, Seibel, Winrow, and Morin ( 2001 ).

The action-project method (Young, Valach, & Domene, 2005 ) is based in action theory and concerns itself with intentional, goal-directed behavior. It utilizes a three-dimensional model of action that incorporates perspectives on action and levels of action organization into four action systems: individual action, joint action, the action project, and the career (Young et al., 2005 , p. 217). Individual and joint actions are short-term, everyday occurrences that cumulatively compose the longer-term “project” in their common themes and goals, which, in turn, results in a long-term organization of projects into a “career” of action that has significant importance in one’s life.

Young et al. ( 2005 ) insist that the action-project method does not fit into any of the existing qualitative paradigms, but rather “represents a unique epistemology and research paradigm” (p. 218). Certainly, the data collection procedures in the action-project method are distinctive and worth noting. They can include several taped dialogues over an extended period of time (e.g., 6 months) in which each subsequently is replayed and commented upon separately by each participant, the data coded and summarized for distribution back to the participants, supplemented by journal entries, phone conversations, and electronic communications. All of these data are captured in an analysis that is fed back to participants in what is essentially a behavioral intervention, a process that may be repeated many times with the same participants in the course of a study. Indeed, the action-project approach resembles a highly specified behavioral counseling process, and it is not surprising that it offers considerable utility in understanding interpersonal interactions and their impact on the goals and behaviors of the individuals involved. Counseling psychologists with strong interests in the integration of science and practice might find the action-project method especially compelling. An example is found in Young et al. ( 1997 ).

Basic Issues to Consider in Qualitative Research

Regardless of the specific qualitative approach a researcher decides to adopt, there are a number of basic issues and challenges with which every researcher must grapple. In this section, we review sampling, data collection, researcher role, data analysis and communication, evaluation, and ethical considerations in conducting qualitative inquiry. Although we present these issues separately, it is important to remember that these decisions are inextricably linked paradigmatically, ontologically, epistemologically, axiologically, methodologically, and rhetorically.

Quantitative sampling strategies, because they are focused on generalizing findings, always are aimed at isolating a clearly bounded group of observations (represented numerically) that is sizable enough to support statistical inferences regarding the overall population of interest. In qualitative research, however, the goal is an in-depth understanding of the meaning of a particular life experience to those who live it, and data most often consist of narratives, observations, field notes, researcher journals, and other kinds of data that are represented (primarily) linguistically. Sample size depends entirely upon saturating the data set—that is, collecting enough data to satisfy the judgment of the researcher that no new information would be gained by additional cases. Thus, sample sizes in terms of actual participants typically are much smaller in qualitative than in quantitative studies, but the data sets themselves are much larger and more complex.

Given the aim of in-depth understanding, sampling in qualitative inquiry is always “purposeful,” that is, to select participants who will provide the most “information-rich” accounts of the phenomena of interest (Patton, 2002 , p. 239). The purposes in “purposeful” sampling can be quite varied, depending on the focus of the research. Patton ( 2002 ), for example, includes 15 different types of sampling strategies that may be of interest to qualitative researchers, including maximum variation, homogeneous, extreme case, snowball, intensity, typical case, critical case, disconfirming case, and other kinds of sampling.

Qualitative sampling also is criterion-based, in that specific criteria used in selecting participants are based on the research questions that guide the inquiry as well as the particular qualitative approach being used (Creswell, Hanson, Clark, & Morales, 2007 ; Morrow, 2007 ). In a phenomenological study, for example, the sample may consist of a small group of individuals who share a very specific common experience (e.g., priests accused of sexual abuse), whereas a participatory action research approach may call for a sample that includes an entire community or organization (e.g., a shelter providing services for women victimized by partner violence). In addition, there are decisions to be made about the extent and kind of contact with participants, ranging from one single lengthy interview with follow-up contact to immersion in and observation of a community over several years. These decisions about who will participate in the study and what the length and nature of the contact will be also determine how the process of actually gathering data will occur.

Data Collection

As the goal of data collection in qualitative inquiry is to ensure that all information relevant to understanding a particular phenomenon is obtained (i.e., the data set is saturated), the process of gathering data often is both prolonged and iterative. Because interviews, observations, extensive field notes, cultural artifacts, and other similar kinds of documentation form the corpus of data, it is not unusual for months and even years to be devoted to data collection. Moreover, most qualitative methods assume some sort of additional contact with participants to verify the researcher’s interpretations during the analysis process, creating an iterative cycle of data collection, researcher analysis, participant feedback, additional data collection and/or analysis, and repeated feedback from participants until no new information is emerging from the process.

Much has been written about the primary data tool in qualitative research: the individual interview. Researchers must conceptualize and articulate their interview strategy in terms of length, depth, kinds of open-ended questions, degree of structure, degree and kind of probing for sensitive information, ways of ensuring that participants’ words and ideas are being captured, and ways of monitoring their own reactivity. Patton ( 2002 , as but one example) includes detailed discussion of theoretical and practical issues in planning and conducting interviews, including sections on focus groups and cross-cultural interviewing, as well as numerous tables and checklists. It is imperative that counseling psychologists undertaking qualitative research for the first time consult such resources, as there may be a tendency to assume that competent clinical interviewing skills fully prepare one for conducting interviews aimed at gathering data for research purposes. However, the roles of scientist and helper are very different, and, although good clinical skills may facilitate the kind of relationship-building that is critical to the success of any interview, acquiring an in-depth understanding for research purposes requires a different mindset and approach than coming to understand an individual therapeutically (an ethical issue to which we return below).

Because data collection in qualitative research is implemented in a deeply interpersonal manner, the researcher also must consider when and how entry into the research context will occur and how trust and rapport will be established. Again, the form that this process takes is determined largely by the qualitative approach being used. If interviews with participants who have no connection to one another constitute the primary means of data collection, then the task becomes one of establishing credibility and trust with people one individual at a time. But if data collection includes multiple interviews, behavioral observations, and scrutiny of organizational documents within a group of highly interconnected individuals, then the task of entry into the organization, identification of key informants, rapport-building, and role clarification will be considerably more complex. Similarly, exit from the research context also is driven by approach; a study that utilized single isolated interviewees requires a different kind of process of following up and sharing findings than does a study of an entire community in which multiple stakeholders desire a product they can use to initiate political redress of identified problems. Of course, these different approaches to relationship-based data collection also have important implications for the role and stance of the researcher.

Researcher Role

Implementing interpersonally based inquiry requires a different researcher stance than that taken in most quantitatively based studies, in which the goals are appropriate distance, control, and avoidance of researcher contamination of data. Because qualitative research relies on co-constructed representations of lived experience, the researcher is rendered both a participant and an observer in the investigative process, with values, assumptions, and world views that must be made conscious and articulated clearly. As both participants and observers, researchers must grapple with the tension inherent in those roles, including the extent to which they want to function emically (as insiders) or etically (as outsiders), the degree to which they want their observations to be overt or more covert and less obvious, the amount of self-disclosure and collaboration they will offer, the expectation of entry into a long-term or more short-term relationship with participants, and the extent to which they will function as catalysts for change (Fine, 1992, 2007 ; Morrow, 2007 ; Patton, 2002 ).

Clearly, a research approach that requires interpersonal connection as its foundation and squarely places the researcher within that connection calls for a researcher stance that differs markedly from the distanced position of quantitative approaches. Researcher “reflexivity” is the term used most often to capture this stance (e.g., Marecek, 2003 ; Morrow, 2005 , 2007 ), and refers to the capacity to use own one’s experiences, thoughts, and feelings, to recognize and understand one’s own perspectives and world views, and to actively and constantly reflect upon the ways in which those might influence one’s experience of observing, collecting, understanding, interpreting, and communicating data. Rennie ( 2004 , cited in Morrow, 2005 ) described reflexivity as self-awareness and agency within that self-awareness. Moreover, reflexivity is not just about the self; it also includes deep reflection about those studied, the intended and unintended audiences for the inquiry, and the cultural and historical context in which the scientific endeavor occurs. Fine (1992) captured the complexity of the reflexive stance in her description of qualitative researchers as “self-conscious, critical, and participatory analysts, engaged with but still distinct from [their] informants” (p. 254).

In actual practice, researcher reflexivity is facilitated through a variety of strategies that are articulated somewhat differently depending on the particular qualitative approach being used. These strategies may include publicly articulating one’s biases through researcher-as-instrument statements, bracketing and monitoring one’s biases, being rigorously subjective in one’s observations and interpretations, keeping and using field notes throughout the research process, continuously separating description from interpretation and judgment, using thick description to ensure remaining close to participants’ experiences, maintaining an appropriate balance between participation and observation, returning again and again to the data and/or participants to verify one’s interpretations, memoing or keeping a journal throughout the course of the study, and using external auditors or teams of multiple researchers to maintain systems of peer checking and review (Morrow, 2005 ). Researcher reflexivity is important throughout the entire inquiry, but it is especially critical during the process of analyzing, interpreting, and communicating the data in the study.

Analysis, Interpretation, and Communication of Data

As noted earlier, qualitative approaches differ in the extent to which systematic analytic principles have been detailed in specific how-to formats. However, all offer conceptual delineations of data analyses that parallel the core paradigmatic assumptions of the approach. Thus, a grounded theory analysis moves the researcher through a system of coding and constantly comparing data to an end point of generating an emergent theory grounded in the lived experiences of the participants. A narratological researcher, on the other hand, will (re)arrange narratives into a chronologically and psychologically coherent, storied account of lived experience. Participatory action researchers involve the constituent group(s) in making sense of the data and consciously use the data to mobilize individuals and the community into actions aimed at social change.

Regardless of specific approach, all qualitative methods rely heavily on researcher reflexivity in the analysis process. This reflexive stance compels a continual return to and immersion in the data—not only the narratives or other data gathered from participants, but also the memos, journals, field notes, research team notes, and other documentation of the extensive and intensive process of rigorous thinking that has occurred throughout the inquiry. Thus, six or eight interview transcripts alone (totaling, at minimum, about 150 pages of text) will generate hundreds more pages of an analysis record and audit trail. Moreover, it can be assumed that many hundreds of hours will be devoted to reading, coding, (re)arranging, thematizing or propertizing, theorizing, (re)checking, obtaining feedback, and discussing the data. When other kinds of data are added (e.g., behavioral observations, artifacts, historical records), the sheer size and complexity of the data set becomes quite challenging, and it should be clear why continually interrogating the data corpus is an absolute necessity in qualitative research.

Capturing the enormity and complexity of the data analysis process for purposes of communicating findings also is extremely challenging, and not particularly well-suited to the length and format constraints of most scholarly journals (Morrow, 2005 ). Morrow ( 2005 ) has provided a cogent guide to writing publishable versions of qualitative inquiries, and many excellent studies have been published in counseling psychology journals despite the difficulties. Unfortunately, the brevity of most published accounts belies the extensive work that undergirds those studies and provides limited information about why particular conceptual, sampling, data collection, or analytic decisions were made, thus offering little basis for judging the quality of the research.

Evaluating Qualitative Research

Because qualitative and quantitative research emanate from different paradigmatic assumptions, the criteria for judging the quality and rigor of quantitative research simply do not apply to qualitative studies. Attempts have been made to describe evaluation criteria for qualitative studies that parallel the quantitative criteria of validity, reliability, generalizability, and objectivity (probably developed, at least in part, to make qualitative work more acceptable to the positivist researchers comprising most editorial and review boards). However, Morrow ( 2005 ) argues that such criteria do not mean or accomplish the same things, and that qualitative studies are more appropriately evaluated using standards that are congruent internally with what qualitative research seeks to do. Morrow advises the development and use of “intrinsic standards of trustworthiness that have emerged more directly from the qualitative endeavor” (2005, p. 252).

Key to discussions of evaluating the rigor of qualitative research is the concept of trustworthiness or credibility (Morrow, 2005 ). A comprehensive evaluation framework offered by Morrow ( 2005 ) outlines four overarching or “transcendent” criteria (p. 250), so termed because they transcend the particular requirements of any specific approach and apply to the evaluation of all qualitative inquiry. The first criterion for judging the trustworthiness of a study is social validity, or the social value of the project. The second criterion addresses the way in which the study handles subjectivity and reflexivity on the part of the researcher, so that the reader can determine whether the participants’ accounts are being honored or whether the findings merely or predominantly reflect the opinions of the researcher. Morrow ( 2005 ) advises that, regardless of paradigmatic and axiological approach (i.e., whether researcher subjectivity is bracketed and monitored or incorporated as a driving force in the study), researchers must make their implicit assumptions and biases fully and clearly explicit to themselves and to all others.

The third criterion for judging the trustworthiness of a study lies in the adequacy of the data. Because sample size has little to do with the richness, breadth, and depth of qualitative research data, the study must demonstrate other forms of evidence that the data are maximally informative. Such evidence might include information-rich cases, appropriate sampling, saturated data sets, lengthy and open-ended interviews, feedback from participants, multiple data types and sources, field notes indicating rapport with participants, and inclusion of discrepant or disconfirming cases. The fourth criterion for evaluating the trustworthiness of the inquiry is the adequacy of the interpretation. There must be clear evidence of immersion in the data set during analysis, the use of a specified analytic framework and analytic memos, and a balance in the writing between the interpretations of the researcher and the direct words of the participants (Morrow, 2005 ).

In addition to these four transcendent criteria, Morrow ( 2005 ) also includes criteria that are more specific to the paradigm that undergirds a particular study. In a constructivist/interpretivist study, for example, the additional criteria of fairness, authenticity, and meaning would be important, whereas a critical/ideological study would be expected to include those criteria but also demonstrate consequential and transgressive evidence. Finally, regardless of approach, the trustworthiness of a study also must include evidence that the researcher attended to the social and ethical issues inherent in that study.

Ethics, Politics, and Social Responsibility

Social, political, and ethical considerations are not pertinent uniquely to qualitative inquiry, as all research is embedded in a sociopolitical and scientific context and therefore must attend to issues of social power, researcher responsibility, protection of people from harm, and potential (mis)use of findings. However, “[b]ecause qualitative methods are highly personal and interpersonal, because naturalistic inquiry takes the researcher into the real world where people live and work, and because in-depth interviewing opens up what is inside people—qualitative inquiry may be more intrusive and involve greater reactivity than surveys, tests, and other quantitative approaches” (Patton, 2002 , p. 407). That is, relationship-based methods create unique challenges in the implementation of the standard ethical requirements of the scientific enterprise, and they “increase both the likelihood of ‘ethically relevant moments’ and the ambiguity of how, or whether, specific ethical standards apply to the question at hand” (Haverkamp, 2005 , p. 148).

The relational focus of qualitative inquiry also is buttressed by the use of linguistically based data, which offer researchers considerable interpretive latitude. However, those constructions typically are supported by participant verification which, in turn, is obtained through repeated and prolonged contact. In addition, qualitative inquiry transforms the notion of research benefit where, particularly in the critical/ideological approach, outcomes of a study must include direct benefit to participants in the form of knowledge and empowerment. Finally, the very process of qualitative research has ethical implications, as its flexibility, fluidity, and changeability necessitate ethical decision making repeatedly throughout the entire inquiry process.

Many discussions of ethical, political, and social issues in qualitative research may be found in the literature (e.g., Fine, 1999, 2007 ; Haverkamp, 2005 ; Marecek, 2003 ; Morrow, 2007 ). Haverkamp ( 2005 ) offers a particularly useful discussion for counseling psychologists, recommending a synthesis of virtue ethics, principle ethics, and an ethic of care, all of which are central to graduate training in our field and thus known to counseling psychologists. Haverkamp ( 2005 ) calls for “professional reflexivity” (p. 152), the ethical counterpart to research reflexivity, which refers to a conscious consideration of the ways in which our social roles, skills, and knowledge base may influence our research practices, including relationships with participants. Professional reflexivity is the cornerstone of competence, and Haverkamp ( 2005 ) notes that professionally reflexive competence includes not only expertise in the populations and topics we wish to investigate, but also in the qualitative methods that we will use in the investigation.

Probably the most widely discussed ethical issue in qualitative research involves researcher boundaries and the complexities inherent in multiple relationships with participants. The deep and prolonged engagement between researcher and participants; the centrality of the researcher’s positionality and values; the public and individual perceptions and expectations of psychologists as healers; the skills of clinically trained psychologists in eliciting deeply private and even unconscious information; the co-creation of the meaning, interpretation, and form of the research product; and the focus of much qualitative inquiry on marginalized and disempowered social or cultural groups elicit a host of complex ethical issues that have clear social and political ramifications. Such issues include dual relationships, conflicts of roles and interests, confidentiality, informed consent, coercion, fiduciary responsibility, and use of professional and social power.

Although detailed discussion of these issues is well beyond the scope of this chapter, we return to the concept of trustworthiness, noted above as the primary criterion for evaluating the quality of a study. Haverkamp ( 2005 ) argues that trustworthiness does not pertain only to the rigor of methods used in a study, but that it “is an inherently relational construct with relevance for multiple dimensions of the research enterprise” (p. 146). Trustworthiness in the realm of ethics recognizes the potential vulnerability of participants involved in qualitative inquiry, and it reminds researchers that they must maintain constant vigilance in their responsibility to protect participants from harm. Fine ( 2007 ) extends this notion into the arena of political and social responsibility by urging a deeper kind of responsibility upon counseling psychologists. She asks that we consider the harm implicit in oppressive social structures and protect our participants by refusing to perpetuate their narratives of denial, blame, or victimization. Fine asserts that we “bear responsibility to theorize that which may not be spoken by those most vulnerable, or, for different reasons, by those most privileged” (p. 472), a clarion call for the kind of qualitative inquiry that also becomes individual and social intervention.

Mixed Methods and Future Prospects

It should be clear that both quantitative and qualitative methods have much to offer in understanding the kinds of issues of interest to most counseling psychologists—relationships, work, counseling, culture, health, and the like. It is our position that mixing these methods in creative ways offers much potential in solving some of the thorniest problems in our field today, and we urge researchers to consider mixed method approaches.

That being said, it is also true that quantitative and qualitative methods may not necessarily mesh well or complement one another within one study. Because they utilize different paradigms, different conceptions of researcher role, different approaches to interacting with participants, different ways of unearthing information, and different articulations of the research enterprise, the outcomes of qualitative and qualitative methods may be not only disparate but wholly incompatible. Dealing with this fundamental gap requires great caution and care, and it may be easier to alternate quantitative and qualitative approaches across studies within an ongoing program of research over many years, shifting the approach to illuminate different aspects of the same research problem (Ponterotto & Grieger, 2007 ).

Several authors have offered perspectives on the challenges and possibilities of mixed-methods approaches. Marecek ( 2003 ) offers a less polarized view of quantitative and qualitative methods, suggesting that the tension is not that one approach produces greater truth than the other, but that they offer different kinds of truths, and researchers must determine which truth is of greatest interest to them in understanding a particular phenomenon. In addition, she observes that any research approach, regardless of paradigmatic and methodological underpinnings, can be used oppressively or dismissively by researchers, and that no particular method guarantees appropriate handling of social justice goals or redress of social ills. Patton ( 2002 ) offers an assortment of possibilities that mix research design, measurement, and analysis in creative approaches to specific problems.

Ponterotto and Grieger ( 1999 , 2007 ) suggest that, just as psychologists can learn to embrace different cultures and languages and become bicultural, researchers can learn to be facile in both quantitative and qualitative inquiry and become bicultural or “merged” in their research identity (1999, p. 59), termed “bimethodological” (2007, p. 408). These authors argue that the flexibility of a merged identity produces scientific richness, but they caution that becoming truly bicultural methodologically requires immersion in the unfamiliar culture—that is, counseling psychologists must actively undertake qualitative research to learn qualitative research. Ponterotto and Grieger ( 1999 ) describe in detail two mixed-methods studies that they judge to be of high quality (Jick, 1983 , and Blustein et al., 1997 ), demonstrating how the researchers successfully navigated the complexities of contrasting paradigmatic approaches, and explaining how and why a mixed-methods approach was effective in these particular investigations. The work of Fassinger and her colleagues in women’s career development provides an example of the use of both quantitative (e.g., Fassinger, 1990 ) and qualitative (e.g., Gomez et al., 2001 ) approaches in different studies over time to explicate a vocational process. These examples may be of help to counseling psychologists wishing to stretch their scientific competence and work toward becoming more bimethodological.

In concluding this chapter, we express hope that more researchers will embrace the notion of using mixed methods in their research programs. In some cases, this “mixing” may be done by one researcher—within one study or over time in a programmatic series of studies designed to enhance understanding of some phenomenon of interest. In other cases, groups of researchers may combine their efforts, some engaged in qualitative and others in quantitative investigations of a particular problem. In all cases, we assert that researchers must become competent in any method they wish to use, and that, at the same time, we all become conversant enough in both quantitative and qualitative research approaches to appreciate their significant and unique contributions to scholarly progress.

AMOS. Structural equation modeling software. [Software]. Available from www.spss.com/Amos/ .

Arminio, J. ( 2001 ). Exploring the nature of race-related guilt.   Journal of Multicultural Counseling and Development , 29 , 239–252.

Avalos, L. ( 2005 ). An initial examination of a model of intuitive eating . Unpublished honors thesis, Department of Psychology, the Ohio State University.

Google Scholar

Google Preview

Baron, R. M., & Kenny, D. A. ( 1986 ). The moderator-mediator variable distinction in social psychological research.   Journal of Personality and Social Psychology , 51 , 1173–1182.

Bentler, P. (1995). EQS: Structural equation modeling software. [Software]. Available from Multivariate Software: www.mvsoft.com/ .

Blustein, D. L., Phillips, S. D., Jobin-Davis, K., Finkelberg, S. L., & Rourke, A. E. ( 1997 ). A theory- building investigation of the school-to-work transition.   The Counseling Psychologist , 25 , 364–402.

Borgen, F. H., & Betz, N. E. ( 2008 ). Career self-efficacy and personality: Linking career confidence and the healthy personality.   Journal of Career Assessment , 16 , 22–43.

Browne, M. ( 2001 ). An overview of analytic rotation in exploratory factor analysis.   Multivariate Behavioral Research , 36 , 111–150.

Browne, M., & Cudeck, R. ( 1993 ). Alternative ways of assessing model fit. In K. A. Bollen, & J. Long (Eds.), Testing structural equation models (pp 136–162). Newbury Park, CA: Sage.

Browne, M., Cudeck, R., Tateneni, K., & Mels, G. (2004). Comprehensive exploratory factor analysis (CEFA). Computer software and manual . Retrieved from http://faculty.psy.ohio-sttae.edu/browne/software.php .

Charmaz, K. ( 2000 ). Grounded theory: Objectivist and constructivist methods. In N. K. Denzin, & Y. S. Lincoln (Eds.), Handbook of qualitative research (2nd ed., pp. 509–536). Thousand Oaks, CA: Sage Publications.

Cohen, J. ( 1969 ). Statistical power analyses for the behavioral sciences . New York: Academic Press.

Cohen, J. ( 1988 ). Statistical power analyses for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Cohen, J. ( 1992 ). A power primer.   Psychological Bulletin , 112 , 155–159.

Cohen, J. ( 1994 ). The earth is round (p < .05). American Psychologist , 49 , 997–1003.

Creswell, J. W., Hanson, W. E., Plano Clark, V. L., & Morales, A. ( 2007 ). Qualitative research designs: Selection and implementation.   The Counseling Psychologist , 35 (2), 236–264.

DeNeve, K. M., & Cooper, H. ( 1998 ). The happy personality.   Psychological Bulletin , 124 , 197–229.

Denzin, N. K., & Lincoln, Y. S. ( 2000 ). Handbook of qualitative research (2nd ed.). Thousand Oaks, CA: Sage Publications.

Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Starhan, E. J. ( 1999 ). Evaluating the use of exploratory factor analysis in psychological research.   Psychological Methods , 4 , 272–299.

Fassinger, R. E. ( 1990 ). Causal models of career choice in two samples of college women.   Journal of Vocational Behavior , 36 , 225–248.

Fassinger, R. E. ( 2001 ). On remodeling the master’s house: Tools for dismantling sticky floors and glass ceilings . Invited keynote address, Fifth Biennial Conference of the Society for Vocational Psychology, Houston, TX.

Fassinger, R. E. ( 2005 ). Paradigms, praxis, problems, and promise: Grounded theory in counseling psychology research.   Journal of Counseling Psychology , 52 , 156–166.

Fassinger, R. E., & O’Brien, K. M. ( 2000 ). Career counseling with college women: A scientist-practitioner-advocate model of intervention. In D. Luzzo (Ed.), Career development of college students: Translating theory and research into practice (pp. 253–265). Washington, DC: American Psychological Association Books.

Fine, M. ( 2007 ). Expanding the methodological imagination.   The Counseling Psychologist , 35 , 459–473.

Fine, M., Torre, M. E., Boudin, K., Bowen, I., Clark, J., Hylton, D., et al. ( 2003 ). Participatory action research: From within and beyond prison bars. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design (pp. 173–198). Washington, DC: American Psychological Association Books.

Fitzgerald, L. F., & Hubert, L. J. ( 1987 ). Multidimensional scaling: Some possibilities for counseling psychology.   Journal of Counseling Psychology , 34 , 469–480.

Forester, M., Kahn, J., & Hesson-McInnis, M. ( 2004 ). Factor structure of three measures of research self-efficacy.   Journal of Career Assessment , 12 , 3–16.

Frazier, P. A., Tix, A. P., & Barron, K. E. ( 2004 ). Testing moderator and mediator effects in counseling psychology research.   Journal of Counseling Psychology , 51 , 115–134.

Friedman, M. L., Friedlander, M. L., & Blustein, D. L. ( 2005 ). Toward an understanding of Jewish identity: A phenomenological study.   Journal of Counseling Psychology , 52 , 77–83.

Giorgi, A. P., & Giorgi, B. M. ( 2003 ). The descriptive phenomenological psychological model. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design (pp. 243–274). Washington, DC: American Psychological Association Books.

Glaser, B. G. ( 1992 ). Basics of grounded theory analysis: Emergence vs . forcing . Mill Valley, CA: Sociology Press.

Glaser, B. G. ( 2000 ). The future of grounded theory.   Grounded Theory Review , 1 , 1–8.

Glaser, B. G., & Strauss, A. L. ( 1967 ). The discovery of grounded theory: Strategies for qualitative research . Chicago: Aldine.

Glass, G. V. ( 1976 ). Primary, secondary, and meta-analysis of research.   Educational Researcher , 5 , 3–8.

Glass, G. V. ( 2006 ). Meta-analysis: Quantitative synthesis of research findings. In J. Green, G. Camilli, & P. Elmore (Eds.), Handbook of complementary methods in education research (pp. 427–438). Mahwah, NJ: Erlbaum.

Glass, G. V., & Hopkins, K. ( 1996 ). Statistical methods in psychology and education (3rd ed.). Englewood Cliffs, NJ: Prentice Hall.

Gomez, M. J., Fassinger, R. E., Prosser, J., Cooke, K., Mejia, B., & Luna, J. ( 2001 ). Voces abriendo caminos (voices forging paths): A qualitative study of the career development of notable Latinas.   Journal of Counseling Psychology , 48 , 286–300.

Gorsuch, R. L. ( 1983 ). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum.

Haase, R. F., & Ellis, M. V. ( 1987 ). Multivariate analysis of variance.   Journal of Counseling Psychology , 34 , 404–413.

Hansen, J. C., Dik, B., & Zhou, S. ( 2008 ). An examination of the structure of leisure interests in college students, working age adults, and retirees.   Journal of Counseling Psychology , 55 , 133–145.

Hardy, G. E., Barkham, M., Field, S. D., Elliott, R., & Shapiro, D. A. ( 1998 ). Whineging versus working: Comprehensive process analysis of a “vague awareness” event in psychodynamic-interpersonal therapy.   Psychotherapy Research , 8 , 334–353.

Haverkamp, B. E. ( 2005 ). Ethical perspectives on qualitative research in applied psychology.   Journal of Counseling Psychology , 52 (2), 146–155.

Hayton, J. C., Allen, D. G., Scarpello, V. ( 2004 ). Factor retention decisions in exploratory factor analysis: A tutorial on parallel analysis.   Organizational Research Methods , 7 , 191–205.

Henwood, K., & Pidgeon, N. ( 2003 ). Grounded theory in psychological research. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design (pp. 131–156). Washington, DC: American Psychological Association Books.

Heppner, P., Wampold, B., & Kivlighan, D. ( 2008 ). Research design in counseling (4th ed.). Belmont, CA: Brooks-Cole.

Hermann, K. ( 2005 ). Path models of the relationships of instrumentality and expressiveness, social self-efficacy, and self-esteem to depressive symptoms in college students . Unpublished PhD Dissertation, Department of Psychology, Ohio State University.

Hermann, K., & Betz, N. E. ( 2006 ). Path models of the relationships of instrumentality and expressiveness, social self-efficacy, and self-esteem to depressive symptoms in college students.   Journal of Social and Clinical Psychology , 25 , 1086–1106.

Hill, C. E., Knox, S., Thompson, B. J., Williams, E. N., Hess, S. A., Ladany, N. ( 2005 ). Consensual qualitative research: An update.   Journal of Counseling Psychology , 52 (2), 196–205.

Hogan, T. P. ( 2007 ). Psychological testing: A practical introduction (2nd ed.). New York: Wiley.

Hoshmand, L. T. ( 2005 ). Narratology, cultural psychology, and counseling research.   Journal of Counseling Psychology , 52 (2), 178–186.

Hu, L., & Bentler, P. ( 1999 ). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives.   Structural Equation Modeling , 6 , 1–55.

Jick, T. D. ( 1983 ). Mixing qualitative and quantitative methods: Triangulation in action. In J. Van Maanen (Ed.), Qualitative methodology (pp. 135–148). Beverly Hills, CA: Sage.

Joreskog, K. G., & Sorbom, D. (2008). LISREL 8.8 . [Software]. Lincolnwood IL: Scientific Software. Retrieved from www.ssicentral.com/lisrel/new.html .

Juntunen, C. L., Barraclough, D. J., Broneck, C. L., Seibel, G. A., Winrow, S. A., & Morin, P. M. ( 2001 ). American Indian perspectives on the career journey.   Journal of Counseling Psychology , 48 , 274–285.

Kahn, J. H. ( 2006 ). Factor analysis in counseling psychology research, training, and practice.   The Counseling Psychologist , 34 , 684–718, inside back cover.

Kanji, G. K. ( 1993 ). 100 statistical tests . Newbury Park, CA: Sage.

Kashubeck-West, S., Coker, A. D., Awad, G. H., Hix, R. D., Bledman, R. A., & Mintz, L. ( 2008 , August). Psychometric evaluation of body image measures in African American women . Poster presented at the meeting of the American Psychological Association, Boston, MA.

Kidd, S. A., & Kral, M. J. ( 2005 ). Practicing participatory action research.   Journal of Counseling Psychology , 52 , 187–195.

Killeen, P. R. ( 2005 ). An alternative to null-hypothesis significance tests.   Psychological Science , 16 , 345–353.

Kinnier, R. T., Tribbensee, N. E., Rose, C. A., & Vaughan, S. M. ( 2001 ). In the final analysis: More wisdom from people who have faced death.   Journal of Counseling & Development , 79, 187–195.

Kirk, R. ( 1996 ). Practical significance: A concept whose time has come.   Educational and Psychological Measurement , 56 , 746–759.

Larson, L., Wei, M., Wu, T., Borgen, F., & Bailey, D. ( 2007 ). Discriminating among educational majors and career aspirations in Taiwanese undergraduates.   Journal of Counseling Psychology , 54 , 395–408.

Leff, S. S., Costigan, T., & Power, T. J. ( 2004 ). Using participatory research to develop a playground-based prevention program.   Journal of School Psychology , 42 , 3–21.

Lent, R. W., Brown, S. D., Sheu, H.-B., Schmidt, J., Brenner, B., Gloster, C., et al. ( 2005 ). Social cognitive predictors of academic interest and goals in engineering: Utility for women and students at historically Black universities.   Journal of Counseling Psychology , 52 , 84–92.

MacCallum, R. C., & Austin, J. ( 2000 ). Applications of structural equation modeling in psychological research.   Annual Review of Psychology , 51 , 201–226.

MacCallum, R., Roznowski, M., & Necowitz, L. ( 1992 ). Model modifications in covariance structure analysis: The problem of capitalization on chance.   Psychological Bulletin , 111 , 490–504.

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. ( 1999 ). Sample size in factor analysis.   Psychological Methods , 4 , 84–89.

Marecek, J. ( 2003 ). Dancing through minefields: Toward a qualitative stance in psychology. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design (pp. 49–70). Washington, DC: American Psychological Association Books.

Miller, P. J., Hengst, J. A., & Wang, S. ( 2003 ). Ethnographic methods: Applications from developmental cultural psychology. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design (pp. 49–70). Washington, DC: American Psychological Association Books.

Morrow, S. L. ( 2005 ). Quality and trustworthiness in qualitative research in counseling psychology.   Journal of Counseling Psychology , 52 (2), 250–260.

Morrow, S. L. ( 2007 ). Qualitative research in counseling psychology: Conceptual foundations.   The Counseling Psychologist , 35 (2), 209–235.

Morrow, S. L., & Smith, M. L. ( 1995 ). Constructions of survival and coping by women who have survived childhood sexual abuse.   Journal of Counseling Psychology , 42 , 24 – 33 .

Murray, M. ( 2003 ). Narrative psychology and narrative analysis. In P. M. Camic, J. E. Rhodes, & L. Yardley (Eds.), Qualitative research in psychology: Expanding perspectives in methodology and design . Washington, DC: American Psychological Association Books.

Muthen, L. K., & Muthen, B. O. (2006). Mplus users guide (4th ed.). Los Angeles: Muthen and Muthen. Retrieved from www.statmodel.com/company.shtml .

Noonan, B. M., Gallor, S., Hensler-McGinnis, N., Fassinger, R. E., Wang, S., & Goodman, J. ( 2004 ). Challenge and success: A qualitative study of the career development of highly achieving women with physical and sensory disabilities. Journal of Counseling Psychology , 51 , 68–80.

Patton, M. Q. ( 2002 ). Qualitative research & evaluation methods (3rd ed.). Thousand Oaks, CA: Sage Publications.

Polkinghorne, D. ( 2005 ). Language and meaning: Data collection in qualitative research.   Journal of Counseling Psychology , 52 , 137–145.

Ponterotto, J. G. ( 2005 ). Qualitative research in counseling psychology: A primer on research paradigms and philosophy of science.   Journal of Counseling Psychology , 52 (2), 126–136.

Ponterotto, J. G., & Grieger, I. ( 1999 ). Merging qualitative and quantitative perspectives in a research identity. In M. Kopala, & L. A. Suzuki (Eds.), Using qualitative methods in psychology (pp. 49–61.). Thousand Oaks, CA: Sage Publications.

Ponterotto, J. G., & Grieger, I. ( 2007 ). Effectively communicating qualitative research.   The Counseling Psychologist , 35 (3), 431–458.

Rennie, D. L. ( 1994 ). Clients’ deference in psychotherapy.   Journal of Counseling Psychology , 41, 427–437.

Rennie, D. L. ( 2000 ). Grounded theory methodology as methodological hermeneutics.   Theory & Psychology , 10 (4), 481–502.

Rennie, D. L. ( 2004 ). Anglo-North American qualitative counseling and psychotherapy research.   Psychotherapy Research , 14 , 37–55.

Richie, B. S., Fassinger, R. E., Linns, S., Johnson, J., Prosser, J., & Robinson, S. ( 1997 ). Persistence, connection, and passion: A qualitative study of the career development of highly achieving African American/Black and White women.   Journal of Counseling Psychology , 44 , 133–148.

Russell, D., Kahn, J., Spoth, R., & Altmaier, E. ( 1998 ). Analyzing data from experimental studies.   Journal of Counseling Psychology , 45 , 18–29.

Schafer, J. L., & Graham, J. W. ( 2002 ). Missing data: Our view of the state of the art.   Psychological Methods , 7 , 147–177.

Schumaker, R. E., & Lomax, R. G. ( 2004 ). A beginner’s guide to structural equation modeling . Mahwah, NJ: Lawrence Erlbaum.

Sherry, A. ( 2006 ). Discriminant analysis in counseling psychology research.   The Counseling Psychologist , 34 , 661–683.

Sireci, S. G., & Talento-Miller, E. ( 2006 ). Evaluating the predictive validity of Graduate Management Admissions Test scores.   Educational and Psychological Measurement , 66 , 305–317.

Smith, M. L., & Glass, G. V. ( 1977 ). Meta-analysis of psychotherapy outcome studies.   American Psychologist , 32 , 752–760.

Sobel, M. E. ( 1982 ). Asymptotic intervals for indirect effects in structural equation models. In S. Leinhart (Ed.), Sociological methodology (pp. 290–312). San Francisco: Jossey-Bass.

Spearman, C. ( 1904 ). General intelligence: Objectively defined and measured.   American Journal of Psychology , 15 , 201–293.

Steiger, J. H. ( 1990 ). Structural model evaluation and modification: An interval estimation approach.   Multivariate Behavioral Research , 25 , 173–180.

Sterner, W. R. ( 2011 ). What is missing in counseling research? Reporting missing data.   Journal of Counseling and Development , 89 , 56–63.

Stevens, S. S. ( 1951 ). Handbook of experimental psychology . New York: Wiley.

Strauss, A. L. ( 1987 ). Qualitative analysis for social scientists . New York: Cambridge University Press.

Strauss, A. L., & Corbin, J. ( 1998 ). Basics of qualitative research: Techniques and procedures for developing grounded theory (2nd ed.). Thousand Oaks, CA: Sage Publications.

Suzuki, L. A., Ahluwalia, M. K., Mattis, J. S., & Quizon, C. A. ( 2005 ). Ethnography in counseling psychology research: Possibilities for application.   Journal of Counseling Psychology , 52 , 206–214.

Suzuki, L. A., Prendes-Lintel, M., Wertlieb, L., & Stallings, A. ( 1999 ). Exploring multicultural issues using qualitative methods. In M. Kopala, & L. A. Suzuki (Eds.), Using qualitative methods in psychology (pp. 123–134). Thousand Oaks, CA: Sage Publications.

Swanson, J. L. ( 2011 ). Measurement and assessment in counseling psychology. In E. M. Altmaier, & J. C. Hansen (Eds.), The Oxford handbook of counseling psychology . New York: Oxford University Press.

Tabachnick, B. G., & Fidell, L. S. ( 2007 ). Using multivariate statistics (5th ed.). New York: Harper Collins College Publishers.

Thurstone, L. L. ( 1947 ). Multiple factor analysis . Chicago: University of Chicago Press.

Tracey, T. J. G. ( 2008 ). Adherence to RIASEC structure as a key career decision construct.   Journal of Counseling Psychology , 55 , 146–157.

Vacha-Haase, T., & Thompson, B. ( 2004 ). How to estimate and interpret various effect sizes.   Journal of Counseling Psychology , 51 , 473–481.

Wertz, F. J. ( 2005 ). Phenomenological research methods for counseling psychology.   Journal of Counseling Psychology , 52 , 167–177.

Weston, R., & Gore, P. ( 2006 ). A brief guide to structural equation modeling.   The Counseling Psychologist , 34 , 719–751.

Wilkinson, L., & Task Force on Statistical Inference. ( 1999 ). Statistical methods in psychology journals: Guidelines and explanations.   American Psychologist , 54 , 594–604.

Young, R. A., Valach, L., & Domene, J. E. ( 2005 ). The action-project method in counseling psychology.   Journal of Counseling Psychology , 52 (2), 215–223.

Young, R. A., Valach, L., Paseluihko, M. A., Dover, C., Matthes, G. E., Paproski, D., et al. ( 1997 ). The joint action of parents and adolescents in conversation about career.   Career Development Quarterly , 46 , 72–86.

Zwick, W. R., & Velicer, W. F. ( 1986 ). A comparison of five rules for determining the number of components to retain.   Psychological Bulletin , 99 , 432–442.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Counseling Research: A Practitioner-Scholar Approach 2nd Edition

Richard S. Balkin and David M. Kleist

Counseling Research: A Practitioner-Scholar Approach 2nd Edition

Tags: Career Development, Professional Counseling, Professional Development

American Counseling Association

2461 Eisenhower Avenue, Suite 300, Alexandria, Va. 22314 | 800-347-6647 | (fax) 800-473-2329

My ACA      Join Now       Contact Us       Privacy Policy       Terms of Use      ©  All Rights Reserved.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Reporting Standards for Research in Psychology

In anticipation of the impending revision of the Publication Manual of the American Psychological Association , APA’s Publications and Communications Board formed the Working Group on Journal Article Reporting Standards (JARS) and charged it to provide the board with background and recommendations on information that should be included in manuscripts submitted to APA journals that report (a) new data collections and (b) meta-analyses. The JARS Group reviewed efforts in related fields to develop standards and sought input from other knowledgeable groups. The resulting recommendations contain (a) standards for all journal articles, (b) more specific standards for reports of studies with experimental manipulations or evaluations of interventions using research designs involving random or nonrandom assignment, and (c) standards for articles reporting meta-analyses. The JARS Group anticipated that standards for reporting other research designs (e.g., observational studies, longitudinal studies) would emerge over time. This report also (a) examines societal developments that have encouraged researchers to provide more details when reporting their studies, (b) notes important differences between requirements, standards, and recommendations for reporting, and (c) examines benefits and obstacles to the development and implementation of reporting standards.

The American Psychological Association (APA) Working Group on Journal Article Reporting Standards (the JARS Group) arose out of a request for information from the APA Publications and Communications Board. The Publications and Communications Board had previously allowed any APA journal editor to require that a submission labeled by an author as describing a randomized clinical trial conform to the CONSORT (Consolidated Standards of Reporting Trials) reporting guidelines ( Altman et al., 2001 ; Moher, Schulz, & Altman, 2001 ). In this context, and recognizing that APA was about to initiate a revision of its Publication Manual ( American Psychological Association, 2001 ), the Publications and Communications Board formed the JARS Group to provide itself with input on how the newly developed reporting standards related to the material currently in its Publication Manual and to propose some related recommendations for the new edition.

The JARS Group was formed of five current and previous editors of APA journals. It divided its work into six stages:

  • establishing the need for more well-defined reporting standards,
  • gathering the standards developed by other related groups and professional organizations relating to both new data collections and meta-analyses,
  • drafting a set of standards for APA journals,
  • sharing the drafted standards with cognizant others,
  • refining the standards yet again, and
  • addressing additional and unresolved issues.

This article is the report of the JARS Group’s findings and recommendations. It was approved by the Publications and Communications Board in the summer of 2007 and again in the spring of 2008 and was transmitted to the task force charged with revising the Publication Manual for consideration as it did its work. The content of the report roughly follows the stages of the group’s work. Those wishing to move directly to the reporting standards can go to the sections titled Information for Inclusion in Manuscripts That Report New Data Collections and Information for Inclusion in Manuscripts That Report Meta-Analyses.

Why Are More Well-Defined Reporting Standards Needed?

The JARS Group members began their work by sharing with each other documents they knew of that related to reporting standards. The group found that the past decade had witnessed two developments in the social, behavioral, and medical sciences that encouraged researchers to provide more details when they reported their investigations. The first impetus for more detail came from the worlds of policy and practice. In these realms, the call for use of “evidence-based” decision making had placed a new emphasis on the importance of understanding how research was conducted and what it found. For example, in 2006, the APA Presidential Task Force on Evidence-Based Practice defined the term evidence-based practice to mean “the integration of the best available research with clinical expertise” (p. 273; italics added). The report went on to say that “evidence-based practice requires that psychologists recognize the strengths and limitations of evidence obtained from different types of research” (p. 275).

In medicine, the movement toward evidence-based practice is now so pervasive (see Sackett, Rosenberg, Muir Grey, Hayes & Richardson, 1996 ) that there exists an international consortium of researchers (the Cochrane Collaboration; http://www.cochrane.org/index.htm ) producing thousands of papers examining the cumulative evidence on everything from public health initiatives to surgical procedures. Another example of accountability in medicine, and the importance of relating medical practice to solid medical science, comes from the member journals of the International Committee of Medical Journal Editors (2007) , who adopted a policy requiring registration of all clinical trials in a public trials registry as a condition of consideration for publication.

In education, the No Child Left Behind Act of 2001 (2002) required that the policies and practices adopted by schools and school districts be “scientifically based,” a term that appears over 100 times in the legislation. In public policy, a consortium similar to that in medicine now exists (the Campbell Collaboration; http://www.campbellcollaboration.org ), as do organizations meant to promote government policymaking based on rigorous evidence of program effectiveness (e.g., the Coalition for Evidence-Based Policy; http://www.excelgov.org/index.php?keyword=a432fbc34d71c7 ). Each of these efforts operates with a definition of what constitutes sound scientific evidence. The developers of previous reporting standards argued that new transparency in reporting is needed so that judgments can be made by users of evidence about the appropriate inferences and applications derivable from research findings.

The second impetus for more detail in research reporting has come from within the social and behavioral science disciplines. As evidence about specific hypotheses and theories accumulates, greater reliance is being placed on syntheses of research, especially meta-analyses ( Cooper, 2009 ; Cooper, Hedges, & Valentine, 2009 ), to tell us what we know about the workings of the mind and the laws of behavior. Different findings relating to a specific question examined with various research designs are now mined by second users of the data for clues to the mediation of basic psychological, behavioral, and social processes. These clues emerge by clustering studies based on distinctions in their methods and then comparing their results. This synthesis-based evidence is then used to guide the next generation of problems and hypotheses studied in new data collections. Without complete reporting of methods and results, the utility of studies for purposes of research synthesis and meta-analysis is diminished.

The JARS Group viewed both of these stimulants to action as positive developments for the psychological sciences. The first provides an unprecedented opportunity for psychological research to play an important role in public and health policy. The second promises a sounder evidence base for explanations of psychological phenomena and a next generation of research that is more focused on resolving critical issues.

The Current State of the Art

Next, the JARS Group collected efforts of other social and health organizations that had recently developed reporting standards. Three recent efforts quickly came to the group’s attention. Two efforts had been undertaken in the medical and health sciences to improve the quality of reporting of primary studies and to make reports more useful for the next users of the data. The first effort is called CONSORT (Consolidated Standards of Reporting Trials; Altman et al., 2001 ; Moher et al., 2001 ). The CONSORT standards were developed by an ad hoc group primarily composed of biostatisticians and medical researchers. CONSORT relates to the reporting of studies that carried out random assignment of participants to conditions. It comprises a checklist of study characteristics that should be included in research reports and a flow diagram that provides readers with a description of the number of participants as they progress through the study—and by implication the number who drop out—from the time they are deemed eligible for inclusion until the end of the investigation. These guidelines are now required by the top-tier medical journals and many other biomedical journals. Some APA journals also use the CONSORT guidelines.

The second effort is called TREND (Transparent Reporting of Evaluations with Nonexperimental Designs; Des Jarlais, Lyles, Crepaz, & the TREND Group, 2004 ). TREND was developed under the initiative of the Centers for Disease Control, which brought together a group of editors of journals related to public health, including several journals in psychology. TREND contains a 22-item checklist, similar to CONSORT, but with a specific focus on reporting standards for studies that use quasi-experimental designs, that is, group comparisons in which the groups were established using procedures other than random assignment to place participants in conditions.

In the social sciences, the American Educational Research Association (2006) recently published “Standards for Reporting on Empirical Social Science Research in AERA Publications.” These standards encompass a broad range of research designs, including both quantitative and qualitative approaches, and are divided into eight general areas, including problem formulation; design and logic of the study; sources of evidence; measurement and classification; analysis and interpretation; generalization; ethics in reporting; and title, abstract, and headings. They contain about two dozen general prescriptions for the reporting of studies as well as separate prescriptions for quantitative and qualitative studies.

Relation to the APA Publication Manual

The JARS Group also examined previous editions of the APA Publication Manual and discovered that for the last half century it has played an important role in the establishment of reporting standards. The first edition of the APA Publication Manual , published in 1952 as a supplement to Psychological Bulletin ( American Psychological Association, Council of Editors, 1952 ), was 61 pages long, printed on 6-in. by 9-in. paper, and cost $1. The principal divisions of manuscripts were titled Problem, Method, Results, Discussion, and Summary (now the Abstract). According to the first Publication Manual, the section titled Problem was to include the questions asked and the reasons for asking them. When experiments were theory-driven, the theoretical propositions that generated the hypotheses were to be given, along with the logic of the derivation and a summary of the relevant arguments. The method was to be “described in enough detail to permit the reader to repeat the experiment unless portions of it have been described in other reports which can be cited” (p. 9). This section was to describe the design and the logic of relating the empirical data to theoretical propositions, the subjects, sampling and control devices, techniques of measurement, and any apparatus used. Interestingly, the 1952 Manual also stated, “Sometimes space limitations dictate that the method be described synoptically in a journal, and a more detailed description be given in auxiliary publication” (p. 25). The Results section was to include enough data to justify the conclusions, with special attention given to tests of statistical significance and the logic of inference and generalization. The Discussion section was to point out limitations of the conclusions, relate them to other findings and widely accepted points of view, and give implications for theory or practice. Negative or unexpected results were not to be accompanied by extended discussions; the editors wrote, “Long ‘alibis,’ unsupported by evidence or sound theory, add nothing to the usefulness of the report” (p. 9). Also, authors were encouraged to use good grammar and to avoid jargon, as “some writing in psychology gives the impression that long words and obscure expressions are regarded as evidence of scientific status” (pp. 25–26).

Through the following editions, the recommendations became more detailed and specific. Of special note was the Report of the Task Force on Statistical Inference ( Wilkinson & the Task Force on Statistical Inference, 1999 ), which presented guidelines for statistical reporting in APA journals that informed the content of the 4th edition of the Publication Manual . Although the 5th edition of the Manual does not contain a clearly delineated set of reporting standards, this does not mean the Manual is devoid of standards. Instead, recommendations, standards, and requirements for reporting are embedded in various sections of the text. Most notably, statements regarding the method and results that should be included in a research report (as well as how this information should be reported) appear in the Manual ’s description of the parts of a manuscript (pp. 10–29). For example, when discussing who participated in a study, the Manual states, “When humans participated as the subjects of the study, report the procedures for selecting and assigning them and the agreements and payments made” (p. 18). With regard to the Results section, the Manual states, “Mention all relevant results, including those that run counter to the hypothesis” (p. 20), and it provides descriptions of “sufficient statistics” (p. 23) that need to be reported.

Thus, although reporting standards and requirements are not highlighted in the most recent edition of the Manual, they appear nonetheless. In that context, then, the proposals offered by the JARS Group can be viewed not as breaking new ground for psychological research but rather as a systematization, clarification, and—to a lesser extent than might at first appear—an expansion of standards that already exist. The intended contribution of the current effort, then, becomes as much one of increased emphasis as increased content.

Drafting, Vetting, and Refinement of the JARS

Next, the JARS Group canvassed the APA Council of Editors to ascertain the degree to which the CONSORT and TREND standards were already in use by APA journals and to make us aware of other reporting standards. Also, the JARS Group requested from the APA Publications Office data it had on the use of auxiliary websites by authors of APA journal articles. With this information in hand, the JARS Group compared the CONSORT, TREND, and AERA standards to one another and developed a combined list of nonredundant elements contained in any or all of the three sets of standards. The JARS Group then examined the combined list, rewrote some items for clarity and ease of comprehension by an audience of psychologists and other social and behavioral scientists, and added a few suggestions of its own.

This combined list was then shared with the APA Council of Editors, the APA Publication Manual Revision Task Force, and the Publications and Communications Board. These groups were requested to react to it. After receiving these reactions and anonymous reactions from reviewers chosen by the American Psychologist , the JARS Group revised its report and arrived at the list of recommendations contained in Tables 1 , ​ ,2, 2 , and ​ and3 3 and Figure 1 . The report was then approved again by the Publications and Communications Board.

An external file that holds a picture, illustration, etc.
Object name is nihms239779f1.jpg

Note. This flowchart is an adaptation of the flowchart offered by the CONSORT Group ( Altman et al., 2001 ; Moher, Schulz, & Altman, 2001 ). Journals publishing the original CONSORT flowchart have waived copyright protection.

Journal Article Reporting Standards (JARS): Information Recommended for Inclusion in Manuscripts That Report New Data Collections Regardless of Research Design

Module A: Reporting Standards for Studies With an Experimental Manipulation or Intervention (in Addition to Material Presented in Table 1 )

Reporting Standards for Studies Using Random and Nonrandom Assignment of Participants to Experimental Groups

Information for Inclusion in Manuscripts That Report New Data Collections

The entries in Tables 1 through ​ through3 3 and Figure 1 divide the reporting standards into three parts. First, Table 1 presents information recommended for inclusion in all reports submitted for publication in APA journals. Note that these recommendations contain only a brief entry regarding the type of research design. Along with these general standards, then, the JARS Group also recommended that specific standards be developed for different types of research designs. Thus, Table 2 provides standards for research designs involving experimental manipulations or evaluations of interventions (Module A). Next, Table 3 provides standards for reporting either (a) a study involving random assignment of participants to experimental or intervention conditions (Module A1) or (b) quasi-experiments, in which different groups of participants receive different experimental manipulations or interventions but the groups are formed (and perhaps equated) using a procedure other than random assignment (Module A2). Using this modular approach, the JARS Group was able to incorporate the general recommendations from the current APA Publication Manual and both the CONSORT and TREND standards into a single set of standards. This approach also makes it possible for other research designs (e.g., observational studies, longitudinal designs) to be added to the standards by adding new modules.

The standards are categorized into the sections of a research report used by APA journals. To illustrate how the tables would be used, note that the Method section in Table 1 is divided into subsections regarding participant characteristics, sampling procedures, sample size, measures and covariates, and an overall categorization of the research design. Then, if the design being described involved an experimental manipulation or intervention, Table 2 presents additional information about the research design that should be reported, including a description of the manipulation or intervention itself and the units of delivery and analysis. Next, Table 3 presents two separate sets of reporting standards to be used depending on whether the participants in the study were assigned to conditions using a random or nonrandom procedure. Figure 1 , an adaptation of the chart recommended in the CONSORT guidelines, presents a chart that should be used to present the flow of participants through the stages of either an experiment or a quasi-experiment. It details the amount and cause of participant attrition at each stage of the research.

In the future, new modules and flowcharts regarding other research designs could be added to the standards to be used in conjunction with Table 1 . For example, tables could be constructed to replace Table 2 for the reporting of observational studies (e.g., studies with no manipulations as part of the data collection), longitudinal studies, structural equation models, regression discontinuity designs, single-case designs, or real-time data capture designs ( Stone & Shiffman, 2002 ), to name just a few.

Additional standards could be adopted for any of the parts of a report. For example, the Evidence-Based Behavioral Medicine Committee ( Davidson et al., 2003 ) examined each of the 22 items on the CONSORT checklist and described for each special considerations for reporting of research on behavioral medicine interventions. Also, this group proposed an additional 5 items, not included in the CONSORT list, that they felt should be included in reports on behavioral medicine interventions: (a) training of treatment providers, (b) supervision of treatment providers, (c) patient and provider treatment allegiance, (d) manner of testing and success of treatment delivery by the provider, and (e) treatment adherence. The JARS Group encourages other authoritative groups of interested researchers, practitioners, and journal editorial teams to use Table 1 as similar starting point in their efforts, adding and deleting items and modules to fit the information needs dictated by research designs that are prominent in specific subdisciplines and topic areas. These revisions could then be in corporated into future iterations of the JARS.

Information for Inclusion in Manuscripts That Report Meta-Analyses

The same pressures that have led to proposals for reporting - standards for manuscripts that report new data collections have led to similar efforts to establish standards for the reporting of other types of research. Particular attention has been focused on the reporting of meta-analyses.

With regard to reporting standards for meta-analysis, the JARS Group began by contacting the members of the Society for Research Synthesis Methodology and asking them to share with the group what they felt were the critical aspects of meta-analysis conceptualization, methodology, and results that need to be reported so that readers (and manuscript reviewers) can make informed, critical judgments about the appropriateness of the methods used for the inferences drawn. This query led to the identification of four other efforts to establish reporting standards for meta-analysis. These included the QUOROM Statement (Quality of Reporting of Meta-analysis; Moher et al., 1999 ) and its revision, PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses; Moher, Liberati, Tetzlaff, Altman, & the PRISMA Group, 2008 ), MOOSE (Meta-analysis of Observational Studies in Epidemiology; Stroup et al., 2000 ), and the Potsdam Consultation on Meta-Analysis ( Cook, Sackett, & Spitzer, 1995 ).

Next the JARS Group compared the content of each of the four sets of standards with the others and developed a combined list of nonredundant elements contained in any or all of them. The JARS Group then examined the combined list, rewrote some items for clarity and ease of comprehension by an audience of psychologists, and added a few suggestions of its own. Then the resulting recommendations were shared with a subgroup of members of the Society for Research Synthesis Methodology who had experience writing and reviewing research syntheses in the discipline of psychology. After these suggestions were incorporated into the list, it was shared with members of the Publications and Communications Board, who were requested to react to it. After receiving these reactions, the JARS Group arrived at the list of recommendations contained in Table 4 , titled Meta-Analysis Reporting Standards (MARS). These were then approved by the Publications and Communications Board.

Meta-Analysis Reporting Standards (MARS): Information Recommended for Inclusion in Manuscripts Reporting Meta-Analyses

Other Issues Related to Reporting Standards

A definition of “reporting standards”.

The JARS Group recognized that there are three related terms that need definition when one speaks about journal article reporting standards: recommendations, standards, and requirements. According to Merriam-Webster’s Online Dictionary (n.d.) , to recommend is “to present as worthy of acceptance or trial … to endorse as fit, worthy, or competent.” In contrast, a standard is more specific and should carry more influence: “something set up and established by authority as a rule for the measure of quantity, weight, extent, value, or quality.” And finally, a requirement goes further still by dictating a course of action—“something wanted or needed”—and to require is “to claim or ask for by right and authority … to call for as suitable or appropriate … to demand as necessary or essential.”

With these definitions in mind, the JARS Group felt it was providing recommendations regarding what information should be reported in the write-up of a psychological investigation and that these recommendations could also be viewed as standards or at least as a beginning effort at developing standards. The JARS Group felt this characterization was appropriate because the information it was proposing for inclusion in reports was based on an integration of efforts by authoritative groups of researchers and editors. However, the proposed standards are not offered as requirements. The methods used in the subdisciplines of psychology are so varied that the critical information needed to assess the quality of research and to integrate it successfully with other related studies varies considerably from method to method in the context of the topic under consideration. By not calling them “requirements,” the JARS Group felt the standards would be given the weight of authority while retaining for authors and editors the flexibility to use the standards in the most efficacious fashion (see below).

The Tension Between Complete Reporting and Space Limitations

There is an innate tension between transparency in reporting and the space limitations imposed by the print medium. As descriptions of research expand, so does the space needed to report them. However, recent improvements in the capacity of and access to electronic storage of information suggest that this trade-off could someday disappear. For example, the journals of the APA, among others, now make available to authors auxiliary websites that can be used to store supplemental materials associated with the articles that appear in print. Similarly, it is possible for electronic journals to contain short reports of research with hot links to websites containing supplementary files.

The JARS Group recommends an increased use and standardization of supplemental websites by APA journals and authors. Some of the information contained in the reporting standards might not appear in the published article itself but rather in a supplemental website. For example, if the instructions in an investigation are lengthy but critical to understanding what was done, they may be presented verbatim in a supplemental website. Supplemental materials might include the flowchart of participants through the study. It might include oversized tables of results (especially those associated with meta-analyses involving many studies), audio or video clips, computer programs, and even primary or supplementary data sets. Of course, all such supplemental materials should be subject to peer review and should be submitted with the initial manuscript. Editors and reviewers can assist authors in determining what material is supplemental and what needs to be presented in the article proper.

Other Benefits of Reporting Standards

The general principle that guided the establishment of the JARS for psychological research was the promotion of sufficient and transparent descriptions of how a study was conducted and what the researcher(s) found. Complete reporting allows clearer determination of the strengths and weaknesses of a study. This permits the users of the evidence to judge more accurately the appropriate inferences and applications derivable from research findings.

Related to quality assessments, it could be argued as well that the existence of reporting standards will have a salutary effect on the way research is conducted. For example, by setting a standard that rates of loss of participants should be reported (see Figure 1 ), researchers may begin considering more concretely what acceptable levels of attrition are and may come to employ more effective procedures meant to maximize the number of participants who complete a study. Or standards that specify reporting a confidence interval along with an effect size might motivate researchers to plan their studies so as to ensure that the confidence intervals surrounding point estimates will be appropriately narrow.

Also, as noted above, reporting standards can improve secondary use of data by making studies more useful for meta-analysis. More broadly, if standards are similar across disciplines, a consistency in reporting could promote interdisciplinary dialogue by making it clearer to researchers how their efforts relate to one another.

And finally, reporting standards can make it easier for other researchers to design and conduct replications and related studies by providing more complete descriptions of what has been done before. Without complete reporting of the critical aspects of design and results, the value of the next generation of research may be compromised.

Possible Disadvantages of Standards

It is important to point out that reporting standards also can lead to excessive standardization with negative implications. For example, standardized reporting could fill articles with details of methods and results that are inconsequential to interpretation. The critical facts about a study can get lost in an excess of minutiae. Further, a forced consistency can lead to ignoring important uniqueness. Reporting standards that appear comprehensive might lead researchers to believe that “If it’s not asked for or does not conform to criteria specified in the standards, it’s not necessary to report.” In rare instances, then, the setting of reporting standards might lead to the omission of information critical to understanding what was done in a study and what was found.

Also, as noted above, different methods are required for studying different psychological phenomena. What needs to be reported in order to evaluate the correspondence between methods and inferences is highly dependent on the research question and empirical approach. Inferences about the effectiveness of psychotherapy, for example, require attention to aspects of research design and analysis that are different from those important for inferences in the neuroscience of text processing. This context dependency pertains not only to topic-specific considerations but also to research designs. Thus, an experimental study of the determinants of well-being analyzed via analysis of variance engenders different reporting needs than a study on the same topic that employs a passive longitudinal design and structural equation modeling. Indeed, the variations in substantive topics and research designs are factorial in this regard. So experiments in psychotherapy and neuroscience could share some reporting standards, even though studies employing structural equation models investigating well-being would have little in common with experiments in neuroscience.

Obstacles to Developing Standards

One obstacle to developing reporting standards encountered by the JARS Group was that differing taxonomies of research approaches exist and different terms are used within different subdisciplines to describe the same operational research variations. As simple examples, researchers in health psychology typically refer to studies that use experimental manipulations of treatments conducted in naturalistic settings as randomized clinical trials, whereas similar designs are referred to as randomized field trials in educational psychology. Some research areas refer to the use of random assignment of participants, whereas others use the term random allocation. Another example involves the terms multilevel model, hierarchical linear model, and mixed effects model, all of which are used to identify a similar approach to data analysis. There have been, from time to time, calls for standardized terminology to describe commonly but inconsistently used scientific terms, such as Kraemer et al.’s (1997) distinctions among words commonly used to denote risk. To address this problem, the JARS Group attempted to use the simplest descriptions possible and to avoid jargon and recommended that the new Publication Manual include some explanatory text.

A second obstacle was that certain research topics and methods will reveal different levels of consensus regarding what is and is not important to report. Generally, the newer and more complex the technique, the less agreement there will be about reporting standards. For example, although there are many benefits to reporting effect sizes, there are certain situations (e.g., multilevel designs) where no clear consensus exists on how best to conceptualize and/or calculate effect size measures. In a related vein, reporting a confidence interval with an effect size is sound advice, but calculating confidence intervals for effect sizes is often difficult given the current state of software. For this reason, the JARS Group avoided developing reporting standards for research designs about which a professional consensus had not yet emerged. As consensus emerges, the JARS can be expanded by adding modules.

Finally, the rapid pace of developments in methodology dictates that any standards would have to be updated frequently in order to retain currency. For example, the state of the art for reporting various analytic techniques is in a constant state of flux. Although some general principles (e.g., reporting the estimation procedure used in a structural equation model) can incorporate new developments easily, other developments can involve fundamentally new types of data for which standards must, by necessity, evolve rapidly. Nascent and emerging areas, such as functional neuroimaging and molecular genetics, may require developers of standards to be on constant vigil to ensure that new research areas are appropriately covered.

Questions for the Future

It has been mentioned several times that the setting of standards for reporting of research in psychology involves both general considerations and considerations specific to separate subdisciplines. And, as the brief history of standards in the APA Publication Manual suggests, standards evolve over time. The JARS Group expects refinements to the contents of its tables. Further, in the spirit of evidence-based decision making that is one impetus for the renewed emphasis on reporting standards, we encourage the empirical examination of the effects that standards have on reporting practices. Not unlike the issues many psychologists study, the proposal and adoption of reporting standards is itself an intervention. It can be studied for its effects on the contents of research reports and, most important, its impact on the uses of psychological research by decision makers in various spheres of public and health policy and by scholars seeking to understand the human mind and behavior.

The Working Group on Journal Article Reporting Standards was composed of Mark Appelbaum, Harris Cooper (Chair), Scott Maxwell, Arthur Stone, and Kenneth J. Sher. The working group wishes to thank members of the American Psychological Association’s (APA’s) Publications and Communications Board, the APA Council of Editors, and the Society for Research Synthesis Methodology for comments on this report and the standards contained herein.

  • Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Gotzsche PC, Lang T. The revised CONSORT statement for reporting randomized trials: Explanation and elaboration. Annals of Internal Medicine. 2001. pp. 663–694. Retrieved April 20, 2007, from http://www.consort-statement.org/ [ PubMed ]
  • American Educational Research Association. Standards for reporting on empirical social science research in AERA publications. Educational Researcher. 2006; 35 (6):33–40. [ Google Scholar ]
  • American Psychological Association. Publication manual of the American Psychological Association. 5. Washington, DC: Author; 2001. [ Google Scholar ]
  • American Psychological Association, Council of Editors. Publication manual of the American Psychological Association. Psychological Bulletin. 1952; 49 (Suppl, Pt 2) [ Google Scholar ]
  • APA Presidential Task Force on Evidence-Based Practice. Evidence-based practice in psychology. American Psychologist. 2006; 61 :271–283. [ PubMed ] [ Google Scholar ]
  • Cook DJ, Sackett DL, Spitzer WO. Methodologic guidelines for systematic reviews of randomized control trials in health care from the Potsdam Consultation on Meta-Analysis. Journal of Clinical Epidemiology. 1995; 48 :167–171. [ PubMed ] [ Google Scholar ]
  • Cooper H. Research synthesis and meta-analysis: A step-by-step approach. 4. Thousand, Oaks, CA: Sage; 2009. [ Google Scholar ]
  • Cooper H, Hedges LV, Valentine JC, editors. The handbook of research synthesis and meta-analysis. 2. New York: Russell Sage Foundation; 2009. [ Google Scholar ]
  • Davidson KW, Goldstein M, Kaplan RM, Kaufmann PG, Knatterud GL, Orleans TC, et al. Evidence-based behavioral medicine: What is it and how do we achieve it? Annals of Behavioral Medicine. 2003; 26 :161–171. [ PubMed ] [ Google Scholar ]
  • Des Jarlais DC, Lyles C, Crepaz N the TREND Group. Improving the reporting quality of nonrandomized evaluations of behavioral and public health interventions: The TREND statement. American Journal of Public Health. 2004. pp. 361–366. Retrieved April 20, 2007, from http://www.trend-statement.org/asp/documents/statements/AJPH_Mar2004_Trendstatement.pdf . [ PMC free article ] [ PubMed ]
  • International Committee of Medical Journal Editors. Uniform requirements for manuscripts submitted to biomedical journals: Writing and editing for biomedical publication. 2007. Retrieved April 9, 2008, from http://www.icmje.org/#clin_trials . [ PubMed ]
  • Kraemer HC, Kazdin AE, Offord DR, Kessler RC, Jensen PS, Kupfer DJ. Coming to terms with the terms of risk. Archives of General Psychiatry. 1997; 54 :337–343. [ PubMed ] [ Google Scholar ]
  • Merriam-Webster’s online dictionary. nd. Retrieved April 20, 2007, from http://www.m-w.com/dictionary/
  • Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup D for the QUOROM group. Improving the quality of reporting of meta-analysis of randomized controlled trials: The QUOROM statement. Lancet. 1999; 354 :1896–1900. [ PubMed ] [ Google Scholar ]
  • Moher D, Schulz KF, Altman DG. The CONSORT statement: Revised recommendations for improving the quality of reports of parallel-group randomized trials. Annals of Internal Medicine. 2001. pp. 657–662. Retrieved April 20, 2007 from http://www.consort-statement.org . [ PubMed ]
  • Moher D, Liberati A, Tetzlaff J, Altman DG the PRISMA Group. Preferred reporting items for systematic reviews and meta-analysis: The PRISMA statement. 2008. Manuscript submitted for publication. [ PubMed ] [ Google Scholar ]
  • No Child Left Behind Act of 2001, Pub. L. 107–110, 115 Stat. 1425 (2002, January 8).
  • Sackett DL, Rosenberg WMC, Muir Grey JA, Hayes RB, Richardson WS. Evidence based medicine: What it is and what it isn’t. British Medical Journal. 1996; 312 :71–72. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Stone AA, Shiffman S. Capturing momentary, self-report data: A proposal for reporting guidelines. Annals of Behavioral Medicine. 2002; 24 :236–243. [ PubMed ] [ Google Scholar ]
  • Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology. Journal of the American Medical Association. 2000; 283 :2008–2012. [ PubMed ] [ Google Scholar ]
  • Wilkinson L the Task Force on Statistical Inference. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist. 1999; 54 :594–604. [ Google Scholar ]

types of research in counseling

Promoting best practices

in assessment, research, & evaluation in counseling.

types of research in counseling

Share your passion

for the advancement of measurement and evaluation

Welcome to AARC

The Association for Assessment and Research in Counseling (AARC) is an organization of counselors, educators, and other professionals that advances the counseling profession by promoting best practices in assessment, research, and evaluation in counseling. The mission of AARC is to promote and recognize excellence in assessment, research, and evaluation in counseling.

Our Education Center has resources for educators, researchers, and practitioners.

  • Research Design Documents
  • Learn About Membership
  • Assessment and Testing Documents
  • American Counseling Association

types of research in counseling

Ready to become a member?

Upcoming events.

We are excited to share upcoming AARC events!

  • September 6th - 7th, 2024 2024 AARC Conference

Social Feed

View this profile on Instagram AARC (@ aarc.aarc ) • Instagram photos and videos

Become a member

AARC members are a dynamic group of professionals who share a common passion for measurement and evaluation.

types of research in counseling

The Lastest

Aarc networking brunch at the aca national conference, welcome newly elected aarc executive council members.

AARC is pleased to announce the results of the recent election.  Thank you to all who participated as candidates and for members who took the…

2024 AARC Conference

Please visit the Conference 2024 page for more details.

Recruiting Participants for Quantitative Research Studies: Opportunities and Challenges

The AARC Research Committee is excited to present a panel discussion entitled, Recruiting Participants for Quantitative Research Studies: Opportunities and Challenges. The four expert panelists…

Call for 2024 Conference Proposals

The Association for Assessment and Research in Counseling is seeking proposals for the 2024 AARC Conference in Pittsburgh, Pennsylvania from September 6th to 7th, 2024….

Northwestern University The Family Institute - home

  • About Northwestern
  • Student Stories
  • Application Requirements
  • How to Apply
  • Information Sessions
  • Standard Program
  • Course Descriptions
  • Bridge to Counseling Program
  • Continuing Education
  • Child and Adolescent Specialization
  • Clinical Training and Placement
  • In-Person Immersion Experiences
  • What Is Counseling?
  • Counseling versus Psychology
  • The Differences between Clinical Mental Health Counseling and Social Work
  • Careers in Mental Health Counseling
  • Leadership & Faculty
  • Program Data
  • Five Counseling Theories and Approaches
  • Inclusive Language Guide
  • Holiday Survival Guide Toolkit
  • Building and Maintaining Wellness
  • Apply External link: open_in_new

Counseling@Northwestern / Online Master's in Counseling / Curriculum / Course Descriptions / 406-6: Research Methods in Counseling (1)

406-6: Research Methods in Counseling (1)

Provides an understanding of types of research methods, basic statistics, and ethical considerations in research; principles, practices, and applications of needs assessment and program evaluation.

Back to course descriptions .

Request More Information

Complete the form to receive information about the online graduate counseling and therapy programs offered by the Family Institute at Northwestern University.

U.S. flag

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Heart-Healthy Living
  • High Blood Pressure
  • Sickle Cell Disease
  • Sleep Apnea
  • Information & Resources on COVID-19
  • The Heart Truth®
  • Learn More Breathe Better®
  • Blood Diseases and Disorders Education Program
  • Publications and Resources
  • Blood Disorders and Blood Safety
  • Sleep Science and Sleep Disorders
  • Lung Diseases
  • Health Disparities and Inequities
  • Heart and Vascular Diseases
  • Precision Medicine Activities
  • Obesity, Nutrition, and Physical Activity
  • Population and Epidemiology Studies
  • Women’s Health
  • Research Topics
  • Clinical Trials
  • All Science A-Z
  • Grants and Training Home
  • Policies and Guidelines
  • Funding Opportunities and Contacts
  • Training and Career Development
  • Email Alerts
  • NHLBI in the Press
  • Research Features
  • Past Events
  • Upcoming Events
  • Mission and Strategic Vision
  • Divisions, Offices and Centers
  • Advisory Committees
  • Budget and Legislative Information
  • Jobs and Working at the NHLBI
  • Contact and FAQs
  • NIH Sleep Research Plan
  • < Back To Home

Thirty years later, the Women’s Health Initiative provides researchers with key messages for postmenopausal women

A physician shows a medical tablet to a patient in a clinical setting.

Researchers from the NHLBI-supported Women’s Health Initiative , the largest women’s health study in the U.S., published findings from a 20-year review that underscores the importance of postmenopausal women moving away from a one-size-fits-all approach to making medical decisions. Through this lens, the researchers encourage women and physicians to work together to make shared and individualized decisions based on a woman’s medical history, age, lifestyle, disease risks, symptoms, and health needs and preferences, among other factors. These findings support the concept of “whole-person health” and published in  JAMA .  After reviewing decades of data following clinical trials that started between 1993 and 1998, the researchers explain that estrogen or a combination of estrogen and progestin, two types of hormone replacement therapies, had varying outcomes with chronic conditions. The evidence does not support using these therapies to reduce risks for chronic diseases, such as heart disease, stroke, cancer, and dementia. However, the authors caution that the study was not designed to assess the effects of FDA-approved hormone therapies for treating menopausal symptoms . These benefits had been established before the WHI study began.  Another finding from the study is that calcium and vitamin D supplements were not associated with reduced risks for hip fractures among postmenopausal women who had an average risk for osteoporosis. Yet, the authors note women concerned about getting sufficient intake of either nutrient should talk to their doctor. A third finding was that a low-fat dietary pattern with at least five daily servings of fruits and vegetables and increased grains did not reduce the risk of breast or colorectal cancer, but was associated with reduced risks for breast cancer deaths. 

Media Coverage

  • The Women's Health Initiative trials: clinical messages
  • HRT for menopause is safe for some women, new study shows
  • Major study supports safety of HRT in early menopause
  • Hormone therapy for menopause doesn't reduce heart disease risk
  • No need to fear menopause hormone drugs, finds major women's health study
  • Researchers review findings and clinical messages from the Women’s Health Initiative 30 years after launch

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Online First
  • Identifying the association between serum urate levels and gout flares in patients taking urate-lowering therapy: a post hoc cohort analysis of the CARES trial with consideration of dropout
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0001-9475-1363 Sara K Tedeschi 1 ,
  • Keigo Hayashi 1 ,
  • http://orcid.org/0000-0001-7638-0888 Yuqing Zhang 2 ,
  • Hyon Choi 2 ,
  • http://orcid.org/0000-0001-8202-5428 Daniel H Solomon 1
  • 1 Brigham and Women's Hospital , Boston , Massachusetts , USA
  • 2 Massachusetts General Hospital , Boston , Massachusetts , USA
  • Correspondence to Dr Sara K Tedeschi, Brigham and Women's Hospital, Boston, Massachusetts, USA; stedeschi1{at}bwh.harvard.edu

Objective To investigate gout flare rates based on repeated serum urate (SU) measurements in a randomised controlled trial of urate-lowering therapy (ULT), accounting for dropout and death.

Methods We performed a secondary analysis using data from Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout, which randomised participants to febuxostat or allopurinol, titrated to target SU <6 mg/dL with flare prophylaxis for 6 months. SU was categorised as ≤3.9, 4.0–5.9, 6.0–7.9, 8.0–9.9 or ≥ 10 mg/dL at each 3–6 month follow-up. The primary outcome was gout flare. Poisson regression models, adjusted for covariates and factors related to participant retention versus dropout, estimated gout flare incidence rate ratios by time-varying SU category.

Results Among 6183 participants, the median age was 65 years and 84% were male. Peak gout flare rates for all SU categories were observed in months 0–6, coinciding with the initiation of ULT and months 6–12 after stopping prophylaxis. Flare rates were similar across SU groups in the initial year of ULT. During months 36–72, a dose–response relationship was observed between the SU category and flare rate. Lower flare rates were observed when SU ≤3.9 mg/dL and greater rates when SU ≥10 mg/dL, compared with SU 4.0–5.9 mg/dL (p for trend <0.01).

Conclusion Gout flare rates were persistently higher when SU ≥6 mg/dL after the first year of ULT after accounting for censoring. The spike in flares in all categories after stopping prophylaxis suggests a longer duration of prophylaxis may be warranted.

  • Crystal arthropathies

Data availability statement

Data are available upon reasonable request. This publication is based on research using data from data contributors Takeda that has been made available through Vivli.

https://doi.org/10.1136/ard-2024-225761

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

A previous analysis of the Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout (CARES) dataset reported on the relationship between time-varying serum urate (SU) and gout flares starting at month 12 after randomisation and identified that gout flare rates were lowest when SU<4.0 mg/dL and highest when SU ≥6 mg/dL. A recent population-based study reported that individuals with baseline SU ≥10 mg/dL at enrollment in the UK Biobank had a 15-fold greater gout flare rate than those with baseline SU <6 mg/dL after 5 years of follow-up.

WHAT THIS STUDY ADDS

The present analysis of CARES data reports on gout flare rates in the 12 months after urate-lowering therapy (ULT) initiation (the highest-risk period for flares), takes censoring into account, evaluates smaller SU increments and reports flare incidence rate ratios in each of four time periods among participants that continued in follow-up. This analysis provides a more nuanced assessment of gout flare rates by SU measurement at standardised time points after initiation of ULT, whereas the UK Biobank study did not focus on ULT initiators.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

The spike in flares in all SU categories after stopping prophylaxis suggests a prophylaxis for longer than 6 months may be warranted. Aiming for lower levels of SU has the potential to lower gout flare rates.

Introduction

Gout flares negatively impact the quality of life, and patients with gout identify flare reduction as an important outcome. 1–3 Urate-lowering therapy (ULT) is a cornerstone of gout management and achieving serum urate (SU) <6 mg/dL is recommended to reduce the risk of gout flare recurrence. 1 However, the precise target for SU to minimise gout flares has not been empirically determined. The American College of Radiology (ACR) 2020 Gout Treatment Guidelines recommend titrating ULT to achieve a target SU <6 mg/dL, the target used in three large trials testing strategies to implement ULT. 4–6 Urate dissolves in serum at a concentration below 6.8 mg/dL, which provides rationale for achieving SU <6 mg/dL. The utility of achieving and maintaining SU below 6 mg/dL has been tested in just one randomised controlled trial (RCT) among patients with erosive gout; the frequency of gout flares was not significantly different at months 12 or 24 in those randomised to achieve target SU<0.30 mmol/L (<5 mg/dL) compared with SU<0.20 mmol/L (<3.4 mg/dL). 7 Whether maintaining SU <6 mg/dL long term (eg, more than 2 years after initiating ULT) provides benefit to reducing gout flare rates has not been tested.

It is well recognised that gout flares are common in the months after initiating ULT. 8 The ACR 2020 Gout Treatment Guidelines recommend using flare prophylaxis for at least 3–6 months after starting ULT and to continue or resume prophylaxis on a case-by-case basis in patients with recurrent flares. 1 The benefit of a longer duration of gout flare prophylaxis for all patients starting ULT has not been tested.

The Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout (CARES) randomised controlled trial tested the effect of ULT with febuxostat or allopurinol on cardiovascular (CV) endpoints. 9 SU was measured at regular intervals and gout flares were assessed by self-report at each visit. We investigated gout flare rates based on repeated measurements of SU in the CARES trial, accounting for dropout and death.

CARES trial overview

CARES randomised 6190 participants with gout to febuxostat or allopurinol; the primary outcome was a composite of CV events. 9 The median follow-up was 3.2 years. SU was measured at months 0, 3, 6 and then every 6 months, and ULT was titrated to achieve SU <6 mg/dL. Participants received gout flare prophylaxis for 6 months with colchicine 0.6 mg daily or naproxen 250 mg two times per day if colchicine was not tolerated. The trial had a high dropout rate, with 45% of participants missing at least one visit.

Observational study design

We performed a secondary analysis using CARES trial data. Participants were followed prospectively from randomization (month 0) through censoring (dropout or death) or the end of the trial, whichever came first. For this analysis, randomisation was broken and ULT was considered a covariate rather than exposure. Follow-up intervals were 3 or 6 months, as per the trial study visit schedule. Participants with baseline SU <4 mg/dL were excluded.

Exposure and covariates

SU at the start of each interval was the exposure for the next interval. For example, SU at month 0 was the exposure for outcomes reported in the month 0–3 interval. Time-varying SU exposure categories were ≤3.9, 4.0–5.9, 6.0–7.9, 8.0–9.9 and ≥10 mg/dL.

Baseline covariates included age, sex, race, body mass index (kg/m 2 ), gout duration in years, tophus (present or absent), SU, randomised ULT assignment (febuxostat or allopurinol), CV risk factors (diabetes, hypertension, hyperlipidaemia, myocardial infarction, unstable angina, coronary revascularisation, cerebral revascularisation, stroke and peripheral vascular disease) and creatinine clearance (mL/min).

Time-varying covariates included absolute change in SU since the prior interval (mg/dL), direction of change in SU since the prior interval (increase, decrease or no change), flare prophylaxis (colchicine, naproxen or none), ULT use (febuxostat, allopurinol or none) and gout flare in the prior interval (present if ≥1 flare or absent).

Self-reported gout flares were recorded at each visit. Multiple flares per interval were counted if the dates of flare onset were at least 14 days apart.

Statistical analysis

Baseline characteristics were summarised with descriptive statistics. We derived the inverse probability of censoring weights (IPCW) during each follow-up interval using the baseline covariates, time-varying covariates and interval duration (3 or 6 months). IPCWs were used to account for potential differential dropouts between treatment groups (ie, informative censoring) by up-weighting participants who remained in the study and who had similar traits to those who dropped out. Dropout and death were both treated as censoring. Very few deaths occurred in the trial, providing a rationale for this approach, and the Fine and Grey models to evaluate the competing risk of death were not compatible for estimating gout flare rates over time. Gout flare incidence rates (IR) and incidence rate ratios (IRRs) were estimated using Poisson models adjusted for IPCW, baseline covariates and time-varying covariates. Poisson models were stratified into four periods of time: 0–6 months, 6–12 months, 12–36 months and 36–72 months postrandomisation. We performed a test for linear trend across SU categories within each time period to calculate p for trend. We also assessed whether the velocity of SU lowering impacted the number of gout flares. To do so, we identified individuals with a 2.5–3.5 mg/dL decrease in SU over 3 months or over 6 months and used a t-test to compare the mean number of flares during that SU-lowering period (3 or 6 months). Missing SU values were imputed using multiple imputations. Analyses were performed using R V.4.1. A two-sided p-value <0.05 was considered significant. This study was exempt from Institutional Review Board approval.

Among the 6183 participants in this analysis, the median age was 65 (IQR 58–71) years and 84% were male. Baseline characteristics are provided in table 1 . The median follow-up was 32 months, and 442 participants (7.1%) died after randomisation. Median SU was 8.6 mg/dL (IQR 7.6–9.7) at randomisation. 71% achieved SU <6 mg/dL by month 3, and this percentage increased slightly over time among retained participants ( figure 1A,B ).

  • Download figure
  • Open in new tab
  • Download powerpoint

Serum urate levels over time in the Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout trial dataset. (A) Serum urate level over time by serum urate category at randomisation. (B) The proportion of subjects in each serum urate category over time.

  • View inline

Characteristics of CARES trial participants included in this secondary analysis

Gout flare rates were highest during nearly all intervals when SU ≥10 mg/dL and lowest when SU ≤3.9 mg/dL ( table 2 and figure 2 ). Peak gout flare rates for all SU categories were observed between months 0 and 3, coinciding with the initiation of ULT and the greatest change in SU. A second spike in gout flares occurred in all SU groups between months 6 and 12, coinciding with the discontinuation of prophylaxis ( figure 2 ).

Age-adjusted and sex-adjusted gout flare rates per person-year by serum urate category in the Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout trial dataset.

Time-varying serum urate category and adjusted gout flare IRR in the CARES trial dataset

In the initial year of ULT, flare rates did not significantly differ between SU groups, but flare rates were consistently highest when SU ≥10 mg/dL ( table 2 , p for trend >0.5 during months 0–6 and 6–12). A dose–response relationship was observed between the SU category and flare rate after the first year of ULT after adjustment for IPCW weights, baseline covariates and time-varying covariates with a relatively greater flare rate. Gout flare rates were significantly lower when SU ≤3.9 mg/dL was compared with SU 4.0–5.9 mg/dL after month 12 (IRR during months 12–36: 0.80, 95% CI 0.66 to 0.98; IRR during months 36–72: 0.85, 95% CI 0.62 to 1.18) ( table 2 ). By contrast, flare rates were nearly 50% greater when SU ≥10 mg/dL was compared with SU 4.0–5.9 mg/dL during months 12–36 (IRR 1.44, 95% CI 0.96 to 2.16; p for trend 0.01). The relative risk of gout flare was more than four times higher when SU ≥10 mg/dL was compared with SU 4.0–5.9 mg/dL during months 36–72 (IRR 4.47, 95% CI 2.42 to 8.26; p for trend<0.01).

There were 607 participants with a 2.5–3.5 mg/dL decrease in SU over 3 months and 116 with the same decrease in SU over 6 months. The median number of gout flares was zero in both groups. The mean number of gout flares was 0.24 (SD 0.56) over 3 months if the SU decrease occurred over 3 months, and 0.38 flares (SD 0.71) over 6 months if the SU decrease occurred over 6 months (p=0.02).

A significant dose–response relationship was present between SU and gout flare rate after the first year of ULT after accounting for censoring in this observational study of CARES trial data. Gout flare rates remained significantly higher after the first year of ULT when SU ≥6 mg/dL compared with<6 mg/dL (at target) and were significantly lower when SU≤3.9 mg/dL. Peak flare rates coincided with two study protocol elements: first, after the initiation of ULT, and next, after discontinuing prophylaxis. These peaks occurred in all SU categories, suggesting that other factors aside from SU also contribute to flare risk. Gout flare rates were four times as high when SU was ≥10 mg/dL three or more years after starting ULT, as compared with SU 4.0–5.9 mg/dL.

Nearly three-quarters of participants achieved target SU within 3 months of starting febuxostat or allopurinol, demonstrating the efficacy of xanthine oxidase inhibitors when taken in a trial setting. However, approximately 10% of SU levels were ≥10 mg/dL in each interval after the first year of follow-up, and gout flare rates were significantly higher in this SU category than all other categories. It is possible that participants whose SU remained ≥10 mg/dL differed from other participants in a number of ways, including genetic differences in SU metabolism, adherence to the study drug or reaching the protocol’s maximum dose of allopurinol (600 mg or 400 mg if renal impairment). Identifying which patients are at the greatest risk of persistent hyperuricaemia despite escalated ULT dosing might allow for different therapeutic approaches to reduce the risk of flares.

A previous analysis of the CARES dataset reported on the relationship between time-varying SU and gout flares starting at month 12. 10 Unadjusted gout flare rates were lowest when SU <4.0 mg/dL (0.27 flares/person-year, 95% CI 0.24 to 0.30) and highest when SU ≥6 mg/dL (0.50 flares/person-year, 95% CI 0.48 to 0.53). The present analysis takes censoring into account, meaning that the flare rates are more reflective of ‘real-world’ patients (although in a clinical trial population). Our analysis included the 12 months after randomisation, which is recognised as a high-risk period for gout flares. We identified gout flare rates for SU categories above the target of 6 mg/dL, which is pertinent to patient care as most gout patients, unfortunately, have SU above target and reported flare IRR in each of four time periods among participants that continued in follow-up. A recent population-based study reported that individuals with baseline SU ≥10 mg/dL at enrollment in the UK Biobank had a 15-fold greater gout flare rate than those with baseline SU <6 mg/dL after 5 years of follow-up—much higher than the fourfold greater gout flare rate when SU ≥10 mg/dL in the present analysis of clinical trial data. 11 A key reason underlying this difference is that the UK Biobank data reflects stable SU levels, which more accurately represent the total body urate pool than SU levels from participants in a treat-to-target trial of ULT. The present analysis of CARES trial data provides a more nuanced assessment of gout flare rates for up to 5 years after initiation of ULT and used time-varying SU measurements at standardised time points. Gout flares are common in the months after initiating ULT, and the present analysis additionally suggests that the rate of SU lowering may be relevant for gout flare risk. 8

Results from the present study raise questions about the optimal duration of gout flare prophylaxis after initiating ULT. In addition to causing joint pain and functional limitation, gout flares have been temporally associated with increased risk for CV events—providing yet another reason to reduce gout flare risk. 12 Low-dose colchicine, which is commonly used for gout flare prophylaxis, reduced the risk of recurrent CV events in patients with prior CV events. 13 We observed a peak in gout flares during months 6–12, temporally coinciding with the protocol-specified cessation of prophylaxis at month 6. A recent RCT testing the effect of colchicine versus placebo on gout flares in the first 6 months of ULT similarly reported an increase in gout flares after stopping colchicine. 14 Taken together, these suggest a potential benefit of a longer duration of prophylaxis in all patients starting ULT.

The ultimate goal of ULT is to dissolve all monosodium urate crystals by normalising the enlarged total body urate pool, which would prevent future gout flares and might also mitigate the risk of CV events, venous thromboembolism and other complications that have been associated with gout flares. 12 15 16 During periods of very low SU (≤3.9 mg/dL), gout flare rates were lower than periods of SU 4.0–5.9 mg/dL. Both of these SU categories are at target, defined as <6 mg/dL. The ACR gout treatment guidelines do not recommend any alternative targets due to ‘a lack of supporting evidence for additional specific thresholds’. 1 The EULAR gout treatment guidelines recommend against maintaining SU <3 mg/dL long term due to concerns about associations between very low SU and neurodegenerative disease. 17 A recent RCT testing the effect of different ULT targets on bone erosions in gout reported that the frequency of gout flares was not significantly different at year 2 in those randomised to achieve target SU <0.30 mmol/L (<5 mg/dL) compared with SU <0.20 mmol/L (<3.4 mg/dL). 7 Achievement of SU <0.20 mmol/L was also noted to be challenging using oral ULT. 7 Results from the present analysis, showing a lower flare rate when SU is maintained below 4 mg/dL beyond year 2 of ULT, raise the question of whether shared decision-making in gout has a role in determining the target SU. Patients who express a strong preference to avoid flares for the rest of their lives might opt to aim for a lower SU well below 6 mg/dL, weighing the risk of flares against the possible risk of neurocognitive decline.

This secondary data analysis leveraged a large dataset generated from an RCT of ULT, in which frequent SU monitoring was performed. Data were available for more than 5 years, providing information about longitudinal SU values and flares. Limitations of this analysis include a decreased sample size after the first year of follow-up, though we accounted for this by including IPCW in our IRR estimates. Given the high amount of dropout in later years, the IRR in later years may be prone to selection bias and should be interpreted with caution. Gout flares were self-reported and did not require physician evaluation, though a self-reported gout flare measure (not employed in this trial) showed excellent accuracy. 18

In conclusion, gout flares were lower when SU was <6 mg/dL compared with ≥6 mg/dL after the first year of ULT. The potential benefit of maintaining very low long-term SU levels (≤3.9 mg/dL) for longer than 2 years to reduce flare rates deserves further study. It may be prudent to continue gout flare prophylaxis, especially with colchicine, beyond the first 6 months of ULT given the observed spike in gout flare rates after stopping prophylaxis, CV risks associated with gout flare and CV benefits of colchicine.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

This was a secondary analysis of previously published clinical trial data, thus this study was exempt from Mass General Brigham Institutional Review Board approval.

Acknowledgments

This publication is based on research using data from data contributor Takeda that has been made available through Vivli. Vivli has not contributed to or approved and is not in any way responsible for the contents of this publication.

  • FitzGerald JD ,
  • Dalbeth N ,
  • Mikuls T , et al
  • Topless R ,
  • Noorbaloochi S ,
  • Merriman TR , et al
  • Doherty M ,
  • Jenkins W ,
  • Richardson H , et al
  • Goldfien R ,
  • Pressman A ,
  • Jacobson A , et al
  • Mikuls TR ,
  • Cheetham TC ,
  • Levy GD , et al
  • Billington K , et al
  • Yamanaka H ,
  • Ide Y , et al
  • Becker MA , et al
  • Becker MA ,
  • White WB , et al
  • McCormick N ,
  • Challener GJ , et al
  • Cipolletta E ,
  • Nakafero G , et al
  • Nidorf SM ,
  • Fiolet ATL ,
  • Mosterd A , et al
  • Mihov B , et al
  • Reginato AM , et al
  • Richette P ,
  • Pascual E , et al
  • Saag KG , et al

Handling editor Josef S Smolen

Contributors SKT, DHS, YZ and HC conceived the study and contributed to study design. KH performed data analyses. All authors contributed to data interpretation. SKT drafted the manuscript and all authors provided critical feedback and approved the final version of the manuscript. SKT is the guarantor and accepts full responsibility for the work, had access to the data and controlled the decision to publish.

Funding National Institutes of Health (K23 AR075070 (Tedeschi), L30 AR070514 (Tedeschi), R03 AR081309 (Tedeschi), P30 AR072577 (Solomon).

Competing interests SKT: consulting fees for Novartis and Avalo Therapeutics. HC: research grants from Horizon; service on a board or committee for LG Chem, Shanton and ANI Pharmaceuticals. DHS: research grants from CorEvitas, Janssen, Moderna and Novartis. Royalties from UpToDate. KH and YZ: no competing interests reported.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

COMMENTS

  1. PDF Doing Research in Counselling and Psychotherapy

    Principle 3: Developing reliable and practically useful research-based knowledge in the field of counselling and psychotherapy requires the adoption of methodological pluralism. Principle 4: The purpose of therapy research is to produce practical knowledge that contributes to social justice. Principle 5: Good therapy research requires paying ...

  2. PDF Counseling Psychology Research Methods: Qualitative Approaches

    93 Morrow,Castaneda-Sound,and Abrams. Kidd & Kral, 2005) is receiving increasing attention as counseling psychologists explore more effective ways of pursuing our multicultural and social justice research agendas. In 2005 and 2007, respectively, the journal of Counseling Psychology (]CP) and The Counseling Psychologist (TCP) published special ...

  3. Methodologies in Counseling Psychology

    Abstract. This chapter reviews quantitative and qualitative methodologies most frequently used in counseling psychology research. We begin with a review of the paradigmatic bases and epistemological stances of quantitative and qualitative research, followed by overviews of both approaches to empirical research in counseling psychology.

  4. Research methods for counseling: An introduction.

    This text provides a rich, culturally-sensitive presentation of current research techniques in counseling. Author Robert J. Wright introduces the theory and research involved in research design, measurement, and assessment with an appealingly clear writing style. He addresses ways to meet the requirements of providing the data needed to facilitate evidence-based therapy and interventions with ...

  5. Journal of Counseling Psychology

    The Journal of Counseling Psychology® publishes empirical research in the areas of. counseling activities (including assessment, interventions, consultation, supervision, training, prevention, psychological education, and advocacy) career and educational development and vocational psychology. diversity and underrepresented populations in ...

  6. Counseling Research: A Practitioner-Scholar Approach 2nd Edition

    Counseling Research: A Practitioner-Scholar Approach 2nd Edition. This widely adopted and accessible introductory text for counselors-in-training and emerging researchers provides a foundational understanding of the primary research methods used in counseling and how these concepts can be applied to research design. Writing in a clear and ...

  7. Advancing the Counseling Profession Through Contemporary Quantitative

    In this article, we describe four unique quantitative methods: single-case research design, dyadic data analysis, profile analysis, and nonlinear analysis. We detail the utility, application, and recommendations for use associated with each approach, including advances in the methods reviewed for future consideration in counseling research.

  8. Counseling psychology research methods: Qualitative approaches

    The questions raised by these three studies are anchored in an understanding of research paradigms; thus, in this chapter, we first identify the paradigmatic issues that form the foundation of qualitative inquiry. Building on this framework, we describe the current status of the genre by reviewing content analyses of qualitative research in counseling and counseling psychology.

  9. Counseling research: A practitioner-scholar approach.

    This introductory text for counselors-in-training and emerging researchers focuses on research methodology, design, measurement, and evaluation. Richard Balkin and David Kleist explain the primary research methods used in counseling while emphasizing the importance of ethics and multicultural issues, demonstrating a professional counselor identity within the framework of research, and ...

  10. Counselling and Psychotherapy Research

    This virtual Research Methods edition of Counselling and Psychotherapy Research invites readers to consider and discuss the issue of therapist ... Therapy research has evolved in a multitude of directions since Freud's case study research. It is multifaceted and an ever-developing field, frequently positioned between art and science. ...

  11. Qualitative Methods in Mental Health Services Research

    In such designs, qualitative methods are used to explore and obtain depth of understanding while quantitative methods are used to test and confirm hypotheses and obtain breadth of understanding of the phenomenon of interest ( Teddlie & Tashakkori, 2003 ). Mixed method designs in mental health services research can be categorized in terms of ...

  12. How should we evaluate research on counselling and the treatment of

    In this context, the therapy 'counselling' has clear theoretical and empirical roots and is a synonym for a type of talking therapy. In contrast, a 2012 meta‐analytic study by Cuijpers et al. examined the efficacy of 'nondirective supportive therapy' (NDST) - which they stated is 'commonly described in the literature as ...

  13. Reporting Standards for Research in Psychology

    The methods used in the subdisciplines of psychology are so varied that the critical information needed to assess the quality of research and to integrate it successfully with other related studies varies considerably from method to method in the context of the topic under consideration.

  14. PDF Qualitative Research in Counseling: A Reflection for Novice ...

    This paper is thus written to support novice counselor researchers, and to inspire an emerging research culture through sharing formative experiences and lessons learned during a qualitative research project exploring minority issues in counseling. Key Words: Counseling, Health, Qualitative, Methods, and Narrative.

  15. AARC.com

    The Association for Assessment and Research in Counseling (AARC) is an organization of counselors, educators, and other professionals that advances the counseling profession by promoting best practices in assessment, research, and evaluation in counseling. The mission of AARC is to promote and recognize excellence in assessment, research, and ...

  16. Mixed methods research designs in counseling psychology.

    With the increased popularity of qualitative research, researchers in counseling psychology are expanding their methodologies to include mixed methods designs. These designs involve the collection, analysis, and integration of quantitative and qualitative data in a single or multiphase study. This article presents an overview of mixed methods ...

  17. 406-6: Research Methods in Counseling (1)

    406-6: Research Methods in Counseling (1) Provides an understanding of types of research methods, basic statistics, and ethical considerations in research; principles, practices, and applications of needs assessment and program evaluation. Back to course descriptions.

  18. Staff Members

    208 Raleigh Street CB #3916 Chapel Hill, NC 27515-8890 919-962-1053

  19. Best Online Therapy Services We Tried In 2024

    Types of therapy available: Individual, life coaching, relationship counseling and coaching, dating coaching, parent coaching, career coaching and more ... Some research shows that online therapy ...

  20. Thirty years later, the Women's Health Initiative provides researchers

    Researchers from the NHLBI-supported Women's Health Initiative, the largest women's health study in the U.S., published findings from a 20-year review that underscores the importance of postmenopausal women moving away from a one-size-fits-all approach to making medical decisions. , the researchers explain that estrogen or a combination of estrogen and progestin, two types of hormone ...

  21. Cell therapy fails to slow type 1 diabetes, but safety is ...

    Cell therapy fails to slow early type 1 diabetes, but safety is established. By Elizabeth Cooney. Reprints. Adobe. T olerance is the holy grail in calming autoimmune disease, a truce in the immune ...

  22. Improving the quality of research in counseling psychology: Conceptual

    Although we believe that experimental methods are an essential tool of scientific inquiry (and could be employed more frequently than they currently are in our field), we also recognize the growing interest among counseling psychology researchers in the effects of personal characteristics (e.g., ethnic identity, self-efficacy, body image ...

  23. Trial of Thrombectomy for Stroke with a Large Infarct of Unrestricted

    The median NIHSS score was 21, and thrombolysis therapy was administered intravenously to 34.9% of the patients. In 83.6% of the patients, MRI was the imaging method used for selection.

  24. Identifying the association between serum urate levels and gout flares

    Objective To investigate gout flare rates based on repeated serum urate (SU) measurements in a randomised controlled trial of urate-lowering therapy (ULT), accounting for dropout and death. Methods We performed a secondary analysis using data from Cardiovascular Safety of Febuxostat or Allopurinol in Patients with Gout, which randomised participants to febuxostat or allopurinol, titrated to ...

  25. Global Cell Therapy and Gene Therapy Market Report

    Dublin, May 13, 2024 (GLOBE NEWSWIRE) -- The "Cell Therapy and Gene Therapy Markets (Markets by Disease Type), 2023-2029" report has been added to ResearchAndMarkets.com's offering. A newly ...