Bookmark this page

  • A Model for the National Assessment of Higher Order Thinking
  • International Critical Thinking Essay Test
  • Online Critical Thinking Basic Concepts Test
  • Online Critical Thinking Basic Concepts Sample Test

Consequential Validity: Using Assessment to Drive Instruction

Translate this page from English...

*Machine translated pages not guaranteed for accuracy. Click Here for our professional translations.

critical thinking measurement tools

Critical Thinking Testing and Assessment

The purpose of assessment in instruction is improvement. The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students’ abilities to think their way through content using disciplined skill in reasoning. The more particular we can be about what we want students to learn about critical thinking, the better we can devise instruction with that particular end in view.

critical thinking measurement tools

The Foundation for Critical Thinking offers assessment instruments which share in the same general goal: to enable educators to gather evidence relevant to determining the extent to which instruction is teaching students to think critically (in the process of learning content). To this end, the Fellows of the Foundation recommend:

that academic institutions and units establish an oversight committee for critical thinking, and

that this oversight committee utilizes a combination of assessment instruments (the more the better) to generate incentives for faculty, by providing them with as much evidence as feasible of the actual state of instruction for critical thinking.

The following instruments are available to generate evidence relevant to critical thinking teaching and learning:

Course Evaluation Form : Provides evidence of whether, and to what extent, students perceive faculty as fostering critical thinking in instruction (course by course). Machine-scoreable.

Online Critical Thinking Basic Concepts Test : Provides evidence of whether, and to what extent, students understand the fundamental concepts embedded in critical thinking (and hence tests student readiness to think critically). Machine-scoreable.

Critical Thinking Reading and Writing Test : Provides evidence of whether, and to what extent, students can read closely and write substantively (and hence tests students' abilities to read and write critically). Short-answer.

International Critical Thinking Essay Test : Provides evidence of whether, and to what extent, students are able to analyze and assess excerpts from textbooks or professional writing. Short-answer.

Commission Study Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Based on the California Commission Study . Short-answer.

Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Short-answer.

Protocol for Interviewing Students Regarding Critical Thinking : Provides evidence of whether, and to what extent, students are learning to think critically at a college or university. Can be adapted for high school). Short-answer. 

Criteria for Critical Thinking Assignments : Can be used by faculty in designing classroom assignments, or by administrators in assessing the extent to which faculty are fostering critical thinking.

Rubrics for Assessing Student Reasoning Abilities : A useful tool in assessing the extent to which students are reasoning well through course content.  

All of the above assessment instruments can be used as part of pre- and post-assessment strategies to gauge development over various time periods.

Consequential Validity

All of the above assessment instruments, when used appropriately and graded accurately, should lead to a high degree of consequential validity. In other words, the use of the instruments should cause teachers to teach in such a way as to foster critical thinking in their various subjects. In this light, for students to perform well on the various instruments, teachers will need to design instruction so that students can perform well on them. Students cannot become skilled in critical thinking without learning (first) the concepts and principles that underlie critical thinking and (second) applying them in a variety of forms of thinking: historical thinking, sociological thinking, biological thinking, etc. Students cannot become skilled in analyzing and assessing reasoning without practicing it. However, when they have routine practice in paraphrasing, summariz­ing, analyzing, and assessing, they will develop skills of mind requisite to the art of thinking well within any subject or discipline, not to mention thinking well within the various domains of human life.

For full copies of this and many other critical thinking articles, books, videos, and more, join us at the Center for Critical Thinking Community Online - the world's leading online community dedicated to critical thinking!   Also featuring interactive learning activities, study groups, and even a social media component, this learning platform will change your conception of intellectual development.

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Back to Entry
  • Entry Contents
  • Entry Bibliography
  • Academic Tools
  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Supplement to Critical Thinking

How can one assess, for purposes of instruction or research, the degree to which a person possesses the dispositions, skills and knowledge of a critical thinker?

In psychometrics, assessment instruments are judged according to their validity and reliability.

Roughly speaking, an instrument is valid if it measures accurately what it purports to measure, given standard conditions. More precisely, the degree of validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” (American Educational Research Association 2014: 11). In other words, a test is not valid or invalid in itself. Rather, validity is a property of an interpretation of a given score on a given test for a specified use. Determining the degree of validity of such an interpretation requires collection and integration of the relevant evidence, which may be based on test content, test takers’ response processes, a test’s internal structure, relationship of test scores to other variables, and consequences of the interpretation (American Educational Research Association 2014: 13–21). Criterion-related evidence consists of correlations between scores on the test and performance on another test of the same construct; its weight depends on how well supported is the assumption that the other test can be used as a criterion. Content-related evidence is evidence that the test covers the full range of abilities that it claims to test. Construct-related evidence is evidence that a correct answer reflects good performance of the kind being measured and an incorrect answer reflects poor performance.

An instrument is reliable if it consistently produces the same result, whether across different forms of the same test (parallel-forms reliability), across different items (internal consistency), across different administrations to the same person (test-retest reliability), or across ratings of the same answer by different people (inter-rater reliability). Internal consistency should be expected only if the instrument purports to measure a single undifferentiated construct, and thus should not be expected of a test that measures a suite of critical thinking dispositions or critical thinking abilities, assuming that some people are better in some of the respects measured than in others (for example, very willing to inquire but rather closed-minded). Otherwise, reliability is a necessary but not a sufficient condition of validity; a standard example of a reliable instrument that is not valid is a bathroom scale that consistently under-reports a person’s weight.

Assessing dispositions is difficult if one uses a multiple-choice format with known adverse consequences of a low score. It is pretty easy to tell what answer to the question “How open-minded are you?” will get the highest score and to give that answer, even if one knows that the answer is incorrect. If an item probes less directly for a critical thinking disposition, for example by asking how often the test taker pays close attention to views with which the test taker disagrees, the answer may differ from reality because of self-deception or simple lack of awareness of one’s personal thinking style, and its interpretation is problematic, even if factor analysis enables one to identify a distinct factor measured by a group of questions that includes this one (Ennis 1996). Nevertheless, Facione, Sánchez, and Facione (1994) used this approach to develop the California Critical Thinking Dispositions Inventory (CCTDI). They began with 225 statements expressive of a disposition towards or away from critical thinking (using the long list of dispositions in Facione 1990a), validated the statements with talk-aloud and conversational strategies in focus groups to determine whether people in the target population understood the items in the way intended, administered a pilot version of the test with 150 items, and eliminated items that failed to discriminate among test takers or were inversely correlated with overall results or added little refinement to overall scores (Facione 2000). They used item analysis and factor analysis to group the measured dispositions into seven broad constructs: open-mindedness, analyticity, cognitive maturity, truth-seeking, systematicity, inquisitiveness, and self-confidence (Facione, Sánchez, and Facione 1994). The resulting test consists of 75 agree-disagree statements and takes 20 minutes to administer. A repeated disturbing finding is that North American students taking the test tend to score low on the truth-seeking sub-scale (on which a low score results from agreeing to such statements as the following: “To get people to agree with me I would give any reason that worked”. “Everyone always argues from their own self-interest, including me”. “If there are four reasons in favor and one against, I’ll go with the four”.) Development of the CCTDI made it possible to test whether good critical thinking abilities and good critical thinking dispositions go together, in which case it might be enough to teach one without the other. Facione (2000) reports that administration of the CCTDI and the California Critical Thinking Skills Test (CCTST) to almost 8,000 post-secondary students in the United States revealed a statistically significant but weak correlation between total scores on the two tests, and also between paired sub-scores from the two tests. The implication is that both abilities and dispositions need to be taught, that one cannot expect improvement in one to bring with it improvement in the other.

A more direct way of assessing critical thinking dispositions would be to see what people do when put in a situation where the dispositions would reveal themselves. Ennis (1996) reports promising initial work with guided open-ended opportunities to give evidence of dispositions, but no standardized test seems to have emerged from this work. There are however standardized aspect-specific tests of critical thinking dispositions. The Critical Problem Solving Scale (Berman et al. 2001: 518) takes as a measure of the disposition to suspend judgment the number of distinct good aspects attributed to an option judged to be the worst among those generated by the test taker. Stanovich, West and Toplak (2011: 800–810) list tests developed by cognitive psychologists of the following dispositions: resistance to miserly information processing, resistance to myside thinking, absence of irrelevant context effects in decision-making, actively open-minded thinking, valuing reason and truth, tendency to seek information, objective reasoning style, tendency to seek consistency, sense of self-efficacy, prudent discounting of the future, self-control skills, and emotional regulation.

It is easier to measure critical thinking skills or abilities than to measure dispositions. The following eight currently available standardized tests purport to measure them: the Watson-Glaser Critical Thinking Appraisal (Watson & Glaser 1980a, 1980b, 1994), the Cornell Critical Thinking Tests Level X and Level Z (Ennis & Millman 1971; Ennis, Millman, & Tomko 1985, 2005), the Ennis-Weir Critical Thinking Essay Test (Ennis & Weir 1985), the California Critical Thinking Skills Test (Facione 1990b, 1992), the Halpern Critical Thinking Assessment (Halpern 2016), the Critical Thinking Assessment Test (Center for Assessment & Improvement of Learning 2017), the Collegiate Learning Assessment (Council for Aid to Education 2017), the HEIghten Critical Thinking Assessment (https://territorium.com/heighten/), and a suite of critical thinking assessments for different groups and purposes offered by Insight Assessment (https://www.insightassessment.com/products). The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students’ critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level certificates in critical thinking on the basis of an examination (OCR 2011). Many of these standardized tests have received scholarly evaluations at the hands of, among others, Ennis (1958), McPeck (1981), Norris and Ennis (1989), Fisher and Scriven (1997), Possin (2008, 2013a, 2013b, 2013c, 2014, 2020) and Hatcher and Possin (2021). Their evaluations provide a useful set of criteria that such tests ideally should meet, as does the description by Ennis (1984) of problems in testing for competence in critical thinking: the soundness of multiple-choice items, the clarity and soundness of instructions to test takers, the information and mental processing used in selecting an answer to a multiple-choice item, the role of background beliefs and ideological commitments in selecting an answer to a multiple-choice item, the tenability of a test’s underlying conception of critical thinking and its component abilities, the set of abilities that the test manual claims are covered by the test, the extent to which the test actually covers these abilities, the appropriateness of the weighting given to various abilities in the scoring system, the accuracy and intellectual honesty of the test manual, the interest of the test to the target population of test takers, the scope for guessing, the scope for choosing a keyed answer by being test-wise, precautions against cheating in the administration of the test, clarity and soundness of materials for training essay graders, inter-rater reliability in grading essays, and clarity and soundness of advance guidance to test takers on what is required in an essay. Rear (2019) has challenged the use of standardized tests of critical thinking as a way to measure educational outcomes, on the grounds that  they (1) fail to take into account disputes about conceptions of critical thinking, (2) are not completely valid or reliable, and (3) fail to evaluate skills used in real academic tasks. He proposes instead assessments based on discipline-specific content.

There are also aspect-specific standardized tests of critical thinking abilities. Stanovich, West and Toplak (2011: 800–810) list tests of probabilistic reasoning, insights into qualitative decision theory, knowledge of scientific reasoning, knowledge of rules of logical consistency and validity, and economic thinking. They also list instruments that probe for irrational thinking, such as superstitious thinking, belief in the superiority of intuition, over-reliance on folk wisdom and folk psychology, belief in “special” expertise, financial misconceptions, overestimation of one’s introspective powers, dysfunctional beliefs, and a notion of self that encourages egocentric processing. They regard these tests along with the previously mentioned tests of critical thinking dispositions as the building blocks for a comprehensive test of rationality, whose development (they write) may be logistically difficult and would require millions of dollars.

A superb example of assessment of an aspect of critical thinking ability is the Test on Appraising Observations (Norris & King 1983, 1985, 1990a, 1990b), which was designed for classroom administration to senior high school students. The test focuses entirely on the ability to appraise observation statements and in particular on the ability to determine in a specified context which of two statements there is more reason to believe. According to the test manual (Norris & King 1985, 1990b), a person’s score on the multiple-choice version of the test, which is the number of items that are answered correctly, can justifiably be given either a criterion-referenced or a norm-referenced interpretation.

On a criterion-referenced interpretation, those who do well on the test have a firm grasp of the principles for appraising observation statements, and those who do poorly have a weak grasp of them. This interpretation can be justified by the content of the test and the way it was developed, which incorporated a method of controlling for background beliefs articulated and defended by Norris (1985). Norris and King synthesized from judicial practice, psychological research and common-sense psychology 31 principles for appraising observation statements, in the form of empirical generalizations about tendencies, such as the principle that observation statements tend to be more believable than inferences based on them (Norris & King 1984). They constructed items in which exactly one of the 31 principles determined which of two statements was more believable. Using a carefully constructed protocol, they interviewed about 100 students who responded to these items in order to determine the thinking that led them to choose the answers they did (Norris & King 1984). In several iterations of the test, they adjusted items so that selection of the correct answer generally reflected good thinking and selection of an incorrect answer reflected poor thinking. Thus they have good evidence that good performance on the test is due to good thinking about observation statements and that poor performance is due to poor thinking about observation statements. Collectively, the 50 items on the final version of the test require application of 29 of the 31 principles for appraising observation statements, with 13 principles tested by one item, 12 by two items, three by three items, and one by four items. Thus there is comprehensive coverage of the principles for appraising observation statements. Fisher and Scriven (1997: 135–136) judge the items to be well worked and sound, with one exception. The test is clearly written at a grade 6 reading level, meaning that poor performance cannot be attributed to difficulties in reading comprehension by the intended adolescent test takers. The stories that frame the items are realistic, and are engaging enough to stimulate test takers’ interest. Thus the most plausible explanation of a given score on the test is that it reflects roughly the degree to which the test taker can apply principles for appraising observations in real situations. In other words, there is good justification of the proposed interpretation that those who do well on the test have a firm grasp of the principles for appraising observation statements and those who do poorly have a weak grasp of them.

To get norms for performance on the test, Norris and King arranged for seven groups of high school students in different types of communities and with different levels of academic ability to take the test. The test manual includes percentiles, means, and standard deviations for each of these seven groups. These norms allow teachers to compare the performance of their class on the test to that of a similar group of students.

Copyright © 2022 by David Hitchcock < hitchckd @ mcmaster . ca >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Menu Trigger

Why Schools Need to Change Yes, We Can Define, Teach, and Assess Critical Thinking Skills

critical thinking measurement tools

Jeff Heyck-Williams (He, His, Him) Director of the Two Rivers Learning Institute in Washington, DC

critical thinking

Today’s learners face an uncertain present and a rapidly changing future that demand far different skills and knowledge than were needed in the 20th century. We also know so much more about enabling deep, powerful learning than we ever did before. Our collective future depends on how well young people prepare for the challenges and opportunities of 21st-century life.

Critical thinking is a thing. We can define it; we can teach it; and we can assess it.

While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the advent of the term “21st century skills” and discussions of deeper learning. There is increasing agreement among education reformers that critical thinking is an essential ingredient for long-term success for all of our students.

However, there are still those in the education establishment and in the media who argue that critical thinking isn’t really a thing, or that these skills aren’t well defined and, even if they could be defined, they can’t be taught or assessed.

To those naysayers, I have to disagree. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. In fact, as part of a multi-year Assessment for Learning Project , Two Rivers Public Charter School in Washington, D.C., has done just that.

Before I dive into what we have done, I want to acknowledge that some of the criticism has merit.

First, there are those that argue that critical thinking can only exist when students have a vast fund of knowledge. Meaning that a student cannot think critically if they don’t have something substantive about which to think. I agree. Students do need a robust foundation of core content knowledge to effectively think critically. Schools still have a responsibility for building students’ content knowledge.

However, I would argue that students don’t need to wait to think critically until after they have mastered some arbitrary amount of knowledge. They can start building critical thinking skills when they walk in the door. All students come to school with experience and knowledge which they can immediately think critically about. In fact, some of the thinking that they learn to do helps augment and solidify the discipline-specific academic knowledge that they are learning.

The second criticism is that critical thinking skills are always highly contextual. In this argument, the critics make the point that the types of thinking that students do in history is categorically different from the types of thinking students do in science or math. Thus, the idea of teaching broadly defined, content-neutral critical thinking skills is impossible. I agree that there are domain-specific thinking skills that students should learn in each discipline. However, I also believe that there are several generalizable skills that elementary school students can learn that have broad applicability to their academic and social lives. That is what we have done at Two Rivers.

Defining Critical Thinking Skills

We began this work by first defining what we mean by critical thinking. After a review of the literature and looking at the practice at other schools, we identified five constructs that encompass a set of broadly applicable skills: schema development and activation; effective reasoning; creativity and innovation; problem solving; and decision making.

critical thinking competency

We then created rubrics to provide a concrete vision of what each of these constructs look like in practice. Working with the Stanford Center for Assessment, Learning and Equity (SCALE) , we refined these rubrics to capture clear and discrete skills.

For example, we defined effective reasoning as the skill of creating an evidence-based claim: students need to construct a claim, identify relevant support, link their support to their claim, and identify possible questions or counter claims. Rubrics provide an explicit vision of the skill of effective reasoning for students and teachers. By breaking the rubrics down for different grade bands, we have been able not only to describe what reasoning is but also to delineate how the skills develop in students from preschool through 8th grade.

reasoning rubric

Before moving on, I want to freely acknowledge that in narrowly defining reasoning as the construction of evidence-based claims we have disregarded some elements of reasoning that students can and should learn. For example, the difference between constructing claims through deductive versus inductive means is not highlighted in our definition. However, by privileging a definition that has broad applicability across disciplines, we are able to gain traction in developing the roots of critical thinking. In this case, to formulate well-supported claims or arguments.

Teaching Critical Thinking Skills

The definitions of critical thinking constructs were only useful to us in as much as they translated into practical skills that teachers could teach and students could learn and use. Consequently, we have found that to teach a set of cognitive skills, we needed thinking routines that defined the regular application of these critical thinking and problem-solving skills across domains. Building on Harvard’s Project Zero Visible Thinking work, we have named routines aligned with each of our constructs.

For example, with the construct of effective reasoning, we aligned the Claim-Support-Question thinking routine to our rubric. Teachers then were able to teach students that whenever they were making an argument, the norm in the class was to use the routine in constructing their claim and support. The flexibility of the routine has allowed us to apply it from preschool through 8th grade and across disciplines from science to economics and from math to literacy.

argumentative writing

Kathryn Mancino, a 5th grade teacher at Two Rivers, has deliberately taught three of our thinking routines to students using the anchor charts above. Her charts name the components of each routine and has a place for students to record when they’ve used it and what they have figured out about the routine. By using this structure with a chart that can be added to throughout the year, students see the routines as broadly applicable across disciplines and are able to refine their application over time.

Assessing Critical Thinking Skills

By defining specific constructs of critical thinking and building thinking routines that support their implementation in classrooms, we have operated under the assumption that students are developing skills that they will be able to transfer to other settings. However, we recognized both the importance and the challenge of gathering reliable data to confirm this.

With this in mind, we have developed a series of short performance tasks around novel discipline-neutral contexts in which students can apply the constructs of thinking. Through these tasks, we have been able to provide an opportunity for students to demonstrate their ability to transfer the types of thinking beyond the original classroom setting. Once again, we have worked with SCALE to define tasks where students easily access the content but where the cognitive lift requires them to demonstrate their thinking abilities.

These assessments demonstrate that it is possible to capture meaningful data on students’ critical thinking abilities. They are not intended to be high stakes accountability measures. Instead, they are designed to give students, teachers, and school leaders discrete formative data on hard to measure skills.

While it is clearly difficult, and we have not solved all of the challenges to scaling assessments of critical thinking, we can define, teach, and assess these skills . In fact, knowing how important they are for the economy of the future and our democracy, it is essential that we do.

Jeff Heyck-Williams (He, His, Him)

Director of the two rivers learning institute.

Jeff Heyck-Williams is the director of the Two Rivers Learning Institute and a founder of Two Rivers Public Charter School. He has led work around creating school-wide cultures of mathematics, developing assessments of critical thinking and problem-solving, and supporting project-based learning.

Read More About Why Schools Need to Change

middle school student presentation

Equitable and Sustainable Social-Emotional Learning: Embracing Flexibility for Diverse Learners

Clementina Jose

May 30, 2024

high school student invention team

Nurturing STEM Identity and Belonging: The Role of Equitable Program Implementation in Project Invent

Alexis Lopez (she/her)

May 9, 2024

NGLC's Bravely 2024-2025

Bring Your Vision for Student Success to Life with NGLC and Bravely

March 13, 2024

critical thinking measurement tools

Christopher Dwyer Ph.D.

Critical Thinking About Measuring Critical Thinking

A list of critical thinking measures..

Posted May 18, 2018

  • What Is Cognition?
  • Find a therapist near me

In my last post , I discussed the nature of engaging the critical thinking (CT) process and made mention of individuals who draw a conclusion and wind up being correct. But, just because they’re right, it doesn’t mean they used CT to get there. I exemplified this through an observation made in recent years regarding extant measures of CT, many of which assess CT via multiple-choice questions. In the case of CT MCQs, you can guess the "right" answer 20-25% of the time, without any need for CT. So, the question is, are these CT measures really measuring CT?

As my previous articles explain, CT is a metacognitive process consisting of a number of sub-skills and dispositions, that, when applied through purposeful, self-regulatory, reflective judgment, increase the chances of producing a logical solution to a problem or a valid conclusion to an argument (Dwyer, 2017; Dwyer, Hogan & Stewart, 2014). Most definitions, though worded differently, tend to agree with this perspective – it consists of certain dispositions, specific skills and a reflective sensibility that governs application of these skills. That’s how it’s defined; however, it’s not necessarily how it’s been operationally defined.

Operationally defining something refers to defining the terms of the process or measure required to determine the nature and properties of a phenomenon. Simply, it is defining the concept with respect to how it can be done, assessed or measured. If the manner in which you measure something does not match, or assess the parameters set out in the way in which you define it, then you have not been successful in operationally defining it.

Though most theoretical definitions of CT are similar, the manner in which they vary often impedes the construction of an integrated theoretical account of how best to measure CT skills. As a result, researchers and educators must consider the wide array of CT measures available, in order to identify the best and the most appropriate measures, based on the CT conceptualisation used for training. There are various extant CT measures – the most popular amongst them include the Watson-Glaser Critical Thinking Assessment (WGCTA; Watson & Glaser, 1980), the Cornell Critical Thinking Test (CCTT; Ennis, Millman & Tomko, 1985), the California Critical Thinking Skills Test (CCTST; Facione, 1990a), the Ennis-Weir Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985) and the Halpern Critical Thinking Assessment (Halpern, 2010).

It has been noted by some commentators that these different measures of CT ability may not be directly comparable (Abrami et al., 2008). For example, the WGCTA consists of 80 MCQs that measure the ability to draw inferences; recognise assumptions; evaluate arguments; and use logical interpretation and deductive reasoning (Watson & Glaser, 1980). The CCTT consists of 52 MCQs which measure skills of critical thinking associated with induction; deduction; observation and credibility; definition and assumption identification; and meaning and fallacies. Finally, the CCTST consists of 34 multiple-choice questions (MCQs) and measures CT according to the core skills of analysis, evaluation and inference, as well as inductive and deductive reasoning.

As addressed above, the MCQ-format of these three assessments is less than ideal – problematic even, because it allows test-takers to simply guess when they do not know the correct answer, instead of demonstrating their ability to critically analyse and evaluate problems and infer solutions to those problems (Ku, 2009). Furthermore, as argued by Halpern (2003), the MCQ format makes the assessment a test of verbal and quantitative knowledge rather than CT (i.e. because one selects from a list of possible answers rather than determining one’s own criteria for developing an answer). The measurement of CT through MCQs is also problematic given the potential incompatibility between the conceptualisation of CT that shapes test construction and its assessment using MCQs. That is, MCQ tests assess cognitive capacities associated with identifying single right-or-wrong answers and as a result, this approach to testing is unable to provide a direct measure of test-takers’ use of metacognitive processes such as CT, reflective judgment, and disposition towards CT.

Instead of using MCQ items, a better measure of CT might ask open-ended questions, which would allow test-takers to demonstrate whether or not they spontaneously use a specific CT skill. One commonly used CT assessment, mentioned above, that employs an open-ended format is the Ennis-Weir Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985). The EWCTET is an essay-based assessment of the test-taker’s ability to analyse, evaluate, and respond to arguments and debates in real-world situations (Ennis & Weir, 1985; see Ku, 2009 for a discussion). The authors of the EWCTET provide what they call a “rough, somewhat overlapping list of areas of critical thinking competence”, measured by their test (Ennis & Weir, 1985, p. 1). However, this test, too, has been criticised – for its domain-specific nature (Taube, 1997), the subjectivity of its scoring protocol and its bias in favour of those proficient in writing (Adams, Whitlow, Stover & Johnson, 1996).

Another, more recent CT assessment that utilises an open-ended format is the Halpern Critical Thinking Assessment (HCTA; Halpern, 2010). The HCTA consists of 25 open-ended questions based on believable, everyday situations, followed by 25 specific questions that probe for the reasoning behind each answer. The multi-part nature of the questions makes it possible to assess the ability to use specific CT skills when the prompt is provided (Ku, 2009). The HCTA’s scoring protocol also provides comprehensible, unambiguous instructions for how to evaluate responses by breaking them down into clear, measurable components. Questions on the HCTA represent five categories of CT application: hypothesis testing (e.g. understanding the limits of correlational reasoning and how to know when causal claims cannot be made), verbal reasoning (e.g. recognising the use of pervasive or misleading language), argumentation (e.g. recognising the structure of arguments, how to examine the credibility of a source and how to judge one’s own arguments), judging likelihood and uncertainty (e.g. applying relevant principles of probability, how to avoid overconfidence in certain situations) and problem-solving (e.g. identifying the problem goal, generating and selecting solutions among alternatives).

Up until the development of the HCTA, I would have recommended the CCTST for measuring CT, despite its limitations. What’s nice about the CCTST is that it assesses the three core skills of CT: analysis, evaluation, and inference, which other scales do not (explicitly). So, if you were interested in assessing students’ sub-skill ability, this would be helpful. However, as we know, though CT skill performance is a sequence, it is also a collation of these skills – meaning that for any given problem or topic, each skill is necessary. By administrating an analysis problem, an evaluation problem and an inference problem, in which the student scores top marks for all three, it doesn’t guarantee that the student will apply these three to a broader problem that requires all three. That is, these questions don’t measure CT skill ability per se, rather analysis skill, evaluation skill and inference skill in isolation. Simply, scores may predict CT skill performance, but they don’t measure it.

critical thinking measurement tools

What may be a better indicator of CT performance is assessment of CT application . As addressed above, there are five general applications of CT: hypothesis testing, verbal reasoning, argumentation, problem-solving and judging likelihood and uncertainty – all of which require a collation of analysis, evaluation, and inference. Though the sub-skills of analysis, evaluation, and inference are not directly measured in this case, their collation is measured through five distinct applications; and, as I see it, provides a 'truer' assessment of CT. In addition to assessing CT via an open-ended, short-answer format, the HCTA measures CT according to the five applications of CT; thus, I recommend its use for measuring CT.

However, that’s not to say that the HCTA is perfect. Though it consists of 25 open-ended questions, followed by 25 specific questions that probe for the reasoning behind each answer, when I first used it to assess a sample of students, I found that in setting up my data file, there were actually 165 opportunities for scoring across the test. Past research recommends that the assessment takes roughly between 45 and 60 minutes to complete. However, many of my participants reported it requiring closer to two hours (sometimes longer). It’s a long assessment – thorough, but long. Fortunately, adapted, shortened versions are now available, and it’s an adapted version that I currently administrate to assess CT. Another limitation is that, despite the rationale above, it would be nice to have some indication of how participants get on with the sub-skills of analysis, evaluation, and inference, as I do think there’s a potential predictive element in the relationship among the individual skills and the applications. With that, I suppose it is feasible to administer both the HCTA and CCTST to assess such hypotheses.

Though it’s obviously important to consider how assessments actually measure CT and the nature in which each is limited, the broader, macro-problem still requires thought. Just as conceptualisations of CT vary, so too does the reliability and validity of the different CT measures, which has led Abrami and colleagues (2008, p. 1104) to ask: “How will we know if one intervention is more beneficial than another if we are uncertain about the validity and reliability of the outcome measures?” Abrami and colleagues add that, even when researchers explicitly declare that they are assessing CT, there still remains the major challenge of ensuring that measured outcomes are related, in some meaningful way, to the conceptualisation and operational definition of CT that informed the teaching practice in cases of interventional research. Often, the relationship between the concepts of CT that are taught and those that are assessed is unclear, and a large majority of studies in this area include no theory to help elucidate these relationships.

In conclusion, solving the problem of consistency across CT conceptualisation, training, and measure is no easy task. I think recent advancements in CT scale development (e.g. the development of the HCTA and its adapted versions) have eased the problem, given that they now bridge the gap between current theory and practical assessment. However, such advances need to be made clearer to interested populations. As always, I’m very interested in hearing from any readers who may have any insight or suggestions!

Abrami, P. C., Bernard, R. M., Borokhovski, E., Wade, A., Surkes, M. A., Tamim, R., & Zhang, D. (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78(4), 1102–1134.

Adams, M.H., Whitlow, J.F., Stover, L.M., & Johnson, K.W. (1996). Critical thinking as an educational outcome: An evaluation of current tools of measurement. Nurse Educator, 21, 23–32.

Dwyer, C.P. (2017). Critical thinking: Conceptual perspectives and practical guidelines. Cambridge, UK: Cambridge University Press.

Dwyer, C.P., Hogan, M.J. & Stewart, I. (2014). An integrated critical thinking framework for the 21st century. Thinking Skills & Creativity, 12, 43-52.

Ennis, R.H., Millman, J., & Tomko, T.N. (1985). Cornell critical thinking tests. CA: Critical Thinking Co.

Ennis, R.H., & Weir, E. (1985). The Ennis-Weir critical thinking essay test. Pacific Grove, CA: Midwest Publications.

Facione, P. A. (1990a). The California critical thinking skills test (CCTST): Forms A and B;The CCTST test manual. Millbrae, CA: California Academic Press.

Facione, P.A. (1990b). The Delphi report: Committee on pre-college philosophy. Millbrae, CA: California Academic Press.

Halpern, D. F. (2003b). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and practice. Cresskill, NJ: Hampton Press.

Halpern, D.F. (2010). The Halpern critical thinking assessment: Manual. Vienna: Schuhfried.

Ku, K.Y.L. (2009). Assessing students’ critical thinking performance: Urging for measurements using multi-response format. Thinking Skills and Creativity, 4, 1, 70- 76.

Taube, K.T. (1997). Critical thinking ability and disposition as factors of performance on a written critical thinking test. Journal of General Education, 46, 129-164.

Watson, G., & Glaser, E.M. (1980). Watson-Glaser critical thinking appraisal. New York: Psychological Corporation.

Christopher Dwyer Ph.D.

Christopher Dwyer, Ph.D., is a lecturer at the Technological University of the Shannon in Athlone, Ireland.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Online Therapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Self Tests NEW
  • Therapy Center
  • Diagnosis Dictionary
  • Types of Therapy

May 2024 magazine cover

At any moment, someone’s aggravating behavior or our own bad luck can set us off on an emotional spiral that threatens to derail our entire day. Here’s how we can face our triggers with less reactivity so that we can get on with our lives.

  • Emotional Intelligence
  • Gaslighting
  • Affective Forecasting
  • Neuroscience
  • ADEA Connect

' src=

  • Communities
  • Career Opportunities
  • New Thinking
  • ADEA Governance
  • House of Delegates
  • Board of Directors
  • Advisory Committees
  • Sections and Special Interest Groups
  • Governance Documents and Publications
  • Dental Faculty Code of Conduct
  • ADEAGies Foundation
  • About ADEAGies Foundation
  • ADEAGies Newsroom
  • Gies Awards
  • Press Center
  • Strategic Directions
  • 2023 Annual Report
  • ADEA Membership
  • Institutions
  • Faculty and Staff
  • Individuals
  • Corporations
  • ADEA Members
  • Predoctoral Dental
  • Allied Dental
  • Nonfederal Advanced Dental
  • U.S. Federal Dental
  • Students, Residents and Fellows
  • Corporate Members
  • Member Directory
  • Directory of Institutional Members (DIM)
  • 5 Questions With
  • ADEA Member to Member Recruitment
  • Students, Residents, and Fellows
  • Information For
  • Deans & Program Directors
  • Current Students & Residents
  • Prospective Students
  • Educational Meetings
  • Upcoming Events
  • 2025 Annual Session & Exhibition
  • eLearn Webinars
  • Past Events
  • Professional Development
  • eLearn Micro-credentials
  • Leadership Institute
  • Leadership Institute Alumni Association (LIAA)
  • Faculty Development Programs
  • ADEA Scholarships, Awards and Fellowships
  • Academic Fellowship
  • For Students
  • For Dental Educators
  • For Leadership Institute Fellows
  • Teaching Resources
  • ADEA weTeach®
  • MedEdPORTAL

Critical Thinking Skills Toolbox

  • Resources for Teaching
  • Policy Topics
  • Task Force Report
  • Opioid Epidemic
  • Financing Dental Education
  • Holistic Review
  • Sex-based Health Differences
  • Access, Diversity and Inclusion
  • ADEA Commission on Change and Innovation in Dental Education
  • Tool Resources
  • Campus Liaisons
  • Policy Resources
  • Policy Publications
  • Holistic Review Workshops
  • Leading Conversations Webinar Series
  • Collaborations
  • Summer Health Professions Education Program
  • Minority Dental Faculty Development Program
  • Federal Advocacy
  • Dental School Legislators
  • Policy Letters and Memos
  • Legislative Process
  • Federal Advocacy Toolkit
  • State Information
  • Opioid Abuse
  • Tracking Map
  • Loan Forgiveness Programs
  • State Advocacy Toolkit
  • Canadian Information
  • Dental Schools
  • Provincial Information
  • ADEA Advocate
  • Books and Guides
  • About ADEA Publications
  • 2023-24 Official Guide
  • Dental School Explorer
  • Dental Education Trends
  • Ordering Publications
  • ADEA Bookstore
  • Newsletters
  • About ADEA Newsletters
  • Bulletin of Dental Education
  • Charting Progress
  • Subscribe to Newsletter
  • Journal of Dental Education
  • Subscriptions
  • Submissions FAQs
  • Data, Analysis and Research
  • Educational Institutions
  • Applicants, Enrollees and Graduates
  • Dental School Seniors
  • ADEA AADSAS® (Dental School)
  • AADSAS Applicants
  • Health Professions Advisors
  • Admissions Officers
  • ADEA CAAPID® (International Dentists)
  • CAAPID Applicants
  • Program Finder
  • ADEA DHCAS® (Dental Hygiene Programs)
  • DHCAS Applicants
  • Program Directors
  • ADEA PASS® (Advanced Dental Education Programs)
  • PASS Applicants
  • PASS Evaluators
  • DentEd Jobs
  • Information For:

critical thinking measurement tools

  • Introduction
  • Overview of Critical Thinking Skills
  • Teaching Observations
  • Avenues for Research

CTS Tools for Faculty and Student Assessment

  • Critical Thinking and Assessment
  • Conclusions
  • Bibliography
  • Helpful Links
  • Appendix A. Author's Impressions of Vignettes

A number of critical thinking skills inventories and measures have been developed:

     Watson-Glaser Critical Thinking Appraisal (WGCTA)      Cornell Critical Thinking Test      California Critical Thinking Disposition Inventory (CCTDI)      California Critical Thinking Skills Test (CCTST)      Health Science Reasoning Test (HSRT)      Professional Judgment Rating Form (PJRF)      Teaching for Thinking Student Course Evaluation Form      Holistic Critical Thinking Scoring Rubric      Peer Evaluation of Group Presentation Form

Excluding the Watson-Glaser Critical Thinking Appraisal and the Cornell Critical Thinking Test, Facione and Facione developed the critical thinking skills instruments listed above. However, it is important to point out that all of these measures are of questionable utility for dental educators because their content is general rather than dental education specific. (See Critical Thinking and Assessment .)

Table 7. Purposes of Critical Thinking Skills Instruments

  Reliability and Validity

Reliability means that individual scores from an instrument should be the same or nearly the same from one administration of the instrument to another. The instrument can be assumed to be free of bias and measurement error (68). Alpha coefficients are often used to report an estimate of internal consistency. Scores of .70 or higher indicate that the instrument has high reliability when the stakes are moderate. Scores of .80 and higher are appropriate when the stakes are high.

Validity means that individual scores from a particular instrument are meaningful, make sense, and allow researchers to draw conclusions from the sample to the population that is being studied (69) Researchers often refer to "content" or "face" validity. Content validity or face validity is the extent to which questions on an instrument are representative of the possible questions that a researcher could ask about that particular content or skills.

Watson-Glaser Critical Thinking Appraisal-FS (WGCTA-FS)

The WGCTA-FS is a 40-item inventory created to replace Forms A and B of the original test, which participants reported was too long.70 This inventory assesses test takers' skills in:

     (a) Inference: the extent to which the individual recognizes whether assumptions are clearly stated      (b) Recognition of assumptions: whether an individual recognizes whether assumptions are clearly stated      (c) Deduction: whether an individual decides if certain conclusions follow the information provided      (d) Interpretation: whether an individual considers evidence provided and determines whether generalizations from data are warranted      (e) Evaluation of arguments: whether an individual distinguishes strong and relevant arguments from weak and irrelevant arguments

Researchers investigated the reliability and validity of the WGCTA-FS for subjects in academic fields. Participants included 586 university students. Internal consistencies for the total WGCTA-FS among students majoring in psychology, educational psychology, and special education, including undergraduates and graduates, ranged from .74 to .92. The correlations between course grades and total WGCTA-FS scores for all groups ranged from .24 to .62 and were significant at the p < .05 of p < .01. In addition, internal consistency and test-retest reliability for the WGCTA-FS have been measured as .81. The WGCTA-FS was found to be a reliable and valid instrument for measuring critical thinking (71).

Cornell Critical Thinking Test (CCTT)

There are two forms of the CCTT, X and Z. Form X is for students in grades 4-14. Form Z is for advanced and gifted high school students, undergraduate and graduate students, and adults. Reliability estimates for Form Z range from .49 to .87 across the 42 groups who have been tested. Measures of validity were computed in standard conditions, roughly defined as conditions that do not adversely affect test performance. Correlations between Level Z and other measures of critical thinking are about .50.72 The CCTT is reportedly as predictive of graduate school grades as the Graduate Record Exam (GRE), a measure of aptitude, and the Miller Analogies Test, and tends to correlate between .2 and .4.73

California Critical Thinking Disposition Inventory (CCTDI)

Facione and Facione have reported significant relationships between the CCTDI and the CCTST. When faculty focus on critical thinking in planning curriculum development, modest cross-sectional and longitudinal gains have been demonstrated in students' CTS.74 The CCTDI consists of seven subscales and an overall score. The recommended cut-off score for each scale is 40, the suggested target score is 50, and the maximum score is 60. Scores below 40 on a specific scale are weak in that CT disposition, and scores above 50 on a scale are strong in that dispositional aspect. An overall score of 280 shows serious deficiency in disposition toward CT, while an overall score of 350 (while rare) shows across the board strength. The seven subscales are analyticity, self-confidence, inquisitiveness, maturity, open-mindedness, systematicity, and truth seeking (75).

In a study of instructional strategies and their influence on the development of critical thinking among undergraduate nursing students, Tiwari, Lai, and Yuen found that, compared with lecture students, PBL students showed significantly greater improvement in overall CCTDI (p = .0048), Truth seeking (p = .0008), Analyticity (p =.0368) and Critical Thinking Self-confidence (p =.0342) subscales from the first to the second time points; in overall CCTDI (p = .0083), Truth seeking (p= .0090), and Analyticity (p =.0354) subscales from the second to the third time points; and in Truth seeking (p = .0173) and Systematicity (p = .0440) subscales scores from the first to the fourth time points (76). California Critical Thinking Skills Test (CCTST)

Studies have shown the California Critical Thinking Skills Test captured gain scores in students' critical thinking over one quarter or one semester. Multiple health science programs have demonstrated significant gains in students' critical thinking using site-specific curriculum. Studies conducted to control for re-test bias showed no testing effect from pre- to post-test means using two independent groups of CT students. Since behavioral science measures can be impacted by social-desirability bias-the participant's desire to answer in ways that would please the researcher-researchers are urged to have participants take the Marlowe Crowne Social Desirability Scale simultaneously when measuring pre- and post-test changes in critical thinking skills. The CCTST is a 34-item instrument. This test has been correlated with the CCTDI with a sample of 1,557 nursing education students. Results show that, r = .201, and the relationship between the CCTST and the CCTDI is significant at p< .001. Significant relationships between CCTST and other measures including the GRE total, GRE-analytic, GRE-Verbal, GRE-Quantitative, the WGCTA, and the SAT Math and Verbal have also been reported. The two forms of the CCTST, A and B, are considered statistically significant. Depending on the testing, context KR-20 alphas range from .70 to .75. The newest version is CCTST Form 2000, and depending on the testing context, KR-20 alphas range from .78-.84.77

The Health Science Reasoning Test (HSRT)

Items within this inventory cover the domain of CT cognitive skills identified by a Delphi group of experts whose work resulted in the development of the CCTDI and CCTST. This test measures health science undergraduate and graduate students' CTS. Although test items are set in health sciences and clinical practice contexts, test takers are not required to have discipline-specific health sciences knowledge. For this reason, the test may have limited utility in dental education (78).

Preliminary estimates of internal consistency show that overall KR-20 coefficients range from .77 to .83.79 The instrument has moderate reliability on analysis and inference subscales, although the factor loadings appear adequate. The low K-20 coefficients may be result of small sample size, variance in item response, or both (see following table).

Table 8. Estimates of Internal Consistency and Factor Loading by Subscale for HSRT

Professional Judgment Rating Form (PJRF)

The scale consists of two sets of descriptors. The first set relates primarily to the attitudinal (habits of mind) dimension of CT. The second set relates primarily to CTS.

A single rater should know the student well enough to respond to at least 17 or the 20 descriptors with confidence. If not, the validity of the ratings may be questionable. If a single rater is used and ratings over time show some consistency, comparisons between ratings may be used to assess changes. If more than one rater is used, then inter-rater reliability must be established among the raters to yield meaningful results. While the PJRF can be used to assess the effectiveness of training programs for individuals or groups, the evaluation of participants' actual skills are best measured by an objective tool such as the California Critical Thinking Skills Test.

Teaching for Thinking Student Course Evaluation Form

Course evaluations typically ask for responses of "agree" or "disagree" to items focusing on teacher behavior. Typically the questions do not solicit information about student learning. Because contemporary thinking about curriculum is interested in student learning, this form was developed to address differences in pedagogy and subject matter, learning outcomes, student demographics, and course level characteristic of education today. This form also grew out of a "one size fits all" approach to teaching evaluations and a recognition of the limitations of this practice. It offers information about how a particular course enhances student knowledge, sensitivities, and dispositions. The form gives students an opportunity to provide feedback that can be used to improve instruction.

Holistic Critical Thinking Scoring Rubric

This assessment tool uses a four-point classification schema that lists particular opposing reasoning skills for select criteria. One advantage of a rubric is that it offers clearly delineated components and scales for evaluating outcomes. This rubric explains how students' CTS will be evaluated, and it provides a consistent framework for the professor as evaluator. Users can add or delete any of the statements to reflect their institution's effort to measure CT. Like most rubrics, this form is likely to have high face validity since the items tend to be relevant or descriptive of the target concept. This rubric can be used to rate student work or to assess learning outcomes. Experienced evaluators should engage in a process leading to consensus regarding what kinds of things should be classified and in what ways.80 If used improperly or by inexperienced evaluators, unreliable results may occur.

Peer Evaluation of Group Presentation Form

This form offers a common set of criteria to be used by peers and the instructor to evaluate student-led group presentations regarding concepts, analysis of arguments or positions, and conclusions.81 Users have an opportunity to rate the degree to which each component was demonstrated. Open-ended questions give users an opportunity to cite examples of how concepts, the analysis of arguments or positions, and conclusions were demonstrated.

Table 8. Proposed Universal Criteria for Evaluating Students' Critical Thinking Skills 

Aside from the use of the above-mentioned assessment tools, Dexter et al. recommended that all schools develop universal criteria for evaluating students' development of critical thinking skills (82).

Their rationale for the proposed criteria is that if faculty give feedback using these criteria, graduates will internalize these skills and use them to monitor their own thinking and practice (see Table 4).

' src=

  • Application Information
  • ADEA GoDental
  • ADEA AADSAS
  • ADEA CAAPID
  • Events & Professional Development
  • Scholarships, Awards & Fellowships
  • Publications & Data
  • Official Guide to Dental Schools
  • Data, Analysis & Research
  • Follow Us On:

' src=

  • ADEA Privacy Policy
  • Terms of Use
  • Website Feedback
  • Website Help

critical thinking measurement tools

  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, performance assessment of critical thinking: conceptualization, design, and implementation.

critical thinking measurement tools

  • 1 Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, United States
  • 2 Graduate School of Education, Stanford University, Stanford, CA, United States
  • 3 Department of Business and Economics Education, Johannes Gutenberg University, Mainz, Germany

Enhancing students’ critical thinking (CT) skills is an essential goal of higher education. This article presents a systematic approach to conceptualizing and measuring CT. CT generally comprises the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion. We further posit that CT also involves dealing with dilemmas involving ambiguity or conflicts among principles and contradictory information. We argue that performance assessment provides the most realistic—and most credible—approach to measuring CT. From this conceptualization and construct definition, we describe one possible framework for building performance assessments of CT with attention to extended performance tasks within the assessment system. The framework is a product of an ongoing, collaborative effort, the International Performance Assessment of Learning (iPAL). The framework comprises four main aspects: (1) The storyline describes a carefully curated version of a complex, real-world situation. (2) The challenge frames the task to be accomplished (3). A portfolio of documents in a range of formats is drawn from multiple sources chosen to have specific characteristics. (4) The scoring rubric comprises a set of scales each linked to a facet of the construct. We discuss a number of use cases, as well as the challenges that arise with the use and valid interpretation of performance assessments. The final section presents elements of the iPAL research program that involve various refinements and extensions of the assessment framework, a number of empirical studies, along with linkages to current work in online reading and information processing.

Introduction

In their mission statements, most colleges declare that a principal goal is to develop students’ higher-order cognitive skills such as critical thinking (CT) and reasoning (e.g., Shavelson, 2010 ; Hyytinen et al., 2019 ). The importance of CT is echoed by business leaders ( Association of American Colleges and Universities [AACU], 2018 ), as well as by college faculty (for curricular analyses in Germany, see e.g., Zlatkin-Troitschanskaia et al., 2018 ). Indeed, in the 2019 administration of the Faculty Survey of Student Engagement (FSSE), 93% of faculty reported that they “very much” or “quite a bit” structure their courses to support student development with respect to thinking critically and analytically. In a listing of 21st century skills, CT was the most highly ranked among FSSE respondents ( Indiana University, 2019 ). Nevertheless, there is considerable evidence that many college students do not develop these skills to a satisfactory standard ( Arum and Roksa, 2011 ; Shavelson et al., 2019 ; Zlatkin-Troitschanskaia et al., 2019 ). This state of affairs represents a serious challenge to higher education – and to society at large.

In view of the importance of CT, as well as evidence of substantial variation in its development during college, its proper measurement is essential to tracking progress in skill development and to providing useful feedback to both teachers and learners. Feedback can help focus students’ attention on key skill areas in need of improvement, and provide insight to teachers on choices of pedagogical strategies and time allocation. Moreover, comparative studies at the program and institutional level can inform higher education leaders and policy makers.

The conceptualization and definition of CT presented here is closely related to models of information processing and online reasoning, the skills that are the focus of this special issue. These two skills are especially germane to the learning environments that college students experience today when much of their academic work is done online. Ideally, students should be capable of more than naïve Internet search, followed by copy-and-paste (e.g., McGrew et al., 2017 ); rather, for example, they should be able to critically evaluate both sources of evidence and the quality of the evidence itself in light of a given purpose ( Leu et al., 2020 ).

In this paper, we present a systematic approach to conceptualizing CT. From that conceptualization and construct definition, we present one possible framework for building performance assessments of CT with particular attention to extended performance tasks within the test environment. The penultimate section discusses some of the challenges that arise with the use and valid interpretation of performance assessment scores. We conclude the paper with a section on future perspectives in an emerging field of research – the iPAL program.

Conceptual Foundations, Definition and Measurement of Critical Thinking

In this section, we briefly review the concept of CT and its definition. In accordance with the principles of evidence-centered design (ECD; Mislevy et al., 2003 ), the conceptualization drives the measurement of the construct; that is, implementation of ECD directly links aspects of the assessment framework to specific facets of the construct. We then argue that performance assessments designed in accordance with such an assessment framework provide the most realistic—and most credible—approach to measuring CT. The section concludes with a sketch of an approach to CT measurement grounded in performance assessment .

Concept and Definition of Critical Thinking

Taxonomies of 21st century skills ( Pellegrino and Hilton, 2012 ) abound, and it is neither surprising that CT appears in most taxonomies of learning, nor that there are many different approaches to defining and operationalizing the construct of CT. There is, however, general agreement that CT is a multifaceted construct ( Liu et al., 2014 ). Liu et al. (2014) identified five key facets of CT: (i) evaluating evidence and the use of evidence; (ii) analyzing arguments; (iii) understanding implications and consequences; (iv) developing sound arguments; and (v) understanding causation and explanation.

There is empirical support for these facets from college faculty. A 2016–2017 survey conducted by the Higher Education Research Institute (HERI) at the University of California, Los Angeles found that a substantial majority of faculty respondents “frequently” encouraged students to: (i) evaluate the quality or reliability of the information they receive; (ii) recognize biases that affect their thinking; (iii) analyze multiple sources of information before coming to a conclusion; and (iv) support their opinions with a logical argument ( Stolzenberg et al., 2019 ).

There is general agreement that CT involves the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion (e.g., Erwin and Sebrell, 2003 ; Kosslyn and Nelson, 2017 ; Shavelson et al., 2018 ). We further suggest that CT includes dealing with dilemmas of ambiguity or conflict among principles and contradictory information ( Oser and Biedermann, 2020 ).

Importantly, Oser and Biedermann (2020) posit that CT can be manifested at three levels. The first level, Critical Analysis , is the most complex of the three levels. Critical Analysis requires both knowledge in a specific discipline (conceptual) and procedural analytical (deduction, inclusion, etc.) knowledge. The second level is Critical Reflection , which involves more generic skills “… necessary for every responsible member of a society” (p. 90). It is “a basic attitude that must be taken into consideration if (new) information is questioned to be true or false, reliable or not reliable, moral or immoral etc.” (p. 90). To engage in Critical Reflection, one needs not only apply analytic reasoning, but also adopt a reflective stance toward the political, social, and other consequences of choosing a course of action. It also involves analyzing the potential motives of various actors involved in the dilemma of interest. The third level, Critical Alertness , involves questioning one’s own or others’ thinking from a skeptical point of view.

Wheeler and Haertel (1993) categorized higher-order skills, such as CT, into two types: (i) when solving problems and making decisions in professional and everyday life, for instance, related to civic affairs and the environment; and (ii) in situations where various mental processes (e.g., comparing, evaluating, and justifying) are developed through formal instruction, usually in a discipline. Hence, in both settings, individuals must confront situations that typically involve a problematic event, contradictory information, and possibly conflicting principles. Indeed, there is an ongoing debate concerning whether CT should be evaluated using generic or discipline-based assessments ( Nagel et al., 2020 ). Whether CT skills are conceptualized as generic or discipline-specific has implications for how they are assessed and how they are incorporated into the classroom.

In the iPAL project, CT is characterized as a multifaceted construct that comprises conceptualizing, analyzing, drawing inferences or synthesizing information, evaluating claims, and applying the results of these reasoning processes to various purposes (e.g., solve a problem, decide on a course of action, find an answer to a given question or reach a conclusion) ( Shavelson et al., 2019 ). In the course of carrying out a CT task, an individual typically engages in activities such as specifying or clarifying a problem; deciding what information is relevant to the problem; evaluating the trustworthiness of information; avoiding judgmental errors based on “fast thinking”; avoiding biases and stereotypes; recognizing different perspectives and how they can reframe a situation; considering the consequences of alternative courses of actions; and communicating clearly and concisely decisions and actions. The order in which activities are carried out can vary among individuals and the processes can be non-linear and reciprocal.

In this article, we focus on generic CT skills. The importance of these skills derives not only from their utility in academic and professional settings, but also the many situations involving challenging moral and ethical issues – often framed in terms of conflicting principles and/or interests – to which individuals have to apply these skills ( Kegan, 1994 ; Tessier-Lavigne, 2020 ). Conflicts and dilemmas are ubiquitous in the contexts in which adults find themselves: work, family, civil society. Moreover, to remain viable in the global economic environment – one characterized by increased competition and advances in second generation artificial intelligence (AI) – today’s college students will need to continually develop and leverage their CT skills. Ideally, colleges offer a supportive environment in which students can develop and practice effective approaches to reasoning about and acting in learning, professional and everyday situations.

Measurement of Critical Thinking

Critical thinking is a multifaceted construct that poses many challenges to those who would develop relevant and valid assessments. For those interested in current approaches to the measurement of CT that are not the focus of this paper, consult Zlatkin-Troitschanskaia et al. (2018) .

In this paper, we have singled out performance assessment as it offers important advantages to measuring CT. Extant tests of CT typically employ response formats such as forced-choice or short-answer, and scenario-based tasks (for an overview, see Liu et al., 2014 ). They all suffer from moderate to severe construct underrepresentation; that is, they fail to capture important facets of the CT construct such as perspective taking and communication. High fidelity performance tasks are viewed as more authentic in that they provide a problem context and require responses that are more similar to what individuals confront in the real world than what is offered by traditional multiple-choice items ( Messick, 1994 ; Braun, 2019 ). This greater verisimilitude promises higher levels of construct representation and lower levels of construct-irrelevant variance. Such performance tasks have the capacity to measure facets of CT that are imperfectly assessed, if at all, using traditional assessments ( Lane and Stone, 2006 ; Braun, 2019 ; Shavelson et al., 2019 ). However, these assertions must be empirically validated, and the measures should be subjected to psychometric analyses. Evidence of the reliability, validity, and interpretative challenges of performance assessment (PA) are extensively detailed in Davey et al. (2015) .

We adopt the following definition of performance assessment:

A performance assessment (sometimes called a work sample when assessing job performance) … is an activity or set of activities that requires test takers, either individually or in groups, to generate products or performances in response to a complex, most often real-world task. These products and performances provide observable evidence bearing on test takers’ knowledge, skills, and abilities—their competencies—in completing the assessment ( Davey et al., 2015 , p. 10).

A performance assessment typically includes an extended performance task and short constructed-response and selected-response (i.e., multiple-choice) tasks (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). In this paper, we refer to both individual performance- and constructed-response tasks as performance tasks (PT) (For an example, see Table 1 in section “iPAL Assessment Framework”).

www.frontiersin.org

Table 1. The iPAL assessment framework.

An Approach to Performance Assessment of Critical Thinking: The iPAL Program

The approach to CT presented here is the result of ongoing work undertaken by the International Performance Assessment of Learning collaborative (iPAL 1 ). iPAL is an international consortium of volunteers, primarily from academia, who have come together to address the dearth in higher education of research and practice in measuring CT with performance tasks ( Shavelson et al., 2018 ). In this section, we present iPAL’s assessment framework as the basis of measuring CT, with examples along the way.

iPAL Background

The iPAL assessment framework builds on the Council of Aid to Education’s Collegiate Learning Assessment (CLA). The CLA was designed to measure cross-disciplinary, generic competencies, such as CT, analytic reasoning, problem solving, and written communication ( Klein et al., 2007 ; Shavelson, 2010 ). Ideally, each PA contained an extended PT (e.g., examining a range of evidential materials related to the crash of an aircraft) and two short PT’s: one in which students either critique an argument or provide a solution in response to a real-world societal issue.

Motivated by considerations of adequate reliability, in 2012, the CLA was later modified to create the CLA+. The CLA+ includes two subtests: a PT and a 25-item Selected Response Question (SRQ) section. The PT presents a document or problem statement and an assignment based on that document which elicits an open-ended response. The CLA+ added the SRQ section (which is not linked substantively to the PT scenario) to increase the number of student responses to obtain more reliable estimates of performance at the student-level than could be achieved with a single PT ( Zahner, 2013 ; Davey et al., 2015 ).

iPAL Assessment Framework

Methodological foundations.

The iPAL framework evolved from the Collegiate Learning Assessment developed by Klein et al. (2007) . It was also informed by the results from the AHELO pilot study ( Organisation for Economic Co-operation and Development [OECD], 2012 , 2013 ), as well as the KoKoHs research program in Germany (for an overview see, Zlatkin-Troitschanskaia et al., 2017 , 2020 ). The ongoing refinement of the iPAL framework has been guided in part by the principles of Evidence Centered Design (ECD) ( Mislevy et al., 2003 ; Mislevy and Haertel, 2006 ; Haertel and Fujii, 2017 ).

In educational measurement, an assessment framework plays a critical intermediary role between the theoretical formulation of the construct and the development of the assessment instrument containing tasks (or items) intended to elicit evidence with respect to that construct ( Mislevy et al., 2003 ). Builders of the assessment framework draw on the construct theory and operationalize it in a way that provides explicit guidance to PT’s developers. Thus, the framework should reflect the relevant facets of the construct, where relevance is determined by substantive theory or an appropriate alternative such as behavioral samples from real-world situations of interest (criterion-sampling; McClelland, 1973 ), as well as the intended use(s) (for an example, see Shavelson et al., 2019 ). By following the requirements and guidelines embodied in the framework, instrument developers strengthen the claim of construct validity for the instrument ( Messick, 1994 ).

An assessment framework can be specified at different levels of granularity: an assessment battery (“omnibus” assessment, for an example see below), a single performance task, or a specific component of an assessment ( Shavelson, 2010 ; Davey et al., 2015 ). In the iPAL program, a performance assessment comprises one or more extended performance tasks and additional selected-response and short constructed-response items. The focus of the framework specified below is on a single PT intended to elicit evidence with respect to some facets of CT, such as the evaluation of the trustworthiness of the documents provided and the capacity to address conflicts of principles.

From the ECD perspective, an assessment is an instrument for generating information to support an evidentiary argument and, therefore, the intended inferences (claims) must guide each stage of the design process. The construct of interest is operationalized through the Student Model , which represents the target knowledge, skills, and abilities, as well as the relationships among them. The student model should also make explicit the assumptions regarding student competencies in foundational skills or content knowledge. The Task Model specifies the features of the problems or items posed to the respondent, with the goal of eliciting the evidence desired. The assessment framework also describes the collection of task models comprising the instrument, with considerations of construct validity, various psychometric characteristics (e.g., reliability) and practical constraints (e.g., testing time and cost). The student model provides grounds for evidence of validity, especially cognitive validity; namely, that the students are thinking critically in responding to the task(s).

In the present context, the target construct (CT) is the competence of individuals to think critically, which entails solving complex, real-world problems, and clearly communicating their conclusions or recommendations for action based on trustworthy, relevant and unbiased information. The situations, drawn from actual events, are challenging and may arise in many possible settings. In contrast to more reductionist approaches to assessment development, the iPAL approach and framework rests on the assumption that properly addressing these situational demands requires the application of a constellation of CT skills appropriate to the particular task presented (e.g., Shavelson, 2010 , 2013 ). For a PT, the assessment framework must also specify the rubric by which the responses will be evaluated. The rubric must be properly linked to the target construct so that the resulting score profile constitutes evidence that is both relevant and interpretable in terms of the student model (for an example, see Zlatkin-Troitschanskaia et al., 2019 ).

iPAL Task Framework

The iPAL ‘omnibus’ framework comprises four main aspects: A storyline , a challenge , a document library , and a scoring rubric . Table 1 displays these aspects, brief descriptions of each, and the corresponding examples drawn from an iPAL performance assessment (Version adapted from original in Hyytinen and Toom, 2019 ). Storylines are drawn from various domains; for example, the worlds of business, public policy, civics, medicine, and family. They often involve moral and/or ethical considerations. Deriving an appropriate storyline from a real-world situation requires careful consideration of which features are to be kept in toto , which adapted for purposes of the assessment, and which to be discarded. Framing the challenge demands care in wording so that there is minimal ambiguity in what is required of the respondent. The difficulty of the challenge depends, in large part, on the nature and extent of the information provided in the document library , the amount of scaffolding included, as well as the scope of the required response. The amount of information and the scope of the challenge should be commensurate with the amount of time available. As is evident from the table, the characteristics of the documents in the library are intended to elicit responses related to facets of CT. For example, with regard to bias, the information provided is intended to play to judgmental errors due to fast thinking and/or motivational reasoning. Ideally, the situation should accommodate multiple solutions of varying degrees of merit.

The dimensions of the scoring rubric are derived from the Task Model and Student Model ( Mislevy et al., 2003 ) and signal which features are to be extracted from the response and indicate how they are to be evaluated. There should be a direct link between the evaluation of the evidence and the claims that are made with respect to the key features of the task model and student model . More specifically, the task model specifies the various manipulations embodied in the PA and so informs scoring, while the student model specifies the capacities students employ in more or less effectively responding to the tasks. The score scales for each of the five facets of CT (see section “Concept and Definition of Critical Thinking”) can be specified using appropriate behavioral anchors (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). Of particular importance is the evaluation of the response with respect to the last dimension of the scoring rubric; namely, the overall coherence and persuasiveness of the argument, building on the explicit or implicit characteristics related to the first five dimensions. The scoring process must be monitored carefully to ensure that (trained) raters are judging each response based on the same types of features and evaluation criteria ( Braun, 2019 ) as indicated by interrater agreement coefficients.

The scoring rubric of the iPAL omnibus framework can be modified for specific tasks ( Lane and Stone, 2006 ). This generic rubric helps ensure consistency across rubrics for different storylines. For example, Zlatkin-Troitschanskaia et al. (2019 , p. 473) used the following scoring scheme:

Based on our construct definition of CT and its four dimensions: (D1-Info) recognizing and evaluating information, (D2-Decision) recognizing and evaluating arguments and making decisions, (D3-Conseq) recognizing and evaluating the consequences of decisions, and (D4-Writing), we developed a corresponding analytic dimensional scoring … The students’ performance is evaluated along the four dimensions, which in turn are subdivided into a total of 23 indicators as (sub)categories of CT … For each dimension, we sought detailed evidence in students’ responses for the indicators and scored them on a six-point Likert-type scale. In order to reduce judgment distortions, an elaborate procedure of ‘behaviorally anchored rating scales’ (Smith and Kendall, 1963) was applied by assigning concrete behavioral expectations to certain scale points (Bernardin et al., 1976). To this end, we defined the scale levels by short descriptions of typical behavior and anchored them with concrete examples. … We trained four raters in 1 day using a specially developed training course to evaluate students’ performance along the 23 indicators clustered into four dimensions (for a description of the rater training, see Klotzer, 2018).

Shavelson et al. (2019) examined the interrater agreement of the scoring scheme developed by Zlatkin-Troitschanskaia et al. (2019) and “found that with 23 items and 2 raters the generalizability (“reliability”) coefficient for total scores to be 0.74 (with 4 raters, 0.84)” ( Shavelson et al., 2019 , p. 15). In the study by Zlatkin-Troitschanskaia et al. (2019 , p. 478) three score profiles were identified (low-, middle-, and high-performer) for students. Proper interpretation of such profiles requires care. For example, there may be multiple possible explanations for low scores such as poor CT skills, a lack of a disposition to engage with the challenge, or the two attributes jointly. These alternative explanations for student performance can potentially pose a threat to the evidentiary argument. In this case, auxiliary information may be available to aid in resolving the ambiguity. For example, student responses to selected- and short-constructed-response items in the PA can provide relevant information about the levels of the different skills possessed by the student. When sufficient data are available, the scores can be modeled statistically and/or qualitatively in such a way as to bring them to bear on the technical quality or interpretability of the claims of the assessment: reliability, validity, and utility evidence ( Davey et al., 2015 ; Zlatkin-Troitschanskaia et al., 2019 ). These kinds of concerns are less critical when PT’s are used in classroom settings. The instructor can draw on other sources of evidence, including direct discussion with the student.

Use of iPAL Performance Assessments in Educational Practice: Evidence From Preliminary Validation Studies

The assessment framework described here supports the development of a PT in a general setting. Many modifications are possible and, indeed, desirable. If the PT is to be more deeply embedded in a certain discipline (e.g., economics, law, or medicine), for example, then the framework must specify characteristics of the narrative and the complementary documents as to the breadth and depth of disciplinary knowledge that is represented.

At present, preliminary field trials employing the omnibus framework (i.e., a full set of documents) indicated that 60 min was generally an inadequate amount of time for students to engage with the full set of complementary documents and to craft a complete response to the challenge (for an example, see Shavelson et al., 2019 ). Accordingly, it would be helpful to develop modified frameworks for PT’s that require substantially less time. For an example, see a short performance assessment of civic online reasoning, requiring response times from 10 to 50 min ( Wineburg et al., 2016 ). Such assessment frameworks could be derived from the omnibus framework by focusing on a reduced number of facets of CT, and specifying the characteristics of the complementary documents to be included – or, perhaps, choices among sets of documents. In principle, one could build a ‘family’ of PT’s, each using the same (or nearly the same) storyline and a subset of the full collection of complementary documents.

Paul and Elder (2007) argue that the goal of CT assessments should be to provide faculty with important information about how well their instruction supports the development of students’ CT. In that spirit, the full family of PT’s could represent all facets of the construct while affording instructors and students more specific insights on strengths and weaknesses with respect to particular facets of CT. Moreover, the framework should be expanded to include the design of a set of short answer and/or multiple choice items to accompany the PT. Ideally, these additional items would be based on the same narrative as the PT to collect more nuanced information on students’ precursor skills such as reading comprehension, while enhancing the overall reliability of the assessment. Areas where students are under-prepared could be addressed before, or even in parallel with the development of the focal CT skills. The parallel approach follows the co-requisite model of developmental education. In other settings (e.g., for summative assessment), these complementary items would be administered after the PT to augment the evidence in relation to the various claims. The full PT taking 90 min or more could serve as a capstone assessment.

As we transition from simply delivering paper-based assessments by computer to taking full advantage of the affordances of a digital platform, we should learn from the hard-won lessons of the past so that we can make swifter progress with fewer missteps. In that regard, we must take validity as the touchstone – assessment design, development and deployment must all be tightly linked to the operational definition of the CT construct. Considerations of reliability and practicality come into play with various use cases that highlight different purposes for the assessment (for future perspectives, see next section).

The iPAL assessment framework represents a feasible compromise between commercial, standardized assessments of CT (e.g., Liu et al., 2014 ), on the one hand, and, on the other, freedom for individual faculty to develop assessment tasks according to idiosyncratic models. It imposes a degree of standardization on both task development and scoring, while still allowing some flexibility for faculty to tailor the assessment to meet their unique needs. In so doing, it addresses a key weakness of the AAC&U’s VALUE initiative 2 (retrieved 5/7/2020) that has achieved wide acceptance among United States colleges.

The VALUE initiative has produced generic scoring rubrics for 15 domains including CT, problem-solving and written communication. A rubric for a particular skill domain (e.g., critical thinking) has five to six dimensions with four ordered performance levels for each dimension (1 = lowest, 4 = highest). The performance levels are accompanied by language that is intended to clearly differentiate among levels. 3 Faculty are asked to submit student work products from a senior level course that is intended to yield evidence with respect to student learning outcomes in a particular domain and that, they believe, can elicit performances at the highest level. The collection of work products is then graded by faculty from other institutions who have been trained to apply the rubrics.

A principal difficulty is that there is neither a common framework to guide the design of the challenge, nor any control on task complexity and difficulty. Consequently, there is substantial heterogeneity in the quality and evidential value of the submitted responses. This also causes difficulties with task scoring and inter-rater reliability. Shavelson et al. (2009) discuss some of the problems arising with non-standardized collections of student work.

In this context, one advantage of the iPAL framework is that it can provide valuable guidance and an explicit structure for faculty in developing performance tasks for both instruction and formative assessment. When faculty design assessments, their focus is typically on content coverage rather than other potentially important characteristics, such as the degree of construct representation and the adequacy of their scoring procedures ( Braun, 2019 ).

Concluding Reflections

Challenges to interpretation and implementation.

Performance tasks such as those generated by iPAL are attractive instruments for assessing CT skills (e.g., Shavelson, 2010 ; Shavelson et al., 2019 ). The attraction mainly rests on the assumption that elaborated PT’s are more authentic (direct) and more completely capture facets of the target construct (i.e., possess greater construct representation) than the widely used selected-response tests. However, as Messick (1994) noted authenticity is a “promissory note” that must be redeemed with empirical research. In practice, there are trade-offs among authenticity, construct validity, and psychometric quality such as reliability ( Davey et al., 2015 ).

One reason for Messick (1994) caution is that authenticity does not guarantee construct validity. The latter must be established by drawing on multiple sources of evidence ( American Educational Research Association et al., 2014 ). Following the ECD principles in designing and developing the PT, as well as the associated scoring rubrics, constitutes an important type of evidence. Further, as Leighton (2019) argues, response process data (“cognitive validity”) is needed to validate claims regarding the cognitive complexity of PT’s. Relevant data can be obtained through cognitive laboratory studies involving methods such as think aloud protocols or eye-tracking. Although time-consuming and expensive, such studies can yield not only evidence of validity, but also valuable information to guide refinements of the PT.

Going forward, iPAL PT’s must be subjected to validation studies as recommended in the Standards for Psychological and Educational Testing by American Educational Research Association et al. (2014) . With a particular focus on the criterion “relationships to other variables,” a framework should include assumptions about the theoretically expected relationships among the indicators assessed by the PT, as well as the indicators’ relationships to external variables such as intelligence or prior (task-relevant) knowledge.

Complementing the necessity of evaluating construct validity, there is the need to consider potential sources of construct-irrelevant variance (CIV). One pertains to student motivation, which is typically greater when the stakes are higher. If students are not motivated, then their performance is likely to be impacted by factors unrelated to their (construct-relevant) ability ( Lane and Stone, 2006 ; Braun et al., 2011 ; Shavelson, 2013 ). Differential motivation across groups can also bias comparisons. Student motivation might be enhanced if the PT is administered in the context of a course with the promise of generating useful feedback on students’ skill profiles.

Construct-irrelevant variance can also occur when students are not equally prepared for the format of the PT or fully appreciate the response requirements. This source of CIV could be alleviated by providing students with practice PT’s. Finally, the use of novel forms of documentation, such as those from the Internet, can potentially introduce CIV due to differential familiarity with forms of representation or contents. Interestingly, this suggests that there may be a conflict between enhancing construct representation and reducing CIV.

Another potential source of CIV is related to response evaluation. Even with training, human raters can vary in accuracy and usage of the full score range. In addition, raters may attend to features of responses that are unrelated to the target construct, such as the length of the students’ responses or the frequency of grammatical errors ( Lane and Stone, 2006 ). Some of these sources of variance could be addressed in an online environment, where word processing software could alert students to potential grammatical and spelling errors before they submit their final work product.

Performance tasks generally take longer to administer and are more costly than traditional assessments, making it more difficult to reliably measure student performance ( Messick, 1994 ; Davey et al., 2015 ). Indeed, it is well known that more than one performance task is needed to obtain high reliability ( Shavelson, 2013 ). This is due to both student-task interactions and variability in scoring. Sources of student-task interactions are differential familiarity with the topic ( Hyytinen and Toom, 2019 ) and differential motivation to engage with the task. The level of reliability required, however, depends on the context of use. For use in formative assessment as part of an instructional program, reliability can be lower than use for summative purposes. In the former case, other types of evidence are generally available to support interpretation and guide pedagogical decisions. Further studies are needed to obtain estimates of reliability in typical instructional settings.

With sufficient data, more sophisticated psychometric analyses become possible. One challenge is that the assumption of unidimensionality required for many psychometric models might be untenable for performance tasks ( Davey et al., 2015 ). Davey et al. (2015) provide the example of a mathematics assessment that requires students to demonstrate not only their mathematics skills but also their written communication skills. Although the iPAL framework does not explicitly address students’ reading comprehension and organization skills, students will likely need to call on these abilities to accomplish the task. Moreover, as the operational definition of CT makes evident, the student must not only deploy several skills in responding to the challenge of the PT, but also carry out component tasks in sequence. The former requirement strongly indicates the need for a multi-dimensional IRT model, while the latter suggests that the usual assumption of local item independence may well be problematic ( Lane and Stone, 2006 ). At the same time, the analytic scoring rubric should facilitate the use of latent class analysis to partition data from large groups into meaningful categories ( Zlatkin-Troitschanskaia et al., 2019 ).

Future Perspectives

Although the iPAL consortium has made substantial progress in the assessment of CT, much remains to be done. Further refinement of existing PT’s and their adaptation to different languages and cultures must continue. To this point, there are a number of examples: The refugee crisis PT (cited in Table 1 ) was translated and adapted from Finnish to US English and then to Colombian Spanish. A PT concerning kidney transplants was translated and adapted from German to US English. Finally, two PT’s based on ‘legacy admissions’ to US colleges were translated and adapted to Colombian Spanish.

With respect to data collection, there is a need for sufficient data to support psychometric analysis of student responses, especially the relationships among the different components of the scoring rubric, as this would inform both task development and response evaluation ( Zlatkin-Troitschanskaia et al., 2019 ). In addition, more intensive study of response processes through cognitive laboratories and the like are needed to strengthen the evidential argument for construct validity ( Leighton, 2019 ). We are currently conducting empirical studies, collecting data on both iPAL PT’s and other measures of CT. These studies will provide evidence of convergent and discriminant validity.

At the same time, efforts should be directed at further development to support different ways CT PT’s might be used—i.e., use cases—especially those that call for formative use of PT’s. Incorporating formative assessment into courses can plausibly be expected to improve students’ competency acquisition ( Zlatkin-Troitschanskaia et al., 2017 ). With suitable choices of storylines, appropriate combinations of (modified) PT’s, supplemented by short-answer and multiple-choice items, could be interwoven into ordinary classroom activities. The supplementary items may be completely separate from the PT’s (as is the case with the CLA+), loosely coupled with the PT’s (as in drawing on the same storyline), or tightly linked to the PT’s (as in requiring elaboration of certain components of the response to the PT).

As an alternative to such integration, stand-alone modules could be embedded in courses to yield evidence of students’ generic CT skills. Core curriculum courses or general education courses offer ideal settings for embedding performance assessments. If these assessments were administered to a representative sample of students in each cohort over their years in college, the results would yield important information on the development of CT skills at a population level. For another example, these PA’s could be used to assess the competence profiles of students entering Bachelor’s or graduate-level programs as a basis for more targeted instructional support.

Thus, in considering different use cases for the assessment of CT, it is evident that several modifications of the iPAL omnibus assessment framework are needed. As noted earlier, assessments built according to this framework are demanding with respect to the extensive preliminary work required by a task and the time required to properly complete it. Thus, it would be helpful to have modified versions of the framework, focusing on one or two facets of the CT construct and calling for a smaller number of supplementary documents. The challenge to the student should be suitably reduced.

Some members of the iPAL collaborative have developed PT’s that are embedded in disciplines such as engineering, law and education ( Crump et al., 2019 ; for teacher education examples, see Jeschke et al., 2019 ). These are proving to be of great interest to various stakeholders and further development is likely. Consequently, it is essential that an appropriate assessment framework be established and implemented. It is both a conceptual and an empirical question as to whether a single framework can guide development in different domains.

Performance Assessment in Online Learning Environment

Over the last 15 years, increasing amounts of time in both college and work are spent using computers and other electronic devices. This has led to formulation of models for the new literacies that attempt to capture some key characteristics of these activities. A prominent example is a model proposed by Leu et al. (2020) . The model frames online reading as a process of problem-based inquiry that calls on five practices to occur during online research and comprehension:

1. Reading to identify important questions,

2. Reading to locate information,

3. Reading to critically evaluate information,

4. Reading to synthesize online information, and

5. Reading and writing to communicate online information.

The parallels with the iPAL definition of CT are evident and suggest there may be benefits to closer links between these two lines of research. For example, a report by Leu et al. (2014) describes empirical studies comparing assessments of online reading using either open-ended or multiple-choice response formats.

The iPAL consortium has begun to take advantage of the affordances of the online environment (for examples, see Schmidt et al. and Nagel et al. in this special issue). Most obviously, Supplementary Materials can now include archival photographs, audio recordings, or videos. Additional tasks might include the online search for relevant documents, though this would add considerably to the time demands. This online search could occur within a simulated Internet environment, as is the case for the IEA’s ePIRLS assessment ( Mullis et al., 2017 ).

The prospect of having access to a wealth of materials that can add to task authenticity is exciting. Yet it can also add ambiguity and information overload. Increased authenticity, then, should be weighed against validity concerns and the time required to absorb the content in these materials. Modifications of the design framework and extensive empirical testing will be required to decide on appropriate trade-offs. A related possibility is to employ some of these materials in short-answer (or even selected-response) items that supplement the main PT. Response formats could include highlighting text or using a drag-and-drop menu to construct a response. Students’ responses could be automatically scored, thereby containing costs. With automated scoring, feedback to students and faculty, including suggestions for next steps in strengthening CT skills, could also be provided without adding to faculty workload. Therefore, taking advantage of the online environment to incorporate new types of supplementary documents should be a high priority and, perhaps, to introduce new response formats as well. Finally, further investigation of the overlap between this formulation of CT and the characterization of online reading promulgated by Leu et al. (2020) is a promising direction to pursue.

Data Availability Statement

All datasets generated for this study are included in the article/supplementary material.

Author Contributions

HB wrote the article. RS, OZ-T, and KB were involved in the preparation and revision of the article and co-wrote the manuscript. All authors contributed to the article and approved the submitted version.

This study was funded in part by the Spencer Foundation (Grant No. #201700123).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We would like to thank all the researchers who have participated in the iPAL program.

  • ^ https://www.ipal-rd.com/
  • ^ https://www.aacu.org/value
  • ^ When test results are reported by means of substantively defined categories, the scoring is termed “criterion-referenced”. This is, in contrast to results, reported as percentiles; such scoring is termed “norm-referenced”.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (2014). Standards for Educational and Psychological Testing. Washington, D.C: American Educational Research Association.

Google Scholar

Arum, R., and Roksa, J. (2011). Academically Adrift: Limited Learning on College Campuses. Chicago, IL: University of Chicago Press.

Association of American Colleges and Universities (n.d.). VALUE: What is value?. Available online at:: https://www.aacu.org/value (accessed May 7, 2020).

Association of American Colleges and Universities [AACU] (2018). Fulfilling the American Dream: Liberal Education and the Future of Work. Available online at:: https://www.aacu.org/research/2018-future-of-work (accessed May 1, 2020).

Braun, H. (2019). Performance assessment and standardization in higher education: a problematic conjunction? Br. J. Educ. Psychol. 89, 429–440. doi: 10.1111/bjep.12274

PubMed Abstract | CrossRef Full Text | Google Scholar

Braun, H. I., Kirsch, I., and Yamoto, K. (2011). An experimental study of the effects of monetary incentives on performance on the 12th grade NAEP reading assessment. Teach. Coll. Rec. 113, 2309–2344.

Crump, N., Sepulveda, C., Fajardo, A., and Aguilera, A. (2019). Systematization of performance tests in critical thinking: an interdisciplinary construction experience. Rev. Estud. Educ. 2, 17–47.

Davey, T., Ferrara, S., Shavelson, R., Holland, P., Webb, N., and Wise, L. (2015). Psychometric Considerations for the Next Generation of Performance Assessment. Washington, DC: Center for K-12 Assessment & Performance Management, Educational Testing Service.

Erwin, T. D., and Sebrell, K. W. (2003). Assessment of critical thinking: ETS’s tasks in critical thinking. J. Gen. Educ. 52, 50–70. doi: 10.1353/jge.2003.0019

CrossRef Full Text | Google Scholar

Haertel, G. D., and Fujii, R. (2017). “Evidence-centered design and postsecondary assessment,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 313–339. doi: 10.4324/9781315709307-26

Hyytinen, H., and Toom, A. (2019). Developing a performance assessment task in the Finnish higher education context: conceptual and empirical insights. Br. J. Educ. Psychol. 89, 551–563. doi: 10.1111/bjep.12283

Hyytinen, H., Toom, A., and Shavelson, R. J. (2019). “Enhancing scientific thinking through the development of critical thinking in higher education,” in Redefining Scientific Thinking for Higher Education: Higher-Order Thinking, Evidence-Based Reasoning and Research Skills , eds M. Murtonen and K. Balloo (London: Palgrave MacMillan).

Indiana University (2019). FSSE 2019 Frequencies: FSSE 2019 Aggregate. Available online at:: http://fsse.indiana.edu/pdf/FSSE_IR_2019/summary_tables/FSSE19_Frequencies_(FSSE_2019).pdf (accessed May 1, 2020).

Jeschke, C., Kuhn, C., Lindmeier, A., Zlatkin-Troitschanskaia, O., Saas, H., and Heinze, A. (2019). Performance assessment to investigate the domain specificity of instructional skills among pre-service and in-service teachers of mathematics and economics. Br. J. Educ. Psychol. 89, 538–550. doi: 10.1111/bjep.12277

Kegan, R. (1994). In Over Our Heads: The Mental Demands of Modern Life. Cambridge, MA: Harvard University Press.

Klein, S., Benjamin, R., Shavelson, R., and Bolus, R. (2007). The collegiate learning assessment: facts and fantasies. Eval. Rev. 31, 415–439. doi: 10.1177/0193841x07303318

Kosslyn, S. M., and Nelson, B. (2017). Building the Intentional University: Minerva and the Future of Higher Education. Cambridge, MAL: The MIT Press.

Lane, S., and Stone, C. A. (2006). “Performance assessment,” in Educational Measurement , 4th Edn, ed. R. L. Brennan (Lanham, MA: Rowman & Littlefield Publishers), 387–432.

Leighton, J. P. (2019). The risk–return trade-off: performance assessments and cognitive validation of inferences. Br. J. Educ. Psychol. 89, 441–455. doi: 10.1111/bjep.12271

Leu, D. J., Kiili, C., Forzani, E., Zawilinski, L., McVerry, J. G., and O’Byrne, W. I. (2020). “The new literacies of online research and comprehension,” in The Concise Encyclopedia of Applied Linguistics , ed. C. A. Chapelle (Oxford: Wiley-Blackwell), 844–852.

Leu, D. J., Kulikowich, J. M., Kennedy, C., and Maykel, C. (2014). “The ORCA Project: designing technology-based assessments for online research,” in Paper Presented at the American Educational Research Annual Meeting , Philadelphia, PA.

Liu, O. L., Frankel, L., and Roohr, K. C. (2014). Assessing critical thinking in higher education: current state and directions for next-generation assessments. ETS Res. Rep. Ser. 1, 1–23. doi: 10.1002/ets2.12009

McClelland, D. C. (1973). Testing for competence rather than for “intelligence.”. Am. Psychol. 28, 1–14. doi: 10.1037/h0034092

McGrew, S., Ortega, T., Breakstone, J., and Wineburg, S. (2017). The challenge that’s bigger than fake news: civic reasoning in a social media environment. Am. Educ. 4, 4-9, 39.

Mejía, A., Mariño, J. P., and Molina, A. (2019). Incorporating perspective analysis into critical thinking performance assessments. Br. J. Educ. Psychol. 89, 456–467. doi: 10.1111/bjep.12297

Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educ. Res. 23, 13–23. doi: 10.3102/0013189x023002013

Mislevy, R. J., Almond, R. G., and Lukas, J. F. (2003). A brief introduction to evidence-centered design. ETS Res. Rep. Ser. 2003, i–29. doi: 10.1002/j.2333-8504.2003.tb01908.x

Mislevy, R. J., and Haertel, G. D. (2006). Implications of evidence-centered design for educational testing. Educ. Meas. Issues Pract. 25, 6–20. doi: 10.1111/j.1745-3992.2006.00075.x

Mullis, I. V. S., Martin, M. O., Foy, P., and Hooper, M. (2017). ePIRLS 2016 International Results in Online Informational Reading. Available online at:: http://timssandpirls.bc.edu/pirls2016/international-results/ (accessed May 1, 2020).

Nagel, M.-T., Zlatkin-Troitschanskaia, O., Schmidt, S., and Beck, K. (2020). “Performance assessment of generic and domain-specific skills in higher education economics,” in Student Learning in German Higher Education , eds O. Zlatkin-Troitschanskaia, H. A. Pant, M. Toepper, and C. Lautenbach (Berlin: Springer), 281–299. doi: 10.1007/978-3-658-27886-1_14

Organisation for Economic Co-operation and Development [OECD] (2012). AHELO: Feasibility Study Report , Vol. 1. Paris: OECD. Design and implementation.

Organisation for Economic Co-operation and Development [OECD] (2013). AHELO: Feasibility Study Report , Vol. 2. Paris: OECD. Data analysis and national experiences.

Oser, F. K., and Biedermann, H. (2020). “A three-level model for critical thinking: critical alertness, critical reflection, and critical analysis,” in Frontiers and Advances in Positive Learning in the Age of Information (PLATO) , ed. O. Zlatkin-Troitschanskaia (Cham: Springer), 89–106. doi: 10.1007/978-3-030-26578-6_7

Paul, R., and Elder, L. (2007). Consequential validity: using assessment to drive instruction. Found. Crit. Think. 29, 31–40.

Pellegrino, J. W., and Hilton, M. L. (eds) (2012). Education for life and work: Developing Transferable Knowledge and Skills in the 21st Century. Washington DC: National Academies Press.

Shavelson, R. (2010). Measuring College Learning Responsibly: Accountability in a New Era. Redwood City, CA: Stanford University Press.

Shavelson, R. J. (2013). On an approach to testing and modeling competence. Educ. Psychol. 48, 73–86. doi: 10.1080/00461520.2013.779483

Shavelson, R. J., Zlatkin-Troitschanskaia, O., Beck, K., Schmidt, S., and Marino, J. P. (2019). Assessment of university students’ critical thinking: next generation performance assessment. Int. J. Test. 19, 337–362. doi: 10.1080/15305058.2018.1543309

Shavelson, R. J., Zlatkin-Troitschanskaia, O., and Marino, J. P. (2018). “International performance assessment of learning in higher education (iPAL): research and development,” in Assessment of Learning Outcomes in Higher Education: Cross-National Comparisons and Perspectives , eds O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, and C. Kuhn (Berlin: Springer), 193–214. doi: 10.1007/978-3-319-74338-7_10

Shavelson, R. J., Klein, S., and Benjamin, R. (2009). The limitations of portfolios. Inside Higher Educ. Available online at: https://www.insidehighered.com/views/2009/10/16/limitations-portfolios

Stolzenberg, E. B., Eagan, M. K., Zimmerman, H. B., Berdan Lozano, J., Cesar-Davis, N. M., Aragon, M. C., et al. (2019). Undergraduate Teaching Faculty: The HERI Faculty Survey 2016–2017. Los Angeles, CA: UCLA.

Tessier-Lavigne, M. (2020). Putting Ethics at the Heart of Innovation. Stanford, CA: Stanford Magazine.

Wheeler, P., and Haertel, G. D. (1993). Resource Handbook on Performance Assessment and Measurement: A Tool for Students, Practitioners, and Policymakers. Palm Coast, FL: Owl Press.

Wineburg, S., McGrew, S., Breakstone, J., and Ortega, T. (2016). Evaluating Information: The Cornerstone of Civic Online Reasoning. Executive Summary. Stanford, CA: Stanford History Education Group.

Zahner, D. (2013). Reliability and Validity–CLA+. Council for Aid to Education. Available online at:: https://pdfs.semanticscholar.org/91ae/8edfac44bce3bed37d8c9091da01d6db3776.pdf .

Zlatkin-Troitschanskaia, O., and Shavelson, R. J. (2019). Performance assessment of student learning in higher education [Special issue]. Br. J. Educ. Psychol. 89, i–iv, 413–563.

Zlatkin-Troitschanskaia, O., Pant, H. A., Lautenbach, C., Molerov, D., Toepper, M., and Brückner, S. (2017). Modeling and Measuring Competencies in Higher Education: Approaches to Challenges in Higher Education Policy and Practice. Berlin: Springer VS.

Zlatkin-Troitschanskaia, O., Pant, H. A., Toepper, M., and Lautenbach, C. (eds) (2020). Student Learning in German Higher Education: Innovative Measurement Approaches and Research Results. Wiesbaden: Springer.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., and Pant, H. A. (2018). “Assessment of learning outcomes in higher education: international comparisons and perspectives,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 686–697.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., Schmidt, S., and Beck, K. (2019). On the complementarity of holistic and analytic approaches to performance assessment scoring. Br. J. Educ. Psychol. 89, 468–484. doi: 10.1111/bjep.12286

Keywords : critical thinking, performance assessment, assessment framework, scoring rubric, evidence-centered design, 21st century skills, higher education

Citation: Braun HI, Shavelson RJ, Zlatkin-Troitschanskaia O and Borowiec K (2020) Performance Assessment of Critical Thinking: Conceptualization, Design, and Implementation. Front. Educ. 5:156. doi: 10.3389/feduc.2020.00156

Received: 30 May 2020; Accepted: 04 August 2020; Published: 08 September 2020.

Reviewed by:

Copyright © 2020 Braun, Shavelson, Zlatkin-Troitschanskaia and Borowiec. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Henry I. Braun, [email protected]

This article is part of the Research Topic

Assessing Information Processing and Online Reasoning as a Prerequisite for Learning in Higher Education

Back Home

  • Search Search Search …
  • Search Search …

Critical Thinking Models: A Comprehensive Guide for Effective Decision Making

Critical Thinking Models

Critical thinking models are valuable frameworks that help individuals develop and enhance their critical thinking skills . These models provide a structured approach to problem-solving and decision-making by encouraging the evaluation of information and arguments in a logical, systematic manner. By understanding and applying these models, one can learn to make well-reasoned judgments and decisions.

critical thinking measurement tools

Various critical thinking models exist, each catering to different contexts and scenarios. These models offer a step-by-step method to analyze situations, scrutinize assumptions and biases, and consider alternative perspectives. Ultimately, the goal of critical thinking models is to enhance an individual’s ability to think critically, ultimately improving their reasoning and decision-making skills in both personal and professional settings.

Key Takeaways

  • Critical thinking models provide structured approaches for enhancing decision-making abilities
  • These models help individuals analyze situations, scrutinize assumptions, and consider alternative perspectives
  • The application of critical thinking models can significantly improve one’s reasoning and judgment skills.

Fundamentals of Critical Thinking

critical thinking measurement tools

Definition and Importance

Critical thinking is the intellectual process of logically, objectively, and systematically evaluating information to form reasoned judgments, utilizing reasoning , logic , and evidence . It involves:

  • Identifying and questioning assumptions,
  • Applying consistent principles and criteria,
  • Analyzing and synthesizing information,
  • Drawing conclusions based on evidence.

The importance of critical thinking lies in its ability to help individuals make informed decisions, solve complex problems, and differentiate between true and false beliefs .

Core Cognitive Skills

Several core cognitive skills underpin critical thinking:

  • Analysis : Breaking down complex information into smaller components to identify patterns or inconsistencies.
  • Evaluation : Assessing the credibility and relevance of sources, arguments, and evidence.
  • Inference : Drawing conclusions by connecting the dots between analyzed information.
  • Synthesis : Incorporating analyzed information into a broader understanding and constructing one’s argument.
  • Logic and reasoning : Applying principles of logic to determine the validity of arguments and weigh evidence.

These skills enable individuals to consistently apply intellectual standards in their thought process, which ultimately results in sound judgments and informed decisions.

Influence of Cognitive Biases

A key aspect of critical thinking is recognizing and mitigating the impact of cognitive biases on our thought processes. Cognitive biases are cognitive shortcuts or heuristics that can lead to flawed reasoning and distort our understanding of a situation. Examples of cognitive biases include confirmation bias, anchoring bias, and availability heuristic.

To counter the influence of cognitive biases, critical thinkers must be aware of their own assumptions and strive to apply consistent and objective evaluation criteria in their thinking process. The practice of actively recognizing and addressing cognitive biases promotes an unbiased and rational approach to problem-solving and decision-making.

The Critical Thinking Process

critical thinking measurement tools

Stages of Critical Thinking

The critical thinking process starts with gathering and evaluating data . This stage involves identifying relevant information and ensuring it is credible and reliable. Next, an individual engages in analysis by examining the data closely to understand its context and interpret its meaning. This step can involve breaking down complex ideas into simpler components for better understanding.

The next stage focuses on determining the quality of the arguments, concepts, and theories present in the analyzed data. Critical thinkers question the credibility and logic behind the information while also considering their own biases and assumptions. They apply consistent standards when evaluating sources, which helps them identify any weaknesses in the arguments.

Values play a significant role in the critical thinking process. Critical thinkers assess the significance of moral, ethical, or cultural values shaping the issue, argument, or decision at hand. They determine whether these values align with the evidence and logic they have analyzed.

After thorough analysis and evaluation, critical thinkers draw conclusions based on the evidence and reasoning gathered. This step includes synthesizing the information and presenting a clear, concise argument or decision. It also involves explaining the reasoning behind the conclusion to ensure it is well-founded.

Application in Decision Making

In decision making, critical thinking is a vital skill that allows individuals to make informed choices. It enables them to:

  • Analyze options and their potential consequences
  • Evaluate the credibility of sources and the quality of information
  • Identify biases, assumptions, and values that may influence the decision
  • Construct a reasoned, well-justified conclusion

By using critical thinking in decision making, individuals can make more sound, objective choices. The process helps them to avoid pitfalls like jumping to conclusions, being influenced by biases, or basing decisions on unreliable data. The result is more thoughtful, carefully-considered decisions leading to higher quality outcomes.

Critical Thinking Models

Critical thinking models are frameworks that help individuals develop better problem-solving and decision-making abilities. They provide strategies for analyzing, evaluating, and synthesizing information to reach well-founded conclusions. This section will discuss four notable models: The RED Model, Bloom’s Taxonomy, Paul-Elder Model, and The Halpern Critical Thinking Assessment.

The RED Model

The RED Model stands for Recognize Assumptions, Evaluate Arguments, and Draw Conclusions. It emphasizes the importance of questioning assumptions, weighing evidence, and reaching logical conclusions.

  • Recognize Assumptions: Identify and challenge assumptions that underlie statements, beliefs, or arguments.
  • Evaluate Arguments: Assess the validity and reliability of evidence to support or refute claims.
  • Draw Conclusions: Make well-reasoned decisions based on available information and sound reasoning.

The RED Model helps individuals become more effective problem solvers and decision-makers by guiding them through the critical thinking process ^(source) .

Bloom’s Taxonomy

Bloom’s Taxonomy is a hierarchical model that classifies cognitive skills into six levels of complexity. These levels are remembering, understanding, applying, analyzing, evaluating, and creating. By progressing through these levels, individuals can develop higher-order thinking skills.

  • Remembering: Recall information or facts.
  • Understanding: Comprehend the meaning of ideas, facts, or problems.
  • Applying: Use knowledge in different situations.
  • Analyzing: Break down complex topics or problems into sub-parts.
  • Evaluating: Assess the quality, relevance, or credibility of information, ideas, or solutions.
  • Creating: Combine elements to form a new whole, generate new ideas, or solve complex issues.

Paul-Elder Model

The Paul-Elder Model introduces the concept of “elements of thought,” focusing on a structured approach to critical thinking. This model promotes intellectual standards, such as clarity, accuracy, and relevance. It consists of three stages:

  • Critical Thinking: Employ the intellectual standards to problem-solving and decision-making processes.
  • Elements of Thought: Consider purpose, question at issue, information, interpretation and inference, concepts, assumptions, implications, and point of view.
  • Intellectual Traits: Develop intellectual traits, such as intellectual humility, intellectual empathy, and intellectual perseverance.

This model fosters a deeper understanding and appreciation of critical thinking ^(source) .

The Halpern Critical Thinking Assessment

The Halpern Critical Thinking Assessment is a standardized test developed by Diane Halpern to assess critical thinking skills. The evaluation uses a variety of tasks to measure abilities in core skill areas, such as verbal reasoning, argument analysis, and decision making. Pearson, a leading publisher of educational assessments, offers this test as a means to assess individuals’ critical thinking skills ^(source) .

These four critical thinking models can be used as frameworks to improve and enhance cognitive abilities. By learning and practicing these models, individuals can become better equipped to analyze complex information, evaluate options, and make well-informed decisions.

Evaluating Information and Arguments

In this section, we will discuss the importance of evaluating information and arguments in the process of critical thinking, focusing on evidence assessment, logic and fallacies, and argument analysis.

Evidence Assessment

Evaluating the relevance, accuracy, and credibility of information is a vital aspect of critical thinking. In the process of evidence assessment, a thinker should consider the following factors:

  • Source reliability : Research and understand the expertise and credibility of the source to ensure that biased or inaccurate information is not being considered.
  • Currency : Check the date of the information to make sure it is still relevant and accurate in the present context.
  • Objectivity : Analyze the information for potential bias and always cross-reference it with other credible sources.

When practicing critical thinking skills, it is essential to be aware of your own biases and make efforts to minimize their influence on your decision-making process.

Logic and Fallacies

Logic is crucial for deconstructing and analyzing complex arguments, while identifying and avoiding logical fallacies helps maintain accurate and valid conclusions. Some common fallacies to watch out for in critical thinking include:

  • Ad Hominem : Attacking the person making the argument instead of addressing the argument itself.
  • Strawman : Misrepresenting an opponent’s argument to make it easier to refute.
  • False Dilemma : Presenting only two options when there may be multiple viable alternatives.
  • Appeal to Authority : Assuming a claim is true simply because an authority figure supports it.

Being aware of these fallacies enables a thinker to effectively evaluate the strength of an argument and make sound judgments accordingly.

Argument Analysis

Analyzing an argument is the process of evaluating its structure, premises, and conclusion while determining its validity and soundness. To analyze an argument, follow these steps:

  • Identify the premises and conclusion : Determine the main point is being argued, how it is related and substance of the argument.
  • Evaluate the validity : Assess whether the conclusion logically follows from the premises and if the argument’s structure is sound.
  • Test the soundness : Evaluate the truth and relevance of the premises. This may require verifying the accuracy of facts and evidence, as well as assessing the reliability of sources.
  • Consider counter-arguments : Identify opposing viewpoints and counter-arguments, and evaluate their credibility to gauge the overall strength of the original argument.

By effectively evaluating information and arguments, critical thinkers develop a solid foundation for making well-informed decisions and solving problems.

Enhancing Critical Thinking

Strategies for improvement.

To enhance critical thinking, individuals can practice different strategies, including asking thought-provoking questions, analyzing ideas and observations, and being open to different perspectives. One effective technique is the Critical Thinking Roadmap , which breaks critical thinking down into four measurable phases: execute, synthesize, recommend, and communicate. It’s important to use deliberate practice in these areas to develop a strong foundation for problem-solving and decision-making. In addition, cultivating a mindset of courage , fair-mindedness , and empathy will support critical thinking development.

Critical Thinking in Education

In the field of education, critical thinking is an essential component of effective learning and pedagogy. Integrating critical thinking into the curriculum encourages student autonomy, fosters innovation, and improves student outcomes. Teachers can use various approaches to promote critical thinking, such as:

  • Employing open-ended questions to stimulate ideas
  • Incorporating group discussions or debates to facilitate communication and evaluation of viewpoints
  • Assessing and providing feedback on student work to encourage reflection and improvement
  • Utilizing real-world scenarios and case studies for practical application of concepts

Developing a Critical Thinking Mindset

To truly enhance critical thinking abilities, it’s important to adopt a mindset that values integrity , autonomy , and empathy . These qualities help to create a learning environment that encourages open-mindedness, which is key to critical thinking development. To foster a critical thinking mindset:

  • Be curious : Remain open to new ideas and ask questions to gain a deeper understanding.
  • Communicate effectively : Clearly convey thoughts and actively listen to others.
  • Reflect and assess : Regularly evaluate personal beliefs and assumptions to promote growth.
  • Embrace diversity of thought : Welcome different viewpoints and ideas to foster innovation.

Incorporating these approaches can lead to a more robust critical thinking skillset, allowing individuals to better navigate and solve complex problems.

Critical Thinking in Various Contexts

The workplace and beyond.

Critical thinking is a highly valued skill in the workplace, as it enables employees to analyze situations, make informed decisions, and solve problems effectively. It involves a careful thinking process directed towards a specific goal. Employers often seek individuals who possess strong critical thinking abilities, as they can add significant value to the organization.

In the workplace context, critical thinkers are able to recognize assumptions, evaluate arguments, and draw conclusions, following models such as the RED model . They can also adapt their thinking to suit various scenarios, allowing them to tackle complex and diverse problems.

Moreover, critical thinking transcends the workplace and applies to various aspects of life. It empowers an individual to make better decisions, analyze conflicting information, and engage in constructive debates.

Creative and Lateral Thinking

Critical thinking encompasses both creative and lateral thinking. Creative thinking involves generating novel ideas and solutions to problems, while lateral thinking entails looking at problems from different angles to find unique and innovative solutions.

Creative thinking allows thinkers to:

  • Devise new concepts and ideas
  • Challenge conventional wisdom
  • Build on existing knowledge to generate innovative solutions

Lateral thinking, on the other hand, encourages thinkers to:

  • Break free from traditional thought patterns
  • Combine seemingly unrelated ideas to create unique solutions
  • Utilize intuition and intelligence to approach problems from a different perspective

Both creative and lateral thinking are essential components of critical thinking, allowing individuals to view problems in a holistic manner and generate well-rounded solutions. These skills are highly valued by employers and can lead to significant personal and professional growth.

In conclusion, critical thinking is a multifaceted skill that comprises various thought processes, including creative and lateral thinking. By embracing these skills, individuals can excel in the workplace and in their personal lives, making better decisions and solving problems effectively.

Overcoming Challenges

Recognizing and addressing bias.

Cognitive biases and thinking biases can significantly affect the process of critical thinking . One of the key components of overcoming these challenges is to recognize and address them. It is essential to be aware of one’s own beliefs, as well as the beliefs of others, to ensure fairness and clarity throughout the decision-making process. To identify and tackle biases, one can follow these steps:

  • Be self-aware : Understand personal beliefs and biases, acknowledging that they may influence the interpretation of information.
  • Embrace diverse perspectives : Encourage open discussions and invite different viewpoints to challenge assumptions and foster cognitive diversity.
  • Reevaluate evidence : Continuously reassess the relevance and validity of the information being considered.

By adopting these practices, individuals can minimize the impact of biases and enhance the overall quality of their critical thinking skills.

Dealing with Information Overload

In today’s world, information is abundant, and it can become increasingly difficult to demystify and make sense of the available data. Dealing with information overload is a crucial aspect of critical thinking. Here are some strategies to address this challenge:

  • Prioritize information : Focus on the most relevant and reliable data, filtering out unnecessary details.
  • Organize data : Use tables, charts, and lists to categorize information and identify patterns more efficiently.
  • Break down complex information : Divide complex data into smaller, manageable segments to simplify interpretation and inferences.

By implementing these techniques, individuals can effectively manage information overload, enabling them to process and analyze data more effectively, leading to better decision-making.

In conclusion, overcoming challenges such as biases and information overload is essential in the pursuit of effective critical thinking. By recognizing and addressing these obstacles, individuals can develop clarity and fairness in their thought processes, leading to well-informed decisions and improved problem-solving capabilities.

Measuring Critical Thinking

Assessment tools and criteria.

There are several assessment tools designed to measure critical thinking, each focusing on different aspects such as quality, depth, breadth, and significance of thinking. One example of a widely used standardized test is the Watson-Glaser Critical Thinking Appraisal , which evaluates an individual’s ability to interpret information, draw conclusions, and make assumptions. Another test is the Cornell Critical Thinking Tests Level X and Level Z , which assess an individual’s critical thinking skills through multiple-choice questions.

Furthermore, criteria for assessing critical thinking often include precision, relevance, and the ability to gather and analyze relevant information. Some assessors utilize the Halpern Critical Thinking Assessment , which measures the application of cognitive skills such as deduction, observation, and induction in real-world scenarios.

The Role of IQ and Tests

It’s important to note that intelligence quotient (IQ) tests and critical thinking assessments are not the same. While IQ tests aim to measure an individual’s cognitive abilities and general intelligence, critical thinking tests focus specifically on one’s ability to analyze, evaluate, and form well-founded opinions. Therefore, having a high IQ does not necessarily guarantee strong critical thinking skills, as critical thinking requires additional mental processes beyond basic logical reasoning.

To build and enhance critical thinking skills, individuals should practice and develop higher-order thinking, such as critical alertness, critical reflection, and critical analysis. Using a Critical Thinking Roadmap , such as the four-phase framework that includes execution, synthesis, recommendation, and the ability to apply, individuals can continuously work to improve their critical thinking abilities.

Frequently Asked Questions

What are the main steps involved in the paul-elder critical thinking model.

The Paul-Elder Critical Thinking Model is a comprehensive framework for developing critical thinking skills. The main steps include: identifying the purpose, formulating questions, gathering information, identifying assumptions, interpreting information, and evaluating arguments. The model emphasizes clarity, accuracy, precision, relevance, depth, breadth, logic, and fairness throughout the critical thinking process. By following these steps, individuals can efficiently analyze and evaluate complex ideas and issues.

Can you list five techniques to enhance critical thinking skills?

Here are five techniques to help enhance critical thinking skills:

  • Ask open-ended questions : Encourages exploration and challenges assumptions.
  • Engage in active listening: Focus on understanding others’ viewpoints before responding.
  • Reflect on personal biases: Identify and question any preconceived notions or judgments.
  • Practice mindfulness: Develop self-awareness and stay present in the moment.
  • Collaborate with others: Exchange ideas and learn from diverse perspectives.

What is the RED Model of critical thinking and how is it applied?

The RED Model of critical thinking consists of three key components: Recognize Assumptions, Evaluate Arguments, and Draw Conclusions. To apply the RED Model, begin by recognizing and questioning underlying assumptions, being aware of personal biases and stereotypes. Next, evaluate the strengths and weaknesses of different arguments, considering evidence, logical consistency, and alternative explanations. Lastly, draw well-reasoned conclusions that are based on the analysis and evaluation of the information gathered.

How do the ‘3 C’s’ of critical thinking contribute to effective problem-solving?

The ‘3 C’s’ of critical thinking – Curiosity, Creativity, and Criticism – collectively contribute to effective problem-solving. Curiosity allows individuals to explore various perspectives and ask thought-provoking questions, while Creativity helps develop innovative solutions and unique approaches to challenges. Criticism, or the ability to evaluate and analyze ideas objectively, ensures that the problem-solving process remains grounded in logic and relevance.

What characteristics distinguish critical thinking from creative thinking?

Critical thinking and creative thinking are two complementary cognitive skills. Critical thinking primarily focuses on analyzing, evaluating, and reasoning, using objectivity and logical thinking. It involves identifying problems, assessing evidence, and drawing sound conclusions. Creative thinking, on the other hand, is characterized by the generation of new ideas, concepts, and approaches to solve problems, often involving imagination, originality, and out-of-the-box thinking.

What are some recommended books to help improve problem-solving and critical thinking skills?

There are several books that can help enhance problem-solving and critical thinking skills, including:

  • “Thinking, Fast and Slow” by Daniel Kahneman: This book explores the dual process theory of decision-making and reasoning.
  • “The 5 Elements of Effective Thinking” by Edward B. Burger and Michael Starbird: Offers practical tips and strategies for improving critical thinking skills.
  • “Critique of Pure Reason” by Immanuel Kant: A classic philosophical work that delves into the principles of reason and cognition.
  • “Mindware: Tools for Smart Thinking” by Richard E. Nisbett: Presents a range of cognitive tools to enhance critical thinking and decision-making abilities.
  • “The Art of Thinking Clearly” by Rolf Dobelli: Explores common cognitive biases and errors in judgment that can affect critical thinking.

You may also like

AI, Artificial intelligence

How to Use Artificial Intelligence in the Critical Thinking Process: Enhancing Human Decision-Making

Artificial Intelligence (AI) is revolutionizing the way critical thinking is approached in various sectors. By integrating AI into the critical thinking process, […]

communication and critical thinking

Critical Thinking and Effective Communication: Enhancing Interpersonal Skills for Success

In today’s fast-paced world, effective communication and critical thinking have become increasingly important skills for both personal and professional success. Critical thinking […]

Critical Thinking Brain Teasers

Critical Thinking Brain Teasers: Enhance Your Cognitive Skills Today

Critical thinking brain teasers are an engaging way to challenge one’s cognitive abilities and improve problem-solving skills. These mind-bending puzzles come in […]

critical thinking puzzles

Critical thinking puzzles for adults (with answers)

Critical thinking can help to better navigate the information-dense and complex world we live in. By thinking critically we can better identify […]

Evaluation of tools used to measure critical thinking development in nursing and midwifery undergraduate students: a systematic review

Affiliations.

  • 1 School of Nursing and Midwifery, Centre for Health Practice Innovation, Menzies Health Institute Queensland, Griffith University, Brisbane, Australia. Electronic address: [email protected].
  • 2 Centre for Health Practice Innovation, Menzies Health Institute Queensland, Griffith University, Brisbane, Australia.
  • 3 School of Nursing and Midwifery, Centre for Health Practice Innovation, Menzies Health Institute Queensland, Griffith University, Brisbane, Australia.
  • PMID: 25817987
  • DOI: 10.1016/j.nedt.2015.02.023

Background: Well developed critical thinking skills are essential for nursing and midwifery practices. The development of students' higher-order cognitive abilities, such as critical thinking, is also well recognised in nursing and midwifery education. Measurement of critical thinking development is important to demonstrate change over time and effectiveness of teaching strategies.

Objective: To evaluate tools designed to measure critical thinking in nursing and midwifery undergraduate students.

Data sources: The following six databases were searched and resulted in the retrieval of 1191 papers: CINAHL, Ovid Medline, ERIC, Informit, PsycINFO and Scopus.

Review methods: After screening for inclusion, each paper was evaluated using the Critical Appraisal Skills Programme Tool. Thirty-four studies met the inclusion criteria and quality appraisal. Sixteen different tools that measure critical thinking were reviewed for reliability and validity and extent to which the domains of critical thinking were evident.

Results: Sixty percent of studies utilised one of four standardised commercially available measures of critical thinking. Reliability and validity were not consistently reported and there was a variation in reliability across studies that used the same measure. Of the remaining studies using different tools, there was also limited reporting of reliability making it difficult to assess internal consistency and potential applicability of measures across settings.

Conclusions: Discipline specific instruments to measure critical thinking in nursing and midwifery are required, specifically tools that measure the application of critical thinking to practise. Given that critical thinking development occurs over an extended period, measurement needs to be repeated and multiple methods of measurement used over time.

Keywords: Critical thinking; Evaluation; Measures; Midwifery; Nursing; Scales.

Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

Publication types

  • Systematic Review
  • Education, Nursing, Baccalaureate*
  • Educational Measurement / methods
  • Midwifery / education*
  • Nursing Education Research
  • Reproducibility of Results
  • Students, Nursing / psychology*

*NEW* Team Delegation & Accountability Training Course

About Amy Gray

Individual Leadership Coaching

  • Business Coaching
  • Team Delegation & Accountability
  • Giving & Receiving Constructive Feedback
  • Coaching Skills
  • Master Deep Active Listening Skills for Leaders & Managers
  • Coaching Skills for Leaders & Managers
  • Course: Team Trust Foundations
  • Course: Better 1:1s
  • Give Constructive Feedback That Creates An Impact Guidebook
  • 9 Critical Thinking Tools Every Effective Leader Needs Guidebook
  • Receiving Feedback Webinar
  • Giving Feedback Webinar
  • Other Recommended Resources

9 Critical Thinking Tools for Better Decision Making

“A great many people think they are thinking when they are merely rearranging their prejudices.” William James

This article is a companion to my previous article about a Decision-Making Framework for Leaders and will refer to some of the concepts in that post. Today, I’m sharing an overview of 9 critical thinking tools you can use as a leader making decisions for your organisation or team. I have written a more in-depth article on each of the tools and you will find links to those articles below.

Table of Contents

What is critical thinking.

Critical thinking is the mode of thinking – about any subject, content, or problem – in which the thinker improves the quality of his or her thinking by skillfully analyzing, assessing, and reconstructing it.

It entails effective communication and problem-solving abilities, as well as a commitment to overcoming our biases.

Or, to put it another way – critical thinking is the art of thinking about our approach to thinking. It’s about gaining knowledge, comprehending it, applying that knowledge, analyzing and synthesizing.

Critical thinking can happen at any part of the decision making process. And the goal is to make sure we think deeply about our thinking and apply that thinking in different ways to come up with options and alternatives.

Think of it as a construct of moving through our thinking instead of just rushing through it.

Critical Thinking Is An Important Part of Decision-Making

It’s important to understand that critical thinking can sit outside of a specific decision-making process. And by the same token, decision making doesn’t always need to include critical thinking.

But for the purposes of this article, I’m addressing critical thinking within the problem and decision-making context.

And I’m sharing 9 critical thinking tools that are helpful for people at every stage of their leadership journey. There are so many tools out there and I’d love to hear from you if you have a favourite one that you’ve found useful.

So, whether you are:

  • just beginning to flex your critical thinking and decision-making muscles
  • or an experienced leader looking for tools to help you think more deeply about a problem

There is something here for you.

Let’s dive in.

9 Critical Thinking Tools For Leaders

  • Decision Tree
  • Changing Your Lens
  • Active Listening & Socratic Method
  • Decision Hygiene Checklist
  • Where Accuracy Lives
  • Overcoming Analysis Paralysis

Of course, there are many other tools available. But let’s look at how each of these can improve your decision-making and leadership skills .

1. Decision-Making Tree

The decision making tree can be useful before going into a decision-making meeting to determine how collaborative or inclusive you need to be and who should be included in the discussion on a particular issue.

This tree is a simple yes/no workflow in response to some specific questions that can guide you to identify if you need others to help you make a certain decision and if so, who you should include.

To take a deeper dive into the decision-making tree framework read our latest article.

2. Changing Your Lens

Looking at problems through a different lens is about changing your point of view, changing the context, or changing the reality. Let’s go into each of those a little more.

Point of View

Ask yourself these questions as it relates to the problem at hand.

  • Can you change your point of view?
  • How is the problem defined from the perspective of the CEO, of the frontline staff, of customers, of adjacent groups? The goal is to look at the problem from the perspective of others within your specific organisation, so adjust these as needed.

They will all look at the problem in different ways as well as define it differently, depending upon their point of view. Understanding all of the viewpoints can give you a deeper understanding of all the ramifications of the problem at hand.

We tend to come at the problem from our own functional perspective. If I work in finance, well, it’s going to be a finance problem. If you ask someone who works in IT, they’ll likely look at the same thing and say, “It’s an IT problem.”

Can you change the context in terms of how you define the problem? Find someone from another area and ask them how they would define the problem. Use their perspective to generate that different point of view.

Change Your Reality

Ask yourself, “What if I …

  • Removed some of these constraints?
  • Had some of these resources?
  • Was able to do X instead of Y?

By changing the reality, you may find a different way to define the problem that enables you to pursue different opportunities.

3. Active Listening & Socratic Method

This is pairing active listening with the Socratic method. Active listening is one of the core skills you’ll want to develop to get better at critical thinking. I also touched on active listening / deep listening in my article on difficult conversations .

Because you need to turn down the volume on your own beliefs and biases and listen to someone else. It’s about being present and staying focused.

Listening Skills include:

  • Be present and stay focused
  • Ask open-ended and probing questions
  • Be aware of your biases
  • Don’t interrupt or preempt
  • Be curious and ask questions (80/20 talk time)
  • Recap facts – repeat back what you heard using their language
  • Allow the silence
  • Move from Cosmetic>Conversational>Active>Deep Listening

When you are trying to find the problem, talk about what success looks like, and think about what the real question is, you have to be aware of your own biases. The things that resonate with you because it’s what you already believe.

Learn to ask questions and listen for insight.

When you’re trying to understand and gather information, it’s very easy to want to jump in to clarify your question when someone’s thinking.

But they’re actually thinking – so you need to sit back and allow it.

When you marry this type of active listening with some key questions that come from Socrates, it can help you understand problems at a deeper level.

To use this, just highlight one or two questions you’ve never used before to clarify, to understand the initial issue, or to bring up some assumptions. You can take just one question from each area to try out and listen for the answer.

As simple as this sounds, this is part of critical thinking. It’s about uncovering what’s actually going on to get to the root cause of a situation.

To take a deeper dive into the socratic method framework and some scenarios in the worplace read our latest article.

4. Decision Hygiene Checklist

When we think about active listening with great questions, we need to make sure that we are learning what someone else thinks without infecting them with what WE think.

That’s where the Decision Hygiene Checklist comes in. When we’re in this gathering and analysing data phase, you need to make sure you keep that analysis in a neutral environment. Don’t signal your conclusions.

You may want to quarantine people from past decisions, as well. Don’t bring up past decisions or outcomes because you want to get the information from them without it being polluted.

When you’re seeking feedback from others, exercise good decision hygiene in the following ways:

  • Quarantine others from your opinions and beliefs when asking for feedback.
  • Frame your request for feedback in a neutral fashion to keep from signalling your conclusions.
  • Quarantine others from outcomes when asking about past decisions.
  • Prior to being amid a decision, make a checklist of the fact and relevant information you would need to provide feedback for such a decision.
  • Have the people seeking and giving feedback agree to be accountable to provide all the relevant information, ask for anything that’s not been provided, and refuse to give feedback if the person seeking feedback can’t provide relevant information.

When involved in a group setting, exercise these additional forms of decision hygiene:

  • Solicit feedback independently, before a group discussion or before members express their views to one another.
  • Anonymize the sources of the views and distribute a compilation to group members for review, in advance of group meetings or discussion.

5. Where Accuracy Lives

Remaining on the flavour of understanding that our own beliefs can compete or pollute reality and our decision making, another approach is to think about where accuracy lives.

The Inside View is from your own perspective, experiences, and beliefs. The Outside View is the way others see the world and the situation you’re in. And somewhere in the middle may be the reality.

This tool is quite simple. Start out with your inside view and describe the challenge from your perspective. Write down your understanding, your analysis, and maybe even your conclusions.

Then it’s almost like De Bono’s six hats where you take that hat off and you look at the outside view. Describe the situation from an outside view. Ask yourself if a co-worker had this problem, how would they view it? How might their perspective differ? What kind of solutions could they offer?

And then you marry those two narratives. One thing about the outside view is that you can get statistics around some of the information you’re looking at.

It can be quite helpful to get a base level of what is actually proven and true, statistically, that is not polluted by the inside view.

Once you’ve run through this process, ask yourself:

  • Did this actually change my view?
  • Can I see the biases that were sitting there?
  • And if Yes, why?

To learn more about how to use this framework and how to overcome some of the obstacles you might encounter read our deeper dive here.

6. The 5 Whys: Root Cause Analysis

This is a really simple tool that starts off by defining the problem or the defect and then continuing to ask why until you get to the 5th Why. This is is usually where you’ll start to discover a possible solution.

Here’s a simple example:

  • Problem – I ran a red light.
  • Well, why did it happen? I was late for an appointment.
  • Why did that happen? Well, I woke up late.
  • And why did that happen? My alarm didn’t go off on my phone.
  • Why did that happen? I didn’t plug it into the charger.
  • And why is that happening? It wasn’t plugged in. It’s because I forgot to plug it in.

So there’s the possible solution – I’ve got to set up a recurring alarm at 9pm to remind me to plug my phone in.

This is a tool perfect for junior members on your team, or ones that come to you with a barrage of questions on a problem. Have them take the 5 Whys template and think it through, ask themselves the 5 why’s.

Interested in learning more about how to use the 5 Why’s framework and how to overcome some of the obstacles you might encounter? Read our latest article with case studies.

7. RAID Log

RAID stands for

  • Risks – write down the risks that will have an adverse impact on this?
  • Assumptions – list out all the associated assumptions
  • Issues – What are some of the issues that have already impacted or could impact the project?
  • Dependencies – what are the dependencies

The RAID Log is often used when you’ve got multiple decisions about an ongoing project.

Whether you’ll be assessing your thinking by yourself, or with team members or customers, this is a great way to make sure you’re gathering all of the necessary information including the assumptions, any issues and dependencies.

8. The 7 So-Whats: Consequences of Actions

All of the previous tools are designed to help you define what the problem is. But it’s also important to think about the consequences of actions.

As you grow as a leader, you’ll need to be comfortable understanding both big thinking and little thinking. Big picture and little details so you are confident in your decisions.

A big part of that is understanding the consequences of your actions and decisions. That’s what the 7 So-Whats tool is about.

The 7 So-Whats is similar to the 5 Whys in that you ask the same question repeatedly to get the answer. Start with your recommendation or possible solution and then ask “So, what will that mean” 7 times.

For example, if you need to hire a new sales rep, the first ‘So, what’ would be something like, “We’ll need to have the right job description and salary package for them, and let the team know they’re coming on.”

And then you work your way through the rest of the ‘So, Whats’ to detail out the results or consequences of the action you’re thinking about.

To read more about the 7 So-Whats read our comprehensive article with case studies.

9. Overcoming Analysis Paralysis

A lot of people get caught up in analysis paralysis. I know I do. Whether it’s thinking about moving house or taking on a new hire, you get all the information but you still feel stuck.

What I find is that it’s usually because we are narrowing our focus too much, especially when it comes to advancement in your career or self-promotion.

So here are some questions to help you push through that analysis paralysis. Ask yourself:

  • How would I make this decision if I was focused on opening up opportunities for myself / the situation?
  • What would I advise my best friend to do? Or What would my successor do in this situation?
  • Your caution may be the result of short-term fears, such as embarrassment, that aren’t important in the long run. Can you create a timeline or deadline to make the decision that will give you some mental distance?

Basically, you want to ask yourself what is holding you back. Is it fear? Fear of disappointment? Or that you don’t have enough information?

Perhaps you think you could get more information, but can you get more information in the time available? If not, then make the decision with what you have.

If you hold back from making your decision, what will the impact be for your stakeholders, your career, and how people view you?

The purpose of this tool is to separate yourself from the situation a little bit so you can look at it more subjectively as if you were advising a friend. And push through the paralysis to make the decision.

9 Critical Thinking Tools For Better Decision-Making

Taking time to think about how you think and using tools like these can be the difference between becoming a good leader and a great one.

Use these nine critical thinking tools to empower you to make better decisions for your business, organisation, and career – and feel confident doing so.

For personalised guidance on how best to use critical thinking skills for your business or organisation, drop us a line . We would be happy to partner with you to create a plan tailored to your needs.

Taming Your “Advice Monster”: The Key to Empowering Leadership

The engagement disconnect: why younger workers are feeling less engaged (and how to fix it), decision hygiene checklist: mastering unbiased leadership decisions, leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Leadership Development Programs

Subscribe To Our Mailing List

Articles & Insights

Client Stories & Results

  • Recommended Resources

Ebook: How To Give Feedback

[email protected]

Sydney, Australia

LinkedIn: Amy Gray

Instagram: @LeadershipEffect

  • Programs & Courses
  • Client Stories

© The Leadership Effect All rights reserved.

critical thinking measurement tools

  • Open access
  • Published: 14 September 2022

Adaptation and validation of a critical thinking scale to measure the 3D critical thinking ability of EFL readers

  • Moloud Mohammadi   ORCID: orcid.org/0000-0001-7848-1869 1 ,
  • Gholam-Reza Abbasian   ORCID: orcid.org/0000-0003-1507-1736 2 &
  • Masood Siyyari 1  

Language Testing in Asia volume  12 , Article number:  24 ( 2022 ) Cite this article

5735 Accesses

3 Citations

2 Altmetric

Metrics details

Thinking has always been an integral part of human life, and it can be said that whenever humanity has been thinking, it has been practicing a kind of criticizing the issues around. This is the concept of critical thinking that enhances the ability of individuals to identify problems and find solutions. Most previous research has focused on only one aspect of critical thinking that is critical thinking skills, while two other dimensions of criticality and critical pedagogy should have also been included. In order to assure of the validity of the instrument designed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review), it was first adapted and then SEM modeling was used. Examination of the results of factor analysis and modeling of SEM showed that the model satisfied the fit indices ( χ 2 /df, CFI, TLI, RMSEA), and all the factor loads are greater than 0.4 which represent that the items are defined properly. This research suggested a SEM modeling of critical thinking skills, composed of six factors measured by twenty-two indices. The results of the PLS-SEM CFA represented that it is a valid structural model to measure a critical thinking of EFL readers at three levels.

Introduction

Recent research on reading has represented that, although it is generally established as the first skill in language learners, it is a complex cognitive activity for individuals to perform well in learning and obtaining sufficient information from the target community (Shang, 2010 ). According to Krathwohl ( 2002 ), the cognitive domain is divided into two parts: first is the knowledge (including real, theoretical, procedural, and metacognitive knowledge) and then the cognitive process (including recalling, comprehending, applying, examining, evaluating, and creating). In defining this skill, Chamot ( 2004 ) holds that reading is the process of activating language-acquired knowledge and skills to access information and transfer them. Swallow ( 2016 ) looks at it as a three-dimensional construct including content, perception, and understanding through thinking, metacognition, and meaning construction (Gear, 2006 ).

According to Rashel and Kinya ( 2021 ), the focus of education in this competitive period of time is on higher-level thinking skills (including critical thinking) rather than low-level thinking skills, and research into measuring critical thinking skills is growing. In the eyes of Ennis ( 2011 ), critical thinking ability is defined as clear and rational thinking that includes engaging in reflective and independent thinking. Moon ( 2008 ) and Willingham ( 2007 ) emphasized that the development of critical thinking in individuals is the goal of higher education and can be recognized as the primary goal of learning. Paul and Elder ( 2002 ), in describing a critical thinker, stated that in the eyes of such a person, all four skills of reading, writing, speaking, and listening are methods of skilled higher-order thinking. Such a person, while reading the text, finds it a representation of the author’s thinking and therefore tries to align with his point of view. In this regard, Din ( 2020 ) emphasizes that since a critical thinker has the ability to understand beyond the content of a text, they tend to react to the content being studied. Moreover, the tendency towards implementing critical thinking programs in the English language teaching context has increased as well (Heidari, 2020 ; Liu & Stapleton, 2014 ).

Beside the theory-wise investigations, there are a couple of studies with practical direction. Some research has examined the role of critical thinking in learning a language (e.g., Akyuz & Samsa, 2009 ; Willingham, 2007 ), others focused on the thinking strategies used by language learners in improving reading skills (Shang, 2010 ) or the relationship between critical thinking and language learning strategies (Nikopour et al., 2011 ). A few studies confirmed the relationship between critical thinking ability and reading comprehension (e.g., Eftekhary & Besharati Kalayeh, 2014 ). In such area, a limited number of studies have relied on the participation of the academic community (Hawkins, 2012 ; Hosseini et al., 2012 ), and this study is also innovative in this respect. It can be inferred that in most of these studies, critical thinking is limited to the use of definite basic skills (compare and contrast, conclusion, inferencing, etc.). According to Facione ( 1990 ) and Rashel and Kinya ( 2021 ), most research on this topic has focused on general critical thinking skills (but not expertise), although these skills have been of interest for years. But, is it enough to just use these skills to understand a content? Is critical thinking summarized in terms of several sub-skills? Where and how is the role and impression of society reflected in critical thinking or critical reading? Does learning these sub-skills alone indicate the internalization of critical thinking and reading in individuals? These key questions have been left intact mainly due to a lack of specific and valid instrument, as a rationale behind this very study.

The novel point in the present study is that, despite the existence of the one-dimensional attitude towards critical thinking (Facione, 1992 ; Kember et al., 2000 ), it tries to highlight the concept of a three-dimensional critical thinking in academic context and in this regard developed a tool for measuring its subscales (and not just individual skills). Such a tool can measure the real needs of the next generation with evidence of real-life multifaceted critical thinking issues. The purpose of this study was to evaluate the validity of the questionnaire developed for assessing three-dimensional critical thinking skills in EFL readers. Moreover, the application of the partial least squares method (PLS-SEM) in the development and validation of the proposed model has also made this research innovative. The objectives of this study were (1) to assess the validity of the items introduced in the questionnaire, (2) to investigate the relationship between and among the identified components, and (3) to determine the validity and reliability of the questionnaire designed to assess three-dimensional critical thinking skills in EFL readers. The contribution of this article in the literature is to illustrate the importance of critical thinking both in personal and sociocultural aspects, to evaluate and validate the tool that was developed to measure the components of three-dimensional critical thinking (proposed by the same researchers), to provide the model fit indices for factor analysis, and to adapt the instrument to the conditions of English language readers. Therefore, an attempt was made to briefly introduce the components of the proposed model, and then to discuss the validation method of the developed instrument to measure these components, and finally to report the validation results of the introduced instrument. The pedagogical implications of this study include the following: using the presented concepts in research centers to identify and introduce the method of teaching and developing each of the sub-skills of critical thinking in different societies; identifying differences in instructional approaches for each of the sub-skills; applying both concepts (i.e., three-dimensional critical thinking and reading) in other areas and assessing the generalizability of findings; and reviewing the previous literature by looking at all three dimensions introduced and evaluated in order to identify their strengths and weaknesses in this regard.

Literature review

Today that critical thinking is more prominent in language teaching than ever (Li, 2016 ; Van Laar et al., 2017 ), there is a wealth of research on the need and importance of fostering such thinking in language classrooms (Zhao et al., 2016 ), representing that developing such thinking facilitates the language acquisition (Wang & Henderson, 2014 ; Wu et al., 2013 ), and equips learners with such self-criticism that it develops analytical and reflective view of themselves and their environment in learners (Moghadam et al., 2021 ). Brookfield ( 2019 ), Dekker ( 2020 ), Haji Maibodi and Fahim ( 2012 ), and Zou and Lee ( 2021 ) acknowledged that teachers who emphasize the education and application of critical thinking increase awareness and understanding of socio-cultural concepts in learners. In this regard, Crenshaw et al. ( 2011 ) stated that encouraging language learners to participate actively in thinking activities is essential, and McGregor ( 2007 ) and Rezaei et al. ( 2011 ) emphasized that engaging teachers and language learners in thinking and reflecting on the views and assumptions presented in a text are among the essential steps in the development of critical thinking in individuals. Rezaei et al. ( 2011 ) acknowledged that learners’ participation in critical thinking processes during teaching is done through asking questions and providing answers, discussing topics, asking for explaining or elaborating on opinions, and so on. They also emphasized the need to provide teachers with accurate and comprehensive knowledge of critical thinking before attending such classes. In addition, Tehrani and Razali ( 2018 ) and (Li, 2016 ) have suggested that critical thinking training should begin at early ages and in the natural process of learning the target language. However, despite the importance and emphasis on its development, little progress has been made in its application and integration in education (Li, 2011 ; Pica, 2000 ), whose reasons, according to Lin et al. ( 2016 ) can be found in its challenging-widespread nature and ambiguous details of its components.

The traditional definitions of critical thinking by philosophers do not necessarily assist individuals to become a critical citizen/being. However, the core characteristics of critical thinking introduced in these definitions remain fundamental to what is meant by critical thinking on (Bali, 2015 ; Davies, 2015 ; Davies & Barnett, 2015 ; Renatovna & Renatovna, 2021 ; Widyastuti, 2018 ; Wilson, 2016 ). Considering critical thinking as a very pivotal learning skill, the acquisition of related skills in the traditional attitude was limited to practices of certain types of skills such as inferencing, reasoning, and analyzing (Davies, 2015 ). He emphasizes that one of the weaknesses of the traditional sense of critical thinking, which is crystallized in the critical thinking movement, is the lack of formation of the important component of action. This is worth mentioning that paying less attention to the topics related to critical thinking in higher education may result in a lack of having a proper and well-defined practical (and even theoretical) instruction, and as it was mentioned by Paulsen ( 2015 ), little advancement can be formed if critical thinking remains vague.

A model of critical thinking in higher education is suggested by Davies ( 2015 ) in which the basic focus is on critical rationality and critical character of individuals. He presumes six distinct dimensions for critical thinking including critical argumentation, critical judgment, critical dispositions, critical actions, critical social relations, and critical creativity or critical being. Each of these dimensions plays a significant role in the comprehensive model of critical thinking (Davies, 2015 ; Davies & Barnett, 2015 ).

There are many well-developed models of critical thinking which might be called “philosophical” models of critical thinking. These models might be dispersed on a continuum from the taxonomy of pedagogical objectives (e.g., Airasian et al., 2001 ; Bloom, 1956 ) to the APA Delphi Report and Paul-Elder models (e.g., Paul & Elder, 2002 ; Sadler, 2010 ) and also to the model of critical thinking by Ennis ( 1991 ) according to which the main emphasis is on cognitive decision-making. However, Davies ( 2015 ) represented that these models are utilized mostly in the case of educating for critical thinking in which the main goal is providing learners with activities based on which they can improve their basic judgment and decision-making ability, while critical thinking is a multidimensional concept containing both personal and social aspects. In endorsing and supporting the use of the term multidimensional in relation to the concept of critical thinking, some of the existing challenges can be mentioned. Lun et al. ( 2010 ) and Manalo and Sheppard ( 2016 ) stated that a specific level of language proficiency is expected to accomplish such thinking. Similarly, Peng ( 2014 ) stated that for students, language deficiency is one of the main reasons of cognitive barriers that prevents them from practicing critical thinking. Explaining the other challenges, Liang and Fung ( 2020 ) and Merrifield ( 2018 ) stated that the subject of culture is effective in applying and practicing such thinking. For example, factors such as a significant decline in the quality and quantity of social interactions and intense attention to the social status of an individual in a group (Suryantari, 2018 ), some considerate social standards explicitly in eastern setting (Bag & Gürsoy, 2021 ), socio-cultural factors (Imperio et al., 2020 ; Shpeizer, 2018 ), fear of being ridiculed during expressing an opinion (Tumasang, 2021 ), epistemic belief in the certainty of knowledge (Kahsay, 2019 ), the emphasis on teacher-centered language classes (Fahim & Ahmadian, 2012 ; Hemmati & Azizmalayeri, 2022 ; Khorasani & Farimani, 2010 ), or weakness in CT experiences due to the lack of inductive education in Iranian context (Birjandi et al., 2018 ), reduce the natural learning ability of developing such skill as well as the power of induction—especially in adults (Dornyei, 2010 ). Therefore, the subject of language learning, whether in a foreign or a second language context, complicates the issue of cultivating critical thinking in such a way that its development cannot be limited to learning a few individual skills. In this regard, Davies and Barnett ( 2015 ) attempted to bring together a set of perspectives, thus identified three broad perspectives on critical thinking in the literature. These perspectives are often opposed to each other, while overlapping and significantly merging with each other (Frykholm, 2020 ; Ledman, 2019 ; Shpeizer, 2018 ; Wilson, 2016 ; Wilson & Howitt, 2018 ). Shpeizer ( 2018 ) also emphasized that this mutual influence and the lack of transparency in the boundaries of each of the three areas have made the concept of critical thinking confusing and perhaps daunting for English teachers.

In addition, understanding the nature and dimensions of critical thinking in order to evaluate it is also of crucial importance. Assessing an individuals’ critical thinking requires a measuring instrument that can precisely and perfectly determine the true conditions. From the result of the literacy study, one can find some instruments to measure critical thinking skills and abilities of students each with their specific perspectives, definitions of criteria, and priorities. Among these instruments are California Critical Thinking Skill Test (CCTST) by Facione ( 1992 ); Critical Thinking Questionnaire by Kember et al. ( 2000 ); Ricketts ( 2003 ) questionnaire; Critical Reading Scale by Zhou et al. ( 2015 ); and Critical Thinking Inventory by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ). The designed questionnaire by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), unlike previous tools, addresses all the three dimensions of critical thinking (i.e., individual skills, criticality, and critical pedagogy).

Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), getting insights from Davies ( 2015 ) and Davies and Barnett ( 2015 ), represent that critical thinking is composed of both personal critical thinking skills, and those skills gained at the criticality level and critical pedagogy level. The levels, movements, and skills of each of the levels introduced in their study are presented in the figure below.

As shown in Fig. 1 , as one moves from the center to the outside (the surface), the stages of critical thinking development appear, according to which this process begins with the development of individual critical thinking skills, the criticality movement, and then the critical pedagogy movement. This figure includes the XY plane (page drown on x and y diagrams), indicating the measurement subscales; YZ plane (page drown on y and z diagrams) represents individual and socio-cultural dimensions; and the XZ plane (page drown on the x and z diagrams) is different movements.

figure 1

The model of critical thinking movements, skills and abilities, and assessing criteria extracted from Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review )

The model represents that in order to improve critical thinking in a person, it is necessary to consider both individual and socio-cultural aspects. In this figure, the X-Z page represents various dimensions of critical thinking, the Y-Z page represents cognitive-developmental skills, and the X-Y page shows sub-skills of each layer (i.e., assessing criteria in this study). Aspects and skills of the three-dimensional critical thinking which were previously introduced by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) are briefly explained in Table 1 .

Critical thinking and criticality are the most interwoven concepts with language skills acquisition in general and processing and development of reading skills in particular. And of course, developing skills related to each of these two movements requires critical pedagogy. According to Haji Maibodi ( 2014 ), reading comprehension refers to the ability to construct meaning through thinking, before, during, and after reading the text, as well as integrating the information presented in the text with one’s prior knowledge. She also stated that different types of texts with different levels of difficulty and various topics are available to people to be encouraged to read and thus gain new knowledge and strengthen their reading skills. As people go through this process, they realize that in order to understand the texts they read as much as possible, they have to use thinking skills (Haji Maibodi, 2014 ), and this thinking takes different forms and the implementation of each and requires skills that are called critical thinking skills. Haji Maibodi ( 2014 ) also emphasized that practical teaching of reading comprehension requires the development of the ability to understand, analyze, and recognize various components of a text.

Reading is viewed as the most crucial academic skill for foreign language learners which can facilitate their professional progression, social success, and personal development. Reading skill is defined by Berardo ( 2006 ) as a dynamic and complex phenomenon and is considered as a source of gaining language input since it is a receptive skill based on which there should be an interaction among the author of the text, his/her message, and the reader in order to comprehend it. Therefore, in order to read, comprehend, and respond to a written content, the reader is expected to have some certain skills and abilities including reading to grasp the message of each line and paragraph, reading to find the existing relationship between the paragraphs, understanding the basic message of the author, and finding the most appropriate answer to the idea of the writer (Berardo, 2006 ). According to Berardo ( 2006 ), stages of reading require readers to apply a higher order of thinking called “critical reading” by Bloom ( 1956 ). According to Hall and Piazza ( 2008 ), critical reading skill is still one of the skills which helps learners gain success in academic courses whilst it is still vague to many teachers and they usually fail to develop such skill in their students. They represent that if students lack the skill to analyze and challenge written content in the classroom environment, then they will face many problems in understanding and questioning their living environment and society.

Wallace ( 2003 ) and Sweet ( 1993 ) approach the critical reader as an active reader who is able to ask questions, to recognize, analyze, and confirm evidences; to detect the truth; to understand tone, bias, and persuasion; and to judge them throughout the reading process. Khonamri and Karimabadi ( 2015 ) state that in order to have an effective reading, readers should have the ability to read with their critical eyes, i.e., they have to read and evaluate a text for its intentions and the reasons behind it, that is the ability to think critically.

Critical reading, as the key player in the development of core language skills, involves activities such as reasoning, questioning, evaluation, comparison, and inference (e.g., Adalı, 2010 ; Adalı, 2011 ; Söylemez, 2015 ). Regarding critical reading, Nemat Tabrizi and Akhavan Saber ( 2016 ) emphasized that this skill plays an important role in the formation of democratic societies since it makes people decide what they accept as reality, only after reviewing, analyzing, and comparing the content presented with their knowledge and values of their internal-external worlds.

Instrument validation

Measurement validation in the eyes of Zumbo ( 2005 ) is a continuous process in which evidence is collected to support the appropriateness, significance, and usefulness of the inferences derived from scores obtained from a sample. He also emphasizes that the method and process of validation is important in the construction and evaluation of tools related to social sciences, behavioral, health, and humanities, since without the implementation of this process, any conclusions or inferences from the obtained scores are meaningless.

Many have argued that in the contemporary view, the main purpose is to extend the conceptual framework and power of the traditional vision towards validity (Johnson & Plake, 1998 ; Kane, 2001 ; Messick, 1989 ; Messick, 1995 ), according to which validity is not one of the characteristics of measuring tools anymore, but the characteristics of inferences made on scores that can be examined in the form of a continuum (valid/invalid dual is no longer considered). In this view, construct validity is the only and the most important feature in validation, and there are only different sources of evidence to prove the validity of inferences. Zumbo ( 2005 ) stated that the calculation of validity using statistical methods such as correlation is not acceptable, and it is necessary to provide a detailed theory and support for it, including analysis of covariance matrices between experimental data and covariance structure model. From the study of previous research, it can be seen that the two categories of models are introduced as key for validation, which are confirmatory factor analysis (CFA), which has a lengthy and rich history in research (for example, Byrne, 1998 ; Byrne, 2001 ; Kaplan, 2000 ) and Multiple Indicators Multiple Causes (MIMIC) that have been generalized to linear structural equation models by integrating structural equation modeling and item response theory (Ullman, 2001 ). The multidimensional and hierarchical representation of the skills needed for critical thinking at each level is primarily based on theoretical reasoning (by Davies, 2015 ; Davies & Barnett, 2015 ; Frykholm, 2020 ; Ledman, 2019 ; Shpeizer, 2018 ), as mentioned in the previous paragraphs.

Accordingly, this study was an attempt to adapt and assure of the validity of the questionnaire proposed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) in order to measure criteria introduced in Fig. 1 , XY plane (see Appendix A for the validated version). A review of previous studies showed that previous research has only examined individual skills and examined various subskills in this area. None of the studies have provided a comprehensive scale, consisting of both individual and socio-cultural factors, and the validation of a common scale for measuring the set of factors. Regarding this, the present study assessed the three-level scale of critical thinking and validates the proposed model. In this study, a measurement and structural model according to the previous literature and the method of factor analysis is proposed. This research is innovative because it uses the partial least squares method (PLS-SEM) and various software to validate the proposed model. The PLS method relies on a series of consecutive ordinary least square (OLS) regressions; thus, it eliminates the necessity of having a normal distribution of observations. OLS indicates the compatibility of the partial least squares method with small samples and is suitable for the conditions of this research (Farahani, 2010 ). On the other hand, given that PLS assumes that all blocks are linear combinations of their reagents, common problems such as nonlinear solutions and uncertainty of the factors that occur in covariance-based structural equation modeling (CB-SEM) techniques do not occur (Pirouz, 2006 ). Researchers aimed to answer the following question:

RQ. To what extent is the newly developed rating description a valid measure of critically thinkers’ reading ability?

Methodology

In this study, an attempt was made to validate the three-dimensional critical thinking instrument developed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) to assess critical thinking in English as a Foreign Language (EFL) readers (Tables 2 , 3 , 4 , 5 , and 6 ).

Participants

In order to answer the research question, 89 Iranian EFL under-graduate students (age range 18 to 35) were selected for the development and validation of a reading skill-oriented critical thinker measurement instrument. The participants were members of intact classes (with the aim of involving individuals with diverse abilities), and the homogeneity of the classes was also assessed via Preliminary English Test (PET score above 147). Due to the fact that the participants cooperated with the researchers during different phases of the study, the implementation steps were introduced to them, ethical approval was given, participants were assured of not publishing personal opinions to the third person/parties, and the final results were communicated to them.

Instruments

Critical thinking inventory: The CTI, by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), contains 59 items of 5-point Likert type to measure the factors of argumentation (15 items), judgment (5 items), disposition (9 items), criticality (12 items), social cognition (9 items), and creativity (9 items number) in 50 min. The minimum score of the questionnaire is 59 and the maximum is 295, and the participants were asked to respond within 60 min. The CR and AVE were reported in the work as 0.97 and 0.687 (see Table 7 ).

Preliminary English Test (PET): This test was used to select groups of participants who have similar language proficiency. It is an official intermediate English language test (designed by Cambridge ESOL examinations) with the maximum achievable score of 170. This test includes sections of reading (five parts, thirty-five items, scoring range of 0–35), writing (three parts, seven items, scoring range of 0–15), listening (four parts and twenty-five items, scoring range of 0–25), and speaking (four parts of face-to-face interview questions, scoring range of 0–25). Two raters were asked to assess the test to be assured of interrater consistency of scores. The intra-class correlation coefficient (ICC) test was run to determine if there was an agreement between raters’ judgment on the scores. A high degree of reliability was found between the scores ( F (88, 88)= 146.08, p < .000) with the average measure ICC of .982).

Initially, the written informed consent was obtained from the participants. Then, PET test was used to ensure the homogeneity of the participants and those with similar performance were selected for this study. Next, participants were asked to respond questions to assess CTI validity. After collecting data, the relationships between the elements, skills, and concepts introduced in the questionnaire (see Table 1 ) were assessed. For this purpose, the validity testing of the model was conducted through CFA method of evaluating and comparing alternative models: CFA of the measurement model (first-order model) and CFA of the structural model (second-order model). In this study, in order to increase statistical power, researchers tried to use predictor variables (i.e., AWC, QAR, classic instructions), considering less operating levels for continuous variables, utilizing continuous variables instead of polarizing or grouping them, defining focused hypothesis tests, crossing the extracted factors, etc., which are described in Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ). The scale validation in this study included a PLS-SEM analysis technique due to the abnormal distribution of the data gathered (Afthanorhan, 2013 ) and the model validation included the following tests:

Analysis of the convergent validity

Test of discriminatory validity

Test of construct validity.

Data analysis

After collecting the data of the designed inventory in SPSS, the collected data related to the validity of the questionnaire were transferred to SmartPLS software to validate the proposed model through model validation techniques (FIT, GFI, RMR, etc.), SEM, CR, and AVE estimation. The datasets generated and analyzed during the current study are not publicly available due to ethical issues and privacy restrictions.

In order to find the answer to the research question, a CFA-based approach was used as an MTMM technique to estimate the validity of the designed instrument (Bentler, 2000 ; Byrne, 2006 ; Koch et al., 2018 ). For this purpose, different types of validity of the developed inventory were evaluated.

Internal validity

Face validity: Face validity depends on the judgment of the constructor of the test and was approved according to the Advisor’s opinion.

Content validity: Various aspects of the structure are examined. Content validity was confirmed by the Advisor.

Criterion-related validity (both concurrent validity and predictive validity): In order to appraise the predictive validity, this instrument should be evaluated over a long period of time, for example, once at the beginning of the undergraduate course and then, again at the end of the fourth year, and then compare its performance in predicting the results with current results. To measure concurrent validity, it is necessary to examine this tool in a completely different content and on a completely different group of learners (at the same time).

Construct validity: The category is focused on the structure of the questionnaire. In order to measure the next three criteria, Smart PLS software was used.

Convergent validity: Estimation of CR and AVE

Discriminate (Divergent) validity: Confirmatory factor analysis ( t value)

Construct validity: Model validation (SRMR)

In examining the introduced validity criteria, the results of (a) checking the suitability of factor loads, (b) investigating structural equation model, and (c) estimating Goodness of Fit were investigated as follows:

At the beginning, in order to investigate the effect of items and factor loads in measuring the desired structure, the model parameters and their significance were calculated (Fig. 2 ).

figure 2

Measurement model test

It is observed that all factor loads are more than 0.4 and are significant. Therefore, the studied items have a significant effect on the measurement of the structure (Table 2 ).

The model parameter table accurately shows that the p value and t value measures are respectively, less than .001 and more than 1.96, representing a good value. In the following table, the measures of the overall hypothetical fitted model (i.e., goodness-of-fit indicators) are calculated (Table 3 ).

According to the results, both GFI and AGFI value are more than 0.80; RMR values are close to .00; X 2 /df ratios are less than 5; and RMSEA estimates are less than 0.08 indicating reasonable errors for approximation in the society. Therefore, all indicators are in the desired range, so the results of the model are trusted and valid and can be used, in general. It should be noted that variables with less than three items cannot be fitted and accurate calculation of their indicators are not possible. In the following, the results of detailed analysis of the model and determination of validity indicators are presented.

Next, the data analysis algorithm in Smart PLS software is displayed. In this algorithm, after model formation and confirmatory factor analysis, it is the time to examine the structural model in three areas:

Measurement model test: To evaluate the validity and reliability of each structure, the AVE (average variance extracted) and CR (composite reliability) are calculated, respectively (Table 4 ).

Therefore, according to the results, the validity criterion is more than 0.4 and the reliability criterion for this structure is close to 0.7, so it can be said that in terms of convergent validity criteria, all structures are in the desired range (Fig. 3 ).

Structural equation modeling: The results of confirmatory factor analysis of the model represented that:

figure 3

Structural equation modeling results

It can be seen that all items have a significant effect ( p <0.001) on the structure. This shows that the items related to each structure measure the desired structure well (Table 5 ).

The estimation of the model parameters represents that p values are lower than .001, and t values are greater than 1.96, meaning that the path is significant at the .05 level, meaning that its estimated path parameter has a significant effect on the structure (Ullman, 2001 ). This shows that the items related to each variable measure the desired structure well.

Goodness of fit: For the purpose of conducting confirmatory factor analysis (CFA), as an MTMM technique to assess divergent validity of the model, goodness-of-fit indices were estimated as follows (Table 6 ):

According to the obtained indicators, it can be seen that AGFI is greater than 0.80, x2/df ratio is less than 3, RMSEA value is less than .08, and CFI is greater than .95 which means that there is a great satisfactory fit. All in all, this can be concluded that the indicators are in the desired range and the results of the model are reliable. Finally, the results of confirmatory factor analysis confirm the relationships and structure of the model, investigating the validity and reliability of the structure (Table 7 ):

Investigation of the significance of covariance relations also shows that all covariance relationships between structures have a p value less than the error level of 0.05, and the relationships are significant. The advantage of composite reliability over Cronbach’s alpha is that the reliability of structures is not computed definitely; rather, it is obtained through evaluating the correlation of existing structures with each other. In this method, indicators that have a higher factor load are more important. Therefore, both criteria are used to better measure the reliability of this type of models. Moreover, the common measure for creating convergent validity at the structural level is the mean extracted variance (AVE). This criterion is defined as the equivalent to the share of a structure. Acceptable values for CR is over .70, and the excellent value for AVE is over .50.

Considering that the second generation of structural equation modeling is based on the variance approach, and in order to ensure the values of covariance and provide a complete report, the covariance relationships in this model were also examined and the results were reported (Table 8 ).

As it turns out, all covariance relationships between structures have a p value less than the 0.05 error level and a t value greater than 1.96, meaning that the relationships between latent variables are meaningful.

Campbell and Fiske ( 1959 ) and Langer et al. ( 2010 ) stated that CFA is an analysis for construct validity. Putting the results observed in steps 2 and 3 together, it can be concluded that all the three absolute fitness indices, parsimony fit indices, and incremental fit indices have desirable values in the model, and this theoretical model is consistent with its experimental model, and therefore, the divergent validity of this structure is confirmed. The results of calculating the reliability of the inventory were also presented in “instrumentation” section. Therefore, combining the results of covariance analysis and the three-level analyses, it can be seen that this questionnaire is valid and reliable.

Since there is little agreement on the nature and dimensions of the term critical thinking (Facione et al., 2000 ; Frykholm, 2020 ), the researchers of this study decided to provide a comprehensive picture of its various dimensions and develop a valid tool for its measurement. Frykholm ( 2020 ) believes that no educator has proposed a comprehensive definition and model of critical thinking, and it can be said that most previous studies have focused only on a few limited skills of critical thinking. However, the results of the interviews in the first phase of this study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ) clearly showed that the socio-cultural dimensions—if not more—are as important as the individual skills dimension. And by approaching the proposed model of the present study to the model of Davies ( 2015 ) and Ledman ( 2019 ), it can be inferred that the comprehensive model is well suited to the set of skills, judgments, and activities (especially for investigating and questioning tasks of receptive skills) as well as expressing desires or attitudes (expressing ideas, creativity, analysis, and other productive skills). In review, the main objectives of this study were to investigate the validity of items and components of the model and also the validity of the tool designed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) to assess three dimensional critical thinking in EFL readers based on which the following results were identified.

Examining the values obtained from the data, it was observed that the data distribution was not normal. Therefore, in order to assess the validity of this tool, confirmatory factor analysis (CFA) and PLS-SEM was used in SmartPLS software because this method is suitable for abnormal data (Hair et al., 2014 ) and makes it possible to examine complex models with multiple exogenous and endogenous constructs as well as index variables (Hair Jr. et al., 2010 ; Hair et al., 2014 ; Hair et al., 2019 ). The study of structural equation modeling and covariance relationships and also model evaluation indices clearly showed that the components were selected correctly, the relationships between the components of the model were defined properly and the questionnaire items were well designed, and in this way, the study has reached its objectives.

The six-factor and twenty-two items scale that was proposed by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) has been validated using a hybrid technique mainly due to the existence of abnormally distributed data. Results indicated that the PSL-SEM CFA represented the best fit to the proposed model, in terms of factor loadings. The findings of the first phase of this study indicated the existence of validity between the factors introduced in the three-level model of critical thinking. From the results obtained in this phase, it can be seen that focusing on all the skills and abilities introduced (i.e., argumentation, judgment, disposition, action, social cognition, and creativity) is important in developing critical thinking in English readers.

Discussing the elements of the first movement, a comparison on the criteria was introduced in this study with the ones mentioned in Kaewpet ( 2018 ); this can be said that the same measures were mentioned by EFL learners. Focusing on factors of judgements, the elements of buck-passing and vigilance were extracted which were also mentioned by Mann et al. ( 1997 ). They also referred to hypervigilance and defensive avoidance which were not mentioned by EFL learners. The last skill of the first movement was disposition which was assessed based on innovativeness, maturity, and engagement as introduced by Ricketts ( 2003 ).

In the second movement of developing critical thinking, it was referred to criticality which was mentioned by learners in terms of habitual action, understanding, reflection, and critical reflection. These factors were also used by Kember et al. ( 2000 ). The findings of this section, contrary to the view of Shpeizer ( 2018 ), in which the two concepts introduced in the first and second movements were considered the same without considerable distinctions, clearly showed that the second movement involves the development of critical actions (and the introduced sub-actions) in individuals, while the first movement does not focus on the development of action skills in individuals. The findings of this study also confirm the views of Wilson and Howitt ( 2018 ) based on which they acknowledged that critical thinking in this movement is self-centered and manifests itself in the form of introspection, self-adjusting, and metacognition. The set of abilities acquired at this stage will make a person a prosperous learner, specialist, and scholar, while the first movement focuses on the application of rational-argumentative thinking in the form of training methods and with the aim of improving exactness, proficiency, and creativeness in individuals. Similarly, Ledman ( 2019 ) considers this dimension as disciplinary criticality based on which the thinking tools and habits of mind promote epistemological structures.

And the third movement in this study, namely critical pedagogy movement, was composed of the two layers of social cognition and creativity. The first layer was assessed based on factors such as social competence, literacy, cultural competence, and extraversion. The findings of this section are very similar to Pishghadam et al. ( 2011 ) criteria in which factors of social competence, social solidarity, literacy, cultural competence, and extraversion were introduced as basic criteria in measuring social cognition. But these findings are in contrast with criteria introduced by Pishvaei and Kasaian ( 2013 ) among which are tenets of monolingualism, monoculturalism, native-speakerism, native teacher, native-like pronunciation, and authenticity of native-designed materials quantitatively. Reasons for such a difference may include the nature of the classes, the objectives of the courses, and the interlocutors/participants. These findings are consistent with the works of Davies ( 2015 ) and Davies and Barnett ( 2015 ) who predicted that critical thinking is not only limited to individual critical thinking skills but also other dimensions such as socio-cultural dimensions and critical pedagogy should also be considered. The last layer was creativity which was assessed based on factors of fluency, flexibility, originality, and elaboration which were also mentioned by O’Neil et al. ( 1992 ) and Abedi ( 2002 ).

Discussing this movement, the introduced elements of this dimension confirmed the orientations taken by Davies ( 2015 ), Davies and Barnett ( 2015 ), Rahimi and Asadi Sajed ( 2014 ), and Shpeizer ( 2018 ) based on which critical pedagogy have impact on critical thinking. According to Shpeizer ( 2018 ), the fundamental difference between the two schools of critical thinking discourse and the critical pedagogy is in the contrast between the sociocultural as well as political and moral tendencies in this school and the apparent neutral tendencies of the school of critical thinking. According to Shpeizer ( 2018 ) and Freire ( 1993 ), in the former, it is not possible to intercept epistemology and politics, and if there is a critical approach, then people’s awareness of power relations and structural inequalities of the societies will be aroused. Shpeizer ( 2018 ) adds that, advocates of critical thinking believe that this approach is incompatible, inconsistent, and hazardous since it initially creates uncertain assumptions about a society and thus diverts us from the path of truth-seeking and enlightenment required by a critical thinker. And perhaps the main reason for the slow and one-dimensional movement of critical thinking during all the years can be found in this point. According to Shpeizer ( 2018 ) and Rahimi and Asadi Sajed ( 2014 ), the proponents of critical pedagogy development argue that since social, political, and educational structures in different societies hitherto run in an inequitable and oppressive manner, disregarding such conditions (which undoubtedly construct the lives and thoughts of individuals) makes objective critical development—and consequently, the progress of community members and communities—impossible. They emphasized that to develop critical pedagogy, it is not possible to teach rational and critical thinking skills and tendencies in individuals without regard to other dimensions such as awareness of cultural, political, and religious. The findings are also in line with Ledman ( 2019 ), who states that moral education (the name chosen for the third dimension) emphasizes the need to develop the capacity for moral thinking and judgment independent of official orders and requirements. Finally, by matching the findings of this study with the study of Davies ( 2015 ) and Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ), it can be concluded that critical thinking can be defined in three complementary layers; critical thinking skills, criticality, and critical pedagogy. And the more one strives to become more capable in thinking critically, the more s/he moves from gaining initial-personal skills (critical thinker) to socio-cultural skills (critical being).

Regarding the methodology of the study, as explained, due to the fact that the distribution of data obtained from the questionnaire was not normal, the PLS-SEM method was used as a confirmatory factor analysis (CFA) technique. The validation of the model used in this study is based on theoretical and experimental concepts developed in the previous study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ). The model validity test was performed in the framework of SEM approach of CFA based on which the investigation of the first degree model (same as the measurement model), and the second degree model (same as the structural model) was conducted. Examination of the absolute values of skewness and Kurtosis as well as data distribution showed that the distribution was not normal; therefore, PLS-SEM confirmatory factor analysis was performed to determine the structural validity of the scale (Mâtă et al., 2020 ). In addition, the modeling approach is suitable for complex models with multiple endogenous and exogenous index structures and variables (Hair et al., 2014 ). Also, due to the fact that the sample size in this study is more than the minimum recommended value (i.e. 50), so the most appropriate method for model analysis was considered (Mâtă et al., 2020 ).

The results of this study provided the next implications: this study investigated a framework of assessing EFL readers who has critical thinking in the three main streams of individual skills, critical pedagogy, and criticality. The results showed that in each of these three main streams, there are criteria that can be used to assess learners’ abilities; Students were interviewed in different phases of the study and offered a variety of views not only on their attitudes toward critical thinking, but also on their perceptions of teaching instructions and the strengths and weaknesses of each, which can provide insights towards designing and implementing critical thinking training sessions; a review of previous literature on three-dimensional critical thinking provided a comprehensive overview of its strengths and weaknesses, as well as the supporters and opponents and finally, the findings of this study were a true validation of the studies confirming the views of all those who agree with the three-dimensional approach to critical thinking under any heading; using the presented concepts in research-academic institutions to identify the most suitable training methods of each of the sub-skills of critical thinking in different societies is very helpful. Given that this study was conducted only in the field of English language and in the university context, its application in other educational spaces and for people with different academic backgrounds and identifying differences in the application of various instructions for each of the sub-skills will be very effective. It is possible to apply both concepts (i.e., three-dimensional critical thinking and reading) in other areas to assess the generalizability of findings. An interesting finding was that in some cases, students engaged in group discussions sometimes returned to their first language, which could be a consequence of poor language proficiency. In such circumstances, Lun et al. ( 2010 ) have suggested that in order to promote critical thinking, the emphasis on language processing should be reduced or, on the recommendation of Ko ( 2013 ), teachers should first describe the task in order to prepare students and initialize the analysis and then ask them to complete it. The validity of the criterion proposed in the previous study (Mohammadi et al., Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach, under review ) was evaluated through structural equation modeling, which is a new method and has a very limited history in language studies. This study showed that the method can be used to evaluate path analysis/regression, repeated measures analysis/latent changes modeling, and confirmatory factor analysis.

This study was designed and conducted to confirm the subscales introduced by Mohammadi et al. (Characterization and Development of Critically-thinker EFL Readers’ Reading Ability: Asynchronous Webbased Collaborative vs. Question-Answer-Relationship Instructional Approach,  under review ) in determining the critical thinking ability in three different layers (i.e., individual critical thinking skills, criticality, and critical pedagogy) through assessing the validity of the proposed questionnaire. The model studied in this study well confirmed the relationship between the factors identified in previous studies and the proposed model with six scales and twenty-two subscales showed a good fit representing that argumentation, judgment, disposition, action, social cognition, and creativity are proper components for measuring three-level critical thinking in language learners.

The results of assessing the validity of CTI through CFA represented that all the three absolute fitness indices, parsimony fit indices, and incremental fit indices have desirable values in the model, and the proposed model is consistent with its experimental model; meaning that the divergent validity of the structure is confirmed. Therefore, combining the results of covariance analysis, the three-level analyses, and the reliability calculations, it can be seen that the questionnaire is valid and reliable. This represents that a critical thinker EFL reader is an individual with the ability to make argumentation (i.e., to find relevance, provide reasoning, recognize language use, comprehend the text’s organization, and distinguish author’s voice), to make judgement (i.e., to pass the buck and vigilant), to provide dispositions (i.e., to innovate, be mature, and engage in doing activities), to act (i.e., to form habitual actions, to understand, to be reflective, and to have critical reflection towards issues), to have social cognition (i.e., to have social competence, literacy, cultural competence, and be extrovert), and to be creative (i.e., to be able to elaborate, be flexible, have fluency, and propose original ideas).

Future research can introduce the extent and manner of internalization of the introduced skills and the effectiveness of different internalization methods. In addition, it should be noted that in this study, the views of language learners were examined. It is necessary to examine the introduced criteria also from the point of view of teachers and administrators in order to answer questions such as the following: Are teachers’ perceptions different from students? If so, what are the differences? What are the effective strategies in teaching these criteria? This type of research can also determine whether students, teachers, and planners have the same understanding of the concepts as well as the strategies used in the classroom. And whether their understanding of the criteria introduced in the first language is the same as in the second language? Moreover, due to the distribution of the gathered data in this study, the factor analysis method with partial least squares (PLS) approach was used. Subsequent researchers can use other analysis programs, such as LISREL or AMOS, for structural analysis relying on larger communities.

Finally, it is necessary to mention that the generalization of the results of this study to other fields and research communities is not possible due to the limited number of participants and its specific field, and it is recommended that first the necessary research efforts be made to apply this scale in different educational fields and societies in order to have more strength and generalizability.

Availability of data and materials

The datasets generated and analyzed during the current study are not publicly available due to ethical issues and privacy restrictions, but are available from the corresponding author on reasonable request.

Abbreviations

Three dimensional

Adjusted Goodness of Fit Index

American Philosophical Association

Argumentation

Average variance extracted

Covariance-based structural equation modeling

Confirmatory factor analysis

Comparative Fit Index

Composite reliability

Critical thinking inventory

Cultural competence

English as a Foreign Language

Goodness of Fit Index

Multiple indicators multiple causes

Ordinary least square

Preliminary English Test

Partial Least Squares-Structural Equation Modeling Confirmatory Factor Analysis

Reflective thinking

Root mean squared residual

Root mean square error of approximation

Structural equation modeling

Social cognition

Social competence

Statistical Package for the Social Sciences

Standardized root mean squared residual

Tucker-Lewis Index

Abedi, J. (2002). Standardized achievement tests and English language learners: psychometrics issues. Educational Assessment , 8 , 231–257. https://doi.org/10.1207/S15326977EA0803_02 .

Article   Google Scholar  

Adalı, O. (2010). Interactive and critical reading techniques . Toroslu Library.

Google Scholar  

Adalı, O. (2011). Understand and tell . Pan Publishing.

Afthanorhan, W. M. A. (2013). A comparison of partial least square structural equation modeling (PLS-SEM) and covariance based structural equation modeling (CB-SEM) for confirmatory factor analysis. International Journal of Engineering Science and Innovative Technology , 2 , 198–205.

Airasian, P. W., Anderson, L. W., Krathwohl, D. R., & Bloom, B. S. (2001). A taxonomy for learning, teaching, and assessing: a revision of Bloom’s taxonomy of educational objectives . Longman.

Akyuz, H. I., & Samsa, S. (2009). The effects of blended learning environment on the critical thinking skills of students. Procedia Social and Behavioral Sciences , 1 , 1744–1748.

Bag, H. K., & Gürsoy, E. (2021). The effect of critical thinking embedded English course design to the improvement of critical thinking skills of secondary school learners. Thinking Skills and Creativity , 41 . https://doi.org/10.1016/j.tsc.2021.100910 .

Bali, M. (2015). Critical thinking through a multicultural lens: cultural challenges of teaching critical thinking. In M. Davies, & R. Barnett (Eds.), The Palgrave handbook of critical thinking in higher education , (pp. 317–334). Palgrave Macmillan.

Chapter   Google Scholar  

Bentler, P. M. (2000). Rites, wrongs, and gold in model testing. Structural Equation Modeling , 7 , 82–91. https://doi.org/10.1207/S15328007SEM0701_04 .

Berardo, S. (2006). The use of authentic materials in the teaching of reading. The Reading Matrix , 6 (2), 60–69.

Birjandi, P., Bagheri, M. B., & Maftoon, P. (2018). The place of critical thinking in Iranian Educational system. Foreign Language Research Journal , 7 (2), 299–324. https://doi.org/10.22059/JFLR.2017.236677.353 .

Bloom, B. S. (1956). Taxonomy of educational objectives, handbook 1: Cognitive domain . Longmans Green.

Brookfield, S. (2019). Using discussion to foster critical thinking. In D. Jahn, A. Kergel, & B. Heidkamp-Kergel (Eds.), Kritische Hpschullehre. Diversitat und bildung im digitalen zeitalter , (pp. 135–151). Springer. https://doi.org/10.1007/978-3-658-25740-8_7 .

Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS and SIMPLIS: basic concepts, applications and programming . Lawrence Erlbaum.

Byrne, B. M. (2001). Structural equation modeling with AMOS: basic concepts, applications, and programming . Lawrence Erlbaum Associates Publishers.

Byrne, B. M. (2006). Structural equation modeling with eqs: basic concepts, applications, and programming , (2nd ed., ). Lawrence Erlbaum Associates Publishers.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin , 56 , 81–105.

Chamot, A. U. (2004). Issues in language learning strategy research and teaching. Electronic Journal of Foreign Language Teaching , 1 , 14–26.

Crenshaw, P., Hale, E., & Harper, S. L. (2011). Producing intellectual labor in the classroom: the utilization of a critical thinking model to help students take command of their thinking. Journal of College Teaching & Learning , 8 (7), 13–26. https://doi.org/10.19030/tlc.v8i7.4848 .

Davies, M. (2015). A model of critical thinking in higher education. In M. B. Paulsen (Ed.), Higher Education: Handbook of Theory and Research , (vol. 30, pp. 41–92). Springer International Publishing.

Davies, M., & Barnett, R. (2015). Introduction. In M. Davies, & R. Barnett (Eds.), The Palgrave handbook of critical thinking in higher education , (pp. 1–26). Palgrave Macmillan.

Dekker, T. J. (2020). Teaching critical thinking through engagement with multiplicity. Thinking Skills and Creativity , 37 . https://doi.org/10.1016/j.tsc.2020.100701 .

Din, M. (2020). Evaluating university students’ critical thinking ability as reflected in their critical reading skill: a study at bachelor level in Pakistan. Thinking Skills and Creativity , 35 , 1–11. https://doi.org/10.1016/j.tsc.2020.100627 .

Dornyei, Z. (2010). Researching motivation: from integrativeness to the ideal L2 self. Introducing applied linguistics. Concepts and skills , 3 (5), 74–83.

Eftekhary, A. A., & Besharati Kalayeh, K. (2014). The relationship between critical thinking and extensive reading on Iranian intermediate EFL learners. Journal of Novel Applied Sciences , 3 (6), 623–628.

Ennis, R. (1991). Critical thinking: a streamlined conception. Teaching Philosophy , 14 (1), 5–24.

Ennis, R. H. (2011). Critical thinking: reflection and perspective, Part I. Inquiry: Critical Thinking across the Disciplines , 26 (1), 4–18. https://doi.org/10.5840/inquiryctnews20112613 .

Facione, P. A. (1990). Executive summary of critical thinking: a statement of expert consensus for purposes of educational assessment and instruction . The California Academic Press.

Facione, P. A. (1992). Critical thinking: what it is and why it counts . Insight Assessment and the California Academic Press Retrieved May 2019 from http://www.student.uwa.edu.au .

Facione, P. A., Facione, N. C., & Giancarlo, C. A. (2000). The disposition toward critical thinking: its measurement, and relationship to critical thinking skill. Informal Logic , 20 (1), 61–84. https://doi.org/10.22329/il.v20i1.2254 .

Fahim, M., & Ahmadian, M. (2012). Critical thinking and Iranian EFL context. Journal of Language Teaching and Research , 3 (4), 793–800. https://doi.org/10.4304/jltr.3.4.793-800 .

Farahani, H. A. (2010). A comparison of partial least squares (PLS) and ordinary least squares (OLS) regressions in predicting of couples’ mental health based on their communicational patterns. Procedia Social and Behavioral Sciences , 5 , 1459–1463. https://doi.org/10.1016/j.sbspro.2010.07.308 .

Freire, P. (1993). Pedagogy of the oppressed . Continuum.

Frykholm, J. (2020). Critical thinking and the humanities: a case study of conceptualizations and teaching practices at the Section for Cinema Studies at Stockholm University. Arts and Humanities in Higher Education , 20 (3), 253–273. https://doi.org/10.1177/1474022220948798 .

Gear, A. (2006). Reading power: teaching students to think while they read . Pembroke.

Hair, J., Hult, G. T. M., Ringle, C., & Sarstedt, M. (2014). A primer on partial least squares structural equation modeling (PLS-SEM) . SAGE Publications.

Hair, J. F., Risher, J. J., Sarstedt, M., & Ringle, C. M. (2019). When to use and how to report the results of PLS-SEM. European Business Review , 31 (1), 2–24. https://doi.org/10.1108/EBR-11-2018-0203 .

Hair Jr., J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis , (7th ed., ). Pearson Prentice-Hall.

Haji Maibodi, A. (2014). The effect of critical thinking skills on reading English novels. Research in English Language Pedagogy , 2 (2), 97–108.

Haji Maibodi, A., & Fahim, M. (2012). The impact of critical thinking in EFL/ESL literacy. The Iranian EFL Journal , 8 (3), 24–44.

Hall, L., & Piazza, S. (2008). Critically reading texts: what students do and how teachers can help. Reading Teacher , 62 (1), 32–41. https://doi.org/10.1598/RT.62.1.14 .

Hawkins, K. T. (2012). Thinking and reading among college undergraduates: an examination of the relationship between critical thinking skills and voluntary reading (Unpublished Doctoral Dissertation) University of Tennessee, USA.

Heidari, K. (2020). Critical thinking and EFL learners’ performance on textually-explicit, textually-implicit, and script-based reading items. Thinking Skills and Creativity , 37 . https://doi.org/10.1016/j.tsc.2020.100703 .

Hemmati, M. R., & Azizmalayeri, F. (2022). Iranian EFL teachers’ perceptions of obstacles to implementing student-centered learning: a mixed-methods study. International Journal of Foreign Language Teaching and Research , 10 (40), 133–152. https://doi.org/10.30495/JFL.2022.686698 .

Hosseini, E., Bakhshipour Khodaei, F., Sarfallah, S., & Dolatabad, H. R. (2012). Exploring the relationship between critical thinking, reading comprehension and reading strategies of English university students. World Applied Sciences Journal , 17 (10), 1356–1364.

Imperio, A., Staarman, J. K., & Basso, D. (2020). Relevance of the socio-cultural perspective in the discussion about critical thinking. Journal of Theories and Research in Education , 15 (1), 1–19. https://doi.org/10.6092/issn.1970-2221/9882 .

Johnson, J. L., & Plake, B. S. (1998). A historical comparison of validity standards and validity practices. Educational and Psychological Measurement , 58 , 736–753. https://doi.org/10.1177/0013164498058005002 .

Kaewpet, C. (2018). Quality of argumentation models. Theory and Practice in Language Studies , 8 (9), 1105–1113. https://doi.org/10.17507/TPLS.0809.01 .

Kahsay, M. T. (2019). EFL students’ epistemological beliefs and use of cognitive and metacognitive strategies in Bahir Dar University. International Journal of Foreign Language Teaching & Research , 7 (26), 69–83.

Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement , 38 , 319–342. https://doi.org/10.1111/j.1745-3984.2001.tb01130.x .

Kaplan, D. (2000). Structural equation modeling: foundations and extensions . Sage Publications.

Kember, D., Leung, D. Y. P., Jones, A., Loke, A. Y., McKay, J., Sinclair, K., & Yeung, E. (2000). Development of a questionnaire to measure the level of reflective thinking. Assessment and Evaluation in Higher Education , 25 (4), 382–395. https://doi.org/10.1080/713611442 .

Khonamri, F., & Karimabadi, M. (2015). Collaborative strategic reading and critical reading ability of intermediate Iranian learners. Theory and Practice in Language Studies , 5 (7), 1375–1382. https://doi.org/10.17507/tpls.0507.09 .

Khorasani, M. M., & Farimani, M. A. (2010). The analysis of critical thinking in Fariman’s teachers and factors influencing it. Journal of Social Science of Ferdowsi University , 6 (1), 197–230.

Ko, M. Y. (2013). Critical literacy practices in the EFL context and the English language proficiency: further exploration. English Language Teaching , 6 (11), 17–28.

Koch, T., Eid, M., & Lochner, K. (2018). Multitrait-multimethod-analysis: the psychometric foundation of CFA-MTMM models. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development , (vol. VII, pp. 781–846). Wiley-Blackwell Publishing Ltd..

Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: an overview. Theory into Practice , 41 , 212–218.

Langer, D. A., Wood, J. J., Bergman, R. L., & Piacentini, J. C. (2010). A multitrait–multimethod analysis of the construct validity of child anxiety disorders in a clinical sample. Child Psychiatry Hum Dev , 41 , 549–561. https://doi.org/10.1007/s10578-010-0187-0 .

Ledman, K. (2019). Discourses of criticality in Nordic countries’ school subject civics. Journal of Humanities and Social Science Education, 3 , 149–167.

Li, L. (2011). Obstacles and opportunities for developing thinking through interaction in language classrooms. Thinking Skills and Creativity , 6 (3), 146–158. https://doi.org/10.1016/j.tsc.2011.05.001 .

Li, L. (2016). Thinking skills and creativity in second language education: where are we now? Thinking Skills and Creativity , 22 , 267–272. https://doi.org/10.1016/j.tsc.2016.11.005 .

Liang, W., & Fung, D. (2020). Fostering critical thinking in English-as-a-second-language classrooms: challenges and opportunities. Thinking Skills and Creativity , 2020 . https://doi.org/10.1016/j.tsc.2020.100769 .

Lin, M., Preston, A., Kharrufa, A., & Kong, Z. (2016). Making L2 learners’ reasoning skills visible: the potential of computer supported collaborative learning environment. Thinking Skills and Creativity , 22 , 303–322. https://doi.org/10.1016/j.tsc.2016.06.004 .

Liu, F., & Stapleton, P. (2014). Counterargumentation and the cultivation of critical thinking in argumentative writing: investigating washback from a high-stakes test. System , 45 , 117–128. https://doi.org/10.1016/j.system.2014.05.005 .

Lun, V. M.-C., Fischer, R., & Ward, C. (2010). Exploring cultural differences in critical thinking: is it about my thinking style or the language I speak? Learning and Individual Differences , 20 (6), 604–616. https://doi.org/10.1016/j.lindif.2010.07.001 .

Manalo, E., & Sheppard, C. (2016). How might language affect critical thinking performance? Thinking Skills and Creativity , 21 , 41–49. https://doi.org/10.1016/j.tsc.2016.05.005 .

Mann, L., Burnett, P., Radford, M., & Ford, S. (1997). The Melbourne decision making questionnaire: an instrument for measuring patterns for coping with decisional conflict. Journal of Behavioral Decision Making, 10 (1), 1–19 https://doi.org/10.1002/(SICI)1099-0771(199703)10:1<1::AID-BDM242>3.0.CO;2-X .

Mâtă, L., Clipa, O., & Tzafilkou, K. (2020). The development and validation of a scale to measure university teachers’ attitude towards ethical use of information technology for a sustainable education. Sustainability , 12 (15), 1–20. https://doi.org/10.3390/su12156268 .

McGregor, D. (2007). Developing thinking; developing learning. A guide to thinking skills in education . Open University Press.

Merrifield, W. (2018). Culture and critical thinking: exploring culturally informed reasoning processes in a Lebanese university using think-aloud protocols (Unpublished Doctoral Dissertation) . George Fox University, College of Education.

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement , (pp. 13–103). American Council on Education and Macmillan Publishing Co., Inc.

Messick, S. (1995). Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist , 50 , 741–749. https://doi.org/10.1037/0003-066X.50.9.741 .

Moghadam, Z. B., Narafshan, M. H., & Tajadini, M. (2021). Development of a critical self in the language reading classroom: an examination of learners’ L2 self. Thinking Skills and Creativity , 42 , 1–29. https://doi.org/10.1016/j.tsc.2021.100944 .

Moon, J. (2008). Critical thinking: an exploration of theory and practice . Routledge.

Nemat Tabrizi, A. R., & Akhavan Saber, M. (2016). The effect of critical reading strategies on EFL learners’ recall and retention of collocations. International Journal of Education & Literacy Studies , 4 (4), 30–37. https://doi.org/10.7575/aiac.ijels.v.4n.4p.30 .

Nikopour, J., Amini, F. M., & Nasiri, M. (2011). On the relationship between critical thinking and language learning strategies among Iranian EFL learners. Journal of Technology & Education , 5 (3), 195–199. https://doi.org/10.22061/TEJ.2011.283 .

O’Neil, H. F., Abedi, J., & Spielberg, C. D. (1992). The measurement and teaching of creativity. In H. F. O’Neil, & M. Drillings (Eds.), Motivation: Theory and Research , (pp. 245–264). Lawrence Erlbaum.

Paul, R., & Elder, L. (2002). Critical thinking: tools for taking charge of your professional and personal life . Financial Times Prentice Hall.

Paulsen, M. B. (2015). Higher education: handbook of theory and research . Springer.

Book   Google Scholar  

Peng, J. (2014). Willingness to communicate in the Chinese EFL university classrooms: an ecological perspective . Multilingual Matters.

Pica, T. (2000). Tradition and transition in English language teaching methodology. System , 29 , 1–18.

Pirouz, D. M. (2006). An overview of partial least squares (unpublished doctoral dissertation) . University of California.

Pishghadam, R., Noghani, M., & Zabihi, R. (2011). The construct validation of a questionnaire of social and cultural capital. English Language Teaching , 4 (4), 195–203. https://doi.org/10.5539/elt.v4n4p195 .

Pishvaei, V., & Kasaian, S. A. (2013). Design, construction, and validation of a CP attitude questionnaire in Iran. European Online Journal of Natural and Social Sciences , 2 (2), 59–74.

Rahimi, A., & Asadi Sajed, M. (2014). The interplay between critical pedagogy and critical thinking: theoretical ties and practicalities. Procedia - Social and Behavioral Sciences , 136 , 41–45. https://doi.org/10.1016/j.sbspro.2014.05.284 .

Rashel, U. M., & Kinya, S. (2021). Development and validation of a test to measure the secondary students’ critical thinking skills: a focus on environmental education in Bangladesh. International Journal of Educational Research Review , 6 (3), 264–274.

Renatovna, A. G., & Renatovna, A. S. (2021). Pedagogical and psychological conditions of preparing students for social relations on the basis of the development of critical thinking. Psychology and Educational Journal , 58 (2), 4889–4902. https://doi.org/10.17762/pae.v58i2.2886 .

Rezaei, S., Derakhshan, A., & Bagherkazemi, M. (2011). Critical thinking in language education. Journal of Language Teaching and Research , 2 (4), 769–777. https://doi.org/10.4304/jltr.2.4.769-777 .

Ricketts, J. C. (2003). The efficacy of leadership development, critical thinking dispositions, and students academic performance on the critical thinking skills of selected youth leaders (unpublished doctoral dissertation) . University of Florida.

Sadler, D. R. (2010). Beyond feedback: developing student capability in complex appraisal. Assessment & Evaluation in Higher Education , 35 (5), 535–550. https://doi.org/10.1080/02602930903541015 .

Shang, H. F. (2010). Reading strategy use, self-efficacy and EFL reading comprehension. Asian EFL Journal , 12 (2), 18–42.

Shpeizer, R. (2018). Teaching critical thinking as a vehicle for personal and social transformation. Research in Education , 100 (1), 32–49. https://doi.org/10.1177/0034523718762176 .

Söylemez, Y. (2015). Development of critical essential language skills for middle school students (unpublished doctoral dissertation) . Ataturk University Educational Sciences Institute.

Suryantari, H. (2018). Children and adults in second language learning. Tell Teaching of English and Literature Journal , 6 (1), 30–38. https://doi.org/10.30651/tell.v6i1.2081 .

Swallow, C. (2016). Reading is thinking. BU Journal of Graduate Studies in Education , 8 (2), 27–31.

Sweet, A. P. (1993). Transforming ideas for teaching and learning to read . Office of Educational Research and Improvement CS 011 460.

Tehrani, H. T., & Razali, A. B. (2018). Developing thinking skills in teaching English as a second/foreign language at primary school. International Journal of Academic Research in Progressive Education and Development , 7 (4), 13–29. https://doi.org/10.6007/IJARPED/v7-i4/4755 .

Tumasang, S. S. (2021). How fear affects EFL acquisition: the case of “terminale” students in Cameroon. Journal of English Language Teaching and Applied Linguistics , 63–70. https://doi.org/10.32996/jeltal .

Ullman, J. R. (2001). Structural equation modeling in using multivariate statistics. In B. G. Tabachnick, & L. Fidell (Eds.), Understanding multivariate statistics , (4th ed., pp. 653–771). Allyn & Bacon.

Van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2017). The relation between 21st-century skills and digital skills: a systematic literature review. Computers in Human Behavior , 72 , 577–588. https://doi.org/10.1016/j.chb.2017.03.010 .

Wallace, C. (2003). Critical reading in language education . Palgrave Macmillan.

Wang, Y., & Henderson, F. (2014). Teaching content through Moodle to facilitate students’ critical thinking in academic reading. The Asian EFL Journal , 16 (3), 7–40.

Widyastuti, S. (2018). Fostering critical thinking skills through argumentative writing. Cakrawala Pendidikan , 37 (2), 182–189. https://doi.org/10.21831/cp.v37i2.20157 .

Willingham, D. (2007). Critical thinking why is it so hard to teach? Arts Education Policy Review , 109 (4), 21–32. https://doi.org/10.3200/AEPR.109.4.21-32 .

Wilson, A. N., & Howitt, S. M. (2018). Developing critical being in an undergraduate science course. Studies in Higher Education , 43 (7), 1160–1171. https://doi.org/10.1080/03075079.2016.1232381 .

Wilson, K. (2016). Critical reading, critical thinking: delicate scaffolding in English for Academic Purposes (EAP). Thinking Skills and Creativity , 22 , 256–265. https://doi.org/10.1016/j.tsc.2016.10.002 .

Wu, W. C. V., Marek, M., & Chen, N. S. (2013). Assessing cultural awareness and linguistic competency of EFL learners in a CMC-based active learning context. System , 41 (3), 515–528. https://doi.org/10.1016/j.system.2013.05.004 .

Zhao, C., Pandian, A., & Singh, M. K. M. (2016). Instructional strategies for developing critical thinking in EFL classrooms. English Language Teaching , 9 (10), 14–21. https://doi.org/10.5539/elt.v9n10p14 .

Zhou, J., Jiang, Y., & Yao, Y. (2015). The investigation on critical thinking ability in EFL reading class. English Language Teaching , 8 (1), 84–93. https://doi.org/10.5539/elt.v8n1p83 .

Zou, M., & Lee, I. (2021). Learning to teach critical thinking: testimonies of three EFL teachers in China. Asia Pacific Journal of Education. https://doi.org/10.1080/02188791.2021.1982674 .

Zumbo, B. D. (2005). Structural equation modeling and test validation. In B. Everitt, & D. C. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science , (pp. 1951–1958). Wiley.

Download references

Acknowledgements

Not applicable.

The research received no specific grant from any funding agency in the public, commercial, or non-profit sectors.

Author information

Authors and affiliations.

Department of English, Science and Research Branch, Islamic Azad University, Tehran, Iran

Moloud Mohammadi & Masood Siyyari

Imam Ali University, Tehran, Iran

Gholam-Reza Abbasian

You can also search for this author in PubMed   Google Scholar

Contributions

M.M., G-R. A., and M. S. contributed to the design and implementation of the research, to the analysis of the results, and to the writing of the manuscript. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Gholam-Reza Abbasian .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Mohammadi, M., Abbasian, GR. & Siyyari, M. Adaptation and validation of a critical thinking scale to measure the 3D critical thinking ability of EFL readers. Lang Test Asia 12 , 24 (2022). https://doi.org/10.1186/s40468-022-00173-6

Download citation

Received : 11 November 2021

Accepted : 05 July 2022

Published : 14 September 2022

DOI : https://doi.org/10.1186/s40468-022-00173-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Critical thinking and reading
  • Criticality
  • Critical pedagogy
  • PLS-SEM factor analysis
  • Scale validation

critical thinking measurement tools

IMAGES

  1. Critical thinking measurement tools and assessment rubrics report

    critical thinking measurement tools

  2. Standardized Critical Thinking Assessment Tools 1 College-Level

    critical thinking measurement tools

  3. (PDF) DEVELOPMENT OF CRITICAL THINKING SKILLS SCALE FOR SCIENCE LESSON

    critical thinking measurement tools

  4. Critical thinking measurement tools and assessment rubrics report

    critical thinking measurement tools

  5. Critical thinking measurement tools and assessment rubrics report

    critical thinking measurement tools

  6. Critical thinking measurement tools and assessment rubrics report

    critical thinking measurement tools

VIDEO

  1. Part 3 Various tools, Critical Thinking, Evidence

  2. Part 1 Various tools, Critical Thinking, Evidence

  3. How to Evaluate an Argument with MindMup

  4. 4. Grade 8 CBC Science Practice Exam #1

  5. Tools Of Thinking

  6. Critical Thinking in Academic Research

COMMENTS

  1. PDF Standardized Critical Thinking Assessment Tools 1

    Assessment of Reasoning and Communication, Reasoning Subtest, (1986). College Outcome Measures Program, The American College Testing Program (ACT) General-Content. Multi-Aspect. Open-Ended. Students produce three short essays and three short speeches, which are graded on pertinence, relevance, plausibility, reasonableness, and realism.17. Tasks ...

  2. Critical Thinking Testing and Assessment

    Foundation for Critical Thinking. PO Box 31080 • Santa Barbara, CA 93130 . Toll Free 800.833.3645 • Fax 707.878.9111. [email protected]

  3. Critical Thinking > Assessment (Stanford Encyclopedia of Philosophy)

    The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students' critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level ...

  4. Teaching, Measuring & Assessing Critical Thinking Skills

    Yes, We Can Define, Teach, and Assess Critical Thinking Skills. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the ...

  5. Critical Thinking About Measuring Critical Thinking

    A list of critical thinking measures. A list of critical thinking measures. ... L.M., & Johnson, K.W. (1996). Critical thinking as an educational outcome: An evaluation of current tools of ...

  6. Assessing Critical Thinking in Higher Education: Current State and

    Watson-Glaser Critical Thinking Appraisal tool (WGCTA) Pearson: MC: Online and paper/pencil: Standard: 40-60 min (Forms A and B) if timed: ... Innovative item types can enhance the measurement of a wide range of critical thinking skills and are likely to help students engage in test taking. Third, the new framework and assessment emphasize ...

  7. CTS Tools for Faculty and Student Assessment

    The WGCTA-FS was found to be a reliable and valid instrument for measuring critical thinking (71). Cornell Critical Thinking Test (CCTT) There are two forms of the CCTT, X and Z. Form X is for students in grades 4-14. Form Z is for advanced and gifted high school students, undergraduate and graduate students, and adults.

  8. Frontiers

    Enhancing students' critical thinking (CT) skills is an essential goal of higher education. This article presents a systematic approach to conceptualizing and measuring CT. CT generally comprises the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion. We further posit that CT also involves ...

  9. Development of the Critical Thinking Toolkit (CriTT): A measure of

    Critical thinking is an important focus in higher education and is essential for good academic achievement. We report the development of a tool to measure critical thinking for three purposes: (i) to evaluate student perceptions and attitudes about critical thinking, (ii) to identify students in need of support to develop their critical thinking, and (iii) to predict academic performance.

  10. A Systematic Review on Instruments to Assess Critical Thinking

    Critical Think ing an d Problem Solving (CTPS) are soft skills essential to be equipped among students according to. 21st-century learning. Several instruments have bee n dev eloped to measure ...

  11. Critical Thinking Models: A Comprehensive Guide for Effective Decision

    There are several assessment tools designed to measure critical thinking, each focusing on different aspects such as quality, depth, breadth, and significance of thinking. One example of a widely used standardized test is the Watson-Glaser Critical Thinking Appraisal , which evaluates an individual's ability to interpret information, draw ...

  12. Critical Thinking Dispositions Scale

    The Critical Thinking Dispositions Scale (CTDS; Sosu, 2013) was developed to assess dimensions of critical thinking. Across 2 studies using samples of undergraduate and graduate students, item development processes, and exploratory and confirmatory factor analyses, an 11-item measure was produced 2 factors: Critical Openness (7 items) and Reflective Scepticism (4 items).

  13. The Disposition Toward Critical Thinking: Its Character, Measurement

    It encompasses two parts: critical thinking disposition and critical thinking skills (Facione, 2000). Critical thinking skills encompass six core cognitive abilities: interpretation, analysis ...

  14. How to Choose and Use Critical Thinking Assessment Tools

    Critical thinking assessment tools are instruments that measure your level of critical thinking skills and abilities. They can be either standardized tests, such as the Watson-Glaser Critical ...

  15. (PDF) Critical Thinking Questionnaire (CThQ) -construction and

    The questionnaire is a critical thinking test tool designed for adolescents and adults. ... to measure academic stress, and a questionnaire on critical thinking skills adapted from the Critical ...

  16. Development and Validation of a Critical Thinking Assessment-Scale

    This study presents and validates the psychometric characteristics of a short form of the Critical Thinking Self-assessment Scale (CTSAS). The original CTSAS was composed of six subscales representing the six components of Facione's conceptualisation of critical thinking. The CTSAS short form kept the same structures and reduced the number of items from 115 in the original version, to 60.

  17. Measurement of critical thinking, clinical reasoning, and clinical

    Measurement tools for critical thinking, clinical reasoning, and clinical judgment have been translated into languages other than English and have been used with nursing students from a variety of countries. When considering these measurement tools for use in nursing education, there are two challenges: use of the measurement tools with diverse ...

  18. Evaluation of tools used to measure critical thinking ...

    Conclusions: Discipline specific instruments to measure critical thinking in nursing and midwifery are required, specifically tools that measure the application of critical thinking to practise. Given that critical thinking development occurs over an extended period, measurement needs to be repeated and multiple methods of measurement used over ...

  19. The Safe Care Framework™: A practical tool for critical thinking

    Unfortunately, the continued use of generalized critical thinking measurement tools is not considered likely to assist nurse educators to identify optimal teaching methods (Carter et al., 2016). Chan (2013) recommends an evaluation of teaching strategies designed to improve critical thinking skills in nursing specifically.

  20. Critical thinking measurement tools.

    Most studies in Iran used the CCTST as a measurement tool and they found that nursing students had a lower level of critical thinking skills (Azizi-Fini et al., 2015;Hasanpour et al., 2018;Shirazi ...

  21. 9 Critical Thinking Tools for Better Decision Making

    3. Active Listening & Socratic Method. This is pairing active listening with the Socratic method. Active listening is one of the core skills you'll want to develop to get better at critical thinking. I also touched on active listening / deep listening in my article on difficult conversations.

  22. Review Evaluation of tools used to measure critical thinking

    Discipline specific instruments to measure critical thinking in nursing and midwifery are required, specifically tools that measure the application of critical thinking to practise. Given that critical thinking development occurs over an extended period, measurement needs to be repeated and multiple methods of measurement used over time.

  23. PDF Pamukkale critical thinking skill scale: a validity and ...

    Critical thinking, Test development, University students, Validity and reliability. Abstract: The aim of this study is to develop a valid and reliable measurement tool that measures critical thinking skills of university students. Pamukkale Critical Thinking Skills Scale was developed as two separate forms; multiple choice and open-ended.

  24. Adaptation and validation of a critical thinking scale to measure the

    The novel point in the present study is that, despite the existence of the one-dimensional attitude towards critical thinking (Facione, 1992; Kember et al., 2000), it tries to highlight the concept of a three-dimensional critical thinking in academic context and in this regard developed a tool for measuring its subscales (and not just ...