Assessment and Evaluation

Teacher-Made Assessments

One of the challenges for beginning teachers is to select and use appropriate assessment techniques. In this section, we summarize the wide variety of types of assessments that classroom teachers use. First, we discuss the informal techniques teachers use during instruction that typically require instantaneous decisions. Then we consider formal assessment techniques that teachers plan before instruction and allow for reflective decisions.

Teachers’ Observation, Questioning, and Record-Keeping

During teaching, teachers not only have to communicate the information they planned but also continuously monitor students’ learning and motivation in order to determine whether modifications have to be made (Airasian, 2005). Beginning teachers find this more difficult than experienced teachers because of the complex cognitive skills required to improvise and be responsive to students’ needs while simultaneously keeping in mind the goals and plans of the lesson (Borko & Livingston, 1989). The informal assessment strategies teachers most often use during instruction are  observation  and  questioning .

Observation

Effective teachers observe their students from the time they enter the classroom. Some teachers greet their students at the door not only to welcome them but also to observe their mood and motivation. Are Hannah and Naomi still not talking to each other? Does Ethan have his materials with him? Gaining information on such questions can help the teacher foster student learning more effectively (e.g. suggesting Ethan goes back to his locker to get his materials before the bell rings or avoiding assigning Hannah and Naomi to the same group).

During instruction, teachers observe students’ behavior to gain information about students’ level of interest and understanding of the material or activity. Observation includes looking at non-verbal behaviors as well as listening to what the students are saying. For example, a teacher may observe that a number of students are looking out of the window rather than watching the science demonstration, or a teacher may hear students making comments in their group indicating they do not understand what they are supposed to be doing. Observations also help teachers decide which student to call on next, whether to speed up or slow down the pace of the lesson, when more examples are needed, whether to begin or end an activity, how well students are performing a physical activity, and if there are potential behavior problems (Airasian, 2005). Many teachers find that moving around the classroom helps them observe more effectively because they can see more students from a variety of perspectives. However, the fast pace and complexity of most classrooms make it difficult for teachers to gain as much information as they want.

Questioning

Teachers ask questions for many instructional reasons including keeping students’ attention on the lesson, highlighting important points and ideas, promoting critical thinking, allowing students to learn from each other’s answers, and providing information about students’ learning. Devising good appropriate questions and using students’ responses to make effective instantaneous instructional decisions is very difficult. Some strategies to improve questioning include planning and writing down the instructional questions that will be asked, allowing sufficient wait time for students to respond, listening carefully to what students say rather than listening for what is expected, varying the types of questions asked, making sure some of the questions are higher level, and asking follow-up questions.

While the informal assessment based on spontaneous observation and questioning is essential for teaching there are inherent problems with the validity, reliability, and bias of this information (Airasian, 2005; Stiggins 2005). We summarize these issues and some ways to reduce the problems in the table below.

Record Keeping

Keeping records of observations improves reliability and can be used to enhance the understanding of one student, a group, or the whole class’ interactions. Sometimes this requires help from other teachers. For example, Alexis, a beginning science teacher is aware of the research documenting that longer wait time enhances students’ learning (e.g. Rowe, 2003) but is unsure of her behaviors so she asks a colleague to observe and record her wait times during one class period. Alexis learns her wait times are very short for all students so she starts practicing silently counting to five whenever she asks students a question.

Teachers can keep  anecdotal records about students without help from peers. These records contain descriptions of incidents of a student’s behavior, the time and place the incident takes place, and a tentative interpretation of the incident. For example, the description of the incident might involve Joseph, a second-grade student, who fell asleep during the mathematics class on a Monday morning. A tentative interpretation could be the student did not get enough sleep over the weekend, but alternative explanations could be the student is sick or is on medications that make him drowsy. Obviously, additional information is needed and the teacher could ask Joseph why he is so sleepy and also observe him to see if he looks tired and sleepy over the next couple of weeks.

Anecdotal records often provide important information and are better than relying on one’s memory but they take time to maintain and it is difficult for teachers to be objective. For example, after seeing Joseph fall asleep the teacher may now look for any signs of Joseph’s sleepiness—ignoring the days he is not sleepy. Also, it is hard for teachers to sample a wide enough range of data for their observations to be highly reliable.

Teachers also conduct more formal observations, especially for students with special needs who have IEPs. An example of the importance of informal and formal observations in a preschool follows:

The class of preschoolers in a suburban neighborhood of a large city has eight special needs students and four students—the peer models—who have been selected because of their well-developed language and social skills. Some of the special needs students have been diagnosed with delayed language, some with behavior disorders, and several with autism.

The students are sitting on the mat with the teacher who has a box with sets of three “cool” things of varying sizes (e.g. toy pandas) and the students are asked to put the things in order by size, big, medium, and small. Students who are able to are also requested to point to each item in turn and say “This is the big one,” “This is the medium one,” and “This is the little one.” For some students, only two choices (big and little) are offered because that is appropriate for their developmental level.

The teacher informally observes that one of the boys is having trouble keeping his legs still so she quietly asks the aid for a weighted pad that she places on the boy’s legs to help him keep them still. The activity continues and the aide carefully observes students’ behaviors and records on IEP progress cards whether a child meets specific objectives such as: “When given two picture or object choices, Mark will point to the appropriate object in 80 percent of the opportunities.” The teacher and aides keep records of the relevant behavior of the special needs students during the half-day they are in preschool. The daily records are summarized weekly. If there are not enough observations that have been recorded for a specific objective, the teacher and aide focus their observations more on that child, and if necessary, try to create specific situations that relate to that objective. At the end of each month, the teacher calculates whether the special needs children are meeting their IEP objectives.

Selected Response Items

Common formal assessment formats used by teachers are  multiple-choice ,  matching , and  true/false items . In selected-response items, students have to select a response provided by the teacher or test developer rather than constructing a response in their own words or actions. Selected response items do not require that students  recall  the information but rather  recognize  the correct answer. Tests with these items are called  objective  because the results are not influenced by scorers’ judgments or interpretations and so are often machine scored. Eliminating potential errors in scoring increases the reliability of tests but teachers who only use objective tests are liable to reduce the validity of their assessment because objective tests are not appropriate for all learning goals (Linn & Miller, 2005). Effective assessment  for  learning as well as the assessment of learning must be based on aligning the assessment technique to the learning goals and outcomes.

For example, if the goal is for students to conduct an experiment then they should be asked to do that rather than being asked about  conducting an experiment.

Common problems

Selected response items are easy to score but are hard to devise. Teachers often do not spend enough time constructing items and common problems include:

  • True or False:  Although George Washington was born into a wealthy family, his father died when he was only 11, he worked as a youth as a surveyor of rural lands, and later stood on the balcony of Federal Hall in New York when he took his oath of office in 1789.
  • A common clue is that all the true statements on a true/false test or the corrective alternatives on a multiple-choice test are longer than the untrue statements or the incorrect alternatives.
  • A poor item:  “True or False: None of the steps made by the student was unnecessary.”
  • A better item:  True or False: “All of the steps were necessary.”
  • Students often do not notice the negative terms or find them confusing so avoiding them is generally recommended (Linn & Miller 2005). However, since standardized tests often use negative items, teachers sometimes deliberately include some negative items to give students practice in responding to that format.
  • Taking sentences directly from the textbook or lecture notes. Removing the words from their context often makes them ambiguous or can change the meaning. For example, a statement from Chapter 3 taken out of context suggests all children are clumsy. “Similarly with jumping, throwing, and catching: the large majority of children can do these things, though often a bit clumsily.” A fuller quotation makes it clearer that this sentence refers to 5-year-olds: “For some fives, running still looks a bit like a hurried walk, but usually it becomes more coordinated within a year or two. Similarly with jumping, throwing, and catching: the large majority of children can do these things, though often a bit clumsily, by the time they start school, and most improve their skills noticeably during the early elementary years.” If the abbreviated form was used as the stem in a true/false item it would obviously be misleading.
  • While it is important to know approximately when Piaget made his seminal contributions to the understanding of child development, the exact year of his birth (1880) is not important.

Strengths and weaknesses

All types of selected-response items have a number of strengths and weaknesses.  True/False items are appropriate for measuring factual knowledge such as vocabulary, formulae, dates, proper names, and technical terms. They are very efficient as they use a simple structure that students can easily understand and take little time to complete. They are also easier to construct than multiple-choice and matching items. However, students have a 50 percent probability of getting the answer correct through guessing so it can be difficult to interpret how much students know from their test scores. Examples of common problems that arise when devising true/false items are in the table below.

Column B is a mixture of generals and dates. Matching Too many items in each list Lists should be relatively short (4–7) in each column. More than 10 are too confusing. Matching Responses are not in logical order In the example with Spanish and English words (Exhibit 1) should be in a logical order (they are alphabetical). If the order is not logical, students spend too much time searching for the correct answer. Multiple Choice Problem (i.e. the stem) is not clearly stated problem New Zealand

  • Is the worlds’ smallest continent
  • Is home to the kangaroo
  • Was settled mainly by colonists from Great Britain
  • Is a dictatorship

This is really a series of true-false items. Because the correct answer is 3, a better version with the problem in the stem is Much of New Zealand was settled by colonists from

  • Great Britain
  • Gerald Ford
  • Correct alternative is longer
  • Incorrect alternatives are not grammatically correct with the stem
  • Too many correct alternatives are in position “b” or “c” making it easier for students to guess. All the options (e.g. a, b, c, d) should be used in approximately equal frequently (not exact as that also provides clues).

In  matching items , two parallel columns containing terms, phrases, symbols, or numbers are presented and the student is asked to match the items in the first column with those in the second column. Typically there are more items in the second column to make the task more difficult and to ensure that if a student makes one error they do not have to make another. Matching items most often are used to measure lower-level knowledge such as persons and their achievements, dates and historical events, terms and definitions, symbols and concepts, plants or animals, and classifications (Linn & Miller, 2005). An example of Spanish language words and their English equivalents is in Example 1.

EXample 1: SPANISH AND ENGLISH TRANSLATION

Directions: On the line to the right of the Spanish word in Column A, write the letter of the English word in Column B that has the same meaning.

While matching items may seem easy to devise it is hard to create homogenous lists. Other problems with matching items and suggested remedies are in Table 10.3.2.

Multiple Choice  items are the most commonly used type of objective test items because they have a number of advantages over other objective test items. Most importantly they can be adapted to assess higher levels of thinking such as application as well as lower-level factual knowledge. The first example in Example 2 assesses knowledge of a specific fact whereas the second example assesses the application of knowledge.

Example 2: MULTIPLE-CHOICE EXAMPLES

Who is best known for their work on the development of the morality of justice?

Which one of the following best illustrates the law of diminishing returns?

  • A factory doubled its labor force and increased production by 50 percent
  • The demand for an electronic product increased faster than the supply of the product
  • The population of a country increased faster than agricultural self-sufficiency
  • A machine decreased in efficacy as its parts became worn out

(Adapted from Linn and Miller 2005, p, 193).

There are several other advantages of multiple-choice items. Students have to recognize the correct answer not just know the incorrect answer as they do in true/false items. Also, the opportunity for guessing is reduced because four or five alternatives are usually provided whereas in true/false items students only have to choose between two choices. Also, multiple-choice items do not need homogeneous material as matching items do. However, creating good multiple-choice test items is difficult and students (maybe including you) often become frustrated when taking a test with poor multiple-choice items. Three steps have to be considered when constructing a multiple-choice item: formulating a clearly stated problem, identifying plausible alternatives, and removing irrelevant clues to the answer. Common problems in each of these steps are summarized in Table 10.3.3 (below).

Constructed Response Items

Formal assessment also includes constructed-response items in which students are asked to recall information and create an answer—not just recognize if the answer is correct—so guessing is reduced. Constructed response items can be used to assess a wide variety of kinds of knowledge and two major kinds are discussed: completion or short answer (also called short response) and extended response.

Completion and Short Answer

Completion and short-answer items can be answered in a word, phrase, number, or symbol. These types of items are essentially the same only varying in whether the problem is presented as a statement or a question (Linn & Miller 2005). Look at Example 3 for a sample:

Example 3: COMPLETION AND SHORT ANSWER QUESTIONS

Completion:  The first traffic light in the US was invented by ________.

Short Answer:  Who invented the first traffic light in the US?

These items are often used in mathematics tests, for example:

3 + 10 = ____?

If  x  = 6, what does  x ( x  − 1) = ________

A simple D-shape.

A major advantage of these items is that they are easy to construct. However, apart from their use in mathematics they are unsuitable for measuring complex learning outcomes and are often difficult to score. Completion and short-answer tests are sometimes called objective tests as the intent is that there is only one correct answer and so there is no variability in scoring but unless the question is phrased very carefully, there are frequently a variety of correct answers. For example, consider the short answer question “Where was President Lincoln born?”

The teacher may expect the answer “in a log cabin” but other correct answers are also “on Sinking Spring Farm,” “in Hardin County,” or “in Kentucky.” Common errors in these items are summarized in the table below.

Extended Response

Extended response items are used in many content areas and answers may vary in length from a paragraph to several pages. Questions that require longer responses are often called essay questions. Extended response items have several advantages and the most important is their adaptability for measuring complex learning outcomes— particularly integration and application. These items also require that students write and therefore provide teachers a way to assess writing skills. A commonly cited advantage to these items is their ease in construction; however, carefully worded items that are related to learning outcomes and assess complex learning are hard to devise (Linn & Miller, 2005). Well-constructed items phrase the question so the task of the student is clear. Often this involves providing hints or planning notes. In the first example below the actual question is clear not only because of the wording but because of the format (i.e. it is placed in a box). In the second and third examples planning notes are provided:

EXAMPLE 4: THIRD GRADE MATHEMATICS

The owner of a bookstore gave 14 books to the school. The principal will give an equal number of books to each of three classrooms and the remaining books to the school library. How many books could the principal give to each student and the school?

Show all your work on the space below and on the next page. Explain in words how you found the answer. Tell why you took the steps you did to solve the problem.

(From Illinois Standards Achievement Test, 2006; ( http://www.isbe.state.il.us/assessment/isat.htm ))

Example 5: FIFTH GRADE SCIENCE: THE GRASS IS ALWAYS GREENER

Jose and Maria noticed three different types of soil, black soil, sand, and clay, were found in their neighborhood. They decided to investigate the question, “How does the type of soil (black soil, sand, and clay) under grass sod affect the height of grass?”

Plan an investigation that could answer their new question. In your plan, be sure to include:

  • Prediction of the outcome of the investigation
  • Materials needed to do the investigation
  • logical steps to do the investigation
  • one variable kept the same (controlled)
  • one variable changed (manipulated)
  • any variables being measure and recorded
  • how often measurements are taken and recorded

(From Washington State 2004 assessment of student learning ( http://www.k12.wa.us/assessment/WASL/default.aspx ))

Example 6: GRADES 9–11 ENGLISH

Writing prompt.

Some people think that schools should teach students how to cook. Other people think that cooking is something that ought to be taught in the home. What do you think? Explain why you think as you do.

Planning notes

Choose One:

  • I think schools should teach students how to cook
  • I think cooking should l be taught in the home

I think cooking should be taught in ____________ because ________________________.

(From Illinois Measure of Annual Growth in English ( http://www.isbe.state.il.us/assessment/image.htm ))

A major disadvantage of extended-response items is the difficulty in reliable scoring. Not only do various teachers score the same response differently but also the same teacher may score an identical response differently on various occasions (Linn & Miller 2005). A variety of steps can be taken to improve the reliability and validity of scoring. First, teachers should begin by writing an outline of a model answer. This helps make it clear what students are expected to include. Second, a sample of the answers should be read. This assists in determining what the students can do and if there are any common misconceptions arising from the question. Third, teachers have to decide what to do about irrelevant information that is included (e.g. is it ignored or are students penalized) and how to evaluate mechanical errors such as grammar and spelling. Then, a  point scoring  or a  scoring rubric  should be used.

In point-scoring components of the answer are assigned points. This provides some guidance for evaluation and helps consistency but point-scoring systems often lead the teacher to focus on facts (e.g. naming risk factors) rather than higher-level thinking which may undermine the validity of the assessment if the teachers’ purposes include higher-level thinking. A better approach is to use a scoring rubric that describes  the quality of the answer or performance at each level.

Example 7: Point Scoring for Written Response

For example, if students were asked: What are the nature, symptoms, and risk factors of hyperthermia?

Point Scoring Guide:

  • Definition (natures) 2 pts
  • Symptoms (1 pt for each) 5 pts
  • Risk Factors (1 point for each) 5 pts
  • Writing 3 pts

Scoring Rubrics

A rubric is a scoring guide used to assess performance given a set of criteria.

A basic scoring rubric could be a list of the required components for evaluating the assignment. More advanced rubrics divide an assignment into all component parts and provide explicit expectations of acceptable and unacceptable levels of performance for each component. More complex scoring rubrics can be holistic  or  analytical . In holistic scoring rubrics, general descriptions of performance are made and a single overall score is obtained.

Rubrics help teachers carefully plan assignments and expectations of the students before giving the assignment. The guidance provided by the rubric may reduce the time clarifying assignment requirements for students and helps students know precisely what is expected from them when completing the work. When students have clear expectations, the quality of student work increases. Rubrics usually reduce time spent grading, but still be able to give constructive feedback to students. Finally, student complaints and questions about grades will decrease when it is clear why they received a particular score. It takes time to create a rubric, but you only have to create it once for each assignment. However, in the long run, a rubric is a time-saver, because you’ll spend less time clarifying and grading.

The least complex type of scoring rubric is the checklist . A checklist allows the rater to simply indicate the presence of the expected component, but it does not indicate the quality of the element, a pass/fail of sorts. These rubrics do tend to be quick and easy to administer, they do not provide much feedback to the student. Checklists may be most useful for providing feedback on minor assignments or drafts of assignments. If the teacher wants to provide feedback about the quality of work, a rating scale would be more appropriate.

Example 8: Checklist: Question Response Checklist

Assignment: read the chapter and write a reply to the assigned question. Define relevant terms and apply concepts in a real-world setting.

teacher made test in education

Basic Rating Scales

Rating scales  are like checklists with the addition of evaluating the quality of criteria using a scoring system. While rating scales may be more informative than checklists, the meaning of the numeric ratings is vague. Without a narrative for the ratings, the raters must make a judgment based on their perception of the meanings of the rating scale. For example, for an assignment, one rater might score it a “3,” presuming that it means “good” and another rater might also choose “3” but believe the work was “marginal.” Similarly, the rating may not mean the same thing to the student as it does to the teacher. The vagueness of the basic rating scale can be corrected by adding descriptors for the ratings, making it a holistic rubric.

Example 9: Basic Rating scale: Wiki Group Project

Assignment: students work with a group to create a wiki page on a given topic.

teacher made test in education

Holistic Scoring Rubrics

Holistic scoring rubrics  use a short narrative of characteristics associated with each numeric score based on an overall impression of a student’s performance on a task. The descriptions tend to be rather vague because the same score descriptions are applied to multiple components of the assignment. This type of rubric does not provide specific areas of strengths and weaknesses for each element and therefore the feedback is less directive for teachers and students to focus improvement efforts.

A holistic rubric may be more appropriate when the assignments to be assessed will vary significantly (e.g., independent study projects submitted in a capstone course) or when practicality dictates a speedier assessment due to the number of assignments or time limitations (e.g., reviewing all the essays from applicants to determine who will need developmental courses).

An example from grade 2 language arts in Los Angeles Unified School District classifies responses into four levels: not proficient, partially proficient, proficient, and advanced is on Example 4.

Example 10: HOLISTIC SCORING RUBRIC: ENGLISH LANGUAGE ARTS GRADE 2

Assignment: write about an interesting, fun, or exciting story you have read in class this year. Some of the things you could write about are:

  • What happened in the story (the plot or events)
  • Where the events took place (the setting)
  • People, animals, or things in the story ( the characters)

In your writing make sure you use facts and details from the story to describe everything clearly. After you write about the story, explain what makes the story interesting, fun or exciting.

Analytical rubrics provide descriptions of levels of student performance on a variety of characteristics. For example, six characteristics used for assessing writing developed by the Northwest Regional Education Laboratory (NWREL) are:

  • ideas and content
  • organization
  • word-choice
  • sentence fluency
  • conventions

Descriptions of the high, medium, and low responses for each characteristic are available from Education Northwest .

Holistic rubrics have the advantages that they can be developed more quickly than analytical rubrics. They are also faster to use as there is only one dimension to examine. However, they do not provide students with feedback about which aspects of the response are strong and which aspects need improvement (Linn & Miller, 2005). This means they are less useful for assessment  for  learning. An important use of rubrics is to use them as teaching tools and provide them to students  before  the assessment so they know what knowledge and skills are expected.

Analytic Scoring Rubrics

A scoring rubric that includes specific performance expectations for each rating for each criterion is an  analytic scoring rubric.  Analytic rating scales are especially appropriate for complex assignments with multiple criteria. These rubrics help raters with consistency and provide the most detailed feedback for teacher and student improvement. The disadvantage of using this type of rubric is that it can be time-consuming to construct and score.

Teachers can use scoring rubrics as part of instruction by giving students the rubric during instruction, providing several responses, and analyzing these responses in terms of the rubric. For example, the use of accurate terminology is one dimension of the science rubric in Example 11. An elementary science teacher could discuss why it is important for scientists to use accurate terminology, give examples of inaccurate and accurate terminology, provide that component of the scoring rubric to students, distribute some examples of student responses (maybe from former students), and then discuss how these responses would be classified according to the rubric. This strategy of assessment for learning should be more effective if the teacher (a) emphasizes to students why using accurate terminology is important when learning science rather than how to get a good grade on the test (we provide more details about this in the section on motivation later in this chapter); (b) provides an exemplary response so students can see a model; (c) emphasizes that the goal is student improvement on this skill not ranking students.

Example 11: Analytic Scoring Rubric:

Assignment: write a response to questions using the correct terminology, supporting details. Synthesize and apply relevant information to demonstrate understanding of the topic.

Rubric Maker

There are several online tools available to help teachers create rubrics. Rubric Maker is a website that offers sample rubrics for different subjects and activities, as well as allows you to create custom rubrics.

Performance Assessments

Typically in performance assessments, students complete a specific task while teachers observe the process or procedure (e.g. data collection in an experiment) as well as the product (e.g. completed report) (Popham, 2005; Stiggens, 2005). The tasks that students complete in performance assessments are not simple—in contrast to selected-response items—and include the following:

  • playing a musical instrument
  • athletic skills
  • artistic creation
  • conversing in a foreign language
  • engaging in a debate about political issues
  • conducting an experiment in science
  • repairing a machine
  • writing a term paper
  • using interaction skills to play together

These examples all involve complex skills but illustrate that the term performance assessment is used in a variety of ways. For example, the teacher may not observe all of the processes (e.g. she sees a draft paper but the final product is written during out-of-school hours) and essay tests are typically classified as performance assessments (Airasian, 2000). In addition, in some performance assessments there may be no clear product (e.g. the performance may be group interaction skills).

Two related terms,  alternative assessment,  and  authentic assessment are sometimes used instead of performance assessment but they have different meanings (Linn & Miller, 2005). Alternative assessment refers to tasks that are not pencil-and-paper and while many performance assessments are not pencil-and-paper tasks some are (e.g. writing a term paper, essay tests). Authentic assessment is used to describe tasks that students do that are similar to those in the “real world.” Classroom tasks vary in the level of authenticity (Popham, 2005). For example, in a Japanese language class taught in a high school in Chicago conversing in Japanese in Tokyo is highly authentic— but only possible in a study abroad program or a trip to Japan. Conversing in Japanese with native Japanese speakers in Chicago is also highly authentic, and conversing with the teacher in Japanese during class is moderately authentic. Much less authentic is a matching test on English and Japanese words. In a language arts class, writing a letter (to an editor) or a memo to the principal is highly authentic as letters and memos are common work products. However, writing a five-paragraph paper is not as authentic as such papers are not used in the world of work. However, a five-paragraph paper is a complex task and would typically be classified as a performance assessment.

Advantages and Disadvantages of Performance Assessments

There are several advantages of performance assessments (Linn & Miller 2005). First, the focus is on complex learning outcomes that often cannot be measured by other methods. Second, performance assessments typically assess the process or procedure as well as the product. For example, the teacher can observe if the students are repairing the machine using the appropriate tools and procedures as well as whether the machine functions properly after the repairs. Third, well-designed performance assessments communicate the instructional goals and meaningful learning clearly to students. For example, if the topic in a fifth-grade art class is a one-point perspective the performance assessment could be drawing a city scene that illustrates a one-point perspective. This assessment is meaningful and clearly communicates the learning goal. This performance assessment is a good instructional activity and has good content validity—common with well-designed performance assessments (Linn & Miller 2005).

One major disadvantage of performance assessments is that they are typically very time-consuming for students and teachers. This means that fewer assessments can be gathered so if they are not carefully devised fewer learning goals will be assessed—which can reduce content validity. State curriculum guidelines can be helpful in determining what should be included in a performance assessment. For example, Eric, a dance teacher in a high school in Tennessee learns that the state standards indicate that dance students at the highest level should be able to do demonstrate consistency and clarity in performing technical skills by:

  • performing complex movement combinations to music in a variety of meters and styles
  • performing combinations and variations in a broad dynamic range
  • demonstrating improvement in performing movement combinations through self-evaluation
  • critiquing a live or taped dance production based on given criteria

Eric devises the following performance task for his eleventh-grade modern dance class:

In groups of 4–6, students will perform a dance at least 5 minutes in length. The dance selected should be multifaceted so that all the dancers can demonstrate technical skills, complex movements, and a dynamic range (Items 1–2). Students will videotape their rehearsals and document how they improved through self-evaluation (Item 3). Each group will view and critique the final performance of one other group in class (Item 4). Eric would need to scaffold most steps in this performance assessment. The groups probably would need guidance in selecting a dance that allowed all the dancers to demonstrate the appropriate skills; critiquing their own performances constructively; working effectively as a team, and applying criteria to evaluate a dance.

Another disadvantage of performance assessments is they are hard to assess reliably which can lead to inaccuracy and unfair evaluation. As with any constructed response assessment, scoring rubrics is very important. A rubric designed to assess the process of group interactions is in the table below.

This rubric was devised for middle-grade science but could be used in other subject areas when assessing the group process. In some performance assessments, several scoring rubrics should be used. In the dance performance example above, Eric should have scoring rubrics for the performance skills, the improvement based on self-evaluation, teamwork, and the critique of the other group. Obviously, devising a good performance assessment is complex and Linn and Miller (2005) recommend that teachers should:

  • Create performance assessments that require students to use complex cognitive skills. Sometimes teachers devise assessments that are interesting and that the students enjoy but do not require students to use higher-level cognitive skills that lead to significant learning. Focusing on high-level skills and learning outcomes is particularly important because performance assessments are typically so time-consuming.
  • Ensure that the task is clear to the students. Performance assessments typically require multiple steps so students need to have the necessary prerequisite skills and knowledge as well as clear directions. Careful scaffolding is important for successful performance assessments.
  • Specify expectations of the performance clearly by providing students with scoring rubrics during the instruction. This not only helps students understand what it expected but it also guarantees that teachers are clear about what they expect. Thinking this through while planning the performance assessment can be difficult for teachers but is crucial as it typically leads to revisions of the actual assessment and directions provided to students.
  • Reduce the importance of unessential skills in completing the task. What skills are essential depends on the purpose of the task? For example, for a science report, is the use of publishing software essential? If the purpose of the assessment is for students to demonstrate the process of the scientific method including writing a report, then the format of the report may not be significant. However, if the purpose includes integrating two subject areas, science, and technology, then the use of publishing software is important. Because performance assessments take time, it is tempting to include multiple skills without carefully considering if all the skills are essential to the learning goals.

“A portfolio is a meaningful collection of student work that tells the story of student achievement or growth” (Arter, Spandel, & Culham, 1995, p. 2). Portfolios are a  purposeful collection of student work not just folders of all the work a student does. Portfolios are used for a variety of purposes and developing a portfolio system can be confusing and stressful unless the teachers are clear on their purpose. The varied purposes can be illustrated in four dimensions (Linn & Miller 2005):

Assessment for Learning ↔ Assessment of learning

Current Accomplishments ↔ Progress

Best Work Showcase ↔ Documentation

Finished ↔ Working

When the primary purpose is assessment  for learning, the emphasis is on student self-reflection and responsibility for learning. Students not only select samples of their work they wish to include, but also reflect and interpret their own work. Portfolios containing this information can be used to aid communication as students can present and explain their work to their teachers and parents (Stiggins, 2005). Portfolios focusing on the assessment  of learning contain students’ work samples that certify accomplishments for a classroom grade, graduation, state requirements, etc. Typically, students have less choice in the work contained in such portfolios as some consistency is needed for this type of assessment. For example, the writing portfolios that fourth and seventh graders are required to submit in Kentucky must contain a self-reflective statement and an example of three pieces of writing (reflective, personal experience or literary, and transactive). Students do choose which of their pieces of writing in each type to include in the portfolio ( Kentucky Student Performance Standards ).

Portfolios can be designed to focus on student progress or current accomplishments. For example, audio recordings of English language learners speaking could be collected over one year to demonstrate growth in learning. Student progress portfolios may also contain multiple versions of a single piece of work. For example, a writing project may contain notes on the original idea, outline, first draft, comments on the first draft by peers or teacher, second draft, and the final finished product (Linn & Miller 2005). If the focus is on current accomplishments, only recently completed work samples are included.

Portfolios can focus on documenting student activities or highlighting important accomplishments. Documentation portfolios are inclusive containing all the work samples rather than focusing on one special strength, best work, or progress. In contrast, showcase portfolios focus on the best work. The best work is typically identified by students. One aim of such portfolios is that students learn how to identify products that demonstrate what they know and can do. Students are not expected to identify their best work in isolation but also use feedback from their teachers and peers.

A final distinction can be made between a finished portfolio—which may be used for a job application—versus a working portfolio that typically includes day-to-day work samples. Working portfolios evolve over time and are not intended to be used for the assessment of learning. The focus of a working portfolio is on developing ideas and skills so students should be allowed to make mistakes, freely comment on their own work, and respond to teacher feedback (Linn & Miller, 2005). Finished portfolios are designed for use with a particular audience and the products selected may be drawn from a working portfolio. For example, in a teacher education program, the working portfolio may contain work samples from all the courses taken. A student may develop one finished portfolio to demonstrate she has mastered the required competencies in the teacher education program and a second finished portfolio for her job application.

Advantages and Disadvantages of Portfolios

Portfolios used well in classrooms have several advantages. They provide a way of documenting and evaluating growth in a much more nuanced way than selected-response tests can. Also, portfolios can be integrated easily into instruction, i.e. used for assessment for learning. Portfolios also encourage student self-evaluation and reflection, as well as ownership for learning (Popham, 2005). Using classroom assessment to promote student motivation is an important component of assessment  for  learning which is considered in the next section.

However, there are some major disadvantages of portfolio use. First, a good portfolio assessment takes an enormous amount of teacher time and organization. The time is needed to help students understand the purpose and structure of the portfolio, decide which work samples to collect, and self-reflect. Some of this time needs to be conducted in one-to-one conferences. Reviewing and evaluating the portfolios out of class time is also enormously time-consuming. Teachers have to weigh if the time spent is worth the benefits of portfolio use.

Second, evaluating portfolio reliability and eliminating bias can be even more difficult than in a constructed response assessment because the products are more varied. The experience of the state-wide use of portfolios for assessment in writing and mathematics for fourth and eighth graders in Vermont is sobering. Teachers used the same analytic scoring rubric when evaluating the portfolio. In the first two years of implementation samples from schools were collected and scored by an external panel of teachers. In the first year, the agreement among raters (i.e. inter-rater reliability) was poor for mathematics and reading; in the second year, the agreement among raters improved for mathematics but not for reading. However, even with the improvement in mathematics, the reliability was too low to use the portfolios for individual student accountability (Koretz, Stecher, Klein & McCaffrey, 1994). When reliability is low, validity is also compromised because unstable results cannot be interpreted meaningfully.

If teachers do use portfolios in their classroom, the series of steps needed for implementation are outlined in the table below. If the school or district has an existing portfolio system these steps may have to be modified.

Assessment that Enhances Motivation and Student Confidence

Studies on  testing and learning conducted more than 20 years ago demonstrated that tests promote learning and that more frequent tests are more effective than less frequent tests (Dempster & Perkins, 1993). Frequent smaller tests encourage continuous effort rather than last-minute cramming and may also reduce test anxiety because the consequences of errors are reduced. College students report preferring more frequent testing over infrequent testing (Bangert-Downs, Kulik, Kulik, 1991). More recent research indicates that teachers’ assessment  purposes and beliefs , the  type  of assessment selected, and the  feedback  given contribute to the assessment climate in the classroom which influences students’ confidence and motivation. The use of self-assessment is also important in establishing a positive assessment climate.

Candela Citations

  • Teacher-Made Assessments. Authored by : Nicole Arduini-Van Hoose. Provided by : Hudson Valley Community College. Retrieved from : https://courses.lumenlearning.com/edpsy/chapter/teacher-made-assessments/. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Educational Psychology. Authored by : Kelvin Seifert and Rosemary Sutton. Retrieved from : https://courses.lumenlearning.com/educationalpsychology. License : CC BY: Attribution
  • Creating Rrubrics. Authored by : Tom Lombardo. Provided by : Rock Valley College. Retrieved from : https://www.rockvalleycollege.edu/Academics/ATLE/upload/Creating-Rubrics.pdf. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike

Educational Psychology Copyright © 2020 by Nicole Arduini-Van Hoose is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Your Article Library

Teacher made test: meaning, features and uses | statistics.

teacher made test in education

ADVERTISEMENTS:

After reading this article you will learn about:- 1. Meaning of Teacher Made Test 2. Features of Teacher-Made Tests 3. Steps/Principles of Construction 4. Uses.

Meaning of Teacher Made Test:

Carefully constructed teacher-made tests and standardised tests are similar in many ways. Both are constructed on the basis of carefully planned table of specifications, both have the same type of test items, and both provide clear directions to the students.

Still the two differ. They differ in the quality of test items, the reliability of test measures, the procedures for administering and scoring and the interpretation of scores. No doubt, standardised tests are good and better in quality, more reliable and valid.

But a classroom teacher cannot always depend on standardised tests. These may not suit to his local needs, may not be readily available, may be costly, may have different objectives. In order to fulfill the immediate requirements, the teacher has to prepare his own tests which are usually objective type in nature.

Teacher-made tests are normally prepared and administered for testing class­room achievement of students, evaluating the method of teaching adopted by the teacher and other curricular programmes of the school.

Teacher-made test is one of the most valuable instrument in the hands of the teacher to solve his purpose. It is designed to solve the problem or requirements of the class for which it is prepared.

It is prepared to measure the outcomes and content of local curriculum. It is very much flexible so that, it can be adopted to any procedure and material. It does not require any sophisticated technique for preparation.

Taylor has highly recommended for the use of these teacher-made objective type tests, which do not require all the four steps of standardised tests nor need the rigorous processes of standardisation. Only the first two steps planning and preparation are sufficient for their construction.

Features of Teacher-Made Tests:

1. The items of the tests are arranged in order of difficulty.

2. These are prepared by the teachers which can be used for prognosis and diagnosis purposes.

3. The test covers the whole content area and includes a large number of items.

4. The preparation of the items conforms to the blueprint.

5. Test construction is not a single man’s business, rather it is a co-operative endeavour.

6. A teacher-made test does not cover all the steps of a standardised test.

7. Teacher-made tests may also be employed as a tool for formative evaluation.

8. Preparation and administration of these tests are economical.

9. The test is developed by the teacher to ascertain the student’s achievement and proficiency in a given subject.

10. Teacher-made tests are least used for research purposes.

11. They do not have norms whereas providing norms is quite essential for standardised tests.

Steps/Principles of Construction of Teacher-made Test:

A teacher-made test does not require a well-planned preparation. Even then, to make it more efficient and effective tool of evaluation, careful considerations arc needed to be given while constructing such tests.

The following steps may be followed for the preparation of teacher-made test:

1. Planning:

Planning of a teacher-made test includes :

a. Determining the purpose and objectives of the test, ‘as what to measure and why to measure’.

b. Deciding the length of the test and portion of the syllabus to be covered.

c. Specifying the objectives in behavioural terms. If needed, a table can even be prepared for specifications and weightage given to the objectives to be measured.

d. Deciding the number and forms of items (questions) according to blue­print.

e. Having a clear knowledge and understanding of the principles of constructing essay type, short answer type and objective type questions.

f. Deciding date of testing much in advance in order to give time to teachers for test preparation and administration.

g. Seeking the co-operation and suggestion of co-teachers, experienced teachers of other schools and test experts.

2. Preparation of the Test:

Planning is the philosophical aspect and preparation is the practical aspect of test construction. All the practical aspects to be taken into consideration while one constructs the tests. It is an art, a technique. One is to have it or to acquire it. It requires much thinking, rethinking and reading before constructing test items.

Different types of objective test items viz., multiple choice, short-answer type and matching type can be constructed. After construction, test items should be given lo others for review and for seeking their opinions on it.

The suggestions may be sought even from others on languages, modalities of the items, statements given, correct answers supplied and on other possible errors anticipated. The suggestions and views thus sought will help a test constructor in modifying and verifying his items afresh to make it more acceptable and usable.

After construction of the test, items should be arranged in a simple to complex order. For arranging the items, a teacher can adopt so many methods viz., group-wise, unit-wise, topic wise etc. Scoring key should also be prepared forthwith to avoid further delay in scoring.

Direction is an important part of a test construction. Without giving a proper direction or instruction, there will be a probability of loosing the authenticity of the test reliability. It may create a misunderstanding in the students also.

Thus, the direction should be simple and adequate to enable the students to know:

(i) The time for completion of test,

(ii) The marks allotted to each item,

(iii) Required number of items to be attempted,

(iv) How and where to record the answer? and

(v) The materials, like graph papers or logarithmic table to be used.

Uses of Teacher-Made Tests:

1. To help a teacher to know whether the class in normal, average, above average or below average.

2. To help him in formulating new strategies for teaching and learning.

3. A teacher-made test may be used as a full-fledged achievement test which covers the entire course of a subject.

4. To measure students’ academic achievement in a given course.

5. To assess how far specified instructional objectives have been achieved.

6. To know the efficacy of learning experiences.

7. To diagnose students learning difficulties and to suggest necessary remedial measures.

8. To certify, classify or grade the students on the basis of resulting scores.

9. Skillfully prepared teacher-made tests can serve the purpose of standardised test.

10. Teacher-made tests can help a teacher to render guidance and counseling.

11. Good teacher-made tests can be exchanged among neighbouring schools.

12. These tests can be used as a tool for formative, diagnostic and summative evaluation.

13. To assess pupils’ growth in different areas.

Related Articles:

  • Comparison between Standardised and Teacher-Made Tests
  • Merits and Demerits of Objective Type Test

Tests , Educational Statistics , Teacher Made Test

Comments are closed.

web statistics

The Classroom | Empowering Students in Their College Journey

The Similarities and Difference of Classroom Test and Standardized Achievement Test

Melanie Forstall

What Are the Advantages & Disadvantages of Achievement Tests?

The education process is very much a cycle: a process between teacher and student that is continually moving. The education cycle focuses on planning, delivery and assessment, in which both teacher and student are continually assessing whether or not the content and delivery was successful. For the teacher, it is important to determine if the students received the information, retained it and were able to process it. For students, the process often involves making the determination if the content was effectively presented and whether or not the information was retained. Also, results of an assessment often give students critical information like ranking, overall performance and whether or not specific criteria were met in order to pass on to the next class.

For both students and teachers, often the main tool used to determine the effectiveness of instruction is the test instrument. Teachers use the test to determine if the content was presented effectively, and students use the test to determine if they understood the content and, sometimes more importantly to the students, if they passed the class.

Different Tests Measure Different Things

It’s important to understand that within the scope of all test instruments, there is a vast array of differences and not all tests measure the same things. There are important differences between teacher-made tests vs. standardized tests. Additionally, there are differences within standardized tests. The differences often have to do with how the scores may be interpreted. Therefore, it is critically important to understand what a test actually measures (and what it doesn’t) so that when compared, similar measurements are used. It is also important for test takers to understand the types of tests and what their individual scores actually mean. Lastly, it is important for teachers and test takers to understand that there are many advantages and disadvantages to both teacher-made tests and standardized tests.

What Are Classroom Tests and Standardized Achievement Tests?

Classroom tests, also referred to as teacher-made tests, play a vital role in classroom assessments. These types of assessments are considered nonstandardized tests. Classroom assessments help determine if students mastered content. For most students, the grade on the test is important because it may determine if they pass or fail. Doing well on classroom assessments indicates mastery of content and also helps build student confidence along the way. Classroom assessments also give important insights to the instructor. For example, if a large majority of students miss a question or questions on a test, this signals to the instructor that students possibly needed more time or more clarification on that topic.

Classroom tests, or teacher-made tests, may be a combination of multiple choice, true/false, short answer or essay questions. The content of the test should be directly related to the content delivered and discussed in the class. It may also cover the content in textbook readings or articles shared with the class. The content for classroom assessments is directly related to the specific content taught in the class. The assessment may also require students to apply or explain what they have learned, as in an essay question or performance assessment.

However, there are differences between teacher-made tests and standardized tests. When you think of teacher-made tests versus standardized tests, it's important to understand the differences between the type of assessment and what they each measure. Classroom, or teacher-made tests, are not standardized and therefore may be open to broader interpretation. Additionally, teacher-made tests often assess very specific content or skills often associated with a unit of study. It is important to understand that there are advantages and disadvantages of nonstandardized tests.

In contrast to classroom tests, a standardized test is by definition standardized. A standardized test includes the same format, the same types of questions and the same content no matter when or where the test is administered and no matter who is taking the test. Questions may be multiple choice, true/false or short answer, and it is administered either by paper and pencil or on a computer. Additionally, standardized tests tend to measure a broader range of knowledge and skills. Examples of standardized tests used for undergraduate acceptance include the SAT and ACT. Standardized tests often used for acceptance into graduate school include the LSAT, MCAT and GRE.

What Are Norm-Referenced and Criterion-Referenced Tests?

Within the set of standardized tests, it is important to understand that there is a difference between norm-referenced and criterion-referenced tests. While both are still standardized, norm-referenced tests measure and rank test takers to each other. A test taker’s score is compared to the norm of similar test takers and may be expressed as a percentile, grade equivalent or stanine. Criterion-referenced tests measure the number of correct responses based on a specific criterion of what is expected to pass the exam or what is acceptable achievement. A criterion-referenced test score may be expressed in a percentage correct out of the total.

The two types of tests each serve a different purpose and the scores are used differently. Schools and teachers may use norm-referenced test scores to rank student achievement across broad areas of knowledge. Criterion-referenced scores may be used to determine if a student has mastered specific skills or concepts in specific areas of study.

Why Teachers Use Classroom and Standardized Achievement Tests

Classroom teachers may use tests for a variety of reasons. Most notably, a well-constructed teacher-made test will inform the teacher if students mastered the content taught in class. Classroom tests can be summative, in which the assessment comes at the end of a unit of study and encompasses all of the content covered in class. Think of this as a final exam. Additionally, teachers may conduct smaller formative assessments, in which a small amount of material is covered. A formative assessment is used by the teacher to determine if the class is ready to move on to the next phase of content. The results of formative assessments should inform the teacher of the next instructional steps.

Standardized tests, however, are used for very different reasons. Norm-referenced standardized tests are often used to rank students across a broad range of knowledge. Universities and colleges often use these test scores because of the ability to rank students to see how they compare to other test takers. When colleges and universities are looking for the top-performing students, these standardized test scores will illuminate the top performers. Since criterion-referenced standardized tests are designed to determine how much a student knows or if a student has mastered a specific skill set, these tests would be used to determine if a student can move on to the next grade level. In this case, the scores on a criterion-referenced test are not compared to other test takers. The performance of other test takers in this case is irrelevant; the focus is simply on the performance of each individual test taker.

Differences Between Standardized and Nonstandardized Tests

Whether norm-referenced or criterion-referenced, there are differences between standardized and nonstandardized tests. Both norm-referenced and criterion-referenced tests share the common characteristic of being measurable and quantifiable. Scores from standardized tests are quantifiable and are resulted in a numeric measure, often a percentile, percentage or grade equivalent. Nonstandardized assessments, however, are not quantifiable in the same way. Some examples of nonstandardized assessments include portfolios, observations or checklists. Often, these types of performance-based assessments are graded using a rubric. Nonstandardized assessments are usually less structured and more organic.

Both standardized and nonstandardized assessments play an important role in education. However, they are used in very different ways. While standardized assessments are used to rank students or determine individual student performance, nonstandardized assessments may be used to determine a student’s strengths or to highlight a particular skill. While a rubric may give some structure to this type of assessment, portfolios and performance assessments typically allow for high levels of student input and creativity. While these assessments are important in creating a holistic look at a student’s overall performance, it can’t necessarily be used in the same way as standardized assessments.

Advantages and Disadvantages of Standardized Tests

Standardized testing has been under fire recently by many who object to this type of testing for students. While opponents list several reasons why testing should be stopped, there is an equally long list of reasons why testing is still supported by schools and educators across the nation. Essentially, there are advantages and disadvantages of standardized tests. First, standardized testing provides an objective and reliable measure of student ability and achievement. These tests have undergone such a strict norming process that the results are statistically reliable and valid. Additionally, this form of measurement has increased accountability for many schools, requiring them to set rigorous yet attainable goals for long-term student success. As a result, many schools have made significant and sustainable gains in overall student achievement.

Because of the standardized nature of these tests, they are highly inclusive. A higher proportion of students with disabilities are taking standardized tests today than ever before, furthering inclusive practices for students of all abilities. The high-stakes nature of these tests has improved the focus and quality of the necessary skills being taught for students to be successful at the college level and beyond.

However, one main drawback of standardized tests is that not all students are successful at scoring well on them. For some students, this type of assessment does not accurately reflect their true abilities, which certainly can cause added stress and struggle for students. Opponents feel that standardized tests focus so narrowly on a few skills that they do not provide a true reflection of overall student achievement and should not be a central factor in the education process. Some suggest that the narrowing of the curriculum has actually lessened student preparation for college and career readiness. Essentially, opponents feel that there is too much emphasis and focus placed on the test.

Why Are Tests Important?

Assessment is a critical part of the teaching and learning cycle. Assessments provide an opportunity for students to demonstrate their understanding of material. It is also important to determine if mastery of standards for specific content has been met. For educators, assessment results help drive instruction and guide decisions about overall content. Additionally, assessments help highlight student performance. For high-scoring students, test results are critical for highlighting exceptional performance. The ability for test scores to set one test taker apart from all other test takers can be highly valuable for some students.

As with many things in education, there isn’t usually a one-size-fits-all when it comes to assessment. As students are multifaceted, so are the assessments available to them. It is important to remember that different forms of assessments are good for some things and not others, and there are strengths and weaknesses as well. Therefore, it’s important not to focus too much on any one specific test. While colleges and universities do look at test scores, they often look at other assessments too. Find the tests on which you do well and focus on those. If there is a certain test on which you don’t do well, be sure to include other stand-out assessments such as written essays, portfolios or interviews to balance out the overall picture of your abilities. A combination of assessment results will provide a clear, multifaceted picture of you as an individual student.

Related Articles

The Difference Between Standardized & Norm Reference Tests

The Difference Between Standardized & Norm Reference Tests

Strengths & Weaknesses of the Kaufman Test of Educational Achievement

Strengths & Weaknesses of the Kaufman Test of Educational Achievement

High School Assessment & Test Analysis

High School Assessment & Test Analysis

Achievement Vs. Aptitude Tests

Achievement Vs. Aptitude Tests

Cognitive Testing Vs. Achievement Testing

Cognitive Testing Vs. Achievement Testing

Differences Between Assessment & Testing

Differences Between Assessment & Testing

Types of Standardized Test Scores

Types of Standardized Test Scores

Testing & Assessment in Education

Testing & Assessment in Education

  • TeachThought: 16 Standardized Tests Being Used in Education Today

Melanie Forstall has a doctorate in education and has worked in the field of education for over 20 years. She has been a teacher, grant writer, program director, and higher education instructor. She is a freelance writer specializing in education, and education related content. She writes for We Are Teachers, School Leaders Now, Classroom, Pocket Sense, local parenting magazines, and other professional academic outlets. Additionally, she has co-authored book chapters specializing in providing services for students with disabilities.

Logo for New Prairie Press Open Book Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

12 Teacher-made assessment strategies

Kym teaches sixth grade students in an urban school where most of the families in the community live below the poverty line. Each year the majority of the students in her school fail the state-wide tests. Kym follows school district teaching guides and typically uses direct instruction in her Language Arts and Social Studies classes. The classroom assessments are designed to mirror those on the state-wide tests so the students become familiar with the assessment format. When Kym is in a graduate summer course on motivation she reads an article called, “Teaching strategies that honor and motivate inner-city African American students” (Teel, Debrin-Parecki, & Covington, 1998) and she decides to change her instruction and assessment in fall in four ways. First, she stresses an incremental approach to ability focusing on effort and allows students to revise their work several times until the criteria are met. Second, she gives students choices in performance assessments (e.g. oral presentation, art project, creative writing). Third, she encourages responsibility by asking students to assist in classroom tasks such as setting up video equipment, handing out papers etc. Fourth, she validates student’ cultural heritage by encouraging them to read biographies and historical fiction from their own cultural backgrounds. Kym reports that the changes in her students’ effort and demeanor in class are dramatic: students are more enthusiastic, work harder, and produce better products. At the end of the year twice as many of her students pass the State-wide test than the previous year.

Afterward. Kym still teaches sixth grade in the same school district and continues to modify the strategies described above. Even though the performance of the students she taught improved the school was closed because, on average, the students’ performance was poor. Kym gained a Ph.D and teaches Educational Psychology to preservice and inservice teachers in evening classes.

Kym’s story illustrates several themes related to assessment that we explore in this chapter on teacher-made assessment strategies and in the Chapter 13 on standardized testing. First, choosing effective classroom assessments is related to instructional practices, beliefs about motivation, and the presence of state-wide standardized testing. Second, some teacher-made classroom assessments enhance student learning and motivation

—some do not. Third, teachers can improve their teaching through action research. This involves identifying a problem (e.g. low motivation and achievement), learning about alternative approaches (e.g. reading the literature), implementing the new approaches, observing the results (e.g. students’ effort and test results), and continuing to modify the strategies based on their observations.

Best practices in assessing student learning have undergone dramatic changes in the last 20 years. When Rosemary was a mathematics teacher in the 1970s, she did not assess students’ learning she tested them on the mathematics knowledge and skills she taught during the previous weeks. The tests varied little format and students always did them individually with pencil and paper. Many teachers, including mathematics teachers, now use a wide variety of methods to determine what their students have learned and also use this assessment information to modify their instruction. In this chapter the focus is on using classroom assessments to improve student learning and we begin with some basic concepts.

Basic concepts

Assessment is an integrated process of gaining information about students’ learning and making value judgments about their progress (Linn & Miller, 2005). Information about students’ progress can be obtained from a variety of sources including projects, portfolios, performances, observations, and tests. The information about students’ learning is often assigned specific numbers or grades and this involves measurement. Measurement answers the question, “How much?” and is used most commonly when the teacher scores a test or product and assigns numbers (e.g. 28 /30 on the biology test; 90/100 on the science project). Evaluation is the process of making judgments about the assessment information (Airasian, 2005). These judgments may be about individual students (e.g. should Jacob’s course grade take into account his significant improvement over the grading period?), the assessment method used (e.g. is the multiple choice test a useful way to obtain information about problem solving), or one’s own teaching (e.g. most of the students this year did much better on the essay assignment than last year so my new teaching methods seem effective).

The primary focus in this chapter is on assessment for learning, where the priority is designing and using assessment strategies to enhance student learning and development. Assessment for learning is often formative assessment, i.e. it takes place during the course of instruction by providing information that teachers can use to revise their teaching and students can use to improve their learning (Black, Harrison, Lee, Marshall & Wiliam, 2004). Formative assessment includes both informal assessment involving spontaneous unsystematic observations of students’ behaviors (e.g. during a question and answer session or while the students are working on an assignment) and formal assessment involving pre-planned, systematic gathering of data. Assessment of learning is formal assessment that involves assessing students in order to certify their competence and fulfill accountability mandates and is the primary focus of the next chapter on standardized tests but is also considered in this chapter. Assessment of learning is typically summative, that is, administered after the instruction is completed (e.g. a final examination in an educational psychology course). Summative assessments provide information about how well students mastered the material, whether students are ready for the next unit, and what grades should be given (Airasian, 2005).

Assessment for learning: an overview of the process

Using assessment to advance students’ learning not just check on learning requires viewing assessment as a process that is integral to the all phases of teaching including planning, classroom interactions and instruction, communication with parents, and self-reflection (Stiggins, 2002). Essential steps in assessment for learning include:

Step 1: Having clear instructional goals and communicating them to students

In the previous chapter we documented the importance of teachers thinking carefully about the purposes of each lesson and unit. This may be hard for beginning teachers. For example, Vanessa, a middle school social studies teacher, might say that the goal of her next unit is: “Students will learn about the Cvil War.” Clearer goals require that Vanessa decides what it is about the US Civil War she wants her students to learn, e.g. the dates and names of battles, the causes of the US Civil War, the differing perspectives of those living in the North and the South, or the day-to-day experiences of soldiers fighting in the war. Vanessa cannot devise appropriate assessments of her students’ learning about the US Civil War until she is clear about her own purposes.

For effective teaching Vanessa also needs to communicate clearly the goals and objectives to her students so they know what is important for them to learn. No matter how thorough a teacher’s planning has been, if students do not know what they are supposed to learn they will not learn as much. Because communication is so important to teachers a specific chapter is devoted to this topic (Chapter 9), and so communication is not considered in any detail in this chapter.

Step 2: Selecting appropriate assessment techniques

Selecting and administrating assessment techniques that are appropriate for the goals of instruction as well as the developmental level of the students are crucial components of effective assessment for learning. Teachers need to know the characteristics of a wide variety of classroom assessment techniques and how these techniques can be adapted for various content, skills, and student characteristics. They also should understand the role reliability, validity, and the absence of bias should play is choosing and using assessment techniques. Much of this chapter focuses on this information.

Step 3: Using assessment to enhance motivation and confidence

Students’ motivation and confidence is influenced by the type of assessment used as well as the feedback given about the assessment results. Consider, Samantha a college student who takes a history class in which the professor’s lectures and text book focus on really interesting major themes. However, the assessments are all multiple choice tests that ask about facts and Samantha, who initially enjoys the classes and readings, becomes angry, loses confidence she can do well, and begins to spend less time on the class material. In contrast, some instructors have observed that that many students in educational psychology classes like the one you are now taking will work harder on assessments that are case studies rather than more traditional exams or essays. The type of feedback provided to students is also important and we elaborate on these ideas later in this chapter.

Step 4: Adjusting instruction based on information

An essential component of assessment for learning is that the teacher uses the information gained from assessment to adjust instruction. These adjustments occur in the middle of a lesson when a teacher may decide that students’ responses to questions indicate sufficient understanding to introduce a new topic, or that her observations of students’ behavior indicates that they do not understand the assignment and so need further explanation. Adjustments also occur when the teacher reflects on the instruction after the lesson is over and is planning for the next day. We provide examples of adjusting instruction in this chapter and consider teacher reflection in more detail in Chapter 2..

Step 5: Communicating with parents and guardians

Students’ learning and development is enhanced when teachers communicate with parents regularly about their children’s performance. Teachers communicate with parents in a variety of ways including newsletters, telephone conversations, email, school district websites and parent-teachers conferences. Effective communication requires that teachers can clearly explain the purpose and characteristics of the assessment as well as the meaning of students’ performance. This requires a thorough knowledge of the types and purposes of teacher made and standardized assessments (this chapter and Chapter 13) and well as clear communication skills (Chapter 9).

We now consider each step in the process of assessment for learning in more detail. In order to be able to select and administer appropriate assessment techniques teachers need to know about the variety of techniques that can be used as well as what factors ensure that the assessment techniques are high quality. We begin by considering high quality assessments.

Selecting appropriate assessment techniques I: high quality assessments

For an assessment to be high quality it needs to have good validity and reliability as well as absence from bias.

Validity is the evaluation of the “adequacy and appropriateness of the interpretations and uses of assessment results” for a given group of individuals (Linn & Miller, 2005, p. 68). For example, is it appropriate to conclude that the results of a mathematics test on fractions given to recent immigrants accurately represents their understanding of fractions? Is it appropriate for the teacher to conclude, based on her observations, that a kindergarten student, Jasmine, has Attention Deficit Disorder because she does not follow the teachers oral instructions? Obviously in each situation other interpretations are possible that the immigrant students have poor English skills rather than mathematics skills, or that Jasmine may be hearing impaired.

It is important to understand that validity refers to the interpretation and uses made of the results of an assessment procedure not of the assessment procedure itself. For example, making judgments about the results of the same test on fractions may be valid if the students all understand English well. A teacher concluding from her observations that the kindergarten student has Attention Deficit Disorder (ADD) may be appropriate if the student has been screened for hearing and other disorders (although the classification of a disorder like ADD cannot be made by one teacher). Validity involves making an overall judgment of the degree to which the interpretations and uses of the assessment results are justified. Validity is a matter of degree (e.g. high, moderate, or low validity) rather than all-or none (e.g. totally valid vs invalid) (Linn & Miller, 2005).

Three sources of evidence are considered when assessing validity—content, construct and predictive. Content validity evidence is associated with the question: How well does the assessment include the content or tasks it is supposed to? For example, suppose your educational psychology instructor devises a mid-term test and tells you this includes chapters one to seven in the text book. Obviously, all the items in test should be based on the content from educational psychology, not your methods or cultural foundations classes. Also, the items in the test should cover content from all seven chapters and not just chapters three to seven—unless the instructor tells you that these chapters have priority.

Teachers’ have to be clear about their purposes and priorities for instruction before they can begin to gather evidence related content validity . Content validation determines the degree that assessment tasks are relevant and representative of the tasks judged by the teacher (or test developer) to represent their goals and objectives (Linn & Miller, 2005). It is important for teachers to think about content validation when devising assessment tasks and one way to help do this is to devise a Table of Specifications. An example, based on Pennsylvania’s State standards for grade 3 geography, is in . In the left hand column is the instructional content for a 20-item test the teacher has decided to construct with two kinds of instructional objectives: identification and uses or locates. The second and third columns identify the number of items for each content area and each instructional objective. Notice that the teacher has decided that six items should be devoted to the sub area of geographic representations- more than any other sub area. Devising a table of specifications helps teachers determine if some content areas or concepts are over-sampled (i.e. there are too many items) and some concepts are under-sampled (i.e. there are too few items).

Table 35: Example of Table of Specifications: grade 3 basic geography literacy

Construct validity evidence is more complex than content validity evidence. Often we are interested in making broader judgments about student’s performances than specific skills such as doing fractions. The focus may be on constructs such as mathematical reasoning or reading comprehension. A construct is a characteristic of a person we assume exists to help explain behavior. For example, we use the concept of test anxiety to explain why some individuals when taking a test have difficulty concentrating, have physiological reactions such as sweating, and perform poorly on tests but not in class assignments. Similarly mathematics reasoning and reading comprehension are constructs as we use them to help explain performance on an assessment. Construct validation is the process of determining the extent to which performance on an assessment can be interpreted in terms of the intended constructs and is not influenced by factors irrelevant to the construct. For example, judgments about recent immigrants’ performance on a mathematical reasoning test administered in English will have low construct validity if the results are influenced by English language skills that are irrelevant to mathematical problem solving. Similarly, construct validity of end-of-semester examinations is likely to be poor for those students who are highly anxious when taking major tests but not during regular class periods or when doing assignments. Teachers can help increase construct validity by trying to reduce factors that influence performance but are irrelevant to the construct being assessed. These factors include anxiety, English language skills, and reading speed (Linn & Miller 2005).

A third form of validity evidence is called criterion-related validity. Selective colleges in the USA use the ACT or SAT among other criteria to choose who will be admitted because these standardized tests help predict freshman grades, i.e. have high criterion-related validity. Some K-12 schools give students math or reading tests in the fall semester in order to predict which are likely to do well on the annual state tests administered in the spring semester and which students are unlikely to pass the tests and will need additional assistance. If the tests administered in fall do not predict students’ performances accurately then the additional assistance may be given to the wrong students illustrating the importance of criterion-related validity.

Reliability

Reliability refers to the consistency of the measurement (Linn & Miller 2005). Suppose Mr Garcia is teaching a unit on food chemistry in his tenth grade class and gives an assessment at the end of the unit using test items from the teachers’ guide. Reliability is related to questions such as: How similar would the scores of the students be if they had taken the assessment on a Friday or Monday? Would the scores have varied if Mr Garcia had selected different test items, or if a different teacher had graded the test? An assessment provides information about students by using a specific measure of performance at one particular time. Unless the results from the assessment are reasonably consistent over different occasions, different raters, or different tasks (in the same content domain) confidence in the results will be low and so cannot be useful in improving student learning.

Obviously we cannot expect perfect consistency. Students’ memory, attention, fatigue, effort, and anxiety fluctuate and so influence performance. Even trained raters vary somewhat when grading assessment such as essays, a science project, or an oral presentation. Also, the wording and design of specific items influence students’ performances. However, some assessments are more reliable than others and there are several strategies teachers can use to increase reliability.

First, assessments with more tasks or items typically have higher reliability. To understand this, consider two tests one with five items and one with 50 items. Chance factors influence the shorter test more then the longer test. If a student does not understand one of the items in the first test the total score is very highly influenced (it would be reduced by 20 per cent). In contrast, if there was one item in the test with 50 items that were confusing, the total score would be influenced much less (by only 2 percent). Obviously this does not mean that assessments should be inordinately long, but, on average, enough tasks should be included to reduce the influence of chance variations. Second, clear directions and tasks help increase reliability. If the directions or wording of specific tasks or items are unclear, then students have to guess what they mean undermining the accuracy of their results. Third, clear scoring criteria are crucial in ensuring high reliability (Linn & Miller, 2005). Later in this chapter we describe strategies for developing scoring criteria for a variety of types of assessment.

Absence of bias

Bias occurs in assessment when there are components in the assessment method or administration of the assessment that distort the performance of the student because of their personal characteristics such as gender, ethnicity, or social class (Popham, 2005). Two types of assessment bias are important: offensiveness and unfair penalization . An assessment is most likely to be offensive to a subgroup of students when negative stereotypes are included in the test. For example, the assessment in a health class could include items in which all the doctors were men and all the nurses were women. Or, a series of questions in a social studies class could portray Latinos and Asians as immigrants rather than native born Americans. In these examples, some female, Latino or Asian students are likely to be offended by the stereotypes and this can distract them from performing well on the assessment.

Unfair penalization occurs when items disadvantage one group not because they may be offensive but because of differential background experiences. For example, an item for math assessment that assumes knowledge of a particular sport may disadvantage groups not as familiar with that sport (e.g. American football for recent immigrants). Or an assessment on team work that asks students to model their concept of a team on a symphony orchestra is likely to be easier for those students who have attended orchestra performances—probably students from affluent families. Unfair penalization does not occur just because some students do poorly in class. For example, asking questions about a specific sport in a physical education class when information on that sport had been discussed in class is not unfair penalization as long as the questions do not require knowledge beyond that taught in class that some groups are less likely to have.

It can be difficult for new teachers teaching in multi-ethnic classrooms to devise interesting assessments that do not penalize any groups of students. Teachers need to think seriously about the impact of students’ differing backgrounds on the assessment they use in class. Listening carefully to what students say is important as is learning about the backgrounds of the students.

Selecting appropriate assessment techniques II: types of teacher-made assessments

One of the challenges for beginning teachers is to select and use appropriate assessment techniques. In this section we summarize the wide variety of types of assessments that classroom teachers use. First we discuss the informal techniques teachers use during instruction that typically require instantaneous decisions. Then we consider formal assessment techniques that teachers plan before instruction and allow for reflective decisions.

Teachers’ observation, questioning, and record keeping

During teaching, teachers not only have to communicate the information they planned but also continuously monitor students’ learning and motivation in order to determine whether modifications have to be made (Airasian, 2005). Beginning teachers find this more difficult than experienced teachers because of the complex cognitive skills required to improvise and be responsive to students needs while simultaneously keeping in mind the goals and plans of the lesson (Borko & Livingston, 1989). The informal assessment strategies teachers most often use during instruction are observation and questioning.

Observation

Effective teachers observe their students from the time they enter the classroom. Some teachers greet their students at the door not only to welcome them but also to observe their mood and motivation. Are Hannah and Naomi still not talking to each other? Does Ethan have his materials with him? Gaining information on such questions can help the teacher foster student learning more effectively (e.g. suggesting Ethan goes back to his locker to get his materials before the bell rings or avoiding assigning Hannah and Naomi to the same group).

During instruction, teachers observe students’ behavior to gain information about students’ level of interest and understanding of the material or activity. Observation includes looking at non-verbal behaviors as well as listening to what the students are saying. For example, a teacher may observe that a number of students are looking out of the window rather than watching the science demonstration, or a teacher may hear students making comments in their group indicating they do not understand what they are supposed to be doing. Observations also help teachers decide which student to call on next, whether to speed up or slow down the pace of the lesson, when more examples are needed, whether to begin or end an activity, how well students are performing a physical activity, and if there are potential behavior problems (Airasian, 2005). Many teachers find that moving around the classroom helps them observe more effectively because they can see more students from a variety of perspectives. However, the fast pace and complexity of most classrooms makes it difficult for teachers to gain as much information as they want.

Questioning

Teachers ask questions for many instructional reasons including keeping students’ attention on the lesson, highlighting important points and ideas, promoting critical thinking, allowing students’ to learn from each others answers, and providing information about students’ learning. Devising good appropriate questions and using students’ responses to make effective instantaneous instructional decisions is very difficult. Some strategies to improve questioning include planning and writing down the instructional questions that will be asked, allowing sufficient wait time for students to respond, listening carefully to what students say rather than listening for what is expected, varying the types of questions asked, making sure some of the questions are higher level, and asking follow-up questions.

While the informal assessment based on spontaneous observation and questioning is essential for teaching there are inherent problems with the validity, reliability and bias in this information (Airasian, 2005; Stiggins 2005). We summarize these issues and some ways to reduce the problems in Table 35 .

Table 36: Validity and reliability of observation and questioning

Record keeping

Keeping records of observations improves reliability and can be used to enhance understanding of one student, a group, or the whole class’ interactions. Sometimes this requires help from other teachers. For example, Alexis, a beginning science teacher is aware of the research documenting that longer wait time enhances students’ learning (e.g. Rowe, 2003) but is unsure of her behaviors so she asks a colleague to observe and record her wait times during one class period. Alexis learns her wait times are very short for all students so she starts practicing silently counting to five whenever she asks students a question.

Teachers can keep anecdotal records about students without help from peers. These records contain descriptions of incidents of a student’s behavior, the time and place the incident takes place, and a tentative interpretation of the incident. For example, the description of the incident might involve Joseph, a second grade student, who fell asleep during the mathematics class on a Monday morning. A tentative interpretation could be the student did not get enough sleep over the weekend, but alternative explanations could be the student is sick or is on medications that make him drowsy. Obviously additional information is needed and the teacher could ask Joseph why he is so sleepy and also observe him to see if he looks tired and sleepy over the next couple of weeks.

Anecdotal records often provide important information and are better than relying on one’s memory but they take time to maintain and it is difficult for teachers to be objective. For example, after seeing Joseph fall asleep the teacher may now look for any signs of Joseph’s sleepiness—ignoring the days he is not sleepy. Also, it is hard for teachers to sample a wide enough range of data for their observations to be highly reliable.

Teachers also conduct more formal observations especially for students with special needs who have IEP’s. An example of the importance of informal and formal observations in a preschool follows:

The class of preschoolers in a suburban neighborhood of a large city has eight special needs students and four students—the peer models—who have been selected because of their well developed language and social skills. Some of the special needs students have been diagnosed with delayed language, some with behavior disorders, and several with autism. The students are sitting on the mat with the teacher who has a box with sets of three “cool” things of varying size (e.g. toy pandas) and the students are asked to put the things in order by size, big, medium and small. Students who are able are also requested to point to each item in turn and say “This is the big one”, “This is the medium one” and “This is the little one”. For some students, only two choices (big and little) are offered because that is appropriate for their developmental level. The teacher informally observes that one of the boys is having trouble keeping his legs still so she quietly asks the aid for a weighted pad that she places on the boy’s legs to help him keep them still. The activity continues and the aide carefully observes students behaviors and records on IEP progress cards whether a child meets specific objectives such as: “When given two picture or object choices, Mark will point to the appropriate object in 80 per cent of the opportunities.” The teacher and aides keep records of the relevant behavior of the special needs students during the half day they are in preschool. The daily records are summarized weekly. If there are not enough observations that have been recorded for a specific objective, the teacher and aide focus their observations more on that child, and if necessary, try to create specific situations that relate to that objective. At end of each month the teacher calculates whether the special needs children are meeting their IEP objectives.

Selected response items

Common formal assessment formats used by teachers are multiple choice, matching, and true/false items . In selected response items students have to select a response provided by the teacher or test developer rather than constructing a response in their own words or actions. Selected response items do not require that students recall the information but rather recognize the correct answer. Tests with these items are called objective because the results are not influenced by scorers’ judgments or interpretations and so are often machine scored. Eliminating potential errors in scoring increases the reliability of tests but teachers who only use objective tests are liable to reduce the validity of their assessment because objective tests are not appropriate for all learning goals (Linn & Miller, 2005). Effective assessment for learning as well as assessment of learning must be based on aligning the assessment technique to the learning goals and outcomes.

For example, if the goal is for students to conduct an experiment then they should be asked to do that rather that than being asked about conducting an experiment.

Common problems

Selected response items are easy to score but are hard to devise. Teachers often do not spend enough time constructing items and common problems include:

  • True or False: Although George Washington was born into a wealthy family, his father died when he was only 11, he worked as a youth as a surveyor of rural lands, and later stood on the balcony of Federal Hall in New York when he took his oath of office in 1789.
  • A common clue is that all the true statements on a true/false test or the corrective alternatives on a multiple choice test are longer than the untrue statements or the incorrect alternatives.
  • A poor item. “True or False: None of the steps made by the student was unnecessary.”
  • A better item. True or False: “All of the steps were necessary.”

Students often do not notice the negative terms or find them confusing so avoiding them is generally recommended (Linn & Miller 2005 ). However, since standardized tests often use negative items, teachers sometimes deliberately include some negative items to give students practice in responding to that format.

  • Removing the words from their context often makes them ambiguous or can change the meaning. For example, a statement from Chapter 4 taken out of context suggests all children are clumsy. “Similarly with jumping, throwing and catching: the large majority of children can do these things, though often a bit clumsily.” A fuller quotation makes it clearer that this sentence refers to 5-year-olds: For some fives, running still looks a bit like a hurried walk, but usually it becomes more coordinated within a year or two. Similarly with jumping, throwing and catching: the large majority of children can do these things, though often a bit clumsily, by the time they start school, and most improve their skills noticeably during the early elementary years. ” If the abbreviated form was used as the stem in a true/false item it would obviously be misleading.
  • e.g. Jean Piaget was born in what year?

While it important to know approximately when Piaget made his seminal contributions to the understanding of child development, the exact year of his birth (1880) is not important.

Strengths and weaknesses

All types of selected response items have a number of strengths and weaknesses. True/False items are appropriate for measuring factual knowledge such as vocabulary, formulae, dates, proper names, and technical terms. They are very efficient as they use a simple structure that students can easily understand, and take little time to complete. They are also easier to construct than multiple choice and matching items. However, students have a 50 per cent probability of getting the answer correct through guessing so it can be difficult to interpret how much students know from their test scores. Examples of common problems that arise when devising true/false items are in Table 37 .

Table 37: Common errors in selected response items

Column B is a mixture of generals and dates.

Too many items in each list

Lists should be relatively short (4 – 7) in each column. More than 10 are too confusing.

Responses are not in logical order

In the example with Spanish and English words should be in a logical order (they are alphabetical). If the order is not logical, student spend too much time searching for the correct answer.

Multiple Choice

Problem (i.e. the stem) is not clearly stated problem

New Zealand

a) Is the worlds’ smallest continent

b) Is home to the kangaroo

c) Was settled mainly by colonists from Great Britain

d) Is a dictatorship

This is really a series of true-false items. Because the correct answer is c) a better version with the problem in the stem is

Much of New Zealand was settled by colonists from

a) Great Britain

Some of the alternatives are not plausible

Who is best known for their work on the development of the morality of justice.

  • Gerald Ford

Obviously Gerald Ford is not a plausible alternative.

Irrelevant cues

Use of “All of above”

  • Correct alternative is longer
  • Incorrect alternatives are not grammatically correct with the stem
  • Too many correct alternatives are in position “b” or “c” making it easier for students to guess. All the options (e.g. a, b, c, d) should be used in approximately equal frequently (not exact as that also provides clues).
  • If all of the “above is used” then the other items must be correct. This means that a student may read the 1st response, mark it correct and move on. Alternatively, a student may read the 1st two items and seeing they are true does nor need to read the other alternatives to know to circle “all of the above”. The teacher probably does not want either of these options.

In matching items, two parallel columns containing terms, phrases, symbols, or numbers are presented and the student is asked to match the items in the first column with those in the second column. Typically there are more items in the second column to make the task more difficult and to ensure that if a student makes one error they do not have to make another. Matching items most often are used to measure lower level knowledge such as persons and their achievements, dates and historical events, terms and definitions, symbols and concepts, plants or animals and classifications (Linn & Miller, 2005). An example with Spanish language words and their English equivalents is below:

Directions: On the line to the left of the Spanish word in Column A, write the letter of the English word in Column B that has the same meaning.

While matching items may seem easy to devise it is hard to create homogenous lists. Other problems with matching items and suggested remedies are in Table 37 .

Multiple Choice items are the most commonly used type of objective test items because they have a number of advantages over other objective test items. Most importantly they can be adapted to assess higher levels thinking such as application as well as lower level factual knowledge. The first example below assesses knowledge of a specific fact whereas the second example assesses application of knowledge.

Who is best known for their work on the development of the morality of justice?

b. Vygotsky

d. Kohlberg

Which one of the following best illustrates the law of diminishing returns?

a. A factory doubled its labor force and increased production by 50 per cent

b. The demand for an electronic product increased faster than the supply of the product

c. The population of a country increased faster than agricultural self sufficiency

d. A machine decreased in efficacy as its parts became worn out (Adapted from Linn and Miller 2005, p, 193).

There are several other advantages of multiple choice items. Students have to recognize the correct answer not just know the incorrect answer as they do in true/false items. Also, the opportunity for guessing is reduced because four or five alternatives are usually provided whereas in true/false items students only have to choose between two choices. Also, multiple choice items do not need homogeneous material as matching items do.

However, creating good multiple choice test items is difficult and students (maybe including you) often become frustrated when taking a test with poor multiple choice items. Three steps have to be considered when constructing a multiple choice item: formulating a clearly stated problem, identifying plausible alternatives, and removing irrelevant clues to the answer. Common problems in each of these steps are summarized in Table 38

Constructed response items

Formal assessment also includes constructed response items in which students are asked to recall information and create an answer—not just recognize if the answer is correct—so guessing is reduced. Constructed response items can be used to assess a wide variety of kinds of knowledge and two major kinds are discussed: completion or short answer (also called short response) and extended response.

Completion and short answer

Completion and short answer items can be answered in a word, phrase, number, or symbol. These types of items are essentially the same only varying in whether the problem is presented as a statement or a question (Linn & Miller 2005). For example:

Completion: The first traffic light in the US was invented by…………….

Short Answer: Who invented the first traffic light in the US?

These items are often used in mathematics tests, e.g.

3 + 10 = …………..?

If x = 6, what does x(x-1) =……….

teacher made test in education

A major advantage of these items is they that they are easy to construct. However, apart from their use in mathematics they are unsuitable for measuring complex learning outcomes and are often difficult to score. Completion and short answer tests are sometimes called objective tests as the intent is that there is only one correct answer and so there is no variability in scoring but unless the question is phrased very carefully, there are frequently a variety of correct answers. For example, consider the item

Where was President Lincoln born?………………..

The teacher may expect the answer “in a log cabin” but other correct answers are also “on Sinking Spring Farm”, “in Hardin County” or “in Kentucky”. Common errors in these items are summarized in Table 38 .

Table 38: Common errors in constructed response items

Extended response

Extended response items are used in many content areas and answers may vary in length from a paragraph to several pages. Questions that require longer responses are often called essay questions. Extended response items have several advantages and the most important is their adaptability for measuring complex learning outcomes— particularly integration and application. These items also require that students write and therefore provide teachers a way to assess writing skills. A commonly cited advantage to these items is their ease in construction; however, carefully worded items that are related to learning outcomes and assess complex learning are hard to devise (Linn & Miller, 2005). Well-constructed items phrase the question so the task of the student is clear. Often this involves providing hints or planning notes. In the first example below the actual question is clear not only because of the wording but because of the format (i.e. it is placed in a box). In the second and third examples planning notes are provided:

Example 1: Third grade mathematics:

The owner of a bookstore gave 14 books to the school. The principal will give an equal number of books to each of three classrooms and the remaining books to the school library. How many books could the principal give to each student and the school?

Show all your work on the space below and on the next page. Explain in words how you found the answer. Tell why you took the steps you did to solve the problem.

From Illinois Standards Achievement Test, 2006; ( http://www.isbe.state.il.us/assessment/isat.htm )

Example 2 : Fifth grade science: The grass is always greener

Jose and Maria noticed three different types of soil, black soil, sand, and clay, were found in their neighborhood. They decided to investigate the question, “How does the type of soil (black soil, sand, and clay) under grass sod affect the height of grass?”

Plan an investigation that could answer their new question. In your plan, be sure to include:

  • Prediction of the outcome of the investigation
  • Materials needed to do the investigation
  • logical steps to do the investigation
  • one variable kept the same (controlled)
  • one variable changed (manipulated)
  • any variables being measure and recorded
  • how often measurements are taken and recorded

(From Washington State 2004 assessment of student learning) http://www.k12.wa.us/assessment/WASL/default.aspx )

Example 3: Grades 9-11 English:

Writing prompt

Some people think that schools should teach students how to cook. Other people think that cooking is something that ought to be taught in the home. What do you think? Explain why you think as you do.

Planning notes

Choose One:

  • I think schools should teach students how to cook
  • I think cooking should l be taught in the home

I think cooking should be taught in _____________________because________________

      (school) or (the home)

(From Illinois Measure of Annual Growth in English http://www.isbe.state.il.us/assessment/image.htm )

A major disadvantage of extended response items is the difficulty in reliable scoring. Not only do various teachers score the same response differently but also the same teacher may score the identical response differently on various occasions (Linn & Miller 2005). A variety of steps can be taken to improve the reliability and validity of scoring. First, teachers should begin by writing an outline of a model answer. This helps make it clear what students are expected to include. Second, a sample of the answers should be read. This assists in determining what the students can do and if there are any common misconceptions arising from the question. Third, teachers have to decide what to do about irrelevant information that is included (e.g. is it ignored or are students penalized) and how to evaluate mechanical errors such as grammar and spelling. Then, a point scoring or a scoring rubric should be used.

In point scoring components of the answer are assigned points. For example, if students were asked: What are the nature, symptoms, and risk factors of hyperthermia?

Point Scoring Guide:

This provides some guidance for evaluation and helps consistency but point scoring systems often lead the teacher to focus on facts (e.g. naming risk factors) rather than higher level thinking that may undermine the validity of the assessment if the teachers’ purposes include higher level thinking. A better approach is to use a scoring rubric that describes the quality of the answer or performance at each level.

Scoring rubrics

Scoring rubrics can be holistic or analytical. In holistic scoring rubrics, general descriptions of performance are made and a single overall score is obtained. An example from grade 2 language arts in Los Angeles Unified School District classifies responses into four levels: not proficient, partially proficient, proficient and advanced is on Table 39 .

Table 39: Example of holistic scoring rubric: English language arts grade 2

Analytical rubrics provide descriptions of levels of student performance on a variety of characteristics. For example, six characteristics used for assessing writing developed by the Northwest Regional Education Laboratory (NWREL) are:

  • ideas and content
  • organization
  • word choice
  • sentence fluency
  • conventions

Descriptions of high, medium, and low responses for each characteristic are available from: http://www.nwrel.org/assessment/toolkit98/traits/index.html ).

Holistic rubrics have the advantages that they can be developed more quickly than analytical rubrics. They are also faster to use as there is only one dimension to examine. However, they do not provide students feedback about which aspects of the response are strong and which aspects need improvement (Linn & Miller, 2005). This means they are less useful for assessment for learning. An important use of rubrics is to use them as teaching tools and provide them to students before the assessment so they know what knowledge and skills are expected.

Teachers can use scoring rubrics as part of instruction by giving students the rubric during instruction, providing several responses, and analyzing these responses in terms of the rubric. For example, use of accurate terminology is one dimension of the science rubric in Table 40 . An elementary science teacher could discuss why it is important for scientists to use accurate terminology, give examples of inaccurate and accurate terminology, provide that component of the scoring rubric to students, distribute some examples of student responses (maybe from former students), and then discuss how these responses would be classified according to the rubric. This strategy of assessment for learning should be more effective if the teacher (a) emphasizes to students why using accurate terminology is important when learning science rather than how to get a good grade on the test (we provide more details about this in the section on motivation later in this chapter); (b) provides an exemplary response so students can see a model; and (c) emphasizes that the goal is student improvement on this skill not ranking students.

Table 40: Example of a scoring rubric, Science

*On the High School Assessment, the application of a concept to a practical problem or real-world situation will be scored when it is required in the response and requested in the item stem.

Performance assessments

Typically in performance assessments students complete a specific task while teachers observe the process or procedure (e.g. data collection in an experiment) as well as the product (e.g. completed report) (Popham, 2005; Stiggens, 2005). The tasks that students complete in performance assessments are not simple—in contrast to selected response items—and include the following:

  • playing a musical instrument
  • athletic skills
  • artistic creation
  • conversing in a foreign language
  • engaging in a debate about political issues
  • conducting an experiment in science
  • repairing a machine
  • writing a term paper
  • using interaction skills to play together

These examples all involve complex skills but illustrate that the term performance assessment is used in a variety of ways. For example, the teacher may not observe all of the process (e.g. she sees a draft paper but the final product is written during out-of-school hours) and essay tests are typically classified as performance assessments (Airasian, 2000). In addition, in some performance assessments there may be no clear product (e.g. the performance may be group interaction skills).

Two related terms, alternative assessment and authentic assessment are sometimes used instead of performance assessment but they have different meanings (Linn & Miller, 2005). Alternative assessment refers to tasks that are not pencil-and-paper and while many performance assessments are not pencil-and paper tasks some are (e.g. writing a term paper, essay tests). Authentic assessment is used to describe tasks that students do that are similar to those in the “real world”. Classroom tasks vary in level of authenticity (Popham, 2005). For example, a Japanese language class taught in a high school in Chicago conversing in Japanese in Tokyo is highly authentic— but only possible in a study abroad program or trip to Japan. Conversing in Japanese with native Japanese speakers in Chicago is also highly authentic, and conversing with the teacher in Japanese during class is moderately authentic. Much less authentic is a matching test on English and Japanese words. In a language arts class, writing a letter (to an editor) or a memo to the principal is highly authentic as letters and memos are common work products. However, writing a five-paragraph paper is not as authentic as such papers are not used in the world of work. However, a five paragraph paper is a complex task and would typically be classified as a performance assessment.

Advantages and disadvantages

There are several advantages of performance assessments (Linn & Miller 2005). First, the focus is on complex learning outcomes that often cannot be measured by other methods. Second, performance assessments typically assess process or procedure as well as the product. For example, the teacher can observe if the students are repairing the machine using the appropriate tools and procedures as well as whether the machine functions properly after the repairs. Third, well designed performance assessments communicate the instructional goals and meaningful learning clearly to students. For example, if the topic in a fifth grade art class is one-point perspective the performance assessment could be drawing a city scene that illustrates one point perspective. ( http://www.sanford-artedventures.com ). This assessment is meaningful and clearly communicates the learning goal. This performance assessment is a good instructional activity and has good content validity—common with well designed performance assessments (Linn & Miller 2005).

One major disadvantage with performance assessments is that they are typically very time consuming for students and teachers. This means that fewer assessments can be gathered so if they are not carefully devised fewer learning goals will be assessed—which can reduce content validity. State curriculum guidelines can be helpful in determining what should be included in a performance assessment. For example, Eric, a dance teacher in a high school in Tennessee learns that the state standards indicate that dance students at the highest level should be able to do demonstrate consistency and clarity in performing technical skills by:

  • performing complex movement combinations to music in a variety of meters and styles
  • performing combinations and variations in a broad dynamic range
  • demonstrating improvement in performing movement combinations through self-evaluation
  • critiquing a live or taped dance production based on given criteria ( http://www.tennessee.gov/education/ci/standards/music/dance912.shtml )

Eric devises the following performance task for his eleventh grade modern dance class .

In groups of 4-6 students will perform a dance at least 5 minutes in length. The dance selected should be multifaceted so that all the dancers can demonstrate technical skills, complex movements, and a dynamic range (Items 1-2). Students will videotape their rehearsals and document how they improved through self evaluation (Item 3). Each group will view and critique the final performance of one other group in class (Item 4). Eric would need to scaffold most steps in this performance assessment. The groups probably would need guidance in selecting a dance that allowed all the dancers to demonstrate the appropriate skills; critiquing their own performances constructively; working effectively as a team, and applying criteria to evaluate a dance.

Another disadvantage of performance assessments is they are hard to assess reliably which can lead to inaccuracy and unfair evaluation. As with any constructed response assessment, scoring rubrics are very important. An example of holistic and analytic scoring rubrics designed to assess a completed product are in Table 39 and Table 40 . A rubric designed to assess the process of group interactions is in Table 41 .

Table 41: Example of group interaction rubric

This rubric was devised for middle grade science but could be used in other subject areas when assessing group process. In some performance assessments several scoring rubrics should be used. In the dance performance example above Eric should have scoring rubrics for the performance skills, the improvement based on self evaluation, the team work, and the critique of the other group. Obviously, devising a good performance assessment is complex and Linn and Miller (2005) recommend that teachers should:

  • Create performance assessments that require students to use complex cognitive skills. Sometimes teachers devise assessments that are interesting and that the students enjoy but do not require students to use higher level cognitive skills that lead to significant learning. Focusing on high level skills and learning outcomes is particularly important because performance assessments are typically so time consuming.
  • Ensure that the task is clear to the students. Performance assessments typically require multiple steps so students need to have the necessary prerequisite skills and knowledge as well as clear directions. Careful scaffolding is important for successful performance assessments.
  • Specify expectations of the performance clearly by providing students scoring rubrics during the instruction. This not only helps students understand what it expected but it also guarantees that teachers are clear about what they expect. Thinking this through while planning the performance assessment can be difficult for teachers but is crucial as it typically leads to revisions of the actual assessment and directions provided to students.
  • Reduce the importance of unessential skills in completing the task. What skills are essential depends on the purpose of the task. For example, for a science report, is the use of publishing software essential? If the purpose of the assessment is for students to demonstrate the process of the scientific method including writing a report, then the format of the report may not be significant. However, if the purpose includes integrating two subject areas, science and technology, then the use of publishing software is important. Because performance assessments take time it is tempting to include multiple skills without carefully considering if all the skills are essential to the learning goals.

Working Finished Documentation Best Work ShowcaseProgress Current AccomplishmentsAssessment of learning Assessment for Learning “A portfolio is a meaningful collection of student work that tells the story of student achievement or growth” (Arter, Spandel, & Culham, 1995, p. 2). Portfolios are a purposeful collection of student work not just folders of all the work a student does. Portfolios are used for a variety of purposes and developing a portfolio system can be confusing and stressful unless the teachers are clear on their purpose. The varied purposes can be illustrated as four dimensions (Linn & Miller 2005)

teacher made test in education

When the primary purpose is assessment for learning, the emphasis is on student self-reflection and responsibility for learning. Students not only select samples of their work they wish to include, but also reflect and interpret their own work. Portfolios containing this information can be used to aid communication as students can present and explain their work to their teachers and parents (Stiggins, 2005). Portfolios focusing on assessment of learning contain students’ work samples that certify accomplishments for a classroom grade, graduation, state requirements etc. Typically, students have less choice in the work contained in such portfolios as some consistency is needed for this type of assessment. For example, the writing portfolios that fourth and seventh graders are required to submit in Kentucky must contain a self-reflective statement and an example of three pieces of writing (reflective, personal experience or literary, and transactive). Students do choose which of their pieces of writing in each type to include in the portfolio.

(http://www.kde.state.ky.us/KDE/Instructional+Resources/Curriculum+Documents+and+Resources/Student+Performance+Standards/)

Portfolios can be designed to focus on student progress or current accomplishments. For example, audio tapes of English language learners speaking could be collected over one year to demonstrate growth in learning. Student progress portfolios may also contain multiple versions of a single piece of work. For example, a writing project may contain notes on the original idea, outline, first draft, comments on the first draft by peers or teacher, second draft, and the final finished product (Linn & Miller 2005). If the focus is on current accomplishments, only recent completed work samples are included.

Portfolios can focus on documenting student activities or highlighting important accomplishments. Documentation portfolios are inclusive containing all the work samples rather than focusing on one special strength, best work, or progress. In contrast, showcase portfolios focus on best work. The best work is typically identified by students. One aim of such portfolios is that students learn how to identify products that demonstrate what they know and can do. Students are not expected to identify their best work in isolation but also use the feedback from their teachers and peers.

A final distinction can be made between a finished portfolio—maybe used to for a job application—versus a working portfolio that typically includes day-to-day work samples. Working portfolios evolve over time and are not intended to be used for assessment of learning. The focus in a working portfolio is on developing ideas and skills so students should be allowed to make mistakes, freely comment on their own work, and respond to teacher feedback (Linn & Miller, 2005). Finished portfolios are designed for use with a particular audience and the products selected may be drawn from a working portfolio. For example, in a teacher education program, the working portfolio may contain work samples from all the courses taken. A student may develop one finished portfolio to demonstrate she has mastered the required competencies in the teacher education program and a second finished portfolio for her job application.

Portfolios used well in classrooms have several advantages. They provide a way of documenting and evaluating growth in a much more nuanced way than selected response tests can. Also, portfolios can be integrated easily into instruction, i.e. used for assessment for learning. Portfolios also encourage student self-evaluation and reflection, as well as ownership for learning (Popham, 2005). Using classroom assessment to promote student motivation is an important component of assessment for learning which is considered in the next section.

However, there are some major disadvantages of portfolio use. First, good portfolio assessment takes an enormous amount of teacher time and organization. The time is needed to help students understand the purpose and structure of the portfolio, decide which work samples to collect, and to self reflect. Some of this time needs to be conducted in one-to-one conferences. Reviewing and evaluating the portfolios out of class time is also enormously time consuming. Teachers have to weigh if the time spent is worth the benefits of the portfolio use.

Second, evaluating portfolios reliability and eliminating bias can be even more difficult than in a constructed response assessment because the products are more varied. The experience of the state-wide use of portfolios for assessment in writing and mathematics for fourth and eighth graders in Vermont is sobering. Teachers used the same analytic scoring rubric when evaluating the portfolio. In the first two years of implementation samples from schools were collected and scored by an external panel of teachers. In the first year the agreement among raters (i.e. inter-rater reliability) was poor for mathematics and reading; in the second year the agreement among raters improved for mathematics but not for reading. However, even with the improvement in mathematics the reliability was too low to use the portfolios for individual student accountability (Koretz, Stecher, Klein & McCaffrey, 1994). When reliability is low, validity is also compromised because unstable results cannot be interpreted meaningfully.

If teachers do use portfolios in their classroom, the series of steps needed for implementation are outlined in Table 36 . If the school or district has an existing portfolio system these steps may have to be modified.

Table 42: Steps in implementing a classroom portfolio program

Assessment that enhances motivation and student confidence

Studies on testing and learning conducted more than 20 years ago demonstrated that tests promote learning and that more frequent tests are more effective than less frequent tests (Dempster & Perkins, 1993). Frequent smaller tests encourage continuous effort rather than last minute cramming and may also reduce test anxiety because the consequences of errors are reduced. College students report preferring more frequent testing than infrequent testing (Bangert-Downs, Kulik, Kulik, 1991). More recent research indicates that teachers’ assessment purpose and beliefs , the type of assessment selected, and the feedback given contributes to the assessment climate in the classroom which influences students’ confidence and motivation. The use of self-assessment is also important in establishing a positive assessment climate.

Teachers’ purposes and beliefs

Student motivation can be enhanced when the purpose of assessment is promoting student learning and this is clearly communicated to students by what teachers say and do (Harlen, 2006). This approach to assessment is associated with what the psychologist, Carol Dweck, (2000) calls an incremental view of ability or intelligence. An incremental view assumes that ability increases whenever an individual learns more. This means that effort is valued because effort leads to knowing more and therefore having more ability. Individuals with an incremental view also ask for help when needed and respond well to constructive feedback as the primary goal is increased learning and mastery. In contrast, a fixed view of ability assumes that some people have more ability than others and nothing much can be done to change that. Individuals with a fixed view of ability often view effort in opposition to ability (“Smart people don’t have to study”) and so do not try as hard, and are less likely to ask for help as that indicates that they are not smart. While there are individual differences in students’ beliefs about their views of intelligence, teachers’ beliefs and classroom practices influence students’ perceptions and behaviors.

Teachers with an incremental view of intelligence communicate to students that the goal of learning is mastering the material and figuring things out. Assessment is used by these teachers to understand what students know so they can decide whether to move to the next topic, re-teach the entire class, or provide remediation for a few students. Assessment also helps students’ understand their own learning and demonstrate their competence. Teachers with these views say things like, “We are going to practice over and over again. That’s how you get good. And you’re going to make mistakes. That’s how you learn.” (Patrick, Anderman, Ryan, Edelin, Midgley, 2001, p. 45).

In contrast, teachers with a fixed view of ability are more likely to believe that the goal of learning is doing well on tests especially outperforming others. These teachers are more likely to say things that imply fixed abilities e.g. “This test will determine what your math abilities are”, or stress the importance of interpersonal competition, “We will have speech competition and the top person will compete against all the other district schools and last year the winner got a big award and their photo in the paper.” When teachers stress interpersonal competition some students may be motivated but there can only a few winners so there are many more students who know they have no chance of winning. Another problem with interpersonal competition in assessment is that the focus can become winning rather than understanding the material.

Teachers who communicate to their students that ability is incremental and that the goal of assessment is promoting learning rather that ranking students, or awarding prizes to those who did very well, or catching those who did not pay attention, are likely to enhance students’ motivation.

Choosing assessments

The choice of assessment task also influences students’ motivation and confidence. First, assessments that have clear criteria that students understand and can meet rather than assessments that pit students against each other in interpersonal competition enhances motivation (Black, Harrison, Lee, Marshall, Wiliam, 2004). This is consistent with the point we made in the previous section about the importance of focusing on enhancing learning for all students rather than ranking students. Second, meaningful assessment tasks enhance student motivation. Students often want to know why they have to do something and teachers need to provide meaningful answers. For example, a teacher might say, “You need to be able to calculate the area of a rectangle because if you want new carpet you need to know how much carpet is needed and how much it would cost.” Well designed performance tasks are often more meaningful to students than selected response tests so students will work harder to prepare for them.

Third, providing choices of assessment tasks can enhance student sense of autonomy and motivation according to self determination theory (see Chapter 7). Kym, the sixth grade teacher whose story began this chapter, reports that giving students choices was very helpful. Another middle school social studies teacher Aaron, gives his students a choice of performance tasks at the end of the unit on the US Bill of Rights. Students have to demonstrate specified key ideas but can do that by making up a board game, presenting a brief play, composing a rap song etc. Aaron reports that students work much harder on this performance assessment which allows them to use their strengths than previously when he did not provide any choices and gave a more traditional assignment. Measurement experts caution that a danger of giving choices is that the assessment tasks are no longer equivalent and so the reliability of scoring is reduced so it is particularly important to use well designed scoring rubrics. Fourth, assessment tasks should be challenging but achievable with reasonable effort (Elliott, McGregor & Thrash, 2004). This is often hard for beginning teachers to do, who may give assessment tasks that are too easy or too hard, because they have to learn to match their assessment to the skills of their students.

Providing feedback

When the goal is assessment for learning , providing constructive feedback that helps students know what they do and do not understand as well as encouraging them to learn from their errors is fundamental. Effective feedback should be given as soon as possible as the longer the delay between students’ work and feedback the longer students will continue to have some misconceptions. Also, delays reduce the relationship between student s’ performance and the feedback as students can forget what they were thinking during the assessment. Effective feedback should also inform students clearly what they did well and what needs modification. General comments just as “good work, A”, or “needs improvement” do not help students understand how to improve their learning. Giving feedback to students using well designed scoring rubrics helps clearly communicate strengths and weaknesses. Obviously grades are often needed but teachers can minimize the focus by placing the grade after the comments or on the last page of a paper. It can also be helpful to allow students to keep their grades private making sure when returning assignments that the grade is not prominent (e.g. not using red ink on the top page) and never asking students to read their scores aloud in class. Some students choose to share their grades—but that should be their decision not their teachers.

When grading, teachers often become angry at the mistakes that student make. It is easy for teachers to think something like: “With all the effort I put into teaching, this student could not even be bothered to follow the directions or spell check!” Many experienced teachers believe that communicating their anger is not helpful, so rather than saying: “How dare you turn in such shoddy work”, they rephrase it as, “I am disappointed that your work on this assignment does not meet the standards set” (Sutton, 2003). Research evidence also suggests that comments such as “You are so smart” for a high quality performance can be counterproductive. This is surprising to many teachers but if students are told they are smart when they produce a good product, then if they do poorly on the next assignment the conclusion must be they are “not smart” (Dweck, 2000). More effective feedback focuses on positive aspects of the task (not the person), as well as strategies, and effort. The focus of the feedback should relate to the criteria set by the teacher and how improvements can be made.

When the teacher and student are from different racial/ethnic backgrounds providing feedback that enhances motivation and confidence but also includes criticism can be particularly challenging because the students of color have historical reasons to distrust negative comments from a white teacher. Research by Cohen Steele, Ross (1999) indicates that “wise” feedback from teachers needs three components: positive comments, criticisms, and an assurance that the teacher believes the student can reach higher standards. We describe this research is more detail in “Deciding for yourself about the research” found in Appendix #2.

Self and peer assessment

In order to reach a learning goal, students need to understand the meaning of the goal, the steps necessary to achieve a goal, and if they are making satisfactory progress towards that goal (Sadler, 1989). This involves self assessment and recent research has demonstrated that well designed self assessment can enhance student learning and motivation (Black & Wiliam, 2006). For self assessment to be effective, students need explicit criteria such as those in an analytical scoring rubric. These criteria are either provided by the teacher or developed by the teacher in collaboration with students. Because students seem to find it easier to understand criteria for assessment tasks if they can examine other students’ work along side their own, self assessment often involves peer assessment. An example of a strategy used by teachers involves asking students to use “traffic lights” to indicate of their confidence in their assignment or homework. Red indicates that they were unsure of their success, orange that they were partially unsure, and green that they were confident of their success. The students who labeled their own work as orange and green worked in mixed groups to evaluate their own work while the teacher worked with the students who had chosen red (Black & Wiliam, 2006).

If self and peer assessment is used, it is particularly important that the teachers establish a classroom culture for assessment that is based on incremental views of ability and learning goals. If the classroom atmosphere focuses on interpersonal competition, students have incentives in self and peer assessment to inflate their own evaluations (and perhaps those of their friends) because there are limited rewards for good work.

Adjusting instruction based on assessment

Using assessment information to adjust instruction is fundamental to the concept of assessment for learning. Teachers make these adjustments “in the moment” during classroom instruction as well as during reflection and planning periods. Teachers use the information they gain from questioning and observation to adjust their teaching during classroom instruction. If students cannot answer a question, the teacher may need to rephrase the question, probe understanding of prior knowledge, or change the way the current idea is being considered. It is important for teachers to learn to identify when only one or two students need individual help because they are struggling with the concept, and when a large proportion of the class is struggling so whole group intervention is needed.

After the class is over, effective teachers spend time analyzing how well the lessons went, what students did and did not seem to understand, and what needs to be done the next day. Evaluation of student work also provides important information for teachers. If many students are confused about a similar concept the teacher needs to re- teach it and consider new ways of helping students understand the topic. If the majority of students complete the tasks very quickly and well, the teacher might decide that the assessment was not challenging enough. Sometimes teachers become dissatisfied with the kinds of assessments they have assigned when they are grading—perhaps because they realize there was too much emphasis on lower level learning, that the directions were not clear enough, or the scoring rubric needed modification. Teachers who believe that assessment data provides information about their own teaching and that they can find ways to influence student learning have high teacher efficacy or beliefs that they can make a difference in students’ lives. In contrast, teachers who think that student performance is mostly due to fixed student characteristics or the homes they come from (e.g. “no wonder she did so poorly considering what her home life is like”) have low teacher efficacy (Tschannen-Moran, Woolfolk Hoy, & Hoy, 1998).

Communication with parents and guardians

Clear communication with parents about classroom assessment is important—but often difficult for beginning teachers. The same skills that are needed to communicate effectively with students are also needed when communicating with parents and guardians. Teachers need to be able to explain to parents the purpose of the assessment, why they selected this assessment technique, and what the criteria for success are. Some teachers send home newsletters monthly or at the beginning of a major assessment task explaining the purpose and nature of the task, any additional support that is needed (e.g. materials, library visits), and due dates. Some parents will not be familiar with performance assessments or the use of self and peer assessment so teachers need to take time to explain these approaches carefully.

Many school districts now communicate though websites that have mixtures of public information available to all parents in the class (e.g. curriculum and assessment details) as well information restricted to the parents or guardians of specific students (e.g. the attendance and grades). Teachers report this is helpful as parents have access to their child’s performance immediately and when necessary, can talk to their child and teacher quickly.

The recommendations we provided above on the type of feedback that should be given to students also apply when talking to parents. That is, the focus should be on students’ performance on the task, what was done well and what needs work, rather than general comments about how “smart” or “weak” the child is. If possible, comments should focus on strategies that the child uses well or needs to improve (e.g. reading test questions carefully, organization in a large project). When the teacher is white and the student or parents are minority, trust can be an issue so using “wise” feedback when talking to parents may help.

Action research: studying yourself and your students

Assessment for learning emphasizes devising and conducting assessment data in order to improve teaching and learning and so is related to action research (also called teacher research). In Chapter 1, we described action research as studies conducted by teachers of their own students or their own work. Action research can lead to decisions that improve a teacher’s own teaching or the teaching of colleagues. Kym, the teacher we described at the beginning of this chapter, conducted action research in her own classroom as she identified a problem of poor student motivation and achievement, investigated solutions during the course on motivation, tried new approaches, and observed the resulting actions.

Cycles of planning, acting and reflecting

Action research is usually described as a cyclical process with the following stages (Mertler, 2006).

  • Planning Stage. Planning has three components. First, planning involves identifying and defining a problem. Problems sometimes start with some ill defined unease or feeling that something is wrong and it can take time to identify the problem clearly so that it becomes a researchable question. The next step, is reviewing the related literature and this may occur within a class or workshop that the teachers are attending. Teachers may also explore the literature on their own or in teacher study groups. The third step is developing a research plan. The research plan includes what kind of data will be collected (e.g. student test scores, observation of one or more students, as well as how and when it will be collected (e.g. from files, in collaboration with colleagues, in spring or fall semester).
  • Acting sage . During this stage the teacher is collecting and analyzing data. The data collected and the analyses do not need to be complex because action research, to be effective, has to be manageable.
  • Developing an action plan . In this stage the teacher develops a plan to make changes and implements these changes. This is the action component of action research and it is important that teachers document their actions carefully so that they can communicate them to others.
  • Communicating and reflection . An important component of all research is communicating information. Results can be shared with colleagues in the school or district, in an action research class at the local college, at conferences, or in journals for teachers. Action research can also involve students as active participants and if this is the case, communication may include students and parents. Communicating with others helps refine ideas and so typically aids in reflection. During reflection teachers/researchers ask such questions as: “What did I learn?” “What should I have done differently?” “What should I do next?” Questions such as these often lead to a new cycle of action research beginning with planning and then moving to the other steps.

Ethical issues—privacy, voluntary consent

Teachers are accustomed to collecting students’ test scores, data about performances, and descriptions of behaviors as an essential component of teaching. However, if teachers are conducting action research and they plan to collect data that will be shared outside the school community then permission from parents (or guardians) and students must be obtained in order to protect the privacy of students and their families. Typically permission is obtained by an informed consent form that summarizes the research, describes the data that will be collected, indicates that participation is voluntary, and provides a guarantee of confidentiality or anonymity (Hubbard & Power, 2005). Many large school districts have procedures for establishing informed consent as well as person in the central office who is responsible for the district guidelines and specific application process. If the action research is supported in some way by a college of university (e.g. through a class) then informed consent procedures of that institution must be followed.

One common area of confusion for teachers is the voluntary nature of student participation in research. If the data being collected are for a research study, students can choose not to participate. This is contrary to much regular classroom instruction where teachers tell students they have to do the work or complete the tasks.

Grading and reporting

Assigning students grades is an important component of teaching and many school districts issue progress reports, interim reports, or mid term grades as well as final semester grades. Traditionally these reports were printed on paper and sent home with students or mailed to students’ homes. Increasingly, school districts are using web-based grade management systems that allow parents to access their child’s grades on each individual assessment as well as the progress reports and final grades.

Grading can be frustrating for teachers as there are many factors to consider. In addition, report cards typically summarize in brief format a variety of assessments and so cannot provide much information about students’ strengths and weaknesses. This means that report cards focus more on assessment of learning than assessment for learning. There are a number of decisions that have to be made when assigning students’ grades and schools often have detailed policies that teachers have to follow. In the next section, we consider the major questions associated with grading.

How are various assignments and assessments weighted?

Students typically complete a variety of assignments during a grading period such as homework, quizzes, performance assessments, etc. Teachers have to decide—preferably before the grading period begins—how each assignment will be weighted. For example, a sixth grade math teacher may decide to weight the grades in the following manner:

Deciding how to weight assignments should be done carefully as it communicates to students and parents what teachers believe is important, and also may be used to decide how much effort students will exert (e.g. “If homework is only worth 5 per cent, it is not worth completing twice a week”).

Should social skills or effort be included? Elementary school teachers are more likely than middle or high school teachers to include some social skills into report cards (Popham, 2005). These may be included as separate criteria in the report card or weighted into the grade for that subject. For example, the grade for mathematics may include an assessment of group cooperation or self regulation during mathematics lessons. Some schools and teachers endorse including social skills arguing that developing such skills is important for young students and that students need to learn to work with others and manage their own behaviors in order to be successful. Others believe that grades in subject areas should be based on the cognitive performances—and that if assessments of social skills are made they should be clearly separated from the subject grade on the report card. Obviously, clear criteria such as those contained in analytical scoring rubrics should be used if social skills are graded.

Teachers often find it difficult to decide whether effort and improvement should be included as a component of grades. One approach is for teachers to ask students to submit drafts of an assignment and make improvements based on the feedback they received. The grade for the assignment may include some combination of the score for the drafts, the final version, and the amount of improvement the students made based on the feedback provided. A more controversial approach is basing grades on effort when students try really hard day after day but still cannot complete their assignments well. These students could have identified special needs or be recent immigrants that have limited English skills. Some school districts have guidelines for handling such cases. One disadvantage of using improvement as a component of grades is that the most competent students in class may do very well initially and have little room for improvement—unless teachers are skilled at providing additional assignments that will help challenge these students.

Teachers often use “hodgepodge grading”, i.e. a combination of achievement, effort, growth, attitude or class conduct, homework, and class participation. A survey of over 8,500 middle and high school students in the US state of Virginia supported the hodgepodge practices commonly used by their teachers (Cross & Frary, 1999).

How should grades be calculated?

Two options are commonly used: absolute grading and relative grading. In absolute grading grades are assigned based on criteria the teacher has devised. If an English teacher has established a level of proficiency needed to obtain an A and no student meets that level then no A’s will be given. Alternatively if every student meets the established level then all the students will get A’s (Popham, 2005). Absolute grading systems may use letter grades or pass/fail.

In relative grading the teacher ranks the performances of students from worst to best (or best to worst) and those at the top get high grades, those in the middle moderate grades, and those at the bottom low grades. This is often described as “grading on the curve” and can be useful to compensate for an examination or assignment that students find much easier or harder than the teacher expected. However, relative grading can be unfair to students because the comparisons are typically within one class, so an A in one class may not represent the level of performance of an A in another class. Relative grading systems may discourage students from helping each other improve as students are in competition for limited rewards. In fact, Bishop (1999) argues that grading on the curve gives students a personal interest in persuading each other not to study as a serious student makes it more difficult for others to get good grades.

What kinds of grade descriptions should be used?

Traditionally a letter grade system is used (e.g. A, B, C, D, F ) for each subject. The advantages of these grade descriptions are they are convenient, simple, and can be averaged easily. However, they do not indicate what objectives the student has or has not met nor students’ specific strengths and weaknesses (Linn & Miller 2005). Elementary schools often use a pass-fail (or satisfactory-unsatisfactory) system and some high schools and colleges do as well. Pass-fail systems in high school and college allow students to explore new areas and take risks on subjects that they may have limited preparation for, or is not part of their major (Linn & Miller 2005). While a pass- fail system is easy to use, it offers even less information about students’ level of learning. A pass-fail system is also used in classes that are taught under a mastery-learning approach in which students are expected to demonstrate mastery on all the objectives in order to receive course credit. Under these conditions, it is clear that a pass means that the student has demonstrated mastery of all the objectives.

Some schools have implemented a checklist of the objectives in subject areas to replace the traditional letter grade system, and students are rated on each objective using descriptors such as Proficient, Partially Proficient, and Needs Improvement. For example, the checklist for students in a fourth grade class in California may include the four types of writing that are required by the English language state content standards ( http://www.cde.ca.gov/be/st/ss/enggrade4.asp )

  • writing narratives
  • writing responses to literature
  • writing information reports
  • writing summaries

The advantages of this approach are that it communicates students’ strengths and weaknesses clearly, and it reminds the students and parents the objectives of the school. However, if too many objectives are included then the lists can become so long that they are difficult to understand.

Chapter summary

The purpose of classroom assessment can be assessment for learning or assessment of learning. Essential steps of assessment for learning include communicating instructional goals clearly to students; selecting appropriate high quality assessments that match the instructional goals and students’ backgrounds; using assessments that enhance student motivation and confidence, adjusting instruction based on assessment, and communicating assessment results with parents and guardians. Action research can help teachers understand and improve their teaching. A number of questions are important to consider when devising grading systems.

Airasian, P. W. (2000). Classroom Assessment: A concise approach 2nd ed. Boston: McGraw Hill. Airasian, P. W. (2004). Classroom Assessment: Concepts and Applications 3rd ed. Boston: McGraw Hill.

Bangert-Downs, R. L.,Kulik, J. A., & Kulik, C-L, C. (1991). Effects of frequent classroom testing. Journal of Educational Research, 85 (2), 89-99.

Black, P., Harrison, C., Lee, C., Marshall, B. & Wiliam, D. (2004). Working inside the black box.: Assessment for learning in the classroom. Phi Delta Kappan, 86 (1) 9-21.

Black, P., & Wiliam,D. (2006). Assessment for learning in the classroom. In J. Gardner (Ed.). Assessment and learning (pp. 9-25). Thousand Oaks, CA:Sage.

Bishop, J. H. (1999). Nerd harassment, incentives, school priorities, and learning.In S. E. Mayer & P. E. Peterson (Eds.) Earning and learning: How school matters ( pp. 231-280). Washington, DC: Brookings Institution Press.

Borko, H. & Livingston, C. (1989) Cognition and Improvisation: Differences in Mathematics Instruction by Expert and Novice Teachers. American Educational Research Journal , 26 , 473-98.

Cross, L. H., & Frary, R. B. (1999). Hodgepodge grading: Endorsed by students and teachers alike. Applied Measurement in Education, 21 (1) 53-72.

Dempster, F. N. & Perkins, P. G. (1993). Revitalizating classroom assessment: Using tests to promote learning. Journal of Instructional Psychology, 20 (3) 197-203.

Dweck, C. S. (2000) Self-theories: Their role in motivation, personality, and development. Philadelphia, PA: Psychology Press.

Elliott, A., McGregor, H., & Thrash, T. (2004). The need for competence. In E. Deci & R. Ryan (Eds.), Handbook of self-determination research (pp. 361-388). Rochester, NY: University of Rochester Press.

Harlen, W. The role of assessment in developing motivation for learning. In J. Gardner (Ed.). Assessment and learning (pp. 61-80). Thousand Oaks, CA: Sage.

Hubbard, R. S., & Power, B. M. (2003). The art of classroom inquiry , A handbook for teachers-researchers (2nd ed.). Portsmith, NH: Heinemann.

Koretz, D. Stecher, B. Klein, S. & McCaffrey, D. (1994). The evolution of a portfolio program: The impact and quality of the Vermont program in its second year (1992-3). (CSE Technical report 385) Los Angeles: University of California, Center for Research on Evaluation Standards and student Testing. Accessed January 25, 2006 from http://www.csr.ucla.edu .

Linn, R. L., & Miller, M. D. (2005). Measurement and Assessment in Teaching 9th ed. Upper Saddle River, NJ: Pearson .

Mertler, C. A. (2006). Action research: Teachers as researchers in the classroom . Thousand Oaks, CA: Sage. Popham, W. J. (2005). Classroom Assessment: What teachers need to know. Boston, MA: Pearson.

Rowe, M. B. (2003). Wait-time and rewards as instructional variables, their influence on language, logic and fate control: Part one-wait time. Journal of Research in science Teaching, 40 Supplement, S19-32.

Stiggins, R. J. (2002). Assessment crisis: The absence of assessment FOR learning . Phi Delta Kappan, 83 (10), 758-765.

Sutton, R. E. (2004). Emotional regulation goals and strategies of teachers. Social Psychology of Education , 7 (4), 379-398.Teel, K. M., Debrin-Parecki, A., & Covington, M. V. (1998). Teaching strategies that honor and motivate inner-city African American students: A school/university collaboration. Teaching and Teacher Education, 14 (5), 479-495.

Tschannen-Moran, M., Woolfolk-Hoy, A., & Hoy, W. K. (1998). Teacher efficacy: Its meaning and measure. Review of Educational Research , 68 , 202-248.

Educational Psychology Copyright © 2019 by Kelvin Seifert and Rosemary Sutton is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

  • Food & Dining
  • Coronavirus
  • Real Estate
  • Seattle History
  • PNW Politics

The Advantages of Teacher-Made Tests

Related articles, different forms of formative assessment in lesson plans, cognitive testing vs. achievement testing, how to improve a 5th-grader's listening skills.

  • Reading Fluency Activities for Older Students
  • The Wilson Reading System Multisensory Activities

Formal assessments give teachers a way to test knowledge and plan future instruction, but standardized exams or commercially prepared tests don't always accurately assess that information. The extra time required to prepare exams pays off with the potential for more accurate assessments, and with the benefits in mind, teachers can more accurately monitor student learning and progress.

Integration to Instruction

Teachers often expand on the textbook to make the information relevant and meaningful to their students. Instead of simply reading from the textbook, a teacher might use non-fiction books, guest speakers, experiments, field trips and demonstrations to teach course content. Because tests provided with a textbook don't include the knowledge the students gain from outside experiences, teacher-made tests better reflect what is taught in class and fit better with the teaching methods they use. With customized tests, teachers can assess the students as they progress to check for understanding.

Control Over Format

Commercially prepared tests are typically multiple choice or fill-in-the-blank, although you may find some short answer and essay questions. When a teacher creates his or her own tests, she has complete control over the format. Paper-and-pencil tests can include different types of questions and formats that best fit specific material. Multiple-choice questions may work well for certain sections, while answering essay questions is best for others. Teachers also have the option of alternative testing types, such as an oral exams or presentations.

Increased Frequency

Standardized testing typically only happens once per year, so the results don't necessarily give the teachers the tools to consistently improve teaching methods. Similarly, the tests provided by publishers to accompany textbooks are often only provided at the end of chapters or units. When a teacher makes her own exams, she can make as many or as few as she wants. More frequent testing gives a better look at the students' progress throughout the chapter and over the course of the year.

Modifications

The teacher knows her students better than any publisher. While commercial tests may cover the majority of information, they may not take into account students with special needs or different learning styles. A teacher who makes her own tests has the option to tailor the exams to the students in her classroom. Examples include adding pictures or diagrams, enlarging print or leaving extra space between sentences to allow for easier reading.

  • Educational Testing Service: Unlocking the Power of the Teacher-Made Test

Based in the Midwest, Shelley Frost has been writing parenting and education articles since 2007. Her experience comes from teaching, tutoring and managing educational after school programs. Frost worked in insurance and software testing before becoming a writer. She holds a Bachelor of Arts in elementary education with a reading endorsement.

Early Childhood Reading Assessments

Classroom assessment tools for elementary students, super fun vocabulary game for 7th grade, different assessment methods to test students, criteria for evaluating reading textbooks, what activities can improve a third grader's reading fluency, what are the advantages of authentic assessment over standardized testing, national learning style assessment results for college students, five components to a comprehensive reading program, most popular.

  • 1 Early Childhood Reading Assessments
  • 2 Classroom Assessment Tools for Elementary Students
  • 3 Super Fun Vocabulary Game for 7th Grade
  • 4 Different Assessment Methods to Test Students

IGI Global

  • Get IGI Global News

US Flag

  • All Products
  • Book Chapters
  • Journal Articles
  • Video Lessons
  • Teaching Cases

Shortly You Will Be Redirected to Our Partner eContent Pro's Website

eContent Pro powers all IGI Global Author Services. From this website, you will be able to receive your 25% discount (automatically applied at checkout), receive a free quote, place an order, and retrieve your final documents .

InfoScipedia Logo

What is Teacher-Made Tests

Handbook of Research on Assessment Literacy and Teacher-Made Testing in the Language Classroom

Related Books View All Books

Preparing Agriculture and Agriscience Educators for the Classroom

Related Journals View All Journals

International Journal of Innovative Teaching and Learning in Higher Education (IJITLHE)

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

1.15: Teacher made assessment strategies

  • Last updated
  • Save as PDF
  • Page ID 64553

Kevin Seifert and Rosemary Sutton

Assessment

Teacher-made assessment strategies

Kym teaches sixth grade students in an urban school where most of the families in the community live below the poverty line. Each year the majority of the students in her school fail the statewide tests. Kym follows school district teaching guides and typically uses direct instruction in her Language Arts and Social Studies classes. The classroom assessments are designed to mirror those on the statewide tests so the students become familiar with the assessment format.

When Kym is in a graduate summer course on motivation she reads an article called, “Teaching strategies that honor and motivate inner-city African American students” (Teel, Debrin-Parecki, & Covington, 1998) and she decides to change her instruction and assessment in fall in four ways:

  • First, she stresses an incremental approach to ability focusing on effort and allows students to revise their work several times until the criteria are met.
  • Second, she gives students choices in performance assessments (e.g. oral presentation, art project, creative writing).
  • Third, she encourages responsibility by asking students to assist in classroom tasks such as setting up video equipment, handing out papers etc.
  • Fourth, she validates student’ cultural heritage by encouraging them to read biographies and historical fiction from their own cultural backgrounds.

Kym reports that the changes in her students’ effort and demeanor in class are dramatic: students are more enthusiastic, work harder, and produce better products. At the end of the year twice as many of her students pass the statewide test than the previous year.

Afterward. Kym still teaches sixth grade in the same school district and continues to modify the strategies described above. Even though the performance of the students she taught improved the school was closed because, on average, the students’ performance was poor. Kym gained a Ph.D and teaches Educational Psychology to preservice and inservice teachers in evening classes.

action research . This involves identifying a problem (e.g. low motivation and achievement), learning about alternative approaches (e.g. reading the literature), implementing the new approaches, observing the results (e.g. students’ effort and test results), and continuing to modify the strategies based on their observations.

assess students’ learning she tested them on the mathematics knowledge and skills she taught during the previous weeks. The test formats varied little and students always did them individually with pencil and paper.

Basic concepts

Assessment is an integrated process of gaining information about students’ learning and making value judgments about their progress (Linn & Miller, 2005). Information about students’ progress can be obtained from a variety of sources, including projects, portfolios, performances, observations, and tests. The information about students’ learning is often assigned specific numbers or grades and this involves measurement. Measurement answers the question, “How much?” and is used most commonly when the teacher scores a test or product and assigns numbers (e.g. 28 /30 on the biology test; 90/100 on the science project).

Evaluation is the process of making judgments about the assessment information (Airasian, 2005). These judgments may be about individual students (e.g. should Jacob’s course grade take into account his significant improvement over the grading period?), the assessment method used (e.g. is the multiple choice test a useful way to obtain information about problem solving), or one’s own teaching (e.g. most of the students this year did much better on the essay assignment than last year so my new teaching methods seem effective).

assessment for learning, where the priority is designing and using assessment strategies to enhance student learning and development. Assessment for learning is often referred to as formative assessment, i.e. it takes place during the course of instruction by providing information that teachers can use to revise their teaching and students can use to improve their learning (Black, Harrison, Lee, Marshall & Wiliam, 2004).

informal assessment involving spontaneous unsystematic observations of students’ behaviors (e.g. during a question and answer session or while the students are working on an assignment) and formal assessment involving pre-planned, systematic gathering of data.

Assessment of learning is formal assessment that involves assessing students in order to certify their competence and fulfill accountability mandates and is the primary focus of the next chapter on standardized tests but is also considered in this chapter. Assessment of learning is typically summative , that is, administered after the instruction is completed (e.g. a final examination in an educational psychology course). Summative assessments provide information about how well students mastered the material, whether students are ready for the next unit, and what grades should be given (Airasian, 2005).

Assessment for learning: an overview of the process

Step 1: Having clear instructional goals and communicating them to students

Step 2: Selecting appropriate assessment techniques

Step 3: Using assessment to enhance motivation and confidence

Step 4: Adjusting instruction based on information

for learning is that the teacher uses the information gained from assessment to adjust instruction. These adjustments occur in during lesson when a teacher may decide that students’ responses to questions indicate sufficient understanding to introduce a new topic, or that her observations of students’ behavior indicates that they do not understand the assignment and so need further explanation. Adjustments also occur when the teacher reflects on the instruction after the lesson is over and is planning for the next day. We provide examples of adjusting instruction in this chapter.

Step 5: Communicating with parents and guardians

quality

Selecting appropriate assessment techniques: high quality assessments

For an assessment to be high quality it needs to have good validity and reliability as well as absence of bias.

Reliability

  • First, assessments with more tasks or items typically have higher reliability.
  • Second, clear directions and tasks help increase reliability.
  • Third, clear scoring criteria are crucial in ensuring high reliability (Linn & Miller, 2005).

Absence of bias

  • Two types of assessment bias are important: offensiveness and unfair penalization.

Selecting appropriate assessment techniques II: types of teacher-made assessments

alt

Teachers’ observation, questioning, and record keeping

  • The informal assessment strategies teachers most often use during instruction are observation and questioning.

Observation

Questioning.

  • planning and writing down the instructional questions that will be asked
  • allowing sufficient wait time for students to respond
  • listening carefully to what students say rather than listening for what is expected
  • varying the types of questions asked
  • making sure some of the questions are higher level
  • asking follow-up questions

Table 36: Validity and reliability of observation and questioning (problem and strategies to alleviate the problem

Record keeping

The class of preschoolers in a suburban neighborhood of a large city has eight special needs students and four students—the peer models—who have been selected because of their well-developed language and social skills. Some of the special needs students have been diagnosed with delayed language, some with behavior disorders, and several with autism.

The students are sitting on the mat with the teacher who has a box with sets of three “cool” things of varying size (e.g. toy pandas) and the students are asked to put the things in order by size, big, medium and small. Students who are able are also requested to point to each item in turn and say “This is the big one”, “This is the medium one” and “This is the little one”.

For some students, only two choices (big and little) are offered because that is appropriate for their developmental level. The teacher informally observes that one of the boys is having trouble keeping his legs still so she quietly asks the aid for a weighted pad that she places on the boy’s legs to help him keep them still. The activity continues and the aide carefully observes student’s behaviors and records on IEP progress cards whether a child meets specific objectives such as: “When given two picture or object choices, Mark will point to the appropriate object in 80 per cent of the opportunities.”

The teacher and aides keep records of the relevant behavior of the special needs students during the half day they are in preschool. The daily records are summarized weekly. If there are not enough observations that have been recorded for a specific objective, the teacher and aide focus their observations more on that child, and if necessary, try to create specific situations that relate to that objective. At end of each month the teacher calculates whether the special needs children are meeting their IEP objectives.

test

Selected response items

multiple choice, matching, and true/false items . In selected response items students must select a response provided by the teacher or test developer rather than constructing a response in their own words or actions. Selected response items do not require that students recall the information but rather recognize the correct answer.

Common problems

  • Unclear wording in the items
  • True or False: Although George Washington was born into a wealthy family, his father died when he was only 11, he worked as a youth as a surveyor of rural lands, and later stood on the balcony of Federal Hall in New York when he took his oath of office in 1789.
  • Cues that are not related the content being examined.
  • A common clue is that all the true statements on a true/false test or the corrective alternatives on a multiple-choice test are longer than the untrue statements or the incorrect alternatives.
  • Using negatives (or double negatives) the items.
  • A poor item . “True or False: None of the steps made by the student was unnecessary.”
  • A better item . True or False: “All of the steps were necessary.”
  • Taking sentences directly from a textbook or lecture notes.
  • Avoid trivial questions

e.g. Jean Piaget was born in what year?

Strengths and weaknesses

  • True/False items are appropriate for measuring factual knowledge such as vocabulary, formulae, dates, proper names, and technical terms. They are very efficient as they use a simple structure that students can easily understand, and take little time to complete. They are also easier to construct than multiple choice and matching items. However, students have a 50 percent probability of getting the answer correct through guessing so it can be difficult to interpret how much students know from their test scores. Examples of common problems that arise when devising true/false items are in Table 37.

Table 37: Common errors in selected response items (type of assessment item/ common errors/ example)

matching items, two parallel columns containing terms, phrases, symbols, or numbers are presented and the student is asked to match the items in the first column with those in the second column. Typically, there are more items in the second column to make the task more difficult and to ensure that if a student makes one error they do not have to make another.

Matching items most often are used to measure lower level knowledge , such as persons and their achievements, dates and historical events, terms and definitions, symbols and concepts, plants or animals and classifications (Linn & Miller, 2005). An example with Spanish language words and their English equivalents is below:

matching items, two parallel columns containing terms, phrases, symbols, or numbers are presented and the student is asked to match the items in the first column with those in the second column. Typically there are more items in the second column to make the task more difficult and to ensure that if a student makes one error they do not have to make another. Matching items most often are used to measure lower level knowledge such as persons and their achievements, dates and historical events, terms and definitions, symbols and concepts, plants or animals and classifications (Linn & Miller, 2005). An example with Spanish language words and their English equivalents is below:

Multiple Choice items are the most commonly used type of objective test items because they have a number of advantages over other objective test items.

  • Most importantly, they can be adapted to assess higher levels thinking such as application as well as lower level factual knowledge. The first example below assesses knowledge of a specific fact, whereas the second example assesses application of knowledge.

Who is best known for their work on the development of the morality of justice?

  • b) Vygotsky
  • d) Kohlberg

Which one of the following best illustrates the law of diminishing returns

  • a) A factory doubled its labor force and increased production by 50 per cent
  • b) The demand for an electronic product increased faster than the supply of the product
  • c) The population of a country increased faster than agricultural self sufficiency
  • d) A machine decreased in efficacy as its parts became worn out

(Adapted from Linn and Miller 2005, p, 193).

Table 38: Common errors in constructed response items

Constructed response items

  • Constructed response items can be used to assess a wide variety of kinds of knowledge and two major kinds are discussed: completion or short answer (also called short response) and extended response.

Completion and short answer

apart from their use in mathematics they are unsuitable for measuring complex learning outcomes and are often difficult to score. Completion and short answer tests are sometimes called objective tests as the intent is that there is only one correct answer and so there is no variability in scoring but unless the question is phrased very carefully, there are frequently a variety of correct answers. For example, consider the item

Extended response

essay questions. Extended response items have several advantages and the most important is their adaptability for measuring complex learning outcomes — particularly integration and application . These items also require that students write and therefore provide teachers a way to assess writing skills. A commonly cited advantage to these items is their ease in construction; however, carefully worded items that are related to learning outcomes and assess complex learning are hard to devise (Linn & Miller, 2005).

The owner of a bookstore gave 14 books to the school. The principal will give an equal number of books to each of three classrooms and the remaining books to the school library. How many books could the principal give to each student and the school?

Show all your work on the space below and on the next page. Explain in words how you found the answer. Tell why you took the steps you did to solve the problem.

From Illinois Standards Achievement Test, 2006;

Jose and Maria noticed three different types of soil, black soil, sand, and clay, were found in their neighborhood. They decided to investigate the question, “How does the type of soil (black soil, sand, and clay) under grass sod affect the height of grass?”

  • Prediction of the outcome of the investigation
  • Materials needed to do the investigation
  • Procedure that includes:
  • logical steps to do the investigation
  • one variable kept the same (controlled)
  • one variable changed (manipulated)
  • any variables being measure and recorded
  • how often measurements are taken and recorded (From Washington State 2004 assessment of student learning)

Writing prompt

Choose One:

□ I think schools should teach students how to cook

□ I think cooking should l be taught in the home

I think cooking should be taught in … …………………………. . because………

( school ) or (the home)

(From Illinois Measure of Annual Growth in English)

What are the nature, symptoms, and risk factors of hyperthermia?

Point Scoring Guide:

Definition (natures) 2 pts

Symptoms (1 pt for each) 5 pts

Risk Factors (1 point for each) 5 pts

Writing 3 pts

Scoring rubrics

Scoring rubrics can be holistic or analytica l. In holistic scoring rubrics, general descriptions of performance are made and a single overall score is obtained. An example from grade 2 language arts in Los Angeles Unified School District classifies responses into four levels: not proficient, partially proficient, proficient and advanced is on Table 39.

Table 39: Example of holistic scoring rubric: English language arts grade 2

Analytical rubrics provide descriptions of levels of student performance on a variety of characteristics. For example, six characteristics used for assessing writing developed by the Northwest Regional Education Laboratory (NWREL) are:

  • ideas and content
  • organization
  • word choice
  • sentence fluency
  • conventions

http://www.nwrel.org/assessment/toolkit98/traits/index.html ) .

Holistic rubrics have the advantages that they can be developed more quickly than analytical rubrics. They are also faster to use as there is only one dimension to examine. However, they do not provide students feedback about which aspects of the response are strong and which aspects need improvement (Linn & Miller, 2005). This means they are less useful for assessment for learning. An important use of rubrics is to use them as teaching tools and provide them to students before the assessment so they know what knowledge and skills are expected.

  • This strategy of assessment for learning should be more effective if the teacher:

(a) emphasizes to students why using accurate terminology is important when learning science rather than how to get a good grade on the test (we provide more details about this in the section on motivation later in this chapter)

(b) provides an exemplary response so students can see a model

(c) emphasizes that the goal is student improvement on this skill not ranking students.

Table 40: Example of a scoring rubric, Science * On the High School Assessment, the application of a concept to a practical problem or real-world situation will be scored when it is required in the response and requested in the item stem.

Performance assessments

  • playing a musical instrument
  • athletic skills
  • artistic creation
  • conversing in a foreign language
  • engaging in a debate about political issues
  • conducting an experiment in science
  • repairing a machine
  • writing a term paper
  • using interaction skills to play together

Alternative assessmen t refers to tasks that are not pencil-and-paper and while many performance assessments are not pencil-and paper tasks some are (e.g. writing a term paper, essay test).

  • Alternative assessment also refers an assessment system that is used to assess students with the most significant cognitive disability or multiple disabilities that significantly impact intellectual functioning and adaptive behavior.

Click here to watch the video on Dynamic Learning Maps assessment system (DLM) (8:50 minutes)

Thumbnail for the embedded element "The DLM System"

https://granite.pressbooks.pub/teachingdiverselearners/?p=284

Authentic assessment is used to describe tasks that students do that are similar to those in the “real world”. Classroom tasks vary in level of authenticity (Popham, 2005). For example, a Japanese language class taught in a high school in Chicago conversing in Japanese in Tokyo is highly authentic— but only possible in a study abroad program or trip to Japan. Conversing in Japanese with native Japanese speakers in Chicago is also highly authentic, and conversing with the teacher in Japanese during class is moderately authentic. Much less authentic is a matching test on English and Japanese words. In a language arts class, writing a letter (to an editor) or a memo to the principal is highly authentic as letters and memos are common work products.

  • However, writing a five-paragraph paper is not as authentic as such papers are not used in the world of work.
  • However, a five-paragraph paper is a complex task and would typically be classified as a performance assessment.

Internet Resource on Performance Assessment

The Inside Mathematics website has Performance Assessments Ta sks for grades 2 through 8 and high school math (algebra, functions, geometry, statistics and probability, and number and quantity). The assessments are aligned to the Common Core Standards for Mathematics: http://www.insidemathematics.org/performance-assessment-tasks You may download and use these tasks for professional development purposes without modifying the tasks.

Jay McTighe and Associates have a Performance Tasks Blog Series- What is a Performance Task? jaymctighe.com/2015/04/what-is-a-performance-task/ Seven characteristics of Performance Tasks and a few examples are included.

Advantages and disadvantages

performance assessment s (Linn & Miller 2005). First, the focus is on complex learning outcomes that often cannot be measured by other methods. Second, performance assessments typically assess process or procedure as well as the product. For example, the teacher can observe if the students are repairing the machine using the appropriate tools and procedures as well as whether the machine functions properly after the repairs. Third, well designed performance assessments communicate the instructional goals and meaningful learning clearly to students. For example, if the topic in a fifth-grade art class is one-point perspective the performance assessment could be drawing a city scene that illustrates one-point perspective. This assessment is meaningful and clearly communicates the learning goal. This performance assessment is a good instructional activity and has good content validity—common with well-designed performance assessments (Linn & Miller 2005).

  • One major disadvantage with performance assessments is that they are typically very time consuming for students and teachers. This means that fewer assessments can be gathered so if they are not carefully devised fewer learning goals will be assessed—which can reduce content validity.
  • performing complex movement combinations to music in a variety of meters and styles
  • performing combinations and variations in a broad dynamic range
  • demonstrating improvement in performing movement combinations through self-evaluation
  • critiquing a live or taped dance production based on given criteria
  • Another disadvantage of performance assessments is they are hard to assess reliably which can lead to inaccuracy and unfair evaluation. As with any constructed response assessment, scoring rubrics are very important.

Table 41: Example of group interaction rubric

middle grade science ,but could be used in other subject areas when assessing group process. In some performance assessments, several scoring rubrics should be used. In the dance performance example above Eric should have scoring rubrics for the performance skills, the improvement based on self-evaluation, the team work, and the critique of the other group.

  • Create performance assessments that require students to use complex cognitive skills. Sometimes teachers devise assessments that are interesting and that the students enjoy but do not require students to use higher level cognitive skills that lead to significant learning. Focusing on high level skills and learning outcomes is particularly important because performance assessments are typically so time consuming.
  • Ensure that the task is clear to the students. Performance assessments typically require multiple steps so students need to have the necessary prerequisite skills and knowledge as well as clear directions. Careful scaffolding is important for successful performance assessments.
  • Specify expectations of the performance clearly by providing students scoring rubrics during the instruction. This not only helps students understand what it expected but it also guarantees that teachers are clear about what they expect. Thinking this through while planning the performance assessment can be difficult for teachers, but is crucial as it typically leads to revisions of the actual assessment and directions provided to students.
  • Reduce the importance of unessential skills in completing the task. What skills are essential depends on the purpose of the task. For example, for a science report, is the use of publishing software essential? If the purpose of the assessment is for students to demonstrate the process of the scientific method including writing a report, then the format of the report may not be significant. However, if the purpose includes integrating two subject areas, science and technology, then the use of publishing software is important. Because performance assessments take time it is tempting to include multiple skills without carefully considering if all the skills are essential to the learning goals.

Portfolio

“A portfolio is a meaningful collection of student work that tells the story of student achievement or growth” (Arter, Spandel, & Culham, 1995, p. 2).

When the primary purpose is assessment for learning, the emphasis is on student self-reflection and responsibility for learning.

Portfolios can be designed to focus on student progress or current accomplishments.

Portfolios can focus on documenting student activities or highlighting important accomplishments.

A final distinction can be made between a finished portfolio—maybe used to for a job application—versus a working portfolio that typically includes day-to-day work samples.

  • When reliability is low, validity is also compromised because unstable results cannot be interpreted meaningfully.

Steps in implementing a classroom portfolio program

  • Talk to your students about your ideas of the portfolio, the different purposes, and the variety of work samples. If possible, have them help make decisions about the kind of portfolio you implement.
  • Will the focus be on growth or current accomplishments? Best work showcase or documentation? Good portfolios can have multiple purposes, but the teacher and students need to be clear about the purpose.
  • For example, in writing, is every writing assignment included? Are early drafts as well as final products included?

Decide where the work sample will be stored. For example, will each student have a file folder in a file cabinet, or a small plastic tub on a shelf in the classroom?

Assessment that enhances motivation and student confidence

  • More recent research indicates that teachers’ assessment purpose and beliefs, the type of assessment selected, and the feedback given contributes to the assessment climate in the classroom which influences students’ confidence and motivation. The use of self-assessment is also important in establishing a positive assessment climate.

Teachers’ purposes and beliefs

incremental view assumes that ability increases whenever an individual learns more. This means that effort is valued because effort leads to knowing more and therefore having more ability. Individuals with an incremental view also ask for help when needed and respond well to constructive feedback as the primary goal is increased learning and mastery.

Choosing assessments

First, assessments that have clear criteria that students understand and can meet rather than assessments that pit students against each other in interpersonal competition enhances motivation (Black, Harrison, Lee, Marshall, Wiliam, 2004). This is consistent with the point we made in the previous section about the importance of focusing on enhancing learning for all students rather than ranking students.

Second , meaningful assessment tasks enhance student motivation. Students often want to know why they have to do something and teachers need to provide meaningful answers. For example, a teacher might say, “You need to be able to calculate the area of a rectangle because if you want new carpet you need to know how much carpet is needed and how much it would cost.” Well-designed performance tasks are often more meaningful to students than selected response tests so students will work harder to prepare for them.

Third, providing choices of assessment tasks can enhance student sense of autonomy and motivation according to self-determination theory. Kym, the sixth-grade teacher whose story began this chapter, reports that giving students choices was very helpful. Another middle school social studies teacher Aaron, gives his students a choice of performance tasks at the end of the unit on the US Bill of Rights. Students have to demonstrate specified key ideas, but can do that by making up a board game, presenting a brief play, composing a rap song etc.

Providing feedback

Self and peer assessment, adjusting instruction based on assessment, communication with parents and guardians, action research: studying yourself and your students, cycles of planning, acting and reflecting.

Action research is usually described as a cyclical process with the following stages (Mertler, 2006).

  • Planning Stage. Planning has three components. First, planning involves identifying and defining a problem. Problems sometimes start with some ill-defined unease or feeling that something is wrong and it can take time to identify the problem clearly so that it becomes a researchable question. The next step, is reviewing the related literature and this may occur within a class or workshop that the teachers are attending. Teachers may also explore the literature on their own or in teacher study groups. The third step is developing a research plan. The research plan includes what kind of data will be collected (e.g. student test scores, observation of one or more students, as well as how and when it will be collected (e.g. from files, in collaboration with colleagues, in spring or fall semester).
  • Acting sage. During this stage, the teacher is collecting and analyzing data. The data collected and the analyses do not need to be complex because action research, to be effective, has to be manageable.
  • Developing an action plan. In this stage, the teacher develops a plan to make changes and implements these changes. This is the action component of action research and it is important that teachers document their actions carefully so that they can communicate them to others.

Communicating and reflection. An important component of all research is communicating information. Results can be shared with colleagues in the school or district, in an action research class at the local college, at conferences, or in journals for teachers. Action research can also involve students as active participants and if this is the case, communication may include students and parents. Communicating with others helps refine ideas and so typically aids in reflection. During reflection teachers/researchers ask such questions as: “What did I learn?” “What should I have done differently?” “What should I do next?” Questions such as these often lead to a new cycle of action research beginning with planning and then moving to the other steps.

Ethical issues—privacy, voluntary consent

grading

Grading and reporting

How are various assignments and assessments weighted?

Should social skills or effort be included?

How should grades be calculated?

What kinds of grade descriptions should be used?

  • writing narratives
  • writing responses to literature
  • writing information reports
  • writing summaries

Chapter summary

References:

Seifert, K. and Sutton, R. (2009). Educational Psychology. Saylor Foundation. ( Chapter 11) Retrieved from open.umn.edu/opentextbooks/BookDetail.aspx?bookId=153 (CC BY)

youtu.be/Ltr6SV8zbn0

What Teachers Should Know About Integrating Formative Assessment With Instruction

teacher made test in education

  • Share article

Clarification : This story has been updated to clarify formative assessment terminology.

Teachers need more support to move testing from a “necessary evil” to a classroom tool, experts say.

While summative assessments—like unit quizzes or annual state tests—are used for evaluation and accountability, research shows formative assessments—like puzzles, projects, and class error analyses—can help teachers and students identify misunderstandings and reflect on students’ progress as they’re learning.

While most teachers use at least some formative measures in daily classroom practice, experts at the American Educational Research Association conference here last week argued they need more support to integrate daily assessment practices with overall classroom instruction.

Illustration of papers and magnifying glass

“We need to think about assessment more holistically,” said E. Caroline Wylie, a senior associate at the National Center for the Improvement of Educational Assessment and co-author of the National Academy of Education’s new report on assessment released last week. “Certainly we’ve got to be sharing learning goals for students, assessment, and learning in ways that are recognizable.”

For example, in one study, Dustin Van Orman, a STEM education research associate at Western Washington University, asked a national sample of more than 100 elementary student-teachers who participated in simulations of English, science, and math classes. The teachers were asked how much experience they had in using formative assessment, and then were asked to use information about students in the simulated classes to plan tasks and other formative assessments over three class periods.

About 1 in 5 of the preservice teachers had little to no prior training in formative assessment, and a third had experience only with formal testing, rather than informal measures through tasks and activities. One preservice teacher reported that her mentor-teacher downplayed the usefulness of classroom assessment and “only likes to do assessment if it’s like something she kind of has to do.”

While a majority of preservice teachers in the study could identify tasks students should be able to do as they learn particular academic content, Van Orman and his colleagues found many student-teachers did not set learning goals with their students or set criteria for tests and tasks based on learning goals. Rather, tests and tasks often could be disconnected from overarching instructional goals. Teachers less experienced in assessment also tended to give more static feedback—praising or correcting students—but not providing information on which students were expected to act.

How to use formative assessments effectively

“Assessment shouldn’t be dropped in from the sky, disconnected from student experience,” said Wylie, who was not part of the preservice teacher study. It’s also important for teachers to understand their own and students’ cultural backgrounds when designing assessments, she noted.

Van Orman and Erin Riley-Lepo, a visiting assistant professor at the College of New Jersey, have been working with researchers, teachers, and principals to develop a framework for teachers’ own assessment literacy.

To effectively use formative assessment in class, they recommended:

  • Teachers and students develop a shared understanding of learning goals.
  • Assessments focus on what students are learning in ways that the students can recognize, so that they can understand their own progress.
  • Students should engage in self- and peer-assessments, to help take ownership of their learning.
  • Assessments should make students’ thinking visible to both the teacher and students, to correct misconceptions and build on students’ strengths.
  • Assessments should directly inform teachers’ instruction.

Building better classroom assessment practice also requires support from principals and district leaders.

“We talk about teachers, but the reality is, teachers are embedded in schools and districts with particular approaches and constraints on their assessment that may or may not be—and often is not—supportive of active and focused assessments,” Wylie said.

For example, Wylie noted that in the last four months, she and her research team repeatedly had to cancel professional development for teachers because the principals could not secure enough substitute teachers to cover their classes.

Sign Up for EdWeek Update

Edweek top school jobs.

Grading reform lead art

Sign Up & Sign In

module image 9

  • Special Issues
  • Conferences
  • Turkish Journal of Analysis and Number Theory Home
  • Current Issue
  • Browse Articles
  • Editorial Board
  • Abstracting and Indexing
  • Aims and Scope
  • American Journal of Educational Research Home
  • Social Science
  • Medicine & Healthcare
  • Earth & Environmental
  • Agriculture & Food Sciences
  • Business, Management & Economics
  • Biomedical & Life Science
  • Mathematics & Physics
  • Engineering & Technology
  • Materials Science & Metallurgy
  • Quick Submission
  • Apply for Editorial Position
  • Propose a special issue
  • Launch a new journal
  • Authors & Referees
  • Advertisers
  • Open Access

teacher made test in education

  • Full-Text PDF
  • Full-Text HTML
  • Full-Text Epub
  • Full-Text XML
  • Nwani-Grace Ugwu, Solomon Okechukwu Mkpuma. Ensuring Quality in Education: Validity of Teacher-made Language Tests in Secondary Schools in Ebonyi State. American Journal of Educational Research . Vol. 7, No. 7, 2019, pp 518-523. https://pubs.sciepub.com/education/7/7/12 ">Normal Style
  • Ugwu, Nwani-Grace, and Solomon Okechukwu Mkpuma. 'Ensuring Quality in Education: Validity of Teacher-made Language Tests in Secondary Schools in Ebonyi State.' American Journal of Educational Research 7.7 (2019): 518-523. ">MLA Style
  • Ugwu, N. , & Mkpuma, S. O. (2019). Ensuring Quality in Education: Validity of Teacher-made Language Tests in Secondary Schools in Ebonyi State. American Journal of Educational Research , 7 (7), 518-523. ">APA Style
  • Ugwu, Nwani-Grace, and Solomon Okechukwu Mkpuma. 'Ensuring Quality in Education: Validity of Teacher-made Language Tests in Secondary Schools in Ebonyi State.' American Journal of Educational Research 7, no. 7 (2019): 518-523. ">Chicago Style

Ensuring Quality in Education: Validity of Teacher-made Language Tests in Secondary Schools in Ebonyi State

The study was carried out to find out the extent that teachers of English in secondary schools in Ebonyi State validate their test items. To guide the study, three research questions were formulated and one null hypothesis was postulated and tested at 0.00 level of significance. The design of the study was the descriptive survey. The population consisted of all the teachers of English in all the government-owned secondary schools in the three education zones of Ebonyi State. Purposive sampling technique was used to select 367 teachers which made up 50% of the entire population as sample. A 22-item researcher-developed questionnaire entitled Test Item Validation Questionnaire (TIVQ) was constructed, validated, trial tested and used to elicit data from the respondents. Data obtained were presented and analyzed using mean and standard deviation to answer the research questions and t-test statistical tool was used to test the hypothesis. The study revealed, amongst other things, that majority of the teachers of English in public secondary schools in Ebonyi State do not validate their test items before administration. The researchers recommended that test item review committees be set up; training programs should be provided for teachers of English and that there should be strenuous supervision of academic activities in secondary schools in Ebonyi State by both internal and external authorities.

1. Introduction

A test is a measurement devise used by assessors to gather certain information about the testees in order to make important decisions. The test is the most commonly used instrument for assessing cognitive achievements 1 . Also, tests connote the presentation of a standard set of items to be responded to and the responses derived provide a basis for determining the level of achievement. A language test, therefore, is a measurement devise used for measuring the proficiency of an individual in using a particular language or in a language course.

Types of tests include diagnostic tests, proficiency tests, achievement tests, and aptitude tests. In classroom situations (which the study is concerned with) the achievement test is most commonly used. The achievement test is used to measure the degree of success attained in a specific area of learning 2 . It is the type of ability test that is concerned with what a person has learnt and its importance lies in its use to find out the progress made in aspects of language that has been taught. Achievement test, thus, is closely tied to particular school subjects. The merits of achievement tests lie in its provision of objective, independent and accuracy of measurement in what has been learned 3 . Achievement tests are also known to provide norms for comparing students’ performance with their counterparts within and outside their schools.

Achievement tests are of two types – the standardized test and the teacher-made test (also called the classroom test). Teacher-made tests are locally developed by subject teachers in schools to assess the achievement of their students in areas covered in instruction. While the standardized test is valid, reliable and has a table of norms, the teacher-made test does not possess any form of norms 2 . The standardized test is designed to be used on a much larger scale than the teacher-made test. As a result, test items are subjected to series of standardization processes before they are administered on testees. The teacher-made test, on the other hand, has no specific method of assuring its quality; the method of assuring the quality of the teacher-made tests varies from institution to institution.

In the selection of a measuring instrument, two fundamental questions arise. These are:

Ÿ Does the instrument measure a variable consistently?

Ÿ Is the instrument a true measure of the variable?

The former is an indication of reliability while the later raises issues of validity. The adequacy (and quality) of a measuring instrument is determined by its reliability and validity. Reliability refers to the consistency or stability of measurement 4 . Further, reliability is viewed as the degree of consistency between two sets of scores or observation obtained with the same instrument or equivalent forms of an instrument 2 . Validity, on the other hand, has to do with the ability of test items to measure what they are meant to measure. In other words, while reliability is concerned with the consistency of scores, validity is closely tied to the adequacy of test items in testing a specified area of instruction. This study is particularly interested in the validity of teacher-made language tests in the study area.

Validity basically is the assessment of whether a test measures what it aims to measure. A test is valid when it measures what it sets out to measure; for instance, a reading comprehension test which asks, “What is the difference between a tropical climate and a temperate climate? may not be valid especially as the question looks like a Geography question; it would, however, be valid if it is tied to passage 4 . Validity is believed to be the most important characteristic of a test and a test that lacks validity is worthless 1 . Standards of validity are content, criterion-related and construct validity but content validity is most vital for the classroom teacher 1 . Further, a teacher-made test must fulfil two conditions to be termed valid. These are:

Ÿ It must measure achievement in the subject for which it was prepared.

Ÿ It must measure achievement in the learning objective defined for the subject.

A test that satisfies these two conditions would be seen to possess content validity. The question, however, is to what extent do classroom teachers subject their test items to validity check in tertiary institutions.

Tests are very important in the school system. This is because they give insight to how much the objective of learning is achieved; how well the method(s) of teaching has worked; and how worthwhile a programme is. Teachers construct and administer tests and the learners’ performance will determine the level of achievement. Thus, test construction should not be taken lightly. One may argue that the teacher who taught a subject has the ability to develop valid test items, however, it is observed that the manner that tests are developed in schools often present problems in scoring and grading of achievement 5 . Also, many teachers do not use correct procedure in preparing classroom tests 6 . From personal observation, classroom teachers in Ebonyi State sometimes construct test items on the day they are required to be taken. Even when the items are constructed earlier, there is often no time or resources to ensure their content validity before administration. These suggest that there is more to be desired. If the language teacher who is the key agent of the implementation of the language policy in Nigeria is not performing at optimum level, there is a problem 7 . Effort should be made to ensure quality in the education system and it should start from the classroom.

This study sets out to find out the extent to which language teachers in secondary schools in Ebonyi State subject their test items to processes of testing their validity (especially content validity). Also, the study seeks to discover the characteristics of validity that secondary school teachers in Ebonyi State use to test the validity of their test items. Finally, the study will proffer suggestions that will help to ensure quality of teacher-made tests in secondary schools in Ebonyi State.

The findings of this study are significant in that it exposed the extent to which secondary school teachers in Ebonyi State their test items to validation processes. Also, it revealed the characteristic of validity that is employed by secondary school teachers in Ebonyi State to validate their test items. In essence, teachers, curriculum planners, students and parents will greatly benefit from the findings of this study. Teachers will see the need to improve the quality of their test items; curriculum planners will identify areas of need and pay attention to the right areas and not perceived ones; students and parents will benefit most as qualitative education will be provided and better qualified students will be produced. The findings, also, provide accurate information that will enable informed decisions on educational policies. Finally, the study serves as a reference material for future researchers who may wish to carry out researches in similar areas.

Quality of tests is determined by the reliability, validity and usability; this study is delimited to the validity of teacher-made tests in secondary schools. Also, the study focused on government-owned (or public) secondary schools in the three educational zones of Ebonyi State.

2. Hypotheses

One null hypothesis was formulated and tested at 0.05 level of significance.

HO 1 : There is no significant difference in the mean rating of male and female teachers of English in secondary schools in Ebonyi State on the extent to which they validate their test items .

3. Conceptualisation

Quality in the education system has to with the standard enforced in the implementation of programmes. Quality in education, thus, connotes standard of education, standard of service, management, relevance, significance, and efficiency of product. Quality is an inalienable index of education programmes and it is imperative that every segment of the system must establish and maintain quality 8 . To achieve the objective of education, therefore, the quality of tests must be assured. Quality assurance in testing is the systematic construction, administration and scoring of teacher-made tests using competent teachers and appropriate test items among other things 6 . Thus, assuring the quality of tests will enable a high educational standard. Quality assurance is the process of setting, maintaining and improving standards in all aspects of the school system 9 . Quality assurance is an all embracing, ongoing and a continuous process of improving the education system, institutions and programs 10 .

Quality assurance is aimed at preventing faults from occurring. Quality is designed to ensure that products or services meet predetermined specifications; quality assurance aims at providing products and services completely devoid of defect by doing things right at all time 11 . By implication, quality will be assured in the education system by feeding quality inputs into the system in order to get quality outputs.

The importance of tests in education cannot be overemphasized; it helps the teacher to take decisions on course improvement; identify the needs of students; and help educational administrators and curriculum planners to judge how good the school system is. Also, tests helps the evaluator to evaluate human ability, personality characteristics as well as adjustment and mental health 12 . The purposes of tests include giving direction to instructional activities; measuring achievement; providing empirical basis for curricular activities; determining the merits and limitations of the instructional program; and supplying the data for a comprehensive judgement in the learners 13 . Further, tests help the education administrator to make decisions in educational planning; determine strengths and weaknesses of instructional programs; identify areas where supervision is needed; and determine overall effectiveness of schools 3 . Also, tests help teachers to gain understanding of the achievement and ability levels of individual students and classes; determine whether to adjust instructional practices; diagnose students learning difficulties; measure students’ attainment; and make decisions regarding grouping students within subject matter areas 3 .

From the foregoing, tests are devices used to evaluate if learners are coping with the lesson being taught 14 . Tests are standard sets of questions which are administered to testees to determine the extent they have attained previous identified objectives 12 . Also, a test is a procedure used to evaluate human ability, personality, characteristics, adjustment and mental health. This means that for tests to be administered to learners, they must have been exposed to learning experiences 15 .

The importance of validity in testing is equated with the accuracy of a wristwatch; a wrist watch that is consistently late by five minutes may be reliable but not valid as it is consistently below the accurate time 16 . In other words, validity assesses the relevance of an instrument to its purpose. Validity is also seen as the extent to which a test adequately measures what it is supposed to measure 1 . Validity, which could be content, construct or face related, is the most important characteristic of a test 1 . A test possesses content validity if it contains items that measure and cover the intended area to be covered by instruction. Usually test content and classroom instruction are in close relationship.

This study is a descriptive survey. A survey research is a systematic collection of data or information from a population (sometimes referred to as universe) or sample of a population (considered to be a representative of the entire group of interest), through the use of personal interview and/or questionnaire 17 . This design was considered appropriate as the study collected data from the sample, with the aid of a researcher-developed questionnaire, to describe an entire population under study.

The area of the study comprised all the government-owned secondary schools in the three education zones of Ebonyi State. The three education zones are Abakaliki, Onueke and Afikpo. Also, data was obtained from the state Secondary Education Board (SEB) at Abakaliki to help the study. This is due to the fact that the government-owned secondary schools are centrally controlled by the Secondary Education Board.

The study was interested in the methods of ascertaining validity of teacher-made language (English) test items, so, the population of the study comprised all teachers of English in the two hundred and forty-three (243) government-owned (public) secondary schools in the three (3) education zones of Ebonyi State. The choice of the population was based on the fact that these institutions all offer English as a compulsory subject at all levels while other languages such as Igbo and French are compulsory only in junior classes but taken as optional subjects in the senior classes. Data obtained from the Secondary Education Board in Abakaliki revealed that there were six hundred and fifty-seven (657) teachers of English in the two hundred and twenty-seven (227) government-owned secondary schools. Thus, the population of the study was 657 secondary school teachers.

The purposive sampling technique was used to select one hundred and twenty-two (122) schools which represent 50% of the public secondary schools in Ebonyi State. All the teachers of English from the 122 schools were used since the number was not very large. The total number of the sample was three hundred and sixty-seven (367) teachers of English. The purposive sampling technique was deemed appropriate because the population was not very large as there were often few teachers of English in schools. Moreover, matters of quality should not be trifled with and the more responses sought, the better the result obtained.

The instrument for data collection was a researcher developed teachers’ questionnaire entitled Test Items Validation Questionnaire (TIVQ). The questionnaire items were generated from data gathered in the review of related literature. There were two parts in the questionnaire – Part A (which solicited information on respondents’ personal data) and Part B (which contained items on knowledge and practice of validation processes). Further, Part B was in three sections and the clustered items relate to and attempt to answer the three research questions.

Face and content validity of the instrument were determined by two experts from the department of Arts Education, Ebonyi State University, Abakaliki; and two experts from the Department of the same university. Copies of the questionnaire were given to these experts and their corrections and suggestions were incorporated. As a result, the instrument was seen to possess both content and face validity.

The reliability of the instrument was determined by pre-testing it on thirty (30) teachers of English in public secondary schools in Enugu State. The scores obtained from the respondents were collated and analyzed to determine the co-efficient of the set of scores for the items in each of the sections. The Cronbach Co-Efficient Alpha was used to obtain the reliability co-efficient of 0.85, 0.82 and 0.87 respectively for sections 1, 2 and 3.

The researcher employed the services of six (6) research assistants to help in the administration and collection of questionnaires on the spot to avoid loss. The rationale behind the number of the research assistants is that two (2) research assistants covered the schools in each of the three education zones in the state. The expectation and reality is that all the questionnaires administered were returned and used in the study.

Data collected was analyzed using simple percentage and frequency count. The YES option implies that the respondent(s) agree with the statement while the NO option implies that the respondent(s) disagree with the statement. Fifty percent (50%) and above indicates approval while forty-nine percent (49%) and below indicates disapproval. The chi-square statistics was used to test the single hypothesis and a critical value of <0.05 was accepted.

Data presented and analyzed here are based on the research questions guiding the study. The items are clustered according to the research questions and analyzed thus.

Research Question 1: To what extent do secondary school teachers in Ebonyi State validate their test items?

The result of data in Table 1 revealed that the respondents in item 1-10 had the mean scores of 1.46 ± 0.55, 1.29 ± 0.47, 2.04 ± 1.06, 1.41 ± 0.62, 1.72 ± 0.91, 1.38 ± 0.57, 1.59 ± 0.87, 1.63 ± 0.92, 1.48 ± 0.75 and 1.35 ± 0.49. This indicates that the respondents disagreed on all the item that test items are constructed on the day of the test and that test items are sent to test experts to scrutinize before administration, that test items are constructed on the day of the test, that test items are analyzed before administered to testees, that test items are submitted to the HOD for assessment, that corrections and inputs are made by the HOD before administration, that the HOD is often too busy to look at the test items before administration, that a committee reviews test items before administration, that every teacher handles his/her test items alone to avoid leakage and that individual teachers are at liberty to do as they see fit with test items in their subjects. The grand mean score of all the respondent is 1.56 with the standard deviation of 0.72. This mean sore is below than 2.50 benchmark of acceptance. Therefore, secondary school teachers in Ebonyi State do not validate their test items before administration.

Research Question 2: What characteristic of validity do secondary school teachers in Ebonyi State use to test the validity of their test items?

The result of data in Table 2 revealed that the respondents in item 11-17 had the mean scores of 3.07 ± 0.84, 1.26 ± 0.45, 3.00 ± 0.89, 1.51 ± 0.73, 1.46 ± 0.73, 2.54 ± 0.75 and 2.72 ± 0.91. This indicates that the respondents disagreed on item 12, 14 and 15 that test items cover only some aspects of instruction that test items do not necessarily look like language tests and that test items are taken from any area of the content of instruction. The data also revealed that the greater number of respondents accepted items 11, 13, 16 and 17 that test items cover every aspect of instruction, that test items look like language tests, test items are constructed using a test blue print and test items correspond with the goals of instruction. The grand mean score of all the respondents was 2.22 which was lower than the 2.50 benchmark. Therefore, teachers in Ebonyi state secondary schools sometimes do not test the validity of their test items.

Research Question 3: What can be done to beef up the quality of teacher-made tests in secondary schools in Ebonyi State?

The result of data in Table 3 revealed that the respondents in item 18-22 had the mean scores of 2.99 ± 0.83, 2.69± 0.91, 2.70 ± 0.86, 2.94 ± 0.87 and 2.92 ± 0.91. This indicates that the respondents disagreed on item 18, 19, 20, 21 and 22. This indicates that all the respondents accepted that the teachers should construct test items with a test blue print, teachers must periodically attend seminars and workshops, that a committee should be formed to review test items before administration, that principals should supervise teachers to make sure they are doing the right thing and that HODs must moderate test items before Administration. The grand mean score of all the respondents is 2.84 which is lower than 2.50 benchmark. Therefore, principals must supervise text item before administration in order to beef it up to avoid error.

Table 1. Mean rating of questionnaire on extent secondary school teachers in Ebonyi State validates their test item

  • Download as PowerPoint Slide Tables index View option Full Size Next Table

Table 2. Mean rating of questionnaire on the characteristic of validity secondary school teachers in Ebonyi State use to test the validity of their test items

  • Download as PowerPoint Slide Tables index View option Full Size Previous Table Next Table

Table 3. Mean rating of questionnaire on what can be done to beef up the quality of teacher-made tests in secondary schools in Ebonyi State

6. test of hypotheses, table 4. t-test summary on the significant difference in the mean rating of male and female teachers of english in secondary schools in ebonyi state on the extent to which they validate their test items.

  • Download as PowerPoint Slide Tables index View option Full Size Previous Table

Data in Table 4 showed that the mean scores of male and female teachers on the extent to which they validate their test items were 2.6377 and 2.4723 with the standard deviation of 0.28449 and 0.23637 respectively. This indicates that more male teachers than female teachers in Ebonyi State secondary schools validate their test items. It also showed a P-Value of 0.000 which is lower than the chosen level of significance, 0.05. The null hypothesis which states that there is no significant difference in the mean rating of male and female teachers of English in secondary schools in Ebonyi State on the extent to which they validate their test items was consequently rejected.

7. Discussion

The first research question sought information on the extent to which teachers of English in public secondary schools in Ebonyi State validate their test items. The findings indicate that teachers of English generally do not validate their test items as 78.5% accepted that they construct test items on the day of administration; only 21.5% of the population rejected that their test items are constructed earlier than the administration day. From the data, there will not be time to subject test items to processes of validation before administration if teachers only hurriedly construct them on the actual day of administration. Also, to give credence to the claim, 95.9% rejected the statement that test items are analyzed before administered to testees. This shows that no form of item analyses is done to determine the effectiveness of test items before administration.

Furthermore, it was discovered that test items were not submitted to any authority such as an examination committee, the head of department or the dean of studies for scrutiny. Data showed that 55.9% rejected the assertion that test items are submitted to the HOD for assessment; a whopping 97.3% rejected that there is a committee that reviews test items before administration; and 88.8% completely disagreed that test items are sent to test experts to scrutinize before administration. Finally, it was discovered that every teacher handles his/her tests items alone. Available data revealed that 83.7% affirmed that every teacher handles his/her test items alone to avoid leakage and 92.9% concur that individual teachers are at liberty to do as they see fit with test items in their subjects.

Table two presented data that answered the second research question. Research question two sought answers on the characteristics of validity applied by teachers of English in their assessment of test items. Despite the fact that table one revealed that absolutely no measures are taken to validate language test items, data here showed that test items actually possess face and content validity even if it is by the assessment of the individual teachers. For instance, 56.1% of the population affirm that their test items cover every aspect of instruction; 80.7% concede that their test items do resemble language tests and 77.7% agree that their test items correspond with set goals of instruction. In other words, more of the respondents accepted that language tests look like language tests and that their test items agree with the objectives of instruction.

However, these claims are only one sided since no other person was required to verify them. More so, no specification or test blue print was used in the construction of test items by teachers of English in public secondary schools in Ebonyi State. This is seen from the 51% of the population who rejected the statement on the use of test blue print to prepare test items.

From Table 3 , which presented data on the strategies to be employed in order to improve the quality of teacher-made tests in secondary schools in Ebonyi State, evidence showed that all the five items scored above 50% indicating that they were all accepted. To elucidate further, 86.4% affirmed that test items should be constructed with a test blue print; 80.4% supported that teachers must periodically attend seminars and workshops if they are to improve in test item development and ensure quality in the system; 84.7% accepted that committees be set up to oversee test construction and administration in schools; 88% felt that principals should engage in active supervision; and 79% accepted that heads of departments must moderate test items before administration. This implies that teachers want the best for the system and are prepared to do the right thing in a bid to enhance the quality of their products.

8. Conclusion

The study explored the extent of validation of teacher-made language tests in secondary schools in Ebonyi State. Given the importance of evaluation in schools it is appalling to note that effort is not made to ensure that tests are constructed and administered properly. Language tests items are constructed in a hurry and no time is taken to analyze and ascertain their effectiveness. Even though test items are constructed bearing in mind the objectives of instruction, they are not sent to any expert to validate; neither are they constructed with the aid of test blue print or table of specification. By so doing, results may go either way – students may pass too well or they may fail drastically. Either way, the result will not be a true outcome of instruction.

However, reports over the years have shown that there is massive failure in English in external examinations such as West African Senior Secondary School Certificate Examination (WASSSCE). One begins to wonder whether inability to validate test items is a major cause of the failure. Ensuring quality in teacher-made language tests is a good way of improving performance of students in secondary schools. Some of the strategies that can help to ensure that language test items are validated in secondary schools in Ebonyi State are heads of departments should adequately supervise teachers under them to ensure that they construct their test items with the aid of test blue print and that committees should be set up to review and validate test items.

9. Recommendations

Based on the findings of the study, the following recommendations are made.

Ÿ Without proper supervision, activities in schools will be chaotic. The study recommends strenuous supervision of academic activities in secondary schools in Ebonyi State by both internal and external authorities.

Ÿ Teachers of English can only give what they have. In this dispensation when graduates of English are employed to teach English without teaching qualifications, adequate training in test item construction should periodically be provided for teachers of English to pad the inadequacy.

Ÿ Departments of English in public secondary schools in Ebonyi State must as a matter of urgency set up test item review committees to monitor test items before administration.

Published with license by Science and Education Publishing, Copyright © 2019 Nwani-Grace Ugwu and Solomon Okechukwu Mkpuma

Cite this article:

Normal style, chicago style.

  • Google-plus
  • View in article Full Size

Logo

Teachers Made Test vs Standardized Tests

To know about Teachers Made Test vs Standardized Tests. We have given the many teachers made test our official website at teachers adda.

Featured Image

Table of Contents

Teachers Made Test vs Standardized Tests are designed by the teachers for the purpose of conducting classroom tests. Teacher made tests can be in the form of oral tests and written tests. A standardized test is one in which the procedure, apparatus and scoring have been fixed so that precisely the same test can be given at the different times and places. Lets learn more about Teachers Made Test vs Standardized Tests here.

A teacher is more concerned with the teacher-made tests as she is directly involved in their construction. Moreover, the teacher made tests have an advantage over standardized tests because they can be constructed to measure outcomes directly related to classroom specific objectives and particular class situations. These tests are within the means of every teacher and most economical. The teacher made oral tests are designed to measure the performance of students skills like listening and speaking in language learning. Written tests are designed to test the abilities of students’ knowledge comprehension and written expression.

Standardized Tests

A Standardized test is one in which norms have been established. The test has been given to a large number of students. A norm is an average score that measures achievement. So, every standardized test has norms. It is intended for general use and covers a wider scope of material than is covered in an ordinary teacher-made test. A standardized test is one in which the procedure, apparatus and scoring have been fixed so that precisely the same test can be given at the different times and places. A standardized test is one that has been given to so many people that the test makers have been able to determine fairly accurately how well a typical person of a particular age or grade in school will succeed in it.

Role of Standardized Test: –

  • Information becomes easier to convince the guardians of students
  • Information in much less time than provided by other devices.
  • Information for all guidance workers.
  • Aspects of the behaviour which otherwise could not be obtained.
  • Objectives and impartial information about an individual.

Steps Involved in Standardized Test: –

A standardized test is tried out and administered on a number of subjects for the expressed purpose of refining the items by subjecting the performances of the standard decision to pertinent statistical analysis. The steps for the standardized test is constructed by test specialists or experts who are

  • Proper planning
  • Adequate preparations
  • Try – out of the test
  • Preparation of proper, norms
  • Preparation of a manual containing instruction of administering a tool or test.
  • Item analysis

The Teacher made Test vs. Standardized Tests: –

The standardized test is based on the general content and objectives common to many schools all over the country whereas the teacher made test can be adapted to content and objectives specific to his own situation. The standardized test deals with large segments of knowledge or skill whereas the teacher made test can be prepared in relation to any specific limited topic. The standardized test is developed with the help of professional writers, reviewers and editors of tests items whereas the teacher made test usually relies upon the skill of one or two teachers. The standardized test provides norms for various groups that are broadly representative of performance throughout the country whereas the teacher made test lack this external point of reference.

Characteristics of a Standardized Test: –

  • Standardized tests are based on the content and objectives of teaching common to many schools.
  • Not just one, but a team of experts are involved in the writing of test items.
  • Items analysis is done on the basis of a pilot study, unlike in the case of a classroom test.
  • Norms are calculated for the purpose of comparison between grades, schools, age levels and sexes.
  • They cover large segments of knowledge and skills.
  • Test manuals are prepared.
  • Fairly a large same, not just one class is involved in the standardization of a test.

The teacher needs to test student performance. Test results are critical, not only because they affect careers but because of the influence they exercise on motivation to learn. The teacher must be aware of different testing techniques because they give useful information to both the teacher and the students. Testing techniques are often similar to the teaching techniques, but with a different purpose.

Download Teachers Made Test vs Standardized Tests PDF (In Hindi)

You may also like to  

  • TOP 300 Child Development & Pedagogy Questions : Download PDF
  • CDP Study Notes for all Teaching Exams

Psychological Test : Types, Principles_60.1

Read Also :

Sharing is caring!

TN TRB Assistant Professor Notification 2024 Out, Exam Date and Salary

Leave a comment

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Trending Articles

  • Upcoming Teaching Exams
  • Teaching Exam PYQ
  • TN TET 2024
  • AP DSC 2024
  • TN TRB 2024

EMRS

Recent Posts

Important exams.

  • CTET Sylllabus 2024
  • CTET Eligibility
  • CTET Result
  • CTET Old Papers
  • UPTET Syllabus
  • UPTET Eligibility
  • UPTET Center
  • UPTET Old Papers
  • UP B.Ed 2024
  • Bihar B.Ed 2024
  • Super TET 2024
  • Bihar STET 2024
  • MP TET 2024
  • Karnataka TET 2024
  • KVS Syllabus

Our Other Websites

  • Teachers Adda
  • Bankers Adda
  • Adda Malayalam
  • Adda Punjab
  • Current Affairs
  • Defence Adda
  • Adda Bengali
  • Engineers Adda
  • Adda Marathi
  • Adda School

teachers

TeachersAdda  is Leading Job Information Portal for All Teaching Jobs & CTET Exam in India. The portal has complete information about all Latest Teaching Jobs Notification and Teacher Recruitment for all state and national level Teaching Jobs Exam like CTET, UPTET,  NVS, KVS Recruitment.

Download Adda247 App

Google Play

Follow us on

youtube

  • Responsible Disclosure Program
  • Cancellation & Refunds
  • Terms & Conditions
  • Privacy Policy

Texas Education Agency must accommodate teachers during certification tests, DOJ says

teacher made test in education

The U.S. Department of Justice has reached a settlement with the Texas Education Agency over a 2022 complaint that accused the latter of not providing appropriate accommodations to a teacher taking a reading certification exam, the federal department announced Wednesday .

The settlement requires the TEA to allow testers with dyslexia or dysgraphia to use alternative exam arrangements, such as text-to-speech technology, when taking a reading certification exam.

The Justice Department opened the case after receiving a complaint alleging that the TEA had violated Title II of the Americans with Disabilities Act in administering a Science of Teaching Reading exam, a regular test for issuing certain teaching certifications.

The TEA didn’t respond to a request for comment Thursday.

A 2019 law requires all Texas teacher candidates who teach students from prekindergarten through sixth grade to demonstrate proficiency in the Science of Teaching Reading program.

In 2021, the TEA directed NCS Pearson Inc., which administers the exam, not to allow a reader for exam takers who weren’t blind on tests in which reading skills are measured, according to the settlement agreement. A reader in an exam will read out test materials for someone who needs accommodations.

The TEA determined that such an accommodation “could fundamentally alter the accurate measurement of knowledge or skills intended to be measured by that exam,” according to the settlement.

In 2022, a Science of Teaching Reading tester who had previously been diagnosed with dyslexia and dysgraphia requested extra test time, a scribe and either someone to read the test or speech-to-text technology, according to the complaint.

The TEA denied the tester’s request June 20, 2022, but the tester has received these same accommodations for at least three different teacher certification exams since then, according to the complaint.

“Preparing for professional certification examinations is a stressful time for anyone, let alone people with disabilities who may also worry that their requests for alternative testing arrangements may be rejected,” U.S. Attorney Jaime Esparza said. “People with dyslexia should not be denied the testing accommodations they deserve. The ADA requires such modifications to ensure that people with disabilities are not being graded on their disabilities, and unfairly denied access to their chosen professions.”

Despite the settlement, the TEA disputed the federal department’s conclusion that it had violated disability law, and the settlement didn’t determine any liability, according to the federal department.

The TEA oversees state education policy such as certifying teachers to educate in Texas.

Politics latest: Rishi Sunak to hold news conference as Rwanda bill returns to parliament

Rishi Sunak will address the nation this morning ahead of the final push later today to pass the bill designed to rescue the embattled Rwanda scheme. Plus, the home secretary will meet with the Met chief amid calls for him to quit over the treatment of an antisemitism campaigner.

Monday 22 April 2024 07:15, UK

  • PM to hold news conference this morning as Rwanda bill returns to parliament
  • Met Police chief to meet home secretary after calls for him to quit over antisemitism row
  • Rob Powell:  A Met chief is again in middle of policing and politics - so what happens now?
  • Listen to this week's Politics at Jack and Sam's above and  tap here   to follow wherever you get your podcasts
  • Live reporting by Ben Bloch

Ahead of the Rwanda bill returning to parliament later today, the prime minister is set to hold a press conference in Downing Street this morning.

We are told Rishi Sunak will be discussing the Rwanda bill, that last week was unexpectedly delayed by peers in the House of Lords, who passed two new amendments on Wednesday.

On Friday, Mr Sunak vowed that both Houses would sit as late into the night tonight as needed to agree a version of the bill to pass into law.

We will have live coverage of the news conference later this morning here in the Politics Hub and across Sky News, so do stay tuned.

In the Venn diagram of policing and politics, it's often the Met Police commissioner who gets trapped in the middle.

And so once again, Sir Mark Rowley is being pushed and pulled between the public order decisions made by his officers on the ground and the extensive public and political examination that follows.

In the case of the high-profile interaction between Gideon Falter of the Campaign Against Antisemitism and an officer policing the pro-Palestinian march in London last Saturday, the best vantage point we have is the footage filmed by a Sky News camera crew at the demonstration.

The footage shows a lengthy and bad-tempered discussion, with the officer accusing Mr Falter of purposefully leaving the pavement and walking on the road against the flow of protesters.

"You are looking to try and antagonise... I can already see what your mindset is," the officer says at one point.

Mr Falter disagrees, saying he is simply trying to cross the road and "get out of here".

The officer replies that if that's the case, he's happy to escort him and his group safely around the march.

However, Mr Falter asks: "Why can't I just walk where I want to walk?", before adding "the Metropolitan Police says these marches are completely safe for Jews... you're telling me... I have to be escorted by you."

This is really the key point.

Read Rob's full analysis here:

Met Police chief Sir Mark Rowley will meet the home secretary and policing minister this week to discuss antisemitism, Sky News understands.

It comes after an antisemitism campaigner was threatened with arrest yards away from a pro-Palestine protest where officers described him as "openly Jewish" and said his presence was "antagonising demonstrators".

The force apologised but then had to apologise for their apology after suggesting opponents of pro-Palestinian marches "must know that their presence is provocative".

Sir Mark will also meet London mayor Sadiq Khan to discuss "community relations" and he is expected to speak to organisations including the Board of Deputies of British Jews, the London Jewish Forum and the Community Safety Trust.

Gideon Falter, the campaigner who was threatened with arrest, said Jewish Londoners could not have confidence in the police under Sir Mark's leadership, accusing the commissioner of "victim blaming".

Read the full story, and watch video of the incident, here:

Good morning!

Welcome back to the Politics Hub on this Monday, 22 April.

Here's what's happening today:

  • The Rwanda bill is returning to parliament today after the Lords unexpectedly passed new amendments to it last week, delaying its passage;
  • Parliament is expected to debate and vote on amendments for as long as it takes for the bill to pass both Houses, with the government lifting all restrictions on how late they can sit - we are expecting debate and votes to start from around 4.30pm ;
  • We are expecting to hear from Rishi Sunak this morning as he tries to get on the front foot and make clear to the House of Lords that he wants this legislation to become law as soon as possible;
  • Controversy continues after an antisemitism campaigner was threatened with arrest yards away from a pro-Palestine protest where officers described him as "openly Jewish" and said his presence was "antagonising demonstrators";
  • The PM has said he is "appalled" by the incident, and Met Police chief Sir Mark Rowley will meet the home secretary and policing minister this week to discuss antisemitism, Sky News understands;
  • After a blitz of challenging media interviews promoting her new book in the last week, ex-PM Liz Truss will speak to a much friendlier audience at the Heritage Foundation in Washington DC at around 4pm.

We will be discussing all of that and more with:

  • Andrew Mitchell , deputy foreign secretary, at 7.20am ;
  • Bridget Phillipson , shadow education secretary, at 8.15am ;
  • Gideon Falter , chief executive of the Campaign Against Antisemitism, at 9.20am .

Follow along for the very latest political news.

But before you go, here are the headlines:

  • Liz Truss has refused to apologise to homeowners for higher interest rates in an interview with Sky News;
  • The former prime minister also said she had changed her mind on "problematic" net zero legislation;
  • The Metropolitan Police chief is facing calls to quit over the force's handling of a recent pro-Palestinian protest;
  • He will meet Jewish groups after the force threatened to arrest an "openly Jewish" man;
  • Labour's shadow justice secretary told Sunday Morning with Trevor Phillips the force have "not covered themselves in glory", but the Met chief's resignation is not "the way forward";
  • Reform UK leader Richard Tice has accused Rishi Sunak of not being a "real Conservative," telling Sunday Morning With Trevor Phillips there will be a "realignment of the right";
  • A number of MPs are running the London Marathon today, including Chancellor Jeremy Hunt who is running the race for the third and "the last" time.

We'll be back from 6am with all the latest.

People voting in local elections in England on 2 May will need to provide photo ID.

It is the second year the requirement has been in place - but in 2023,  14,000 people couldn't cast their ballot because they didn't take ID to the polling booth.

There are 22 different types of ID you can use - and if you don't have any of them, you can register for a Voter Authority Certificate.

Here's everything you need to know to avoid being caught out:

We have been reporting today that there's a lengthy list of MPs taking part in the London Marathon today - with one MP running for the 18th time ( see post at 11.27am ).

But none so far have taken the title for fastest-ever MP to run the race from Matthew Parris. 

In 1985, he ran the marathon in 2 hours, 32 minutes.

The Times columnist and former Conservative MP told Sky News he is "intensely proud, inordinately proud" at the achievement. 

"I would almost think it is the thing of which I am most proud in my entire political career," he said.

"I had trained so intensely. 

"Up hill, down dale - I would run into the House of Commons for a vote and run back after the vote."

He said he even got stopped by the police once running back from a vote. 

"It was a better result than I could have expected, so I was very proud," he said.

A power-sharing agreement between the SNP and the Greens at Holyrood is under threat after the Scottish government ditched a key climate change target.

The Scottish Green Party has said a vote on the deal, to be held at a forthcoming extraordinary general meeting (EGM), would be binding.

The date of the assembly and the crunch ballot has yet to be announced.

There is unhappiness among Green Party members after the SNP announced the Scottish government was scrapping its commitment to cut emissions by 75% by 2030. 

The Rainbow Greens, the party's LGBT wing, has also criticised the announcement, which came on the same day that the prescription of puberty blockers for new patients under the age of 18, at the gender identity service in Glasgow, would be paused.

The decision followed a landmark review of gender services for under-18s in England and Wales.

Scottish Greens co-leader Patrick Harvie said he would be urging members to back the power-sharing agreement so the party could "put Green values into practice" in government.

Writing on X, he said "many" members had been calling for an EGM to discuss the future of the agreement.

But Mr Harvie said: "As part of the Scottish government, we're making a difference on a far bigger scale than ever before."

Read more here:

In case you missed Sunday Morning With Trevor Phillips, we have a recap of one of the interviews on the programme.

Richard Tice, the leader of Reform UK, claimed Conservative Prime Minister Rishi Sunak is "not a real Conservative".

He pointed to the high tax burden and even went as far as calling Mr Sunak a "socialist".

Read the full story below:

Prime Minister Rishi Sunak is "appalled" by an exchange at a pro-Palestinian protest in which the Met Police threatened to arrest an "openly Jewish" man, a government source has told Sky News.

Met Police chief Sir Mark Rowley is facing calls to resign after antisemitism campaigner Gideon Falter was threatened with arrest near the protest march in London ( see post at 08.36am ).

Mr Falter, the chief executive of the Campaign Against Antisemitism, said Jewish Londoners cannot have confidence in the Met under Sir Mark's leadership and accused the commissioner of "victim blaming".

Energy Secretary Claire Coutinho earlier told Sky News the incident was "completely wrong" and that "what happens next" with regard to Sir Mark was a "matter for the Labour London mayor" ( see post at 08.40am ).

Sky News understands that Sir Mark does still retain the confidence of London mayor Sadiq Khan (see post at 11.46am) .

Be the first to get Breaking News

Install the Sky News app for free

teacher made test in education

IMAGES

  1. Teacher Made Tests

    teacher made test in education

  2. Achievement test

    teacher made test in education

  3. Teacher made test examples

    teacher made test in education

  4. Teacher Giving Test To Students on Lecture Stock Image

    teacher made test in education

  5. Teacher Made Tests

    teacher made test in education

  6. Teacher Made Test

    teacher made test in education

VIDEO

  1. Grading Funny Test Correcting School Teacher Professor ASMR Satisfying

  2. How teachers make their students test 💀

  3. Practicum -Teacher Made Test

  4. English test #quiz #teacher #test

  5. difference between teacher made test and standardized test (B.Ed 3rd sem)

  6. letter recognition activity #viralvideo #parenting #homeschooling #learning #athome #playwaymethod

COMMENTS

  1. Teacher-Made Assessments

    Teacher-Made Assessments. One of the challenges for beginning teachers is to select and use appropriate assessment techniques. In this section, we summarize the wide variety of types of assessments that classroom teachers use. First, we discuss the informal techniques teachers use during instruction that typically require instantaneous ...

  2. Teacher-made tests: why they matter and a framework for analysing

    Classroom assessment research in the United States has shifted away from the examination of teacher-made tests, but such tests are still widely used and have an enormous impact on students' educational experiences. ... Sarah Wellberg is a PhD Candidate in the Research and Evaluation Methodology department in the School of Education at the ...

  3. PDF Test, measurement, and evaluation: Understanding and use of the ...

    as analytic tests) are tests used by the teacher to get evidence detailing the learners' progress about a given subject. To undertake this, the teacher approaches this during the learning process by breaking the subjects into units. Since teachers adapt their teaching methods in their schemes of work, teacher-made tests are made by teachers.

  4. Teacher Made Test: Meaning, Features and Uses

    A teacher-made test does not require a well-planned preparation. Even then, to make it more efficient and effective tool of evaluation, careful considerations arc needed to be given while constructing such tests. The following steps may be followed for the preparation of teacher-made test: 1. Planning: Planning of a teacher-made test includes:

  5. The Similarities and Difference of Classroom Test and Standardized

    The education cycle focuses on planning, delivery and assessment. Educators often use both teacher-made tests and standardized tests. However, not all tests measure the same things. It is important to understand teacher-made tests vs. standardized tests and the advantages and disadvantages of each.

  6. Teacher-made assessment strategies

    12. Teacher-made assessment strategies. Kym teaches sixth grade students in an urban school where most of the families in the community live below the poverty line. Each year the majority of the students in her school fail the state-wide tests. Kym follows school district teaching guides and typically uses direct instruction in her Language ...

  7. The role of the teacher-made test in higher education

    Teacher-made tests are more than assessment devices: They are a fundamental part of the educational process. They can define instructional purposes, influence what students study, and help instructors to gain perspective on their courses. How well the tests accomplish these purposes is a function of their quality.

  8. PDF Creating and Grading Valid and Accessible Teacher-Made Tests

    CREATING AND GRADING VALID AND ACCESSIBLE TEACHER-MADE TESTS 11. You can enhance the validity of your tests by weighting the content of your tests to reflect the complexity of the concepts you taught and the amount of instruc-tional time you devoted to teaching them. In other words, the percentage and num -

  9. Writing Quality Teacher-Made Tests: A Handbook for Teachers

    A checklist for writing quality teacher-made tests is provided. Appendices include: (1) a list of verbs used in teacher-made tests for Bloom's Taxonomy of Education Objectives--Cognitive Domain; (2) a student guide to understanding words used in essay questions; and (3) a list of sample item stems for higher order cognitive questions.

  10. The Role of the Teacher-Made Test in Higher Education

    Teacher-made tests are more than assessment devices: They are a fundamental part of the educational process. They can define instructional purposes, influence what students study, and help instructors to gain perspective on their courses. How well the tests accomplish these purposes is a function of their quality.

  11. PDF Teacher-Made Tests and Techniques

    Study of Education, entitled, T/ie Meas urement of Understanding, and distrib uted by the University of Chicago Press. 1 Information and understanding may be measured informally by teacher-made tests using multiple choice, true-false, or completion types of test items and exercises. Work-study skills may be as sessed informally by observing ...

  12. Using reliability and item analysis to evaluate a teacher-developed

    Teachers at all levels of education prepare and administer many formal teacher-made tests during the school year. Tests are, therefore, indispensable tools in the educational enterprise. Strict adherence to the principles of test construction, test administration and analyses, and reporting is very essential, especially when norm-referenced ...

  13. The Advantages of Teacher-Made Tests

    The Advantages of Teacher-Made Tests. Formal assessments give teachers a way to test knowledge and plan future instruction, but standardized exams or commercially prepared tests don't always accurately assess that information. The extra time required to prepare exams pays off with the potential for more accurate assessments, and with the ...

  14. Test, measurement, and evaluation: Understanding and use of the

    Since teachers adapt their teaching m ethods in their schemes of work, teacher-made tests are made by teach ers. Consequ ently, teachers are at liberty to customize th ese tests.

  15. [PDF] Assessing Teacher-Made Tests in Secondary Math and Science

    Assessing Teacher-Made Tests in Secondary Math and Science Classrooms. J. Oescher, P. Kirby. Published 1 April 1990. Mathematics, Education. ABSTRACT A model for use in identifying assessment needs in association with teacher-made mathematics and ncience tests at the secondary level was developed. The model focuses on the identification…. Expand.

  16. ERIC

    This study encompassed the collection of teacher reported (N=326) testing practices and the direct assessment of teacher-made tests (N=175) for item cognitive functioning levels and construction errors. Focus was on assessing the nature and quality of teacher-made tests used in public school classrooms and describing the classroom teachers' testing preferences.

  17. What is Teacher-Made Tests

    Definition of Teacher-Made Tests: Assessment instruments, typically with a summative purpose, designed to measure and promote student achievement of knowledge/skills specified in the course learning outcomes. ... (AL) has been widely reported in the education and language teaching literature, tools and methods to help teachers actually develop ...

  18. PDF UNIT 9 COMMONLY USED TESTS IN SCHOOLS

    9.3.4 Teacher-made Vs. Standardised Achievement Tests A cowparision of teacher-made and standardized achievement tests is given in the table below: - - - - Standardized Achievement Test They are used to evaluate outcomes and content that have been determined irrespective of what has been taught. General quality of items is high. They

  19. 1.15: Teacher made assessment strategies

    Assessment for learning: an overview of the process. Step 1: Having clear instructional goals and communicating them to students. Step 2: Selecting appropriate assessment techniques. Step 3: Using assessment to enhance motivation and confidence. Step 4: Adjusting instruction based on information.

  20. Validity of Teacher-Made Assessment: A Table of ...

    Items 1 to 4 examine teacher understanding of the table of specification while items 5 to 10 test the content validity of teacher-made assessment. The results showed that teachers exhibited a low ...

  21. What Teachers Should Know About Integrating Formative Tests With

    Students should engage in self- and peer-assessments, to help take ownership of their learning. Assessments should make students' thinking visible to both the teacher and students, to correct ...

  22. The Role of the Teacher-Made Test in Higher Education

    Views teacher-made tests as a fundamental part of the educational process, defining instructional purposes, influencing what students study, and helping instructors gain perspective on their courses. Offers examples of test items that can perform these functions, while assessing knowledge and skills. (DMM)

  23. Ensuring Quality in Education: Validity of Teacher-made Language Tests

    Teacher-made tests are locally developed by subject teachers in schools to assess the achievement of their students in areas covered in instruction. While the standardized test is valid, reliable and has a table of norms, the teacher-made test does not possess any form of norms 2. The standardized test is designed to be used on a much larger ...

  24. Teachers Made Test vs Standardized Tests

    The Teacher made Test vs. Standardized Tests: -. The standardized test is based on the general content and objectives common to many schools all over the country whereas the teacher made test can be adapted to content and objectives specific to his own situation. The standardized test deals with large segments of knowledge or skill whereas ...

  25. TEA must accommodate teachers during certification tests, DOJ says

    The U.S. Department of Justice reached a settlement with the Texas Education Agency over a 2022 complaint that accused the latter of not providing appropriate accommodations to a teacher taking a ...

  26. Texas Education Agency settles ADA complaint over teacher exam

    The Texas Education Agency (TEA) requires certain teachers to take the Science of Teaching Reading certification exam. Someone who has dyslexia and dysgraphia made an ADA complaint after the ...

  27. Politics latest: PM 'appalled' by police protest row; Truss backs Trump

    People voting in local elections in England on 2 May will need to provide photo ID. It is the second year the requirement has been in place - but in 2023, 14,000 people couldn't cast their ballot ...