Module 1 Discussion
Data Use & Teaching
Directions:
To make the best use of data, educators must go beyond the big tests and involve teachers and students in collecting and analyzing data. After studying Module 1: Lecture Materials & Resources, discuss the following:
- Education Majors:
- How can educators effectively involve both teachers and students in the data collection and analysis process?
- What challenges might arise from this approach, and how can they be addressed to ensure data-driven decision-making benefits the learning environment?
- Instructional Design Majors:
- As an instructional designer, how would you design a framework that involves both teachers and students in the process of collecting and analyzing data?
- What tools, strategies, or methods would you recommend to ensure the data collected is meaningful and actionable for improving learning outcomes?
Submission Requirements:
- Each post is to be clear and concise, and students will lose points for improper grammar, punctuation, and misspelling.
- Your initial post should be at least 200 words, formatted and cited in current APA style. You should research and reference one article, as well as reference the textbook readings.
Module 1: Lecture Materials & Resources
Introduction to Assessment & Data Analysis
Read and watch the lecture resources & materials below early in the week to help you respond to the discussion questions and to complete your assignment(s).
(Note: The citations below are provided for your research convenience. You should always cross-reference the current APA guide for correct styling of citations and references in your academic work.)
Read
· Popham, W. J. (2024). Classroom assessment: What teachers need to know (10th ed.). Pearson.
· Chapter 6: Selected-Response Tests
· Chapter 7: Constructed-Response Tests
· Chapter 8: Performance Assessment
· Chapter 9: Portfolio Assessment
· McDonald, J. P. (2019). Toward more effective data use in teaching. Phi Delta Kappan, 100(6), 50-54. Toward More Effective Data Use in TeachingLinks to an external site.
· As part of your readings in this module, please also review the following:
· Syllabus
Watch
· Teachings in Education. (2016, December 18). Assessment in education: Top 14 examples [Video]. YouTube. https://youtu.be/zTkQjH-_97c Assessment in Education: Top 14 Examples (4:21)Links to an external site.
Supplemental Materials & Resources
· Popham, W. J. (2010). Everything school leaders need to know about assessment. Corwin Press. Print: 9781412979795 eText: 9781452271514
Module 1 Discussion
Data Use & Teaching
Directions:
To make the best use of data, educators must go beyond the big tests and involve teachers and students in collecting and analyzing data. After studying Module 1: Lecture Materials & Resources , discuss the following:
· Education Majors:
· How can educators effectively involve both teachers and students in the data collection and analysis process?
· What challenges might arise from this approach, and how can they be addressed to ensure data-driven decision-making benefits the learning environment?
· Instructional Design Majors:
· As an instructional designer, how would you design a framework that involves both teachers and students in the process of collecting and analyzing data?
· What tools, strategies, or methods would you recommend to ensure the data collected is meaningful and actionable for improving learning outcomes?
Submission Requirements:
· Each post is to be clear and concise, and students will lose points for improper grammar, punctuation, and misspelling.
· Your initial post should be at least 200 words, formatted and cited in current APA style. You should research and reference one article, as well as reference the textbook readings.
image3.png
image4.png
image1.png
image2.png
,
162
Chapter 6
Selected-Response Tests
Chief Chapter Outcome
The ability to accurately employ professionally accepted item-writing guidelines, both general and item-type specific, when constructing selected-response items or evaluating those items constructed by others
Learning Objectives
6.1 Using the “Five General Item-Writing Precepts” found in this chapter, identify and explain common characteristics of poorly constructed selected-response test items.
6.2 Differentiate between the four different varieties of selected-response test items (binary-choice items, multiple binary-choice items, multiple-choice items, matching items) and be able to create an example of each.
In this and the following four chapters, you will learn how to construct almost a dozen different kinds of test items you might wish to use for your own class- room assessments. As suggested in the preceding chapters, you really need to choose item types that mesh properly with the inferences you want to make about students—and to be sure those inferences are directly linked to the educational decisions you need to make. Just as the child who’s convinced that vanilla is ice cream’s only flavor won’t benefit from a 36-flavor ice cream emporium, the more item types you know about, the more appropriate your selection of item types will be. So, in this chapter and the next four, you’ll be learning about blackberry-ripple exams and mocha-mango assessment devices.
Realistically, what should you expect after wading through the exposition about item construction contained in the upcoming text? Unless you’re a remarkably quick study, you’ll probably finish this material and not be instantly transformed
M06_POPH0936_10_SE_C06.indd 162M06_POPH0936_10_SE_C06.indd 162 09/11/23 6:23 PM09/11/23 6:23 PM
Expanding Electronic Options 163
into a consummately skilled test constructor. It takes more than reading a brief explanation to turn someone into a capable item developer. But, just as the journey of a thousand miles begins with a single step, you’ll have initiated a tentative trot toward the Test Construction Hall of Fame. You’ll have learned the essentials of how to construct the most common kinds of classroom assessments.
What you’ll need, after having completed your upcoming, fun-filled study of test construction is tons of practice in churning out classroom assessment devices. And, if you remain in teaching for a while, such practice opportunities will surely come your way. Ideally, you’ll be able to get some feedback about the quality of your classroom assessment procedures from a supervisor or colleague who is experienced and conversant with educational measurement. If a competent cohort critiques your recent test construction efforts, you’ll profit by being able to make needed modifications in how you create your classroom assessment instruments.
Expanding Electronic Options Once upon a time, when teachers churned out all their own classroom tests, about the only approach available was reliance on paper in order to present items that, back then, students responded to using pencils or pens. Oh, if you head back far enough in history, you might find Egyptian teachers relying on papyrus or, perhaps, pre-history teachers dishing up pop quizzes on tree bark.
But we have evolved (some more comfortably than others) into a technological age in which whole classrooms full of students possess laptop computers, elec- tronic tablets, or super-smart cell phones that can be employed during instruction and assessment. Accordingly, because the availability of such electronically provided assessment options depends almost totally on what’s available for use in a particu- lar district or school, some of the test construction guidelines you will encounter here may need to be massaged because of electronic limitations in the way a test’s items can be written. To illustrate, you’ll learn how to create “matching” items for a classroom assessment. Well, one of the recommendations to teachers who use such items is that they put everything for a given item on a single page (of paper)—so that the students need not flip back and forth between pages when selecting their answers. But what if the electronic devices that a teacher’s students have been given do not provide sufficient room to follow this all-on-one-page guideline?
Well, in that situation it makes sense for a teacher to arrive at the most reason- able/sensible solution possible. Thus, the test construction guidelines you’ll encounter from here on in this book will be couched almost always in terms suitable for paper- presented tests. If you must create classroom assessments using electronic options that fail to permit implementation of the guidelines presented, just do the best job you can in adapting a guideline to the electronic possibilities at hand. Happily, the use of electronic hardware will typically expand, not truncate, your assessment options.
Before departing from our consideration of emerging digital issues concern- ing educators, a traditional warning is surely warranted. Although the mission and the mechanics of educational testing are generally understood by many
M06_POPH0936_10_SE_C06.indd 163M06_POPH0936_10_SE_C06.indd 163 09/11/23 6:23 PM09/11/23 6:23 PM
164 ChaptEr 6 Selected-response tests
educators, the arrival of brand-new digitally based assessment procedures is apt to baffle today’s educators who fail to keep up with evolving versions of modern educational measurement. New ways of testing students and more efficient ways of doing so suggest that today’s teachers simply must keep up with innovative test- ing procedures heretofore unimagined.
This advice was confirmed in an April 12, 2022, Education Week interview with Sal Khan, founder of the nonprofit Khan Academy, which now counts 137 million users in 190 nations. Kahn was asked how best to employ new technology and soft- ware tools to close the learning gaps that emerged during the COVID-19 pandemic. He remarked, “I think that’s going to be especially important because traditional testing regimes have been broken. And it’s unclear what they’re going back to.”
Because of pandemic-induced limitations on large crowds, such as those rou- tinely seen over the years when students were obliged to complete high-stakes examinations, several firms have been exploring the virtues of providing custom- ers with digitalization software that can track a test-taker’s eye movements and even sobbing during difficult exams. Although the bulk of these development efforts have been aimed at college-level students, it is almost certain that experi- mental electronic proctoring systems will soon be aimed at lower grade levels.
A May 27, 2022, New York Times report by Kashmir Hill makes clear that the COVID-19 pandemic, because of its contagion perils, created “a boom time for companies that remotely monitor test-takers.” Suddenly, “millions of people were forced to take bar exams, tests, and quizzes alone at home on their laptops.” Given the huge number of potential customers in the nation’s K–12 schools, is not a shift of digital proctoring companies into this enormous market a flat-out certainty?
Ten (Divided by Two) Overall Item-Writing Precepts As you can discern from its title, this text is going to describe how to construct selected-response sorts of test items. You’ll learn how to create four different varieties of selected-response test items—namely, binary-choice items, multiple binary-choice items, multiple-choice items, and matching items. All four of these selected-response kinds of items can be used effectively by teachers to derive defensible inferences about students’ cognitive status—that is, the knowledge and skills that teachers typically try to promote in their students.
But no matter whether you’re developing selected-response or constructed-response test items, there are several general guidelines that, if adhered to, will lead to better assessment procedures. Because many ancient sets of precepts have been articulated in a fairly stern “Thou shall not” fashion, and have proved successful in shaping many folks’ behavior through the decades, we will now dish out five general item-writing commandments structured along the same lines. Following these precepts might not get you into heaven, but it will make your assessment schemes slightly more divine. All five item-writing
M06_POPH0936_10_SE_C06.indd 164M06_POPH0936_10_SE_C06.indd 164 09/11/23 6:23 PM09/11/23 6:23 PM
ten (Divided by two) Overall Item-Writing precepts 165
precepts are presented in a box below. A subsequent discussion of each precept will help you understand how to adhere to the five item-writing mandates being discussed. It will probably help if you refer to each of the following item-writing precepts (guidelines) before reading the discussion of that particular precept. Surely, no one would be opposed to your doing just a bit of cribbing!
Opaque Directions Our first item-writing precept deals with a topic most teachers haven’t thought seriously about—the directions for their classroom tests. Teachers who have been laboring to create a collection of test items typically know the innards of those items very well. Thus, because of the teacher’s intimate knowledge not only of the items, but also of how students are supposed to deal with those items, it is often the case that only sketchy directions are provided to students regarding how to respond to a test’s items. Yet, of course, unclear test-taking directions can result in confused test-takers. And the responses of confused test-takers don’t lead to very accurate inferences about those test-takers.
Flawed test directions are particularly problematic when students are being introduced to assessment formats with which they’re not very familiar, such as the performance tests to be described in Chapter 8 or the multiple binary-choice tests to be discussed later in this chapter. It is useful to create directions for students early in the game when you’re developing an assessment instrument. When gener- ated as a last-minute afterthought, test directions typically turn out to be tawdry.
Ambiguous Statements The second item-writing precept deals with ambiguity. In all kinds of classroom assessments, ambiguous writing is to be avoided. If your students aren’t really sure about what you mean in the tasks you present to them, the students are apt to misinterpret what you’re saying and, as a consequence, come up with incor- rect responses, even though they might really know how to respond correctly. For example, sentences in which pronouns are used can fail to make it clear to which individual or individuals a pronoun refers. Suppose that, in a true–false test item, you asked your students to indicate whether the following statement
Five General Item-Writing precepts 1. Thou shalt not provide opaque directions to students regarding how to respond. 2. Thou shalt not employ ambiguous statements in your assessment items. 3. Thou shalt not provide students with unintentional clues regarding appropriate
responses. 4. Thou shalt not employ complex syntax in your assessment items. 5. Thou shalt not use vocabulary that is more advanced than required.
M06_POPH0936_10_SE_C06.indd 165M06_POPH0936_10_SE_C06.indd 165 09/11/23 6:23 PM09/11/23 6:23 PM
166 ChaptEr 6 Selected-response tests
was true or false: “Leaders of developing nations have tended to distrust leaders of developed nations due to their imperialistic tendencies.” Because it is unclear whether the pronoun their refers to the “leaders of developing nations” or to the “leaders of developed nations,” and because the truth or falsity of the statement depends on the pronoun’s referent, students are likely to be confused.
Because you will typically be writing your own assessment items, you will know what you mean. At least you ought to. However, try to slide yourself, at least figuratively, into the shoes of your students. Reread your assessment items from the perspective of the students, and then modify any statements apt to be even a mite ambiguous for those less well-informed students.
Unintended Clues The third of our item-writing precepts calls for you to intentionally avoid some- thing unintentional. (Well, nobody said that following these assessment precepts was going to be easy!) What this precept is trying to sensitize you to is the tendency of test-development novices to inadvertently provide clues to students about appropriate responses. As a consequence, students come up with correct responses even if they don’t possess the knowledge or skill being assessed.
For example, inexperienced item-writers often tend to make the correct answer to multiple-choice items twice as long as the incorrect answers. Even the most confused students will often opt for the lengthy response; they get so many more words for their choice. As another example of how inexperienced item-writers unintentionally dispense clues, absolute qualifiers such as never and always are sometimes used for the
M06_POPH0936_10_SE_C06.indd 166M06_POPH0936_10_SE_C06.indd 166 09/11/23 6:23 PM09/11/23 6:23 PM
ten (Divided by two) Overall Item-Writing precepts 167
Computer-adaptive assessment: pros and Cons Large-scale assessments, such as statewide accountability tests or nationally standardized achievement tests, are definitely different from the classroom tests that teachers might, during a dreary weekend, whip up for their students. Despite those differences, the assessment tactics used in large-scale tests should not be totally unknown to teachers. After all, parents of a teacher’s students might occasionally toss out questions at a teacher about such tests, and what teacher wants to be seen, when it comes to educational testing, as a no-knowledge ninny?
One of the increasingly prevalent variations of standardized achievement testing encountered in our schools is known as computer-adaptive assessment. In some instances, computer- adaptive testing is employed as a state’s annual, large-scale accountability assessment. In other instances, commercial vendors offer computer- adaptive tests to cover shorter segments of instruction, such as two or three months. School districts typically purchase such shorter-duration tests in an attempt to assist classroom teachers in adjusting their instructional activities to the progress of their students. In general, these more instructionally oriented tests are known as interim assessments, and we will consider such assessments more deeply later (in Chapter 12).
Given the near certainty that students in many states will be tested via computer-adaptive assessments, a brief description of this distinctive assessment approach is in order.
Not all assessments involving computers, however, are computer-adaptive. Computer-based assessments rely on computers to deliver test items to students. Moreover, students respond to these computer-transmitted items by using a computer. In many instances, immediate scoring of students’ responses is possible. This form of computer-abetted assessment is becoming more and more popular as (1) schools acquire enough computers to make the approach practicable and (2) states and school districts secure sufficient “bandwidth” (whatever that is!) to transmit tests and receive students’ responses electronically. Computer-based assessments, as you can see, rely on computers only as delivery and retrieval mechanisms. Computer-adaptive assessment is something quite different.
Here’s a shorthand version of how computer- adaptive assessment is usually described. Notice, incidentally, the key term adaptive in its name. That word is your key to understanding how this approach to educational assessment is supposed to function. As a student takes this kind of test, the student is given items of known difficulty levels. Then, based
false items in a true–false test. Because even uninformed students know there are few absolutes in this world, they gleefully (and often unthinkingly) indicate such items are false. One of the most blatant examples of giving unintended clues occurs when writers of multiple-choice test items initiate those items with incomplete statements such as “The bird in the story was an . . .” and then offer answer options in which only the correct answer begins with a vowel. For instance, even though you had never read the story referred to in the previous incomplete statement, if you encountered the following four response options, it’s almost certain that you’d know the correct answer: A. Falcon, B. Hawk, C. Robin, D. Owl. The article an gives the game away.
Unintended clues are seen more frequently with selected-response items than with constructed-response items, but even in supplying background information to students for complicated constructed-response items, the teacher must be wary of unintentionally pointing truly unknowledgeable students deftly down a trail to the correct response.
(Continued)
M06_POPH0936_10_SE_C06.indd 167M06_POPH0936_10_SE_C06.indd 167 09/11/23 6:23 PM09/11/23 6:23 PM
168 ChaptEr 6 Selected-response tests
on the student’s responses to those initial items, an all-knowing computer supplies new items that are tailored in difficulty level on the basis of the student’s previous answers. For instance, if a student is correctly answering the early items doled out by the computer, then the next items popping up on the screen will be more difficult ones. Conversely, if the student stumbles on the initially presented items, the computer will then cheerfully provide easier items to the student, and so on. In short, the computer’s program constantly adapts to the student’s responses by providing items better matched to the student’s assessment-determined level of achievement.
Using this adaptive approach, a student’s overall achievement regarding whatever the test is measuring can be determined with fewer items than would typically be required. This is because, in a typical test, many of the test’s items are likely to be too difficult or too easy for a given student. Accordingly, one of the advertised payoffs of computer-adaptive testing is that it saves testing time—that is, it saves those precious instructional minutes often snatched away from teachers because of externally imposed assessment obligations. And there it is, the promotional slogan for computer- adaptive testing: More Accurate Measurement in Less Time! What clear-thinking teacher does not get just a little misty eyed when contemplating the powerful payoffs of computer-massaged testing?
But there are also limitations of computer- adaptive testing that you need to recognize. The first of these limitations stems from the necessity for all the items in such assessments to be measuring a single variable, such as students’ “mathematical mastery.” Because many items are needed to make computer-adaptive assessment purr properly, and because the diverse difficulties of these items must all be linked to what’s sometimes referred to as “a unidimensional trait” (for instance, a child’s overall reading prowess), com puter- adaptive assessment precludes the possibility of providing student-specific diagnostic data. Too few items dealing with a particular subskill or a body of enabling knowledge can be administered during a student’s abbreviated testing time. In other words, whereas computer-adaptive assessment can supply teachers with an efficiently garnered general fix on a student’s achievement of what’s
often a broadly conceptualized curricular aim, it often won’t reveal a student’s specific strengths and weaknesses. Thus, from an instructional perspectiv e, computer-adaptive assessment usually falls short of supplying the instructionally meaningful results most teachers need.
Second, as students wend their way merrily through a computer-adaptive test, depending on how they respond to certain items, different students frequently receive different items. The adjustments in the items that are dished up to a student depend on the student’s computer-determined status regarding whatever unidimensional trait is being measured. Naturally, because of differences in students’ mastery of this “big bopper” variable, different students receive different items thereafter. Consequently, a teacher’s students no longer end up taking the same exam. The only meaningful way of comparing students’ overall test performances, therefore, is by employing a scale that represents the unidimensional trait (such as a student’s reading capabilities) being measured. We will consider such scale scores later, in Chapter 13. Yet, even before we do so, it should be apparent that when a teacher tries to explain to a parent why a parent’s child tackled a test with a unique collection of items, the necessary explanation is a challenging one—and sometimes an almost impossible one, when the teacher’s explanations hinge on items unseen by the parent’s child.
Finally, when most educators hear about computer-adaptive assessment and get a general sense of how it operates, they often assume that it does its digital magic in much the same way—from setting to setting. In other words, educators think the program governing the operation of computer- adaptive testing in State X is essentially identical to the way the computer-adaptive testing program operates in State Y. Not so! Teachers need to be aware that the oft-touted virtues of computer-adaptive testing are dependent on the degree of adaptivity embodied in the program that’s being employed to analyze the results. Many educators, once they grasp the central thrust of computer-adaptive testing, believe that after a student’s response to each of a test’s items, an adjustment is made in the upcoming test items. This, of course, would represent an optimal degree of adaptivity. But item-adaptive adjustments are not often employed in the real world because of such practical considerations as the costs involved.
M06_POPH0936_10_SE_C06.indd 168M06_POPH0936_10_SE_C06.indd 168 09/11/23 6:23 PM09/11/23 6:23 PM
ten (Divided by two) Overall Item-Writing precepts 169
Complex Syntax Complex syntax, although it sounds something like an exotic surcharge on cigarettes and alcohol, is often encountered in the assessment items of neo- phyte item-writers. Even though some teachers may regard themselves as Steinbecks-in-hiding, an assessment instrument is no setting in which an item-writer should wax eloquent. This fourth item-writing precept directs you to avoid complicated sentence constructions and, instead, to use very simple sentences. Although esteemed writers such as Thomas Hardy and James Joyce are known for their convoluted and clause-laden writing styles, they might have turned out to be mediocre item-writers. Too many clauses, except at Christmas- time, mess up test items. (For readers needing a clue regarding the previous sen- tence’s cryptic meaning, think of a plump, red-garbed guy who brings presents.)
Difficult Vocabulary Our fifth and final item-writing precept is straightforward. It indicates that when writing educational assessment items, you should eschew obfuscative verbiage. In other words—and almost any other words would be preferable—use vocabulary
Accordingly, most of today’s computer-adaptive tests employ what is called a cluster-adaptive approach. One or more clusters of items, perhaps a half-dozen or so items per cluster, are used to make an adjustment in the upcoming items for a student. Such adjustments are based on the student’s performance on the cluster of items. Clearly, the more clusters of items that are employed, the more tailored to a student’s status will be the subsequent items. Thus, you should not assume that the computer-adaptive test being employed in your locale is based on an item-adaptive approach when, in fact, it might be doing its adaptive magic based on only a single set of cluster-adaptive items. Computer adaptivity may be present, but it is a far cry from the mistak