“Assessment” has been an important aspect of teaching and learning (or perhaps more accurately, it has been a buzzword garnering much attention) for most of my career in education. Advocates for many positions (political as much as pedagogical) argue the role of assessment in achieving their vision, thus “fixing the broken educational system” once and for all.
The reality, of course, is that assessment is a much more sophisticated and nuanced part of the educational experience than is allowed by these advocates. Clearly, educators must determine what has been learned by the student, and (for many reasons) that learning must be reduced to a number of proxies; each proxy designed to capture and reflect what the student has learned.
In many ways, the summaries we use to assess students’ learning are an attempt to reify what happens in schools. We reason, “my methods must work, because I observed these changes on these assessments.” Educators do not admit, however, that our instruments are weak (“aligning your assessments with your instruction” is worthwhile, but dubious), subject to misuse (students don’t bother reading questions, educators’ biases affect their assessments), and we can be quite unskilled at understanding results.
The problem of defining and implementing appropriate assessment in schools is becoming more challenging as well. When print dominated, educators could be relatively certain of the skills that students needed. I have some of my grandfather’s college textbooks next to mine. We both studied science, which had largely changed in the 49 years between our graduation dates, but we both learned by reading textbooks and taking notes in those books. Today, students carry laptops, digital textbooks, and are as likely to use video to study as they are to use textbooks. “Becoming educated” has been a more sophisticated endeavor for my children than is was for my grandfather and me. My experiences as someone who has succeeded in both of these worlds are interesting, but the topic of another post.
Largely because information (and other) technology is changing how individual humans understand, how we organize our institutions, and the norms society holds; educators cannot predict with the same certainty what students must learn and which proxies are appropriate for assessment purposes. This is a problem that has occupied my professional attention in recent years, and thanks to continued efforts to collaboratively design a comprehensive assessment method, colleagues and I have a much clearer, complete, and simple system for answering essential assessment questions.
First, we conclude three questions are relevant to understanding what matters in students’ learning, and each has equal value:
- Does the student have the habits of effective learners and workers?
- Can the students produce polished solutions to sophisticated problems?
- How does the student compare to others?
These questions are answered in different ways, and all three comprise a reasonable and complete system for assessing students’ learning.
In course grades, we answer the question “Does the student have the habits of effective learners and workers?” Consider the typical classroom. Over the course of months, students participate in a variety of activities and complete a range of assignments and tasks. Teachers’ make professional judgements about the characteristics of the students the degree to which he or she has mastered the material and is prepared to learn. Just as we do not always expect a supervisor to follow an objective instrument when judging workers’ performance, we should not expect educators to being completely objective.
Of course, as subjectivity enters the grading process, educators will find it necessary to defend decisions, which will motivate them to more deeply articulate expectations, observe learning, and record that learning. All of these are benefits of including educators’ judgments in course grades.
A performance is an activity in which we answer, “Can the student produce polished solutions to sophisticate problems?” Performances are those projects and products that working professionals would recognize as a familiar outcome and professionals would be interested in the motivation of the performance, the nature of the work, and the quality of the performance. Questions regarding a performance are best directed to the student because it was selected, planned, and carried out by the student.
Teachers do have a role in setting to context of a performance, guiding decisions, and facilitating the student’s reflection in the activity; but through a performance, a student demonstrates the capacity to frame and solve complex problems and complete complex communication tasks. While “projects” that are included in course grades contribute to students’ ability to complete these assessments, performances are typically independently constructed and are outside of traditional curriculum boundaries.
Tests have been at the center of intense interest in educational policy for the 21st century. The political motivation for these test have been challenged and is beyond the focus of this post. For the purposes of this essay it is sufficient to recognize that large scale tests (think SAT’s, ACT’s, SBAC, PARC, AccuPlacer, and the like) can be used to determine how a particular student did in comparison to all of the others who took that test.
A few details are necessary to complete the picture of what these tests show. First, standardized tests were used almost exclusively for these purposes in the 20th century. This century, standards-based tests have become more common. A standardized test is a norm-referenced test, which means the scores are expected to follow a normal distribution (bell curve) and an individual’s score is understood in terms of that distribution for comparison. When taking a standards-based test, and individual’s score is compared to those that he or she is expected to answer if the standard has been met.
Regardless of the exact nature of the tests, those interested in assessment of learning must recognize that these tests are administered for the purpose of comparing. Also, these tests are of dubious reliability. One of the fundamental ideas of all data collection is that measurements have errors, so a single measure taken with one instrument administered once is really meaningless. While the test results of a large group of students may allow us to draw conclusions about the group as a whole, a single student’s score cannot be used to draw reasonable conclusions about that student.
If we consider assessment as a method whereby educators can understand their program as much as they can understand students’ learning, then we see the three questions and the three types of assessments forming a meaningful and informative assessment system.