Duncan: Test Scores Not the Best Way to Evaluate Art, Music, Gym Teachers

This one is really wonky, folks, but if you're concerned about standardized testing, I urge you to read on!

Over breakfast this morning, Secretary of Education Arne Duncan acknowledged that the administration's success, via Race to the Top, in getting states to agree to evaluate teachers based on student achievement data has outpaced the ability of states to create the student assessments that make such teacher evaluation possible. 

"This is clearly going to be a choppy transition period," Duncan said, later adding, "We're clearly, as a country, in our infancy" on using student data to evaluate teachers.

Here's the problem: Currently, fewer than half of all public school teachers teach a tested subject in a tested grade. As states embrace value-added teacher evaluation, however, schools will need to collect data on the student achievement outcomes of all teachers. That means either issuing pencil and paper tests to students in every grade and subject area, or devising more complex (and potentially expensive to administer) assessments, such as portfolio systems that correspond to some kind of numerial scale.

In an upcoming American Prospect feature, I look at how one state, Colorado, is attempting to thread this needle. I visited Harrison District 2 in Colorado Springs, where schools are administering pencil and paper tests in grades K-12 in every single subject, including art, music, and physical education. 

When I asked Duncan about that model, he said he doesn't believe paper tests are the best way to collect value-added data in nontraditional subjects. Assistant Secretary Carmel Martin, who was also in the meeting, offered more detail: The DOE is currently "developing guidance for states, so they appreciate it doesn't have to be a paper and pencil test," Martin said. "In things like music and physical education, there are other ways" to assess students and teachers, such as presentations and portfolios of student work.

The challenge is that although most people agree that paper tests in kindergarten gym class are absurd, many districts will be sorely tempted to take the easy way out–testing–when told they must now collect "data" on every single teacher. Tests are cheap to administer and score and have the benefit of being "objective;" unless there is outright cheating, two different evaluators will grade a test much the same way. In addition, there's the thorny question of whether it's fair to evaluate and pay a math teacher according to "objective" test-score data, while relying on a highly subjective portfolio system to evaluate and pay an art teacher.

I don't think we're anywhere near a consensus on this, nor a guarantee that value-added teacher evaluation won't lead to much more standardized testing of children. It's good news, however, that the Obama administration plans to jump in and defend the idea of comprehensive assessments that take into account "multiple measures"–not just test scores.

5 thoughts on “Duncan: Test Scores Not the Best Way to Evaluate Art, Music, Gym Teachers

  1. Gideon

    Assuming consensus is reached that comprehensive assessments are a good idea, they are expensive to develop and implement. Perhaps this is a reason the federal government should get involved in developing assessments: it’s economically more efficient than having all the states or districts or schools develop their own assessments.

  2. Nancy Flanagan

    Tests aren’t cheap. And they’re not objective, either. Many standardized tests aren’t even aligned with the knowledge and skills in the curriculum (including subjects that have traditionally been tested). Further, tests would be the worst possible way to evaluate teachers in “untested” –but important–subjects.

    As a 30-year music teacher, I shudder to think of how good curriculum and instruction in the arts and physical education would be twisted to accommodate pointless testing that reveals nothing of value.

    Don’t misunderstand–I believe in rigorous evaluation of teaching and learning. And there are critical, essential skills and knowledge in the arts and other such disciplines. In early elementary music, for example, the most important musical concepts and skills can be observed: pulse and rhythm, movement, tone and pitch, listening and repetition, creative use of musical elements, transmission of culture through lyrics and dance, etc..

    Any teacher worth her salt could and should be able to identify these elements as goals, measure students’ growth, and demonstrate what they have learned, to a trained evaluator (perhaps a disciplinary peer). Such evaluations wouldn’t have to cost a nickel.

    Tests are not the “easy way out.”

  3. Sheila43

    I’m wondering how they are going to assess my effectiveness in pre-k when there is such a range of skill levels between the ages 3-5. Not to mention that some children come in late to class or don’t come at all if it’s raining, snowing, windy, or too hot.
    I have seen pre-k classes where the focus is on being able to write words and “read” sight words. Many “suits” are interested in the evidence of instruction. However, they don’t mean being able to work together cooperatively or becoming independent or knowing how to manage disaapointment or sharing materials.
    We are destroying our children.

  4. TheHelpTheyNeed

    Am I mistaken or shouldn’t the policy follow the science, not vice-versa. I am all for evaluating teachers using mulitiple measures; as long as those measures are valid and reliable.

  5. Cynthia Liu

    Looking forward to your upcoming piece in American Prospect.

    I’m leery of the DOE’s desire to use the quantitative hammer on the humanities/non-traditional subject nail. And I’m despairing that due to the size and scale of the United States, standardized testing will only continue to grow in use. There are simply numerically too many students and too many testing companies that see dollar signs in testing them. This is to students’ and teachers’ detriment.

    And I can’t help but see that this is doubling down on RTTT’s “value-added” premise.

    Qualitative measures for non-STEM subjects, such as portfolios of work, narrative evaluations, (for older kids) peer review or written self-evaluation, and other techniques, could instead be borrowed from the college model and applied to grades 6-12. How do colleges of art evaluate advanced student work? That seems a more useful model for evaluating a kid’s painting *and* teaching what various artistic movements there are at the same time.

    Instead, quants want to “metric” the humanities, arts, and social sciences as if what students produce in those classes could be charted and measured like math problem sets. Essays, paintings, athletic endeavors, dance or theatre–these are all downright time-consuming and artisanal in practice and evaluation. Try bubbling in a, b, c, or d for that. Perhaps they are mistaking testing for knowledge about the history of art for actual classes where children make art (and learn some art history as they go). How do you measure “improvement” in art class or social studies year to year?

    And that time-consuming aspect of evaluating non-STEM subjects will suffer if made to fit into standardized test models. Yet the need for scale, speed, and convenience will demand it.

    So far as I can tell, what the DOE seeks is also completely disconnected from the class size issue, and it shouldn’t be. How will teachers with 40 students per class be able to grade portfolio style? It’s already grueling enough to grade 5 classes of high school essays when classrooms are 30 students per class.

    The next time you have Arne Duncan’s ear, can you tell him to back off this misguided project? Or at least speak to teachers of art, music, writing, etc, to ask them HOW LONG it takes to arrive at grading for subjects that are not math and reading? I fear he’s asking for the unrealistic and unhelpful. Test for assessment and subject mastery. However, don’t hang what’s an issue of teacher professional development and performance on student standardized test scores. Make teachers evaluate each others’ teaching, and make administrators responsible for demanding improvement in those who need it.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>