Can machine scoring deal with broad and open writing tests as well as human readers?

Doug McCurry

Research output: Contribution to journalArticlepeer-review

Abstract

This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires specific and constrained tasks to produce results that mimic human judgements. The conclusion of a National Assessment of Educational Progress (NAEP) report on the online assessment of writing that ‘the automated scoring of essay responses did not agree with the scores awarded by human readers’ is discussed. The article presents the results of a trial in which two software programmes for scoring writing test responses were compared with the results of the human scoring of a broad and open writing test. The trial showed that ‘automated essay scoring’ (AES) did not grade the broad and open writing task responses as reliably as human markers.
Original languageEnglish
JournalAssessing Writing
Volume15
Issue number2
Publication statusPublished - 2010
Externally publishedYes

Keywords

  • Automated essay scoring
  • Computer scoring of writing
  • Machine scoring of writing
  • Online assessment
  • Validity of writing tests
  • Writing test design

Disciplines

  • Educational Assessment, Evaluation, and Research
  • Educational Methods

Cite this