10-second review: Computers can be programmed to judge essays as accurately as two or three human raters, but they can be fooled. They can’t judge content. That’s why the human rater must be part of the evaluation.
Title: “An Apple for the Computer.” Faye Flam.
Summary/Quotes: “E-rater comes from Educational Testing Service (ETS), the Princeton-based outfit that creates the SAT.”
“They [computers] don’t understand insight or humor and can be fooled into giving top marks to complete nonsense if it uses the right words and the right types of phrasing.”
“The programs grade not only on spelling, grammar and usage but on content, style, organization and clarity. And they do it in three to six seconds.”
“When it comes to what is called ‘high-stakes testing,’ E-rater is always used with human graders…. The computer differs from its human counterpart less than three percent of the time…. And when it does, a second person settles the difference.”
“Jones’ essay may have outsmarted the computer because it used many of the types of words and phrases found in good essays.”
“What it [the computer] seems to lack is the ability to see context and relevance. The software doesn’t care whether you’re a meticulous writer who uses only well-reasoned and well-known facts or a glib writer who pulls ‘facts’ from the air.”
“But with today’s level of computer intelligence…there’s no way these grading programs can have enough common sense to evaluate the content of what they read. And that can be dangerous if the human judges are taken out of the loop.”
Comment: In 2004, at least, humans must still be used with computerized grading. RayS.