Tuesday, 22 May 2012 Text Larger | Smaller      
 

E-assessment

E-assessment - Figures of speech?

Julie Nightingale goes behind the headlines to see why it seems that computers can't evaluate the work of great orators and writers and what this could mean for the assessment of English essays

Words Julie Nightingale

The recent revelation that software designed to test students' English essays has condemned Churchill's wartime rhetoric as "repetitive" will come as no surprise to those who have long believed that, as far as English is concerned, computers make good word-processors but poor assessors.

The Chartered Institute of Educational Assessors fed Churchill's "We will fight them on the beaches" speech into a computer loaded with a software package commonly used in assessment in some US states. Not only did it dislike the great Prime Minister's use of repetition for rhetorical emphasis, it determined that he had used the word 'might' incorrectly - assuming he had deployed it wrongly as an auxiliary verb rather than as a noun.

Churchill was in good company. The software also failed Nobel prize-winner William Golding for his 'erratic sentence structure' and practically blew a gasket when fed segments of Anthony Burgess's classic A Clockwork Orange and works by William.S.Burroughs and Ernest Hemingway.

Martin Walker, an experienced English examiner for GCSE and A-level, who carried out the tests for the institute, says it quickly became clear that the software had severe limitations.

"It can recognise whether a short sentence is structured correctly - whether the full stop is in the right place - but I have grave concerns about it being used to mark any form of connected writing," he says.

Marking by computer lends itself more readily to subjects such as maths where, for example, the number of possible correct answers to a question is limited.

"What computers can't do very well is to understand the nature of language and the interaction between writer and reader, which are very human things," Walker says.

"Look at jokes: you could programme a computer to follow the formula of knock-knock jokes, but it can't understand wit, which humans understand at an early age. If a 15-year-old uses it in a piece of writing, a human marker would instantly recognise someone who is operating at a high level, but a computer wouldn't recognise it at all. Similarly, a computer can spot repetition but it doesn't have the ability to know whether it is being used as a rhetorical device or is simply a clunking way of describing something."

Even in the US, limitations to the software have been exposed by young people themselves, says Graham Herbert, deputy head of the CIEA. Students have learned to do what they call 'schmoozing the computer'," he explains. "They adapt their writing to give the computer the kinds of response it expects, which does not provide good practice, because you can end up with very formulaic answers."

Some experts agree that there may be some mileage in using software to assess knowledge of basic English grammar and use - Pearson is introducing technology based testing for that very purpose (see box) - but that is the extent of its application, and even that is limited.

"There are loads of established reading tests which require responses in the forms of circling letters, multiple choice, and so on and I don't see why software couldn't mark this type of test," acknowledges Ian McNeilly, director of the National Association for the Teaching of English (Nate). "But I would be worried about anything beyond this. And the vast majority of English language testing is well beyond this."

Tim Oates, director of assessment research and development for Cambridge Assessment, says: "Current artificial intelligence systems in essence compare the candidate's writing with model answers which have slowly been accumulated and continue to be accumulated within these systems. They are strictly limited in terms of the extent to which they can assess meaning in text or the intention of the candidate in terms of their writing. And intention in English is really fundamental."

It is extremely unlikely, he believes, that automated systems will not be deployed extensively in educational assessment at some point in the future, but progress is extremely slow.

"We know that artificial intelligence systems are increasing in their complexity and sophistication. The problem is that people make crude assumptions about what is now possible and it's a lot less than people believe."

"All systems need to meet exacting quality criteria and should definitely not be adopted just to make life easier for exam boards," he adds. "Some approaches look like technology in search of a test, rather than assessment designed to support learning and to accurately report attainment."

Other awarding bodies are making no great leaps of faith about using technology to mark English either. A spokesperson for Edexcel says: "Edexcel are always looking at technologies that could improve assessment. We will continue to look at e-testing and other technologies, however at present we have no plans to use them in any mainstream qualifications."

Ruth Goddard, assistant director (processing) for AQA says they are much more interested in improving the design of assessments first and foremost.

"We have the TEAL [Technology Enabled Assessment for Learning] programme and our focus is about what the future of assessment is, and how we will move from assessment which is designed for pen and paper to a style of assessment which is designed for young people in the 21st century."

AQA is "open" about the possibility of using e-testing in English, she adds. "But we would not go there without a lot of our own research."

Graham Herbert points out that, as with all subjects, there are some advantages to using technology for the management of assessment in English.

"There is a much quicker turnaround with e-assessment. You can also get a better random distribution of items - in other words, if you break the assessment down into three parts, they can be distributed to different examiners or put into different forms of online assessment. It uses less paper, there are greater levels of security and it provides instant feedback. It also may work out cheaper, which, cynics would say, is the driving force behind some of the enthusiasm for it, and is simpler than paper tests to administer."

"On the downside, it restricts the types of question that can be set, it requires ultra-reliable internet access and it still needs supervision," he says.

It could be that in 10 years' time, technology has been developed which is sophisticated enough to embrace all of the demands of English marking, says Martin Walker. And one major factor driving research into software development is that it could save governments significant amounts of expenditure on the cost of testing.

"If it could save money and was incredibly accurate, fair and precise, then no one would object," he says. "But we just don't have that capability at the moment."

Computerised proficiency test

Pearson, US parent company of Edexcel, is the first company to introduce e-testing for English language in the UK. Its computerised test - Pearson Test of English Academic to give its full title - is being made available to universities, colleges, professional bodies and other institutions looking to assess the English language skills of non-native English speakers

It is designed to assess the English language proficiency of candidates to institutions in which English is the language of instruction, but whose first language is not English.

It uses interactive task-based examples from academic settings to provide "a realistic reflection of how language is used". Sections tackle speaking and writing skills, reading skills and listening skills.

The test was developed through worldwide field tests involving more than 10,400 international students. It is designed, said Marjorie Scardino, Pearson's chief executive officer, "to give schools and employers a lot more to go on a lot more quickly when they're trying to decide whether a candidate is ready to learn in English or work in English."

A spokesman for Pearson said there were no plans to introduce the system to general qualifications in the UK.