Interesting paper about eAssessment

Interesting paper about eAssessment

by Tim Hunt -
Number of replies: 3
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
My Open University colleagues Phil Butcher and Sally Jordan have just had a paper accepted for publication about computer-marking of free text responses: http://dx.doi.org/10.1016/j.compedu.2010.02.012.

They compared three different computational approaches (regular expressions, our home-grown libraries, and a commercial system) with each other, and with human marking, with fairly impressive results. You can try it yourself with two of the technologies at https://students.open.ac.uk/openmark/omdemo.iat2009/ and https://students.open.ac.uk/openmark/omdemo.pm2009/


The bad news:

1. This is a paper due to be published in a commercial academic journal, so you can only get the full text if you belong to an academic institution that has subscribed to that journal.

2. This is not in Moodle yet.


The possible good news is that our home-grown library came out very strongly, and there is no reason (other than time) why we could not package it up as a Moodle question type to make the technology available for use more widely (both in the OU and beyond). Clearly Phil and I both think this is pretty cool, but we have to listen to our users and institutional policy when prioritising our work. We can't just do whatever we think is cool. However, the OU has a good record of investing in innovative technology, so my guess it is a case of when, not if.

When I describe this as "our home-grown library", that is an over-simplification. You can trace its pedigree back to something Phil helped develop at Leeds University in the 1970s. It has been through several generations and different computer languages since then.

Also, I should point out that even with the technology available, creating a question like this is a lot of work. You have to train the computer to recognise right and wrong answers, and the only reliable way to do that is to have a set of student answers that you have already marked by hand, and which you can use to test your marking algorithm as you tinker with it. Probably in Moodle, you would need to run the quiz the first time with essay questions that you mark manually. Then you would need a special editing screen where you could work on developing your mark-scheme using those hand-marked responses to test, and with an option to save this as a new question. Anyway, too early to be doing the detailed design yet!
Average of ratings: Useful (1)
In reply to Tim Hunt

Re: Interesting paper about eAssessment

by Itamar Tzadok -
Thanks for the link Tim. Interesting approach. However, IMHO this is the wrong direction. You see, it may be much easier to train a human to recognize a simple formal language, than to train a computer to recognize a rich and messy human language. And since no matter which approach you take "creating a question like this is a lot of work" as you aptly put it, it may be much more productive (IMHO) to invest the time in creating learning domains with a definite set of entities and a definite set of relations and rules of application and let learners complete randomly-generated tasks with a simple but friendly UI.

Then, instead of

You are handed two rock specimens and you are told that one is an intrusive igneous rock whilst the other is an extrusive igneous rock. How would you know which was the intrusive specimen?

I would ask

Generate a sequence that produces an intrusive igneous rock, and a sequence that produces an extrusive igneous rock.

where the problem domain contains entities and predicates such as rock, crystal, large, narrow, push in, push out, etc.

The learner will have to construct the sequence from building blocks rather than merely find the answer in the textbook and then struggle with phrasing it such that the computer (or worse, that the human marker) will understand. Moreover, with such systems you can present more complex and difficult tasks which won't fit into the short answer scheme.

smile
In reply to Itamar Tzadok

Re: Interesting paper about eAssessment

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
I am not claiming that short, free-text response questions are the answer to every assessment problem. However, for many years, they have been an important and useful component in exams and other tests. They are clearly a good way of testing certain sorts of knowledge. So it is useful to know that computers can, in some circumstances, grade them sufficiently accurately. Another observation in the paper is that the wording of the question can make a big difference to how easy the responses are to mark, for both humans and computers.

I am very sceptical about expecting students to lean an invented formal language to answer questions. If you want simpler marking, go for something like drag-and-drop gap-fill, or even multiple choice. Those types can also be used to create effective assessment questions. But getting a student to express something in their own works makes them think in a different way.

In some fields, for example Maths, Music, Computer programming and Chemical diagrams, there already exist formal languages that humans use, and computers can process. In those fields, it is absolutely right to use those formal languages to do excellent computer-marked assessment. For example http://www.open.ac.uk/openmarkexamples/p5_4.shtml, http://stack.bham.ac.uk/course/view.php?id=3 and Junit question type.

I am well aware that the economics of the situation at the OU are not typical. We have courses with 1000+ students per presentation, so it is more likely to be worth our while to invest a day of someone's time to create a good mark-scheme for a free text question. Outside places like OU, you probably need collaboration to make it worthwhile. One might hope that Creative commons, Moodle community hubs, or commercial publishers might eventually lead to the generation of banks of sophisticated questions like these.

In reply to Tim Hunt

Re: Interesting paper about eAssessment

by Itamar Tzadok -
I'm not saying that the IAT has not been or is not useful. But it seems to me that it falls pray to the same problem of human assessors. The natural language is just not a good means of conveying precise meaning and too many assessment problems are a result of misunderstandings (the article acknowledges). But since IAT is trained and is based on a formal language, at the end of the day it is likely to perform better than the human instructors for that particular purpose. But overall the whole approach is limited.

Note by the way that even the IAT approach is at least partly the approach I suggested but in disguise. The author of the questions had to be trained in writing quasi formal questions and answers for IAT. So why not training the learnersin a simpler formal language mediated by UI?

I am very sceptical about expecting students to lean an invented formal language to answer questions.


Don't be and you'd be amazed what students and people in general are capable of. But we're also very good at adjusting ourselves to expectations (as some sort of survival mechanism) and so if you treat people as incompetent they will adjust and justify your expectations.

In some fields, for example Maths, Music, Computer programming and Chemical diagrams, there already exist formal languages that humans use, and computers can process.


Precisely. And my point is that any field that lends itself to systematic study can be treated this way. All Arts/Humanity disciplines fall in this category if only by virtue of being academic disciplines.

smile