Ungraded outcomes

Ungraded outcomes

by Christopher Sangwin -
Number of replies: 1
Picture of Particularly helpful Moodlers Picture of Plugin developers

Dear all,

Following a recent conversation with colleagues I'm posting to ask for some help and advice with future design.  The proposal from a colleague was to have a question which is "mostly automatically scored".  That is, where the software tries to automatically mark an answer one way or the other.  If a standard online assessment algorithm (e.g. a STACK response tree, or the equivalent in other software) definitely establishes an objective property then all well and good - the student gets the feedback/marks/stats are generated and stored.

If the algorithm does not establish objective properties, then the question type falls back to human marking and the human can then assign feedback and marks.

This is a particular form of semi-automatic marking.

Consider the following question: Give me an example of a real function with a local minimum at x=1.

It is not possible to write an algorithm which will 100% score this. (At least I think this is the case!).  Some answers really are easy to score, e.g. f(x)=(x-1)^2 passes the first derivative test (f'(1)=0, f''(1)>0) so does have a local minimum.  But there are lots of functions for which f''(1)=0, or worse, what about abs(x-1) when f' doesn't exit at x=1?  

How useful would such a feature be?  Beyond the above example, how many other situations would benefit from this kind of semi-automatic human marking?  Would you actually use such a feature which, after all requires human intervention?

Chris

  


Average of ratings: -
In reply to Christopher Sangwin

Re: Ungraded outcomes

by George Kinnear -

I can't remember if I was one of the colleagues in this conversation!

But I think this would be a useful feature for questions that push at the boundaries of what can be assessed automatically.

I supervised an undergraduate project this year where the student wrote a STACK question like this, asking for an example of a function with a given domain. We could guess a priori what some common answers would be, and test for those automatically. But when the automatic checks were inconclusive, there was a node in the PRT that gave the feedback "Teacher will check your answer manually" - see section 3.5.2 of the report at https://osf.io/natcj/

The main use cases I can think of are these sort of example-generation questions, where it is not clear how far we can get with automated assessment (see https://maths.github.io/e-assessment-research-agenda/questions/Q54).

The approach taken in the UG project question is almost workable, but there are a couple of issues:

(1) the feedback to the students was not ideal, as the quiz was in interactive mode so it led some students to replace a correct (but not automatically graded) answer with one that was incorrect (discussed in 3.6.2 in the report).

(2) to actually do the manual grading would be quite awkward. The basic use question report gives helpful stats about what are common types of answer, so the PRT could be updated to address those, followed by regrading. But it would be nice if there was a way for STACK to flag something as "needing human intervention" in a way that could be easily filtered in the "Manual grading" report.

e.g. it would be lovely to have a "Those that need manual review" item in this list: