Evaluation of the multi-answer questions

Evaluation of the multi-answer questions

Juraj Giertl -
回帖数:16

Hello,

I have revealed an inappropriate evaluation of the multi-answer questions in the last Moodle version - per my understanding it is impossible to have a question properly evaluated without using of penalties, thus I proposed improvement on the Moodle Tracker. See the following link for more details, please:

https://tracker.moodle.org/browse/MDL-45270

However, Tim told me that opening an improvement request is not the right way to start, but recommended to post in the forum instead. Is there someone who knows how to achieve the expected behavior?

Thank you, each advice is appreciated.

 

 

平均分:Useful (2)
回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Joshua Bragg -

I see your point for why you want this changed.  However, this would make more sense as a separate question type rather than modifying the existing multiple choice type.  Perhaps you can engage someone to build that for you.

I sometimes ask questions like "Which of these acids are strong acids?".  In that situation, I have no interest in telling students how many correct answers there are.  If they don't know which acids are strong acids then I have no intention of helping them in the question itself while they're taking a quiz.

I'm sure that there are situations where your idea of how a question should be graded makes sense.  I think it is much more likely to have situations like mine. (But that's my personal bias...)

Can you give a specific example of when your question design is preferable?  Why do you need to tell them to select two?  Could you not just say, select all that apply?

回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Joseph Rézeau -
Core developers的头像 Particularly helpful Moodlers的头像 Plugin developers的头像 Testers的头像 Translators的头像

Hi Juraj,

You are not actually discussing the "multianswer" question type (aka Cloze or Embedded) but the multiple-choice question (MCQ) type with the multiple-answer option.

Actually, Tim, I feel this terminology point should be made more clear in the documentation of the Multiple Choice question type. I suggest replacing:

""Multiple answers" questions types in a quiz allow one or more answers to be chosen by providing check boxes next to the answers.

by

"In Multiple Choice Questions, the Multiple answers option allows one or more answers to be chosen by providing check boxes next to the answers."

Joseph

 

回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Jean-Michel Védrine -

Hello Juraz,

I must admit I don't follow your reasoning.

Do you realise that what you call an "inappropriate evaluation" is used by thousands of Moodle website around the world and that changing anything in the way multichoice multianswer questions are graded would break millions of existing questions ?

So such a change is completely impossible.

But as what you ask for is, after all, just a different grading for the multichoice multianswer question, could it be a new option so that it doesn't break compatibility with existing questions ?

A frequent complaint about Moodle is that it already have too many options, so before introducing a new one we must make sure it is useful for a large number of users.

I read this forum since 2007, I think, and this has only been asked a few times so I don't think there is a wide use of this new grading method for multichoice multianswer questions.

And if there is a demand for it, my personal vote would be to do it as a new question type (addon) rather than as an option into multichoice multi.

Despite I have absolutely no interest in such a new feature, I promise that if there is a significant demand I will create this new addon and submit it to the Moodle plugins Directory so that people interested can download it and install it. But I will not do it if there are only 1 or 2 people interested !

回复Jean-Michel Védrine

Re: Evaluation of the multi-answer questions

Velson Horie -

Jean-Michel

You may not realise it, but in 1.9 the multi-choices in Cloze were listed in the order in the question.  As you say changing this to random order and disrupting the many courses that used this feature was completely impossible.  But guess what, it was changed without enabling me and presumably many others to use existing questions. Not only did I have to change the questions, but I had to insert additional teaching material to reinforce the ordering concepts that the Cloze questions provided.  As you say, impossible.

I call that unnecessary deletion and replacement of a standard Moodle feature extraordinary.

What we are asking for is not a new feature but the reinstatement of a standard feature.  As I have said in a previous thread, if Moodle developers wish to introduce a feature, good. But do not delete an existing feature, so disrupting existing courses.

My hosting site will not allow installed plugins and uses the standard Moodle programme, so your idea of providing a plugin to replace the deleted feature is impossible.

回复Velson Horie

Re: Evaluation of the multi-answer questions

Marcus Green -
Core developers的头像 Particularly helpful Moodlers的头像 Plugin developers的头像 Testers的头像

Velson, is the key issue that your students are demoralised when they are penalised for incorrect choices within multi selection questions?

Many years ago I found this to be true with another VLE and decided I would never use this type of question as students bitterly resented this type of marking.  As a result I always used the radio button type single selection MCQ. I can see the merit it wanting an alternative to marking with penalties for multi selection style questions.

 By the way planet Cloze is not planet Moodle in my experience 微笑

回复Velson Horie

Re: Evaluation of the multi-answer questions

Jean-Michel Védrine -

Hello Velson,

I suppose you realise that your post is completely unrelated to what we were discussing in that thread 微笑

Maybe you don't know but I am just a teacher (I teach math and statistics in France) and not a professional developer. This give me the privilege to choose the subjects on which I want to work in Moodle.

Sometimes a mail, a post or an idea catch my eye and I decide I want to work on this. Or I stumble on a bug and decide I must absolutely kill it.

My offer to do a new question type was motivated by the fact I found interesting to see how much code I could inherit from multichoice to avoid writing new code when creating this question type. This is why I made this proposal.

Most of my work is devoted to create new addons for Moodle. The fact you can't install them is a problem between you and your hosting compagny.

It won't stop me from saying that rather than providing more and more options into the standard Moodle the way to go for the future is to create more and more addons.

回复Jean-Michel Védrine

Re: Evaluation of the multi-answer questions

Marcus Green -
Core developers的头像 Particularly helpful Moodlers的头像 Plugin developers的头像 Testers的头像

I completely agree with you Jean-Michel that adding more options generally a bad idea. It is very possible to add so many options that people just do not use a feature.

回复Marcus Green

Re: Evaluation of the multi-answer questions

Juraj Giertl -

Hi all,

thank you so much for the promising contributions. Let me answer to some of the points:

@Joshua - as an example of when this question design is preferable - anytime when you are interested in what student actually knows. Obviously, I'm not talking about the question content, but about the goal (ideology) of the testing itself. Some of the principles of pedagogy dictates to teachers that the testing is (should be) intended for revealing of student's knowledge, not for mapping of his mind "white space". Along with this principle, any sort of penalty is inappropriate as far as the penalty hides the partial knowledge. In other words, you don't need such type of question when you are interested just in the 100% perfection, but you would appreciate it when you want to see at least partial results. As mentioned, it does not matter what the content is, but what the goal of testing is - it could be applicable even to your example with acids, just depending on the goal you want to follow.

@Jean-Michel - first of all, I'm sorry for the confusion which I may have caused. I'm not very confident with the Moodle terminology and architecture so I didn't realize the difference and potential impact on having this implemented as improvement vs. having it as add-on/plugin. I fully agree that from the existing systems point of view it would be definitely better to have a new feature rather than to have the existing one improved.

Regarding the potential target group - this type of question is being used for ages in one of the biggest academic program worldwide (cannot tell the name in a public forum, but 'm not worried to discuss it one-to-one). However, there is no support for the teacher-managed tests, just for the centrally prepared ones. I'm pretty sure that plenty of enthusiastic teachers would appreciate the tool with such a functionality for preparing their own tests for students as an alternative/addition to the existing ones and/or for some other motivating activities, e.g. competitions in such a field. And that's also my case.

Thank you for considering this proposal.

回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Joshua Bragg -

Juraj,

I'm having a hard time taking your point of view seriously at this point for three reasons.

First, I asked for a specific example where your grading method would be preferable and got an answer which basically says "in all of them."  I'm asking you for an example and providing one myself.  If you basically refuse to give a specific example then I have to question your ability to do so.

Second, the crux of your argument is that I am not correctly evaluating my students because I do not have the proper goal in mind.  The tone of that statement is insulting both personally and professionally to me as a teacher.  This feeling will probably explain the length of my following reason...

Third, I find that your concept of not mapping a mind's "white space" when testing is fundamentally flawed.  Student misconceptions are inherently more problematic than missing knowledge and missing knowledge in itself is something that is worthy of being examined.  If we consider a particular topic, say local poisonous snakes, then we could divide a student's knowledge of those snakes into a couple categories:

  1. Snakes that a student knows are poisonous and that are poisonous.
  2. Snakes that a student thinks are poisonous but are in fact non-poisonous.
  3. Snakes that a student thinks are non-poisonous but are in fact poisonous.
  4. Snakes that a student thinks are non-poisonous and are non-poisonous.
  5. Snakes that a student is unfamiliar with.

Suppose our student is taking a herpetology class that involves fieldwork and collecting snake specimens.  Groups 1 and 4 are good knowledge to have and will serve the student well in their fieldwork.  Group 5 can be somewhat acceptable for fieldwork since a student is likely to be cautious when approaching the snake for collection.  It is simply a lack of knowledge.  This does have some cost since the student may be needlessly careful when approaching a nonpoisonous snake.

However, Groups 2 and 3 are damaging to a student's performance.  If a student tries to collect a snake that is in Group 2, they will approach it in a much more careful way as they think it is poisonous.  This will take slightly longer and cause a great deal of needless panic if they are bitten during the collection process.  Even worse is Group 3.  A student who is collecting a snake in Group 3 will not give the snake the appropriate respect during the collection process and would be unlikely to be appropriately concerned about a bite until symptoms of the venom started to manifest.  Incorrect knowledge is fundamentally more of a problem than missing knowledge.

When I design questions and tests, my goal is to examine all facets of a students knowledge: what they know, what is missing, and what they know incorrectly.  I design my tests with the full range (or at least a good sample) of the knowledge I want to test to make sure that it is not missing.  I design specific questions that will specifically examine common misconceptions and mistakes that students make.  

It is entirely appropriate to penalize students for incorrect knowledge.  Incorrect knowledge is dangerous.  Second, giving students some help by stating "pick two" removes your ability to test appropriately for incorrect knowledge and missing knowledge.  A student who would think there is only one correct answer under normal conditions will see the "pick two" directions and guess a second one.  They will sometimes guess correctly and this will over represent their amount of correct knowledge and under represent their lack of knowledge.  A student who thinks there are three correct answers will see the "pick two" and eliminate one of their answers.  They may get lucky and eliminate the one that is incorrect knowledge which would under represent their incorrect knowledge in the score.  They may be unlucky and eliminate one that is correct knowledge and under represent their correct knowledge.

If you're going to argue that your method is inherently better, I'd again like to see an example that addresses these concerns.

回复Joshua Bragg

Re: Evaluation of the multi-answer questions

Juraj Giertl -

Hi Joshua,

thank you so much for sharing your thoughts, I'm happy to see you are discussing on topic which is very often overlooked by many of teachers. Will try to further discuss the points you have addressed:

The first, let's have an example of question:

Which of the following are public IP addresses?

  • 192.168.1.23
  • 150.23.45.0
  • 10.10.10.1
  • 172.32.16.1
  • 16.1.1.28
  • 192.168.0.1

Obviously, with this question the both evaluation methods are applicable, either the "perfection" one (choose all that apply) or the "weighted" one (choose three). I could give a lot of other examples, but the perfection would be usable for all of them. As mentioned, it is undoubtedly always applicable, just depending on the general goals of the testing.

The second, I'm so sorry about that, my aim is not to insult you neither personally nor professionally. I fully agree with using of perfection when it is suitable - each teacher has this decision in his own hands and based on your comments I assume that your goals are perfectly clear. Perhaps I did not express my standpoint in the best way. As non-native English speaker I use rather the technical language than the polite one. Please accept my apologies, I didn't want to attack you in any way.

The third, yes, I see your points and fully agree that my way can bring guessing as an unpleasant side effect. However, my goal is not to support guessing, but to motivate students to think about their answers. Sure, some of the students will misuse the opportunity to be creative and will simply guess instead, but that is not my primary target group.

Let's say it in other way - if I am a student aware of the penalties being used for the improper answers, I would never mark the answer which I'm not 100% sure about because I would risk to lose points even for those which I consider as the correct ones. That means the perfection is in place, and perhaps it is OK, at least for some of the exact sciences. I can imagine that your students keep creativity along with the perfection, but it seems that my ones are not so courageous 微笑.

Furthermore, I agree that the both ways of grading are OK when teacher incorporates them into the process of examination properly. For instance, I never use the electronic tests as the only form of evaluation, but combine them with the others - at least with interviewing and hands-on activities. Perhaps it sounds as an "old school" which is not so efficient, but I'm keen to reveal what students actually know.

The last but not least, the thoughts I'm trying to present are not my inventions. That's what I have learned from masters of didactics - they consider the multi-choice single-answer questions as good ones, but the multi-choice multi-answer questions as inappropriate ones at least for the level of primary and secondary schools. I think that the point of their consideration is, that student should exactly know, what is expected from him.

Obviously, I cannot argument here, I just take it as it is as far as my primary field is not the science of pedagogy. Nevertheless, moving these thoughts forward, the multi-answer question without penalties is simply a special form of the single-answer question, but has a possibility to disperse the only correct answer into several parts and to grade each of them separately. As a result, we have something what could be considered as a single-answer question with "weighted" scoring as far as the partially correct answers are accepted.

I hope that my motives are more clear now, and, again, please excuse my wording if it creates improper tone. I don't want to argue, just to discuss the topic with all the respect to the ways others than the my one.

Thank you for understanding,

Juraj


回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Joshua Bragg -
First, allow me to apologize a bit.  Yesterday was a long day and I needlessly jumped down your throat. I'm sorry for that.  I was overly sensitive to the "anytime when you are interested in what student actually knows" comment.

I actually like you referencing my preferred way of doing things as the "perfection" method.  I am very interested in making sure my students know things as perfectly as possible.  That is sometimes quite rough on students and I appreciate both your point and Marcus's point that students dislike it and can be intimidated by it.  I almost see the multiple answer multiple choice questions as a different version of the much hated but highly effective questions shown in the last example here: http://en.wikipedia.org/wiki/Multiple_choice#Examples

I think all of the more brutal versions are especially helpful in formative assessments.  They are great teaching tools.  Likewise, I'm in the middle of trying out CBM with my students to work on their metacognitive skills a bit.

Your example does help prove my point just a bit.  My first learning about public vs. private IP addresses came from setting up router in my apartment in college.  My router assigned a 192.168.X.X address to everything.  I got curious and started looking things up.  I didn't learn about 10.X.X.X IP addresses until I got my first job and was on a "real" network for the first time.  If I were in college taking an test with that question, I would eliminate the 192 addresses quickly and then struggle with what the other missing one was.  I would eventually end up guessing which one to eliminate.

Your comments though about multi-choice single answer have pointed out to me that your "weighted" version could be rewritten as:

Which set contains only public IP addresses?

  • 192.168.1.23, 150.23.45.0, 10.10.10.1
  • 192.168.1.23, 150.23.45.0, 172.32.16.1
  • 192.168.1.23, 150.23.45.0, 16.1.1.28
  • etc. etc. for the 20 possible combinations.
That would be a true, multiple choice single answer question.  That is too many combinations of course for a test but you get the idea.

In any case, I appreciate what you're shooting for with the question type but I'm not really convinced its the best course of action.  I do hope for your sake that more people are interested in it and you can convince Jean-Michel to build it for you.

回复Joshua Bragg

Re: Evaluation of the multi-answer questions

Juraj Giertl -

Hi Joshua,

you don't need to apologize, it's been an improper wording on my side. Regarding the example with IP addresses, hmm, now I see that it is not the best one 微笑. The problem is that that question is too simple, just on the Remembering level of Bloom's taxonomy of cognitive domains. Obviously, student cannot find it difficult to answer correctly if he was not absent-minded during the course, otherwise he must to guess. With such kind of question, the guessing is the only way students can use when they don't remember the correct answer. It's my fault, I wanted to put in a simple question understandable regardless of the potential reader's field. 

Normally, we combine a variety of questions from the first 4 levels of Bloom's pyramid within the test (the last two ones are hardly cover-able by multi-choice questions in our field). Especially, with questions aiming on the Applying and Analyzing, students can derive the answers which are not obviously correct from the first sight. They don't need to guess and they do not guess as far as it typically ends with the fail results.

We use this type of questions for a long time and have a very good experience with them. However, as mentioned before, we combine several question types on the several levels within the test, and finally we combine the test with other ways of examination for having comprehensive insight into the students' minds (and for covering the top Bloom's levels, as well).

回复Juraj Giertl

Re: Evaluation of the multi-answer questions

Itamar Tzadok -

Getting back to your proposal

Answer 1, correct, 50%
Answer 2, incorrect, 0%
Answer 3, correct, 50%
Answer 4, incorrect, 0%
Answer 5, incorrect, 0%
When participant marks two answers, he should get either the 100%, 50%, or 0% of the question score, depending on the selection. When participant marks more than two answers, he should get 0% even in the case he marked some of the correct answers.

It has been a while since I worked directly on question types but I think that this could work if you marked the incorrect options -100%.

Answer 1, correct, 50%
Answer 2, incorrect, -100%
Answer 3, correct, 50%
Answer 4, incorrect, -100%
Answer 5, incorrect, -100%

Insofar as negative marks are translated to 0, any combination of two or more answers where at least one is incorrect would yield 0. Only the two correct answers would yield 100%. Only one correct answer would yield 50%. Unsuccessful guessing would yield 0 (successful guessing is as good as knowledge for all the teacher knows without cross-questioning). 

I don't remember if that's a standard behavior or the behavior of a variant of the standard multichoice I developed and used. I even put in the code of that variant a note to myself that not choosing incorrect answers also reflects knowledge (or successful guessing) and in some strict sense should be rewarded just as choosing a correct answer, thus making the marking pedagogy even more complex than it already is. Time permitting I will go back to that variant question type and release it as a plugin, although, being a multipurpose question type, some may find it's flexible configuration UI too flexible. 微笑

回复Itamar Tzadok

Re: Evaluation of the multi-answer questions

Joseph Rézeau -
Core developers的头像 Particularly helpful Moodlers的头像 Plugin developers的头像 Testers的头像 Translators的头像

Itamar "[...] not choosing incorrect answers also reflects knowledge (or successful guessing) and in some strict sense should be rewarded just as choosing a correct answer, [...]"

Totally agree.装酷

回复Itamar Tzadok

Re: Evaluation of the multi-answer questions

Juraj Giertl -

Hi Itamar, Joseph,

 

thank you so much for the comments, actually, the sentence "[...] not choosing incorrect answers also reflects knowledge (or successful guessing) and in some strict sense should be rewarded just as choosing a correct answer, [...]" forced me to re/think my evaluation strategy. Although I totally agree with the statement, at the end of the day I stay with my original mindset. No doubts that your marking pedagogy leads to something what I called the "perfection method", because the only chance how to achieve success is to have a perfect knowledge. However, in some cases I'm aiming on the revealing of the partial knowledge as well, so cannot apply any of the methods leading to the perfection.

 

Will try to go deeper with the explanation. Let's have the above listed question as an example:


Which of the following are public IP addresses? (choose all that apply)

  • 192.168.1.23
  • 150.23.45.0
  • 10.10.10.1
  • 172.32.16.1
  • 16.1.1.28
  • 192.168.0.1

It is too simple question (the remembering level of Bloom's taxonomy) so it is not the best example for what I want to express, but some of the ideas hopefully could be illustrated on its basis. If we let this question without specifying the amount of correct answers (choose all that apply), then we can expect that students will not dare to guess at all and will mark only those options which they are 100% sure about. Obviously, when it is not stated how many answers is expected, then the evaluation must use the penalties for incorrect answers. Perhaps they would try to guess just in the case that they have absolutely no clue about the correct answers so the risk of losing the points potentially achieved from the answers which they are sure about. However, even the answers which are from the student's perspective 100% correct could be incorrect, so student could lose the partial points. As a result, probability of successful guessing is very low, but probability of getting score for partial knowledge is questionable and in my opinion rather low.

 

Moreover, the "old school" masters of didactics do not recommend to use the multi-answer questions at least for the certain levels of education. I'm not aware of their motives, can just guess that the above-listed reasoning could be on behind and/or the fact that student should exactly know what is expected from him. However, I don't want to argument on their behalf, I just take it as I have learned it on classes of pedagogy, even though it is not a modern view. So, if I try to rework the question into the single-answer version, I could get something like this:

 

Which set contains only public IP addresses? (choose one)

  • 192.168.1.23, 172.24.16.1, 10.10.10.1
  • 172.32.16.1, 16.1.1.28, 192.168.0.1
  • 194.168.45.0, 172.32.16.1, 16.1.1.28
  • 150.23.45.0, 16.1.1.28, 192.168.16.1
  • etc... more combinations can be added

This version is perfectly OK with the old-school didactics (single-answer question with equally attractive options), students clearly know what is expected, the probability of successful guessing is 1/(amount of options), but on the other hand, probability of getting score for partial knowledge is zero. For having the both, keeping it clear for students and having a chance to score with partial knowledge, the single-answer question could be reworked into the weighted form with amount of multiple answers specified within the question, e.g.:

 

Which of the following are public IP addresses? (choose three)

  • 192.168.1.23
  • 150.23.45.0
  • 10.10.10.1
  • 172.32.16.1
  • 16.1.1.28
  • 192.168.0.1

This version is still OK with the old-school didactics as from its perspective it is the same as the single-answer question just with added possibility to reveal the partial knowledge. Obviously, probability of successful guessing is higher than in the previous case, and your marking pedagogy is not followed as the not choosing of incorrect answers will not be rewarded. But on the other hand, in case of this question students could partially score if they don't remember the very exact IP ranges, but are at least aware of the classes of IP addresses. However, if they have absolutely no clue, they still can guess and get a partial/full score depending on the luck, but per my experience, it typically leads to failing (assuming the options are equally attractive).

 

Nevertheless, as this is just the remembering level question, students can either have full knowledge, or remember at least the classes of IP addresses, or they must to guess randomly. With higher level questions (applying, analyzing) when they have a partial knowledge, the question forces them (gives them a chance) to think about the rest of answers and derive the correct ones. Obviously, there are still some students who had better to use guessing instead of thinking, but these typically fail.

 

As a conclusion - from my point of view, the methods aiming on the perfection are not always the best ones even though I can clearly imagine their applicability. It could be possible to further develop the chain of my thoughts and express the exact formulas for calculation of probabilities of the successful guessing and the partial/full score achievement. Moreover, we could collect the mass of data from former testing and pass it though some statistical method, but I'm not sure whether it would be worthy enough to invest such effort to it. We have been using this question type many years within the worldwide academic program and are satisfied with the results. The last but not least, this question type is being used also in exams for the industry certifications in the area.

回复Jean-Michel Védrine

Re: Evaluation of the multi-answer questions

Juraj Giertl -

Hello Jean-Michel,

it seems that there are some votes for this add-on. Perhaps it is not a significant amount, but at least couple of them appeared. Would you mind to consider the add-on implementation?

Thank you for understanding and support,

Juraj