Adaptive Quiz: CAT (Computer-Adaptive Testing) implementation for Moodle

Activities ::: mod_adaptivequiz

Maintained by

Adam Franco,

Vitaly Potenko

Create tests that efficiently measure users' abilities by adapting the questions difficulty to the estimation of user's ability.

Latest release: 14 měsíců

711 sites

294 downloads

92 fans

Current versions available: 8

Download

The Adaptive Quiz activity enables a teacher to create tests that efficiently measure the takers' abilities. Adaptive tests are comprised of questions selected from the question bank that are tagged with a score of their difficulty. The questions are chosen to match the estimated ability level of the current test-taker. If the test-taker succeeds on a question, a more challenging question is presented next. If the test-taker answers a question incorrectly, a less-challenging question is presented next. This technique will develop into a sequence of questions converging on the test-taker's effective ability level. The test stops when the test-taker's ability is determined to the required accuracy.

The Adaptive Quiz activity uses the "Practical Adaptive Testing CAT Algorithm" by B.D. Wright published in Rasch Measurement Transactions, 1988, 2:2 p.24 and discussed in John Linacre's "Computer-Adaptive Testing: A Methodology Whose Time Has Come." MESA Memorandum No. 69 (2000).

This Moodle activity module was created as a collaborative effort between Middlebury College and Remote Learner. Later on it was adopted by Vitaly Potenko to keep it compatible with new Moodle versions and enhance with new features.

Below you'll find short documentation on the plugin to explain its essential concepts and flows.

The Question Bank

To begin with, questions to be used with this activity are added or imported into Moodle's question bank. Only questions that can automatically be graded may be used. As well, questions should not award partial credit. The questions can be placed in one or more categories.

This activity is best suited to determining an ability measure along a unidimensional scale. While the scale can be very broad, the questions must all provide a measure of ability or aptitude on the same scale. In a placement test for example, questions low on the scale that novices are able to answer correctly should also be answerable by experts, while questions higher on the scale should only be answerable by experts or a lucky guess. Questions that do not discriminate between takers of different abilities on will make the test ineffective and may provide inconclusive results.

Take for example a language placement test. Low-difficulty vocabulary and reading-comprehension questions would likely be answerable by all but the most novice test-takers. Likewise, high-difficulty questions involving advanced grammatical constructs and nuanced reading-comprehension would be likely only be correctly answered by advanced, high-level test-takers. Such questions would all be good candidates for usage in an Adaptive Test. In contrast, a question like "Is 25¥ a good price for a sandwich?" would not measure language ability but rather local knowledge and would be as likely to be answered correctly by a novice speaker who has recently been to China as it would be answered incorrectly by an advanced speaker who comes from Taiwan -- where a different currency is used. Such questions should not be included in the question-pool.

Questions must be tagged tagged with a 'difficulty score' using the format 'adpq_n' where n is a positive integer, e.g. 'adpq_1' or 'adpq_57'. The range of the scale is arbitrary (e.g. 1-10, 0-99, 1-1000), but should have enough levels to distinguish between question difficulties.

The Testing Process

The Adaptive Test activity is configured with a fixed starting level. The test will begin by presenting the test-taker with a random question from that starting level. As described in Linacre (2000), it often makes sense to have the starting level be in the lower part of the difficulty range so that most test-takers get to answer at least one of the first few questions correctly, helping their moral.

After the test-taker submits their answer, the system calculates the target question difficulty it will select next. If the last question was answered correctly, the next question will be harder; if the last question was answered incorrectly, the next question will be easier. The system also calculates a measure of the test-taker's ability and the standard error for that measure. A next random question at or near the target difficulty is selected and presented to the user.

This process of alternating harder questions following correct answers and easier questions following wrong answers continues until one of the stopping conditions is met. The possible stopping conditions are as follows:

there are no remaining easier questions to ask after a wrong answer
there are no remaining harder questions to ask after a correct answer
the standard error in the measure has become precise enough to stop
the maximum number of questions has been exceeded

Attempt graph

Test Parameters and Operation

The primary parameters for tuning the operation of the test are:

the starting level
the minimum number of questions
the maximum number of questions
the standard error to stop

Relationship between Maximum Number of Questions and Standard Error

As discussed in Wright (1988), the formula for calculating the standard error is given by:

Standard Error (± logits) = sqrt((R+W)/(R*W))

where R is the number of right answers and W is the number of wrong answers. This value is on a logit scale, so we can apply the inverse-logit function to convert it to an percentage scale:

Standard Error (± %) = ((1 / ( 1 + e^( -1 * sqrt((R+W)/(R*W)) ) ) ) - 0.5) * 100

Looking at the Standard Error function, it is important to note that it depends only on the difference between the number of right and wrong answers and the total number of answers, not on any other features such as which answers were right and which answers were wrong. For a given number of questions asked, the Standard Error will be smallest when half the answers are right and half are wrong. From this, we can deduce the minimum standard error possible to achieve for any number of questions asked:

10 questions (5 right, 5 wrong) → Minimum Standard Error = ± 15.30%
20 questions (10 right, 10 wrong) → Minimum Standard Error = ± 11.00%
30 questions (15 right, 15 wrong) → Minimum Standard Error = ± 9.03%
40 questions (20 right, 20 wrong) → Minimum Standard Error = ± 7.84%
50 questions (25 right, 25 wrong) → Minimum Standard Error = ± 7.02%
60 questions (30 right, 30 wrong) → Minimum Standard Error = ± 6.42%
70 questions (35 right, 35 wrong) → Minimum Standard Error = ± 5.95%
80 questions (40 right, 40 wrong) → Minimum Standard Error = ± 5.57%
90 questions (45 right, 45 wrong) → Minimum Standard Error = ± 5.25%
100 questions (50 right, 50 wrong) → Minimum Standard Error = ± 4.98%
110 questions (55 right, 55 wrong) → Minimum Standard Error = ± 4.75%
120 questions (60 right, 60 wrong) → Minimum Standard Error = ± 4.55%
130 questions (65 right, 65 wrong) → Minimum Standard Error = ± 4.37%
140 questions (70 right, 70 wrong) → Minimum Standard Error = ± 4.22%
150 questions (75 right, 75 wrong) → Minimum Standard Error = ± 4.07%
160 questions (80 right, 80 wrong) → Minimum Standard Error = ± 3.94%
170 questions (85 right, 85 wrong) → Minimum Standard Error = ± 3.83%
180 questions (90 right, 90 wrong) → Minimum Standard Error = ± 3.72%
190 questions (95 right, 95 wrong) → Minimum Standard Error = ± 3.62%
200 questions (100 right, 100 wrong) → Minimum Standard Error = ± 3.53%

What this listing indicates is that for a test configured with a maximum of 50 questions and a "standard error to stop" of 7%, the maximum number of questions will always be encountered first and stop the test. Conversely, if you are looking for a standard error of 5% or better, the test must ask at least 100 questions.

Note that these are best-case scenarios for the number of questions asked. If a test-taker answers a lopsided run of questions right or wrong the test will require more questions to reach a target standard of error.

Minimum Number of Questions

For most purposes this value can be set to 1 since the standard of error to stop will generally set a base-line for the number of questions required. This could be configured to be greater than the minimum number of questions needed to achieve the standard of error to stop if you wish to ensure that all test-takers answer additional questions.

Starting Level

As mentioned above, this usually will be set in the lower part of the difficulty range (about 1/3 of the way up from the bottom) so that most test takers will be able answer one of the first two questions correctly and get a moral boost from their correct answers. If the starting level is too high, low-ability users would be asked several questions they can't answer before the test begins asking them questions at a level they can answer.

Scoring

As discussed in Wright (1988), the formula for calculating the ability measure is given by:

Ability Measure = H/L + ln(R/W)

where H is the sum of all question difficulties answered, L is the number of questions answered, R is the number of right answers, and W is the number of wrong answers.

Note that this measure is not affected by the order of answers, just the total difficulty and number of right and wrong answers. This measure is dependent on the test algorithm presenting alternating easier/harder questions as the user answers wrong/right and may not be applicable to other algorithms. In practice, this means that the ability measure should not greatly affected by a small number of spurious right or wrong answers.

As discussed in Linacre (2000), the ability measure of the test taker aligns with the question-difficulty at which the test-taker has a 50% probability of answering a question correctly.

For example, given a test with levels 1-10 and a test-taker that answered every question 5 and below correctly and every question 6 and up wrong, the test-taker's ability measure would fall close to 5.5. Remember that the ability measure does have error associated with it. Be sure to take the standard error amount into account when acting on the score.

Useful links

Screenshots

Contributors

Adam Franco (Lead maintainer): Former maintainer

View other contributions

Vitaly Potenko

View other contributions

Please login to view contributors details and/or to contact them

Awards

Automated testing support

Early bird 4.4

Early bird 4.5

Comments

Ukázat komentáře

Vitaly Potenko
úterý, 8. října 2024, 03.22
To C. Konstanze Me: hi, your scenario looks like a request for specific kind of report. The plan for the student's report is a bit more simple - just allow a student view their progress and score level dynamics through the test, the data would be similar to what a teacher/manager sees in their reports.
To P P: hi,
1. you cannot incorporate the default quiz's timer, as it's not something pluggable. As a developer one basically has to replicate that entire timer functionality in the adaptive quiz plugin. Moreover, you must consider the CAT's specifics. For example, what's the desired behavior when the time ends and the student has progressed, say, just 75% of the quiz? How to handle that unfinished assessment correctly?
2. I don't think just knowing whether the student got 'right' or 'wrong' on certain questions is something the CAT is all about. However, I may be wrong, and, perhaps, this is worth discussing on the forum with the much wider audience there. I keep watching the forum, so, if some well-shaped requests arise from there they all will be considered for sure.

Thanks again for your ideas and comments on the plugin!
Vitaly Potenko
neděle, 13. října 2024, 03.55
Hey folks,
Happy to announce, that version 2.4.0 does (which initially was for Moodle 4.4) officially support Moodle 4.5. It successfully passed automated and manual tests under 4.5. Happy Moodle'ing!
Inga Altroka-Eltermane
čtvrtek, 7. listopadu 2024, 23.08
Hello! Thank you for development and maintenance of this plugin. . Really, a great tool and helps us a lot with organizing student training based on their knowledge level. I would like to ask, if there is a thought to introduce a timed adaptive quiz version, as, currently, there is no option to set a time limit. Due to this fact, we have issues with user attempts being timed-out and quiz stopping due a long inactivity. Our students have o complete the CAT quiz at home to receive their course according to the quiz result. Looking forward to your answer.
Vitaly Potenko
čtvrtek, 14. listopadu 2024, 03.41
Hi Inga,
That's a frequntly requested feature, however, I'd like you to share your opinion on the following:
- how do you see the scenario when the user was timed-out - what should ther ability measure be given they haven't answered a certain amount of questions at all because the quiz has ended not when supposed by the algorithm? Would such ability measure be precise for the user? With that regard, the adaptive quiz differs from the default quiz - you cannot estimate the ability precisely if you just cut out several questions from the quiz for any reason. What's s your vision on this? Thanks!
Yuliya Romanova
středa, 15. ledna 2025, 21.34
Hi Vitaly,
Thanks for your support of the plugin - it's a great tool that helps us a lot!
I have a very small question or maybe even a request. In the activity Settings, you can enter quiz Description using a text editor, but Attempt Feedback is just a text box. Is this easily fixable or could this perhaps be added?
We would like to highlight some information and add clickable links to the students' next steps. I think it should be quite handy.

Have a great day!
Vitaly Potenko
čtvrtek, 16. ledna 2025, 18.25
Hi Yuliya,

Thanks for the support and striving for making the plugin better.
The feature you're requesting is actually what's planned in the nearest release. However, as for the release itself, I'm not quite sure when it's going to come. There are some accompanying features/improvements planned which need more testing, but currently I'm struggling with my schedule a bit. Stay tuned though, I hope it'll not take too much to release. Thank you!
Yuliya Romanova
čtvrtek, 16. ledna 2025, 19.14
Thanks, Vitaly, great news!

We're patiently waiting for the next release whenever it happens
AILKM Moodle
úterý, 18. února 2025, 21.08
How to add the tags in moodle 4.3.3+, cannot find the tag option under the edit.
Vitaly Potenko
středa, 19. února 2025, 14.49
To AILKM Moodle
Not sure what particular tags you mean, but have you visited this - https://docs.moodle.org/403/en/Tags
Like in the screenshot at the bottom there, you should see a field to add tags when editing a question. Basically, adding a tag to a question is pure Moodle, the adaptive quiz activity then just uses those tags added to questions.
Inga Altroka-Eltermane
středa, 26. února 2025, 21.55
Dear Vitaly,
We have been using CAT quiz for almost a year now and there are 2 main issues, I would like to ask about.
1. Is there plan to implement a feature, that student can see own mistakes at the end of the quiz?
2. Is it possible to implement a feature, that questions for CAT quiz are connected to Moodle System QB? We are not happy that we need to create copies of questions and put them into Levels under the quiz. We have frequent changes in company procedures and, therefore, we need to correct questions in the System and then go to CAT questions, find them there and also correct them there. This is time consuming and inefficient to us. Would it be possible to link CAT questions to the System QB?
Vitaly Potenko
čtvrtek, 27. února 2025, 21.15
Hi Inga,
1. Indeed, there's a plan to extend the student's feedback (optionally, controlled by a per-instance setting), perhaps, questions overview could be there as well. The bad news is it's not planned for the nearest release.
2. I've made a quick check, and it seems like historically the adaptive quiz plugin doesn't respect any question categories (say, 'question pools') outside the current instance's course. I need to look into it a bit deeper and perform some testing to make sure the system context's questions can work no different from the course context's if you just add the system context to the instance's pool options. But I'm quite optimistic about this one to be added to the plugin pretty soon.
Thank you much for your suggestions, it's always welcome!
Alaukik Kotecha
středa, 26. března 2025, 13.58
can we have quiz or question timed in this adaptive quiz?
Vitaly Potenko
středa, 26. března 2025, 15.18
Hi Alaukik,
If you travel across the comments here you'll find some similar to yours with some replies. To sum up, currently this is not available, and there are no immediate plans for this.
In general, it's a debatable question how to approach this in adaptive testing, as it's different in its nature compared to regular quizzes. You're welcome to outline your vision on this here if you want to, or even raise this question in Moodle forums to have some valuable input there.
Thank you for your interest to the plugin.
AILKM Moodle
čtvrtek, 3. července 2025, 16.32
Hi

Is this plugin will work on moodle 5.0?
Vitaly Potenko
čtvrtek, 3. července 2025, 16.57
Hi AILKM Moodle,

The plugin hasn't undergone any tests under Moodle 5.0 and doesn't officially support it yet. But you may try and install it on a 5.0 testing instance and see.