Can We Use AI to Grade Essay Questions?

Can We Use AI to Grade Essay Questions?

by Mosaab Alsiddig -
Number of replies: 8

As the title suggests, we’re exploring the possibility of using AI to automatically grade essay questions in Moodle. We're particularly interested in:

  • Moodle plugins that support AI-based essay assessment

  • External AI services that can be integrated with Moodle (via API or LTI)

  • Any solutions that offer automated grading and/or constructive feedback

If you’ve used AI for essay grading in Moodle — whether through a plugin, custom integration, or third-party service — we’d love to hear your experience.

✅ What worked?
⚠️ What challenges did you face?
💡 Any recommendations or tips?

Thanks in advance for sharing your insights!

Average of ratings: -
In reply to Mosaab Alsiddig

Can We Use AI to Grade Essay Questions?

by Deds Castillo -
Picture of Plugin developers
Hi Mosaab,

I did something alone these lines for a client, both for essay writing and for math questions grading. Structured output would work well if there's a requirement for uniform fields. Challenge will always be how you craft the proper prompt so that it gets you what you want. Few shot examples helps a lot. Providing proper rubrics help. Sometimes, you get better results if you break down the task and simulate a conversation as opposed to providing all in one prompt and just giving one input. Lastly, choice of ai model to use will really give you variance in the response. In some cases, the cheap model won't cut it and you'll need to use one that would cost you higher.

Note that I did not use my own model spin my own instance. I just tried APIs from the different vendors like openai and perplexity.
Average of ratings: Useful (1)
In reply to Mosaab Alsiddig

Can We Use AI to Grade Essay Questions?

by Marcus Green -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
 
Currently in use in the following countries to my knowledge
 
Australia
Germany
Japan
Vietnam
Turkey
Israel
Ukraine
Taiwan
Spain
Average of ratings: Useful (1)
In reply to Mosaab Alsiddig

Can We Use AI to Grade Essay Questions?

by Mosaab Alsiddig -
Thank you for your ideas.
I look forward to seeing this implemented as a grading activity rather than just feedback in the near future.
In reply to Mosaab Alsiddig

Can We Use AI to Grade Essay Questions?

by Marcus Green -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
What do you mean by a "grading activity"?
 
In reply to Marcus Green

Can We Use AI to Grade Essay Questions?

by Mosaab Alsiddig -
Sorry, my mistake — I initially thought it only provided feedback, but I now see it can also grade the question.
Thank you very much!
In reply to Mosaab Alsiddig

Can We Use AI to Grade Essay Questions?

by Marcus Green -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Although it will grade questions I suggest being cautious about it for several reasons. It is hard to craft a prompt that gives good consistant grading, and even when the grading is consistent it typically reflects what an statistical average of random people on the web would and not necessarily what actual experts would do. I have found it acceptable for English language constructions and grammar but it may be less useful and reliable in other subjects. But please experiment and let us know what you find.
Average of ratings: Useful (2)
In reply to Marcus Green

Can We Use AI to Grade Essay Questions?

by Matt Bury -
Picture of Particularly helpful Moodlers Picture of Plugin developers

I'll second that.

At the moment, the results from LLMs are "variable" at best which in assessment is a critical issue, see: https://en.wikipedia.org/wiki/Inter-rater_reliability Although LLMs may help to reduce graders' workload, we still need expert graders in the process to ensure relevance, accuracy, & consistency.

Re: prompting, yes, the more generic the prompts, the more generic the responses. An expert can write specific prompts for specific criteria & get more or less decent results, whereas someone with less expertise may find an LLM more hindrance than help. The same goes for formative assessment, AKA feedback, i.e. using trial & error to find prompts that get OK results & then having to edit them extensively rather than understanding the underlying subject matter, common issues that students tend to have with it, & the more effective feedback & follow-up activities that help to move learning forward (AKA, "pedagogical content knowledge").

In reply to Matt Bury

Can We Use AI to Grade Essay Questions?

by Marcus Green -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
I agree with Matt. There is very widespread overclaiming and hard selling for what AI/LLM systems can do. The capabilities have been improving but we are in the "hype cycle". The world of AI Investment is a bubble, at some point it will pop, and I suspect that is not very far away. But we will be left with some interesting technologies and perhaps the hype will ease back a little.

Having said all that I have been working on a feature to make it easier to test prompts against various student responses.
https://fosstodon.org/@marcusgreen/114412400069226194
Average of ratings: Useful (1)