Hi Juan Carlos
Thanks for detailed reply. Now you explain it, I remember old discussions with the same message. I must say, I see advantages of the current "memory-less" approach: most "secure" (nothing to secure) and probably leaner.
Anyway, on recording resp. (live) monitoring, a colleague mentioned a Grafana plug-in would be possible, if I new how to pull the information as it happens. But as customary here, the action begins once it burns, to be specific, the exams in question were today!
They were the same subject, conducted in two sessions, 231 resp. 230 candidates each. Net duration 60 min, single attempt in a open window of 90 min each. That float is an illusion, in the first session over 200 began in the first 5 min! (The second one I don't know, I missed the first 30 min.)
I have no knowledge of the exam itself, I'm only a non-editing teacher in that course, assume multiple questions and the students develop the answer in the VPL, because of the large number of program compilations and the exam ran on SEB.
Only a single student got an error:

I found an old related script in my notes and modified it to:
#!/bin/bashwhile truedo echo -n `date +%s,` echo -n `/usr/bin/pgrep -c vpl-jail-server` echo -n ',' echo -n `uptime | cut -d' ' -f 11` free -m | awk '/Mem/{print $3}' sleep 10done
running
# nohup script.sh >> monitor.csv &
collected the results. (attached)
Erroneously the script recorded the 15-min load average.
The machine is a 16 vCPU VMware machine with 16 GB RAM. Ubuntu 22.04. The network connecting them from one provider to the other is overly complicated - for my taste.
Don't know anything about the Moodle side, other than that it is a "big" server.
The data correlate nicely with the usual system graphs and give some insight in to the performance of VPL jail servers.