Load Testing Moodle 3.7 with Adaptable Theme

Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
Number of replies: 20

We need advice on dealing with high volume users and the sheer size of first.js

We performed Load testing on our Moodle 3.7 using the LoadRunner Cloud software. The script was recorded using VuGen and modified to select a random user from 30 test users.

Eventually the Moodle site starts timing out after 30 seconds when approaching 500 users. We would aim to support approximately 1500 users concurrently. The server hardware is substantial.

Our test configuration is as follows:
Moodle 3.7.2+ on CentOS 7.5

Apache 2.4
MySQL 5.7.21
PHP 7.1
PHP-FPM
Adaptable Theme

The load test scenario is quite simple:
login > entering course > entering quiz > starting a new quiz > pressing next > trying to submit the quiz > logout

We have tried from 50-500 concurrent users from 30 randomly selected accounts.

We have tuned the server by moving from PHP-CGI to PHP-FPM and setting the number of threads and maxclients appropriately.
This has allowed 333 users to be served without excessive timeouts.

It seems that the network bandwidth with a 10Gbit/s connection is being saturated.
The effect of a 3Mb first.js not including any overhead of pages, images etc.
first.js is served from lib/requirejs.php
It doesn't seem to be retrieved from moodledata/cache
The cost of retrieving all js and globbing them together is about 44ms.

Am I right in saying that the client's browser cache with be loaded with
first.js from the server and then on subsequent page requests it will be retrieved from their client side cache?

The amount of users that can be served with the 3Mb first.js alone seems to be:
Network configuration: 1 Gbit/s backbone
125Mb/s (MegaBytes/second)
125/3 = 41.6 users
It is easy to see that overhead of pages, images etc could saturate the available network bandwidth when scaling up.

Are there any other suggestions as to how this could be improved, especially with high instantaneous load? Does Moodle 3.9 use a different caching mechanism?


Average of ratings: -
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
What are you measuring while running the load test? You will run out of bandwidth whenever you hit the "weakest link" so you have to be quite creative about measuring everything you can get at in your system.

I'm a bit of a load testing cynic. How hard are you driving these simulated 500 users - do you have random delays for example?
In reply to Howard Miller

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
Hi Howard. The recorded script uses a 2 second think-time between actions. By default the delay (think-time) is the actual recorded delay.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi

You say, the usual culprits, high CPU load, high RAM utilization, long i/o wait, database congestion, ... all are well within the limit. It is the network which is saturated? That is unusual.

I am out of tough with the JavaScript developments of Moodle. Does $CFG->cachejs has an effect? Did see these developments around release 3.7-3.8?
- Save and Show Next freezes on 2nd student
- https://tracker.moodle.org/browse/MDL-67327
- https://tracker.moodle.org/browse/MDL-67358
and many more.

P.S. @Howard, thanks for clearing my visibility https://moodle.org/mod/forum/discuss.php?d=414040#p1676280 !
wink
In reply to Visvanath Ratnaweera

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
We had a problem with our Redis server limiting performance. It turned out to be on a slower switch and the network bandwidth to Redis was causing the bottle-neck. That took ages and a lot of panic to spot wink
In reply to Howard Miller

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
>We had a problem with our Redis server limiting performance. It turned out to be on a slower switch and the network bandwidth to Redis was causing the bottle-neck.
We don't use Redis because it would take additional configuration. My colleague said on other installations Redis can provide a nice performance boost. I believe we'd have to setup 2 Redis caches - one for data and one for sessions otherwise when the caches are purged everyone would be logged out. Is that correct?
I thought it could have been a physical network problem, but the hosting company have assured us the network configuration is 1Gbit/sec throughout. The firewall is connected to a 1Gbit/sec port for instance.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
On a "proper" Moodle install a "proper" cache is almost obligatory. Recent Moodle versions rely very heavily on the cache and the default method of writing to disk files is unlikely to give you sufficient performance. Redis is particularly easy to set up so there's really very little excuse. You do not need two installs (I do have two for full disclosure but it's not required). That's one of the advantages of using Redis over some other caches.

Note that the configuration for the cache is in the admin user interface, the sessions has to be set up in config.php (no UI)

The point about network capacity is not the speed of the interfaces on paper - it's what they are doing. Most firewalls and switches are capable of producing nice performance graphs. If you suspect the network then that's what you should be looking at.
Average of ratings: Useful (1)
In reply to Visvanath Ratnaweera

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
$CFG->cachejs does indeed have an effect. $CFG->cachejs = true; is enabled.
Here is the odd thing: under load on the first request I believe the request is falling through and serving and uncached version of first.js

This might be expected because the virtual user has no cached version yet, but I think it should be pulled from the moodledata cache directory instead of globbing all the .js files together. I have purged all caches and visited the site as a normal user to regenerate the moodledata/cache files but it still globs them together when the performance test virtual users access the site.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Never had an issue from this corner. So the following is pure guess work.

I think with "uncached first.js" you are at the heart of the problem. Now the question is, whether Moodle 3.7 never did this properly, i.e. a bug in 3.7. If so, this has been resolved in the later versions, otherwise we've had more such reports. You should study the tracker and/or ask in the General developer forum for confirmation. Whether anybody is motivated to fix it in the 3.7 series - bug fixed ended May 2020 - is a different question.

It is also possible that your caching mechanisms are not properly configured. Give as much detail as possible about the current caching mechanism(s) for the people here to have a look.
In reply to Visvanath Ratnaweera

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
Thank you all for your replies and information.

I have examined the differences between Moodle 3.7.2 and 3.8.1 and found that there are significant differences throughout the system and large changes in lib/requirejs.php

Specifically the section the code falls through in 3.7.2 has now been usefully commented:
// If we've made it here then we're in "dev mode" where everything is lazy loaded.
// So all files will be served one at a time.

I have checked all the developer mode options and they are all disabled in our 3.7.2 and the JavaScript cache options are all enabled so I can only assume it is reaching the uncached serving code in error.

Adding debugging code we see that the $rev number is -1 for the uncached client:

As a test in 3.7.2 I added the following code at line 42:
if ($rev == -1 && $file == "core/first.js") {
$rev = 1;
}

The correct cached version of first.js is transmitted to the client, saving network bandwidth and allowing higher throughput:

Before change:
image.png

After change:
image.png

A further consequence of this change is the following:
Action.c(45): Error -27720: Step download timeout (120 seconds) has expired when downloading resource "https://domainname/theme/yui_combo.php?rollup/3.17.2/yui-moodlesimple-min.js". Set the "Step Timeout caused by resources is a warning" Runtime Setting to Yes/No to have this message as a warning/error, respectively [MsgId: MERR-27720]

We intend to upgrade to 3.9 in the near future. I may as a test upgrade the test version of the site to 3.8.1 if that requires less changes to the theme etc. but we will specifically repeat these load tests and compare with what we have at the moment.
Average of ratings: Useful (1)
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
I have upgraded our staging site on the live hardware to Moodle 3.9.3+ (Build: 20201224) and PHP 7.3
This isn't a complete upgrade as some theme styling isn't present. There is also some custom code which isn't present but that isn't used in this load test. We will fix that later.

The results are a big improvement:


In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Alex Rowe -
Something else to look at when trying to support a lot of users at the same time is a cache in front of Moodle.

Something like NGINX, Varnish or Cloudflare/Akamai allow all the JS/CSS/images to be cached so that those requests are being served directly from cache/memery/fast disk rather than having to be sent to Moodle to process.

Moodle also processes the majority of it's images/css/js via PHP scripts rather than directly from disk so removing this aspect will free up a lot of processing on your actual Moodle servers.
In reply to Alex Rowe

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
Using NGINX seems to offer better performance, but for common CPanel installations you only get Apache.
For easy management of NGINX we'd maybe have to install something like https://applications.cpanel.net/listings/view/Engintron-Nginx-on-cPanel but then we'd have to consider if that will work with CPanel upgrades in the future.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
As yo sound like you know what you're doing it's difficult to understand why you are restricting yourself by using cPanel.
Average of ratings: Useful (1)
In reply to Howard Miller

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Michael Lynn -
I'm not a sysadmin and cPanel is there because it is a managed server.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Then the sysadmin should do his job. His first task is to put a decent caching mechanism in place.
In reply to Michael Lynn

Re: Load Testing Moodle 3.7 with Adaptable Theme

by Alex Rowe -
I was primarily referring to a server that sits in front of the Moodle application servers. Normally in a load balanced environment you would have NGINX/HAProxy or other load balancer in front of Moodle, and with NGINX or Varnish (and others) as well you can add a layer of caching for images/js/css that Moodle serves.

CDNs like Akamai/Cloudflare also do similar, so if you aren't able to manage the server installs yourself, these can normally work with managed/cPanel type services.

You will also never get the same performance out of managed cPanel hosting versus VPS type hosting, but that's a trade off you need to decide on with performace vs $$$.
Average of ratings: Useful (1)