Hello
We were having problems with our infrastructure. At that time we have around 200 users. What we have:
load balancer haproxy
two backend ubuntu 14.04
one database (mysql 5.5)
NFS server DELL FS8600
All virtualized under VMWARE. No problems with our virtualized servers with windows or linux.
History:
We migrate in july from 2.8 to 3.1 and evrything was fine. But in september, students were coming again and it was awfull..
Sometimes 5 minutes to get a page, load at 200
We checked everything and tests lots of possibilities:
haproxy: transparent or not
CPU/ram: added to backend and SQL..
/var/www/moodle on local disks
We also installed an Ubuntu 16.04 with PHP7, 8Go, 4 vcpu, same problem.
We then spent some time around cache. The sessions were since the beginning in memcached (defined in config.php). We noticed that there is a possibility to add another cache store using memcached. At first we just put just some caches to this store. Things became better, but we lost sessions. So we put two instances and everything was fine.
We spent around 2 weeks at three persons. When it was panic, that was impossible to say if a script was wrong. A few hours later, you can take the url and it was fast. We were using performance stat in error_log, but nothing trivial appeared.
Conclusion: It seems that if you are in cluster mode with NFS, it's mandatory to use two memcached instance for sessions and cache. We dropped NFS requests from 6000 request/sec to just a little.. No more CPU or load.. and fast pages. What is still very bad is to show user avatar through pluginfile.php (5 sec or more sometimes)
Hope this helps
Dom