We've got an "interesting" (read: worrying!) problem with our (fairly new) live Moodle setup.
Moodle 2.3.1 on SuSE Enterprise Linux based Apache 2.2.10 / PHP5 (x 4 servers). The moodledata area is on NFS.
Cisco ACE providing load balancing / system availability control / SSL offloading (SSL certificates are installed here)
PHP5 acceleration (xcache provided by PHP module php-xcache-1.3.2-2.32)
We're just ramping the usage up, as students are starting to return after their summer break. We're now routinely hitting 120 concurrent users, but this is set to spiral (hopefully not out of control...!) over the next few weeks.
The problems we're seeing are around response. This is just so unpredictable. We've managed to determine that it's not MySQL (at the moment!), but basically anything else in the application stack is up for questioning.
We can remove the Cisco load balancer, and thnings remain the same, so it's very probably not that either.
Interestingly, if we ping (nping) port 80 (which is the moodle port, as SSL offloading is done on the Cisco) then we'll get occasional packet loss. Port 443 works correctly 100% of the time.
So, does anyone have any advice with respect to tuning apache, TCP kernel parameter adjustments, etc?
Our suggested eventual level of concurrency is up to 2000 users logged in, probably 10% busy at any given time.
Thanks very much
Worrying? Yes, unfortunately. Your setup is not going to stand the expected load.
From your post I can think of a couple of areas:
1. Moodle 2.3 seems be a performance hog even compared to 2.2. Go through "Slow page loading in Moodle 2.3.1? Try this:" http://moodle.org/mod/forum/discuss.php?d=210777 and the two other
discussions linked from there.
2. Virtualisation. Is a recurrent topic in this forum. See "Slow disk reads with Linux VMs, Moodle 1.9" http://moodle.org/mod/forum/discuss.php?d=211249. (Yes, it is about 1.9. But that was lean compared to all the 2.xes.
3. NFS. There are hints in the discussion mentioned above. Also search the forum for more ('Advanced search' mentioned in the intro.)
4. Your network setup. Seems to be pretty complicated. Is your site fully https or just the login is
https? Your load balencer can cope with sticky sessions? Do you also have proxy caches, Varnish
You have to dig deeper and localize the bottle neck. Linux is marvelous for that. There are some hints in the forum documentation (linked to the intro). You can also read other discussions in this forum to see how others were doing that.
P.S. Kernel parameters is the last place I would look. By spending weeks or months on job sheduling, Linux IP stack, file system intricacies, etc. you might gain a few percents. A single careless line in the application code (Moodle) could cost you factors!
Thanks very much for the information; that's all really helpful.
We think we've managed to isolate this particular problem to an Apache setting, that was set based on the desired number of connections, and the TCP/IP queue length. It turns out that the equations given on the Apache website (and echoed here) may be a little on the low side for our setup.
I'll update this later with the exact details if it turns out to be the case (fingers crossed) - I don't want to post false solutions here until we've had a chance to absolutely check things ourselves.
The post regarding virtualisation is very interesting, and something that I have to agree with. I'm not convinced it should be used for busy transactional stuff at all. It's just too unpredictable. Development - yes. Perhaps functional testing - yes. But not live. Cost often wins though, and the ease of adding infrastructure is often seen as being a distinct advantage - even if it does all suffer from the same issues!