So my questions:
- Is there any step-by-step guide to installing a Moodle cluster?
- Do I need shared storage on the web servers (i.e. in some systems, everything is stored in the database so nothing actually gets stored on the application servers)?
- If so, do I need a clustered filesystem like GFS or OCFS to prevent locking problems? Are there any alternatives?
It would be great for there to be some documentation on moodle.org about this since I think it would be of interest to quite a few sites.
Have you considered Linux Virtual Server (LVS) www.linuxvirtualserver.org for your requirements?
If you run a search on clustering, you will find quite a few posts within the moodle forums.
I did a search on the forums (and elsewhere) and there were a few bits of good information, but mostly people either saying clustering is a good idea or people asking how to do it
http://catalyst.net.nz/moodle/enterprise/ seems to refer to a lot of what we're looking for, but since it's a presentation, it doesn't give any real details.
The presentation talks about a cluster that uses LVS with KeepAlived, pretty close to what http://www.austintek.com/LVS/LVS-HOWTO/ outlines. That's your blow-by-blow Moodle cluster setup
I haven't gotten around yet to write a specific "what do we do to tune Moodle once the LVS cluster is setup". In the Sydney Moot a guy (can't remember his name) gave a great presentation on setting up an LVS for Moodle, and then he grabbed me and we had an impromptu 1-hr in-depth discussion on how to tune the cluster for performance and reliability, including kernel tunables, apache tuning, db tuning, disk partitioning strategies, etc.
Anyone who was there was keeping notes...? Miles?
Gary Benner also did a lot of work on clustering. I am not certain where he finally finished up, but I have a small development cluster (7 servers - 2 LB, 3 Web, 2 BD). There were some notesd developed for the the Moodle Moots (Feb 2005 in NZ and July 2005) which are avialable on DVD - but as for the actual setup notes I don't believe there are any. Benoit Brosseau from Canada set up a two server heatbeat environment that I have been trying to get notes from for over 6 months, but to no evail - but he might be another source.
I will be over the coming months developing a how to set of notes for the cluster we have, but at this stage it is just a development environment. I would like to build on our general setup single server notes to be able to readily duplicate the environment and then a set of administration and maintenance notes / manuals. If that is of use further down the track let me know.
There was also an admin course started a while ago, that fell into a hole for a number of reasons - and i would like to reserect that at some stage - as there seems to be a number of issues out their from people just setting up environments - but do not know how to properly or administer them if there is enough interest.
I've only had time to have a brief look at the LVS docs so far, but it seems (I might be wrong here!) that this doesn't take care of the issue of shared storage for /moodledata. http://www.linuxvirtualserver.org/architecture.html suggests using "data based systems" for "the data that server nodes need to update dynamically". I suppose that even with clustered file systems like GFS, if the file (or part of the file) is locked by a node and the other node tries to access it, the other node will have problems. Any pointers on this would be much appriciated!
Yeah mate - thanks again for your time on that in Sydney.
I met up with Nick Cross from Scots College (the other fella in that discussion) yesterday - he's gearing up now to go live with his cluster... however I've been a little more slack...
I'll be sure to get my notes together and post them here...
It'll be particularly valuable I think for those working in a high-usage K12 setting. It seems to me that although many of the universities have high user numbers they rarely have a high percentage of them working concurrently. In our installation we have only about 2000 users - but frequently see 150-250 concurrently doing active work (quizzes/ assignments/ forums), 7 servers would be nice, but for many of us a smaller-scale LVS cluster is just the ticket.
It seems to be a little quite after previous discussion.
I plan to try out a small moodle cluster using LVS/NAT -Round Robin (which allow me to change to F5 balancer in future without much pain) . 2 lvs load balancer + 2 apache web servers+ 1 database server (mysql+ moodledata).
Since I do not have SAN/NAS currently, i plan to setup DB server as iSCSi Server(http://iscsitarget.sourceforge.net/) with GFS .
For your information, we are running on 1GB UTP network. All hardware is DL385 G2 (4GB RAM + SAS hard disk, AMD Opteron 2.6GHz - 2MB L2 dual-core)
My questions are :-
1. Is this setup workable? Will moodledata easily corrupted?
2. I know LVS/NAT is not a good choice and there is single point of failure at my DB side. Can I use cold standby unit for DB server using RSYNC?
3. The authentication is currently SSL enabled. Do you foresee any problems that might arouse?
4. We had hard time on CPU overload on Windows platform. Thus, we plan to move to linux. May i know if our hardware can handle 5000-8000 users with 300 concurrent login?
Appreciate your knowledge sharing over performance tunings, coz Aus & NZ is too far away from here to meet up you guys
This only provides a standby for the database and moodledata elements since AFAIK MySQL clustering isn't supported by Moodle. A true active/active clustered database may well be easier with other database types.
Our configuration is as follows:
2 - Dell 1850's (Both configured as master, so we have redundant "master" boxes)
1 - Dell 2850 (MySQL/MoodleData)
We've had no problems thus far with this configuration, simply put, it rocks, and keeps up with our demand. If we ever need another box in the chain, we just purchase it, image it, toss it into the LVS configs, and it's go time.
After nearly 2 months of research, we plan to use this structure. We are in implementation phase. The failover of DRBD sometime causes hiccups
It is amazing how fast delivery can be when coming from an optimized cache - varnish should give another boost, but squid can do a lot of more useful things:
"Heavy"-load users can be restricted using spawn-fcgi (e.g they will get only two fcgi-childs). xfs seems to be a good choiće for the /moodledata-files. At the moment anything runs on a single machine - I'll divide in the following order if load rises.
1. MySQL to a separate server
2. Squid to a separate server
3. /moodledata to a separate server
4. three webservers for delivery
5. adding a MySQL-slave
They are also the makers of http://www.danga.com/memcached/
Any one here had any experience with Perlbal? If so could you please share your experience with us? Thanks!
Speed heavily depends on what can be delivered NOT from cache. If the requested item is not in there, then you'll have to wait for the backend. Only these are the interesting requests.
And then we are talking about database tuning - lots of performance can be achieved by optimizing your database system. Then we talk about RAID10 and RAM, RAM, RAM, so that you can hold the whole database in RAM. The last thoughts (RAID10, XFS, RAM) are not mine, but from Kristian Koehntop - core developer MySQL.
He also says, that not the highest daterate of a HDD will perform (for database), but the shortest average seek time. The more heads the less seek time. Better a HDD-cake consisting of several small disks than one big one with only two heads.
PHP & Webserverload is ridiculous compared to database load (with moodle in my experience)
just my two cents...
Do you know of any objective data regarding the number of users a WS would be able to support when running Moodle and a PHP accelerator?
I read in one post that we need to estimate 50 simultaneous users per 1G of RAM.
We are preparing an install for a university with 70000 students, I assume we will have around 5000 simultaneous users max. If I do the math, it would bean we will need 13 Web servers with 8G each.
Now if we run so many web servers I'm afraid, we will start seeing bottlenecks in other places.
I know the thread is old... I hope I can still get an answer.
Actually, I am a little bit concerned about the session of a user when LVS is deployed.
1- Will the cookie information be enough to identify the user whatever his serving machine is?
2- Is the data in the cookie plain text ? (Hence not secured)
I plan to upgrade from a High Availability setup to a Load Balanced + High Availability catering to approximately 21000+users, with frequent quizzes with some classes are over 1000 students .
This setup will have the following average spec:
Loadbalancers: Dual Core Porcessors, 8GB RAM, 160 GB HDD's running CentOS 32Bit / Ultramonkey
Webservers: Quad Core, 12 GB RAM CentOS 32bit and Apache (not familiar with lighttpd or nginx)
DB servers: Quad core, 16GB RAM, CentOS 64Bit with Mysql. No clustering YET, just a backup with scripted hotcopying of the DB overnight.
What do you guys think of this setup? I am open to any ideas suggestions.
- 8GB RAM on a loadbalancer is overkill, get 2GB RAM in the LB's and put the rest in your webservers, run memcached on them.
- get more webservers, are you able to run all your load on just 1 server? Seriously, I wouldn't want one of these two servers to die, get a whole bunch of webservers, so one or more of them can die/be offline for testing/whatever
- how will you handle the file aspect? Do you already have a storage box, how will you handle the moodledata files?
How are you handling the load at this moment? What boxes do you have now and how are they being used?
You really want redundancy here! Make sure you get an extra of everything.
Thanks for the input.
the 8GB machines already exist, all i am going to do is change their roles in the overall schema. Regarding the memcached option, I will be running eaccelerator, should i also run memcached on the webservers?
regarding the multi-webserver setup, do you have any sort of documentation on how to properly setup and configure this option?
the files will be on separate raided drives on a backend machine.
eaccelerator is a php compiler cache, which caches the compiled php scripts, so that if 100 users hit the same php script, it has to be processed only once.
Memcached is a data base query (or other object) cache, thus lowering the load on your database.
In my opinion they complement eachother.
The multi webserver setup is rather straightforward. We even have the same config files for all our webservers. The only file that is on the physical local box is apache2local.conf, whicht contains the local servername directive I think.
Our setup is similar to this: http://www.howtoforge.com/haproxy_loadbalancer_debian_etch
The webservers all have only private IP addresses and only the loadbalancers have a public ip, sharing one failover ip (3 ip's - 2 servers, one virtual and one physical for each box).
Will you be axporting the files over NFS, or just rsync them?
I notice that you have a twin load-balencer and a twin web-server. How are they coupled? I mean, do they share load all the time or waiting to take over if the other fails?
Or put it another way, is the goal is high-availability or high-performance (or both)? 1000 students taking a test (quiz module) together need lot of ressources. So performance is a concern.
It would be easier to understand, if you say which technolgies you are using: LB appliences?, LVS, Heartbeat? ...
The goal is both high availability, plus load balanced high performance.
I will be modelling after this example:"
Regarding the memcached, i will look into it - thanks for the advice.
Any documentation on installation and configuration specific to moodle, or can i just use standard web documentation as gospel?
Again folks, thanks for the input!