Information on Moodle Scalability (100,000+ users)

Information on Moodle Scalability (100,000+ users)

by Bryan Chapman -
Number of replies: 20

Hi;

I would like to talk to anyone who is using Moodle for very large implementations with over 100,000 users.

Any information about Moodle stability at that level would be most appreciated.

--Bryan Chapman

e-learning analyst

Brandon Hall Research

<font style="background-color:#ffffff;">bryan@brandon-hall.com

Average of ratings: -
In reply to Bryan Chapman

Re: Information on Moodle Scalability (100,000+ users)

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Bryan, sorry to see no replies on this query from you.

I know you found moodle.com eventually but just to provide some closure to this post here ....

Moodle should be able to scale that high without problems (moodle.org has nearly that number of users for example), providing that you have good hardware and IT people capable of monitoring and tuning the server to keep up. Moodle can be clustered if required to help it scale.

The key issue for scaling hardware is the maximum number of "concurrent" users you expect to be hitting the server at any given 10-second window.
In reply to Martin Dougiamas

Re: Information on Moodle Scalability (100,000+ users)

by Samuli Karevaara -
Looking for a hard yardstick (no need to be exact, obviously, just one to use) for the "concurrent users" here, the 10-second window is one. Would you calculate the concurrent users by checking how many different user accounts have loaded a page at any given 10-second window?

Or by calculating the "logged-in users" by for example counting the total of user accounts that have had a page loaded in the last 10 minutes?
In reply to Samuli Karevaara

Re: Information on Moodle Scalability (100,000+ users)

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Well, to me concurrent means "placing load on the server at any given moment". When you request a page the server is made to work hard for a short while (exact length depends on complexity of the page, usually it's under a second - my 10 seconds was an estimate that covered more complex pages).

While you are reading that page in your browser, though, the load you are placing on that server is zero (even if you are logged in).

So yes, counting hits from different user accounts in a given x-second window (let's say 10) could be one way to work out a number for concurrency. You might want to average these counts for many sequential windows, over a longer period etc.

As you can see it's all very inexact - there are too many variables. Personally I treat server health more like biology than mathematics. I watch symptoms like my weight or body temperature (server load) rather than counting calories.

Also, it's best to have heaps of extra headroom to cope with the spikes. I try and keep my average Unix load under 0.1 ... when it starts running hotter than 0.3 or so all the time then it's time to double the server size or reduce the number of users. smile Hardware is cheap!
In reply to Martin Dougiamas

Re: Information on Moodle Scalability (100,000+ users)

by Samuli Karevaara -
Thanks for the insight! I'll use a term "concurrent page loads" internally to avoid confusions, then have another one for "concurrent logins" or such with a 10 minute span or so... I understand that it's the "real thing" that counts ("How does the site run when you use it with your browser?"), not these numbers. It's just that this get's asked a lot, so I'm putting together a figure or two.

The server load (the Unix "load") is another confusing number. Usually "1" means that the processor is running at 100%, and this is how I interpret it too. Sometimes (seldom?) this is not the case, however.

With multiple processors it's trickier: I haven't had the time to investigate (and haven't found articles about) if our dual processor could have a load of "2" to somehow indicate 100% load per processor. Furthermore, as we are hyperthreading, we have four virtual processors, so could we go to "4"... No worries yet though, looking at less than 0.1 at peak hours (start of schoolday and after lunch) for the 15 minute average.
In reply to Samuli Karevaara

Re: Information on Moodle Scalability (100,000+ users)

by Iñaki Arenaza -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Unix load numbers are interpreted a little bit different. The numbers you get are the number of processes in the system run queue averaged over the last minute, the last 5 minutes and the last 15 minutes.

Thus a load of 1 means there is just one proccess (on average) waiting for the processor to became free and have some CPU time. So your system is getting all the work done pretty well (using whatever % of CPU needed). You are in real trouble when your *sustained* load average is over ~10. Numbers between ~3-10 mean the machine is overloaded, but still running along.

By the way, you can have very high load averages and your otherwise iddle CPUs. I have seen this on one of my machines (load over 60 for 60+ minutes, and CPUs at 5-10%) due to an I/O bottleneck in the SCSI bus.

Saludos. Iñaki.

In reply to Iñaki Arenaza

Re: Information on Moodle Scalability (100,000+ users)

by Samuli Karevaara -
Ok, thanks for clarifications! I take it that the number of processes in the queue is per system, so with four virtual processors there are four slots processing the queue so I can divide the queue by four to compare with a single processor system.
In reply to Samuli Karevaara

Re: Information on Moodle Scalability (100,000+ users)

by Bernard Boucher -
Hi all,
         two links to explain the average methods used:
1
2

All these mathematics to simulate a thermometerthoughtful

I hope it may help,

Bernard

In reply to Bernard Boucher

Re: Information on Moodle Scalability (100,000+ users)

by Samuli Karevaara -
I stumbled upon those while googling for this originally, but couldn't find whether our system compensates the multiple processors while calculating the load average, or if I should divide the load by four before comparing to a single processor load average numbers...

Awful lot of math there, seems that less would do smile
In reply to Iñaki Arenaza

Re: Information on Moodle Scalability (100,000+ users)

by Martín Langhoff -
Yup -- Iñaki's got it right. There are great utilities to capture stats on the server usage.

We combine these stats (using the sar utility and NRPE) with stats of unique user entries in mdl_log in 5 minute tracts. It gives us a great correlation between unique users performing actions and server load (cpu and iowait states), in 5 minute increments.

In reply to Bryan Chapman

Re: Information on Moodle Scalability (100,000+ users)

by Martín Langhoff -
(My fault in not answering earlier -- been buried in things to do!)

As MD says, "user accounts" can easily reach 100K without pushing your scalability, it's the concurrent users that count. As part of our work for the NZVLE project (see http://eduforge.org/projects/nzvle ) we are running a Moodle cluster which hosts ~40K users / 11K courses and growing. There was a lot of scalability work done in the 1.4.x series related to our setup.

We are now looking into a cluster setup with up to 300K accounts, and high number of concurrent logins (>1K).

Moodle scales very well, and we are putting significant efforts in reaching better scalability.
In reply to Martín Langhoff

Re: Information on Moodle Scalability (100,000+ users)

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hello Martin (L)

I've heard about your project. Could you recommend literature, prefarably books, for the topics involved, MySQL (or PostgreSQL) clustering and tuning?
In reply to Visvanath Ratnaweera

Re: Information on Moodle Scalability (100,000+ users)

by Martín Langhoff -
Search this forum! smile I've posted a lot with good hints, links to important tuning guides, etc.
In reply to Bryan Chapman

Re: Information on Moodle Scalability (100,000+ users)

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
This is a recurrent topic, although we don't hear about 100K users every week wink For example check this thread http://moodle.org/mod/forum/discuss.php?d=6920

As somebody has pointed out "there is so much a machine can do" with or without moodle! With the numbers you are talking about, you 'll have to do a few calculations.

My _guess_ is that once the core issues solved, you'll have to consider the supporting structure like back-ups, network bandwidth, fail-over strategies, etc.
In reply to Visvanath Ratnaweera

Re: Information on Moodle Scalability (100,000+ users)

by Bhupinder Singh -

This is interesting.

Can anyone suggest what are the basic calculations that need to be done to decide the infrastrucure sizing.

What would be interesting is to have that data available with anyone on any reasonable size ( of users) to enable a look at the sizing metrics and apply the same on a proportionality basis.

Any data available will be helpful to all of us.smile

Garry

In reply to Bhupinder Singh

Re: Information on Moodle Scalability (100,000+ users)

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
This question, or some variation of it, is asked very frequently. And, regrettably, there isn't an answer. The number of factors that would be involved are vast and complex - certainly the maximum number of users in the database is only one factor. Then there's the type of materials served, the maximum number of concurrent users. Even then the machine hardware is only one part of the equation.

I would make two points. Firstly, if you can split up your installation into more than one Moodle instance then do so, or at least don't do anything to prevent it. This means that if you run out of capacity you simply buy another server and have Moodle on more than one server. Secondly, as Martin has said many times, buy the biggest fastest machine you can afford. Favour fast discs and lots of memory.

Don't forget things like how well connected your machines are to the network compared to how your users will be accessing Moodle.
In reply to Howard Miller

Re: Information on Moodle Scalability (100,000+ users)

by Martín Langhoff -
Wise Words Howard!

There isn't "a metric", even less a formula. We run a Moodle with a lot of users, very heavy load, on a high-end cluster. The load on the server is very organic, and changes a lot with how it is being used.

Some factors:
* Concurrent users (usually sees periodic spikes)
* Total users
* Ratio of lurkers vs active users
* Modules used (some are very light on the system, some are murder!)
* Whether "groups" are being used
* Filters in use

Our approach has been to monitor our cluster very closely -- we get daily stats of what parts of the site are popular, and how they perform (memory footprint, database queries, etc). If we spot a potential performance problem, we review the code, and usually find that we can make small changes that lead to sizeable performance improvements.

The other aspect is OS tuning. A well tuned system will give you several times more throughput than a "Default RedHat Install".

Mind you -- this is true of any web-based software. Any "sizing" info you get is usually very fuzzy and "indicative". Good for a starting point, but nothing beyond that.
In reply to Martín Langhoff

Re: Information on Moodle Scalability (100,000+ users)

by Gavin McCullagh -
Here's a thing which would be extremely useful and might cut down the recurring conversation.

Could we have a forum thread or a page where a bunch of people running moodle instances post their detailed experiences, eg:

"I run Moodle [1.5.2] on [Debian GNU/Linux] with [1000] users. 

We use [PostgreSQL 7.3], [Apache 2], [PHP4].

We regularly have [30] concurrent users. 

Hardware is a Dual CPU HT Xeon w 4GB RAM and 72GB SCSI disks Raid1.

Performance is generally [good, bad, mostly good], [except when we have...].

I have tuned portgresql setting.....
I have tuned apache setting "

People could then look at this page to get a ballpark figure based on total/concurrent number of users versus hardware.  It might be best as a moderated page so as not to mislead but it would hopefully give a guide.  A few data points like the above would make a really good guide for people.

Gavin

In reply to Gavin McCullagh

Re: Information on Moodle Scalability (100,000+ users)

by Steve Hyndman -
That is a good idea and we do need a place where all this information is stored and easily retrieved. I would vote for having a section in the documentation wiki dedicated to this so that it could be kept up to date.
In reply to Steve Hyndman

Re: Information on Moodle Scalability (100,000+ users)

by koen roggemans -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Translators
Yes, I like that idea a lot.

I would appreciate some documentation on the sentence "The other aspect is OS tuning. A well tuned system will give you several times more throughput than a "Default RedHat Install" " a lot.

I know it can hardly be the place to document how to maintain a redhat web server, but some hints on how to optimize your server for Moodle, would be welcome smile
Note there is already a modest start for that here