Handling Moodle database load - a solution!

Handling Moodle database load - a solution!

by Nitin Parmar -
Number of replies: 9
Recently, at the University of Bath, we've really been struggling with service performance with our MySQL-backed Moodle 1.9.7+ installation.

We've been aware that the recent poor, perhaps inconsistent, service performance has been a major issue for all users and have been working hard as a team to try to deal with such issues and to find ways to improve the user experience. This has been indicated within posts on our Moodle service blog.

A recent post on the blog indicated that we've turned a real corner with performance, so I wanted to post to this forum to elaborate on the work that we've done, and to share our solution with the community.

Last Monday (22nd February) was a particularly bad day in terms of performance. This was down to a combination of factors, namely:
  • Monday's are the busiest day of the week in terms of usage. I recently showed a graph at a Moodle Advisory Group meeting which indicated this - slide 5 at http://go.bath.ac.uk/fltx. The key thing to notice is the spike at the start of each week.
  • We identified that one department had four submission deadlines for Monday, where hundreds of students were submitting work online. This had a rather large impact on the database server. [I've since spoken to the department, and asked if they might be able to consider staggering submission dates. Not a longer term or scalable solution, but it'd certainly help in the short/medium term].
  • The same department had a couple of hundred students logging into Moodle at the start of an "open week".
All these factors combined, made Moodle crawl for most of the day (11.30am - 6pm, at least) and close to unusable.

The image below shows an illustration of our Moodle database load for the period Tuesday 16th February - Tuesday 23rd February.

duplo-load-averages-1.jpg

On a typical Monday, the load is near to 1.5 (not shown on the graph above). Monday 22nd February 2010 was the exception to the rule - you'll see the spike at 2.0 for much of the day. You'll also be able to see that during the previous week, Wednesday through the Friday, the database load isn't as much and runs between 0.5 and 1.0.

As well as having this information, we receive email pings from the database every hour when there are "too many connections". On a typical Monday we might receive 3 or 4 such emails; on Monday 22nd, we got 7!

During our maintenance period on Tuesday 23rd, we implemented something which has reduced the load on the database server. Previously, the database server was used to store information about current sessions. Now however, this information is stored on the filesystem. (To take a look at this in your own installation, go to: Site Administration >> Server >> Session Handling >> Use database for session information)

A typical load for Tuesday is between 1.2 and 1.5; the load on Tuesday 23rd February (and subsequent days) was close enough to a peak of 0.2 (see graph below). The change detailed above has had a significant impact on Moodle performance, on one of our busiest days... and we didn't receive any email pings, even during the middle of the day!


duplo-load-averages.jpg

The number of users logging into the service Tuesday 23rd compared to Monday 22nd was reduced admittedly, but following the change to how sessions are handled, we did not encounter the same types of problems as the previous day (or previous Tuesday's) and the impact on the database server was markedly different.

We will continue to monitor this over the next couple of weeks, but clearly the real test will come on Monday, when usage is at its very highest. However, we're confident that our improved configuration would be able to handle the same number of users (and associated activity, such as coursework submissions) in a much, much better fashion.

A few weeks ago, we released our Moodle Development Plan for 2010 to colleagues (and the Moodle community) which details further service performance projects are planned from now until June 2010. This includes:
  • Moving Moodle 'log tables'
  • Moodle housekeeping, in particular, deleting 20000 old users that no longer exist in our LDAP.
I hope that this forum post, whilst lengthy, is of some use to Moodle administrators.

I'd be happy to answer any further questions about this, or indeed, our ongoing Moodle-related work at the University of Bath smile



Average of ratings: Useful (3)
In reply to Nitin Parmar

Re: Handling Moodle database load - a solution!

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
I thought it was well known that database sessions were probably a performance hog. Oh well.

Also, are you using InnoDB or MyISAM tables? ISAM has a crappy locking scheme (it locks whole tables, not just rows) which really hurts concurrency on things like logs and quiz tables. Use InnoDB, or Postgres.
Average of ratings: Useful (2)
In reply to Tim Hunt

Re: Handling Moodle database load - a solution!

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers
If it is well known it should probably be documented on that settings screen...
In reply to sam marshall

Re: Handling Moodle database load - a solution!

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers
grrr my editing timeout timed out.

I was going to say, it is actually documented on the moodle docs wiki: Performance.
In reply to sam marshall

Re: Handling Moodle database load - a solution!

by Frank Ralf -
While adding a link to this discussion to Performance FAQ I noticed a lot of overlapping content with Performance. Might be a good idead to merge those two articles?
In reply to Frank Ralf

Re: Handling Moodle database load - a solution!

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi Frank

I started filling the stub http://docs.moodle.org/en/Performance_FAQ after explaining things like number of concurrent users is that counts over and over. http://docs.moodle.org/en/Performance is definitely more comprehensive but is topic-based, not in question-and-answer form.

My idea was to compile the references to a FAQ and put it in one place. Those references could be subtopics from other wiki pages or references to discussion threads or individual postings or even to Xref http://xref.moodle.org/

'fcourse if you want to try another plan, you are welcome. That's why we have wikis!
In reply to sam marshall

Re: Handling Moodle database load - a solution!

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
It has only been documented on Performance since this morning, when I added it wink
In reply to Tim Hunt

Re: Handling Moodle database load - a solution!

by J B -
Hello

I have recently joined Nitin in the elearning team at Bath.

> I thought it was well known that database sessions were probably a performance hog. Oh well.

Well that had been my experience too, albeit on a postgres-backed moodle instance: I understand the logic fo setting it that way was so as not to lose the user session in case of server fallover, and the notes by the checkbox seem so promising - "especially useful for large/busy sites" that I expect the change was made and forgotten about before the usage scaled up to meet the performance at what became a limiting value! Perhaps there should be more of a caveat there.

> Also, are you using InnoDB or MyISAM tables?

I will have to find out on Monday, but I seem to recall seeing a note from my predecessor regarding the need for InnoDB.
In reply to Nitin Parmar

Re: Handling Moodle database load - a solution!

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers
Hi,

I don't think storing sessions in database is at all a sensible idea, either; database is basically the slowest way to store anything so you should use it only for data that actually needs database functions (searchable, rigid control to guarantee data validity, transactional updates, etc.) or where performance doesn't matter.

I'm guessing you have a single-server system, but here are our experiences with a system that has multiple webservers, with storing sessions on the filesystem:

1) If you have more than one webserver, you will need to make sure that EITHER the load-balancing system uses 'sticky sessions' to ensure that a single user keeps seeing the same server OR you store the sessions in shared storage (NFS). Obviously if you don't do this it totally won't work so that's easy to spot.

2) Currently we do not use sticky sessions because we had an issue with certain front-end servers potentially (when the dice fall the wrong way) directing a huge proportion of traffic consistently to a single server, causing it to fall over - i.e. the front-end servers get 'stuck' on the sticky sessions.

3) As a consequence we suspect our NFS servers (or our SAN, whatever) as the cause of some minor continuing performance problems. This is also definitely the cause of some obscure problems that happen when your session is updated by two requests at once (a fairly rare timing issue); this results in session corruption and the user gets re-logged-in. [Not something they would notice usually - but if this happens at a bad time, it might e.g. lose a forum post due to incorrect sesskey.]

We propose to fix this by addressing the issue with the front-end servers then switching on sticky sessions. For the moment it does work OK, but it's definitely not ideal...

So in summary if you have one webserver my opinion is that disk (real, local disk) is a good place to store sessions. If you have multiple load-balanced webservers, local disk is still a good place to store sessions, but you need to ensure that requests from a single user within a session always hit the same server via your load balancer's 'sticky sessions' feature. And don't try to put sessions on NFS.

--sam

PS Note that I am a developer and not actually directly involved with running our server system smile That's somebody else's job. But I stick my nose in now and then, hence this post.

PPS Off topic, but historically, we got a large database load reduction by turning off ('No') the text cache feature ('cachetext' in admin). I am not certain this applies to everyone or indeed if it even still applies to us, but it's worth a try if you want to reduce database load. I mention it since it's a bit counterintuitive; you'd think a cache feature would improve performance, and in some cases, it possibly does - but that cache is in the database so is itself a performance problem.
Average of ratings: Useful (2)
In reply to sam marshall

Re: Handling Moodle database load - a solution!

by Nitin Parmar -
Thanks for your replies! I've asked one of the software developers in my team to reply to your post too, as he'll be able to give more effective answers to your questions.

However, I just wanted to say that as part of the process of tweaking Moodle to improve performance, we changed the 'cachetext' option from 1 hour to 15 minutes. I'm not sure if this has had a significant impact on service though. Might it be a good idea to set this to 'No' given your advice above?

We've also turned off recently glossary auto-linking which we also think might've had a positive effect on service performance...