MySQL Clustering?

MySQL Clustering?

by Michael Penney -
Number of replies: 16

Has anyone tried this out? Is there a reason it won't work with Moodle?

MySQL Cluster enables you to:

  • Cost-effectively deliver 5 nines availability using parallel server architecture with no single point of failure.
  • Deliver the performance and high throughput required to meet the most demanding enterprise applications.
  • Incrementally scale your applications in a linear fashion as your needs grow without having to invest in expensive hardware.

High Performance Only a Main Memory Database Can Deliver

MySQL Cluster achieves its performance advantage by being a main memory clustered database solution, which keeps all data in memory and limits IO bottlenecks by asynchronously writing transaction logs to disk. MySQL Cluster also enables servers to share processing within a cluster, taking full advantage of all hardware. Typical response times for MySQL Cluster are in the range of a few milliseconds and MySQL Cluster has been proven to handle tens of thousands of distributed transactions per second that are also replicated across database nodes.


Many folks I talk to say that the db servers can't be clustered, resulting in the db server being the bottleneck of any cluster, but my server admin pointed out that MySQL claims it can cluster.

"which keeps all data in memory and limits IO bottlenecks by asynchronously writing transaction logs to disk."

So can we use it to cluster moodle db servers?

Average of ratings: -
In reply to Michael Penney

Re: MySQL Clustering?

by Martín Langhoff -
DB servers can't be clustered and be reliable, like ACID reliable. I haven't heard good stories about mysql clusters, really, but I do find the idea intriguing.

For example:

> asynchronously writing transaction

This means that you are open to a power outage causing serious dataloss. If you are going to be running risks like that, just disable the MySQL (or Postgres!) transaction logs and see your server be 5x faster for DB writes. Don't need fancy clustering for that ;)

What I am working on instead is integrating Slony, a Postgres replication mechanism. Useful for hot-failover, and perhaps separate read/write handles.

Long-term, I want to have a memcached writethrough cache for the busiest Moodle tables. In the end, DB writes are your insurmountable bottleneck, the bottleneck that draws the real scalability limits. But we are hitting the DB aplenty with silly SELECTs which we can cache.

I had a recent chat with MD about MySQL's query cache vs a Moodle level cache using memcached. MySQL query cache has some warts to it, but it's a good thing. An app-level cache is even better, because it can do writethrough, which MySQL's cache can't handle, and it saves you trips to the actual DB backend.

When your DB is on a separate machine, this is invaluable.
In reply to Martín Langhoff

Re: MySQL Clustering?

by Michael Penney -
Hi Martin, we have UPSs on the servers, and a generator for the server room, so power failure should not be a problem. We have had a good number of outtages and brownouts of the public power here in the past, so our server admins have independent power and redundent battery backup down very well.

What happened with Wikipedia I think is an example of poor planning, one shouldn't of course try to run a mysql cluster when one isn't allowed to have UPSs on the servers!

As I understand it, in the MySQL/Apache/PHP cluster if one server fails (due to power supply failure or other hardware failure), the other servers in the cluster will take up the load, so things may slow down but not lose data or corrupt tables?

The rest of our cluster plan includes a networked RAID backed up to our SAN, further backed up to tape and offsite storage. This (and the independent power) is part of our overall architecture, not just for the LMS.

So if power is taken care of do you see other problems with clustering MySQL?

In reply to Michael Penney

Re: MySQL Clustering?

by Martín Langhoff -

So if power is taken care of do you see other problems with clustering MySQL?

Hmmm, yeah.

Imagine a mysql process goes bonkers and eats away your CPU -- hey, it happens sometimes. You "kill" the process, either -15 or a more violent -9. It'll leave some corrupt indexes or even table data, which will cause other processes to lock up.

The fix is to run isamchk, but that takes many hours.

Postgres writes data safely, that's part of the ACID thing. All through kernel panics, kill -9, or power cuts, the data on disk is consistent, everytime.

In reply to Martín Langhoff

Re: MySQL Clustering?

by Samuli Karevaara -
If the DB itself is very reliable, then the "bottleneck" of reliability is the Moodle code itself? Many operations require things in multiple tables to be consistent, but I'm quite sure that there is no transactional code everywhere to take care of these. If the power/server goes down in the middle of a db write, then the db itself (like indices and such) is not corrupted (a biiig plus smile ), but moodle-wise the db might still be quite out of sync...

Maybe as part of this quest for reliability some of the bigger things (like deleting a course) could be made transactional. Maybe the AdoDB functions for transactions could be used here? (Like StartTrans etc.)
In reply to Samuli Karevaara

Re: MySQL Clustering?

by Martín Langhoff -
That's a very good point. We've started doing a few transactional writes (look! there's sql_begin(), sql_commit() and sql_rollback()). But until MySQL is widely used with a transactional backend like InnoDB, I don't know how we can turn Moodle towards a more transactional approach.

The issue I see is that the logic of the code, and in particular the error handling, is very different when you're writing it "transaction-oriented". Maintaning transactional and non-transactional code paths is a real challenge.
In reply to Martín Langhoff

Re: MySQL Clustering?

by Samuli Karevaara -
It is reasonable to say that if the db doesn't support transactions, then that installation of Moodle won't support transactions.

I'm not sure, however, why the transactional code wouldn't work with the transactions "off". It would require a dummy-layer for that, but isn't it so in AdoDB (for example) already?

With the dummy-transactions the actualization wouldn't be transactional, of course, but it would be no worse that what it's now. If you have stuff that only touches the db and not the files on the data folder or such (like most of the critical parts in Moodle are?), then it's just a case of start_trans() and commit_trans() of doing nothing. The rollback would have to be done on the db layer anyway?

Some parts would be critical, like deleting a course: there the deletion of the files should be carried out only after the final "ok" from the db. A way to make the actual file deletion to be transactional too would require quite a lot of effort, and might be out of scope for the mid-future...
In reply to Samuli Karevaara

Re: MySQL Clustering?

by Martín Langhoff -

I'm not sure, however, why the transactional code wouldn't work with the transactions "off". It would require a dummy-layer for that, but isn't it so in AdoDB (for example) already?

The error handling, therefore the logic flow, would have to be completely different. It would take a serious re-think of some operations to make things transactional in a meaningful way.

At this point in time the important operations are fail-safe, so that the database doesn't end up in wildly inconsistent state if something goes wrong. So the lack of transactions is not too serious.

In reply to Martín Langhoff

Re: MySQL Clustering?

by Oksana Delgado -
so, I still didn't understand, is clustered database an option for Moodle? is it possible to run clustered mysql/postgresql with Moodle?
In reply to Oksana Delgado

Re: MySQL Clustering?

by Michael Penney -
is clustered database an option for Moodle?

As far as I can tell MySQL in memory clustering works as advertised, it should work as well for Moodle as any other database driven application.

I think the big question is whether MySQL high performance clustering provides sufficient data integrity when run in a cluster, so it's a MySQL cluster question rather than a Moodle question. There are some interesting discussions about this on the MySQL mailing lists Martin L. pointed to.

Moodle can also be run on Oracle, so it should have no problem running on an Oracle cluster.

We'll be running some test on the high performance clustering of MySQL, hopefully in the fall.

I think high availability clustering via replication is older technology and should work fine for Moodle (though Martin L. may correct mesmile.

is it possible to run clustered mysql/postgresql with Moodle?

I don't think Postgres does high performance clustering, for Postgress you want a very fast db server connected to an apache/php cluster, as Martin Langhoff describes, with data replication to a spare db server for failover.



In reply to Michael Penney

Re: MySQL Clustering?

by Oksana Delgado -
Michael, thnks for the references and info. Seems like I'm surrounded by your replies here and forwarded by my bossbig grin
In reply to Martín Langhoff

Re: MySQL Clustering?

by ryan wise -
So just to confirm, if I'm installing MySQL I should install the non-transactional Database only?

Or Should I install the tranactional database?

From what I"m reading here. MySQL 4.1 includes the InnoDB storage engine. I'm guessing this is a recent update?
In reply to ryan wise

Re: MySQL Clustering?

by Samuli Karevaara -
Currently I don't think it makes a big difference as most parts of Moodle don't use the transactional features, whether the DB supports it or not. Might make sense to go with the transactional anyhow.

The InnoDB table type, which supports transactions, has been in MySQL for some time, but in the recent versions it has been made the default table type (instead of MyISAM that was the previous default).
In reply to Michael Penney

Re: MySQL Clustering?

by Richard Jensen -
Please see my reply in this other, very relevant thread:

http://moodle.org/mod/forum/discuss.php?d=121665

It appears that Moodle does not play nice with the ndbcluster table type required by MySQL clustering. The ndbcluster table type imposes serious limitations that existing Moodle schemas do not appear to address. It appears that the Moodle team will need to rethink existing schemas in light of the ndbcluster table type limitations before Moodle will become an enterprise grade application.

The aforementioned thread addresses problems with Moodle on Oracle as well. Again, it is not well-supported; although, aside from it not being open-source, Oracle RAC is a superb RDBMS platform (we use it for our PeopleSoft system). For enterprise grade db clustering, at present Oracle RAC is really the only game in town. MySQL clustering will never be RAC, but it could be a reasonable open-source alternative for Moodle, if Moodle was complient with the ndbcluster table type.

Postgres has several add-ons to provide HA capacity, but we have found only one add-on to create an actual db cluster. It appears to be developed and supported by a single individual in Japan (with marginal English skills), and that is not our idea of a "well supported" approach to an enterprise grade application! So, at present, even Postgres does not seem to offer a well-supported clustering option.

I consider this problem to be critical, and I would urge the Moodle team to view as top priority making Moodle into an enterprise grade, open source application by ensuring support for the ndbcluster table type.

In reply to Richard Jensen

Re: MySQL Clustering?

by Martín Langhoff -
Hi Richard,

IIRC, the main problem with ndbcluster is that it has to fit in memory. It's not a practical thing to expect of any sizable system, and it's one of the main limitations of MySQL cluster. (I personally think it's kind of ridiculous.)

Oracle has, as others have mentioned (and I've researched in depth) some serious limitations in its TCP/IP protocol, specifically in the handling of CLOBs (or perhaps in the lack of an intermediate column type). I've worked on this with the low level PHP driver trying to address this problem. No dice. It's just not a good platform for CMS-style applications. (But the results of my work are available as a complete patchset for Moodle code on Oracle installations. The patches improve the performance on Oracle significantly. Francois Marier and Dan Marsden are the current keepers of those patches.)

A possible option for you is get a support contract for Moodle on Oracle, so that you have experienced developers helping sort out any issues. Eloy (Moodle.com) and the Catalyst team have been doing a ton of work on it. They'd be my first recommendation.

(You might want to read my profile page for disclaimers and professional associations smile )
In reply to Richard Jensen

Re: MySQL Clustering?

by Valery Fremaux -
Hi Richard,

here is some experience that may be of some use.

Actually Martin pointed out the real limitation of the ndbcluster engine. We use such an engine for driving a 40 Moodle Grid for the french Pairformance project.

Our experience with ndbcluster concluded with the folliwing.
  • Moodle cluster may be used for parts of the data model that may face a big increase of stress.
  • MySQL allows choosing the engine for each table within the same database.
  • Moodle tables can be divided within three sets :
    • High read stressed tables (because of being accessed a lot or having a lot of data the php caches need to make a page
    • Write stressed tables (having mostly scarse reads)
    • Moderately read/write stressed tables
    • Fortunately didn't I noticed tables that would have a constant read/write stress
The first set is interesting for cluster, and mostly relates to central tables such as the course, course_module, categories, course_sections tables one side, and the role and capability definition tables the other side. (around 15 tables per Moodle)

None of the other tables are interesting for the cluster. The majoritary written tables are not a good choice as writing synchronously to all the cluster nodes takes time.

The other tables have barer use in the whole load.

I may be able to tell you more about this strategy in a few weeks, after our LoadRunner intensive test and measurement campain.

Cheers.