Please, rewind my Moodle to 10:29am this morning

Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -
Number of replies: 13
There's a nice feature in PostgreSQL called PITR. People normally use it to run warm standby servers (in case their main DB server dies). But it has a few more tricks in the bag.

If you set things up appropriately, you can rewind/replay your DB to a given point-in-time. So you have a 2nd Moodle install "shadowing" your main installation -- from a special admin webpage, you can order the shadow moodle to be rewound to any time from 5 minutes ago to 30 days ago, with granularity down to seconds/miliseconds.

So if a teacher calls saying they have *just* deleted a quiz he shouldn't have, you rewind the shadow Moodle install, export the course/quiz to a moodle backup file, and either restore it or email it to the teacher.

Something like a having your own personal Moodle Tardis.

Depending on HW, traffic and time of the day, the "rewind" operation takes 5-20 minutes. How many days of rewind-ability do you keep around is configurable - naturally, this feature eats hard drives for breakfast on a busy Moodle setup, and more days mean more HD wink

The only caveat is that it would not cover - or rewind - files held in Moodledata. There's a plan I've drafted with Petr for smarter file storage using a content-addressable strategy. Once that's in place, it will be able to magically "rewind" Moodledata too.
Average of ratings: -
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by Pieterjan Heyse -
omg, Martin, this sounds so super-neat!

I really think the sysadmins in this forum are all thinking ... darn, wish I'd be using/knowing/migrating to postgres...

Do you have a list of caveats/specificities of postgres, because if moodle runs better on postgres, I'm sure there are some admins out there willing to learn how to manage it, to run a better moodle (I'm sure I would want to.)...

Sorry, I could not control my enthousiasm.
In reply to Pieterjan Heyse

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -

Not many downsides (that I'm aware of!):

  • Contrib modules may have SQL errors/oddities that MySQL accepts. From v1.7 onwards, thanks to XMLDB and the new dmllib, this is rather unlikely. Still, it can happen.
  • The default PostgreSQL install usually needs a bit of tuning, mainly around telling PostgreSQL how much memory to use (it assumes it can only use something really small, like 256MB). Pg 8.3, due Real Soon Now is supposed to be a bit more autotuning in that regard. Still, expect to do a bit of tuning and testing.
  • Disable autovacuum and run it from cron.
  • For small/medium sites, it may be slower than MySQL.
  • Persistent connections are more important than under MySQL -- this makes it impractical for bulk hosting.

On the other hand

  • Migration is relatively easy. I think I've posted some notes and links to scripts in the past.
  • Never ever gets index or data corruption. No repair tables.
  • Very good scalability, specially with large sites and Moodle v1.9.
  • Cool tools like PITR wink
  • If you want to run a national election, or the DNS backend of your country... you have all the DB you're going to need wide eyes
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by Dan Poltawski -

> Disable autovacuum and run it from cron.

Oh, why? I was under the impression that autovacuum was relatively good (not that i've ever looked into it in detail).

> Migration is relatively easy. I think I've posted some notes and links to scripts in the past.

It is? Would be interesting to see them! The thought scares me! Although I guess it should 'just work' if you were to do it with the XMLDB layer.

We are traditionally a postgres shop, but I went with mysql for our moodle hosting, mostly due to migrating a large number of existing moodles from mysql hosting and the fear of such a migration ( that was pre XMLDB and we'd had v.bad recent experiences with drupal).

I think we'd like to move to postgres one day (if only so my colleagues dont kill me when they hit CTRL-C on the mysql console to quit a query), but we do use features Mysql Replication, which i've not really seen a winning alternative in postgres yet. I'd never really thought about the persistent connections problem before now either.

In reply to Dan Poltawski

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -
Autovacuum sometimes gets its timing wrong, and starts vacuuming just because the traffic slackened a bit. On large DBs this can be murder. (German Poo Cama~no, an old-school Gnome hacker, did some work on the autovacuum scheduler recently. If that gets merged, it might make things a whole lot better.)

I'm sure the MySQL-to-Pg scripts are posted / linked somewhere, the only "tricky" bit was re-setting the sequences. We have a newer one that uses XMLDB to do it across any DB implementations that Moodle supports, and any direction. But that's newish and still unpublished.

In terms of replication - does PITR not meet your needs? I tend to find MySQL replication... more configurable, but slow and somewhat flaky mixed
In reply to Dan Poltawski

Re: Please, rewind my Moodle to 10:29am this morning

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
Hi there

> > Migration is relatively easy. I think I've posted some notes and links to scripts in the past.
> It is?

I have managed to migrate from MySQL to Postgres during my 1.5 >> 1.6 upgrade (so it was a little bit complicated due to UTF8 migration). IIRC I was trying to use one of those scripts caught somewhere in the web. But finally I just made a mysqldump (single inserts with column names!), opened it in my favourite editor and performed several regexp search and replace to get correct SQL syntax. Then I just changed DB info in config.php, let Moodle to create new and empty Postgres tables and imported data from modified dump.

p.s. I like the idea of a repository based moodledata - great solution for backups as well cool
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by emanuele della torre -

Morning, i would like to do two parallel sites, i did it, but the replication tool, is "me" making export from a site and import from the other, regulary. Now my problem is that all these two sites are in an hosting server on the web. how to automate replication ? So, i did'nt find nothing possible to do with phpmydmin for mysql, and nor with phppgadmin for postgre.

am i in error?

hav i to do a mine stand alone server, on a mine machine  with static ip,dns, etc?

may on dedicated server on the web too.

I do not find any documentation on phppgadmin !

Can you help me to understand, please ?

Thanks in advance, EDT 

How about a php "dedicated" page on moodle to reade data from a db and write it on the other ?

Again, EDT

In reply to emanuele della torre

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -
I would suggest you use your DB's replication facilities. Moodle-level export/import are too slow, and not reliable enough for what you are doing.
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -

Actually - over the xmas break the office has been mostly quiet, and I've had time to implement this end-to-end.

  • A nice web UI to rewind things wink that also shows a few stats. Needs CSS love though.
  • The database is "rewindable" to any timestamp, down to the second (or microsecond, if we care to extend things a bit).
  • It covers moodledata as well -- with snapshots taken at configurable intervals (every hour is recommended), so we pick the snapshot that is at or just before the time requested.
  • Extremely cheap performance-wise - the DB snapshotting and transaction log is virtually cost-free, the moodledata snapshots have a bit of disk-io cost.
  • It is fairly space efficient, my estimate is that for a reasonable-traffic moodle, the overhead for keeping 30 days worth of rewind data is around 50%.
  • Storage can be on NFS so it could be on a 'remote' and is replication-friendly (to keep a warm-spare around very efficiently).
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by Dan Poltawski -
What approach do you take to moodledata? Hardlinked snapshots?

The moodledata side would be my 'performance cost concern' as our Moodles are getting more and more loaded up with huge numbers of media files.

Having said that, we are now storing data on ZFS and doing hourly snapshots (which basically cost nothing performance wise), so would be able to plug-in to the .zfs snapshot directory structure.
In reply to Dan Poltawski

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -
I was going down the hardlinked snapshots route, but realised I knew better wink Both moodledata and pgdata are stored in git (each in its own repo). The cost is very low for incrementals - the initial import takes a bit of time - and then we do a daily repack that is costly but saves a ton in storage.

ZFS is quite close to what I am doing, with a possibly lower cost in performance, but higher disk footprint (unless it's doing magic compression/diffing behind the scenes like git does).
In reply to Martín Langhoff

Re: Please, rewind my Moodle to 10:29am this morning

by Dan Poltawski -
>ZFS is quite close to what I am doing, with a possibly lower cost in performance, but higher disk footprint

I don't know much [anything] about how git does its packing (know any good overviews? Google fails me!).

But in the case of moodledata (/big media files) I assume it can't really help too much?

Edit: Except git would do single instancing would it?
In reply to Dan Poltawski

Re: Please, rewind my Moodle to 10:29am this morning

by Martín Langhoff -
Depends on what you store in moodledata -- if it's big media files, or large zipfiles, it's "just" going to know that they haven't changed, and store only one copy, ever. Even if you have 300 copies of it in moodledata. That in itself can save quite a bit of space.

Don't have any good readable overviews of the git packing format, sorry sad the git documentation is pretty technical.