Making backup more reliable

Making backup more reliable

by Colin Chambers -
Number of replies: 5
We've commissioned a project with Catalyst to add incremental backup and restore capability to the backup component because Offline Moodle is going to rely heavily on backup. The development work is scheduled to end around mid June. As a result we need to make sure the backup and restore process is reliable.

We've asked Catalyst to improve the backup reliability but many of the issues we initially raised have now been fixed. That's good to know but it means what should we focus on now. Obviously there's no point fixing things that other people are fixing.

For the Offline Moodle project we're only focussing on delivering 3 core modules initially including; resource, label and forums. So we're really interested in bugs that affect all modules, the backup and restore process as a whole or bugs affecting these specific modules. When we start looking at the other modules we'll have time to address those bugs at that point.

So far we've asked catalyst to look at bug MDL-12037 which asks for improved error reporting when things go wrong so it's easier to fix the problem. We've found that particularly with the Module backups there can be issues that are hard to track down so it makes sense to address this.

Tim Hunt just raised bug MDL-14302 to my attention. While this is listed as a quiz backup up issue it seems likely it could affect all large backups. the problem, as I understand it, is down to the main difference between DOM and SAX in reading xml. Since the restore process reads the xml into a DOM using the xmlize function at /lib/xmlize.php the corresponding memory required becomes too large. Therefore this appears to be a bug worth tackling since it affects all modules.

So if you have any major concerns you'd like us to address while we have resource committed then please raise then and we can discuss the pros and cons.
Average of ratings: -
In reply to Colin Chambers

Re: Making backup more reliable

by Dan Marsden -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
I'm wondering if we should look at using the new DOM stuff in PHP 5 for Moodle 2.0 - would be interesting to see if that would make much of a difference in memory usage....

http://nz.php.net/manual/en/intro.dom.php

smile

Dan

In reply to Dan Marsden

Re: Making backup more reliable

by Dan Marsden -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
php 5's simplexml might be another alternative - I wonder what differences there are in terms of performance.....
In reply to Colin Chambers

Re: Making backup more reliable

by Dan Marsden -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
...me needs to research before posting..... this is an interesting article on processing large xml files.... (although his testing file isn't all that 'large')

http://blog.liip.ch/archive/2004/05/10/processing_large_xml_documents_with_php.html

smile

Dan
In reply to Dan Marsden

Re: Making backup more reliable

by Martín Langhoff -
Still -- all this DOM stuff is memorybound. The only memory-efficient way to deal with XML is to use stream parsers mixed
In reply to Martín Langhoff

Re: Making backup more reliable

by Colin Chambers -
Yes, that's my experience. DOM is always intensive on memory given what it does. So if you want to use it's features You can only work with relatively small xml nodes rather than the whole document. Loading it into memory itself can add a significant lag.

so a balance between sax and dom could be a compromise. that depends on how good the php tools are for using xml. I think that's the kind of thing the article is relating to.

On other projects in moodle I've chosen to use xml as the data layer but used the serialise and unserialise methods of the pear xml objects to get native php which should behave better programatically and with memory.

I can imagine that that approach is too late in the day since xml is used throughout backup