Backup Large Course - work-around? ...

Backup Large Course - work-around? ...

by Ken Task -
Number of replies: 11
Picture of Particularly helpful Moodlers

just spent an hour or two with a troublesome course backup and thought I'd share a 'discovery'.  Have bolded the key/important part below.

Moodle 2.4
RHEL 6.5 - 16Gig mem.  Plenty of space remaining.
PHP 5.3.3

Backup of a large course failed - temp file backup (.mbz) that was being written
disappeared.  The log file that remained shows:

[root@elearning backup]# cat 19977830a7faa33f05211b8873975a50.log
[Thu 16 Jan 2014 08:18:57 PM GMT] [error] backup_auto_failed_on_course SS K-12
[Thu 16 Jan 2014 08:18:57 PM GMT] [error]   Exception: error_zip_packing


Yet in moodledata/temp/backup/[tempfoldername] all the folders and files created for the construction of the .mbz where there including the key moodle_backup.xml file.

cd moodledata/temp/backup/[tempfoldername]
head moodle_backup.xml to acquire the name as Moodle would have given it.
Then while in that directory:

zip -r backup-moodle2-course-24-ss_k-12-20140116-1945.mbz *

One could give a path in front of the .mbz file.

which will 'expand' into:

zip -r backup-moodle2-course-24-ss_k-12-20140116-1945.mbz activities completion.xml course files files.xml gradebook.xml groups.xml moodle_backup.xml outcomes.xml questions.xml roles.xml scales.xml sections users.xml

zip will create a temp name in that directory ...

-rw-------.   1 root root 14650963476 Jan 16 20:57 ziXqv59d

this one did finish:

-rw-r--r--.   1 root root 15558171441 Jan 16 20:58 backup-moodle2-course-24-ss_k-12-20140116-1945.mbz

One **cannot copy it** but one can *move* it.   Say like to the designated directory when doing automated backups and saving to only that designated directory.  Think the PHP routine attempts to copy rather than move.


Wonder IF PHP an do a move rather than copy?

Then clean up that [tempdirectory]:

cd ../
rm -fR 19977830a7faa33f05211b8873975a50

Also noticed, the zip command from CLI deflates files which is nothing new for cli.  Wonder if the PHP routine does that?

  adding: files/38/38f70e63a58bc7f4b233888a9351382887070866 (deflated 11%)
  adding: files/38/38958315323edc507e1ce3b17df03cdcee0c08c9 (deflated 31%)
  adding: files/38/38762eecd19f4ed552605a807f05815bfe17893e (deflated 3%)
  adding: files/38/38ce0a5921f0d7595392cf24d08f04813bd73256 (deflated 10%)

May not get much deflation for digital content like videos, but for other types of documents there
is (sometimes) lots of 'air' that can be removed.

All this to say, if a backup fails, all may NOT be lost!   Work is required, however! :\

Restoring, however, now necessitates the use of file system repo for such large backups - and some re-tweaking of php settings (maybe) before one attempts a restore.
This does assume the .mbz files created are valid zips.  But in checking zip appears to be able to read/list files.

This is for what it's worth ... worked for me ... hope it works for anyone/everyone that finds themselves in a similar situation.

'spirit of sharing', Ken

Average of ratings: Useful (2)
In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Jordi Pujol-Ahulló -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Hi!

I found myself in the same situation, in my Moodle 2.4.

I performed the zip operation manually and worked ok also for mi. It tooks 11 minutes for zipping a raw 11Gb (the site course).

Thanks for your information!

Jordi

In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Jordi Pujol-Ahulló -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Actually, since there are so many big courses, I find it interesting to automate this task of building manually target zips that, from command line they are built correctly, but that from PHP script doesn't.


I have a solaris system in the server. I have built this script to help me. Hoping it may be of your interest:

<?php

define('CLI_SCRIPT', true);

require_once 'config.php';

ini_set('display_errors', true);
ini_set('error_reporting', E_ALL | E_STRICT);

global $CFG, $DB;

$nonzipped = array();

$modtimecommand = 'perl -MPOSIX -le \'print strftime "%Y%m%d%H%M", localtime((lstat)[9]) for @ARGV\' ';
exec('grep -l error ' . $CFG->dataroot . '/temp/backup/*.log', $nonzipped);

$backupdir = (isset($CFG->backup_auto_destination))
    ? $CFG->backup_auto_destination
    : get_config('backup', 'backup_auto_destination');

foreach ($nonzipped as $i => $logfile) {
    $dir = substr($logfile, 0, strpos($logfile, '.'));
    echo '### Processing unfinished backup course directory: ' . $dir . PHP_EOL;
    $xmlfile = $dir . '/moodle_backup.xml';
    $xml = simplexml_load_file($xmlfile);
    if ($xml === false) {
        echo '*** Could not load xml file: ' . $xmlfile . PHP_EOL;
        continue;
    }
    $targetzip = $backupdir . '/' . $xml->information->name;
    if (file_exists($targetzip)) {
        $zipmodtime = filemtime($targetzip);
        if ($zipmodtime === false) {
            $zipmodtime = exec($modtimecommand . $targetzip);
            $dirmodtime = exec($modtimecommand . $dir . '/.');
        } else {
            $dirmodtime = filemtime($dir . '/.');
        }
        if ((int)$zipmodtime > (int)$dirmodtime) {
            echo '*** Skipping. Zip is up-to-date: ' . $targetzip . PHP_EOL;
            continue;
        }
    }
    $command = 'cd ' . $dir . ' && time zip -r ' . $targetzip . ' *';
    echo '===> Running command: ' . $command . PHP_EOL;
    passthru($command);
}

exit(0);

There are some tricks over there:

  1. The $modtimecommand is useful to get the modification time from a file when it is so big that the filemtime does not work properly (and returns false).
  2. Using the passthru, instead of an exec, we get the standard output from the command out into our php script.
  3. I use this script, namely like post-automated-backups.php in the MOODLE HOME like this: time php post-automated-backups.php | tee post-automated-backups.log. This way we get the total spent time for executing all this at once.

Probably you can build your own automated_backup.php script to just run:

  1. the admin/cli/automated_backups.php script, and
  2. this post-automated-backups.php script, to complete unfinished zips.

Hoping this helps!

Jordi

Average of ratings: Useful (2)
In reply to Jordi Pujol-Ahulló

Re: Backup Large Course - work-around? ...

by Ken Task -
Picture of Particularly helpful Moodlers

Thanks for sharing back! ;)

Ken


In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Colin Fraser -
Picture of Documentation writers Picture of Testers

Hi Ken, have you seen the suggestion to use Site Administration > Development > Experimental Settings > Enable New Backup format? Apparently is uses the tar.gz compression format. 

AFAIK, it removes the size restriction of the zip compression format. It may be a more efficacious method for large backups.   

Average of ratings: Useful (1)
In reply to Colin Fraser

Re: Backup Large Course - work-around? ...

by Ken Task -
Picture of Particularly helpful Moodlers

Yes, Colin, thanks!   Have turned on the experimental and run it on a 2.8 version.   It does appear to be faster.

For folks that do this ... the routine doesn't change the extension of the backup file ... still .mbz.  IF, however, you've learned about renaming the file to a zip extension then running unzip, that will fail ... duh!  Well - yeah!

The file is now a tar.gz ... which means to un-compress/expand/extract contents, one must use

tar -zxvf nameofbackup.mbz

Am experimenting, right now, with what appears to be a 14Gig backup, with the command line backup.php script found in moodlecode/admin/cli/ to see if it checks for the experimental setting or not.

Still think there is an issue with the method by which large .mbz's are 'copied' from the moodledatadir/temp/backup/[hashiddirectory]/backup.zip to the designated area ... filedir, or alternative directory as given in auto_backups when manually run.

As root user on a system, it shows:

file size               (blocks, -f) unlimited

So there's some setting for PHP ... maybe?

Have found what appears to be all of the xml files and their related folders in such a temp directory with what appears to be a backup.mbz that couldn't be copied.   One could, however, manually 'mv' the file.

So there's a limit on 'cp' as well ... me thinks.

Wow ...

drwxrwxrwx. 258 root root        4096 Mar 19 15:36 files
-rw-rw-rw-.   1 root root     4285026 Mar 19 16:01 files.xml

Naturally, IF one can successfully create a valid backup file, that doesn't necessarily mean one will be able to restore it.   Uploading via normal means not possible.   File System repo only way to go.

Hmmmm ... feature request?  Now that there is a backup.php for backing up courses, will there be, in the future, a restore.php?    Hoping so.

'spirit of sharing', Ken


Average of ratings: Useful (1)
In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Jordi Pujol-Ahulló -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

That's really interesting, both to know about the tar.gz compression method and the tests you already did.

We are in M2.4, and this feature is not available yet. However, I'll check about this feature in M2.8 or so.

Thank you very much!

In reply to Jordi Pujol-Ahulló

Re: Backup Large Course - work-around? ...

by Ken Task -
Picture of Particularly helpful Moodlers

True, Moodle 2.4 doesn't include an admin/cli/backup.php script, but like I said (maybe not clearly enough or in a related post), one could acquire the code for a 2.7 or 2.8 in a test directory (via git right on the server), then copy the backup.php script to the older versions moodlecode/admin/cli/ directory.

Might work in 2.4 as long as all the files referenced in /backup/util/includes/backup_includes.php are present on the system.

Like I mentioned, does work in a 2.6 (and that didn't include the backup.php script either).

BTW, did run my experiment again ... with experimental turned on, the backup.php script did produce a smaller .mbz file - 14Gig -> 12Gig.   But, still didn't complete the last step ... that of moving the file to the area for backups ... alternate directory IF running autobackups manually and designating a directory or in the back file area (which means filedir).

Definitely desire to keep that size of backup out of filedir ... that's a mess as it is now without adding to it.

IMHO, it sure would be nice for the true Admin user (user ID 1) to be the only one that could have access to a tool to at least help locate course backups in filedir and their sizes.   Home grown script the only way right now.

Ran into a course from H---- not long back ago ... Digital Photography ... teacher backed up every week for a period of time due to assignments.   Backup increased by about 1Gig after each assignment.  Course finally reached limits and required manual manipulation.

Have a site from H---- right now that has several courses designed for the entire staff of a large school district - all teachers have accounts - and the courses are resources related to/and for teachers only.  Massive amounts of PDF/Word Docs acquired from State Education Agency per grade level/subject.   Each year they update those courses and rather than deleting links to older/out-of-date files (and the files to which they link), they hide them! sad   So they have documents from the year 2012 still there that are no longer applicable.   Easier to hide rather than delete ... so am told.   Opps ... sorry, for soap box there ... my problem and just sharing that situation cause it's related.

'spirit of sharing', Ken



In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Jordi Pujol-Ahulló -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Hi Ken!

In our Moodle instance, we set up capabilities in such a way that role editingteacher cannot backup nor restore moodle courses. That way, you automatically prevents teachers and other staff from making periodic backups.

Yeps... I am making an assumption: you institution has a corresponding backup system (in any form) so that you can restore the whole Moodle instance in case of failure. Individual loses (due to human errors) are not addressed by this mechanism though, since this backup would be of the whole Moodle, not just a course.

Best,

Jordi


In reply to Jordi Pujol-Ahulló

Re: Backup Large Course - work-around? ...

by Ken Task -
Picture of Particularly helpful Moodlers

Totally understand.  Entity with which I am working, that is totally un-acceptable.   Can't blame them ... normally their network/workstations are in lock down mode and Moodle is about the only thing where they aren't 'restricted'.   Personally, they need to be empowered to facilitate use of tech in their classroom.  Powers that be don't quite see it that way, however.  So it's Yin/Yang.

But thanks for the tip ... others might be able to set those permissions without disruption and none the wiser cept the server admin person (most oft left out of the planning/'dreams concerning usage,etc.).

'spirit of sharing', Ken




In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Colin Fraser -
Picture of Documentation writers Picture of Testers

Interesting perceptions... the need to control is not bounded by other considerations. The people most needed to be involved are most often excluded. Funnily enough, the same things happen here too. Must be a reason for it. thoughtful Although, a sad indictment of the human condition you think? 

In reply to Ken Task

Re: Backup Large Course - work-around? ...

by Ken Task -
Picture of Particularly helpful Moodlers

Follow Up ... using the command line backup.php script ...

from moodlecode/admin/cli/

nohup php backup.php --courseid=86 --destination=/home/backup/ &

nohup means no hang up ... i.e., run the script in the background
Be sure to use the & at the end because that tells the system to
put the command in the background immediately.   This allows one to logout.
OR if your terminal session disconnects (due to time outs for shells), the
process won't be killed until it's completed.

One can see the output of what the backup script would have shown via:

cat nohup.out

== Performing backup... ==
Writing /home/backup/backup-moodle2-course-25-k-12_math-20150320-0748.mbz
Backup completed.

== Performing backup... ==
Writing /home/backup/backup-moodle2-course-86-techframe-20150320-0751.mbz
Backup completed.

'spirit of sharing', Ken

Average of ratings: Useful (2)