Full Site Backups

Full Site Backups

by Ken Task -
Number of replies: 10
Picture of Particularly helpful Moodlers

Didn't find any forum that is specifically for this so figured this might be the appropriate place to ask.  Senario:  virtualized environment (VMware) [virt backups and restores not tested/trusted] ... no NAS nor 'shared folder' on other server and single site for multiple campuses of a medium size school district that uses Moodle as flipped and as blended (always traffic - almost truely 24/7).  Always wise to do a full site backup when upgrading to a higher version ... easy enough to do on Linux (tar ball of code directory, data directory, and an sql dump in a script) that saves to the largest partition of the drive.  Last night, didn't think it trough before hand and did tar balls and dump via command line via single script while connected via ssh remotely, but waited about 2 hours for the tar ball of the data directory! (Yeah, I know, stupid me!  Sure could have used some zzzzz's (instead of watching the screen and checking for 'space' via df via watch) ... yawn!

Aside from running cron previous to full site backup (hopefully the trashdir will be emptied), are their directories inside moodledata which one could skip ... to reduce the size and therefore time of a full site backup?   Obvious one is moodledata/temp (?) and moodledata/sesssions if not using DB.  But how about moodledata/cache? or others one wouldn't need to do a complete site restore should an upgrade fail whale for some reason?

Come to think of it, that would be a desired admin add-on tool.  Is there one?

Thanks in advance for any suggestions/thoughts.

'spirit of sharing', Ken

Average of ratings: -
In reply to Ken Task

Re: Full Site Backups

by Emma Richardson -
Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Plugin developers

I run in a virtual environment and actually just copy the whole vhd before backing up.  Takes a while but then I know that I can just reattach the whole hard drive and be back up and running in no time should there be a disaster.

I like the idea of trying to minimize the size of the moodledata folder but I suspect that the majority of the size of the directory comes from course resources especially if you are hosting your flipped video through Moodle.

In reply to Emma Richardson

Re: Full Site Backups

by Ken Task -
Picture of Particularly helpful Moodlers

Agree that a VMWare backup is or should be done regularly, trouble is, I don't admin that portion of this server ... just the guest OS and backend of Moodle.   Yep, there are some flipped courses that are using the Moodle server as the resource.  Noticed that.

Also noticed what I'd call 'strange' behavior from dropbox repo ... in the settings it does have an option setting the cache size:

"Enter the maximum size of files (in bytes) to be cached on server for Dropbox aliases/shortcuts. Cached files will be served when the source is no longer available. Empty value or zero mean caching of all files regardless of size."

Their value is 0.   One looks in /moodledata/temp/download/repository_dropbox/ and one finds many many files like 52892f7fc14483.47875921_1384722303.tmp which appear to be all Dropbox - 404 pages!   File dates go back to when site was first used by teachers (May).  In looking at settings for all the repos they have turned on, it is the only one that has an option like that.

So how does the system admin find out if any of these files were flipped videos?

IMHO, true system admins have hands tied with the new file system and few tools to use to find out such info ... but, of course, that's why they make the big bucks [NOT!], huh?

Thanks for sharing, Ken

In reply to Ken Task

Re: Full Site Backups

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi Ken

You wrote:
> did tar balls and dump via command line via single script while connected via ssh remotely, but waited about 2 hours for the tar ball of the data directory!

I would investigate that further. What is the total size of moodledata? How many files? How busy were the host and the guest machines at that time? Could you pull moodledata to a standard workstation, Linux native, and compare the times for the same script?

> Aside from running cron previous to full site backup (hopefully the trashdir will be emptied), are their directories inside moodledata which one could skip ... to reduce the size and therefore time of a full site backup? Obvious one is moodledata/temp (?) and moodledata/sesssions if not using DB. But how about moodledata/cache? or others one wouldn't need to do a complete site restore should an upgrade fail whale for some reason?

Good question. Sorry, I don't know the answer! Searching the forums found these:
- https://moodle.org/mod/forum/discuss.php?d=166792
- https://moodle.org/mod/forum/discuss.php?d=191275

Two thoughts:
- I expect these to change along major releases
- For a clean backup you should put the site in maintenance mode and then clean session data which logs out everybody

> Come to think of it, that would be a desired admin add-on tool. Is there one?

I'm not aware of such a tool. If there's one it should be a shell script, or PHP CLI at most.
In reply to Visvanath Ratnaweera

Re: Full Site Backups

by Ken Task -
Picture of Particularly helpful Moodlers

Am investigating today ... Sunday afternoon here in Texas ... and already replied to Emma with one finding of the behavior of dropbox_repo.

drwxrwxrwx. 2 apache apache 1003520 Nov 17 16:15 repository_dropbox
drwxrwxrwx. 2 apache apache    4096 Jun 24 10:52 repository_flickr_public
drwxrwxrwx. 2 apache apache    4096 Nov  7 23:20 repository_googledocs
drwxrwxrwx. 2 apache apache    4096 Nov  7 09:34 repository_url
drwxrwxrwx. 2 apache apache    4096 Sep 22 16:54 repository_wikimedia

Will look for other 'wierdness'. :|  Gee, forgot about responding to one of those other forum queries.   Thanks for links. Will have to go back and attempt to review what I've learned about administering a Moodle! :|

Thanks for sharing', Ken

In reply to Ken Task

Re: Full Site Backups

by Andrew Lyons -
Picture of Core developers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Testers

Hi Ken,

So aside from running cron, and putting the site into maintenance mode as Visvanath has suggested, there isn't a huge amount that you need to do. However, rather than using tar, I'd recommend something slightly different.

Since the moodledata contents doesn't ordinarily change a huge amount between versions, and since you're already saving the relevant amount of space for your backup tarballs, I'd suggest using rsync. rsync is a wonderful tool for this kind of thing and should be a *lot* faster than using tar since, once you've completed the first upgrade, you no longer need to copy all the data - only those bits that have changed. The first time you run a full site upgrade, you can run a full rsync:

rsync -avi /path/to/src /path/to/backup

Then, unless you have a pressing requirement to keep your backups after your upgrade has completed, when you do any subsequent upgrade, you can run:

rsync -avi --delete /path/to/src /path/to/backup

As long as you don't delete the backup directory between upgrades, it should be nice and fast. If you do (for some unknown reason) need to keep a backup hanging around for posterity, you can do so from your backup directory. This means that you can run your completed backup, and your complete upgrade, in a relatively short period of time.

Additionally, you can start the rsync when your site is still live. This will mean that you can bring the delta right down before putting it into maintenance mode.

As an example, when upgrading from 2.4 to 2.5, I perform an initial sync when the server is still in production:

rsync -avi /srv/www/mysite.com/data/moodledata /srv/www/mysite.com/data/backup/moodledata

And then because that took a few hours and I have a busy site, I'll run the rsync again (maybe in a loop because I want the delta to stay low):

rsync -avi --delete /srv/www/mysite.com/data/moodledata /srv/www/mysite.com/data/backup/moodledata

Now I put my site into maintenance mode, and do it again, and then run the upgrade:

php admin/cli/maintenance.php --enable

rsync -avi --delete /srv/www/mysite.com/data/moodledata /srv/www/mysite.com/data/backup/moodledata
// DB backup goes here
// commands to upgrade go here
php admin/cli/upgrade.php
php admin/cli/maintenance.php --disable

Then from 2.5 to 2.5.1:

rsync -avi --delete /srv/www/mysite.com/data/moodledata /srv/www/mysite.com/data/backup/moodledata

Then from 2.5.1 to 2.6:

rsync -avi --delete /srv/www/mysite.com/data/moodledata /srv/www/mysite.com/data/backup/moodledata

etc.

In addition to making this much faster, it also means that, in the unlikely event that it all does indeed go belly up, you can revert to your backup *very* fast too:

rsync --avi --delete /srv/www/mysite.com/data/backup/moodledata /srv/www/mysite.com/data/moodledate

If you read the manpage for rsync, you'll find that you can also pass it exclusions - I'd recommend the cache and temp directories.

Hope that this helps you in your moodlequest smile

Andrew

Average of ratings: Useful (1)
In reply to Andrew Lyons

Re: Full Site Backups

by Ken Task -
Picture of Particularly helpful Moodlers

Imagine if you will the original poster slapping himself in the front of the head (DUH!)... uhhhh, that's me!  Andrew thanks for reminding me about rsync (have used it many times, but somehow focused in tar balling everything for backups).   Yes, restore, if needed, from rsync would be much faster.   Will look into the options for excluding.  Might have two sets of backup scripts now ... one for updating within a series (2.4.1 to 2.4.2 kinda thing) and one for upgrading (2.4.x to 2.5.x).

Thanks for sharing', Ken

 

In reply to Ken Task

Re: Full Site Backups

by Andrew Lyons -
Picture of Core developers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Testers

IIRC, it's the -x option, but all of the above has been from the top of my head without reading the man page - read before you use!

You shouldn't need any different scripts for minor and major releases really - why would you need two?

Andrew

In reply to Andrew Lyons

Re: Full Site Backups

by J S -

Some other options for quick backups include:

- snapshotting the VMDKs via vmware (you mentioned you dont have access but you could work with your administrator to do this or ask for access)

- using filesystem or lvm snapshots (if your filesystem supports this)

- using alternative tools to tar or rsync to speed the process up (pigz, parallel, etc)

- logically break up the moodledata dir and run multiple tars at the same time (poor mans parallel processing); there wont be much gain though if all your data falls into one folder

You mentioned that the tar took >2=hrs.  Sounds pretty normal if you have a fairly sizable moodledata directory.  However, its possible that your disk is just slow.  

In reply to Ken Task

Re: Full Site Backups

by Adam Durana -

Rsync is a really nice option, but you might also want to check out Duplicity, http://duplicity.nongnu.org.  Duplicity even uses the Rsync algorithm via librsync.

The advantage that I see to using Duplicity over Rsync'ing to a backup directory is that Duplicity allows you keep separate incremental backups in addition to full backups.  This means that you get the same speed as Rsync after you do an initial full backup and you are able to restore to any point in time that you have an incremental backup for.  You could use Duplicity for both your routine backups and your pre-upgrade backups, instead of having separate methods for each backup.

For example, you might set things up to do a routine full backup every Sunday, and then do incremental backups every day except Sunday.  Then when you are ready to do an upgrade, put the site in maintenance mode, do an incremental backup, and then do the upgrade.  You'll have daily backups, and backups from right before upgrades.  You'll be able to restore to any of those backups and not just the last one like with Rsync.

Duplicity also interfaces with external services like S3 so doing off-site backups is really easy.

Average of ratings: Useful (1)
In reply to Adam Durana

Re: Full Site Backups

by Albert Ramsbottom -

I have been reading this thread and its very interesting. No mention of MySQL backups and restore though! We are looking at Percona Extrabackup for this. In testing a 9.6GB MySQL database took 3 minutes to dump and 4 minutes to restore. Using MySQL dump was 30 mins to dump and just over two hours to restore.

We did have an issue with the IBData1 file becoming to large so we do need to give this some more attention before we decide to go with this solution.

For me it has to be rsync as Duplicity is still in Beta form and is very unlikely to be allowed to be used in a medium to large organisation. We for instance have to justify all technologies used whenever we need to introduce something new.