Backup policy

Backup policy

by Albert Ramsbottom -
Number of replies: 8

I have a rather large 7TB Moodle data folder to back up on a regular basis, twice a day, 7 days a week, 5 times a month 13 times in ayear etc.

Now this is massive, its a moodle 1.9 installation that has had its data extracted and placed in to Moodle 3.4 installation.

Anyway, when I programatically back up all my elements is there a way of excluding /moodledata/cache, /moodledata/localcache, /moodledata/sessions, /moodledata/temp, or /moodledata/trashdir

As this will improve the performance and size of each moodledata folder considerably


Thanks

Average of ratings: -
In reply to Albert Ramsbottom

Re: Backup policy

by Ken Task -
Picture of Particularly helpful Moodlers

Sounds like a job for progressive with delete rsync only on the filedir directory of moodledata to a large archive drive.

First dry run to see how long and how large.   Then run for real.   First run will take a long time and acquire it all, but progressive with delete from that point on would update the 'backup' with only files that are new and the 'delete' would compare what's in filedir with what is archived and if a flie has been deleted in filedir, the same file would be deleted in the rsync'd drive.

See man rsync.

'spirit of sharing', Ken


Average of ratings: Useful (1)
In reply to Ken Task

Re: Backup policy

by Albert Ramsbottom -

Can i come back to this?

Currently, we have a perl script running from cron that collects the moodle data directory from our NFS server and sends it to our "Swift" backup server??

I haven't seen the script as I have no access to the swift server!! Why I ave no idea?

Anyway, how would we do a progressive rsync with delete on such a system?

So, we would need to backup the whole Moodledata directory on our NFS Share (Now only 1,2GB).  We then need to collect this backup from the NFS server using Rsync and compare it to what? Bearing in mind we would want multiple copies, i.e. daily, weekly, monthly


Sorry I am a little confused


Cheeeers Ken

In reply to Albert Ramsbottom

Re: Backup policy

by Ken Task -
Picture of Particularly helpful Moodlers

Did say 'it sounds like'! ;)

Hmmmm ... no Vulcan Mind Meld possible ... perl script?  If you haven't seen it, then I certainly have no knowledge of it either.

Are you asking for an example rsync command?

man rsync is your friend there ... but ... will share one I used this week.  Dry run shows what it would have done - so brief look at what files/folders it was working with and also important the summary at the end ... how long it took, total of files/sizes, etc.

[root@sos backup]# cat syncbucketdry
rsync -avzh --dry-run --progress --delete ./ /root/gcloud* /mnt/gbucket/

command + options rsync -avzh --dry-run --progress --delete

man rsync will show all the options/switches, etc. ... no sense re-inventing the wheel here.

source ./ /root/gcloud*

I was issuing from /home/backup/ [the ./] that contained tar balls of code/data and sql dumps in m## directories and etc. ... ie, local server backups.   Also dry run rsync'd /root/gcloud* files and directories.

[root@sos home]# du -h ./backup
256K    ./backup/webmin
70M    ./backup/mysql
6.7G    ./backup/m27
16G    ./backup/m30
485M    ./backup/blog
12M    ./backup/m32/auto
3.8G    ./backup/m32
1.2G    ./backup/m31/courses
16G    ./backup/m31
103M    ./backup/webmindb
12G    ./backup/unirepo
17G    ./backup/m33
13G    ./backup/m34
6.9G    ./backup/m35
92G    ./backup

destination /mnt/gbucket/ (a Google Bucket -  1.0P - that's Pentabyte)

So let's say I ran backups of a moodle35 site which would then put new tar balls and sql dump in /home/backup/m35/.   The --delete would pick up new files not present before and xfer them to the bucket.  So if I deleted from /home/backup/m35/ all previous tar balls and sql dump, the rsync would remove those deleted files in /mnt/gbucket/m35 and transfer only the new.

That's clear as mud, isn't it! :|  Again, man rsync is your friend.

And rsync does require some study and dry runs to get it right.

One doesn't need multiple copies ... I run an rsync to dup a moodledata directory on another Moodle site once a day.   It's a school district site so almost all summer long there were hardly any changes.  With school about to begin another academic year, teachers have been into their courses and adding/deleting, etc. so here recently there has been more changes.

How is one assured that rsync is working ... comparisons of what was rsync'd to the location of the archive?

An example command was provided:

[root@sos home]# du -h ./backup show me totals on the server to be backed up - shown above.

Do the same command on destination location.

Now if one insist on daily/weekly/monthly copies ... the destination better be 1.0P at least ... i would think.  And maybe 3 rsync scripts (maybe with different options) ... one for daily, one for weekly, one for monthly.

That help any?

'spirit of sharing', Ken



In reply to Albert Ramsbottom

Re: Backup policy

by Matteo Scaramuccia -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Hi ALbert,

is there a way of excluding /moodledata/cache, /moodledata/localcache, /moodledata/sessions, /moodledata/temp, or /moodledata/trashdir

you may not exclude moodledata/trashdir since, by Moodle File API design, you have 4 days to automatically recycle it back into the Moodle storage: having configured cron will take care of that folder within that time range and ideally that folder will be "almost empty" on average week by week.

HTH,
Matteo

Average of ratings: Useful (2)
In reply to Matteo Scaramuccia

Re: Backup policy

by Albert Ramsbottom -

Ok, I have a little more information now, we are using enterprise MySQL which includes Enterprise backup which is incredibly quick. I mean hours to minutes over MySQLdump

We can exclude anything from the backup, so I think I need a reliable list of directories that we can exclude to get the size down. We are at 7TB at the moment and I am trying to talk the education team out of automatic courses backups. We keep 7 and they really should be turned off as I have pointed out they are not and shouldnt be part of an emergency backup policy, only full site backups should be for that

So what can I exclude?


Thanks

In reply to Albert Ramsbottom

Re: Backup policy

by Albert Ramsbottom -

Bump smile

In reply to Albert Ramsbottom

Re: Backup policy

by Conn Warwicker -
Picture of Core developers Picture of Plugin developers

Honestly, I'd look at reducing the size of that folder first. I can't imagine you have 7TB worth of stuff you actually need. I suspect you have many years worth of backups and old files in there. You could reduce that down by a significant chunk with a clear-out.



Average of ratings: Useful (1)
In reply to Albert Ramsbottom

Re: Backup policy

by Ken Task -
Picture of Particularly helpful Moodlers

Recently, lost a free resource for hosting sandbox moodles and had to go to a commercial route ... decided on Rackspace.   CentOS 7 and re-learning the diff's between CentOS 6 and 7,

Anyhoo ... also decided I needed to replicate the mounting of a Google Bucket ... 1.0P ... more storage than I'll use in my remaining years, me thinks!!!!

First, you do have to acquire a google account (doesn't have to be one from any Google for Education Domain) and a Google Bucket.   Those I'll leave out here ... but the key is mounting from any server you desire to backup.

https://www.assistanz.com/mount-google-cloud-storage-bucket-linux/

The 'key' for me get it working easily was this bit of info:

/root/.config/gcloud/application_default_credentials.json

Notice did this as root user .. which, due to nature of what one is doing, could be even safer ... devil is in the details (as always) ... the .config directory - 'dot'config.

Didn't set it up in fstab as that might stall a reboot of server IF bucket can not be mounted.
Instead have a 'mountgbucket' script in /root that mounts.

export GOOGLE_APPLICATION_CREDENTIALS=/root/.config/gcloud/application_default_credentials.json;

then

mount -t gcsfuse -o rw,allow_other,nonempty busrv /mnt/gbucket

[root@someserver ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      158G   14G  137G  10% /
busrv          1.0P     0  1.0P   0% /mnt/gbucket

Then do your scripts to backup ... this one for a moodle35

tar -cvf /mnt/gbucket/m35/moodle-code-351+-$(date +%Y%m%d%-H%M%S).tar /var/www/html/moodle35;

tar -cvf /mnt/gbucket/m35/moodle-data-351+-$(date +%Y%m%d%-H%M%S).tar /var/moodle35data;

Note that's not a minimal ... it's a full backup of moodledata

Your mysql dump to the same location on the bucket as well.

The great thing about this ... one could mount the same bucket from multiple servers ... and servers don't have to be on the same network.  Like I said above ... was on tcea.org's now moved to a RackSpace server and one should be able to do the same with any Linux/Mac flavored box.

OR ...

When done with one server, umount the bucket ... go to another server ... setup the same bucket ... create in /mnt/gbucket/ directories for the server you are on.

Run your scripts ... done ... umount ... next server.

For someone that just has to stick to GUI ... did find a commercial product ... does as advertised.

On this Mac I have and use the same mounted Google Bucket.  One can drag and drop from one mounted device to another.

Called ExpanDrive:

https://www.expandrive.com/docs/

And check out what it will do ...


IMHO, worth every dime!

Anyhoo ... thought I'd share ...

'spirit of sharing', Ken