Greedy time consuming backups

Greedy time consuming backups

by Bente Olsen -
Number of replies: 11
Picture of Testers Picture of Translators

I just moved a Moodle to a new and better web hotel (Linux with Apache, PHP 7 and MariaDB - 8 G RAM/SSD disk). But. It takes hours to backup courses as well as data. I only backup courses that have changed since last backup and my data backup is incremental. Nevertheless it takes respectively 5 hours and 1,5 hours. Is that normal?

This morning only five courses were backed up. The backup starts at 3.00 AM. The time stamp for the five courses were 03:57 (559M), 05:53 (750M), 07:50 (235M), 07:52 (178M), and 07:53 (314M). The total amount of courses is about 50, I think.

Average of ratings: -
In reply to Bente Olsen

Re: Greedy time consuming backups

by Ken Task -
Picture of Particularly helpful Moodlers

The size of the backup file may not be an indicator of the time and power it requires to make a backup ... as in the case of quizzes.

Try this ... courses can be backed up individually and from command line ... which is like the backup routine as scheduled in autobackups.   Look in moodlecode/admin/cli/ for a backup.php script.  Help on it:

Perform backup of the given course.

Options:
--courseid=INTEGER          Course ID for backup.
--courseshortname=STRING    Course shortname for backup.
--destination=STRING        Path where to store backup file. If not set the backup
                            will be stored within the course backup file area.
-h, --help                  Print out this help.

so to issue:

php backup.php --courseid=# --destination=/someotherlocation/

Choose the largest course you see.

While that is running, open another terminal session (ssh) and go to:

moodledata/temp/backup/

You'll see a long directory name (looks like a contenthash).

cd [longdirectoryname]

Then issue:

watch "ls -lR"

You'll see what Moodle is doing to build a backup.

Pay attention to activities directory and it's contents where the course contains many quizzes.

The above won't solve the problem but it might give some more info upon which one might be able to tweak things that might need tweaking ... thus making the process faster (like running tuner on the DB).

'spirit of sharing', Ken



Average of ratings: Useful (1)
In reply to Ken Task

Ang: Re: Greedy time consuming backups

by Bente Olsen -
Picture of Testers Picture of Translators

Thanks Ken, I'll try that.

Besides, BTW, rsync and course backup also use a lot of memory, rsync it all, course backup almost all. Afterwards it takes quite a while (hours) before the memory usage is down to normal.

I have seen somewhere else that rsync do use what's available of memory, and you should be able to clear  cache with

sh -c': sh -c "sync; echo 3 > /proc/sys/vm/drop_caches"

But I do not have permissions for that on the web server. The server support can not help me.

 

In reply to Ken Task

Ang: Re: Greedy time consuming backups

by Bente Olsen -
Picture of Testers Picture of Translators

Now I have tried to backup some courses with admin/cli/backup.php. They all went smooth.

I tried the largest course and some courses that seem to use to take a long time to backup. In addition I found some log files in temp/backup with an error: "Orphan course module (id: 400) found. This module will not be backed up." I tried to backup courses which backup file time stamp were similar to the time stamp of the log file. All went smooth, nothing obscure found with watch "ls -lR", and only empty log files were created.

In reply to Bente Olsen

Re: Ang: Re: Greedy time consuming backups

by Ken Task -
Picture of Particularly helpful Moodlers

Ok, trying to determine if the backups were faster - suspect they were.  Empty log files normal upon successful completion of backup.   Will have to research that 'orphaned' stuff. :\

But, this could indicate you have some other processes (not Moodle and maybe not PHP) running at the time of the backups are normally run that could be taking processing from the backups.   So in server environment ... do you have any other cron jobs for other things running ... like the processing of logwatch or the processing of webalizer or the backup of all databases ... etc.

Total guess, of course!   Dig around some more on your system. ;)

Oh ... next time you run a CLI backup ... have another terminal window open running 'top' ... you'll be able to see when the backup php script is run and you might see mysqld bounce in and higher from time to time as well.   I have seen large course backups (PHP) peg CPU at 100% at times.

'spirit of sharing', Ken

In reply to Ken Task

Ang: Re: Ang: Re: Greedy time consuming backups

by Bente Olsen -
Picture of Testers Picture of Translators

Well, the cli initiated backups ran fast, some of the courses I backed up were the ones that had been very slow during the auto-backup.

I have no other processes running on the server while running the auto backup, no cron jobs or anything else.

I now and then do run top in another session, but it is rather confusing. I am on a shared server, my guess is that we are five users. According to 'top' the total mem is 43112192k, I have 8G I can use, so there is probably four more users on the server. As I write 'top' only shows a few running comands: cpanel, sshd, bash, top and php. Often 'top' do not show that I use a lot of mem, but cpanel does. I do run 'top' now and then anyway because it's a stream, cpanel only show static data. But until now it have not made me any wiser.

So at the moment I know that automatic backup are slow, the cli backup is not. And rsync take its time too. I really do not have any idea about what to look after - apart from the so called orphaned module whatever that is. Luckily the users gets a really good experience with Moodle's performance.

In reply to Bente Olsen

Re: Ang: Re: Ang: Re: Greedy time consuming backups

by Ken Task -
Picture of Particularly helpful Moodlers

Hmmm .... 'shared hosting' .... there are sites on the net that given an IP will show how many host are on that IP - dunno how accurate they really are, but .... you saw 5 users (customers - how do you know that?) are 5 other web based applications being hosted on a server that has 43112192k (according to top) memory. 

There are probably memory hungry processes/apps that run on those other host from time to time which have to be metered on a shared host so that no one customer can use more than their allocated share.

On a standalone (not shared), like I mentioned, while watching top I've seen PHP peg the CPU at 100% for a few seconds on backing up what ended up as a 60+ Gig Moodle backup via CLI (have seen the same on other servers that have larger courses - 90+ Gig).

Ok, so the backups are slow when running the autobackup via Moodle.   Faster when running single course backups via CLI ... as long as they complete and the users experience is acceptable then think you'll have to live with it.  However, ever think that one could have a 'nibuc' (non-interactive backup courses) bash shell script that uses the CLI single backup in a loop of course ID numbers.  Have such a script for large courses on systems where there are large courses ... two scripts, as a matter of fact ... one for the 'normal courses' and one that backups up only the large courses.   Can't run autobackups on those boxen due to those large courses.

Also mentioned that rsync was slow ... you are doing that incremental, yes?   That shouldn't be slow I would think (of course that depends upon what is being added to courses ... videos/audios? - don't forget if you have the backups going to filedir as opposed to an alt directory then backups of courses would take a little longer)

Let's face it ... Moodle (full featured + plugins), as it moves upwards, is more resource hungry.   The days of shared hosted Moodle (full featured) might be numbered with some providers.

'spirit of sharing', Ken

Average of ratings: Useful (1)
In reply to Ken Task

Ang: Re: Ang: Re: Ang: Re: Greedy time consuming backups

by Bente Olsen -
Picture of Testers Picture of Translators

Well, I did not say that I saw five users on the server, I do not know how many we are. I assume we are five or as I said, I guess. 8G are assigned to me, so when 'top' shows 43112192k, which is approximately 41G, I get the idea that there is four other users like me. But it could of course be a lot more - or even fewer, and they could have 10 moodles running each or none at all or they could have less memory to use than I.

The cpanel has a lot of features. Among them a 'CPU and Concurrent Connecion Usage' that gives me a 'resource usage overview'. I get graphs of 'CPU Usage DB usage included, Virtual Memory Usage, Physical Memory Usage, Input/Output Usage. DB usage included, only if restricted, Io operations, Entry Processes and Faults' for an optional period of time, e.g. today, last hour or last 30 days. I can get a table with values as well. It allows me to see that sometimes there is IOf's (I guess input/output faults). When the backup was running, there was some IOfs this morning and there was some while I ran the cli backups. So I guess that it is not io-faults that causes the course backup to run so slow. But to identify these fault is not easy, the error log that I have access to does not reflect it.

It seems that the reason why the course backup takes such an amount of time is not because of the backup itself, but the process to determine whether a course should be backed up or not due to the backup settings. When I look at Moodle's backup report I can see that for some courses there are 2 hours between skipping courses that should not be backed up. So now the question is how to dig into that?

In reply to Bente Olsen

Re: Ang: Re: Ang: Re: Ang: Re: Greedy time consuming backups

by Ken Task -
Picture of Particularly helpful Moodlers

Ahhh ... but Moodle (and other customer apps) are multiple user apps.  5 customers doesn't equal 5 users using customer apps.

Have read that on shared systems there are 'system monitors' that when a customers app/web site, etc. reaches the max CPU/resources limits, scripts are killed ... an no errors are displayed ... kinda like a kill -9.    Only your provider could answer that one.

Now the part about '2 hours between skipping courses that should not be backed up' is a puzzler but there are scheduled task now in Moodle.   One of those is automated backups - another ... clean up backup tables and logs.  Moodle Admin can set times to be executed.

'spirit of sharing', Ken

In reply to Ken Task

Ang: Re: Ang: Re: Ang: Re: Ang: Re: Greedy time consuming backups

by Bente Olsen -
Picture of Testers Picture of Translators

There is no sign that my provider's system kills anything. They just ensure that I do not get more than my 8Gs. But fair enough.

I already have automated 'Clean backup tables and logs' on my scheduled tasks list, so as you say, I'll probably have to live with it.

Thanks for your contribution on this. I will come back if any new aspects turn up.