Restoring/downloading files over 2GB

Restoring/downloading files over 2GB

by J Stringer -
Number of replies: 14

We plan on using Moodle next year. We have installed it on a Mac OS X Lion server and we're currently running Moodle 2.3. Most things have gone very smoothly. However, we've had two issues when dealing with large files over 2GB.

When restoring a course from a backup file larger than 2GB we get the message "error/tmp_backup_directory_not_found". I have read that this may have to do with php zip utilitiesnot being able to handle files over 2GB. I installed External zip utility which uses Mac's (or Ubutnu's) built in zip and unzip utilities. When enabled we get the same error when restoring. Restoring course backups under 2GB is fine.

When using a Mac computer (regardless of browser or OS version) all downloads stop at exactly 467MB. However, when using Windows to download the same file it completes (4.7GB). I have tried this on 5 different Macs running 10.6, 10.7, 10.8 using Safari, Firefox, and Chrome. Same result each time.

Anyway if anyone can help solve these two issues, it would be greatly appreciated!!

Thanks!

John

Average of ratings: -
In reply to J Stringer

Re: Restoring/downloading files over 2GB

by J Stringer -

Hi again,

Just to add some more info, I tried to restore in Debug Mode and got additional errors:

Debug info:
Error code: tmp_backup_directory_not_found
$a contents: /Applications/MAMP/data/moodle23/temp/backup/dca63bd27d4555d5af611f6c1e8f10af
Stack trace:

line 141 of /backup/util/helper/convert_helper.class.php: convert_helper_exception thrown
line 250 of /backup/util/helper/backup_general_helper.class.php: call to convert_helper::detect_moodle2_format()
line 188 of /backup/util/ui/restore_ui_stage.class.php: call to backup_general_helper::detect_backup_format()
line 67 of /backup/restore.php: call to restore_ui_stage_confirm->display()

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by Ken Task -
Picture of Particularly helpful Moodlers

There is an issue with PHP zip in regards to 2GB.   Search of Tracker shows the issue is across the board ... Mac/Windows/Linux.  The makers of PHP are the only ones who can really fix ... no change in Moodle code will matter. sad

To the best of my knowledge there are only two approaches for large courses: 

1) in the course, delete large resources - uploaded videos/audios if any (make sure you have those individual files archived or have access to them for later re-linking/uploading).  Purpose: get the backup below the 2Gig limit.

2) Warning ... it involves 'manual labor' and command line.  First a question: after failure, does the dca63bd27d4555d5af611f6c1e8f10af  folder still exist in temp/backup/

If so, have you inspected its contents?

'spirit of sharing', Ken

In reply to Ken Task

Re: Restoring/downloading files over 2GB

by J Stringer -

Hi Ken,

Unfortunately the backup of the course is all we have. One of our technicians extracted it just prior to a server crash. We then had to revert to a backup of the database from 2 weeks prior when the course didn't exist. So now we need to restore the 4GB mbz file into a new course.

I read the articles on PHP zip and the 2GB limitation. To get around this I tried using the externalzip plugin which directs moodle to use the zip and unzip commands rather than using PHP. However, this has not worked either. I found an interesting line in the apache_error.log that says:

warning [/Applications/MAMP/data/moodle23/temp/backup/d12ce02b02a712a6e833cf865dd9cefc]: 4294967296 extra bytes at beginning or within zipfile (attempting to process anyway)

I then tried renaming the backup file with a .zip extension because an mbz file is essentially a zip file. When I use unarchive utility it extracts all the files with no issues. When I use terminal I get this:

wiki:~ moodle$ unzip /Users/moodle/Desktop/test/test.zip
Archive: /Users/moodle/Desktop/test/test.zip
warning [/Users/moodle/Desktop/test/test.zip]: 4294967296 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 4294967296
(attempting to re-compensate)
creating: 0/
inflating: 0/backup-moodle2-course-23-dp_history-20130415-1350.mbz
error: invalid compressed data to inflate

So the unzip utility is not working properly either. We are running unzip 6.0 which is supposed to handle large files. However, it seems many others are running into this issue as well. I thought it might be a corrupt zip file so I extracted the files with unarchive utility and then re-compressed them. However, unzip gave me the same results. Anyway, I'm going to keep trying to create a 4GB zip file that unzip extract.

The dca63bd27d4555d5af611f6c1e8f10af no longer exists. I'll report back if I figure anything out.

John

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by J Stringer -

So I created a 4.86GB zip file using zip command on our server (zip -r). The file was created without error. However, using the unzip -l command to get a file listing of the resulting zip gave me the same "extra bytes at beginning or within zipfile" error. As it's giving me this error based on a zip file created by the same set of commands, I am beginning to think that, similar to PHP unarchiving tools, unzip (at least on a Mac) cannot handle files over 2GB either. Ken, have you or has anyone else experienced something similar on a Windows or Ubuntu server? Or perhaps there's something amiss with our Lion server.

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by Ken Task -
Picture of Particularly helpful Moodlers

Does your unzip give you anything at all?  Any files/folders?

IF not, there is a MacOSX command line utility that will attempt to recover what it can from a renamed .mbz (to .zip) called 'ditto'.  From terminal on the Mac box, 'man ditto' will show all the options/switches.

Have been working with a user that has similar issues with hosting provider and corrupted zips.  As a result of that, have discovered that ditto will recover some ... but not all ... resources/files from the zip.  Therefore not a total loss.

In a work folder and from the command line in that folder:

ditto -x -k backup-moodle2-course-3-tech-20130407-1400-nu-noattend-noquiz.zip extracted

The 'extracted' is a folde that ditto will create when attempting to recover.

It, too, will fail with an error like:

ditto: files/f0/f064fabdcf36f97bec8075102850fa814e59feec: No such file or directory
ditto: Couldn't read pkzip signature.

**BUT** if one checks out the 'extracted' directory in that same folder:

Ken-Tasks-MacBook-Pro:nubackup ktask$ ls -l extracted
total 8
drwx------  49 ktask  staff  1666 Apr 11 12:43 activities
-rw-------   1 ktask  staff    79 Apr 11 12:42 completion.xml
drwx------   8 ktask  staff   272 Apr 11 12:43 course
drwx------  73 ktask  staff  2482 Apr 18 09:19 files

Now the 'fun' begins ...

This is NOT a complete extract ... there is no moodle_backup.xml file for one - therefore attempting to archive backup up what has been recovered will NOT restore to Moodle.

The files folder is a mini of filedir on the server and should have contained all the uploaded files in the course.  Unfortunately, there is no associated files.xml file (which maps the directories/contenthash filenames in the 'files' folder).   But, one can get an idea of which contenthash file is what:

file ./files/*/*

Will render something like:

./files/02/020958777556801f5f5f7649f8e20f326c562829: JPEG image data, JFIF standard 1.01
./files/02/025864fdbd016e966ac6f091e4969e2a43d5f46a: JPEG image data, JFIF standard 1.01
./files/07/071ec456214ae7ec778919abc5cf8251a1954b11: JPEG image data, JFIF standard 1.02
./files/0d/0dbc72e5e6bcdb127994c65cf4e4d6c87fe95352: JPEG image data, JFIF standard 1.01
./files/0e/0ee6fb014df0bc7b4b2aa80beb47a8a87df6ca72: JPEG image data, JFIF standard 1.01
./files/0f/0f7637e3d6298d70afe8ff48250a49fd13f0d12a: Macromedia Flash Video

Using your  Finder (not command line), one can then navigate to the files of interest ... ie, to see the last one listed, open files then open 0f and then double click on 0f7637e3d6298d70afe8ff48250a49fd13f0d12a should launch whatever you have on the Mac that plays Flash Videos.

At this point, one could copy the 0f7637e3d6298d70afe8ff48250a49fd13f0d12a file and then rename it to something human: myflashvideo.flv

Also, depending upon how much wants to recover what, one could inspect the .xml files found and acquire at least the text of the item.

Example: folder (which may or may not exist in your extraction) called 'activities' has subfolders some named 'page_###'.  Changing into extracted/activities/page_275 and listing will show xml files.

One will be page.xml

Opening that with TextEdit one will see a section such as:

    <name>How To Build A Tower (Part Two)</name>
    <intro>&lt;p&gt;For setting up storage folder and cutting wood&lt;/p&gt;</intro>
    <introformat>1</introformat>
    <content>&lt;p&gt;&lt;span style="font-size: large;"&gt;In order to create a tower, the following steps MUST be followed &lt;strong&gt;in order&lt;/strong&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: large;"&gt;Step 1: Get a manila folder from the teacher&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: large;"&gt;Step 2: Write &lt;span style="color: #ff0000;"&gt;your own name&lt;/span&gt; &lt;span style="color: #ff0000;"&gt; and period&lt;/span&gt; like the pictures below &lt;span style="font-size: large;"&gt;(not George Smith and not 6.1.A)&lt;/span&gt;. Use a Sharpie in dark black ink and PRINT in &lt;span style="color: #ff0000;"&gt;neat block letters&lt;/span&gt;. Start in the top left corner of the tab with your name. Put the period just below your first name.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: large;"&gt;You will not get a second chance, so make sure you do it right the first time. If you make a mistake, you will have to provide your own manila folder.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://moodle.org/pluginfile.php/127/mod_forum/post/985482/tower%20folder%20start%20writing.jpg" height="260" width="347" /&gt;&lt;img src="https://moodle.org/pluginfile.php/127/mod_forum/post/985482/tower%20george%20smith%20writing.jpg" height="259" width="389" /&gt;&lt;/p&gt;</content>

One could do a few (ok, a lot) of search and replaces to remove the xml tags and other stuff one doesn't need.   The save the file as something.txt for later copy and paste into the page resource one is rebuilding in a Moodle course.

Granted, it's a lot of work, but we do what we must IF we must! :|

'spirit of sharing', Ken

In reply to Ken Task

Re: Restoring/downloading files over 2GB

by Ken Task -
Picture of Particularly helpful Moodlers

Hmmmm ... ran out of time while editing ... addition:

If you have xmllint installed (on a Mac think it's part of xpath install - Google for that or just go to command line and type: whereis xmllint)there is a way to acquire the text of some resources that might be extracted via ditto. (CentOS and Ubuntu, BTW, does have xmllint)

In the extracted Activities folder (if there is one), folders like the following might exist:

assign_48    assign_68    page_275    page_58        page_86        resource_50    resource_57
assign_59    assign_80    page_276    page_69        page_91        resource_51    url_71
assign_60    assign_82    page_277    page_72        page_92        resource_52    url_74
assign_61    attforblock_37    page_278    page_76        page_93        resource_53    url_75
assign_62    choice_88    page_44        page_77        quiz_40        resource_54    url_89
assign_63    label_64    page_45        page_79        quiz_90        resource_55
assign_67    label_65    page_47        page_81        resource_49    resource_56

In each one will see an .xml file larger than the others.

Example: page_275 has a 'page.xml' file larger than the other .xml files

-rw-------  1 ktask  staff  1951 Apr 11 12:42 page.xml

xmllint --xpath //content page.xml > content.txt

extracts the <content> tag and the text up to the closing </content> tag into a content.txt file which means fewer search and replace actions to recover text.

'spirit of sharing'. Ken

 

In reply to Ken Task

Re: Restoring/downloading files over 2GB

by J Stringer -

Hi Ken,

I can extract the backup just fine using archive utility. The contents appear to have all of the course content including moodle_backup.xml. Is there a way I can restore the course using the extracted files? I just assumed that it was only possible to restore using the compressed file.

thanks again!

John

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by Ken Task -
Picture of Particularly helpful Moodlers

Oh, thought you said couldn't unarchive! Sorry.  Still the info is kinda related to the following.

One has to reverse engineer the .mbz file.

One can remove the references to the largest items (size wise) in the corresponding .xml files.  Suspect the largest items involve the 'files' directory and the 'files.xml' file.

In a work folder unzip the renamed .mbz.  That should create a folder with the same title as the backup file.

drwx------@ 18 ktask  staff    612 Sep 22  2012 backup-moodle2-course-2-tinker-20120914-2047

cd backup-moodle2-course-2-tinker-20120914-2047

ls -lR files

And note the largest files - copy and paste the contenthash name to textedit for later reference/use

example:

./b1:
total 40216
-rwxr-xr-x@ 1 ktask  staff  20586557 Sep 14  2012 b16cc8ce0dcdee28dfba17a27b753802e89659af

this one file is 20M

file b1/*

Ken-Tasks-MacBook-Pro:files ktask$ file b1/*
b1/b16cc8ce0dcdee28dfba17a27b753802e89659af: Macromedia Flash data, version 8

It's a flash file.

move the b1/b16cc8ce0dcdee28dfba17a27b753802e89659af file to a 'savedfiles' folder outside of present location and rename it something - flashfile1.flv.  Then remove the b1/ folder.

Open the files.xml file and do a search for the contenthash name.

Finding the reference in files.xml, remove the reference ... be sure you remove all lines contained in the tag references for the file:

  <file id="126">    <contenthash>b16cc8ce0dcdee28dfba17a27b753802e89659af</contenthash>
    <contextid>33</contextid>
    <component>mod_resource</component>
    <filearea>content</filearea>
    <itemid>0</itemid>
    <filepath>/</filepath>
    <filename>mdl-files.png</filename>
    <userid>3</userid>
    <filesize>20586557</filesize>
    <mimetype>image/png</mimetype>
    <status>0</status>
    <timecreated>1343260609</timecreated>
    <timemodified>1343260609</timemodified>
    <source>mdl-files.png</source>
    <author>Ken Task</author>
    <license>allrightsreserved</license>
    <sortorder>0</sortorder>
    <repositorytype>$@NULL@$</repositorytype>
    <repositoryid>$@NULL@$</repositoryid>
    <reference>$@NULL@$</reference>
  </file>

NOTE: Don't let the above example above confuse.  I know the above references a mime type as a png file.  This was a tinker backup and course I had been fooling with.  The file is really a flv file:

./files/b1/b16cc8ce0dcdee28dfba17a27b753802e89659af: Macromedia Flash data, version 8

Continue until all the largest items have been copied out and the files.xml file that referenced them no longer refer to them.

The tricky part (not really, just have to  do this step at the right location) - create a zip **in** the backup-moodle2-course-2-tinker-20120914-2047 folder (as per example).

zip -r amoodle2backup.zip *

Must zip it in such a fashion that when unzipped by Moodle, the restore process can find the moodle_backup.xml file.  Re-creating the zip note the file size.  Should be smaller now! ;)

Then, renamed the amoodle2backup.zip to amoodle2backup.mbz.

Makes no difference what you name it just as long as the extension is .mbz.  All the information Moodle will use are in the .xml files.

Upload to moodle server.  Think I'd use a file system repository called 'restores' and scp it there (this keeps the .mbz file out of the DB).

X your fingers and restore.

'spirit of sharing', Ken

Average of ratings: Useful (1)
In reply to Ken Task

Re: Restoring/downloading files over 2GB

by J Stringer -

Hi Ken,

I'm going to try doing this. At the same time I'm also going to try to figure out how to work with zip files larger than 2GB from the command line. I'm new to Moodle but hopefully soon I'll be able to contribute to this forum by answering questions rather than by just posing them.

Thanks so much for all your help!!

John

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by JF Bill -

Hi,

the problem here is only because you probably use a 32-bit system. If so, php will use 32 bit integers to store your backup file, limiting it to 2Gb. 

You basically have 2 options:

Update your system to a 64 bits

Or modify the moodle core to dump the content of the zip to a stream (stdout) and than dump stdout to a file

In reply to J Stringer

Re: Restoring/downloading files over 2GB

by Colin Fraser -
Picture of Documentation writers Picture of Testers

For anybody reading this thread to resolve similar problems, and if you do not get what the conversation is about, it is Ken suggesting that *.mbz fies can be renamed to zip files, unzipped, then edited, rezipped and restored. This is a tried and tested technique that has saved my bacon more than once, but the language used here is difficult for anyone not fluent in geekspeek to follow, it obscures too much. Also, it does not matter if you are using a Linux or Windows server, a local host or anything else, the technique is the same. 

The editing comes in two different ways, one is the resources, activities, quizzes, images. video files and so on are listed, written and referred to in the moodle.xml file. You can find the starting point and the end point of each resouce, and Ken provides an example of that, that you can delete out of the xml file. The second part of editing is locating the actual resouce if it is an image, a separate file or video then deleting it. Really large mbz files tend to have a lot of videos, often flv files, or uncompressed images, like tiffs. They can be found, and deleted easily, in the directory tree of the backup.

You can then rezip the edited file, rename it to an mbz and, if you have edited it right, it should restore. You can use the original file to break down really large backups over and over into four or five smaller mbz files, as many as you like.

I would recommend that you test the technique first on a smaller file, it is easier to follow and gets you used to xml structuring and so on. Say one course with a couple of pages, a number of different image types, a couple of videos will help you immensely.  

You do not have to worry about permissions on a Windows machine, or concern yourself with editing rights usually.

Thanks to Ken for his detailed explanation.

Average of ratings: Useful (1)
In reply to Colin Fraser

Re: Restoring/downloading files over 2GB

by Juanjo Florido -

WOW! I'm really scared... I have a 300 MB backup course (Moodle 2.5.2+) and I only can upload 150 MB: It's not related to php.ini (max_upload_size = 512 M) but my hosting account. I don't know where, but I think I have read that it can be uploaded any size file by ftp. If so, are backups located in moodledata/filedir? Then, how could I "transform" my backup on that long numbers and letters string? I think the answer is in this thread, but I can't understand it.

If it is too hard, I prefer to build another course (the truth is that it is a course of a workmate and I can´t spend too much time on it. My backups weight less than 20MB because all the heavy files are linked to dropbox)

Thanks so much for your help!

In reply to Juanjo Florido

Re: Restoring/downloading files over 2GB

by Ken Task -
Picture of Particularly helpful Moodlers

Use the file system repository.  One doesn't have to use Moodle or anything browser based to put files (of any size) into a file system repository.

See: http://docs.moodle.org/25/en/File_system_repository

You might also check your php settings for timeouts and increase those timeout numbers.

'spirit of sharing', Ken

Average of ratings: Useful (1)