Another approach in getting rid of the course backups in the repository

Another approach in getting rid of the course backups in the repository

by Visvanath Ratnaweera -
Number of replies: 7
Picture of Particularly helpful Moodlers Picture of Translators
Hi all

I'm having a look at an ever expanding repository. The initial suspect is teachers taking course backups "in case" and not deleting them. In the directory tree filedir/xx/xx/40charhash I find thousands file of file type "gzip compressed data, from Unix, original size modulo 2^32 nnnnnnn".

I can find them in the database and make sure that the file name extension is mbz. The question is, what is the safe way of deleting the files cleaning the database at the same time?
Average of ratings: -
In reply to Visvanath Ratnaweera

Re: Another approach in getting rid of the course backups in the repository

by Davo Smith -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
The only safe way to delete files would be through a command line PHP script that calls $file->delete() in each case.

So, you'd need a script that would do some sort of DB search to find all the files on the system that ended .mbz, then the script would need to create a stored_file instance from each result and then call delete() on each of those instances.

Something a Moodle developer should be able put together in a few minutes (maybe an hour).
Average of ratings: Useful (1)
In reply to Davo Smith

Re: Another approach in getting rid of the course backups in the repository

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi Davo
also to the others

Thanks for the critical hint, $file->delete()! If the solution is that simple, I thought of taking a shot myself.
;)

The $file->delete() took me to the File API. Looking at the examples there, I imagine this naive idea could lead to a solution.

A. Through SQL I dump the data of the files to be deleted to a file 'filedump.txt' in the following format:

id,contenthash,pathnamehash,contextid,component,filearea,itemid,filepath,filename
12345,40charconthash,40charpathhash,23456,course,legacy,0,/praktikum/,FlipFlop_Loesung.pdf
B. Iterate though the list and dissect the (comma-separated) fields:

$fs = get_file_storage();

foreach(file("filedump.txt") as $fileline) {
    $fieldlist = explode(',', $fileline);
   // Block "delete file"
}

C. Block "delete file"

// Prepare file record object
$fileinfo = array(
    'component' => ???,
    'filearea' => ???',
    'itemid' => ???,
    'contextid' => ???,
    'filepath' => ???,
    'filename' => ???);

// Get file
$file = $fs->get_file($fileinfo['contextid'],
$fileinfo['component'],
$fileinfo['filearea'],
        $fileinfo['itemid'],
$fileinfo['filepath'],
$fileinfo['filename']);

// Delete it if it exists
if ($file) {
    $file->delete();
}
Yet to figure out what those ??? are; something $fileline[i], I guess.

Another question is, do I need all that $fileinfo in the fs->getfile?
In reply to Visvanath Ratnaweera

Re: Another approach in getting rid of the course backups in the repository

by Lawrence N -
I wrote something similar to remove the files automatically every 30 days of files they backed up..took me a while to figure it out but it works
In reply to Lawrence N

Re: Another approach in getting rid of the course backups in the repository

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
That's it, delete automatically 30 days later! But i expect a huge backlog, so the first run will purge 95% of the backup files.

Question: How did you catch the permanent mbz files, like course backups attached to the courses as file resources?

If you remember the details of your work, you might see whether my idea (in the OP) makes sense. Appreciate all feedback.
In reply to Visvanath Ratnaweera

Re: Another approach in getting rid of the course backups in the repository

by Lawrence N -
don't care about the backups this to courses as file resources... users are made aware they will be deleted LOL.

the hard part is making people understand that they need to clean up
L
In reply to Lawrence N

Re: Another approach in getting rid of the course backups in the repository

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
I do care! The fact that a file is mbz doesn't mean I can be deleted after some time. Think of CC-licensed course which provides its own backup to be downloaded or an administrator providing a collection of course backups to the teachers.
In reply to Visvanath Ratnaweera

Re: Another approach in getting rid of the course backups in the repository

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
I experimented a bit further:

$fs = get_file_storage();
foreach(file("filedump.txt") as $filepath) {
    $fieldlist = explode(',', $filepath);
    // echo $fieldlist[0];  // id
    // echo $fieldlist[1];  // contenthash
    // echo $fieldlist[2];  // pathnamehash
    // $fieldlist[3] = $context->$fieldlist[3];  // contextid ???
    // echo $fieldlist[4];  // component
    // echo $fieldlist[5];  // filearea
    // echo $fieldlist[6];  // itemid
    // echo $fieldlist[7];  // filepath
    // echo $fieldlist[8];  // filename
    // 'contextid' => $context->id, // ID of context
    // Get file
    // $file = $fs->get_file($fileinfo['contextid'], $fileinfo['component'], $fileinfo['filearea'], $fileinfo['itemid'], $fileinfo['filepath'], $fileinfo['filename']);
    $file = $fs->get_file($fieldlist[3], $fieldlist[4], $fieldlist[5], $fieldlist[6], $fieldlist[7], $fieldlist[8]);

    // Delete it if it exists
    if ($file) {
        echo "Found";
    $file->delete();
    }
    else {
        echo "Not found";
    }
}

Sorry, I have to leave some debugging lines in the source. I think, my current problem is how to get the context from the context id. (See the ??? mid right.) If I uncomment that line I get:

PHP Notice:  Undefined variable: context in /var/www/html/bztfdevmoodle/delfiledump.php on line 22
PHP Notice:  Array to string conversion in /var/www/html/bztfdevmoodle/delfiledump.php on line 22
PHP Notice:  Trying to get property 'Array' of non-object in /var/www/html/bztfdevmoodle/delfiledump.php on line 22
PHP Notice:  Trying to access array offset on value of type null in /var/www/html/bztfdevmoodle/delfiledump.php on line 22

The cause must be obvious to the developers, who master Moodle's "context".