Course deletion not freeing disk space

Course deletion not freeing disk space

by Artur Welp -
Number of replies: 5

Hello to all... Yet another question about disk space on the forum


Our Moodle installation is almost 15 years old, the fist graduation category is tracing to 2005. In this time period we scaled from 500GB to 1TB than to 2TB of disk space. But now it's time to make a cleanup. Legislation in Brazil requires the files to be saved for 5 years, so we decided to remove all courses older than 2015. The courses are allocated on the categories based on the period it happens.


So.. I decided to use the moodle API to remove old courses. The moodle version is 3.5.9 and the code i used was this snippet:

<?php

define('CLI_SCRIPT', true);
require_once(__DIR__.'/../../../config.php');
$category_list = [1, 2, 3, 4, 5 ...];
$category_list = implode(', ', $category_list);
$courses = $DB->get_records_sql("SELECT mco.id, mco.fullname, mcc.name
                                      FROM {course} mco, {course_categories} mcc
                                      WHERE mcc.id = mco.category AND category IN ($category_list)");
$ammount = sizeof($courses); $item = 1;
foreach($courses as $key => $course){
    try{
        echo "Category $course->name -- Course $course->fullname $item / $ammount \n";
        delete_course($course->id);
        $item++;
    }
    catch(Exception $e) {
        //code to print caught exception
    }
 }


I noted down the disk usage, executed the script with some categories to test and executed the crontab function to clear unused files


php admin/tool/task/cli/schedule_task.php --execute='\core\task\file_temp_cleanup_task' 
php admin/tool/task/cli/schedule_task.php --execute='\core\task\file_trash_cleanup_task
php admin/tool/task/cli/schedule_task.php --execute='\core_files\task\conversion_cleanup_task'

Before the first test the system had 27289 courses and was using 1,7T / 2,0T

#df -h | grep /var
/dev/sdb1       2,0T  1,7T  217G  89% /var
#Postgres Select
moodle=# select count(id) from mdl_course;
 count
-------
 27289
(1 registro)


After the test, the system had 20395 courses ( -6,894 courses ) and somehow the disk usage increased to 1,9T / 2T


#df -h | grep /var
/dev/sdb1       2,0T  1,9T   45G  98% /var
#Postgres Select
moodle=# select count(id) from mdl_course;
 count
-------
 20395
(1 registro)
So..
What is the correct way of removing old courses to free disk space? Preferable not moosh

Average of ratings: -
In reply to Artur Welp

Re: Course deletion not freeing disk space

by Ken Task -
Picture of Particularly helpful Moodlers

Please see:

https://docs.moodle.org/35/en/Recycle_bin

and it has carried forward to 3.9/3.10 so

https://docs.moodle.org/310/en/Recycle_bin

Have been bit by recyclebin ... so now I have it displayed by default in course (course admin menu) and set time to keep to lowest (least time) values.

As far as I know, there is no admin tool to show orphaned files ... exist in moodledata/filedir/ but not in mdl_files table ... nor vice versa ... so you might need to use moosh (even if you don't like it) cause it does have a command to find orphaned files.

Yeah, I know ... with the new file system since version 2.0 of Moodle, such things are not supposed to happen ... but ... things happen! :|

'SoS', Ken

In reply to Artur Welp

Re: Course deletion not freeing disk space

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Also, Moodle does not delete files immediately. It can take several days for a background task to get to it. I have no idea why it was designed like this either.

You may want to take a look at 'Moodle Moosh'. There are some command line tools that will help.
In reply to Howard Miller

Re: Course deletion not freeing disk space

by Artur Welp -
Yes.

I deleted the files in the trash with the crontab after the script:
php admin/tool/task/cli/schedule_task.php --execute='\core\task\file_trash_cleanup_task'
I also tried moosh file commands, but it wont clear space ether...

sudo -u www-data moosh file-dbcheck   
sudo -u www-data moosh file-delete --flush

I am writing a script to remove orphaned files. I will post it soon and tell the results.

Average of ratings: Useful (1)
In reply to Artur Welp

Re: Course deletion not freeing disk space

by Ken Task -
Picture of Particularly helpful Moodlers

mysql> select contenthash,filename,filesize,component  from mdl_files where filename like '%.mbz';

See any .mbz files whose component is recyclebin?

There are 2 'task' specifically for recyclebin backups.

\tool_recyclebin\task\cleanup_category_bin

\tool_recyclebin\task\cleanup_course_bin

'SoS', Ken


In reply to Artur Welp

Re: Course deletion not freeing disk space

by Artur Welp -

Update 1


The delete_course function actually makes the course go to the trash-bin. And the cron functions i listed run once a day, not once a week. So, cleaning the trash-bin did nothing, because it uses the trash-bin config to know the time the files are on the trash-bin. (Ashamed of missing this part)

I updated the script on the post to empty the category trash-bin and then run the cron function to clear the trashdir. 

<?php
define('CLI_SCRIPT', true);
//Load required files
require_once(__DIR__.'/../../../config.php');
require_once($CFG->libdir . '/filelib.php');
require_once($CFG->libdir . '/filestorage/file_system.php');
require_once($CFG->libdir . '/filestorage/file_system_filedir.php');
// Define cliscript user -- Required to empty the bin
\core\session\manager::init_empty_session();
\core\session\manager::set_user($DB->get_record('user', ['id' => 2]));
$categories = [...];
//Prepare and load courses
$categories_list = implode(', ', $categories);
$courses = $DB->get_records_sql("SELECT mco.id, mco.fullname, mcc.name
                                   FROM {course} mco, {course_categories} mcc
                                   WHERE mcc.id = mco.category AND category IN ($categories_list)");
$ammount = sizeof($courses);
$item = 1;

//Do the stuff
foreach($courses as $key => $course){
    try{
        echo "Category $course->name -- Course $course->fullname $item / $ammount \n";
        delete_course($course->id);
        $item++;
    } catch(Exception $e) { }
}

//Clear the category trash
foreach($categories as $category){
    try{
        $context = context_coursecat::instance($category);
        $recyclebin = new \tool_recyclebin\category_bin($context->instanceid);
        $recyclebin->delete_all_items();
    } catch(Exception $e) { }
}

//Delete files on the dir
$fs = new file_system_filedir();
$fs->cron();

----------------

I also wrote a script to search an delete files that exist on disk and do not exist on the files table on the database. Somehow i had more than a hundred files on this state.

<?php

define('CLI_SCRIPT', true);
require_once(__DIR__.'/../../../config.php');
$first_level_flders = glob($CFG->dataroot . '/filedir/*' );
foreach($first_level_flders as $firstlevel){
    $second_level_folders = glob("$firstlevel/*");
    $x16sequence_first = end(explode('/', $firstlevel));
    
    foreach($second_level_folders as $secondlevel){
        $x16sequence_last = end(explode('/', $secondlevel));
        
        // Remove the path of the files
        $files = array_map(
            function($file){
                return end(explode('/', $file));
            }, glob("$secondlevel/*"));

        //Load files from database
        $filepaths = "'" . rtrim( implode("', '", $files), ", '" ) . "'";
        $files_on_database = array_keys(
            $DB->get_records_sql( "SELECT contenthash
            FROM {files}
            WHERE contenthash IN ($filepaths)" ) );

        $orphans =  array_diff($files, $files_on_database);

        foreach($orphans as $orphan){
            echo "Removing Orphan:\n   ".$CFG->dataroot . "/filedir/$x16sequence_first/$x16sequence_last/$orphan \n";
            unlink($CFG->dataroot . "/filedir/$x16sequence_first/$x16sequence_last/$orphan");
        }
    }
}