Multiple discrepancies between mdl_files & moodledata

Multiple discrepancies between mdl_files & moodledata

by Michael Shen -
Number of replies: 0

Hi everyone

Been searching the forums/tracker for days and not getting anywhere (though occasionally getting close).

I've recently taken over as the sysadmin of a moodle site which looks fairly well run. Only issue is that we're exceeding the hosting plan's data usage and so I began taking a look at how to reduce it. 

The obvious low hanging fruit were determined to be the bulky annotated PDFs and student-submitted PDFs which take up the majority of the space. We took the advantage of a new school term by taking local backups and then clearing all assignments in preparation for a fresh intake, while keeping only data for all courses for the prior term. I also made sure the recycle bin and trashdirs are empty.

While there was a small reduction (from around 35 to 29gb of space), I wanted to get this to below 15, which should be possible given the new fairly 'fresh' start. 

After doing some spot tests in mdl_files (where I initially SQL queried all large files) vis-a-vis moodledata, I realised that there are many - and I have no idea just how many given the thousands of subfolders - files that are in moodledata but not mdl_files. The implication is that there are a ton of files in moodledata that are not referenced by any existing component of the moodle site itself (assuming the DB is the accurate 'source of truth'), with the secondary implication being that they potentially are safe to delete as they shouldn't be breaking anything from the site's frontend.

So my questions are

1) Is my assumption that files not referenced by mdl_files in the DB are safe to delete? (bold, I know). 

2) Is there a way to do a diff-style query to find all files that actually have vs files that don't have explicit references in the DB? And then to uncover what their filenames are?

My current theory is that these are leftover files from a migration done about 1 year ago from a local server to a professionally managed service. However this theory has holes like Swiss cheese as I can easily find many instances of files in moodledata (and not in mdl_files) that are less than 6mths old, so they were generated as part of the post-migration build.

Stats below:

Moodle 3.5.13+ (Build: 20200822) [Linux]

PHP 5.6.4

MySQL 5.6.49

Thanks
Michael


Average of ratings: -