Context - I'm involved with both dedicated real servers running 10+ moodles each and with clients wanting a moodle install on a VM. In the former case when doing upgrades (copy moodledata folder is advised) the moodledata copy time can be 40 mins+. In the latter case disc space is a significant cost factor.
Example case - one install has a moodledata folder of 21G - assuming my sql is correct 8.4G is duplicated files (this test based on mdl_files.contenthash) perhaps arising from copied courses or same resource in multiple courses.
Could resource file upload process check for duplication (contenthash) and in mdl_files use the same pathnamehash for the new record ?
If, on deletion of a course or resource, could the process check if another file record uses the same contenthash and not remove the file, while still removing the mdl_files record ?
> one install has a moodledata folder of 21G - assuming my sql is correct 8.4G is duplicated files (this test based on mdl_files.contenthash)
Any documentation of your method?
> perhaps arising from copied courses or same resource in multiple courses.
As Rex already pointed out, the repository is made to keep only a single copy of a file across the whole site. That said, there have been cases of Moodle wasting disk space. Before diving in to that, we need to be certain that your Moodle is not behaving the way it is supposed to.
Thanks all for the information and explanations.
I had not understood that the location of a file in moodledata folder was provided by the contenthash. That misunderstanding led to my incorrect claim of duplicated files.
And just to add to what others have already stated, contenthash is used to determine where the file is stored within moodledata, whereas pathnamehash is a hash of the identifiers in mdl_files and is used to uniquely identify that instance of the file in the Moodle code (the id field is avoided as it would change if the file was deleted and recreated, the pathnamehash would be unchanged if an updated version of a file was uploaded to the same in Moodle with the same filename).