Assignment files

Assignment files

by Fred Riley -
Number of replies: 11

Hi to Moodle folk

When a user submits assignment files, their details are stored in the files table. The handy File API Internals doc has detailed information on this table, and although it's marked as v2.0 I'm assuming that it's still valid for v2.6. What I need to find out is what files in the table are related to assignments. I thought that maybe I could use the files.component column which has strings such as assignfeedback_editpdf, so maybe a search for substring "assign", but this fails as even simple images (eg smile.png) are stored as being part of such a component.

Another possibiility might be the files.filearea column, which contains strings such as "submissions".

I've two related questions which I'd be grateful if someone would answer:

1. What's the best way of differentiating assignment files from other files?

2. Is there a doc which lists the values that files.component and files.filearea can have?

This is not unrelated to a post by Jez H today...

I'm happy to RTFM if someone can point me to the relevant bits of TFM. I'm an old Moodle hand as an admin and course designer, but a naive newbie to Moodle development.

A tangential question: does Moodle use foreign keys at all, or is referential integrity enforced in code?

Cheers

Fred

Average of ratings: -
In reply to Fred Riley

Re: Assignment files

by Stuart Mealor -

No foreign keys in Moodle (which surprised me when I first worked that out).
There was some talk in this developers Forums about this, but I think it went in the too hard basket.

In reply to Stuart Mealor

Re: Assignment files

by Fred Riley -

That's useful to know about FKs. I knew one very good PHP developer who strongly believed that referential integrity should be enforced in code, not structure, which was contrary to how I'd been formally taught database design. If it works, it works, but it's a minor nuisance when trying to understand a complex DB because you can't, from the design alone, see the table relations.

Cheers

Fred

In reply to Fred Riley

Re: Assignment files

by Davo Smith -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Each assignment submission plugin can decide which component to use to store the files.

If you want to find submitted files, then you need to search for the name of the component that submitted the files - in core Moodle the component will be 'assignsubmission_file' and the filearea will be 'submission_files' (if you install any 3rd-party submission plugins then they will chose their own name for the 'filearea' field, but the 'component' field will match the name of the plugin).

As for question 2, if is up to every plugin to choose their own names for the 'filearea' (as only the plugin itself knows what different file areas they will use) - these are not published anywhere publicly (other than in the code), as they are internal implementation details for the plugin and shouldn't be needed anywhere outside the plugin itself. If you want to use code to get all the files submitted via a particular plugin, you can use the plugin's 'get_files' function to retrieve this. The 'component' field should match the frankenstyle name of the plugin that saved it (e.g. 'mod_forum', 'block_html', 'assignsubmission_file', etc.).

As an aside, you will find lots of images stored in the 'assignfeedback_editpdf' component, as that is where all the generated images of each of the pages of the PDF is stored. The 'smile.png' file you found was probably an image added to either an assignment 'description' or as part of an online text submission from a student.

In reply to Davo Smith

Re: Assignment files

by Fred Riley -

Thanks for the reply, Davo

The detail is useful, and I'm trying to digest it with my current poor state of knowledge of back-end Moodle. Essentially:

  • plugins do their own things (within Moodle rules)
  • each plugin has a get_files() method to retrieve files it's stored
  • there's no easy way to determine if a file in the files table is from a submitted assignment

Does that sum it up?

Speaking to a colleague, it appears that assignment submissions are usually handled by the mod_assign core module/plugin, so I'll have a look at that.

Cheers

Fred

PS: That's one scary avatar you've got there! Cybermen really gave me the willies when I was a kid.

In reply to Fred Riley

Re: Assignment files

by Davo Smith -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
  • Plugins do their own thing, as long as they set the 'component' to their frankenstyle name; the 'filearea', 'itemid' and, to a certain extent, 'filepath' and 'filename' are all chosen by the plugin.
  • Assign plugins (both feedback and submission) have a function called get_files (but it only does something useful if the developer implements the code within it, not all assignment plugins have files they can return) - other plugin types do not have such a function (unless they define it for themselves)
  • If the only plugin submitting files on your site is 'assignsubmission_file', then it is very easy to determine if a file has been submitted for an assignment - the component will be 'assignsubmission_file' and the filearea will be 'submission_files' and the itemid will match the id for the record in 'mdl_assign_submission'; the only thing that isn't easy is to generalise this if you have other plugins also submitting files - the only consistent method is to use the get_files function

PS The avatar is just a picture of me ... with a few bits of extra cardboard, papier mache and silver spray paint (and the autograph of a relevant actor on the back).

Average of ratings: Useful (1)
In reply to Fred Riley

Re: Assignment files

by Jez H -

Assuming its possible to identify assignment files (which I think it is) would it  be possible to modify entries in the table to point at "placeholder" or "dummy files"?

So we will be getting perhaps 100,000 assignment files per year in Moodle which we want to clear down.

Resetting the course is no good as we want students to be able to refer  back to past forum discussions etc. later in their course, but we do want our storage space back after x period of time.

One idea is to create a reference (entry in files table) to a pdf with text along the lines of:

"Looks like your old assignment was deleted in line with retention policy..."

Then write a script that deletes old assignment files from the file system and updates the files table setting references to the deleted file to point at the placeholder / notice pdf. Thus we don't end up with broken links, missing file references and users who do click through to their old assignment get some information as to why its no longer there.

The main issue seems to be pathnamehash as it is:

  1. Unique
  2. Contains the file name / extension (/filename.ext)

So it would seem we cannot simply re-use the value from our dummy / placeholder file and cannot leave it as it is because the /filename.ext will be different. We will be removing things like "my-stunning-assignment.docx" and replacing them with "we-culled-your-out-of-date-file.pdf".

One way would be to regenerate pathnamehash with the new $filename = "we-culled-your-out-of-date-file.pdf"; using:

lib/filestorage/file_storage.php

public static function get_pathname_hash($contextid, $component, $filearea, $itemid, $filepath, $filename) {
return sha1("/$contextid/$component/$filearea/$itemid".$filepath.$filename);
}

Every other field is already in the file table so it should be relatively easy to regenerate that?

I hope I am making some sense here, in a nutshell we need to delete 100k assignments pa without resetting the course, orphaning records or showing students broken links.

Sanity checks would be most appreciated!

 

In reply to Jez H

Re: Assignment files

by Davo Smith -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Major disclaimer - messing around with the contents of mdl_files is a very, very bad idea.

Now that is out of the way, the link to the file storage is not 'pathnamehash', but 'contenthash' - 'pathnamehash' is a way of quickly retieving the file if you know the full details of it, 'contenthash' is the way of uniquely linking to the data on the server (it works such that 2 files with identical content will be stored only once).

If you are wanting to replace files with dummy content, then you really want to be using the files API to do this:

$fs = get_file_storage();
$dummyfile = $fs->get_file_by_hash($dummypathnamehash);
$file = $fs->get_file_by_hash($pathnamehash);
$file->replace_content_with($dummyfile);

This code assumes that:
a) You've already created the dummy file using, for example, $fs->create_file_from_pathname() OR $fs->create_file_from_string()
b) You have extracted/generated the $dummypathnamehash that refers to this file.
c) You have extracted/generated the $pathnamehash for the file you want to replace.

 

Average of ratings: Useful (2)
In reply to Davo Smith

Re: Assignment files

by Jez H -

Thanks Davo,

I understood the difference between pathname and contenthash, it is the fact the pathnamehash contains the filename / suffix that seemed like it could cause a problem.

We have no control over the method used to retrieve the files, if mod assign uses the express method (pathnamehash) now or in the future that needs to be supported.

Thanks very much for the tips on the file API, I will have to take a closer look at how that works, specifically what would:

$file->replace_content_with($dummyfile);

actually replace...

Your abc assumptions are all correct, we would have a single known "dummy" file, and not actually a dummy but something sensible. The equivalent of a 404 page for deleted assignments smile

In reply to Jez H

Re: Assignment files

by Davo Smith -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

$file->replace_content_with($dummyfile);

Would:

  1. update the mdl_files record, so that, whilst the filename would appear unchanged, the content of the file (when downloaded) would be the content of the dummy file
  2. check to see if any other entries in mdl_files were using the same content as the original file and, if that was the last reference to it, delete the actual file from the server

The major disclaimer was about making and changes directly to mdl_files (or the files on the disk). Using the files API to make those changes (as I outlined above) should be safe.

If you wanted you could also rename the file, with something like this:

$filename = $file->get_filename();
$newfilename = do_something_to_the_filename($filename);
$file->rename($file->get_filepath(), $newfilename);

 

Average of ratings: Useful (1)
In reply to Davo Smith

Re: Assignment files

by Jez H -

Hi Davo,

Thanks very much for your reply, I am not sure about this:

1. update the mdl_files record, so that, whilst the filename would appear unchanged, the content of the file (when downloaded) would be the content of the dummy file

If the filename remains the same would the file still open?

By that I mean if the original document was docx and we replace with a contenthash for PDF wouldnt the suffix / mime type both need to be updated in order for the browser to handle the file correctly?

 

In reply to Davo Smith

Re: Assignment files

by Jez H -

Oh, regards the major disclaimer, are you able to suggest an alternative approach to clearing down assignment files independently of the rest of the course content / grades etc?