Windows server: Cron hangs on "Delete old draft files..."

Windows server: Cron hangs on "Delete old draft files..."

by Maxime Taisne -
Number of replies: 18

Hi everyone,

Last week our server's disk crashed and our hosting provider had to reinstall the server (Win 2008, IIS7, MSSQL). Fortunately he could find a backup from the beginning of the week, so the loss of data is quite small.

Now the site runs perfectly except for one important thing: the cron!

As a matter of fact, I tried to launch it manually but the browser hung on a blank screen, ending a few minutes later with an error 500 (fastCGI timeout...).

Without any indication concerning what was happening, I launched the cron in a prompt using this command line: "c:\php\php.exe -f c:\mysite\admin\cli\cron.php".

Everything went well (sending email...done, refreshing rss...done, etc.) until this message: "Delete old draft files...".

Is there a way to manually delete those draft files or to bypass this step?

Thanks in advance for any help! smile

Cheers,
Maxime 

Average of ratings: -
In reply to Maxime Taisne

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi Maxime,

Those activities couldn't be finished as the allowed time was exceeded, however they will be done next time cron runs.

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by Maxime Taisne -

Hi Guillermo,

After several manual attempts, I finally and successfully managed to launch cron.php! smile

Actually it seems that there were too many unprocessed draft files in the system, explaining why so many cron jobs were needed!

Cheers,
Maxime 

In reply to Maxime Taisne

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Good afternoon,

I am having the same issue. I have run the cron.php from Windows Task Scheduler and manually throuigh cmd upwards of twenty times. The cron log has always hung at "deleting old draft files..." Does anyone know how to remedy this issue? My cron job has never completed successfully.  

Thanks,

John

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

What Moodle version? You could start by checking your server error log.

As unused files (draft files and/or deleted files) are not deleted immediately, but remain for a period of time (4 days?) before being deleted by the cron process, you might have quite a number of draft files and so the process might be timing-out.

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hello Guillermo,

Thank you for the response. Here is some more info on my setup:

Moodle Version: 2.2

OS: Windows Server 2008 R2

DB: SQL Server 2008 R2

PHP version: 5.3.10

IIS version: 7.5

I did check the server logs, however; there is no evidence of any errors or problems that I can see. You mentioned that it takes 4 days to delete old draft files, how should the cron job be timed to coincide with this? I have been trying to run the cron job every night at 12:00 am, but no matter how long I leave it running, it never passes "Deleting old draft fles". If I run multiple simultaneous instances of the cron job (Using the task scheduler) it is the same outcome. 

Thank you,

John 

 

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

From what I understand, a file won't be deleted until its "grace" period has lapsed, so if an unused file has a time-stamp July 2, it will be deleted four days later.

Does your cron report show the same files being processed? I think that the first files shown should be different, assuming that several days have transcurred.

Maybe you could try and increase the max_execution_time, max_input_time and memory_limit parameters of the php.ini configuration file. You might also want to check the FastCGI's maximum ActivityTimeout.

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hello Guillermo,

 

Thanks again for the reply. Here is what I have tried so far, I have changed the following entries to the values below:

 

max_execution_time = 3600

max_input_time = 7200

memory_limit = 2048M

FastCGI ActivityTimeout = 7200

If I try to move the values any higher, the website will not load- and a 500 internal server error occurs.

When I run the cron, I still See "Deleting Old Draft Files..." and it never moves on from there. I do not see any evenidence or report of any file(s) being processed. Where would I find the cron log? Currently (for testing) I output the results to the DOS prompt.

Thanks again,

John

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

Well, those values should really be more than enough. And yes, it's not good to have them so high all the time: they should be increased only when needed (e.g. when a backup process needs more time to finish) and then brought back to normal operating values.

I'm not into PHP programming, but looking at the code:

mtrace('Deleting old draft files... ', '');
$old = time() - 60*60*24*4;
$sql = "SELECT *
  FROM {files}
  WHERE component = 'user' AND filearea = 'draft' AND filepath = '/' AND filename = '.'
  AND timecreated < :old";
$rs = $DB->get_recordset_sql($sql, array('old'=>$old));
foreach ($rs as $dir) {
$this->delete_area_files($dir->contextid, $dir->component, $dir->filearea, $dir->itemid);
}
$rs->close();
mtrace('done.');

I notice that it doesn't trace which files are being processed. From this code though, you could try and see if you notice something strange, either in the DB or the files themselves.

You might want to start with the following SQL command:

SELECT *
FROM `mdl_files`
WHERE component = 'user'
AND filearea = 'draft'
AND filepath = '/'
AND filename = '.'
AND timecreated < timelimit;

If you defined some other table prefix (by default is "mdl"), then correct the red colored text.

To define the "timelimit" value (current time minus four days) you can use the Epoch Converter page. If I were to run this SQL right now, the value of timelimit would be: 1338783408.

This SQL statement will let you see how many (and which) files are to be processed by Cron.

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hello Guillermo,

Thanks so much for all the great information. I will try your SQL command and let you know the outcome.

Thanks again,

John

 

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hello Guillermo,

I have run the query you have given me with values obtained from an online Epoch converter utility. The query returns 318 rows. I then let the cron run again for one hour. After terminating the cron, 318 rows were still returned from your query. It seems like the cron is not deleting anything from the mdl_files table. 

In your previous post you mention "if you see anything strange in the db or files"- where do the draft files actually exist? The database only refferences the GUID or pointer to these files with the filepath represented as / . Do these files have an actual Windows location, or must they be deleted from the db itself through the cron script? Also, the issue does not seem permissions related, as there are no audit/security failure logs or messages on the SQL DB.

Thanks for your help,

John

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

Unless I'm missing some particular issue, it really seems strange that the cleanup process is not working. I just made two tests at my local installation (Win XP, Apache 2.2.20, MySQL 5.5.8, PHP 5.3.5, Moodle 2.2.1 build: 20120109), and everything ran as it should.

Before I describe the tests I did, maybe you would like to read the following post I made, just to get familiar with some aspects of the mdl_files table and the underlying file system:

http://moodle.org/mod/forum/discuss.php?d=203204#p889847

Well, to test the cron system and the code, I began by creating a new page with some text, saved it and then edited it to add a new image; however, once I uploaded the png file, I clicked Cancel on both the "Insert/edit image" window and the "Updating Page in Topic 1" page. This of course left me with an unused image file (or a draft file).

When I went to browse the mdl_files table, two new entries were there (I still don't understand the reason behind many of the intricacies and mechanisms of the mdl_files table and of the file system, being one of them why an empy file is created along with the corresponding table entry, but that's how it is):

id            157          158
contenthash   ded4b5...    da39a3...
pathnamehash  084da3...    3ce712...
contextid     5            5
component     user         user
filearea      draft        draft
itemid        517639160    517639160
filepath      /            /
filename      imiamec.png  .
filesize      5289         0
timecreated   1339475765   1339475765

As you can see, the first entry (id=157) corresponds with the file that was actually uploaded, as it has the original file name and a non-zero file size, while the second entry doesn't have a name and is an empty file. Evidently, the itemid field creates the relationship between both registers.

Based on the value of the contenthash field, I went and confirmed that the "ded4b5..." file was at moodledata/filedir/de/d4, and that the "da39a3..." (empty) file was at moodledata/filedir/da/39.

To be able to test the deletion of these draft files, I opened file_storage.php (at moodle/lib/filestorage), deleted the "- 60*60*24*4" part of the $old variable, and then manually ran the cron.php script, which ended normally:

Deleting old draft files... done.
Cleaning up files from deleted contexts... done.
Deleting trash files... done.
Cron script completed correctly

Finally I checked the mdl_files table and the two directories: both registers and both files were deleted.

For the second test, I created a new page, uploaded and saved the image and the page. This time, the mdl_files table had four new registers:

id            173         174         175         176
contenthash   580c7a...   da39a3...   580c7a...   da39a3...
pathnamehash  9cd34f...   9e7e6a...   81ddc2...   6a5bea...
contextid     5           5           35          35
component     user        user        mod_page    mod_page
filearea      draft       draft       content     content
itemid        285766602   285766602   0           0
filepath      /           /           /           /
filename      viv.jpg     .           viv.jpg     .
filesize      31745       0           31745       0
mimetype      image/jpeg  NULL        image/jpeg  NULL
timecreated   1339479656  1339479656  1339479656  1339479656

After I ran the cron.php script, only the last two (id=175, 176) remained; those whose filearea=draft, were deleted. Obviously, none of the files ("580c7a..." & "da39a3...") were deleted, as the DB says they are being used (component=mod_page).

I hope some of this will let you find what is going wrong with your installation. If not, maybe you should open a tracker (you need to create an account for this):

http://tracker.moodle.org/

Maybe the problem you're having is related to one of the WISP components (e.g. some particular setting).

---

Funny thing, one thing I just noticed while doing this test that uploaded images are not being shown once I save them!! Neither in IE nor in FF. The link looks Ok (moodle/pluginfile.php/35/mod_page/content/1/viv.jpg) and the file is in its corresponding folder, but the browser just shows the classic white square with a red X dead

If I try to open the image directly (via the image URL), FF says:
The image xxx cannot be displayed, because it contains errors. And the error console reports:

Error: Image corrupt or truncated:
moodle/pluginfile.php/35/mod_page/content/1/viv.jpg

I just checked the SHA1 of both the local and the "uploaded" files, and they are the same, and both browsers display Ok the local file... sigh! (sorry for this extra rant).

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hi Guillermo,

I have read your post very carefully and did a lot of digging around based on your information. My mdl_files table entries look as follows:

contenthash:          da39...

pathnamehash:       385b...

contextid:               15

component:            user

filearea:                  draft

itemid:                    4547120

filepath:                  /

filename:                .

filesize:                  0

 The interesting thing is that all my values in the contenthash column are the same value- they all point to the da\39 directory on my server. This directory in drivename\moodledata\da\39 has a 0byte file with the fuill name of the contenthash value.

- All filesize values show a value of 0 

- All filename values show a value of . 

I modified the filestorage.php and removed the - 60*60*24*4 value, hoping it would delete past 4 days, but that didn't seem to help in any way- the cron never finished. The only customizations made to the site were that I uploaded a custom .png for the site logo for use with the 'Afterburner' theme. Switching themes or deleting the .png didn't help either.

Thanks for the idea of opening a tracker case. I may need to do that if I can't find the solution. Thanks again for your help, and let me know if you have any other ideas.

John

 

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

Yes, that is right, as I mentioned, an empty file with its corresponding table entry is created for each file uploaded, unless all the uploaded files belong to a same course, in which case only one file/entry will be created for the zero sized file.

Have you checked the value of the timecreated field? Does it actually correspond to the date the file was created?

Besides having looked at the server error log and the database logs, have you checked by enabling the Moodle debugging system:

http://docs.moodle.org/22/en/Debugging

If none of the logs give evidence of a problem, then the tracker will be the way to go.

Cheers!

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Hi Guillermo,

The files were created a long time ago. I loaded all the SCORM content to the site and then ran the cron job after all content was loaded. I have a feeling that there are just too many draft files in the table (> 320 now), and the system is timing out trying to delete them. Using procmon I see that PHP is opening the database successfully, after a few minutes there is no other activity and no more entries in procmon related to PHP or the db.

I just want to confirm something you mentioned in your post when you ran the test. You said when the cron finished it deleted all the draft files in the mdl_files table and also in the Windows directory for the entry starting with da39... . When I ran the cron on our test system (which is Redhat not Windows) it did complete successfully but it never deleted the 0 byte file of da39.... . Moreover, the second 'contenthash' entry that was created in mdl_files was nowhere in the filesystem to be found, before or after the cron ran. Are you sure that the cron will delete entires in the db as well as in the file system? If it only deletes in the db, my thought would be to run the SQL delete statement directly from SQL. However, if it also deletes files in the moodle filedir, then I cannot run the SQL delete because that would only get the db entires. Thanks again for all your help.

The other thing I also pondered is - is there any detrimental affect to not deleting draftfiles in the first place- other than storage concerns of course?

John

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hello John,

Just a few comments before I start.

In my second to last entry, I didn't use the SQL query I had previously posted, instead, I directly browsed the mdl_files table to see all the latest entries created during my tests. It would be best if you did the same (directly browse this table), as this will let you see why some entries are not deleted.

Also, in my last entry I mistakenly used the word "created" (Does it actually correspond to the date the file was created?), instead of "uploaded", as the timecreated field contains the timestamp of when the file was uploaded, not of when it was actually created. Sorry for not having been clear about this. My intention with that question was for you to make sure that the server clock setting (or the timezone setting, either in the date.timezone parameter --in the php.ini configuration file-- or in the Default timezone option --in Moodle--) is correct.

You wrote:

> I have a feeling that there are just too many draft files in the table (> 320 now), and the system is timing out trying to delete them.

I really don't think that 320+ entries should cause a system time-out, unless some setting is really low in the database, the server or the php configuration files.

> When I ran the cron on our test system (which is Redhat not Windows) it did complete successfully but it never deleted the 0 byte file of da39.

Beware here! Remember that four entries are created in the database table for each file actually uploaded and used:

  • Two which have the filearea field = draft: one for the real file and another for the zero-sized file (da39).
  • And two which have the filearea field = content (or any other value different from "draft"): once again, one for the real file and another for the zero-sized file (da39).

However, only two files are created in the filesystem: one for the real file and another one for the zero-sized file.

As the "draft cleaning process" only deletes "draft files", then remaining da39 files must belong to a non-draft filearea (e.g. "package", "content").

Just as a side note, when a SCORM package is uploaded, a set of records is created for the SCORM package itself (filearea=package), and a set for each of its components (filearea=content).

> Moreover, the second 'contenthash' entry that was created in mdl_files was nowhere in the filesystem to be found, before or after the cron ran.

I'm not sure what do you mean with the "second 'contenthash'". If you refer to the pathnamehash field, then you didn't find it because that hash is not used to determine the location of the file in the filesystem:

  • contenthash is the SHA1 hash of the file content and it is used to rename the file and to create the two-level directory path (moodledata/filedir/##/##) where it will be saved.
  • pathnamehash is a SHA1 hash created based in the following fields (from the mdl_files table): contextid, component, filearea, itemid, filepath and filename.extension; then, this hash is based on the pathname in which the file is being used inside Moodle (i.e. its context).

> Are you sure that the cron will delete entires in the db as well as in the file system?

  • Database records that have the filearea field = draft, yes, they will be deleted (assuming that the four day period has elapsed).
  • Files in the filesystem will only be deleted if they do not have a database record with filearea field ≠ draft.

If you reread what I wrote about the second test, records with id=173 and id=174 were deleted because they had filearea=draft, but records with id=175 and id=176 were not deleted because they had filearea=content; as these last two records referenced a file actually being used (filearea=content, component=mod_page), then the corresponding files ("580c7a..." & "da39a3...") were not deleted from the filesystem.

> my thought would be to run the SQL delete statement directly from SQL. However, if it also deletes files in the moodle filedir, then I cannot run the SQL delete because that would only get the db entires.

As you have seen, the file deletion process depends on the particular values that a set of records might have.

> is there any detrimental affect to not deleting draftfiles in the first place- other than storage concerns of course?

I don't think there would be any: http://dev.mysql.com/doc/refman/5.0/en/table-size-limit.html

--- A brief recapitulation:

When a file is uploaded, but not used, either two or four entries might be created, depending on how things are done. For example:

  • If I create a new page, upload an image, and then cancel both the "Insert/edit image" window and the page creation process, only two records will be created (as initially described in my second to last post).
  • However, if I create a new page, upload an image, and then only cancel the "Insert/edit image" window, but save the page, then four records will be created. Even more, every time I open the page (which doesn't have the image, as the Insertion process was cancelled) for editing, regardless of whether I save it or not, two additional entries will be created (each time with a different itemid value).
In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by John Cusimano -

Guillermo,

I have an update on the issue. I have used a combination of both Procmon and Network Monitor traces to actually see what is going on behind the scenes. I also looked directly into the SQL tables like you suggested after uploading some pages/pictures as drafts.

When the cron runs, as I mentioned in the previous post, it connects to the Moodle DB and just sits there. The connection to the DB is never closed, PHP never instantiates any new threads or processes, and the batch window remains opened indefinitely. All of the logs and evidence I gathered pointed to a time-out of some sort- even though there is no error message or indication of such.

On my test server I ran a MYSQL command to delete all the files that were marked as draft in mdl_files. I then ran the cron normally and monitored the system for any strange behaviour. The cron completed successfully again after the manual delete. 

I decided to bite the bullet and do this on the soon to be prod server. I ran the SQL command and deleted all the files from the mdl_files table that were marked as draft. I then uploaded some pictures and pages to the site and refreshed the table to see some new draft files. Then, I ran the cron job again through Windows and ... low and behold ... IT WORKED! 

The only conclusion that I can come to is that there were too many files for the cron to process. However, that is only a guess. Moreover, like you said in your reply, why wouldn't the system be able to delete 300 entries, especially with all the PHP, IIS and .ini values increased?! I guess we will never know for sure. As far as the remnants of files that might be left in the file system from the drafts- I will not know either. I will continue to monitor the system to see if anything weird happens, but so far there is nothing to report.

In the very first entry of this post, the user had run the cron script over and over and eventually it succeeded to delete all the back files. In my case, this never happened. I am very new to Moodle and did know about or set up the cron job until all the courses were added. It seems that this is what caused my problem in the first place. I am curious to know if there is a cap to the amount of files the cron can delete, or if some other issue caused this. I did check my server clock and the timezone is set to use the server local time. When I look at the logs after launching a course, the timestamp is correct- so the cron should have been able to successfully delete all the old files based on the Epoch timestamps against the server clock time.

Guillermo, thank you so much for your help! Without all this great information I would have never found the resolution to this issue. Please let me know if you have any thoughts or ideas as to why this might have happened and/or this fix.

               

In reply to John Cusimano

Re: Windows server: Cron hangs on "Delete old draft files..."

by Guillermo Madero -

Hi John,

Well, it's good to know that the cron script now ended without problems.

I did the same test you did. In my case I had 3 real files (and 1 da39 file, as all the files belonged to the same course), along with 16 records, where 12 were draft and the rest were content. About the draft records, 1 was for a really non-used file (it didn't have any related content records), 1 for a used file, 4 for a stray (as I'll call it, for a lack of a better term) file, and 6 for the da39 zero-sized files (related to the previous 6). I deleted the draft records with an SQL command (12 rows were deleted) and then ran the cron script. As expected, everything ran without problems (I was left with 4 content records); of course, the process didn't delete any files, so I was left with a stranded file: the one that wasn't really used. I then deleted manually that astray file.

If you saved a list of the real files that only had a record with filearea=draft, then you might use the value of the contenthash field to locate them and delete them manually. I think this will reduce the probability of having problems if you end up uploading a file with that same hash.

Maybe the whole problem was caused by a file permission issue? To me it seems really illogical to think that the database had problems deleting 300+ records; maybe if one were talking about 300 million records, that could be plausible. Who knows; as you said, it seems we will never know what really caused this situation.

Cheers!

In reply to Guillermo Madero

Re: Windows server: Cron hangs on "Delete old draft files..."

by Andrea Martin -
Hi Guillermo, hi John, some time has passed after your discussion, but we are stuck in that Problem, because 'our' cron script blocks and never runs again if we don´t delete those columns in mdl_files manually. The database administrator found out that the delete command "delete from mdl_file where id=..." doesn´t work because the clause "select * from mdl_file where component='user' and filearea='draft' and filepath='.' " is not finished (BatchStarting, but not BatchCompleted in database trace). The administrator told us that the recordset of the selet-clause is not taken by moodle script, though the row is not set free by the database system (in our case MS SQL 2005 on Windows Server 2003 with IIS/FastCGI with Moodle 2.2.11 =last version). Does anybody have an idea (please not updating, it´s not possible for the next year..., and yes, we try to switch to LAMP configuration one day...)? Yours Andrea stuck in old Moodle and on the way to loose the application, because it´s not stable in 2.2.11 on Windows ;-(