PHP process for cron job 'stuck'

Re: PHP process for cron job 'stuck'

by Thomas College -
Number of replies: 24

We experienced the same or similar problems twice within the past week.  Once the scheduled task would complete, but with a last run result of 0x1.  Restarting the server in that case resolved the issue.  The second time the cron job seemed to be stuck on deleting a file from the recycle bin.  Disabling the scheduled task, restarting the server, and then running the cron job manually from a command prompt fixed the issue.  We have since upgraded the server OS to Windows Server 2016 and IIS 10, so the issue doesn't seem to be Windows or IIS related. 

Has anyone experienced this type of problem before, or have any insight on how to prevent it from happening in the future?

Thanks

Average of ratings: Useful (1)
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

We experienced the same problem a few days ago.  We've since upgraded some components of our environment to try to resolve this issue.  Our current environment is: Moodle 3.3.2, Windows Server 2016, IIS 10, MS SQL Server 2014, PHP 7.1.

In our PHP error logs, we received this error around the time the problem started to happen:

PHP Fatal error:  Maximum execution time of 600 seconds exceeded in C:\Moodle\lib\classes\session\handler.php on line 42

Does anyone know the function of line 42 in handler.php, or if it could be related to this type of problem?

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

We experienced the same problem yesterday.  There were several errors in our PHP error log file, which I've attached.  Has anyone experienced these errors before? 

The problem seems to happen once a week on Saturday, when the Cleanup files in trash task is run.  This task does not seem to cause this problem on other days of the week. 

Is anyone familiar with the cleanup files in the trash task, or why it may be causing this problem?

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

The Site registration scheduled task may be getting 'stuck' as well.  Has anyone experienced any problems with the site registration task?

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

Is there a way to determine what files are being deleted when the cleanup files in trash task runs?  We have full access to the database, so would there be a SQL query we could run?


In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Mark Sharp -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers

I've seen that. The culprit for me was a mod plugin I had written. It was throwing an error when the course_delete_modules adhoc task was trying to run for an instance of that plugin.

So the question is which plugin is causing the issue?

  1. Go to the mdl_task_adhoc table
  2. SELECT the records. The first one is probably your chap.
  3. In customdata you should see "module":"number", so that's the ID of the mod that is probably problematic - cf with mdl_modules. If it's a 3rd party or self-created plugin, it's very likely to be bad code in there.
  4. You might try manually deleting the instance of the course_module, then deleting the item in the mdl_task_adhoc table. The rest of your items should now run through.
Average of ratings: Useful (1)
In reply to Mark Sharp

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Mark,

Thanks for the reply.  We currently do not have any records in our mdl_task_adhoc table.  Is there a cron process that needs to run to populate this table?  If so, is this something we could run manually?

Looking back over our records, the problem seems have started a few days after we upgraded to Moodle 3.3.2.  We are now at Moodle 3.3.3 (Build 20171113) and are still experiencing the problem.  Do you know if others have reported this type of problem with Moodle 3.3.2?

Thanks

Average of ratings: Useful (1)
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

When we've experienced this problem, the cron job has been stuck on either deleting items from the recycle bin, or deleting old draft files.  The scheduled task is 'file_trash_cleanup_task'.  This task is currently set to run every six hours, and does not get 'stuck' normally.  The last time we experienced this problem, the error in the cron job log was:

Execute scheduled task: Cleanup files in trash (core\task\file_trash_cleanup_task)
... started 17:00:58. Current memory use 14MB.
Deleting old draft files... ... started 17:00:58. Current memory use 14.8MB.
... used 3 dbqueries
... used 418.41858696938 seconds
Scheduled task failed: Cleanup files in trash (core\task\file_trash_cleanup_task),Error writing to database
!!! Error reading from database !!!

The error in our PHP log seems to be more specific:

[12-Nov-2017 17:07:56 America/New_York] Default exception handler: Error reading from database Debug: SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Shared Memory Provider: The pipe is being closed.
<br>
SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Communication link failure<br>

SELECT * FROM mdl_task_scheduled WHERE classname = N'\core\task\file_trash_cleanup_task'
[array (
  0 => '\\core\\task\\file_trash_cleanup_task',
)]
Error code: dmlreadexception
* line 486 of \lib\dml\moodle_database.php: dml_read_exception thrown
* line 324 of \lib\dml\sqlsrv_native_moodle_database.php: call to moodle_database->query_end()
* line 431 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->query_end()
* line 896 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->do_query()
* line 972 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->get_recordset_sql()
* line 1558 of \lib\dml\moodle_database.php: call to sqlsrv_native_moodle_database->get_records_sql()
* line 1530 of \lib\dml\moodle_database.php: call to moodle_database->get_record_sql()
* line 1509 of \lib\dml\moodle_database.php: call to moodle_database->get_record_select()
* line 630 of \lib\classes\task\manager.php: call to moodle_database->get_record()
* line 169 of \lib\cronlib.php: call to core\task\manager::scheduled_task_failed()
* line 67 of \lib\cronlib.php: call to cron_run_inner_scheduled_task()
* line 61 of \admin\cli\cron.php: call to cron_run()

[12-Nov-2017 17:07:56 America/New_York] PHP Warning:  sqlsrv_free_stmt(): supplied resource is not a valid ss_sqlsrv_stmt resource in C:\Moodle\lib\dml\sqlsrv_native_moodle_recordset.php on line 135
[12-Nov-2017 17:07:56 America/New_York] PHP Fatal error:  Uncaught coding_exception: Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
C:\Moodle\lib\classes\task\manager.php on line 507

 Code should look like:

 $factory = \core\lock\lock_config::get_lock_factory('type');
 $lock = $factory->get_lock(Resource id #656);
 $lock->release();  // Locks must ALWAYS be released like this.

 in C:\Moodle\lib\classes\lock\lock.php:117
Stack trace:
#0 [internal function]: core\lock\lock->__destruct()
#1 {main}
  thrown in C:\Moodle\lib\classes\lock\lock.php on line 117
[12-Nov-2017 17:09:17 America/New_York] PHP Notice:  Undefined index: SERVER_ADDR in C:\Moodle\lib\dmllib.php on line 351
[12-Nov-2017 17:09:22 America/New_York] PHP Notice:  Undefined index: SERVER_ADDR in C:\Moodle\lib\dmllib.php on line 351


So it seems that the task is becoming disconnected from the database.  The cron job is running on the database server, so the connection should be local, and network problems shouldn't be a factor.  I'm not sure at this point if the disconnect is being caused by SQL Server or the cron job.  Does anyone know what these two errors mean?

[12-Nov-2017 17:07:56 America/New_York] PHP Warning:  sqlsrv_free_stmt(): supplied resource is not a valid ss_sqlsrv_stmt resource in C:\Moodle\lib\dml\sqlsrv_native_moodle_recordset.php on line 135
[12-Nov-2017 17:07:56 America/New_York] PHP Fatal error:  Uncaught coding_exception: Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
C:\Moodle\lib\classes\task\manager.php on line 507

I'm hoping these two errors will provide more insight as to what is going on.

Thanks


Average of ratings: Useful (1)
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers

Qualifications for this response ... don't run windows, but ...

This bit of info begs a question:

"The cron job is running on the database server, so the connection should be local, and network problems shouldn't be a factor."

So the setup is a web server where moodle code resides (scripts run by cron job are located here) and where moodledata resides.   Then you have  a dedicated DB server.  Did I understand that correctly?  

The question: why would one not run the cron jobs on the web server?   As far as cron jobs/task etc to clean up trashdir etc. it would seem to me cron/task should be executed from the web server and not remotely from the DB server.

The way you've described it ... cron on the DB server involves networking first ... which adds a network layer (so to speak) in the execution of those scripts and in the things the scripts must be able to do in regards to files.  You've said that networking isn't the issue ... but do know that even if on the same segment of the network and one hop away, there is usually something hidden there ... like a switch and the switch ports.   A flakey switch or port on that switch could be involved and is a point of failure.

Am wondering if the issue wouldn't be resolved or at least perform better if whatever you have running the cron/task were run from the Web server alone.

'spirit of sharing', Ken




In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Thomas College -

Ken,

Thanks for the response.  Sorry if I wasn't clear.  Our Moodle environment is one server.  The server is running both SQL Server and IIS, and both the Moodle code and the moodledata folder reside on the server.  The cron job is set to run on the server using Windows task scheduler.  So there shouldn't be any network segment that would cause the cron job to become disconnected from SQL Server. 

Going by the error in the PHP log (posted above) it seems that the cron job is becoming disconnected from SQL Server.  My guess is that the 'file_trash_cleanup_task' process puts a lock on something, then the job loses its connection to SQL Server, therefore is unable to remove the lock, resulting in our stuck symptom.

Right now I'm not sure why it losing its connection, or if this theory is even correct.

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers
Again ... don't run windows ... so consider the following questions/comments to be 'food for thought'.

Ok ... one server then.

Locks ... if I am not mistaken, the clean up task looks at file system and determines if (via DB query) that files are no longer linked anywhere in moodle thus can be moved to /moodledata/trashdir/.   In four days time, those are re-checked and if status has not changed, they are removed from the system (ie, erases).   During that time, there are, of course, new files added, etc..   Since the task is getting stalled, seems to me that process sets a lock that isn't then removed .... process failed to complete.

Question ... are there files remaining in /moodledata/trashdir/ after the clean up is supposed to have run?
Have you attempted to manually remove all files/folders in moodledata/trashdir/ after failure of the task to clean them up?
It is supposedly safe to manually remove them ... just like it's safe to manually remove files from /moodledata/temp/backup/ as long as autobackup or a course backup proecess isn't running.

What are ownerships/permissions on moodledata/trashdir ... does that user have full rights?
What are php settings for time for a script to run and memory to allocate a script?

Have you gone though the task list (they have a schedule of their own) to see if there are any potential conflicts with other task/cron jobs?

How often are you running the cron job?

'spirit of sharing', Ken

In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers

Just a follow up ...

There is a table called mdl_task_scheduled.   All the task are shown (classname column) along with their settings for minute/hour/day .... \core\task\file_trash_cleanup_task also has a couple of columns for lastruntime nextruntime ... epoch time stamps.   Also there is a column for 'faileddelay' which I've never seen contain a value ... guess cause I've been fortunate enough never to have a task fail to run.

Since the DB is being use to track stuff related, am wondering if seeing what's contained therein might give you a clue.

Do know that cron jobs need to run more frequently now than ever before .... and I think I've seen other users who had problems with autobackups and cleanup timing before .... autobackups not finished before cleanup came along.

'spirit of sharing', Ken


In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Ken,

Thanks for the update.  When we last experienced the problem there were 4 folders with files in them in /moodledata/trashdir/ .  I manually deleted them while the cron job was 'stuck'.  Unfortunately I'm not sure if this fixed the issue or not.  I do know that after several restarts and manually running the cron job afterwards, it eventually completed successfully.  I'm not sure if this was immediately after I deleted the contents of the trashdir folder, or a couple restarts after.

The local administrators group on the server has full control over the trashdir folder.  The account that the scheduled cron job task runs under is part of this group.

I haven't gone through our task list to see if there are any conflicts (thank you for the suggestion).  I'll do that shortly and let you know the results.

The cron job runs every 5 minutes.

Thanks for the suggestion about Windows.  I'll pass it along.

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Ken,

The faileddelay value for \core\task\file_trash_cleanup_task is 0.  It is set to run every 6 hours at 55 minutes. 

The faileddelay value for \tool_recyclebin\task\cleanup_category_bin is 86400.  When I've experienced the problem it has been stuck on deleting draft files (per the output of the cron job when running manually).  I'm not sure if \tool_recyclebin\task\cleanup_category_bin is the task responsible for deleting draft files. 

I've started going through when all of our tasks are scheduled to run to see if there are any conflicts.  We haven't manually changed any of these so they should all be at the Moodle default.

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers

The draft setting means the file is sorta in limbo ... teacher might change their minds ... just like students (assignments if allowed) ... so draft is the first setting.   Later, when determined no more changes concerning the status of the file (student has submitted assignment and can no longer change their minds), it's changed to another component - ie, not draft.

How about setting the task for cleaning up trash to half of it's current setting?   Just how heavy that process gets dependent upon behavoir of users and usage.   Also consider setting Recycle-bin sooner.

Recycle bin ... it's amazing how users would use the 'trash' on workstations .... some would clean up their desktop by putting files in the trash, but never emptied it.  And they couldn't figure out why their computer was saying the drive was almost full and they needed to free up some space! ;)

'spirit of sharing', Ken



In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers

Am almost certain that just removing files/folders from moodledata/trashdir/ actually fixed the issue as that action did nothing to actually address the cause.

You say you forced the running of the cron job several times until it finally completed.  Was that done via command line?   And each time you ran it, did it stop on the same task/job or did it progress a little further each time?

Using any addon/plugin that might be 'heavy processing'?  Running unoconv and annotation of assignments?   Any large courses that might have assignments due or testing frequently for some reason?

Have read in these forums that the cron should run every minute now ... every 5 minutes might mean something is building up to be too much for the server to complete.

I know that sounds crazy ... but I've just about all of the servers I admin now running cron jobs every 1 to 2 minutes and even though that drags some of those servers down for a bit (speed wise), have not had any issues.

Run the cron job more frequently ... the day you do that .... monitor!

'spirit of sharing', Ken

In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Ken,

Yes, I ran the cron job via command line.  When we were experiencing the problem last, my process was:

1. Disable the scheduled task and reboot the server to stop the 'stuck' cron job

2. Run the cron job via command line to see the output and were it was getting stuck.  I've attached a screen shot of the point were it was 'stuck'.

We have 16 additional plugins installed, most of which are the Office 365 integration plugins.  We have the Sharing Cart plugin but we disabled it shortly after the last time we experienced the stuck cron job, due to an unrelated problem.  We were unable to access courses with it added (we received a general database error).  The Sharing Cart plugin may or not have caused the cron job to become stuck, but without a good way to trigger the problem, I can't be certain.

We had been running the cron job every 1 minute a few years ago, but we experienced a problem where the job would take longer than a minute to run, and then would cause subsequent scheduled jobs to fail to start. 

Thanks

Attachment cron-error2.PNG
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Ken Task -
Picture of Particularly helpful Moodlers

So you've also upgraded php to 7.1?

https://docs.moodle.org/dev/Moodle_and_PHP7#Can_I_use_PHP7_yet.3F

First line in the URL above.

The fact that cron job run started where it did kinda indicates to me
it didn't finish the trash clean up and deleting old drafts. 

Don't think I've ever had a cron job run from CLI begin with that.
But then again, don't run Office 365 Integration either.

You've not shared, previously, that relates me thinks ...
"the Office 365 integration plugins" ... would think  some of those related to files and drafts, etc.. and would/might involve network - if not local, then upstream 'cloud'.

Think the resident expert in Windows and Office 365 integrations is
Mike Churchwood
https://moodle.org/user/view.php?id=9561&course=5

Might ask Mike to review this thread.

Think I've taken this as far as I can go ... being a non-Windows person.

Good luck!

'spirit of sharing', Ken

In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Ken,

Thanks for the update.  We upgraded to PHP 7.1 after the problem started happening, with the hope of fixing it.  The problem occurred when we were running 5.6 as well.

It looks like most of the of the Office 365 integration plugins should support Moodle 3.3, but a few of them don't currently, so I'll try to reach out to Mike to see if he has any insight on this issue.

Thanks

In reply to Ken Task

Re: PHP process for cron job 'stuck'

by Thomas College -

Ken, sorry, PDF annotation is disabled, and we're not running unoconv either.

Thanks

In reply to Mark Sharp

Re: PHP process for cron job 'stuck'

by Thomas College -

Hi Mark,

I've read your forum thread from September here: https://moodle.org/mod/forum/discuss.php?d=357938.  I was able to populate the mdl_task_adhoc table by deleting some activities in a test course.  It seems that what ever purges the mdl_task_adhoc table seems to be working properly as the records in the table are being purged (we don't have a large amount of them, or one that is consistently in the table).

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

We experienced the same problem yesterday.  David Monllaó in this bug tracker: https://tracker.moodle.org/browse/MDL-60417 recommended we enable some additional debugging.  With that enabled, the only additional error I was able to find was this (in bold):

[01-Dec-2017 13:22:51 America/New_York] PHP Warning:  rename(C:\moodledata/cache/cachestore_file/default_application/core_capabilities/cor-cache/VgmzEWOuiE.5a219dfb844f31.71469117.temp,C:\moodledata/cache/cachestore_file/default_application/core_capabilities/cor-cache/core_capabilities-c6b8356e23680a6513bfebdff942a22b.cache): Access is denied. (code: 5) in C:\Moodle\cache\stores\file\lib.php on line 733
[01-Dec-2017 13:35:58 America/New_York] Default exception handler: Error reading from database Debug: SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Shared Memory Provider: The pipe is being closed.

Does anyone know if renaming these two files is part of the trash cleanup scheduled task?  Would failing to rename the files cause the database error that occurs afterwards?

We are going to setup a test copy of our Moodle server and add the trash directory files that we had to manually remove back.  We will then delete the files one at a time until the cron job finishes, to try to identify the file causing the problem.

Thanks

In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

We set up a test copy of our Moodle server.  The troubleshooting steps we followed were:

  1. Disable scheduled task which runs the cron job
  2. Move the files that were left in the trashdir folder when we last experienced this problem back to the trashdir folder
  3. Change run time of 'Cleanup files in trash' scheduled task to run in 5 minutes
  4. Change system date to 4 days ahead
  5. Run cron job from CLI
    It was at this point we were hoping to replicate the issue, but the cleanup files in trash task did not get stuck.  Instead the entire trashdir was deleted. 

Does anyone happen to know if this result is to be expected?  Would this indicate that the problem is not a specific file or file type? 

Average of ratings: Useful (1)
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

We also receive the following message in our site's notification page:

The database is not using the expected READ_COMMITTED_SNAPSHOT mode which can lead to wrong results, especially under high concurrency scenarios. Please enable it for correct behaviour. For further information, see the documentation Installing MSSQL for PHP.

Our next step is to enable READ_COMMITTED_SNAPSHOT to see if that helps the issue.

Does anyone know if not having READ_COMMITTED_SNAPSHOT enabled would cause this type of problem?

Thanks

Average of ratings: Useful (1)
In reply to Thomas College

Re: PHP process for cron job 'stuck'

by Thomas College -

Just a follow up on this issue.  We have not experienced the problem since we enabled READ_COMMITTED_SNAPSHOT on our database.

Thanks

Average of ratings: Useful (1)