General help

PHP process for cron job 'stuck'

 
Picture of Thomas College
PHP process for cron job 'stuck'
 

We recently experienced a problem where we believe the php process for our site's cron job was getting 'stuck' for lack of a better word.  The site is running on Windows, and the cron job is initiated through Windows task scheduler.  The scheduled task is set to run every 5 minutes.  The task was failing to run because there was already an instance of the same task is already running.  During this time, users were unable to upload files to the site.  From a user experience, the upload process would 'spin' until our php timeout value was reached.  There was a corresponding .tmp file created during this process.  Stopping the running cron job through task scheduler fixed the file upload problem.  

At this point it seems that a cron job failed to finish, and its php process was preventing file uploads.  

Has anyone experienced this type of problem before?  Is there anything that would prevent a cron job from finishing?  Is there anything that we can do to prevent the cron job php process from impacting user's ability to upload files?

Our current environment is: Moodle 3.3.2, Windows Server 2012R2, IIS 8, SQL Server 2014, PHP 5.6

Any insight is appreciated.

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We experienced the same or similar problems twice within the past week.  Once the scheduled task would complete, but with a last run result of 0x1.  Restarting the server in that case resolved the issue.  The second time the cron job seemed to be stuck on deleting a file from the recycle bin.  Disabling the scheduled task, restarting the server, and then running the cron job manually from a command prompt fixed the issue.  We have since upgraded the server OS to Windows Server 2016 and IIS 10, so the issue doesn't seem to be Windows or IIS related. 

Has anyone experienced this type of problem before, or have any insight on how to prevent it from happening in the future?

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We experienced the same problem a few days ago.  We've since upgraded some components of our environment to try to resolve this issue.  Our current environment is: Moodle 3.3.2, Windows Server 2016, IIS 10, MS SQL Server 2014, PHP 7.1.

In our PHP error logs, we received this error around the time the problem started to happen:

PHP Fatal error:  Maximum execution time of 600 seconds exceeded in C:\Moodle\lib\classes\session\handler.php on line 42

Does anyone know the function of line 42 in handler.php, or if it could be related to this type of problem?

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We experienced the same problem yesterday.  There were several errors in our PHP error log file, which I've attached.  Has anyone experienced these errors before? 

The problem seems to happen once a week on Saturday, when the Cleanup files in trash task is run.  This task does not seem to cause this problem on other days of the week. 

Is anyone familiar with the cleanup files in the trash task, or why it may be causing this problem?

Thanks

 
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

The Site registration scheduled task may be getting 'stuck' as well.  Has anyone experienced any problems with the site registration task?

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Is there a way to determine what files are being deleted when the cleanup files in trash task runs?  We have full access to the database, so would there be a SQL query we could run?


 
Average of ratings: -
Picture of Mark Sharp
Re: PHP process for cron job 'stuck'
Core developersParticularly helpful MoodlersPlugin developers

I've seen that. The culprit for me was a mod plugin I had written. It was throwing an error when the course_delete_modules adhoc task was trying to run for an instance of that plugin.

So the question is which plugin is causing the issue?

  1. Go to the mdl_task_adhoc table
  2. SELECT the records. The first one is probably your chap.
  3. In customdata you should see "module":"number", so that's the ID of the mod that is probably problematic - cf with mdl_modules. If it's a 3rd party or self-created plugin, it's very likely to be bad code in there.
  4. You might try manually deleting the instance of the course_module, then deleting the item in the mdl_task_adhoc table. The rest of your items should now run through.
 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Mark,

Thanks for the reply.  We currently do not have any records in our mdl_task_adhoc table.  Is there a cron process that needs to run to populate this table?  If so, is this something we could run manually?

Looking back over our records, the problem seems have started a few days after we upgraded to Moodle 3.3.2.  We are now at Moodle 3.3.3 (Build 20171113) and are still experiencing the problem.  Do you know if others have reported this type of problem with Moodle 3.3.2?

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

When we've experienced this problem, the cron job has been stuck on either deleting items from the recycle bin, or deleting old draft files.  The scheduled task is 'file_trash_cleanup_task'.  This task is currently set to run every six hours, and does not get 'stuck' normally.  The last time we experienced this problem, the error in the cron job log was:

Execute scheduled task: Cleanup files in trash (core\task\file_trash_cleanup_task)
... started 17:00:58. Current memory use 14MB.
Deleting old draft files... ... started 17:00:58. Current memory use 14.8MB.
... used 3 dbqueries
... used 418.41858696938 seconds
Scheduled task failed: Cleanup files in trash (core\task\file_trash_cleanup_task),Error writing to database
!!! Error reading from database !!!

The error in our PHP log seems to be more specific:

[12-Nov-2017 17:07:56 America/New_York] Default exception handler: Error reading from database Debug: SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Shared Memory Provider: The pipe is being closed.
<br>
SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Communication link failure<br>

SELECT * FROM mdl_task_scheduled WHERE classname = N'\core\task\file_trash_cleanup_task'
[array (
  0 => '\\core\\task\\file_trash_cleanup_task',
)]
Error code: dmlreadexception
* line 486 of \lib\dml\moodle_database.php: dml_read_exception thrown
* line 324 of \lib\dml\sqlsrv_native_moodle_database.php: call to moodle_database->query_end()
* line 431 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->query_end()
* line 896 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->do_query()
* line 972 of \lib\dml\sqlsrv_native_moodle_database.php: call to sqlsrv_native_moodle_database->get_recordset_sql()
* line 1558 of \lib\dml\moodle_database.php: call to sqlsrv_native_moodle_database->get_records_sql()
* line 1530 of \lib\dml\moodle_database.php: call to moodle_database->get_record_sql()
* line 1509 of \lib\dml\moodle_database.php: call to moodle_database->get_record_select()
* line 630 of \lib\classes\task\manager.php: call to moodle_database->get_record()
* line 169 of \lib\cronlib.php: call to core\task\manager::scheduled_task_failed()
* line 67 of \lib\cronlib.php: call to cron_run_inner_scheduled_task()
* line 61 of \admin\cli\cron.php: call to cron_run()

[12-Nov-2017 17:07:56 America/New_York] PHP Warning:  sqlsrv_free_stmt(): supplied resource is not a valid ss_sqlsrv_stmt resource in C:\Moodle\lib\dml\sqlsrv_native_moodle_recordset.php on line 135
[12-Nov-2017 17:07:56 America/New_York] PHP Fatal error:  Uncaught coding_exception: Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
C:\Moodle\lib\classes\task\manager.php on line 507

 Code should look like:

 $factory = \core\lock\lock_config::get_lock_factory('type');
 $lock = $factory->get_lock(Resource id #656);
 $lock->release();  // Locks must ALWAYS be released like this.

 in C:\Moodle\lib\classes\lock\lock.php:117
Stack trace:
#0 [internal function]: core\lock\lock->__destruct()
#1 {main}
  thrown in C:\Moodle\lib\classes\lock\lock.php on line 117
[12-Nov-2017 17:09:17 America/New_York] PHP Notice:  Undefined index: SERVER_ADDR in C:\Moodle\lib\dmllib.php on line 351
[12-Nov-2017 17:09:22 America/New_York] PHP Notice:  Undefined index: SERVER_ADDR in C:\Moodle\lib\dmllib.php on line 351


So it seems that the task is becoming disconnected from the database.  The cron job is running on the database server, so the connection should be local, and network problems shouldn't be a factor.  I'm not sure at this point if the disconnect is being caused by SQL Server or the cron job.  Does anyone know what these two errors mean?

[12-Nov-2017 17:07:56 America/New_York] PHP Warning:  sqlsrv_free_stmt(): supplied resource is not a valid ss_sqlsrv_stmt resource in C:\Moodle\lib\dml\sqlsrv_native_moodle_recordset.php on line 135
[12-Nov-2017 17:07:56 America/New_York] PHP Fatal error:  Uncaught coding_exception: Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
C:\Moodle\lib\classes\task\manager.php on line 507

I'm hoping these two errors will provide more insight as to what is going on.

Thanks


 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers

Qualifications for this response ... don't run windows, but ...

This bit of info begs a question:

"The cron job is running on the database server, so the connection should be local, and network problems shouldn't be a factor."

So the setup is a web server where moodle code resides (scripts run by cron job are located here) and where moodledata resides.   Then you have  a dedicated DB server.  Did I understand that correctly?  

The question: why would one not run the cron jobs on the web server?   As far as cron jobs/task etc to clean up trashdir etc. it would seem to me cron/task should be executed from the web server and not remotely from the DB server.

The way you've described it ... cron on the DB server involves networking first ... which adds a network layer (so to speak) in the execution of those scripts and in the things the scripts must be able to do in regards to files.  You've said that networking isn't the issue ... but do know that even if on the same segment of the network and one hop away, there is usually something hidden there ... like a switch and the switch ports.   A flakey switch or port on that switch could be involved and is a point of failure.

Am wondering if the issue wouldn't be resolved or at least perform better if whatever you have running the cron/task were run from the Web server alone.

'spirit of sharing', Ken




 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Ken,

Thanks for the response.  Sorry if I wasn't clear.  Our Moodle environment is one server.  The server is running both SQL Server and IIS, and both the Moodle code and the moodledata folder reside on the server.  The cron job is set to run on the server using Windows task scheduler.  So there shouldn't be any network segment that would cause the cron job to become disconnected from SQL Server. 

Going by the error in the PHP log (posted above) it seems that the cron job is becoming disconnected from SQL Server.  My guess is that the 'file_trash_cleanup_task' process puts a lock on something, then the job loses its connection to SQL Server, therefore is unable to remove the lock, resulting in our stuck symptom.

Right now I'm not sure why it losing its connection, or if this theory is even correct.

Thanks

 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers
Again ... don't run windows ... so consider the following questions/comments to be 'food for thought'.

Ok ... one server then.

Locks ... if I am not mistaken, the clean up task looks at file system and determines if (via DB query) that files are no longer linked anywhere in moodle thus can be moved to /moodledata/trashdir/.   In four days time, those are re-checked and if status has not changed, they are removed from the system (ie, erases).   During that time, there are, of course, new files added, etc..   Since the task is getting stalled, seems to me that process sets a lock that isn't then removed .... process failed to complete.

Question ... are there files remaining in /moodledata/trashdir/ after the clean up is supposed to have run?
Have you attempted to manually remove all files/folders in moodledata/trashdir/ after failure of the task to clean them up?
It is supposedly safe to manually remove them ... just like it's safe to manually remove files from /moodledata/temp/backup/ as long as autobackup or a course backup proecess isn't running.

What are ownerships/permissions on moodledata/trashdir ... does that user have full rights?
What are php settings for time for a script to run and memory to allocate a script?

Have you gone though the task list (they have a schedule of their own) to see if there are any potential conflicts with other task/cron jobs?

How often are you running the cron job?

'spirit of sharing', Ken

 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers

Just a follow up ...

There is a table called mdl_task_scheduled.   All the task are shown (classname column) along with their settings for minute/hour/day .... \core\task\file_trash_cleanup_task also has a couple of columns for lastruntime nextruntime ... epoch time stamps.   Also there is a column for 'faileddelay' which I've never seen contain a value ... guess cause I've been fortunate enough never to have a task fail to run.

Since the DB is being use to track stuff related, am wondering if seeing what's contained therein might give you a clue.

Do know that cron jobs need to run more frequently now than ever before .... and I think I've seen other users who had problems with autobackups and cleanup timing before .... autobackups not finished before cleanup came along.

'spirit of sharing', Ken


 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Ken,

Thanks for the update.  When we last experienced the problem there were 4 folders with files in them in /moodledata/trashdir/ .  I manually deleted them while the cron job was 'stuck'.  Unfortunately I'm not sure if this fixed the issue or not.  I do know that after several restarts and manually running the cron job afterwards, it eventually completed successfully.  I'm not sure if this was immediately after I deleted the contents of the trashdir folder, or a couple restarts after.

The local administrators group on the server has full control over the trashdir folder.  The account that the scheduled cron job task runs under is part of this group.

I haven't gone through our task list to see if there are any conflicts (thank you for the suggestion).  I'll do that shortly and let you know the results.

The cron job runs every 5 minutes.

Thanks for the suggestion about Windows.  I'll pass it along.

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Ken,

The faileddelay value for \core\task\file_trash_cleanup_task is 0.  It is set to run every 6 hours at 55 minutes. 

The faileddelay value for \tool_recyclebin\task\cleanup_category_bin is 86400.  When I've experienced the problem it has been stuck on deleting draft files (per the output of the cron job when running manually).  I'm not sure if \tool_recyclebin\task\cleanup_category_bin is the task responsible for deleting draft files. 

I've started going through when all of our tasks are scheduled to run to see if there are any conflicts.  We haven't manually changed any of these so they should all be at the Moodle default.

Thanks

 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers

The draft setting means the file is sorta in limbo ... teacher might change their minds ... just like students (assignments if allowed) ... so draft is the first setting.   Later, when determined no more changes concerning the status of the file (student has submitted assignment and can no longer change their minds), it's changed to another component - ie, not draft.

How about setting the task for cleaning up trash to half of it's current setting?   Just how heavy that process gets dependent upon behavoir of users and usage.   Also consider setting Recycle-bin sooner.

Recycle bin ... it's amazing how users would use the 'trash' on workstations .... some would clean up their desktop by putting files in the trash, but never emptied it.  And they couldn't figure out why their computer was saying the drive was almost full and they needed to free up some space! ;)

'spirit of sharing', Ken



 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers

Am almost certain that just removing files/folders from moodledata/trashdir/ actually fixed the issue as that action did nothing to actually address the cause.

You say you forced the running of the cron job several times until it finally completed.  Was that done via command line?   And each time you ran it, did it stop on the same task/job or did it progress a little further each time?

Using any addon/plugin that might be 'heavy processing'?  Running unoconv and annotation of assignments?   Any large courses that might have assignments due or testing frequently for some reason?

Have read in these forums that the cron should run every minute now ... every 5 minutes might mean something is building up to be too much for the server to complete.

I know that sounds crazy ... but I've just about all of the servers I admin now running cron jobs every 1 to 2 minutes and even though that drags some of those servers down for a bit (speed wise), have not had any issues.

Run the cron job more frequently ... the day you do that .... monitor!

'spirit of sharing', Ken

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Ken,

Yes, I ran the cron job via command line.  When we were experiencing the problem last, my process was:

1. Disable the scheduled task and reboot the server to stop the 'stuck' cron job

2. Run the cron job via command line to see the output and were it was getting stuck.  I've attached a screen shot of the point were it was 'stuck'.

We have 16 additional plugins installed, most of which are the Office 365 integration plugins.  We have the Sharing Cart plugin but we disabled it shortly after the last time we experienced the stuck cron job, due to an unrelated problem.  We were unable to access courses with it added (we received a general database error).  The Sharing Cart plugin may or not have caused the cron job to become stuck, but without a good way to trigger the problem, I can't be certain.

We had been running the cron job every 1 minute a few years ago, but we experienced a problem where the job would take longer than a minute to run, and then would cause subsequent scheduled jobs to fail to start. 

Thanks


 
Average of ratings: -
Picture of Ken Task
Re: PHP process for cron job 'stuck'
Particularly helpful Moodlers

So you've also upgraded php to 7.1?

https://docs.moodle.org/dev/Moodle_and_PHP7#Can_I_use_PHP7_yet.3F

First line in the URL above.

The fact that cron job run started where it did kinda indicates to me
it didn't finish the trash clean up and deleting old drafts. 

Don't think I've ever had a cron job run from CLI begin with that.
But then again, don't run Office 365 Integration either.

You've not shared, previously, that relates me thinks ...
"the Office 365 integration plugins" ... would think  some of those related to files and drafts, etc.. and would/might involve network - if not local, then upstream 'cloud'.

Think the resident expert in Windows and Office 365 integrations is
Mike Churchwood
https://moodle.org/user/view.php?id=9561&course=5

Might ask Mike to review this thread.

Think I've taken this as far as I can go ... being a non-Windows person.

Good luck!

'spirit of sharing', Ken

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Ken,

Thanks for the update.  We upgraded to PHP 7.1 after the problem started happening, with the hope of fixing it.  The problem occurred when we were running 5.6 as well.

It looks like most of the of the Office 365 integration plugins should support Moodle 3.3, but a few of them don't currently, so I'll try to reach out to Mike to see if he has any insight on this issue.

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Ken, sorry, PDF annotation is disabled, and we're not running unoconv either.

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

Hi Mark,

I've read your forum thread from September here: https://moodle.org/mod/forum/discuss.php?d=357938.  I was able to populate the mdl_task_adhoc table by deleting some activities in a test course.  It seems that what ever purges the mdl_task_adhoc table seems to be working properly as the records in the table are being purged (we don't have a large amount of them, or one that is consistently in the table).

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We experienced the same problem yesterday.  David Monllaó in this bug tracker: https://tracker.moodle.org/browse/MDL-60417 recommended we enable some additional debugging.  With that enabled, the only additional error I was able to find was this (in bold):

[01-Dec-2017 13:22:51 America/New_York] PHP Warning:  rename(C:\moodledata/cache/cachestore_file/default_application/core_capabilities/cor-cache/VgmzEWOuiE.5a219dfb844f31.71469117.temp,C:\moodledata/cache/cachestore_file/default_application/core_capabilities/cor-cache/core_capabilities-c6b8356e23680a6513bfebdff942a22b.cache): Access is denied. (code: 5) in C:\Moodle\cache\stores\file\lib.php on line 733
[01-Dec-2017 13:35:58 America/New_York] Default exception handler: Error reading from database Debug: SQLState: 08S01<br>
Error Code: 232<br>
Message: [Microsoft][ODBC Driver 11 for SQL Server]Shared Memory Provider: The pipe is being closed.

Does anyone know if renaming these two files is part of the trash cleanup scheduled task?  Would failing to rename the files cause the database error that occurs afterwards?

We are going to setup a test copy of our Moodle server and add the trash directory files that we had to manually remove back.  We will then delete the files one at a time until the cron job finishes, to try to identify the file causing the problem.

Thanks

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We set up a test copy of our Moodle server.  The troubleshooting steps we followed were:

  1. Disable scheduled task which runs the cron job
  2. Move the files that were left in the trashdir folder when we last experienced this problem back to the trashdir folder
  3. Change run time of 'Cleanup files in trash' scheduled task to run in 5 minutes
  4. Change system date to 4 days ahead
  5. Run cron job from CLI
    It was at this point we were hoping to replicate the issue, but the cleanup files in trash task did not get stuck.  Instead the entire trashdir was deleted. 

Does anyone happen to know if this result is to be expected?  Would this indicate that the problem is not a specific file or file type? 

 
Average of ratings: -
Picture of Thomas College
Re: PHP process for cron job 'stuck'
 

We also receive the following message in our site's notification page:

The database is not using the expected READ_COMMITTED_SNAPSHOT mode which can lead to wrong results, especially under high concurrency scenarios. Please enable it for correct behaviour. For further information, see the documentation Installing MSSQL for PHP.

Our next step is to enable READ_COMMITTED_SNAPSHOT to see if that helps the issue.

Does anyone know if not having READ_COMMITTED_SNAPSHOT enabled would cause this type of problem?

Thanks

 
Average of ratings: -