## General developer forum

### Cron Development

Cron Development

There has been a fair amount of concern about the moodle cron task.

MDL-17783 (https://tracker.moodle.org/browse/MDL-17783) has looked at the notion that there are issues with Moodle only permitting a single Cron instance to run.

This has issues with large sites (such as ours) where some cron functions, such as the Forum digest can take VERY long time to run.

The limitation to a single cron instance means that regardless of the order that cron_* functions are called the "takt time" (http://en.wikipedia.org/wiki/Takt_time) is always going to be sum of each functions execution time, and so functions that want to run every minute will end up running every "total cron execution time".

In addition there are notional dependencies on module cron functions in particular orders. I've struggled to see this documented so it would be good to get some sort of statement about what is dependent and why

From looking at the cron lib code it would seem that the order of things (so notional dependencies) are:

2. account updates and enrolments are done first so that everything else has up to date user and course enrolment information
3. Activity Modules (Examples)
1. Forum
1. Digest (>1hour 1x a day)
2. Quiz
1. Attempt/quiz session updating (??mins every 1minute(i think))
4. Block crons
7. Events
8. Completion
9. Blog related
10. Question Bank
11. Site Update
12. "All other plugins"
13. Automated backups

The problem that we've identified as being an issue isn't the overall order of cron tasks but the independence of 3. Activity Modules.

MDL-17783 applied the use of Session locking code to *all* of the cron elements to run independently, when I think what we really need is a mechanism that allows:

1. (1)->(2) to be run in series,
2. All of (3) to run in parallel,
3. Steps 4->end run in series

Could this (this is me spitballing) be done in a way that could be retrospectively applied to previous versions?

In theory I think it could be by:

1. Define new cron locking API (instead of re-using existing session locking API).
2. Re-model the cron scrip to understand 3 states:
1. CORE_PRE
2. MODULES
3. CORE_POST

The logic would be:

When a cron instance is started, CORE_PRE moodle core and CORE_PRE registered functions are executed. Any subsequent instances that are started would then check the state and exit straight away.

Once ALL CORE_PRE function are handled, the 1st instance starts doing the MODULE registered functions. This would effectively say "we've updated core standing data and it can be considered valid for the immediate future". Each module function obtains a lock when it is initiated and releases it once it's done. The duration to execute the function would be tracked.

Since the cron is in state MODULES, subsequent instances would skip the CORE_PRE stuff and look at the MODULES and would process the next available module function. The duration information that is recorded over iterations could be used to automatically re-order the execution so that small "fast" items are run first.

Only once ALL MODULES registered functions are completed, the cron script moves into it's final CORE_POST state, runs the registered functions whilst preventing any additional cron instances from doing any work.

In my head this "new" cron script could live alongside the current one (since administrators have to configure it outside of moodle anyway). I think that this is important as the Moodle release schedule doesn't always fit with institutions being able to go to the latest version (in our case we typically upgrade in June, so there isn't enough time to really fully evaluate a new release and update everything in time) and the issues in the cron are likely to affect institutions of a size where they (we) can't just take it from the latest version.

Thoughts....

M

Average of ratings: -
Re: Cron Development

Hi,

Nice post Michael. The "cron problems" (concurrent cron => multiple mails..) are very old

It would really need to have an integrated solution in core Moodle.
There had also been MDL-25499 open some years ago for that.

Séverin

Average of ratings: -
Re: Cron Development

Well, we run a pile of big sites and just let their crons run concurrently - it doesn't really cause any problems for us, with a handful of needed patches.

It is necessary in some tasks to set the "last run" time at the start rather than end of the process, so the next run won't attempt to do the same thing right away.. forum digest and statistics come to mind. It would be nice to see better/more use of transactions/locks with concurrency in mind though.

Haven't seen any problems caused by out-of-order execution, so I don't think this is a big deal.

Average of ratings: -
Re: Cron Development

Hi,

Please note that the strategy used in Mahara for cron jobs is much less sensitive to 'big sites' since every task that may be launched byf the overall cron (that is launched as per Moodle either on command line, web access of crontab entry) ,  is regulated by an entry in database ( table cron) with an Unix like syntax for next execution time ; thus not all tasks are run every time ... Furthermore they have a cron lock mechanism per task to avoid concurrent runnings

Unfortunately there is no dev  documentation on their wiki, but peeking at file htdocs/lib/cron.php is really instructive

Cheers

Edit : Screenshot below shows few lines of Mahara's table cron :

Average of ratings: -
Re: Cron Development

That was basically the idea behind MDL-25499 and http://docs.moodle.org/en/Development:Scheduled_Tasks_Proposal.  Parts of that proposal are implemented in Remote-Learner's ELIS (full disclosure: I was the one who implemented it.   I also borrowed some code from Mahara for that), but there would still be a lot of work to finish implementing it for general Moodle use.

Average of ratings: Useful (1)
Re: Cron Development
Yes, this (major piece of development) is the goal that we are all striving for, but also.. has not really moved on in quite a number of years

Average of ratings: -
Re: Cron Development

It being a major piece of development makes it difficult for people to tackle.  Given my recent experiences with an attempt at changes in this space, I am not confidence that spending heaps of time to implement that would end in a positive result.  To be able to do it, you need backing from someone who can continue to direct your and ensure that it gets in.  Who could that be?  Without discussions and direct access to core developers it's very difficult for somebody on the outside to tackle it.

I would do some of the the work in the task but I'm not really in a position to attempt to tackle it all.  My suggestion in my recent comments on MDL-17783 is that I could implement the general locking class and make adjustments to the existing cron to remove the concurrency issues many are experiencing.  This would still require someone to agree to provide reviews and pointers and agree to integrate just part of the work including accepting the locking rules that would be implemented.  The existing implementation of the lock class has sat dead for a long time.  Why wasn't it integrated?  Applying the locking to the existing cron would resolve many of the issues people are experiencing.  It would be able to be completed in a single release cycle and then incremental improvements could be made as time went on.  My current belief is that this piece of work will never make it to the top of the pile if you try to complete it as a single chunk.

That is my proposed strategy for HEAD.  I've also commented in the bug MDL-17783 that the current back-patching policies would indicate to me that it's impossible to resolve the issues big Moodle installs are experiencing with cron.  The reason I say that is currently no locking infrastructure exists and it would be a large complicated patch that would create new or alter API's to deliver.  I originally went with the alterations method with backwards compatibility route for MDL-17783 and it was strongly disagreed with by Petr as too great a change to put on stable.  The approach was also not seen positively.  Does anybody see a different outcome being possible for stable branches?  And what development would that look like.  A single lock on cron is not an acceptable outcome given my and Michaels comments about how long it takes some operations to run.

I'm working hard here trying to come to a position where we could implement something and provide forward movement for all Moodlers.  It's easy to just say this is too hard and we will worry about it another day.  That won't move us forward from the position we are in now, which is that a proposal has been sitting doing nothing for a long time.  So I would appreciated comments and also if objections can include alternatives thus allowing someone to start on something.  Offering of mentoring/review capacity would also increase the speed a which we could deliver something that would be acceptable to HQ without wasting lots of effort.

Thanks

Russell

Average of ratings: -
Re: Cron Development
Hi Russell,

I'm really sorry that you've not had a positive experience with this contribution and I agree with many of your comments and suggestions of mentoring/review. Unfortunately it mostly comes down to not having unlimited resources, it pains me to see that we have over 300 patched and unresolved issues. I don't like to see your work hit a wall like you seem to have, and as I said in a recent presentation, I LOVE core contributors - thank you!

Having said that, I'm afraid I don't agree that that this change would solve the major issues in cron. This hasn't had a higher priority because many many large sites are coping OK by using tools outside of Moodle, they will use something like dotlockfile to achieve the locking outside of Moodle.

I also think that if we're adding locks, then really we do need to have a way of examining the status of the locks and clearing them as an administrator, otherwise we could end up with significant Moodle maintenance tasks being stalled indefinitely, without any visibility by the admin. Which brings us towards the more fully featured cron overhal.

And to get to that better featured cron overhall, it just comes down to time and priority. At the moment I think the community are coping and nobody is really moving this forward. I'd love to help progress this as it seems interesting and useful, but at Moodle HQ its not on our immediate roadmap because we're working on other features which people are crying out for. Especially those which affect students and teachers in a more direct way.

With regards to backporting, its not impossible to get an issue like this resolved in the stable branches. We have a process for backporting improvements, once a solution landed in master we would be able to follow this process to integrate it to the stable branches. Our default position is a cautious one, to try and bring stability to our community.

Average of ratings: -
Re: Cron Development

Thanks Dan, that is all very helpful.  I enjoyed the presentation slides.

I still do not understand how features are rated as being important.  If you add up the votes from all 3 issues that relate to the same root cause, it's the 11th Most popular voted issue.

I've done enough pushing and will leave this one alone.  We are happy with the patch we developed and will continue to use that until the next upgrade when I will develop an upgraded version or use something in core that has come in.  Others are free to implement it if they feel it tests safely in their environment.

I'm still trying to figure out if implementing just the locking class and putting some locks into cron based on the MDL-25499 and it's associated specification would likely be accepted as a first step to implementing cron upgrades.  Is anybody able to comment on that?

Average of ratings: -
Re: Cron Development

Just noting that this is blocking some work on mod_assign for me so I have started picking up the subtasks of MDL-25499 and implementing them. So far the locking framework (with 4 kinds of lock) and the black magic allocator are done and waiting for peer review.

Average of ratings: -
Re: Cron Development

Update - the locking changes have been integrated and the cron changes are up for review. I started some dev docs for the new API here: http://docs.moodle.org/dev/Task_API

Average of ratings: Useful (2)
Re: Cron Development
Thanks Damyon. I really like the look of this - its been something which many people have wanted but never really actually implemented for years.

One question is whether there are restrictions on what the admin can do with the task? I am thinking in particular about a developer writing some code that really depends on being executed by some time and the admin could change the schedule which might have unintended consequences (Though, as I was writing that comment it made me realise that you don't have many guarantees at the moment with a cron task).

Another thing is that for this API to be effective we really need cron to be executed much more frequently than some sites do now - this might be tricky whilst the old cron and new scheduled tasks coexist. I think that is good motivation to try and move as many tasks as we can to the new api. Note: i'm not arguing we should deprecate the old API, just try and move core towards using the new one.

Average of ratings: -
Re: Cron Development

Wow! why did the get summated for integration before any post in the forum. This new system is a really good idea in general (one that was first talked about in the Czech Republic in 2008), but before the code was written, it would have been good to have an up-to-date proposal discussed in the forums.

What they have implemented so far is described at http://docs.moodle.org/dev/Task_API, which looks good, but there are some issues:

1. For ad-hoc tasks, then some sort of handle should be returned, that can be used to query the state of the task.

For example, suppose backups are converted to be background tasks. Then, once a backup of a course has been requited, we need to be able to display in the UI that a backup is still in progress. Otherwise the user may wonder what is happening, and keep starting new ones. In the same way that if a form is slow to submit, some people will click again, which leads to problems.

2. Should what happens at the end of the task be handled centrally?

Continuing the backup example, suppose we want to send an email to the user when their backup is complete. Well, it could be that the backup code needs to implement it. Which would work fine. However, it might be the case that we then end up with very similar code in many task implementations. It might be better overall to have a range of standard actions that can be taken when a task completes.

Also, what if the important task completes successfully (e.g. doing the backup and saving it in the course backup area) but then the message sending code throws an exception? Do we really want to repeat the backup? I'm not sure?

3. Are we sure pseudo-unix-cron notation is the best way to specify the frequency?

4. This code must have potential race conditions, but there is no mention of them in the testing instructions for MDL-25505?

Actually, for such a big change, the testing instructions seem woefully inadequate.

Average of ratings: -
Re: Cron Development

My opinion (note I haven't been involved in this and I might be slightly biased by the basis of wanting to get this done!):

1. For ad-hoc tasks, then some sort of handle should be returned, that can be used to query the state of the task.
2. Should what happens at the end of the task be handled centrally?

I don't see this as part of the job for this API, I don't think you will construct a model which fits all needs well, and so we shouldn't. Some tasks will need to be presented to students, others might be general background processing of interest only to the admins. I think that trying to construct something centralised like this will just result in a bit of worst of all worlds compromise.

1. Are we sure pseudo-unix-cron notation is the best way to specify the frequency?

Yes. I think it is the de facto standard and for once we should try and avoid re-engineering a solved problem . Alternatively, tell us whats wrong with using it?)

Average of ratings: Useful (1)
Re: Cron Development

Thanks for looking at the issue Tim - more feedback is good.

1. For ad-hoc tasks, then some sort of handle should be returned, that can be used to query the state of the task.

I think that each plugin will have different needs in regards to ad-hoc tasks. I have thought about this for editpdf - and I will use locking there to prevent collisions. My case is that I will have a task that can be done in either the background or the foreground. At the time a student submits a PDF - I will create a background task to process it into images ready for the edit pdf interface. If the teacher uses the edit pdf tool before the background task has run - the pdf conversion will run in the foreground and the user will have to wait for the progress bar (like they do now). Locking will be used to prevent a background and a foreground conversion from running at the same time. This example shows (to me) that each situation has many nuances and my requirements are different to the backup situation you described above - if the backup code needs to track the backups waiting to complete - they should do it themselves.

Average of ratings: -
Re: Cron Development

More responses to Tim

2. Should what happens at the end of the task be handled centrally?

There is nothing wrong with an adhoc task queuing another adhoc task just before it completes. This covers the situation that you describe.

Average of ratings: -
Re: Cron Development

More responses to Tim:

3. Are we sure pseudo-unix-cron notation is the best way to specify the frequency?

If this was shown to teachers/students I would be concerned - but this is for Admins only and the syntax is well known and well documented. Is it worth spending weeks devising some new super duper reinvention of google calendar interface for a single admin page? I think not.

4. This code must have potential race conditions, but there is no mention of them in the testing instructions for MDL-25505?

Locking was already integrated - and the use of locking code in this new patch has been peer reviewed many times. I don't have any suggestions for a test that could cover this sensibly. Running cron in a loop would not tell you much - how would you detect a potential issue?

Actually, for such a big change, the testing instructions seem woefully inadequate.

I will add to the testing instructions - but the new code is covered by unit tests and the legacy crons and new scheduled tasks are mostly just cut/pasted to new locations.

Average of ratings: Useful (1)
Re: Cron Development

This is a good example:

"At the time a student submits a PDF - I will create a background task to process it into images ready for the edit pdf interface. If the teacher uses the edit pdf tool before the background task has run - the pdf conversion will run in the foreground and the user will have to wait for the progress bar (like they do now). Locking will be used to prevent a background and a foreground conversion from running at the same time."

What happens in this case?

Teacher tries to use the edit pdf tool after the background task has started, but before it has finished.

In this case, ideal behaviour is

The UI shows the same progress bar that the teacher would see if they background task had not started yet. That shows the progress of the background task, until it has completed, and then when the background task completes, the UI continues as if it had been a foreground task.

The only difference would be that the teacher sees a progress bar that starts somewhere in the middle, rather than at the start, because the background task has already started.

Is the task API going to be able to support this use case?

Average of ratings: -
Re: Cron Development
> Is the task API going to be able to support this use case?

I really feel that is for the higher level (e.g. mod_assign) to track the status of the task (the assignment), not this API. Especially as you are going to have to store mapping to correlate the task with the user/ what should be displayed etc. anyway.

Average of ratings: -
Re: Cron Development

It may be higher level, but that does not mean we want mod_assign, mod_quiz, backup, ... all implementing this themselves with duplicate code.

Yes, mod_assign will need to store the mapping. Hence, my original question: is is possible to get a 'handle' for a scheduled task that uniquely identifies it? I guess it might actually be better to ask, how are locks identified, since the key thing is the lock on 'convert XXX pdf is running now'.

I guess I need to read http://docs.moodle.org/dev/Lock_API, and the code example there makes it clear.

$locktype would be 'mod_assign_pdf_conversion' and$resource would be submission id or something.

That uniquely identifies the lock / task.

Now, we don't want to mix progress tracking into the lock API, which is meant to be low-level, simple, and reliable, but we can build on top of it. (And also build on the progress tracking API that sam marshall build for the 2.6 backup work.)

The higher level API would let you report / track progress for a lock. In the schedule task you would do something like

$progress = new \core\lock\progress_reporter($locktype, $resource);// Or should that be \core\progress\progress_for_lock?// Anyway,$progress would be a subclass of core\progress\base.$progress->start_progres('Description here',$max);// ...// During the task$progress->progress($currentprogress);// ...// Finally$progress->end_progress(); Then, some other code that needed to track progress of another task, would do: $tracker = new \core\lock\progress_tracker($locktype,$resource);\$tracker->display_progress_bar_until_complete();

Hmm, this needs more though. The get progress tracker API really needs to be an logically atomic if no-one has this lock, claim the lock, otherwise if someone else has the lock, and it has a progress tracker, then give me that progress tracker.

But, you get the idea. I am sure something like that is doable, but we agree it should be a layer on top of the existing classes.

Average of ratings: -
Re: Cron Development

That is a good point you raise about running cron more frequently but not wanting to affect legacy cron.

There are 2 things to consider here:
1. the legacy cron is now run by a new scheduled task - so the schedule for it can be adjusted. The default for this is set to run every minute.

2. the new scheduled task uses locking, so no matter how often the legacy cron is due to run, there will only ever be a single instance of it running.

At NetSpot we ran the crons every few minutes because people would notifications emails to get sent out ASAP. Cron has always had a cronlock - so would not run multiple crons at the same time, and running it as often as possible had no real downsides.

I don't think there is anything that needs changing here - let me know if you think otherwise.

Average of ratings: -
Re: Cron Development

Hi Damyon,

I assume that you mean that Cron always had a lock at NetSpot because this wasn't a core feature.

At Lancaster University, we ran our cron jobs every minute and wrapped them in a per-site dotlockfile. As you say, it had no downside and many positives. I think this should be encouraged more and it will no longer be necessary to wrap now that we have locking from 2.7.

Andrew

Average of ratings: Useful (1)
Re: Cron Development

Maybe - it was a customisation (looks like it was) - but the point is that there is no downside to running the legacy cron as often as possible as long as it doesn't run simultaneously.

Average of ratings: -