Need advise on moodledata/lock (Lock API)

Need advise on moodledata/lock (Lock API)

by Visvanath Ratnaweera -
Number of replies: 11
Picture of Particularly helpful Moodlers Picture of Translators

Ping! To the address of Moodle developers. Need advise on a moodledata/lock/ went out-of-control. Not that there is a crisis right now, but I want to be ready when that happens again. ;)

Here is the discussion: Has Ghostscript something to do with a half a million lock files?


Average of ratings: -
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API)

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
I need to come back to the developers, since I suspect a bug, a race condition in the processing of ad-hoc tasks, specially when they are slow.

The original thread Has Ghostscript something to do with a half a million lock files? was about 500 k lock files which were created a long time ago, June/July 2021. After we deleted them on 23 Nov they didn't reappear. No problem so far.

But then on the 26 Nov morning the CPU load average took a steep uphill, passed 20 (in a server with 40 cpu threads) and the Moodle notices page showed cron.php not run a whole day - although it was set to run, and was running, every 2 minutes.

The cron.php log looks like this:
===
Server Time: Fri, 26 Nov 2021 08:54:00 +0530

Execute scheduled task: Background processing for caches (core\task\cache_cron_task)
... started 08:54:00. Current memory use 18.9MB.
Cleaning up stale session data from cache stores.
... used 0 dbqueries
... used 0.009213924407959 seconds
Scheduled task complete: Background processing for caches (core\task\cache_cron_task)
Execute scheduled task: Automated backups (core\task\automated_backup_task)
... started 08:54:00. Current memory use 19.8MB.
Checking automated backup status...INACTIVE
... used 0 dbqueries
... used 0.20417094230652 seconds
Scheduled task complete: Automated backups (core\task\automated_backup_task)
[many more lines]
Execute scheduled task: Background processing for scheduled allocation (workshopallocation_scheduled\task\cron_task)
... started 08:54:01. Current memory use 41.7MB.
... no workshops awaiting scheduled allocation. ... used 1 dbqueries
... used 0.0008080005645752 seconds
Scheduled task complete: Background processing for scheduled allocation (workshopallocation_scheduled\task\cron_task)
Ran 0 adhoc tasks found at Fri, 26 Nov 2021 08:54:00 +0530
Cron script completed correctly
Cron completed at 08:54:01. Memory used 41.7MB.
Execution took 0.987506 seconds
--> Now here there is a huge binary blob!!!
--> followed by
Convert 1 submission attempt(s) for assignment 808
Convert 1 submission attempt(s) for assignment 808
Convert 1 submission attempt(s) for assignment 808
Convert 1 submission attempt(s) for assignment 808
[repeated close to 100 times]
... used 83686 dbqueries
... used 4059.3367171288 seconds
Scheduled task complete: Prepare submissions for annotation (assignfeedback_editpdf\task\convert_submissions)
Stopping processing of scheduled tasks as time limit has been reached.
Ran 0 adhoc tasks found at Fri, 26 Nov 2021 08:30:00 +0530
Cron script completed correctly
Cron completed at 09:37:50. Memory used 1.2GB.
Execution took 4070.286554 seconds
===

Notice that this log file started at 08:54:00 and reports a task which was aborted at 09:37:50 after consuming 68 minutes (4070 seconds). Which means that the process must have started at 08:30:00!

What does that mean? Was there a race condition? The binary data in the log file are typical for a lost or confused file descriptors. Is it possible that the same thing happened in June/July and created that sea of lock files? Or, the deleted lock files broke the logic of the Lock API.

This is an important production server of an academic institution which has become "strategically important" during the pandemic! It is not a server I maintain, just a trouble shooter.
;-(
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators

RESOLVED

The cause it not directly connected to Moodle. This is what happened.

On 24 late evening we moved the Moodle crons to user www. There were a dozen smaller, less active sites. The earlier owner was root. All tested and worked. But what we, the Linux admins, could not know that the log rotation mechanism in the FreeBSD box, called newsyslog, would change the ownership of the cron logs back to root, at least in its default configuration.
sad

That happened midnight and for 32 hours, until 26th morning, no cron.php ran. On 26th morning the sysadmin ran cron.php manually, means without the log files, and the process ran. But the backlog sent the CPU load to the ceiling - at the most inconvenient time. The log file above is from that time, when the sysadmins were trying to get the crontab controlled cron.php running. Either a change in file permissions or swapping the redirection in the crontab from overwrite '>' to append '>>' made _two_ processes to write to the same log file. Somehow an ad-hoc task, the document converter, which is called every 30 min woke up and started writing in to a file descriptor which was 30 min old corrupting it. That is the indirect connection to Moodle. The processes may wake up and write to the same standard input and standard error streams!

The developers may relax!
wink

In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Well, never say never!

The (FreeBSD) system specialist has doubts whether Moodle's delayed cron jobs are compatible with redirecting cron job outputs to a (normal) file, meaning things like > /path/to/logfile or >> /path/to/logfile in Apache user's crontab. I noticed that the docs https://docs.moodle.org/311/en/Cron#Finding_the_right_place_to_put_the_command mentions only > /dev/null, which is not the same!

So, the remaining question is, whether > /path/to/logfile or >> /path/to/logfile are allowed in the crontab of a production server?
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Ken Task -
Picture of Particularly helpful Moodlers

I see we are in 'developers forum' ... I probably shouldn't respond, but ... here am I ... legwork?

>/dev/null 2>&1

2>&1 indicates that the standard error (2>) is redirected to the same file descriptor that is pointed by standard output (&1).

So, both standard output and error will be redirected to >/dev/null

which is la-la land

check out:
https://fullstack-tutorials.com/linux/cron-job-save-output-to-a-log-file

>> appends ... which means that .log file should grow larger.
Even to the point where nano cannot open! smile  Not sure I'd leave the cron job like that once you've ironed out issues.

https://docs.moodle.org/311/en/Cron
* * * * * /usr/bin/php  /path/to/moodle/admin/cli/cron.php >/dev/null

NOTE: the final >/dev/null sends all the output to the 'bin' and stops you getting an email every 1 minute.

Also see:
https://docs.moodle.org/311/en/Administration_via_command_line#Scheduled_tasks

Think, for what you want, where in the command you place  >2 is key.   Tricky stuff and I've not had enough coffee to explain! [actually, coffee + asprin cause it makes my 'feeble' hurt thinking about it! smile]

'SoS'. Ken


In reply to Ken Task

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi Ken

Glad you came in. After talking to myself twice I was starting to get bored.
;-(

That's all well and good, but my question was whether one or both of these crontabs are incompatible with Moodle?
* * * * * /path/to/php  /path/to/moodle/admin/cli/cron.php > /path/to/logfile
* * * * * /path/to/php  /path/to/moodle/admin/cli/cron.php >> /path/to/logfile
As mentioned there, I am thinking of the delayed execution of ad-hoc tasks.

BTW, nitpicking as a true sysadmin, I want to correct this:
> * * * * * /usr/bin/php  /path/to/moodle/admin/cli/cron.php >/dev/null
> NOTE: the final >/dev/null sends all the output to the 'bin' and stops you getting an email every 1 minute.

The final /dev/null sends only the standard output (file descriptor 1) to Nirwana. The error output (file descriptor 2) will still be mailed. If you want both to be sent to Nirwana, add the 2>&1 thing.
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Ken Task -
Picture of Particularly helpful Moodlers

adhoc task is called at the tail end of cron.php if there are any task
that failed to complete from the last call of cron - I thought.

In a 311, moodlecode/admin/cli/
cron.php and adhoc_task.php exist

php adhoc_task.php -e (should render to screen)
Server Time: Sun, 12 Dec 2021 14:09:40 -0600

Ran 0 adhoc tasks found at Sun, 12 Dec 2021 14:09:40 -0600

touch /var/log/m311adhoc.log
to create empty .log file.

php adhoc_task.php -e > /var/log/m311adhoc.log

gets desired text in log file.  No email.

So cron job setup:

/usr/bin/php /path/to/moodlecode/admin/cli/adhoc_task.php -e > /var/log/m311adhoc.log

should write once ... next execution m311adhoc.log file over written.

A >> redirect would/should append to m311adhoc.log.

CLI test ...

[root@server cli]# php adhoc_task.php -e >> /var/log/m311adhoc.log
[root@server cli]# php adhoc_task.php -e >> /var/log/m311adhoc.log
[root@server cli]# php adhoc_task.php -e >> /var/log/m311adhoc.log

[root@sos cli]# cat /var/log/m311adhoc.log
Server Time: Sun, 12 Dec 2021 14:18:18 -0600

Ran 0 adhoc tasks found at Sun, 12 Dec 2021 14:18:18 -0600
Server Time: Sun, 12 Dec 2021 14:27:31 -0600

Ran 0 adhoc tasks found at Sun, 12 Dec 2021 14:27:31 -0600
Server Time: Sun, 12 Dec 2021 14:27:39 -0600

Ran 0 adhoc tasks found at Sun, 12 Dec 2021 14:27:39 -0600
Server Time: Sun, 12 Dec 2021 14:27:42 -0600

Ran 0 adhoc tasks found at Sun, 12 Dec 2021 14:27:42 -0600

'SoS', Ken

Average of ratings: Useful (1)
In reply to Ken Task

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
> touch /var/log/m311adhoc.log
> to create empty .log file.
>
> php adhoc_task.php -e > /var/log/m311adhoc.log
>
> gets desired text in log file. No email.

No email because you do it manually (in the CLI). Sending mails is a business of the cron environment.

From 'man cron'
       cron  then wakes up every minute, examining all stored crontabs, check‐
       ing each command to see if it should be  run  in  the  current  minute.
       When  executing  commands,  any  output  is  mailed to the owner of the
       crontab (or to the user named in the MAILTO environment variable in the
       crontab,  if  such  exists).  The children copies of cron running these
       processes have their name coerced to uppercase, as will be seen in  the
       syslog and ps output.


Anyway, your evidence shows that both crontabs in my previous post are feasible, right?
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Ken Task -
Picture of Particularly helpful Moodlers

This part of your question:

* * * * * /path/to/php  /path/to/moodle/admin/cli/cron.php > /path/to/logfile
* * * * * /path/to/php  /path/to/moodle/admin/cli/cron.php >> /path/to/logfile
As mentioned there, I am thinking of the delayed execution of ad-hoc tasks.

Both/either would/should work ... the >> will increase the size of that one log file as output is appended to the logfile.   If allowed to do that forever, that log file could get huge ... in a year or two.   You might have to add that log to log rotations to prevent other issues down the road with logging.

But ... thought adhoc_task is the last thing cron job does:

Ran 0 adhoc tasks found at Mon, 13 Dec 2021 10:45:54 -0600

so not sure how one can 'delay' execution of.   That is supposed to be a cache of jobs that did not complete for some reason ... delaying further makes me think those won't get better but the opposite ... worse.   One of those task is prep for grading using PDF annotation of assignments.

IF that is the ID'd problem, log the adhoc_task just for that ... think there are two:
\assignfeedback_editpdf\task\convert_submissions
\mod_assign\task\cron_task

In a sandboxed 311 task list:
\assignfeedback_editpdf\task\convert_submissions   */15 * * * * *    Tuesday, 7 December 2021, 8:30 AM        Tuesday, 7 December 2021, 8:45 AM

'SoS', Ken


In reply to Ken Task

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi Ken

> Both/either would/should work ...

In that case, our suspicion has no ground. We suspected that the first one, the single arrow, corrupts the log file.

> the >> will increase the size of that one log file as output is appended to the logfile. If allowed to do that forever, that log file could get huge ... in a year or two. You might have to add that log to log rotations to prevent other issues down the road with logging.

Year or two? Never! The log rotate mechanism takes a bz compacted copy and resets the file. The whole idea is that the system admins should check those files regularly.

> But ... thought adhoc_task is the last thing cron job does:
> ...
> so not sure how one can 'delay' execution of.

"last", yes. But unfinished tasks get awakened, say in 15 min intervals, which appear like delays. The fear is that the parent cron.php process keeps a tab at the place when it started, and the child process writes to that place, even if that tab is history!

We leave at the second option, the double arrow. Need to observe the files, when there are slow tasks like the pdf conversion of scanned exam papers.
In reply to Visvanath Ratnaweera

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Ken Task -
Picture of Particularly helpful Moodlers

Hmmm .... differences between CentOS and Debian?

Those cron logs in my /var/log/ are not rotated automagically.   Since am using the single > log file is overwritten ... thus varies size wise but I've never run across one of those corrupted.

Thinking out-loud here ... wonder if disk has some sector/node issues?

'SoS', Ken


In reply to Ken Task

Re: Need advise on moodledata/lock (Lock API) [RESOLVED]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
> differences between CentOS and Debian?

Neither nor. These are Jails in FreeBSD, old 12.1. Not a server I maintain. For details see "our sister thread" Has Ghostscript something to do with a half a million lock files? (linked to the OP)

> Those cron logs in my /var/log/ are not rotated automagically.

Log rotation is I believe a site-topic. It is in place since years. Other than the newsyslog meddling with the file owner overnight, it was doing its job.

> Since am using the single > log file is overwritten ... thus varies size wise but I've never run across one of those corrupted.

So, you have '> /var/log/cronlog' in your (Linux) web servers, no problems there, even under Moodle rel. 3.11? The ad-hoc processing was continuously expanding.

> Thinking out-loud here ... wonder if disk has some sector/node issues?

The server was so badly maintained, anything is possible. Before I gladly give it back, I want to know the cause. It is a postmortem; it is worse, the patient is back walking, working. Once he falls flat, the kin will visit the last doctor who treated him.
sad