Running out of space with assignfeedback_editpdf\task\convert_submissions

Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Number of replies: 24

Hi all,

need VERY FAST help here.

Updated 3.8.1 -> 3.10.3 , and after some time , when cron kicked in , this task :

10.6G -  2h42:35 - php admin/cli/cron.php (March 27, 09:13:21 Scheduled task: assignfeedback_editpdf\task\convert_submissions)

its now running for 6 hours, and ate roughly 50GB of space  (whole Moodle is ~600GB and right now we have 99% disk usage).

Most space used located in :
 /tmp/requestdir/SfMN
inside theres couple of folders like
/605edb11af11a
/605e5ca9ad441

and inside those, are another forlders with "source.pdf" file

using enormous ammount of space



Its still running , and im wondering :
1. What it does ?
2. How i cans top this ?
3. How can i claim back the space that it used ?

Thanks in advace, hope someone could answer during next hour as otherwise i'll need to kill porceses and stop moodle

Average of ratings: -
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Addition - looks like it is processing same assigments again and again (This is processed log and sorted in EXCEL, in one go it take only different entries, but comes back to smae assigment next cron) :
Any thoughts ?

Convert 1 submission attempt(s) for assignment 44710
Convert 1 submission attempt(s) for assignment 44710
Convert 1 submission attempt(s) for assignment 44710
Convert 1 submission attempt(s) for assignment 44710
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44711
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44712
Convert 1 submission attempt(s) for assignment 44727
Convert 1 submission attempt(s) for assignment 44730
Convert 1 submission attempt(s) for assignment 44730
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

That shows 2 assignment ID's ... find out what course they are in and inspect that assignment parameters.

Appears you have command line?, so may be this will help:

getassigninfo script

mysql -u root -p'' -e "use moodle;select id,course,name,completionsubmit from mdl_assign;" > assignments.txt; cat assignments.txt;wc -l assignments.txt

Output will show at tail end something like:

6607    28    Unit 11 Quiz 3rd Period Remote Non Calculator    1
6608    28    Unit 11 Quiz 3rd Period Remote Calculator    1
6609    162    Debi's Story     1
6610    9    Unit 11 Quiz Work    1
3706 assignments.txt

The second column above is the course ID so you can find the courses where your 447xx assignments are located.

'SoS', Ken

In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Visit your-site/mod/assign/view.php?id=447XX to find the offending assignments.
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Luckily we never had problems from this end. So I'm only guessing.

There must be a "work-flow" in mass scale, where the students submit their work as PDF and teachers correct those work on the built-in PDF editor of Moodle. They tend to generate ridiculously big image files as compared to text or even office formats. It reminded me this long and involved discussion in the German community: https://moodle.org/mod/forum/discuss.php?d=388993.

99% disk usage is very dangerous. Unless you can release some space in moodledata immediately, I would put the site in maintenance mode! Find the disk usage in the sub-directories unter moodledata. Only filedir is absolutely critical.
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Might be a good idea to give a description of your hosting and setup.  Only by looking at other post you've made do we see hints/clues ... in path of info shared on one posting seen is /client/ which is a clue there are multiple sites running on one server.   So where you host, are you leasing a shared hosting solution/package?

We can see Linux clues (forward leaning slashes) and the fact you say 'killing processes' indicates command line access?

How did you upgrade?   Does provider have a cPanel icon for upgrading your Moodle?

Conversion of assignments in moodle can be done in 2 ways ... unoconv (which involves python, minimal install of LibreOffice which is called by unoconv script in a headless mode, via a 'listener' which makes conversions from documents to PDF (if it can) one at a time for annotation/grading.

The other method of converting files is to use Google Document Conversion.

Which is your setup ... looks like unoconv to me from info shared so far.

In newer versions of Moodle, assignment setups have an option for restricting the type of file submitted.   A * means any document type ... could restricted to .pdf, .doc, or .docx or ? upon submission.   So do you have a course where students might have uploaded (no restriction) some file type that can't be converted to a PDF?

Also, is there any assignment made where students had to upload multiple files?   Those would require the joining of the converted PDF's to show in one PDF in the grading/annotation area.  That equals heavier processing.

In the setup of doc conversions, the moodle admin interface for that, does show what unoconv can convert as opposed to using Google.   While unoconv has more, it can be problematic (such as you are experiencing) and less reliable than using Google.

Rather than shutting down .. put your Moodle in maintenance mode to keep students/treachers out.   That will allow you to login as an admin level and work on stuff.

Suggest attempting to see what's in on of those tmp .pdf files 

Sorry ... no quick fix! :|

Space ... one can, if there is anything in it, remove any/all files/folders in moodledata/trashdir/ manually.   If cron isn't reaching it's clean up task that directory could contain lots of files no longer needed.

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi Ken !
Thanks for suggestions !

Hosting - is our own VPS - UBUNTU+ISPCONFIG, I'm in charge here, and everythign was done perfectly setup with Moodle 3.0. I think i havent done any major changes to that config later .

I upgrade manually always - i've been doing this for 6 years now :D So nothing should have been broken here .

It used to use unoconv , but i see BOTH Google and Unoconv disabled in /admin/settings.php?section=managefileconverterplugins . As theres also admins to look after the inner side of moodle, i'll check why this was done. (However the /tmp is still filled with conferted PDFs , so i believe conversion is still happening even when both of those are turned off ?)

Concering the multiple files - we have powerfull 8 core dedicated CPU , so this should not be a problem, as we're at 0.1 load usually and never had this trouble.

if i stop cron and rerun it - looks like /tmp folder cleans itself , so i'm wiht the space now ,and will exeriment further, to find out why this breaks

Also - as im mostly a server support guy (not actully a Moodle expert) - can you pinpoint me the way to limit convertion files ? (* or doc,...) is it done per course or could be done globaly ?

Thanks again for reply , going to check thing further and will report once i find solution smile
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

+ 1 to what Mr. V said! smile

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi guys,
Thanks both of you for suggestions.

I have reenabled unoconv now, however this doesnt helps much - conversions are going on . ( see my example cron log : https://pastebin.com/PGHsJjjk (didnt want to paste it here to save space)) Scroll down and you'll see it just picks random submissions each time , howveer in the long run most of them would be duplicates ...

The link Mr.V. suggested - is showing assigment by id , so yes i can see it there however i dont know how to use that information to fix my trouble :D

My idea - could i somehow mark ALL assigments as processed , or not requiring any conversion ? So that only NEW submissions would be processed ? I do have direct DB access , and can run queries easily , so just need a hint on where are those filed are stored ? (DO you think this could work ?)
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Did you look at mdl_assign? Didn't share that query for nothin'! :|

Pastebin clips ...

Conversion failed with error:nopermission
several different ID numbers there - submission attempts ... that's submission attempts.

No permission?   Have never seen before.

Are you using some plagarisim plugin?

It's never a good idea to fix a problem via direct editing of DB tables ... might have a domino affect as you know tables are relational.   Have had to do that seldom ... and from that day forward, the instance had issues in other areas.

In code/admin/tool/task/cli/ there is schedule_task.php

On one site I have some bash shell scripts in that directory for cleanup ... called cleanup

Contents:

php schedule_task.php --execute="\logstore_standard\task\cleanup_task";
php schedule_task.php --execute="\core\task\backup_cleanup_task";
php schedule_task.php --execute="\core\task\cache_cleanup_task";
php schedule_task.php --execute="\core\task\file_temp_cleanup_task";
php schedule_task.php --execute="\core\task\file_trash_cleanup_task";
php schedule_task.php --execute="\core\task\session_cleanup_task";
php schedule_task.php --execute="\core_files\task\conversion_cleanup_task";
php schedule_task.php --execute="\tool_recyclebin\task\cleanup_category_bin";
php schedule_task.php --execute="\tool_recyclebin\task\cleanup_course_bin";
php schedule_task.php --execute="\core\task\messaging_cleanup_task";
php schedule_task.php --execute="\core\task\delete_incomplete_users_task";
php schedule_task.php --execute="\core\task\delete_unconfirmed_users_task";

Not sure what version of Moodle you are running ... but ... higher versions of Moodle have an addition to cron ... adhoc_task

path:

code/admin/tool/task/cli/

script: adhoc_task.php

Those are jobs moodle couldn't finish for some reason and are now in a que to be executed where it left off ... those can cause such issues.

And still, even if your role is admin backend, you do need to visit those courses and assignments to see what they are!!!!

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi again ,
just a short info (and no -im not using plagiarism)
I started checking all ids with "/mod/assign/view.php?id=447XX"
and what i found is , the resposnce is either:

"Invalid course module ID"
(Debug info:
Error code: invalidcoursemodule
Stack trace:
  • line 2245 of /lib/modinfolib.php: moodle_exception thrown
  • line 30 of /mod/assign/view.php: call to get_course_and_cm_from_cmid()
or

"Can't find data record in database."

However - i do see them in mdl_assign table... (see image attached)

Any ideas whats going on here ? :D
Attachment moodle.jpg
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Really have no idea as to what is going on with your server!

But, from what you've last shared, looks like I'm gonna go back on what I said about not editing tables (in this case removing rows in related tables).

Says module not found in Web interface but you do see references to them in that table.

A module in Moodlese terms is an activity in a course.

If you had a clone of production server I'd tell you make sure you had a backup of DB - mysqldump - then remove one of those rows and see if the course where that is contained has issues.

Risky on production server .. but ... same ... get a mysqldump of your database and then 'gulp' ... remove one row.   Go to course where that module was located ... issue gone ... or other modules not found?   Continue to remove rows - one at a time -  *IF* in the same course ... continue testing course.

Best of luck!

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Unless the database ran out of disk space and lost some data!

@OP, Are the root partition (/), the partition where the database keeps its data and moodledata the same, or two or three different ones?
In reply to Visvanath Ratnaweera

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Noticed too late: Is it really the /tmp which is swelling?
In reply to Visvanath Ratnaweera

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi,
Yes it is /tmp that is swelling

On the whole  situation is becoming very weird .

I was so happy when I came in the morning , and none conversion tasks where running since 4 am. Cron log was clean form conversion tasks.
BUT! CPU was loaded and 32GB RAM  where used, and SWAP was half used , and there where like 20 libre office processes hanging. So in fact no logs in cron log was just becasue conversions where hanging not generating that log sad
So i rebooted the VPS , and got several other errors in log (see image) . Its not in English, but idea:

The top part of image showing Schedule Task force closing (says error working with DB , as DB is stopped now for reboot) , But look at the highleted parts of queries and execution seconds of those hanged processes ! So it looks like, theres more than one trouble:

1. Task scheduler keeps trying to convert same submissions
2. Some submissions are from assigments that where made years ago
3. Some Assigments cannot be accesses in moodle
4. Task scheduler fills "/tmp" with temporary pdf files using tens of Gigs of space
5. Task scheduler (with libre office) HANGS deadly on some conversions eating RAM +CPU and looks like making infinite loop judging on cron log

Concerning the clone dev server -  the whole moodle occupies 700GB (600GB files + 100GB DB) , i dont even have free space to clone it 1:1 sad
Also another headache - why conversion task runs over those tasks randomly ? I mean , if it could run them sorted , i could at least see which id is really stuck and work it out seeing the result with every run.

Concerning the cleanup - i ran all these:
php admin/cli/fix_course_sequence.php  --courses=*
php admin/cli/fix_deleted_users.php
php admin/cli/fix_orphaned_question_categories.php
php admin/cli/check_database_schema.php
php admin/cli/checks.php

and they dont show any errors ..

I'm kinda stuck now ... Is there any way i can get a comment from moodle dev ? Or give dev an access to check things and pay him if this is sorted ?




Attachment 2moodle.jpg
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi Guys,
Thank you once again for your help !

I think i found solution, not sure if it is permanent or not , but i should see soon .

So, the "mdl_assignfeedback_editpdf_queue" table had 15K records in it , i found out that they start from from September 4, 2020 (probably stuck after one of the updates), as most of those assigments where closed, and even coursed where closed, this probably resulted in process being stuck.

Another trouble :
Convert 1 submission attempt(s) for assignment 85641
Conversion failed with error:Could not find readonly pages for grade 612120
Convert 1 submission attempt(s) for assignment 85641
Conversion failed with error:Could not find readonly pages for grade 612111

What could this be ?

PS - the problem on image not related anymore - seems like its being cleaned at the end..
Attachment 3moodle.jpg
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Your comment about LibreOffice processes hanging is a hint that the listener for converting is not working correctly.

Please see:

https://docs.moodle.org/310/en/Universal_Office_Converter_(unoconv)#Run_a_unoconv_listener

and

https://docs.moodle.org/310/en/mod/assign/feedback/editpdf/testunoconv/upstart

Second link above has this comment:

# The home folder for this listener will point to /tmp/ and any temporary files used by
# libreoffice will be created there.
In addition, the courses where those assignments reside (closed or open to students) probably
now have grade book issues.   The only way to see that is to login to the moodle as admin level, go to course, look in course admin menu for a grades link.    Click it.  Do you see the title of the assignment that is now gone in the column heading row?    If so, now one has to inspect any/all tables related to grades and that assignment to manually remove those rows from the DB ... no admin interface in Moodle has ability to do that.

Comment: remember me saying editing tables was a bad idea?   Above is why ... that approach of editing DB tables directly has a domino affect ... un-intended consequences ... which then have to be resolved in non-traditional ways. :|

'SoS', Ken
Average of ratings: Useful (1)
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Of the cleanup script lines I shared, did you try to run this one:

php schedule_task.php --execute="\core_files\task\conversion_cleanup_task";

Your issues are related to conversion of files to pdf for providing feedback to an assignment.

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
HI again!
Yes, i'm seeing logic more clearly now :D

In fact i think i cleaned up all previous mess , however theres now another trouble with feedback . This is mostly related to "Annotate PDF" plugin , and the unoconv (you where right about this)

I will try to setup listener bit later (Thank you for pinpointing this !) , but right now im fighting interesting fact - I have UBUNTU 18 +ISPCONFIG with PHP7.2 .

Once it is switched to FastCGI mode - COnversions would 99% work fine (except several files) however FastCGI is not working well for us in other places,

When switched to ModPHP it starts showing " Unoconv conversion for '/' from 'docx' to 'pdf' was unsuccessful; returned with exit status code (251). Please check the unoconv configuration and conversion file content / format." in error log . And if i go to "files/converter/unoconv/testunoconv.php" and click "Download the converted pdf test file." i would get :

Cannot open file ()
More information about this error

And same error on php log.
I assume this might be something with either HOME dir or temp dir or perssmisions, but whatever i do - cant fix that .
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Regardless of how you run apache ... fastcgi or mod ... the listener still applies.   So however you had the apache server running (fastcgi/mod) originally, leave it like that and get the listener to work first.

All of this used to work, right?

One problem at a time ... a major change such as fastcgi vs mod could send you down another rabbit hole and in trying to fix that one, un-intended affects else where - you've hinted but not said specifically what was affected.

That 'More information about this error' link did you follow it?  Where did it link to ... url?

Some more reading/info material on unoconv:

https://github.com/unoconv/unoconv/issues
https://github.com/unoconv/unoconv

unoconv is python ...
Since you are running Ubuntu 18, what versions of:
python
libreoffice
do you have installed on server?

which python
/usr/bin/python
/usr/bin/python --version

which libreoffice
/usr/bin/libreoffice
/usr/bin/libreoffice --version

in moodle config paths, do you have a path set for python?

Somewhat related is also ghostscript ... short gs.

which gs to find path to executable

example of which gs output on my CentOS 7 server:

/usr/local/bin/gs

Ubuntu 18 might be different.

Then use that path in following example:

/usr/local/bin/gs --version

If you noticed .... all the above is focused just on unoconv and doc conversions ... as well as 'pdf annotations'.

Speaking of PDF annotations, do you have a PDF annotations plugin?

Please see:

https://docs.moodle.org/31/en/PDF_assignment_feedback_plugin

Note the above link is for verson 3.1 of Moodle.

'SoS', Ken


In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Hi,
will report things tomorrow -deadly tired now smile

Latest news: decided to switch back to FastCGI (where Annotaiton work fine ) plus connected Google Converter for docs  -so conversion  are almost  working now .

However the reason why i switched from FastCGI earlier is in the video below (i wanted to start a new topic, but maybe you'll know this ? )



Any ideas apreciated smile

In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Ken Task -
Picture of Particularly helpful Moodlers

Don't know what to say about video cept a question ... in what normal use of admin of moodle or taking a course would someone refresh the screen they were on 5 or 6 times using Chrome?

Since we're having some 'fun', now, how about a serious moodle admin/backend server admin question ... how could you, as admin, make sure only one converter was enabled and visible regardless of bug in GUI and playing around?    There is an answer!

Sometimes ... sometimes ... we are our own worse enemies!

Am tired also!  smile

'SoS', Ken

In reply to Ken Task

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Morning Ken ,
Well, no one would refresh the page that often, however when you SAVE something , and page refreshes and shows OLD values - this becomes visible instantly . And as this happens randomly with FastCGI, i only wanted to show concept why i moved to ModPHP smile And using ModPHP i was sure that i can check right settings smile However , even tho ModPHP works excellent for everything else - unocnv fails to find files :D

PS - I tried unoconv listener, made all as written in doc however ended with "Failed to start unoconv.service: Unit unoconv.service not found." and status says "Unit unoconv.service could not be found." . Yet another question what did go wrong here smile

Another information :
Python - i'v got both 2 and 3 (i tried setting both of them to PATH in moodle , trying to use one or another - no difference)
Python 2.7.17
Python 3.6.9
GPL Ghostscript 9.53.3 (Updated today from 9.26, doesnt seem to do any major changhes). BTW gs itself works fine even with ModPHP

And yes "PDF annotations" is the plugin that fails - it is enabled , and i mentioned it earlier - that this is the guy who doesnt want to work with ModPHP :D

In fact i have 2 ways now :
1. Fix unoconv working with listener and ModPHP (dont see any ways yet)
2. Fix cached pages display on FastCGI (probably should play with APC cahce - might be a module that does that tricky bug)

Thank you so much Ken,
In fact i used lots of your other topics to play with settings (not only this time) smile , so really appreciate you spend so much time helping people ! smile
You should have "Buy me a beer" button at the end of your posts, or good old patreon.com account ;)
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Kodo Mafija -
Update for whoever is watching this smile

Fixed FastCGI - in fact looks like the APC PHP mod playes very bad trick on MOodle with FastCGI . Not sure why and how, however WITHOUT APC / APCU mods cached versions not provided, and all works fine.

SO the ony thing left to figure out - unoconv . works ok with DOC / DOCx , however fails on JPGs and some other files ...
In reply to Kodo Mafija

Re: Running out of space with assignfeedback_editpdf\task\convert_submissions

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
May be on unoconv and JPG there are hints in these discussions:
- Unoconv not converting jpeg to PDF https://moodle.org/mod/forum/discuss.php?d=399755

- unoconv and HEIF image problems https://moodle.org/mod/forum/discuss.php?d=415781

- Unoconv not converting some file types https://moodle.org/mod/forum/discuss.php?d=402880

Note: Your issue is unlikely to be to related to performance. Therefore better suited for General help forum rather than Hardware and performance.