Quiz: Mystery slow/bottlenecked updates

Quiz: Mystery slow/bottlenecked updates

by Martín Langhoff -
Number of replies: 10

(This is fairly development-oriented, after a bit of mulling, posted here. Could move it to quiz forum...)

On a heavily loaded M2.6.x running on MariaDB 5.5.x. Bouncing between innotop's view of Locks and Queries, I see that this setup gets hit by queries following this format

UPDATE mdl_question_attempts 
    SET maxmark = 2 
WHERE 
   questionusageid IN (
        SELECT quiza.uniqueid 
        FROM mdl_quiz_attempts quiza 
        WHERE quiza.quiz = '23440') 
   AND slot = '1'

these queries can take 5~10 minutes to complete. They hold on to a table-wide lock, so nobody can touch mdl_question_attempts. To make things more colorful the subselect yields an empty result set. I feel like this:

To illustrate, checking for empty results like this makes the query take less than 10ms

UPDATE mdl_question_attempts 
SET maxmark = 2 
WHERE
     (   SELECT COUNT(quiza.uniqueid) 
         FROM mdl_quiz_attempts quiza 
         WHERE quiza.quiz = '23441') > 0  
    AND questionusageid IN (
        SELECT quiza.uniqueid 
        FROM mdl_quiz_attempts quiza 
        WHERE quiza.quiz = '23441') 
    AND slot = '2' ; 

Clearly MariaDB is being rather stupid.

Following the code around, qubaids_for_quiz::__construct() has bits of the subselect, and quiz_update_question_instance() calls it thus:

    question_engine::set_max_mark_in_attempts(new qubaids_for_quiz($quiz->id),
            $slot, $grade);

so I think I found the code path that leads to that SQL. However... we see a lot these updates. Are there code paths or uses cases that run over this in a loop? All I see seems to come from the quiz edit form, which I'd say edits one quiz at a time...

Average of ratings: -
In reply to Martín Langhoff

Re: Quiz: Mystery slow/bottlenecked updates

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

In case you have not found it, the source of this query is when you go to the Edit quiz page, and change the Max mark for one of the questions in the quiz. (https://github.com/moodle/moodle/blob/272fec367fba2957cb0818460c31d10d09352bca/question/engine/datalib.php#L1078)

The particular MySQL/MariaDB stupidity that is killing us here is that these databases don't understand that there is such a thing as a non-correlated subquery, so it is probably re-evaluating (the fast and well-indexed) sub-query for every row of the very large question-attempts table. As you say: moron!

We were promised a fix for this obvious query optimiser bug in MySQL 6. Don't know about the corresponding timeline for MariaDB.

I am afraid this is a known issue: MDL-32616.

The short story for why this is not fixed is

  • We know the query that is a problem.
  • But, it is an UPDATE statement, with a subquery, and the only way to fix it would be to change to to be and UPDATE ... JOIN.
  • Unfortunately, the SQL for that is different between all databases.
  • So we need a $BD method $DB->update_join(...) or something, but no-one can work out the right abstraction to make that work.
  • So it has just been sitting there (not helped by the fact the OU uses postgres, so I have not personal itch here).

If you want to break the deadlock, we may have to resort to a MySQL-specific hack, like we did once before: https://github.com/moodle/moodle/blob/272fec367fba2957cb0818460c31d10d09352bca/question/engine/datalib.php#L932. Obviously much better to avoid that, but if you want a quick win, that is probably the way.

I don't particularly like your rewritten query. It only fixes one case (the one where the subquery returns no rows. A more typical case is that it returns one row for each attempt at the quiz, and that will still perform terribly.) Also, surely it should be EXISTS, not COUNT > 0 if you want efficiency. As I say, look at an UPDATE ... JOIN for MySQL only.

Average of ratings: Useful (1)
In reply to Tim Hunt

Re: Quiz: Mystery slow/bottlenecked updates

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

I was thinking about this cycling in to work, but was then too busy to write it up. How about this for an API ...

... ah, while researching options, I just found the horrible mess in quiz_update_open_attempts. That is a useful case-study. ...


$DB->set_field_join($table, $alias, $newfield, $newvaluesql,
        $joins, $wheres, $params);

Example calls

In the quiz_update_open_attempts example, we would have
$table:       'quiz_attempts'
$alias: 'quiza'
$newfield: 'timecheckstate'
$newvaluesql: $timecheckstatesql
$joins: '{quiz} quiz ON quiz.id = quiza.quiz
                        JOIN ( $quizausersql ) quizauser ON quizauser.id = quiza.id'
$wheres: $attemptselect

In the query from this thread, we would have

$table:       'question_attempts'
$alias: 'qa'
$newfield: 'maxmark'
$newvaluesql: 6.7
$joins: '{quiz_attempts} quiza ON quiza.uniqueid = qa.questionusageid' AND quiza.quiz = ?'
$wheres: 'slot = ?'

Example SQL

(I really don't trust the Oracle case here, and don't have a test DB.)

Postgres
UPDATE {$table} $alias
SET $newfield = $newvaluesql
FROM $joins
WHERE $where
MySQL
UPDATE {$table} $alias
JOIN $joins
SET $alias.$newfield = $newvaluesql
WHERE $where
MSSQL
UPDATE $alias
SET $newfield = $newvaluesql
FROM {$table} $alias
JOIN $joins
WHERE $where
Oracle
UPDATE {$table} $alias
SET $newfield = (
SELECT $newvaluesql
FROM $joins
)
WHERE $where
In reply to Tim Hunt

Re: Quiz: Mystery slow/bottlenecked updates

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

There is a very good reason no-one has managed to solve this: it is bloody hard.

Here is a failed attempt:

https://github.com/timhunt/moodle/compare/master...MDL-32616-failure

Please feel free to build on that if you think it helps. I give up for now. (I should not really have been working on this, but I thought it might be fun.)

To run just the new unit test, you need

phpunit --filter test_set_field_join lib/dml/tests/dml_test.php

That test fails on Postgres with

1) core_dml_testcase::test_set_field_join
dml_write_exception: Error writing to database (ERROR:  syntax error at or near
"ON"
LINE 3:                   FROM test_table2 t2 ON t2.table1id = t1.id...
                                              ^
UPDATE test_table1 t1
                   SET valuetoset = SUM(t2.valuetouse)
                  FROM test_table2 t2 ON t2.table1id = t1.id AND t2.valuetotest
= $1
                 WHERE t1.valuetotest = $2
[array (
  0 => 'ok',
  1 => 'match',
)])
In reply to Tim Hunt

Re: Quiz: Mystery slow/bottlenecked updates

by Martín Langhoff -

There is a very good reason no-one has managed to solve this: it is bloody hard.

Doing it right is bloody hard. Arguably, using MySQL is doing it wrong wink

I'm not above fixing one wrong thing with another...

In reply to Tim Hunt

Re: Quiz: Mystery slow/bottlenecked updates

by Martín Langhoff -

A narrow fix here could be to run the subselect first, and only work if there's work to do. Then, if the number of qubaids is short, just put them literally in an IN () section.

Fugly, untested pseudocode patch follows; thoughts?

diff --git a/question/engine/datalib.php b/question/engine/datalib.php
index 9694851..4008658 100644
--- a/question/engine/datalib.php
+++ b/question/engine/datalib.php
@@ -901,9 +901,25 @@ ORDER BY
      * @param number $newmaxmark the new max mark to set.
      */
     public function set_max_mark_in_attempts(qubaid_condition $qubaids, $slot, $newmaxmark) {
-        $this->db->set_field_select('question_attempts', 'maxmark', $newmaxmark,
-                "questionusageid {$qubaids->usage_id_in()} AND slot = :slot",
-                $qubaids->usage_id_in_params() + array('slot' => $slot));
+        if ($mysql_is_bad) {
+            $qubaidslist=$this->db->get_fields_sql('id', $qubaids->gimme_where_clause());
+            if ($qubaidslist > 0) { /* only work if there's work to do */
+                if ($qbaudslist < 1000) { /* evil implode */
+                    $this->db->set_field_select('question_attempts', 'maxmark', $newmaxmark,
+                            "questionusageid IN (" . implode(',', $qubaidslist) . ") AND slot = :slot",
+                            $qubaids->usage_id_in_params() + array('slot' => $slot));
+                } else {
+                    /* lots of matches in subselect, take it in the chin */
+                    $this->db->set_field_select('question_attempts', 'maxmark', $newmaxmark,
+                            "questionusageid {$qubaids->usage_id_in()} AND slot = :slot",
+                            $qubaids->usage_id_in_params() + array('slot' => $slot));
+                }
+
+        } else {
+            $this->db->set_field_select('question_attempts', 'maxmark', $newmaxmark,
+                    "questionusageid {$qubaids->usage_id_in()} AND slot = :slot",
+                    $qubaids->usage_id_in_params() + array('slot' => $slot));
+        }
     }

     /**
In reply to Martín Langhoff

Re: Quiz: Mystery slow/bottlenecked updates

by Martín Langhoff -
The only bit missing in this patch is changing the get_fields_sql() line with something that runs

```
SELECT quiza.uniqueid
FROM mdl_quiz_attempts quiza
WHERE quiza.quiz = '23440'
```

-- after following __construct() overrides around, I wonder why do we have so many layers of indirection? Does this subselect really change for some quiz types?
In reply to Martín Langhoff

Re: Quiz: Mystery slow/bottlenecked updates

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Come on Martín, you can do better than that.

If you look at the other hack I linked to: https://github.com/moodle/moodle/blob/272fec367fba2957cb0818460c31d10d09352bca/mod/quiz/locallib.php#L969 you will see that you can write this sort of query as a single UPDATE ... JOIN in a way that MySQL optimiser can cope with. Do that, even, if you have to do it as a special case in this code with if($dbfamily=='mysql'){, and not as a general solution in the DB library.

As for the indirection, I don't think it is complicated. There are just two components involved here: the quiz, and the question engine. They each own their own database tables, and so the quiz should not poke around directly in the question tables if there is a higher-level API provided.

In order to make that work, there is a single abstraction, the qubaid_condition class, to let the quiz pass the quiz-specific bits of SQL (SELECT a list of IDs that the question engine needs). That is used throughout the question engine, when bulk operations are involved, and it works quite nicely. (Though i realise I sat that as the person who invented it, so of course I would think that.)

If you cannot work it out well enough to do your own good patch, then I will do you a deal: if you write some unit tests for quiz_update_slot_maxmark, then I will code an implementation that works well in MySQL and passes your tests. I don't have time to think through the required unit tests.

In reply to Tim Hunt

Re: Quiz: Mystery slow/bottlenecked updates

by Martín Langhoff -

Come on Martín, you can do better than that.

thank you smile ~ though I'll be honest: this is one of the ~5 serious bottlenecks. And every bottleneck I remove uncovers 3 that were hidden behind. Hard to tell where the rabbit-hole ends.

I am burning to prep short-term patches or fixes for each of them to get to the point where the system behaves more reasonably. At that point I can come back and work on turning each of them into good patches smile

Naturally, we'll work on it with you for something better. James McQuillan is looking into it as well.

In reply to Martín Langhoff

Re: Quiz: Mystery slow/bottlenecked updates

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

It is absolutely fabulous that you are exploring this rabbit hole.

However, there are short-term patches, and short-term patches, and it is a false economy to do a completely slap-dash job now, and have to return to it soon. Better to do a reasonable job while you understand that bit of the code. Also, it is not just your time. You are expecting Moodle HQ to put in the time to do integration review and testing.

It is also the case that if you are doing a fix that is in any degree short-term, then it is more worthwhile than normal to write unit tests. It is highly likely that that code will change again in future, so you will get more value from your tests. Writing unit test does not take that long. (If it takes you longer because you aren't yet familiar with Moodle's tools for doing unit tests, then you will have a hard time convincing me that is a waste of your time.)

Anyway, I look forwards to working with you to get this sorted.