A Giant Query of 1.8 +

A Giant Query of 1.8 +

by Jai Gupta -
Number of replies: 16
I recently upgraded Moodle from 1.7.X to 1.8 +

The average load on the CPU of my server is around 30%-40% but after upgrading i found that it has raised to 300%... to further find out the cause i looked at running process. Mysqld was eating CPU... when i went to mysql processes, it showed me a running query which can be found below.

SELECT rc .capability, c1 .id AS id1 , c1.id AS id2, (
c1.contextlevel * 100
) AS aggrlevel, SUM ( rc.permission ) AS sum
FROM mdl_role_assignments ra , mdl_role_capabilities rc , mdl_context c1
WHERE ra .contextid = c1 .id
AND ra.roleid = rc .roleid
AND ra.userid = 1753
AND c1.id
IN ( 4273, 1 , 7 )
AND rc.contextid = 1
AND (
rc.capability = 'moodle/site:viewfullnames'

OR rc.capability = 'moodle/site:doanything'
)
GROUP BY rc. capability, c1. id, c1. contextlevel *100
HAVING SUM( rc.permission ) !=0
UNION ALL
SELECT rc.capability , c1.id AS id1, c2 .id AS id2 , (
c1. contextlevel *100 + c2.contextlevel
) AS aggrlevel, SUM( rc. permission ) AS sum
FROM mdl_role_assignments ra
LEFT JOIN mdl_role_capabilities rc ON ra .roleid = rc. roleid
LEFT JOIN mdl_context c1 ON ra.contextid = c1.id
LEFT JOIN mdl_context c2 ON rc.contextid = c2.id
LEFT JOIN mdl_context_rel cr ON cr. c1 = c2.id
WHERE ra.userid =1753
AND c1 .id
IN ( 4273, 1, 7 )
AND rc.contextid !=1
AND (
rc.capability = 'moodle/site:viewfullnames'

OR rc. capability = 'moodle/site:doanything'
)
AND cr. c2 = c1.id
GROUP BY rc.capability , c1.id , c2.id , c1.contextlevel *100 + c2. contextlevel
HAVING SUM( rc .permission ) != 0
ORDER BY aggrlevel ASC

this query was running from last 4 hours... i killed the query but after 2-3 min's it started gain, this time the query has some changes...

SELECT rc .capability, c1 .id AS id1 , c1.id AS id2, (
c1.contextlevel * 100
) AS aggrlevel, SUM ( rc.permission ) AS sum
FROM mdl_role_assignments ra , mdl_role_capabilities rc , mdl_context c1
WHERE ra .contextid = c1 .id
AND ra.roleid = rc .roleid
AND ra.userid = 1753
AND c1.id
IN ( 4273, 1 , 7 )
AND rc.contextid = 1
AND (
rc.capability = 'moodle/site:viewfullnames'
OR rc.capability = 'moodle/site:doanything'
)
GROUP BY rc. capability, c1. id, c1. contextlevel *100
HAVING SUM( rc.permission ) !=0
UNION ALL
SELECT rc.capability , c1.id AS id1, c2 .id AS id2 , (
c1. contextlevel *100 + c2.contextlevel
) AS aggrlevel, SUM( rc. permission ) AS sum
FROM mdl_role_assignments ra
LEFT JOIN mdl_role_capabilities rc ON ra .roleid = rc. roleid
LEFT JOIN mdl_context c1 ON ra.contextid = c1.id
LEFT JOIN mdl_context c2 ON rc.contextid = c2.id
LEFT JOIN mdl_context_rel cr ON cr. c1 = c2.id
WHERE ra.userid =1753
AND c1 .id
IN ( 4273, 1, 7 )
AND rc.contextid !=1
AND (
rc.capability = 'moodle/site:viewfullnames'
OR rc. capability = 'moodle/site:doanything'
)
AND cr. c2 = c1.id
GROUP BY rc.capability , c1.id , c2.id , c1.contextlevel *100 + c2. contextlevel
HAVING SUM( rc .permission ) != 0
ORDER BY aggrlevel ASC

this query is using a hell lot of recourses... please help.
Average of ratings: -
In reply to Jai Gupta

Re: A Giant Query of 1.8 +

by Ethan Sommer -
This has been a huge problem for everyone who has lots of courses.

I "fixed" the problem by commenting out the second half of the union (from UNION to (but not including) the ORDER BY -- so the order by should still be part of the query but the union shouldn't be.)

I'm not exactly sure what I'm giving up by doing that, but things seem to work (quickly) and without doing that our installation is completely unusable.
In reply to Jai Gupta

Re: A Giant Query of 1.8 +

by Ken Wilson -
This is logged as a bug in MDL-8120 with an interesting commentary from Ethan. The tracker entry is marked as "resolved" so it may help the developers to add comments that you're still getting the same problem. Keep an eye on MDL-7416 as well!
In reply to Ken Wilson

Re: A Giant Query of 1.8 +

by Jai Gupta -
thanks for your reply... i hope moodle will be able to solve it fast.

I guess i have to rollback to previous version sad

I loved the moodle 1.6 but all versions after that seems to be buggy prototypes
In reply to Jai Gupta

Re: A Giant Query of 1.8 +

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Please help us identify the problem (it's a performance-related problem that only shows up in some environments) by providing test data and feedback on MDL-8120.
In reply to Martin Dougiamas

Re: A Giant Query of 1.8 +

by Martín Langhoff -
+1 - yes, please - let's collect as much data as possible. MattC and I have some time set aside for some serious optimisations there, but I'm a being all consumed by a gigantic v1.5 -> v1.8 upgrade that is almost almost almost ready.

Having more info will help us focus. We specially need perf numbers like dbqueries.

[Of course, we'll also have our own 1.8 to profile]
In reply to Martin Dougiamas

Re: A Giant Query of 1.8 +

by Jai Gupta -
i would love to help, i Went to MDL-8120 but was confused what to do... how can i help Moodle?
In reply to Jai Gupta

Re: A Giant Query of 1.8 +

by Max Garrick -
Just as Ethan described, we also had to disable the second part of the SQL query. MDL-8120 describes how to do this if you are interested.

I am curious what functionality this change removes--it seems like the query is doing some custom role magic.

We are hoping that this will be resolved soon, as we are hesitant to upgrade our version of Moodle with this query still in there...

-Max
In reply to Max Garrick

Re: A Giant Query of 1.8 +

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Removing that will ignore overrides to permissions. So if you added overrides to give certain roles more or less permissions in a forum (for example) then removing this bit of SQL will ignore those.

This may mean that students might see posts in teacher's forums, for example, things like that.
In reply to Martin Dougiamas

Re: A Giant Query of 1.8 +

by Markus Hagman -

Could you please list all the effects these problems have. I'm planning to upgrade our 1.64+ to 1.8+ in few days and this might change my mind.

My test installation uses ~50-80seconds to load users profile (user who is teacher and has multiple courses.. more than 10).

Logins with these users won't take so long.

If profile page is the only bottleneck I could continue with upgrade and make a quickfix to profile page to the point this problem is solved.

In reply to Markus Hagman

Re: A Giant Query of 1.8 +

by Samuli Karevaara -
Markus, could you test this: add some test account as a teacher to a three dozen courses or so and see if that profile page is slow. We had a big difference between "fresh" courses and those that came from the upgrade. So it might be that the upgrade is not calculating some "caching" type of info.
In reply to Samuli Karevaara

Re: A Giant Query of 1.8 +

by Markus Hagman -

Samuli, this seems to be working. Now only thing needed is how this cache is being fixed.

In reply to Markus Hagman

Solution?

by Markus Hagman -

Problem:

Massive load times on profile pages, almost 80secs loading on big queries etc.

Solution:

Massive slowing down was caused by the unsynchronized database table indexes. (I got my profile page loading from 60sec to 0.5sec).

You'll have to make a clean installation of 1.8.1 moodle and then use mysqldiff (http://www.mysqldiff.org/downloads.php) (or other) software to check the differences between Upgraded 1.81 and Clean 1.81 versions.

Mine were massive check this thread: http://moodle.org/mod/forum/discuss.php?d=74211

I made alter table actions to my Upgraded 1.81 to the point it was 100% sync with Clean 1.8.1.

This caused the query load times go down, which is great.

Steps for me to next update:

When your upgrading your moodle from previous versions database sync doesn't seem to keep up with the newest clean version of moodle 1.81

My upgrade steps to 1.81 are:

1. 1.63+ -> 1.65+
2. 1.65+ -> 1.71
3. 1.71  -> 1.81
4. synchronize database, mostly indexes (1.81 upgraded to 1.81 clean)

I do this leveled upgrade because I've noticed that you run in troubles trying to upgade straight to newest version, by skipping one or two released versions of Moodle (this is my opinion, you can disagree).

Cheers,

Markus

--

Im not sure if this is issue only with Moodle installations which has been installed from versions prior 1.63 and upgraded from those to newest, but I find it really strange that this issue hasnt been adressed, or am I alone with this problem?

 

In reply to Markus Hagman

Re: Solution?

by Samuli Karevaara -
Markus, I described the differences that we have in Bug 6855. Eloy suggested that the differences are not critical, maybe trivial even, but I'm worried about the missing index for the column mdl_role_capabilities.capability. That might be the one missing index that makes all the difference for the query at the top of this thread...

We have the same slow-down also. I'm not at my test box right now so I can't confirm if adding just the index for mdl_role_capabilities.capability would be responsible for 99% of the speed-up you experienced.

Edit: that missing index still doesn't explain why courses added after the upgrade don't slow down the profile page but the courses that a user had before the upgrade do. It might be the mdl_context_rel-table, as Ethan S. suggested somewhere.
In reply to Samuli Karevaara

Re: Solution?

by Markus Hagman -

--
Edit: that missing index still doesn't explain why courses added after the upgrade don't slow down the profile page but the courses that a user had before the upgrade do. It might be the mdl_context_rel-table, as Ethan S. suggested somewhere.
--

Maybe the indexes are not updated on right way during the upgrade. New courses you add when upgrade is done will get right indexes.

When I make sync, it removes all the old indexes and adds the new ones, which also recreates all the indexes.

edit: Im only guessing, havent tested it.

In reply to Markus Hagman

Re: Solution?

by Samuli Karevaara -

"I find it really strange that this issue hasnt been adressed."

I have wondered similar things with earlier upgrades when we had some severe performance issues. One explanation might be that most of this is MySQL specific and "the big ones" are using PostgreSQL (OU + "New Zealand" / Catalyst)? Further, Catalyst was using 1.6 till quite recently (?) and OU is using a customised version, so they might have resolved these with some custom code earlier on, not publishing the fixes as it applies to their custom version only (?). I'm mostly guessing here, mind you.

moodle.org is big but only has few courses, so most of these issues don't happen here at moodle.org.

I hope that the core team would have an automated way to generate huge test sites to upgrade and fiddle with after the upgrade (100K users, 10K courses, 1M resources, for example).
In reply to Samuli Karevaara

Re: Solution?

by Dean Thayer -
My site (1.8.2) is really struggling with load_user_capability() killing our performance. It seems to be linked to the execution of cron.php and to the viewing of Journal activities. Anyway, I just tried something crazy on our test server and truncated context_rel. Performance is super quick, but what are the implications? What breaks when context_rel is empty?