strategies for reducing cost of common course query

strategies for reducing cost of common course query

by David Ackerman -
Number of replies: 5

I'm looking at a moodle postgresql database with about 4000 courses, 300K mdl_context records and 200 course categories. The following query gets executed on pretty much every page load I've seen so far and even operating off of fully cached data consistently takes around 200ms. Because of the frequency with which the query gets executed, with most other common queries falling into the sub-millisecond territory, I'd like to find a way to make it faster. I don't know that much can be done with indexing, as it's not a problem with seeking in any one of those tables. I'd think making a view out of the query and/or using pgmemcache would result in being able to run it substantially more quickly (but would require application code changes). Are there any suggested database optimizations that would be transparent to the application and would help this query run faster? Is it something that's already been addressed in current development (I'm using version 2.2)? Am I thinking it's a bigger problem than it actually is? Right now it seems to have a substantial impact on the number of req/s I can handle, but that may be due to testing against pages that aren't normally hit as heavily.

Here's the query:

SELECT c.id, c.sortorder, c.visible, c.fullname, c.shortname, c.category, cat.path AS categorypath , ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel, ctx.instanceid AS ctxinstance
FROM mdl_course c
LEFT JOIN mdl_context ctx ON (ctx.instanceid = c.id AND ctx.contextlevel = 50)
LEFT JOIN mdl_course_categories cat ON cat.id=c.category
WHERE c.id <> $1
ORDER BY c.sortorder ASC LIMIT 20 OFFSET 0
2012-04-20 22:50:47 UTC DETAIL: parameters: $1 = '1'
Average of ratings: -
In reply to David Ackerman

Re: strategies for reducing cost of common course query

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

This looks like a useful observation. I think it is worth reporting as a performance bug in the Tracker (http://docs.moodle.org/dev/Tracker_introduction).

It might be worth testing to see if

SELECT * FROM mdl_course c ORDER BY c.sortorder ASC LIMIT 20

is also slow. Then test

SELECT * FROM mdl_course c WHERE c.id <> 1 ORDER BY c.sortorder ASC LIMIT 20

If both of those are fast, we can rewrite the SQL. If those are slow, then we need a fundamental rethink of what is being done here.

In reply to Tim Hunt

Re: strategies for reducing cost of common course query

by David Ackerman -

Hmm... the two queries you suggest take about 3.8 and 4.6 ms respectively. I'm attaching the EXPLAIN ANALYZE results. Will file a ticket in tracker tomorrow. Thanks.

In reply to David Ackerman

Re: strategies for reducing cost of common course query

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Right, you can see that the query optimiser gets it totally wrong for the more complex query.

For 

EXPLAIN ANALYZE SELECT * FROM mdl_course c WHERE c.id <> 1 ORDER BY c.sortorder ASC LIMIT 20;

It basically does a Sort Method: top-N heapsort on the courses table, which is fast.

For the original table, it does a Sort Method: quicksort of the entire courses table. Then does the left join on all rows of the courses table with the other tables, and only limits to 20 rows at the end.

Stupid thing. I wonder how we can make it realise that it only needs to find the top 20 rows of the courses table before doing the LEFT JOINs? I can't think of a way right now, but there must be a way. I wonder if it is worth reporting this example in the MySQL but database too?

In reply to Tim Hunt

Re: strategies for reducing cost of common course query

by Michael Gwynne -

Forgive my ignorance if I am COMPLETELY wrong with this... but wouldn't the following work?

SELECT courses.*,ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel, ctx.instanceid AS ctxinstance,cat.path AS categorypath
FROM
(
    SELECT
    c.id, c.sortorder, c.visible, c.fullname, c.shortname, c.category
    FROM
    mdl_course c
    WHERE c.id <> $1
    ORDER BY c.sortorder ASC LIMIT 20
) as courses
LEFT JOIN
mdl_context ctx ON (ctx.instanceid = courses.id AND ctx.contextlevel = 50)
LEFT JOIN
mdl_course_categories cat ON cat.id=c.category

In reply to Michael Gwynne

Re: strategies for reducing cost of common course query

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

I think you are not allowed to use LIMIT in a subquery, at least not according to the SQL standard, and alos on some of our supported DBs.

Also, on a practical level, the syntax for LIMIT on different database is completely different, so you could not just use SQL like that.