We have being seeing an odd behavior with postgresql 13.14 and Moodle 4.1.3 (and I have recreated the issue on Moodle 4.3.3 and postgresql 14.11). A partner we manage complained of occasional session lock errors related to the course logs report. When I investigated I did see that some queries against the mdl_logstore_standard_log table were taking much longer than they should. Specifically the query the report does most often.SELECT * FROM mdl_logstore_standard_log WHERE courseid = ? order by timecreated,id DESC LIMIT 100
The table itself is fairly large with 93 million rows, but as long as queries use the indexes we see acceptable performance. So I started experimenting. I know there is active discussion around limits and postgresql, to the point where their CEO described it as harmful (https://twitter.com/Xof/status/1413542818673577987), so I altered the limits and got some data.EXPLAIN ANALYZE SELECT * FROM mdl_logstore_standard_log WHERE courseid = 3212 order by timecreated,id DESC; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=209513.03..209657.23 rows=57682 width=209) (actual time=18.574..18.576 rows=7 loops=1) Sort Key: timecreated, id DESC Sort Method: quicksort Memory: 27kB -> Bitmap Heap Scan on mdl_logstore_standard_log (cost=643.60..204951.58 rows=57682 width=209) (actual time=7.270..18.548 rows=7 loops=1) Recheck Cond: (courseid = 3212) Heap Blocks: exact=7 -> Bitmap Index Scan on mdl_logsstanlog_cou_ix (cost=0.00..629.18 rows=57682 width=0) (actual time=6.146..6.146 rows=7 loops=1) Index Cond: (courseid = 3212) Planning Time: 0.133 ms Execution Time: 18.621 ms
With no limit the query performs as expected using the mdl_logsstanlog_cou_ix index.
EXPLAIN ANALYZE SELECT * FROM mdl_logstore_standard_log WHERE courseid = 3212 order by timecreated,id DESC LIMIT 1000; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=208114.23..208116.73 rows=1000 width=209) (actual time=0.042..0.044 rows=7 loops=1) -> Sort (cost=208114.23..208258.43 rows=57682 width=209) (actual time=0.041..0.042 rows=7 loops=1) Sort Key: timecreated, id DESC Sort Method: quicksort Memory: 27kB -> Bitmap Heap Scan on mdl_logstore_standard_log (cost=643.60..204951.58 rows=57682 width=209) (actual time=0.022..0.032 rows=7 loops=1) Recheck Cond: (courseid = 3212) Heap Blocks: exact=7 -> Bitmap Index Scan on mdl_logsstanlog_cou_ix (cost=0.00..629.18 rows=57682 width=0) (actual time=0.016..0.017 rows=7 loops=1) Index Cond: (courseid = 3212) Planning Time: 0.146 ms Execution Time: 0.074 ms
With a high limit of 1000 we again see no issues.
EXPLAIN ANALYZE SELECT * FROM mdl_logstore_standard_log WHERE courseid = 3212 order by timecreated,id DESC LIMIT 5; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------- Limit (cost=824.02..3672.30 rows=5 width=209) (actual time=278284.531..278284.534 rows=5 loops=1) -> Incremental Sort (cost=824.02..32859763.52 rows=57682 width=209) (actual time=278284.530..278284.531 rows=5 loops=1) Sort Key: timecreated, id DESC Presorted Key: timecreated Full-sort Groups: 1 Sort Method: quicksort Average Memory: 27kB Peak Memory: 27kB -> Index Scan using mdl_logsstanlog_tim_ix on mdl_logstore_standard_log (cost=0.57..32857689.28 rows=57682 width=209) (actual time=145599.714..27828 4.502 rows=6 loops=1) Filter: (courseid = 3212) Rows Removed by Filter: 93360886 Planning Time: 0.146 ms Execution Time: 278284.563 ms
But with a lower limit (5 here) we suddenly see postgresql strangely tries to use the mdl_logsstanlog_tim_ix index which is not expected and takes much longer. I did further queries and found that when the limit was somewhere below 400 i'd see it using the wrong index.
If you temporarily remove the mdl_logsstanlog_tim_ix low limit queries will use the correct index and perform as expected. So the fault here seems to be the query planner is for some reason choosing the wrong index. Unfortunately I just don’t know enough about postgresql’s internal workings to be able to pin down what exactly in going on here. I considered creating a Moodle Tracker ticket, but hesitated as I can’t really justify that this is bad\buggy behavior from Moodle. So I’m post this here to see if anyone else has seen this or has any better insight.
Thanks.