Enabling/disablying filters by context

Enabling/disablying filters by context

by Tim Hunt -
Number of replies: 8
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
The aim is to allow things like
  • disable the glossary auto-linking filter in a quiz
  • enable the TeX filter only in the Maths course category
I have just done a design of how this could work: Development:Filter_enable/disable_by_context. I would appreciate comments and criticism to ensure I have not overlooked anything.

This has been sitting in the tracker as MDL-7336 for a while and has a moderate number of votes. Periodically I duplicates of the specific 'disable the glossary auto-linking filter in a quiz' use case. Since it is not a huge amount of work, I would like to do it for Moodle 2.0 as part of the Navigation etc. changes.
Average of ratings: -
In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Well, that post was met with a resounding silence, even though MDL-7336 has 10 votes. Following a Jabber chant with Petr, I have just amended the proposal to allow for per-context settings, for filters that want to offer that. The use-case that motivates that is:

Suppose in Forum A, you want the glossary auto-linking to link words from Glossary A, while in Forum B, you want the glossary auto-linking to link words from GlossaryB.

Anyway, I may implement this soon, so if you want to object, please do so ASAP.
Average of ratings: Useful (1)
In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Martín Langhoff -
Hmmm. The filters have a huge performance impact. What's the DB-query costs of checking for context and for context-specific settings?

(And of course, I can imagine the UI being a bit of a challenge... but perhaps there's a clever trick there...)
In reply to Martín Langhoff

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Have you actually read the proposal which explains all that?

Anyway, this is probably going to be my first development on a branch in git, so we can find out before it is too late.
In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Martín Langhoff -
Sorry! I have now smile

Still...

you are adding an expensive query that will hit us per-page. The 1.9.x we have right now has a "baseline" of quite a few queries per page, but those are all fairly cheap. The queries that were expensive have been all pushed into various caches that try to be compact. Is there a way to make a compact cache for this data?

The LEFT JOIN to filter_config... if you have config settings that are per-context, it won't grab the correct config settings. I am not sure if the MAX()/HAVING strategy will let you grab the correct settings in one query.

It would also be useful to understand whether this will cause significant duplication in the cache tables -- the content usually appears in various locations, which may be under different contexts, so it'll be recomputed and stored many times over... wondering what % is the impact of this.
In reply to Martín Langhoff

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
I don't think that query is expensive. If your Moodle site has N filters installed, and you are at context.depth = D, then it will examine at most N*D rows from filter_active (but more typically N plus a few), with the where and join to context fully indexed.

Then it fiddles some integers to get down to a list of N or fewer active filters.

Then it does an indexed join on filter_config that will typically match very very few rows.


You are right, we should measure this query on a large DB with some typical data.


I can't think of any content that is filtered and appears in many different contexts. If you have filterall on, then there are activity titles which appear in two contexts.

That is, if you turn on text caching. Text caching adds one or two queries per filtered string on the page (one to see if it is already cached, a second if it isn't). The OU found filtered text caching slowed things down on their servers.

The alternative, which is leads to more cache hits, but more expensive PHP code, is to put serialize(get_active_filters($context)) into the hash key, rather than $context->id. Hmm. That seems like a good idea.


Note that the result of the query depends only on contextid, and probably returns a small amount of data, as such, it is highly amenable to caching by the DB server. And if we had to we could cache it on the Moodle side.


It does grab the correct config settings. It is intentional that filter config does not inherit. I implemented it today, with unit tests. Code so far on MDL-7336 if you want to look.


In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Test script

OK, I now have a test script that sets up the following:
  1. Creates a database connection with a fake table name prefix.
  2. Creates copies of of the 'context', 'filter_active' and 'filter_config' from the definition in install.xml (useful method in HEAD if you are writing test code: $dbman->install_one_table_from_xmldb_file)
  3. Creates 100 course category contexts, for each one choosing a parent randomly from the previously created contexts and the system context.
  4. Creates 1000 category contexts, picking a parent category randomly. (Seems to give an average of 5 levels of nesting, max about 10)
  5. creates 10000 module contexts, picking a parent course randomly.
  6. Randomly chooses a system level setting (disabled, off, on) for each filter.
  7. Randomly sets up 50000 local overrides (to on or off, filter and context chosen randomly).
  8. Randomly sets up 50000 random local config variables (filter and context chosen randomly, variable name chosen from a short list, value generated by random_string(rand(20, 40))).
(Setting this up takes a long time!)

Then I have a test harness that basically looks like:

$contexts = $DB->get_records('context');
$startime = microtime(true);
for ($j = 0; $j < $numcalls; $j++) {
$function($contexts[array_rand($contexts)]);
}
$duration = microtime(true) - $startime;

I call that with three functions:
  • noop($context) {}. Turns out that randomly picking an object from an array in PHP is quite slow sad so we have to adjust for tht in the real timing runs.
  • simple_get_record_by_id($context) { $DB->get_record('context', array('id' => $context->id)); }
  • filter_get_active_in_context - which is the function we are actually worried about.

Running the tests

The test script is attached if you wish to review it.

To run it, you need to
  1. save it to lib/simpletest
  2. apply the patch series from MDL-7336 to a HEAD checkout
  3. set $CFG->unittestprefix to something safe in your config.php
  4. go to the URL .../lib/simpletest/filtersettingsperformancetester.php
  5. Click 'Set up test tables'
  6. Wait
  7. Click 'Run tests'
  8. Click 'Drop test tables' (or don't bother)
You can, of course, play with the numbers in the script.

Results

The simple summary is that filter_get_active_in_context seems to take only about twice as long as a simple_get_record_by_id! I was expecting it to be worse than that.

In terms of scalabiltiy, that simple summary seems pretty stable. Dropping the test dataset size by a factor of 10 pushes the ratio closer to 2.5. It seems very insensitive to the density of local overrides and local config. That is all on Postgres running on my desktop machine. I guess I should try an install on MySQL now. Typical output copied and pasted below.


Time for 1000 calls to noop: 0.596s (0.596 - 0.000s) which is 1678 calls per second.
Time for 1000 calls to simple_get_record_by_id: 1.192s (1.788 - 0.596s) which is 839 calls per second.
Time for 1000 calls to filter_get_active_in_context: 1.821s (2.417 - 0.596s) which is 549 calls per second.
Total of 11101 contexts, 41721 filter_active and 48160 filter_config rows in the database.

In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
On MySQL the ratio of speed of the two queries seems closer to four of five to one. Not quite as good, but acceptable I think.
In reply to Tim Hunt

Re: Enabling/disablying filters by context

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
OK, I think I am pretty much done with this, apart from a couple of small bits that rely on other Navigation 2.0 changes.

(Regrettably one of those small changes is that format_text currently has no way of finding out what the current context is, which means that that the things that were the whole point of the changes don't work yet. Still the underlying code has been tested.)

So I would be really grateful if anyone had time to review the patch attached to MDL-7336 before I commit it. Thanks.