General developer forum

MUC and localcache

Picture of Russell Smith
MUC and localcache

I'm again beginning to look at performance issues of clusters and determine where gains can be made.  It's good that string cache (MDL-41019) have been included.  However having a central store for that means that I am seeing 100-700 calls to MUC during a page load just for strings.  With network overhead of shared storage, that can add up to 700ms to a page load time.

The creation of localcachedir was a great step in the direction of speeding up accesses.  There is then the wiki page which describes the different core caches and whether they need to be local or shared.  I have spent quite some time reading and working on the MUC code and the idea of having to understand which caches could be configured locally or not it's still very much out of my league of understanding.

With the above in mind and Petr's original proposal to add the ability to use localcache for revision based systems, I'd like to discuss options for progressing that forward.  Particularly for some caches like string and database metadata.

Things I've thought about so far;

1. Add a revision tag to the loader class in MUC as well as a definition item for revisions.  If those are set by the developer, you don't get to configure the cache in MUC, it's automatically a localcache but the API for MUC users is the basically the same.  Just a requirement to set a revision before get/set.

2. Do the items in 1, but also add the data to a shared storage as a backup cache if local doesn't have it.  If it's expensive to build the cache then this addition may be helpful to warm a cache in a new node without too much user impact.

3. Upgrade the file cachestore to be the backend for all the local caches, so it uses all the same mechansims as other caches and is well tested.  This would just require detection that it's a localcache to ensure the directory is created in the correct location.

With those in place, any cache with a revision flag, either at the cache or key level would be able to move to local.  I believe this would have a performance improvement on networked clusters without any impact on single server installations.  Immediate candidates appear to be stringcache, langcache, database-meta and course_modinfo.

I would appreciate any feedback and further ideas before I push any further ahead with development in this area.



Average of ratings: -
Re: MUC and localcache
As for which cache would be fastest for every setup - there is no right answer - that is why things should be configurable.

Localcache is generally fast because it is a) local - ie - not shared storage - 0 network accesses. It is possible to speed localcache up even more by e.g. mounting on a ram disk.

MUC can be faster or slower than localcache e.g. for a single server accessing memcache through a socket - extremely fast - faster than local storage, for a memcache on the network - probably slower than localcache because of the network access.

Instead of mandating things like "all strings should go to localcache" I would rather see a localcache MUC plugin which incorporates the jsrev in the keys. This would only be safe to use for certain types of cached info (static things that only change on upgrade etc). This allows the admin to configure whatever is fastest for their environment.

Average of ratings: Useful (1)
Picture of Russell Smith
Re: MUC and localcache
Thanks for the input Damyon it has sparked other thoughts.

In part, I'm then confused about the reason localcachedir was created at all and why it wasn't implemented in MUC initially.  My current guess is that there are things like minified JS and HTML that are better served from a filesystem.

So that puts us in a position where we need;

1. A configurable store for the revision data, could be per key or per cache type.
2. A notification that shared cache is not needed and selection of a cache store that is not shared.  In the current setup that would need to be the same for each server as it's not easily configurable otherwise.

I wish I really understood the use case for the Primary and Secondary stores.  Maybe this is it except with some additional information stored in cache definitions to indicate to the administrator that you can configure your primary store to non-shared if you like.

A cache definition item, like supports local caching is required and can then drive the loader appropriately.

I think now I'm unclear on how best to manage the revision parameter.  As we have database stored revision for modinfo and global configuration revisions for language information.  Maybe that would be out of scope for a first implementation.  Providing only a per server cache, with a possibly shared secondary cache is a good option.  The cache revision can be set as part of the loader and incorporated into the key that is sent to the store.  There would only need be some extra checks in the loader to ensure you are specifying that value correctly.  Mainly to protect against programming errors for users implementing localcache type caches.

The implementation strategy for that should be pretty simple and the changes required should not be large.  cachestore_file already allows for a custom path to be set so a new cachestore definition for that store would provide local storage automatically.  that can then just be selected from the existing administration panel.

I suspect it's the cache purging of older data that is of concern when the size grows.  The localcache dir has a global config for managing this.  I would think to control the size of the cache folder.  I wonder how important that is as part of this.  Is this folder getting large an issue.  What strategies could be in place to manage the size?  enforced large TTL comes to mind, as does a special key to monitor for purge_all_caches.

Do revision numbers go up on purge_all_caches?  If not then a key requirements is managing a purge_all_caches situation.

Handling that situation with a special cache key much like .lastpurged and $CFG->localcachedirpurged is probably the clearest option I can see.  On cache initialisation we could check those values, purge the cache if required and move on from there.

Average of ratings: -
Re: MUC and localcache
Yup - if you look into jsrev / themerev they are incremented when caches are purged ( js_reset_all_caches() ) - this is the only way to effectively "purge" localcache folders (the old data is still there but will never be looked at again).

Average of ratings: -