A trickly code-profiling situation

A trickly code-profiling situation

by Tim Hunt -
Number of replies: 1
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Yesterday I hit a tricky situation with profiling, and I just wanted to write it up where other people might see what I learned.

The situation was that since we upgraded our development environment to Moodle 3.8.2 (from 3.7.x) some of our Behat tests were failing because of hitting the PHP 30-second time-limit, and this was happening in in the minfiy library. We realised that this was it minifiying the CSS for our theme.

Profiling theme/styles.php

So, I wanted to profile (as in https://docs.moodle.org/dev/Profiling_PHP) theme/styles.php. However, that did not work, because that script sets ABORT_AFTER_CONFIG for performance reasons.

This caused Mark Johnson to tell me about $CFG->earlyprofilingenabled = true; which is worth knowing about. (The link there goes to where it is documented in config-dist.php.) However, that did no good. It still does not let you profile ABORT_AFTER_CONFIG scripts.

Then, thanks to a hint from Andrew Nicols, I just hacked the code in theme/styles.php to remove the ABORT_AFTER_CONFIG. It is not quite as simple as that, because the script uses functions like min_clean_param, so I then needed to add a manual require_once of lib/configonlylib.php, and also, lower down the script where it goes in to define('ABORT_AFTER_CONFIG_CANCEL', true);. I had to comment out that bit to prevent errors.

However, with those things done I got the profiling done, and found that almost all of the 40-second execution time (when it was re-building the CSS after the theme cache had been cleared) was due to calls from the minify library to preg_match and preg_replace.

Finding the issue in minify

So, then the question was, which preg calls in minify were causing the bad performance? The only way I could think of to do this was to add some debug code in the library, to count and time the preg_calls with different regexes. That lead to the first half of this patch. For each regex match, we write the number of times it is used, and the total time taken to a comment at the top of the minified PHP.

Then, for using that, it was a major pain that Moodle agressively caches the mified CSS. So I went through the relevant code relentlessly hacking if statements so that it never used the caches and always did the minifying. This meant that I could open the direct URL to the CSS in my browser, and just keep pressing F5 to reload as I changed things. That is the second part of the patch.

(As an aside, it is often worth putting in a little bit of extra effort in code like this whch you will later throw away, to make the feedback cycle as short as possible while you are working on the code changes that you want to keep.)

So, anyway, it then became really obvious which the problem regex was, and I was able to fix it. For the rest of the story (which is not finished yet), see https://github.com/matthiasmullie/minify/issues/317 and MDL-68191

Hope this is at least vaguely useful/interesting to someone.

Average of ratings: Useful (3)
In reply to Tim Hunt

Re: A trickly code-profiling situation

by David Scotson -
I'd seen a similar issue with this code, but it had the extra complication of getting really bad intermittently and on certain servers.

I'd originally assumed it was the SCSS compiling that would be the slow part of generating the Moodle CSS, so was somewhat suprised to see the Minification taking up so much time. (I originally noticed this when some work was done to let you use the C based SCSS compiler instead of the PHP one. There wasn't as much of an overall speedup as I expected because it turned out the Minifier was taking a while)

The quick workaround I came up with for the extremely bad performance, was to only Minify the CSS in sections, which Moodle itself mostly does, but if you go crazy in the theme and use Sass compiling to build up one big input file then you can end up with one giant lump of CSS getting minified in one go. The same amount of minification happens, but it seems much faster when broken down into sections and minified piece by piece.

(Another clue was the preminifying the the CSS that got Minified seemed to help, particularly if it got it under the 1MB size limit, though this will also mean the regexes will match less so it's not totally obvious it's the size reduction alone that helps)

The issue I was seeing was performance falling off a cliff at certain points, on certain servers, but I think the same root cause is responsible for most of the slowness that Moodle experiences with this library. The consistent thing that seems to affect users of this library is slowness as the size of the input increases.

If you take a small chunk of CSS, and repeat it multiple times (I did the following while testing the JS side of the same code: for i in {1..5};do cat jquery.js >> 5jquery.js; done and generated a bunch of files of different sizes) then you'll see that it gets slower and slower as you increase the size of the input file, but more than linearly. So minifying jquery 5 times is more than 5 times faster than minifying 5 copies of jQuery in one go and the difference increases with size.

I'm fairly certain this comes down to the section of code that takes the initial input, runs regexes on it (as timed above) but then takes the first regex that matches and substr it out of the input, and appends the result of the match to the processed string. So there's two big strings, one being assembled piece by piece, and one reduced piece by piece in a loop.

This is the kind of thing that in Java you'd use a StringBuilder for to avoid creating lots of String objects that get quickly destroyed.

I believe that PHP each time it copies a chunk of those two strings (and it'll copy them alot, and more for longer inputs) that basically duplicates the whole string. PHP's refcounting garbage collection seem to cope with this very well at small sizes but I still think it must still be generating a lot of objects that are only briefly used than dropped. And if you're string is say 1MB long (which is where things got really bad in my testing ) then each of these goes through the loop would create (roughly) 1MB of strings that will be thrown away on the next loop. I would guess that if these copies could be avoided then it might speed things up for larger files e.g. rather than constantly cut down the input file as we progress, it might be easier to keep track of an index into it (possibly you could also on search a little ahead each time, but it seems like the regex stuff is so hyper optimised that this wouldn't be worth the trouble).

Measuring memory usage via PHP it seems to handle this well, but I think PHP just reports what the Zend Memory Manager is looking after, which doesn't directly account for what's going on behind the scenes as we continually ask the OS for more fresh memory to use, particularly odd sizes that might not fit in the gap we've just got rid of (or which the Memory Manager would have to be pretty smart to notice that it does).

If anyone's got the time to look into this I think it might make a big difference to the CSS compile speed overall. One way to test would to skip the minification entirely and see how long the CSS generation takes. Then take that output and run it through the minifier. Then split it in multiple pieces and run it through the minifier timing it and add them up. That should give us some rough idea of the extra time that's being used up by the memory juggling in the background.

One relatively quick intervention might be to store the output as an array of string chunks, and only combine them back into a big string at the end (which is basically what a StringBuilder would do I think). It's harder on the input side because we need to run regexes on it, but if we can provide a start index for where we want to search from, then I think it would still be possible to just search the single large string each time.

An alternative approach might be if there's an easy way to split the input CSS (e.g maybe split every time there's a /*! comment starting and add some more of those if we need more chunks), we could just pass it in in in smaller chunks and leave the library as it is.

A third alternative is that possibly it's PHP's cycle-collecting garbage collector kicking in repeatedly, but not actually finding any cycles to clear. This was part of a change that sped Composer up by 90% a while back, and since we're generating so many objects, it might also be the case here. You can switch this off and then back on again from within PHP, which might be worth checking to see if it makes any difference on larger inputs.