A trickly code-profiling situation

Re: A trickly code-profiling situation

by David Scotson -
Number of replies: 0
I'd seen a similar issue with this code, but it had the extra complication of getting really bad intermittently and on certain servers.

I'd originally assumed it was the SCSS compiling that would be the slow part of generating the Moodle CSS, so was somewhat suprised to see the Minification taking up so much time. (I originally noticed this when some work was done to let you use the C based SCSS compiler instead of the PHP one. There wasn't as much of an overall speedup as I expected because it turned out the Minifier was taking a while)

The quick workaround I came up with for the extremely bad performance, was to only Minify the CSS in sections, which Moodle itself mostly does, but if you go crazy in the theme and use Sass compiling to build up one big input file then you can end up with one giant lump of CSS getting minified in one go. The same amount of minification happens, but it seems much faster when broken down into sections and minified piece by piece.

(Another clue was the preminifying the the CSS that got Minified seemed to help, particularly if it got it under the 1MB size limit, though this will also mean the regexes will match less so it's not totally obvious it's the size reduction alone that helps)

The issue I was seeing was performance falling off a cliff at certain points, on certain servers, but I think the same root cause is responsible for most of the slowness that Moodle experiences with this library. The consistent thing that seems to affect users of this library is slowness as the size of the input increases.

If you take a small chunk of CSS, and repeat it multiple times (I did the following while testing the JS side of the same code: for i in {1..5};do cat jquery.js >> 5jquery.js; done and generated a bunch of files of different sizes) then you'll see that it gets slower and slower as you increase the size of the input file, but more than linearly. So minifying jquery 5 times is more than 5 times faster than minifying 5 copies of jQuery in one go and the difference increases with size.

I'm fairly certain this comes down to the section of code that takes the initial input, runs regexes on it (as timed above) but then takes the first regex that matches and substr it out of the input, and appends the result of the match to the processed string. So there's two big strings, one being assembled piece by piece, and one reduced piece by piece in a loop.

This is the kind of thing that in Java you'd use a StringBuilder for to avoid creating lots of String objects that get quickly destroyed.

I believe that PHP each time it copies a chunk of those two strings (and it'll copy them alot, and more for longer inputs) that basically duplicates the whole string. PHP's refcounting garbage collection seem to cope with this very well at small sizes but I still think it must still be generating a lot of objects that are only briefly used than dropped. And if you're string is say 1MB long (which is where things got really bad in my testing ) then each of these goes through the loop would create (roughly) 1MB of strings that will be thrown away on the next loop. I would guess that if these copies could be avoided then it might speed things up for larger files e.g. rather than constantly cut down the input file as we progress, it might be easier to keep track of an index into it (possibly you could also on search a little ahead each time, but it seems like the regex stuff is so hyper optimised that this wouldn't be worth the trouble).

Measuring memory usage via PHP it seems to handle this well, but I think PHP just reports what the Zend Memory Manager is looking after, which doesn't directly account for what's going on behind the scenes as we continually ask the OS for more fresh memory to use, particularly odd sizes that might not fit in the gap we've just got rid of (or which the Memory Manager would have to be pretty smart to notice that it does).

If anyone's got the time to look into this I think it might make a big difference to the CSS compile speed overall. One way to test would to skip the minification entirely and see how long the CSS generation takes. Then take that output and run it through the minifier. Then split it in multiple pieces and run it through the minifier timing it and add them up. That should give us some rough idea of the extra time that's being used up by the memory juggling in the background.

One relatively quick intervention might be to store the output as an array of string chunks, and only combine them back into a big string at the end (which is basically what a StringBuilder would do I think). It's harder on the input side because we need to run regexes on it, but if we can provide a start index for where we want to search from, then I think it would still be possible to just search the single large string each time.

An alternative approach might be if there's an easy way to split the input CSS (e.g maybe split every time there's a /*! comment starting and add some more of those if we need more chunks), we could just pass it in in in smaller chunks and leave the library as it is.

A third alternative is that possibly it's PHP's cycle-collecting garbage collector kicking in repeatedly, but not actually finding any cycles to clear. This was part of a change that sped Composer up by 90% a while back, and since we're generating so many objects, it might also be the case here. You can switch this off and then back on again from within PHP, which might be worth checking to see if it makes any difference on larger inputs.