Moodle and HipHop for PHP (optimized C++)

Moodle and HipHop for PHP (optimized C++)

by Håvard Sørli -
Number of replies: 10

Hi,

Is there anyone who has attempted to compile Moodle with HipHop?
https://github.com/facebook/hiphop-php/wiki/

"HipHop transforms your PHP source code into highly optimized C++ and then compiles it with g++ to build binary files. You keep coding in simpler PHP, then HipHop executes your source code in a semantically equivalent manner and sacrifices some rarely used features – such as eval() – in exchange for improved performance.

Facebook sees about a 50% reduction in CPU usage when serving equal amounts of Web traffic when compared to Apache and PHP. Facebook’s API tier can serve twice the traffic using 30% less CPU."

Average of ratings: -
In reply to Håvard Sørli

Re: Moodle and HipHop for PHP (optimized C++)

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

' eval\(' -326 matches in 'core-moodle-github'

smile

--sam

In reply to sam marshall

Re: Moodle and HipHop for PHP (optimized C++)

by Hubert Chathi -

Yes, I don't know how they can call eval a 'rarely used feature'.

There are other PHP compiler projects, notably Roadsend and phc.  It would be interesting to see if those would work with Moodle (although Roadsend looks to only be compatible with PHP 5.2).

In reply to sam marshall

Re: Moodle and HipHop for PHP (optimized C++)

by Mark Johnson -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Also, as I understand, HipHop requires any additional PHP modules used to have a C++ equivalent written, since the code's converted before compilation.  There's a list around somewhere of the currently supported modules, but I suspect that Moodle requires some that aren't on that list.

In reply to sam marshall

Svar: Re: Moodle and HipHop for PHP (optimized C++)

by Håvard Sørli -

My first impression of Moodle is that the platform has a challenge with basic security. This impression is a personal first impression and not based on a thorough analysis on my part.

When a function like eval () is used over three hundred times in the core code I'm even more skeptical.

If you do a Google search on: php eval security
you will quickly see a number of recommendations against using eval () from a security standpoint. [1] [3] [4] [5] [6]

Michał Rudnicki wrote: [8]
"As a rule of thumb in two extensive follow this:
    1. Sometimes eval is the only / the right solution.
    2. For most cases one Hubble try something else.
    3. If unsure, goto 2
    4. Else, ask very, very careful. "

Based on a performance standpoint, it should be carried out thorough testing of alternative code to protect the use of Eval (). Simple tests show that the use of Eval () means that the code is significantly slower. [2] [7] [9]

This test is an example: [9]

Name of the test  Test duration (sec).  Test duration (%) 
static code (test # 0)  0.039707    100.00%
eval (test # 1)   0.943396   2375.89%
create_function (test # 2)  0.100959   254.26%
create_function (ref) (test # 3)  0.085249   214.70%


[1] http://php.net/manual/en/function.eval.php "Jürgen THAT PERSON BE DOT 12-Mar-2009 9:46"
[2] http://php.net/manual/en/function.eval.php "Luke that chaoticlogic dot net 02-Apr-2008 8:26"
[3] http://www.hardened-php.net/suhosin/a_feature_list:eval_black_and_whitelist.html
[4] http://php.robm.me.uk/ "is a useful but very dangerous function"
[5] http://en.wikipedia.org/wiki/Eval
[6] http://www.blog.highub.com/php/php-core/php-eval-is-evil/
[7] http://blog.joshuaeichorn.com/archives/2005/08/01/using-eval-in-php/
[8] http://stackoverflow.com/questions/951373/when-is-eval-evil-in-php
[9] http://php.webtutor.pl/en/2011/06/13/eval-counterparts-in-php-how-to-do-something-wrong-faster/

In reply to Håvard Sørli

Re: Svar: Re: Moodle and HipHop for PHP (optimized C++)

by Dan Marsden -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

don't think anyone here would disagree that using eval was ugly in a lot of cases.. a massive chunk of those eval calls are actually JS(not php) and 159 of them come from JS code in /mod/scorm - feel free to provide tested patches in any area of SCORM!!! wink - a quick look at some of the others actually come from things like yui, pear libs, tinymce etc so the number of PHP instances of eval are a lot lower (haven't checked how much lower)

In reply to Dan Marsden

Re: Svar: Re: Moodle and HipHop for PHP (optimized C++)

by Dan Marsden -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

in fact - taking a quick look at the directories the evals come from I'd guess that at least under 40 are actually PHP eval calls in latest master code.

In reply to Håvard Sørli

Re: Moodle and HipHop for PHP (optimized C++)

by Eloy Lafuente (stronk7) -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Peer reviewers Picture of Plugin developers Picture of Testers

Just guessing if that 50% reduction in CPU usage (or doubled requests capacity) is comparable to the one obtained by using one simple, plugable, op-code cache.

I know they are totally different (and incompatible) beasts... so avoid any arguing about that, please. I'm just asking about real improvements.

Also, I think (both for hip-hop and op-code caches), that none of them is really going to help if the bottleneck in the site is disk or DB access. And of course, not being a drop-in-replacement (with some extensions or things like eval() / create_function() not being supported) I think it hardly will work with Moodle.

Just my 1cent reflexion based on my no-knowledge of hip-hop at all, ciao smile

PS: About security, I don't think eval() (nor $DB->delete_records()) are evil per se. If you use them correctly, they are NOT unsecure. If you use them incorrectly, for sure they can do really nasty things. wink

PS2: About current uses, aprox (=36, look for ocurrences, just a dozen are "ours"):

grep -r ' eval(' * | grep '\.php:' | grep -v 'scorm.*js' | wc -l 
In reply to Eloy Lafuente (stronk7)

Re: Moodle and HipHop for PHP (optimized C++)

by Danny Wahl -

I was thinking along the same lines - there are other much easier (at least to Moodle core) things you can do to majorly boost your performance:

op-code, as you mentioned

php5-fastcgi

mpm-worker instead of prefork

nginx?

then you can get into "harder" but still easier things like load balanced separate db servers (super simple in postgres9), reverse proxies, SAS drives, etc... etc... that I think are better in the long run.

In reply to Danny Wahl

Re: Moodle and HipHop for PHP (optimized C++)

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

I agree postgres 9 is great, but I'm not sure that 'load balanced separate db servers' are 'super-simple' smile

In fact, I'm pretty sure the correct description is 'impossible', with standard postgres 9 out of the box.

Postgres 9 does have a great new 'hot standby' feature which means that you can run multiple servers to handle 'read' transactions. However you can still only have a single server that handles 'write' transactions.

As the Moodle API doesn't provide a clear way to distinguish transactions before they start, it is not easily possible to send 'read' transactions to a load-balanced mirror and 'write' transactions to a master server. This is made even more difficult by the limited use of explicit transactions in Moodle. Commonly, without indicating an explicit transaction you might do code like the following:

- Make a database change (write transaction)
- Do some query that depends on the change you just made (read transaction)

This scheme will not work in a system where you try to distinguish between read and write transactions, serving the former through mirrored servers to gain performance. The write transaction will likely not have been passed to the mirror servers yet. It would be necessarily to correctly mark these as dependent as each other by having a single transaction that encompasses the write + any reads which might depend on it.

What's the way forward without rewriting all Moodle? Well, the majority of transactions in Moodle are 'read' transactions although each request typically makes at least one write transaction (log table). If we were to make progress in allowing Moodle to benefit from this type of optimisation, we would probably want to have some way where frequently-used pages are identified using a special new API, along the lines of:

$DB->enter_read_only_mode();

At this point, attempts to write to database will throw an exception, but also add new API:

$DB->execute_okay_if_delayed($sql)

Which can be used for the mdl_log update (and similar) to indicate that you want to execute an update, but it is OK if the result of the update is not available until after this PHP request has finished.

With this API, it would then be possible to configure a moodle_database subclass so that if you call the enter_read_only_mode, it makes all requests from a read-only database, except for execute_okay_if_delayed which would use a second db connection to the master database. Pages which don't call the enter_read_only_mode would use only the master database.

In most moodle instances the vast majority of usage is probably concentrated around a few pages (course view, forum view, forum discussion) which are generally read-only so if you could ship all those queries onto separate servers you have solved the scalability problem.

Removing scalability limitations from Moodle is a fairly big potential improvement and implementing this change would probably be less work than replacing all the existing uses of 'exec' smile Which won't gain anyone anything.

So anyway, that would be nice and would allow for hugely greater scalability of moodle instances using postgres 9, but it doesn't exist at the moment, so the best you can do with the new 'hot standby' feature is just that; have your failover server ready to kick in the instant there is a problem.

Feel free to correct me if this is wrong...

--sam

PS So far we have been able to handle high load just with a single fast database server. (It's actually a pair in failover configuration - so yes we are wasting good hardware by not being able to use it in the 'hot standby' manner for read-only queries.)