Awkward usage of get_string

Awkward usage of get_string

by Alexey Veretennikov -
Number of replies: 28
Hello everybody!

Recently I've decided to get into some performance issues and found out that in many cases when pages load a decent part of time is spent on get_list_of_timezones() function.
But this function, in turn, makes use of get_string() function: after fetching the list of timezones from database it calls get_string() for each timezone. In version 1.8.2 it happens 390 times. And each time get_string() tries to find the appropriate translation for a given timezone reading the language files and seeking only one string in them. But as get_list_of_timezones() is called twice, this process is repeated 780 times. The question is: why not to pass the whole array of timezones and process it at once? Or maybe there are some better options...

I really don't know if I'd better post it in the tracker.Sorry if posting it here was unappropriate.
Average of ratings: -
In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Alexey Veretennikov -
Some more details on get_string()...

I have just benchmarked this function, and the results are very surprising for me:
Page
Function calls
Function time (en_utf8)
Total time (en_utf8)
Function time (ru_utf8)
Total time (ru_utf8)
Main, before login
1433
7.7
8.9
16.9
18.2
Main, after student's login
1477
8.0
9.6
16.7
18.4
Main, after admin's login
1488
6.3
8.0
19.3
21.0
The first page of some course, editing off
282
1.2
3.0
1.4
3.7
The first page of some course, editing on
595
2.9
5.5
3.0
5.8
Edit user's own profile
492
2.1
3.2
7.0
8.2
View user's grades
83
0.5
1.8
0.6
1.9
Edit course setitings
635
2.7
4.1
3.3
5.0
Where
Page - is the page that was loading;
Function calls - number of calls to the get_string() function while loading the page;
Function time - the execution time of get_string() function while loading the page (separately for English and Russian locale), in seconds;
Total time - the execution time of the whole page, in seconds.

Note: the server was performing another tasks, so it makes sense to consider these timings only relatively to each other.

As you can see, the get_string() is called too many times on most pages. All it should do is just get the translated strings but the way it is done cannot be called efficient.

Few words about becnhmarking. I modified the get_string() function to count for how long and how many times it is executed (see attachment). In order to see the results I put the lines
$execution_time = 0;
$number_of_calls = 0;
in the beginning of the loading file (e.g. /index.php)
and the line
echo "execution time of get_string(): $
execution_time seconds, called $number_of_calls times<br>";
in the end of the file.
In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Thanks for looking into thi. It's a pretty bad problem you seem to have found. However, it is a bit odd, becuase most people don't experience problems like this. If they were, we would have heard about it by now. And get_string has had optimisation work done on it before. For instance it should not read files repeatedly. It should read each needed file once at most.

So can you tell us a bit about your set-up. Which operating system, phpversion, moodle version, database, and so on. Obviously, get_string is doing something that is slow on your system, and ones like it, but which is fast everywhere else, and we need to fix that.
In reply to Tim Hunt

Re: Awkward usage of get_string

by Alexey Veretennikov -
Thank you for your answer.
I have a quite old server 2 x 1 GHz, 896 MB, SCSI drives 10000 rpm running Windows Server 2003 Standard with latest patches, Apache 2.2.4, PHP 5.2.3 (running as module), MySQL 5.0.41-community-nt, and Moodle 1.8.2. There are also some other web-projects on this server (such as Drupal CMS) using the same Apache+PHP+MySQL but I don't see any performance problems there.

Using File Monitor during the mainpage loading I see that the httpd.exe process requests each language file for reading only once - you are right. But there are also many requests called "query information". For example, for /lang/en_utf8/timezones.php there are 145 such requests.

The magic is that all other functions seem to works relatively fast and it is only get_string() that makes it all moving at the speed ot turtle. mixed
In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Martín Langhoff -
That "query information" means we are rpobably stat()ing it to hell. Must be a loop saying include_once() somewhere.
In reply to Tim Hunt

Re: Awkward usage of get_string

by David Contreras -
Similar behavior has happened to me, with Moodle 1.8+, Windows XP, Apache 2.0.59, PHP 5.2.1, Intel T2060 1.6GHz, 800MHz bus speed, 1Gb RAM. The problem started when I upgraded from Moodle 1.5 to 1.8. I haven't benchmarked the performance of system, but page loading takes ages (especially for the main page), and when I set

max_execution_time = 5

in php.ini, I always get an error inside get_string(). Updating language packs (Spanish in my case) helped somewhat but didn't solve the problem completely (Apache CPU time share still reaches 50% for 3-4 seconds while downloading the main page, even when I'm the only user connected).

I don't have a PHP accelerator installed. I'm going to try one and see if that helps.
In reply to David Contreras

Re: Awkward usage of get_string

by David Contreras -
Update: the bulk of the overhead is due to the administration block (module?). If you remove that block from the main page, downloading time shortens remarkably.

I have installed XCache (as a dll extension), and it doesn't seem to help.
In reply to Tim Hunt

Re: Awkward usage of get_string

by Stephen Anderson -

I am having the same issues.  While loading files for the courses quite often I get this error. "too many requests received, please wait a minute and try again."

Is there a fix for this?

In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Martín Langhoff -

Alexey,

get_string() should be bound by the delay of reading the lang file from disk, parsing the PHP, and then doing the array lookup. As the array is cached, this price is only paid the first time you request something from the file.

An OS with good IO will show better scores for get_string() everytime. Lately that means recent releases of linux wink

  • Does the machine have fast IO? Does bonnie++ think it has fast seeks?
  • Can you make sure you do several runs of it so the lang files are in the kernel buffers? When doing multiple runs, ideally do one with cold-cache (on linux, echo 2 > /proc/sys/vm/drop_caches -- anywhere else, restart the OS) and save those numbers, and then do several runs hot-cache and average the hot-cache runs.
  • In your profiling code -- it'd be great if you could capture numbers of the first time you call get_string() that will trigger the reading/parsing of a new lang file, and compare that with subsequent calls to get_string()
  • Does it change if you run it with tuck-mmcache, eaccelerator or APC?

A combination of cached precompiled lang files should show get_string() to be very fast. If it doesn't, we need to profile it and make it better.

And perhaps we do have some IO that could be done better in there. Last time I reviewed the perf of get_string was around 1.4/1.5 when displaying the lang menu was 50% of the cost of the homepage -- and there wasn't anything obvious to my eyes.

Edit: wanted to say - great that you are benching parts of moodle. More of this needs to happen - and a better understanding of what is fast, and what is slow, emerge. We want to be hitting the fast paths of the code as much as possible, and making sure all the core functions/methods are fast fast fast.

In reply to Martín Langhoff

Re: Awkward usage of get_string

by Alexey Veretennikov -
Well, you say that the array should be cached and, indeed, the time it takes to execute the string
$timezones[$timezone->name] = get_string(strtolower($timezone->name), 'timezones');
inside get_list_of_timezones() shows that the first call lasts for about 0.011 seconds while all the others - for about 0.004 seconds. But if we compare it with other functions it is still too long, especially for processing a cached data.

I know that Windows is not the best choice from the point of view of performance and maybe some day I'll change to Linux, but our college is tightly bound to this OS and we don't have a dedicated server for Moodle, so it won't be fast.

As I said in the previous answer I have SCSI drives 10000 rpm, the filesystem is NTFS. Actually, couldn't find bonnie++ clone for Windows to measure the seeks.

Windows restart doesn't lead to any improvements while loading one page several times leads to decreasing of get_string() calls and load time, though not significantly.

Could you say this in more details - which numbers I should capture? The duration of get_string() execution the first time it is called or what? smile

I don't have any accelerators now. The internal cache system was on all the time. When I turn it off the pages load only about 5 % longer.

Thanks for assistence smile
In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
If you are not using a PHP accellerator, then installing one would be the first thing I would try.
In reply to Alexey Veretennikov

Re: Awkward usage of get_string

by Rimas Abromavičius -
I'm experiencing the same problems with Moodle being extremely slow. PHP profiling showed this is due to get_string() function being called ~700 times/page.

So script execution for the main admin page was 6-10 seconds.
I optimized the get_string() a bit (removed lookups of local module translations and other stuff that I will never need) and I'm down to 0.5-1 seconds per page now. This is quite acceptable, though Moodle could be generally more optimized.

The hardware is: Intel Xeon 2.4GHz, 2.5GB RAM, SCSI (1000rpm) RAID3.
And I'm using: Windows 2003, Apache 2.2.3, PHP 5.2.3 + eAccelerator 0.9.5.1, MySQL 5.0.37.
Latest Moodle 1.8.2+

I suppose someone could implement Moodle translations in gettext and do a comparison. I'm sure it would be more efficient.

In reply to Rimas Abromavičius

Re: Awkward usage of get_string

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
I think the key problem there for you is Windows ... under Linux the OS seems to cache the files a lot more efficiently, so even though get_string is being called a lot the files are only being loaded once.

Still 700 times sounds like way too much, I can't think of any pages that have that many strings ... exactly what URL were you on?

I honestly can't see gettext being much more efficient on Linux (I considered gettext and rejected it before the files were implemented in the current way).
In reply to Martin Dougiamas

Re: Awkward usage of get_string

by Rimas Abromavičius -
I'm happy to report I traced down the problem smile
It turns out, open_basedir restriction in php5 made all the 'include' and 'require' functions awfully slow.

I noticed this when Zend framework took 0.6s to generate a "hello world" page, and another similar server it did it in 0.03s. Now I'm down to 0.05s on Zend's "hello world" smile
And I have ~0.4s on Moodle. It gets down to ~0.3s with eaccelerator.check_mtime="0", though varies depending on the general server load (it serves as a Windows 2003 Domain, Exchange server, file server, etc.)

As for gettext, it would be interesting to do a benchmark on both approaches. With the include statements being fast as they are now, gettext might not give an adequate advantage.


Keep up the good work smile
In reply to Rimas Abromavičius

Re: Awkward usage of get_string

by Martín Langhoff -

Good one!

One other thing to remember WRT gettext() is that when you use it in PHP, you have to be changing your locale all the time so gettext knows what locale, lang & collation to use.

Changing the locale means calls to putenv() and putenv() in PHP is very buggy, and gets broken/fixed/broken again quite often. The "breakage" is that putenv() uses pointers, badly, and ends up SIGSEGV'ing. And it gets worse if you are using turckmmcache or eaccelerator.

We had some really bad apache crashes that would leave the servers half-locked up. After much debugging over about 1 year I traced it down to mod/questionnaire which included something called ESP that had some gettext() calls, and therefore had to putenv().

Once I removed the putenv() calls it - it was like magic. No more crashing! Wohoo.

In reply to Rimas Abromavičius

Re: Awkward usage of get_string

by Martín Langhoff -
Very strange. I agree with MartinD - gettext under PHP isn't a speed demon -- or didn't used to be at least. Using eaccelerator should keep all your lang files precompiled and in an in-memory cache.

What happens if you tell eaccelerator to skip the stat()ing? I think there's an option in php.ini (eaccelerator.mtime?). Under windows, the stat() call is _very_ slow.
In reply to Martín Langhoff

Big Problem - eaccelerator.class.php

by PROFessor VAZZ - Brazil -
Dear Martin,
I changed to eaccelerator, but this locked my MOODLE 1.8 with message:
Fatal error: Call to undefined function _rm() in /home/eadcom/public_html/lib/eaccelerator.class.php on line 65
Now I have a BIG PROBLEM, because my moodle is out.
How Can I go back to NONE ?
Tks a lot.
PROFessor VAZZ
Brazil
In reply to PROFessor VAZZ - Brazil

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -

As an emergency fix, you can put in your config.php file a line saying

 $CFG->rcache = false;

right after the lines that show the database connection details.

In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by PROFessor VAZZ - Brazil -
Martin,

The same problem:
Fatal error: Call to undefined function _get() in /home/eadcom/public_html/lib/eaccelerator.class.php on line 56

Tks.
PROFessor VAZZ
Brazil


<?php /// Moodle Configuration File

unset($CFG);

$CFG->dbtype = 'mysql';
$CFG->dbhost = 'localhost';
$CFG->dbname = 'eadcom';
$CFG->dbuser = 'eadcom';
$CFG->dbpass = 'eadcom';
$CFG->dbpersist = false;
$CFG->prefix = '_mdl';
$CFG->rcache = false;

$CFG->wwwroot = 'http://e-ead.com';
$CFG->dirroot = '/home/eadcom/public_html';
$CFG->dataroot = '/home/eadcom/moodle-eeadcom';
$CFG->admin = 'teste';
$CFG->directorypermissions = 00777; // try 02777 on a server in Safe Mode

require_once("$CFG->dirroot/lib/setup.php");
// MAKE SURE WHEN YOU EDIT THIS FILE THAT THERE ARE NO SPACES, BLANK LINES,
// RETURNS, OR ANYTHING ELSE AFTER THE TWO CHARACTERS ON THE NEXT LINE.
?>
In reply to PROFessor VAZZ - Brazil

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -

Try adding

  $CFG->cachetype = 'internal';
In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by PROFessor VAZZ - Brazil -
Working....

Tks..tks..tks...

PROFessor VAZZ
Brazil
professor-vazz@e-ead.com
Average of ratings: Useful (1)
In reply to PROFessor VAZZ - Brazil

Re: Big Problem - eaccelerator.class.php

by Marco Rojas -
Professor,

You save me from an awful night...can't thank you enough.

Thank you and thank you and thank you!
In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by David Le Blanc -
Martin,
First I wanted to offer my thanks for your suggestions in this thread. I was at my wit's end trying to figure out how to get back into my moodle after this same problem occurred for me. black eye

I wonder if you might have a suggestion about how the settings for that page should be set. I have enabled the APC module on my server but I have no idea what the settings should be for the performance page. I don't want to screw it up again but uploading of files is painful for my learners.
In reply to PROFessor VAZZ - Brazil

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -
Applied a patch to 1.8 and 1.9/HEAD to prevent Moodle from breaking if you say you have eaccelerator... and you don't. Or you did, but was disabled.

The code was catering for the situation, but there was a subtle error in the logic.

Unfortunately, APC is not supported in rcache (Moodle still takes advantage of the main aspect of APC and other precompilers... the precompiler itself!).

Does APC have shmem abilities these days? If it does, we could have a plugin - just someone needs to write it wink
In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by Anthony Borrow -
Picture of Core developers Picture of Plugin developers Picture of Testers
Martin - APC uses shmem by default. I've created MDL-11375 as a request to create the plugin. I believe this would be especially helpful considering that PHP6 plans to include APC by default. Peace - Anthony
In reply to Anthony Borrow

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -
Hmmm. Looking at http://au2.php.net/apc it looks like it offers shmem services (apc_add(), apc_store(), etc). So yep, an APC user could roll up their sleeves and add APC support.

If the semantics of the shmem API are the same as for eaccelerator, then you can probably just make a copy of the eaccelerator class and hack it till it works.
In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -
> we could have a plugin - just someone needs to write it

And I have an initial implementation of this. All I need is testers wink

http://tracker.moodle.org/browse/MDL-11375
In reply to Martín Langhoff

Re: Big Problem - eaccelerator.class.php

by Anthony Borrow -
Picture of Core developers Picture of Plugin developers Picture of Testers
Martin - Could you upload the plugin to MDL-11375 so that I can test it. Peace - Anthony
In reply to Anthony Borrow

Re: Big Problem - eaccelerator.class.php

by Martín Langhoff -