New e-peen script! Er, I mean, system performance analyser

New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Number of replies: 12
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Hi all!

Quite a few people liked the 'perspectives' script I wrote ages ago (it's in another thread in this forum). The script was intended to compare different types of action in PHP, so as to inform development decisions. Which it does a sort of okay job of. But instead, people used it for an informal evaluation of their Moodle hardware.

I don't think it's very good at that. smile

The main problem is that evaluating performance from a single script isn't very useful with modern servers which are designed to do more than one thing at a time. In other words, supposing you have eight processor cores running at 1.5GHz, and somebody else has a processor of the exact same type but with only a single core running at 1.6GHz. The CPU result on the 'perspectives' script will be higher in the second case, but for a server workload, clearly the former processor is much better.

Similarly, if you put the database onto a separate machine from your webserver this will make the database requests in the perspectives script appear slower (due to time taken in network connection between the servers). You've just increased capacity, because the time the webserver is sitting waiting for the database server is time it can be spending servicing another request - but the perspectives script doesn't reflect that.

So! I decided to do something about it. In my own time, i.e. nobody's paying me for this. (Yes, I'm weird.) Anyway, I wrote a new load testing system that runs in your browser using a Java applet - because unlike PHP scripts, Java applets can easily do loads of things at once. This one makes up to 20 simultaneous requests to your server. Basically it's a convenient and visually interesting way to initiate a DoS attack. smile

I also wrote a page for it to request, which does some CPU stuff, some memory stuff, some database stuff, and some file stuff. Should cover most things that Moodle does; the proportions probably aren't right at all and the detail isn't realistic, but it probably doesn't matter too much.

Finally I wrapped all this up into an admin report. Want to try it? Because this places heavy load on your server, if you run it on a live system you should do it in scheduled maintenance time when the system is not accessible to users. Also by the way, there's a log entry for each request made by this system, so typically a couple thousand extra rows in mdl_log, hopefully that doesn't bother anyone.

To install, download the zip file Moodle load testing admin report from my GitHub downloads page. Unzip it and stick the 'loadtest' folder in the admin/report folder of your Moodle 1.9 install. That's it, then just go look at it in a browser that doesn't suck (on my Mac, Safari sucks, dunno why, Firefox is fine) on a computer that has Java 6 installed.

Here's an example of the results. I ran it on the OU developer server, it would be interesting to run it on our live servers, but I probably can't do that... As you can hopefully see the dev server scores around 7.94, or as we call it in the trade 'basically 8', requests per second. Woohoo.

Requests/s (attempted) Requests/s (actual) Median time Successful
1 1.04 249 ms 100%
3 2.92 274 ms 100%
5 4.99 244 ms 100%
7 6.96 272 ms 100%
9 8.09 866 ms 100%
8 7.94 319 ms 100%
9 8.08 948 ms 98.3%
8.5 7.76 1282 ms 100%
8.25 7.55 554 ms 100%

As noted, this doesn't translate directly into real usage, but if somebody really does want to turn this report's number into a 'this server can handle x simultaneous users' lie statistic, it's probably a much better number to start with when constructing your lie statistic than the perspective script's one.

Note the 'probably' though. I haven't been able to run it on the live system, like I say. If the live system (4 front-end servers, separate db server) doesn't get a better score than the dev server, then this script sucks too. smile Dunno if I'll get to try it.

Post here if you find problems with this system, or if you like it, or you just want to compare scores. smile Comparisons between test systems and known-faster live ones might be interesting too...

All the source is right there on that GitHub site so you can fork it or send me patches or stuff if you feel like. It should be easy to convert the report to Moodle 2 as it really doesn't do much; if anyone wants to do so, that would be cool too.

--sam

Average of ratings: Useful (2)
In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Did anyone try this report yet? I'd be interested in comparatives, even if they're only against people's test platforms and not real hardware.

I've now run it on our acceptance test platform and have got very poor results (2.5 requests/second).

An individual run of the test is slightly slower (I might expect that - acct uses NFS for file storage which is bound to be slower than local disk), but I would have expected parallel performance to be higher. What really causes the problem is that some requests seem to time out [the load test script abandons a request if it takes more than 10 seconds]. At present I'm not sure if this is a problem in the load tester, or a real problem in the system - leading candidate for that would probably be something to do with storing sessions in NFS.

As far as I'm aware the load tester only does the same thing that your browser would do if you requested that page (i.e. it makes an http request, passing on the same cookies) so I don't see why it would fail but...

Would be interested to hear if others have the same problem, or if you achieve better results.

--sam

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by Pieterjan Heyse -

Yes, I have tried it, and it didn't work. Cannot figure out why, though.

When I call the test cript manually (ie. in a browser), I get Finished OK, so the script should be working. the java applet loads fine and I get one pink box. But... the script stops after one run, telling me this:

 

Requests/s (attempted) Requests/s (actual) Median time Successful
1 1.05 2 ms 0%
0 90.91 2 ms 0%

 

Firefox also tells me a script is using too much resources and needs to be stopped. No candy for me sad

In reply to Pieterjan Heyse

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Thanks! This is a bug in the script:

1) The script thinks all the requests failed immediately ('successful 0%'). I don't know why this would happen.

2) After an unsuccessful run it tries to 'back off' attempting that number of requests/second (currently 1), reducing it to, er, 0. I'm guessing this results in it trying to do a mental number of requests and then crashing the javascript, hence the Firefox error. This one I should be able to fix, will do that shortly.

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

I'll put this on the GitHub download link this evening hopefully but I can't access that right now, so for the moment, here is a new version attached as zip file.

This one:

- Fixes the bug about going down to 0 requests (will not go lower than 1).

- Changes it to not use your login. Instead the script is set to not require login, and the applet will not send any cookies, so it's getting a fresh session every time. This avoids performance problems caused by session management, i.e. the system doesn't appear to work well when you get many requests from the same session. As these are all different sessions it should be OK now.

- Tweaks the size of file transfer to be small (this was unrealistically large, in reality most requests don't even make such a request).

- Changes the conditions for median time increase causing it to consider something as failed, to make this laxer.

I now get even faster performance from the dev server and... well... the numbers for the acct server are better... I want to rerun these this evening when nobody else is using the servers though.

--sam

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by Pieterjan Heyse -

Okay, thanks. Trie dthe new script, firefox does not crash, but is is still stuck at 1 request. I will try it later today when I'm at home, maybe firewallls/proxies are bugging me here.

Is there a way to debug why the counter never gets higher than 1?  Our setup isa bit complicated,  2 loadbalancers with 2 backend webservers, separate DB and file server. So maybe there's something this script does not understand there?

In reply to Pieterjan Heyse

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Hi,

The script will not go past 1 if the success rate is less than 100%.

I think you said you were getting 0%. If that is the case, just to explain, the success rate means, the applet looks for the text like 'Finished test' or something, in the result it gets back. It counts as failed either if the response didn't have this text in, or it didn't get a response within 10 seconds. Two things to try:

1) Make sure you can view the test script - even if you are not logged into anything (e.g. clear cookies in your browser, or open it without going anywhere, then go to the test.php url). The applet doesn't send any cookies so the script has to work when not logged in. (It should do but if you have any authentication systems on top of moodle, maybe it won't.)

2) The applet will not work through a proxy - you need to be able to directly connect to the URL being tested.

Sorry sad

Still interested if anyone else can get this to work... I'll try to run on our dev and acct systems again just before I go home today, when use is light, and post numbers for comparison with the latest scripts.

--sam

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

Here are the results on our systems of the current script:

Our acceptance test server

Requests/s (attempted) Requests/s (actual) Median time Successful
1 1.04 218 ms 100%
3 3.01 172 ms 100%
5 4.97 312 ms 100%
7 6.99 172 ms 100%
9 8.97 203 ms 100%
11 10.73 172 ms 95%
10 9.64 187 ms 100%
11 10.77 171 ms 97.3%
10.5 10.47 157 ms 97.1%
10.25 10.22 172 ms 100%
10.5 10.45 156 ms 100%
10.75 10.7 157 ms 100%
11 10.97 156 ms 100%
11.25 11.2 172 ms 98.2%
11.25 11.2 172 ms 77.8%
11.25 11.15 171 ms 72%

Our developer server

Requests/s (attempted) Requests/s (actual) Median time Successful
1 1.04 297 ms 100%
3 3.01 281 ms 100%
5 4.98 266 ms 100%
7 6.95 266 ms 100%
9 8.94 266 ms 100%
11 10.9 266 ms 100%
13 12.89 266 ms 100%
15 14.85 266 ms 100%
17 16.83 266 ms 98.8%
16 15.83 266 ms 100%
17 16.83 266 ms 100%
18 17.81 266 ms 100%
19 18.82 266 ms 100%
20 19.81 266 ms 100%
21 20.79 281 ms 99.5%
20.5 19.25 266 ms 100%
21 20.78 281 ms 100%
21.5 21.27 281 ms 97.4%
21.25 21.01 281 ms 99.3%
21.25 21.03 266 ms 100%
21.5 21.28 281 ms 100%
21.75 21.53 281 ms 97.5%
21.75 21.52 266 ms 92.7%
21.75 21.53 281 ms 99.3%

I think the script probably should be changed not to do the fractional values (because really who cares) and maybe to go up by bigger numbers to start with. Other than that it seems to be sort of working.

The failure pattern for the acct run was basically that it was handling the load OK most of the time, but then something seems to get 'stuck' and a stack of requests pile up; these are eventually cleared but something timed out so it counts as a fail.

--sam

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by sakai user -

Sam -

This is pretty slick (once you find a browser that will play nice)  smile

I'm playing with different tweaks on our test install here such as:

  • memcache on or off.
  • various settings in apc.ini
  • php memory limit

Wondering though, this utility will never know the difference if your using db sessions or shared storage right?

Base Run

initial run of e-peen

Using remote memcache server and up apc.shm_size to 64

e-peen w/memcache

This is on my test system:

  • appserver vmguest RHEL5 i386 2g mem
  • dbserver vmguest RHEL5 x86_64 8g mem
  • shared memory server vmguest RHEL5 2g mem 1.5 dedicated to memcached

-kevin

In reply to sakai user

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

thanks! interesting results.

As for sessions - the latest version of this script (...which I think I never did update on github? oh dear, sorry - the zip file posted in this discussion is latest) does not use cookies but it should still test session management to an extent because PHP will create a new session for each request. It will never read an existing session though.

(I think this is a better test than the previous version because it was unrealistic to send many requests at once from the 'same user'; this way, it's equivalent to many requests at once from different users, albeit that they're all the 'first request' from that user - so, still unrealistic, but a different and less limiting way smile

--sam

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by sakai user -

I pulled the github version..

I'm anctious to do more tests when I  return from holiday.

I'll pull again if you manage to update github by the time I return to work.

Also interested in another thread your on RE sessions db vs. filesys. Will find the thread and follow up on that particular thought smile

-kevin

In reply to sakai user

Re: New e-peen script! Er, I mean, system performance analyser

by sam marshall -
Picture of Core developers Picture of Peer reviewers Picture of Plugin developers

I'm unlikely to update it until after I get back (and I have two weeks off) so it may be a while - I'll post here once I've sorted it out.

In reply to sam marshall

Re: New e-peen script! Er, I mean, system performance analyser

by AL Rachels -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers

I've been having fun playing with both versions trying to improve my Moodle classroom performance (my son hates it because I'm creating too much network traffic here at home so that he can't play WOW smile). The one on github right now shows my server can handle 5.48 actual requests versus 3.02 shown by the one listed up above.

This is on an ASUS Mobo with Intel Core 2 Duo at 2.6 GHz with 8GB DDR2 mem. From just watching the server across the room, I think a limiting factor right at the moment is that  my server is using one SATA drive. It shows constant activity as long as the test is running. mixed

I think I will try moving to a used 1U server I just acquired last week...SCSI should improve things I think. I will even try it on my little Zotac backup server and see what it can handle.

Both tests are using Moodle internal cache, Extra PHP mem 512M, Int cache max 100, statistics off, mysql max_allowed_packet=4M, innodb_buffer_pool_size=4GB.

Version from github:

v1

Version from message above:

v2