Apache CPU usage shoots to >= 100%. Why?

Apache CPU usage shoots to >= 100%. Why?

by Brad Smith -
Number of replies: 12

This happens pretty regularly, but if there's a pattern to it I'm not sure I see it (I think it often happens the first time I've accessed the site in a while, but I'm not certain about that). 

Even with only one person trying to access the site, it will grind to a complete halt for a minute or two, and then become perfectly normal. I've set up a script that runs iostat and vmstat whenever it detects the load average going above 1, and there doesn't appear to be any swapping going on. This plus the fact that when it's not completely stopped performance is fine leads me to thing it's a software problem rather than a hardware problem. 

I do have some custom modules included, but they're things like a custom authentication module that would be a problem to just disable, so what I would really like is something that can profile moodle and report back which modules are responsible for the most work. Does such a thing exist? Any other advice for how to deal with this?

It may be worth noting that I'm running an old version (2.2) on a shared VM (though I don't think another VM could cause the CPU usage to skyrocket on mine?), and am in the process of upgrading and moving to a dedicated VM, but that's going to take some time, and I'd really like to figure out what's going on here. 

Average of ratings: -
In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
A couple of thoughts:

- Moodle is a heavy application. It is quite possible that just a single user can overload a shared server. Of course can not be sure, unless you post the specifications of your shared server.

- Also depending on how the shared system is built, the load in another shared server could be shown in your vmstat. But if you see that it is always a single Apache process, that must be your Moodle. Can you reproduce this peak by repeating some steps in your Moodle?

- Is there a difference whether the user is 'admin' or a normal user?

- What kind of caching, specially PHP caching, are active? Can you reproduce the load by clearing a (the) cache?
In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

VMs are not all equal. Some have dire performance, particularly in respect of disk IO speed. 

Is the performance noticeably worse logging in as admin compared to, say, a student in a course? If so, you might suspect that you having caching problems as Moodle is trying to cache to slow disk. 

In any case, you need to tell is a bit more about what resources you have given Moodle in your VM. 

In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Brad Smith -

Thanks very much for the replies!

I think the most interesting/promising point raised here is about admin accounts affecting performance. My account is set up as a site admin because, at the time, I thought "why not"? Perhaps I'm finding that out now. Could someone please say more about exactly how admin privs could cause this problem? If I remove the Site Admin role from my main account, but leave myself with full privileges otherwise, should that work if this is the cause? 

If so, are there best practices around this? For example, do you just do everything that requires admin privileges outside of normal usage hours?


Just in case it's something else, I'll also do my best to provide the rest of the information you requested.  


Unfortunately I don't know what hypervisor is used for the VM, but I can ask. 


/proc/cpuinfo shows two 3.4Ghz Xeons


free shows 2053 / 5984M of RAM free, and 3071 / 3071M of swap free.


Visvanath, you asked, "What kind of caching, specially PHP caching, are active? Can you reproduce the load by clearing a (the) cache?"

To be honest, I'm not sure what PHP caching is active. What should I be checking for? Here's this, if it helps:

"""

$ grep '^[^;].*cache' /etc/php5/apache2/php.ini

pdo_mysql.cache_size = 2000

mysql.cache_size = 2000

mysqli.cache_size = 2000

session.cache_limiter = nocache

session.cache_expire = 180

soap.wsdl_cache_enabled=1

soap.wsdl_cache_dir="/tmp"

soap.wsdl_cache_ttl=86400

soap.wsdl_cache_limit = 5

"""

One of the reasons I'm working on upgrading is because I've read about opcache helping with newer versions, but that's still a work in progress.


When you say "clearing the cache", do you mean the "purge all caches" link in Moodle? If so, I have done that and it doesn't trigger the massive slowdown I'm talking about. 




In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Asking about the admin account was just checking a symptom rather than suggesting a solution. The admin account (and any other site-based role) has to create the site-administration menu structure every time you load a page. If loading all those settings pages and their associated resources is considerably slower than a "normal" account it can be a useful indicator. 

Can you go to site-administration > Plugins > Caching > Test performance

Tell me what numbers you get for the File cache. The top "Application cache" will do. 

In reply to Howard Miller

Re: Apache CPU usage shoots to >= 100%. Why?

by Brad Smith -

Hmm... I'm afraid I don't seem to have that option. Maybe that wasn't yet introduced in 2.2? 

The closest thing I could find was an option to turn on printing of performance stats in the footer of each page. I've enabled that, and here's what I see right now:

1.834125 secs
RAM: 95.4MB
RAM peak: 95.6MB
Included 874 files
Contexts for which filters were loaded: 0
Filters created: 0
Pieces of content filtered: 0
Strings filtered: 0
get_string calls: 8689
strings mem cache hits: 8472
strings disk cache hits: 244
Included YUI modules: 54
Other JavaScript modules: 2
DB reads/writes: 850/3
ticks: 183 
user: 111 
sys: 34 
cuser: 0 
csys: 0
Load average: 0.92
Session: 3.1KB

As you can see from the load average near the end, the server is just coming down from one of its "fits" right now. A couple of minutes earlier, this is what I saw in top... sad 
top - 20:56:51 up 44 days, 17:17,  3 users,  load average: 2.82, 3.50, 5.02
Tasks: 118 total,   3 running, 115 sleeping,   0 stopped,   0 zombie
Cpu(s): 56.3%us, 37.5%sy,  0.0%ni,  4.6%id,  0.0%wa,  0.0%hi,  1.5%si,  0.0%st
Mem:   6127908k total,  4349024k used,  1778884k free,   396688k buffers
Swap:  3145720k total,        0k used,  3145720k free,  3011128k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16014 www-data  20   0  318m  77m 6272 R  100  1.3   0:15.27 apache2
 9634 www-data  20   0  287m  46m 6296 S   59  0.8   2:50.46 apache2
 9097 www-data  20   0  321m  82m 8592 R   37  1.4   1:51.53 apache2
Any thoughts?

In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Ken Task -
Picture of Particularly helpful Moodlers

Maybe Apache Config ... I see www-data which suggest Ubuntu/Debian flavored but what about config of apache?  Running php as a mod or fast-cgi? or other?  Your top shows only 3 apache2 processes.

The following for CentOS/RHEL flavored, but you should be able to do same if running as mod:

/usr/sbin/httpd -t -D DUMP_MODULES

What does that show?

Also, in config for apache ...  prefork MPM and/or worker MPM values/variables?

As far as a VM box and it affecting an Apache run Moodle ... yep, that can happen.   Like you said, however, it won't show in top or anything you do in the virtual OS, but if there is another VM guest that has been given the ability to use all the resources of that system, Apache, thus Moodle, would appear to become un-reponsive.   Not crash ... but no one can talk to it.

Does apache crash?  Can you find any core dump files?

'spirit of 'thinkering', Ken

In reply to Ken Task

Re: Apache CPU usage shoots to >= 100%. Why?

by Brad Smith -

Below I'll attempt to answer all the questions that have been asked...


What version of $SOFTWARE are you using?

Ubuntu 10.04

Apache 2.2.14 

PHP 5.3.2

MySQL 14.14

Moodle 2.2.9

Yes, it's pretty old. Hopefully this whole thing will be resolved once I get the new server up and running, but I've still got some obstacles to overcome before that can happen, and in the mean time I've got new users, and hopefully new content creators, on whom I'd like to make a not-terrible impression, so thanks again for everyone's help!


Is PHP running as module or CGI?

Module. At least, I infer that from apache's config:

LoadModule php5_module /usr/lib/apache2/modules/libphp5.so

One of the annoying things about this situation is I don't actually have root on the server, so when I try the -D DUMP_MODULES trick you mention it tries to access a directory of SSL keys that I can't read and fails.


Is Apache using prefork, or worker?

Prefork.


Your top only shows 3 Apache processes...

This is a very lightly used server (but when it is used, it's important). Also, top was sorting by CPU usage, so there could have been less busy processes further down. 


Are your users performing backup/restores?

No. I'm the only admin / course manager on the site


Is moodle's cron job running?

This was one of my first suspects, but I wrote a wrapper around Moodle's cron that watches CPU usage and reports more detailed information if the cron job takes > 1 second to execute and the load average was over 1 as it executed, and while obviously there is some overlap since it runs every 20 minutes (which I'm led to believe is about right?), there seem to be lots of times when the cron job starts in the middle of a spike, or a spike happens when the cron job isn't running. Plus, top always shows Apache using all the cpu, not the cron script. 


Is memcached installed?

Apparently not. I'll file a ticket to get that done and see if it helps. 



Thanks again for everyone's help! I'll report back once I get memcached and/or the updated server running. In the mean-time, if anyone has more ideas, please let me know! 

In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
With out super user privileges you action radius is highly limited. You knew that in Ubuntu you don't need to be the 'root'? By running "sudo -i" you can get a root shell, provided that sudo is configured that way (check /etc/sudoers).
In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by James McLean -

Are your users performing course backups/restores at all? We've seen these spike a CPU and can last for anything up to 10 minutes for a large course.

In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Oh sorry - 2.2. This cache stuff doesn't apply to you in that case.

In reply to Brad Smith

Re: Apache CPU usage shoots to >= 100%. Why?

by Brad Smith -

In case anyone is curious, I finally found the problem! I've also included a script I wrote that helped me find it.

I did some extra research (largely thanks to this thread!) and found that basically I'd been reading iostat all wrong. I'd just been looking at r/s, w/s, and await and trying to figure out what values seemed reasonable. The trick, and most of you probably know, is to watch for differences between await and svctm (as explained here). Once I started looking for it, I started noticing differences of literally 100x!

As an experiment, I wrote 500M to the disk on another server I control. It took about 8 seconds without the CPU breaking a sweat, as expected. On the moodle server it took 8 MINUTES, with the CPU pegged the entire time. My IT folks didn't say exactly what the issue was, but my guess is the VM was attached to some kind of storage array that was optimized for read-heavy applications and pretty crap at anything else.

Anyway, I sent them some iostat logs taken during the test and they immediately moved me to a new storage back-end, where the same test now completes in < 1 second, and I haven't seen the slightest slowdown from Moodle in days. Success!

Thanks for all the ideas, folks. Even when they didn't turn out to be the problem, they at least got me thinking about it.

In case it helps anyone out, here's a script I wrote to run as a regular cron job. It checks for a load average > .95, and if it sees one generates a report on the state of the system. If I'd known what to look for, the data this was sending me would have solved my mystery long ago!

https://github.com/usernamenumber/sysadmisc/blob/master/chkload.sh

Average of ratings: Useful (3)