Hope someone can help with some performance issues we're having. I've tried to summarise everything below, but if I've missed anything I apologise - I'm learning all this as I go along!
We've got a mid-size Moodle installation at our Further Education college - just 4000 users. However, due to an upstairs managerial decision, Moodle has now become the default homepage for all our staff and students.
We're using the built-in NTLM functionality to automatically log people in to Moodle. Combine this with the default homepage thing and we're getting quite a load on the Moodle server!
Before Moodle became the default homepage for everyone it ran perfectly fine, on exactly the same hardware setup. The only thing that has changed is the use of NTLM auto login and the addition of some custom code on the site frontpage. The custom code doesn't make any database calls, only retrieving and displaying an XML feed of student data. We've tried removing it temporarily, to eliminate it as a cause, but no performance difference ensues.
Moodle version is 1.9.2+ (updated from Catalyst's Git repository) and is running on top of Ubuntu 8.04, Apache 2.2.8, MySQL 5.0.51 and PHP 5.2.4
The software resides on a VMware virtual machine, setup with a 500gb (virtual) drive, 4gb of memory and dual-core 1.6ghz Intel Xeon CPU.
The main problem is that the system slows to a crawl at peak times of the day, when all students are opening Internet Explorer windows and therefore logging into Moodle. As a result we see around 30-50 concurrent Moodle users at any one time during the teaching day.
The scariest symptom of this is that both the CPUs are running at 100%, with top reporting MySQL as using the vast majority. The memory is just fine, only using a few hundred Mb of the 4Gb available. Apache is behaving just fine.
The system is essentially unusable during peak times, with pages timing out and people unable to login.
The attempted fixes
I'm using a mix of my-huge.cnf and the suggestions from the Moodle performance docs page.
Could it be because we're using dbsessions? I'd like to disable it to see if we get a performance increase, but this seems to disable the NTLM authentication, forcing everyone back to logging in manually. The mdl_session2 table is currently at 975Mb with 57000 rows. We've trimmed the log files to 30 days, back from 365. We've installed eaccelerator. UPDATE: I've just tried the suggestions from this thread and got some marginal performance improvements and the session queries have at least disappeared from the MySQL slow queries log.
Any suggestions are welcome, as I've run out of ideas. And coffee.
Performance stats from footer
15.030725 secs RAM: 8.3MB RAM peak: 8.4MB Included 213 files ticks: 1503 user: 85 sys: 54 cuser: 0 csys: 0 Load average: 5.39 Record cache hit/miss ratio : 0/0
key_buffer = 384M
max_allowed_packet = 8M
table_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
thread_concurrency = 8
max_connections = 5000
wait_timeout = 300
I'm not familiar with the NTLM auto-login feature so not sure how that is releated. But here are some general tools/suggestions that might help you on the road to finding the issue. Traditionally even when you think mysql is CPU bound its actually memory or diskbound. Try running theses tools from the CLI when you are running hot:
vmstat 1 10
iostats -d -x 1 10
See if there are any anomalies here compared to when the system is stable.
Your 5000 connections with the stats you provide would allow your system to theoretically allocate more than the 4gigs you've given it.
Connect to your database using mysql administrator. Look at the catalogs and take a look at the index_length of your tables. If their combined length is more than 384M then your indexes are being swapped to disk. This is very bad for myISAM tables as you are going to disk to look up the index and then going to disk to get the data.
I'd also take a look at the slow query log. See if there are a lot of slow queries. Run an explain on that query and look to see about adding an index on it based on the results.
Also run from the mysql CLI: show global status;
Check the following values:
ratio: Table_locks_immediate vs. table_locks_waited (immediate should be much higher)
ratio: qcach_hits vs. qcach_not_cached (Hits should be >60%)
ratio: qcach_lowmem_prunes vs. qucachinserts
max_used_connections. (use to determine how many connections you actually need. Always buffer this number with some)
All of this information might point you in the right direction for what is wrong with the NTLM module or what specifically is causing your mysql to run hot.
Best of luck.
(P.S. I'm not a DBA, the above has just been what knowledge I've gleened in the past couple years working on moodle.)
We are experiencing problems that overlap in some ways, but not in others. I really am not a DB Admin or Apache or MySQL expert, so some of the tests and results you've posted and how to get my own stats to compare is above me.
Here's the situation though - we are a further education college with approx 4000 users of Moodle. We've had the same hardware and Moodle as the students homepage for a year and a bit now, and this summer we've enabled the automatic NTLM login. The server is a Quad 1.6GHz with 8GB RAM, Windows 2003 RC2 and recently SP2, Apache 2.2.2 and PHP 5.1.6.
This term we've been having lots and lots of server crashes. Many times the server will have an emalloc or sometimes an erralloc error and restart. For the most part this is a minor nuisance and it recovers by itself. Less often we get Just-In-Time Debugger pop-ups on the server which need clearing before service will resume.
Now the (possibly) good news - I found a very old (2004) PHP bug report that mentions these errors and also the "(OS 64)The specified network name is no longer available. : winnt_accept: Asynchronous AcceptEx failed" errors we get because DisableAcceptEx causes restarts every minute or so!
I thought it was worth posting to see if any other of my symptoms match yours. Here's the bug report - https://issues.apache.org/bugzilla/show_bug.cgi?id=34664