I can also make this problem occur by dropping the refresh rate in a chat room to 1 second but this is not the cause in this case. What can I do to fix it?
What platform are you running on??
However, I have used MySql in the past on enourmous, mission critical systems and it performed perfectly. I can't even think of a way that Moodle could be wiping out the database. Even with too many connections (and the default is 100) its you application that will lock up waiting for the db not the other way around.
Interesting that its not just me! Have you looked in you mysql error logs to see if there is any evidence?
To be honest I have seen this two or three times since upgrading to 1.2b, but please don't take that as any kind of proof because I just happen to have been doing a lot of work with the db since then. Nothing stops working, but doing a status sometimes shows that mysql has only been up for an hour or so when the system has been running for a year!!
I am running on RedHat 8.0 with mysql 3.23.54 which is the one the RedHat came packaged with. I have 4G of Ram twin processors, vast amounts of disk space. Should all be cool! My system is fairly lightly loaded, our biggest course has about 60 users, plus another few courses with around 15.
I am running the CVS version from a day-or-two after 1.2b came out, with the exception of my test system which is more-or-less current CVS.
version 1.1.1. had this problem under Windows
$CFG->dbpersist = false;
I did not notice any important differences in the speed of access to moodle pages when I did this change, but I did notice that if I left $CFG->dbpersist = true; (the default configuration), Moodle would hang the Apache web server, and also the whole Linux server!!!
So, trust me, I do believe that changing that default to false would produce a much better default performance. I have seen may posts here related to hosting companies complaining about moodle installations because of the "heavy use of database resources", that got solved by changing this simple line in config.php.
Martin, could you change the default config.php to $CFG->dbpersist = false; ? I think it would be a good improvement.
However, stability is obviously far more important, and after seeing many sites now I must say I haven't noticed any difference in performance either, so I completely agree with you, David.
I'll set the default to "false" in Moodle 1.2 to help new installations.
This issue has been around from before Moodle 1.1.1. It's a combination of apache & mysql, and how the database is being queried from Moodle.
Some symptoms of the same problem:
- The dreaded 'Error: Moodle could not connect to the database.' message. The server is busy (locked up) and it cannot reply to other database requests. Note: this error is not always related to this problem. It could happen for other reasons as well (eg. wrong info in your config.php).
- Users refreshing pages too many times at the same time. The server then becomes slow to respond or simply times out.
- Apache & mysql processes take 100% of the CPU time. Using command "top" (unix/linux), you can detect when apache & mysql processes are stuck, and the CPU usage goes to 100%, to the point that no other process can continue (not even login or shutdown!).
To replicate this problem:
- Flood the server with page requests (any Moodle page, for that matter) using continuous refresh (press continuously <ctrl><f5> in IE), and the server locks up.
We get this issue once or twice a week. It is a problem for us which we have not been able to solve yet.
Apache and mysql are resilient indeed. Some things that I've tried already:
- Flood with page requests another non-moodle site. I have another site (on the same server) using php & mysql. I flood that site with page requests (same process as Moodle), but this application does not get overloaded. It responds to all requests (and these are many!). So it has to be with one of the Moodle queries that makes apache & mysql get stuck.
- Database persistent connection. In config.php (Moodle), setting $CFG->dbpersist = false or true does not make a difference.
- Tuned Apache to start fewer servers. It has helped because now it takes only 5 min to recover (as opposed to 30 min before). But this is still not good.
We are interested in finding a solution, if anyone knows. Thanks in advance.
See related postings:
Continuous browser refresh overwhelms moodle-based site
Some symptoms of the same problem:
My local test server has about 20 various PHP/MySQL packages installed of various sizes. I tried most with them with the F5 trick and they all had exactly the same effect on the system load. It's possible the package you tried is only a very light user of PHP/MySQL.
I think mod_throttle might be the solution:
on my system, shows about 15 connections for mysql. Why? As far as I can see you only use one adodb object so surely that should make one connection to mysql, persistent or not.
Is this perhaps an issue with adodb? Is moodle running the latest/best version or are there any know problems with adodb?
Having said that I have never seen the symptoms of actually running out of connections. I know what that looks like (everthig just stops) from previous screw-ups!
That's not much use if it doesn't work for Apache 2
Anyway, I finding all this a bit much to accept. I have done loads of work on *very* big systems in my pre-academic life using Java, PHP, Delphi etc etc. And *never* had a problem remotely like this. I've occasionaly run over the mysql connection limit, but have never broken the thing!
| Id | User | Host | db | Command | Time | State | Info |
| 109 | ctiuser | localhost | cti | Sleep | 2196 | | NULL |
| 110 | ctiuser | localhost | cti | Sleep | 2251 | | NULL |
| 111 | ctiuser | localhost | cti | Sleep | 1474 | | NULL |
| 112 | ctiuser | localhost | cti | Sleep | 1173 | | NULL |
| 113 | ctiuser | localhost | cti | Sleep | 2074 | | NULL |
| 114 | moodleuser | localhost | medlaw | Sleep | 237 | | NULL |
| 115 | moodleuser | localhost | medlaw | Sleep | 255 | | NULL |
| 116 | moodleuser | localhost | medlaw | Sleep | 250 | | NULL |
| 117 | moodleuser | localhost | medlaw | Sleep | 280 | | NULL |
| 118 | moodleuser | localhost | medlaw | Sleep | 182 | | NULL |
| 119 | ctiuser | localhost | cti | Sleep | 572 | | NULL |
| 120 | ctiuser | localhost | cti | Sleep | 874 | | NULL |
| 121 | ctiuser | localhost | cti | Sleep | 274 | | NULL |
| 122 | moodleuser | localhost | medlaw | Sleep | 182 | | NULL |
| 123 | moodleuser | localhost | medlaw | Sleep | 161 | | NULL |
| 124 | ctiuser | localhost | cti | Sleep | 1773 | | NULL |
| 125 | moodleuser | localhost | medlaw | Sleep | 244 | | NULL |
| 126 | moodleuser | localhost | medlaw | Sleep | 244 | | NULL |
| 127 | ctiuser | localhost | cti | Sleep | 2250 | | NULL |
| 128 | moodleuser | localhost | medlaw | Sleep | 256 | | NULL |
| 129 | moodleuser | localhost | medlaw | Sleep | 250 | | NULL |
| 130 | ctiuser | localhost | cti | Sleep | 2259 | | NULL |
| 136 | root | localhost | NULL | Query | 0 | NULL | show processlist |
Each user is for a different moodle install. Those sleep times sem high, especially considering the time of day when I wouldn't expect anything to be particularly busy.
From my J2EE experience, we would normally sort something like this out with a database connection pool with a fixed number of connections per 'site'. I have never used adodb myself so only have a slight familiarity with it, so does it implement pools and if not what does it do?
Lots of questions
It's mostly only the use of F5 in a very-low-latency environment (like a LAN) that can cause this kind of overload. I bet any system would crumble under it.
Why are you using Apache 2? Is there something in it that you really need? If not then use Apache 1 with the rate limiting.
I'm not sure, but I don't think that the load on the server is, per se., the root of the problem, if there is a problem. I can guarantee that my system is not heavily loaded and I do occasionaly see mysql falling over.
Many people do of course use Moodle on intranets and people have reported having problems without artificialy increasing the refresh rate. Besides, I really do have trouble with the idea that refreshing the page makes a significant difference to overall system load and if it did I don't see how it would crash mysql in the way that we are seeing. However, I would be prepared to explore the possibility that this is surfacing some obscure problem in the underlying database code.
Regarding the Apache business; for one thing using rate limiting may indeed cure the symptoms but it doesn't really get to the heart of the problem. In any case Apache 2 is the latest version and increasingly is going to be the prefered choice, or the only supported choice, or a requirement of other software.
I'm a bit worried about adodb's implementation of persistent database connections. I can't find much in the way of documentation and so have had a quick look at the code. It seems to rely on session persistence to do this, and I can't find any reference to any provision to limit the number of persistent connections. Is anybody out there familiar with how adodb implements this feature?
Anyway, I have switch persistence off on all my systems and we'll see what happens.
- the flooding problem, which is definitely an overall load problem - I think I got it up to a load of 60 and everything was brought to a standstill. A long time later everything gets processed and the server does recover itself, like the original poster mentioned. This should be preventable by using mod_throttle to set up a policy like "each individual IP can only request 100 items within a ten second period.". As and when Apache 2 gains in popularity I'm sure mod_throttle will support it.
- the persistent connections problem, which is solved by turning persistent connections off. I've got no idea what causes this though or even how it works, all I know is that it was supposed to be a Good Thing, but doesn't seem to be.
If the adodb persistence relies on user session support and doesn't care about a connection limit, then these could really be the same problem. What I have seen is a mysql server with a lot of concurrent connections start to grind to a holt due to mutual table locking.
It would be very interesting to run your load test with somebody checking the mysql processlist at the same time. If I was a betting man, I would expect to see a huge list of connections with many/most of them in a wait state.
I have now turned off database persistenced and restarted my database. So far I have no open connections in Sleep state, which is what I would expect. Now all I have to do is wait and see if it stays up.
as I had no idea about the exact implementation of Persistent DB Connections in ADOdb, I've take a look to its web site.
Then I found this:
If he prefers to use non-persistent conections and he's the lib author...you know
My class still looses connection but I now think that the problem is between the classroom and the school web server. I think it's the school server that's not handling all the requests. If the Mysql database is failiing then it would take longer than a few minutes to restart itself. (I think) Also there are plenty of documented situations where Moodle is running in the midst of thousands of students and this problem is not occuring. Alternatively, perhaps there is an old switch still lurking somewhere in our system that needs updating.
Does this make sense?