I now have a 1.9.15 and a 2.2.1 on the same server - and from reading the code and seeing how ver 2 has been restructured, I was expecting it to run faster than 1.9... but in fact it appears/feels to be running slower. I don't have any real clock numbers, but just judging from doing x in 1.9 and then doing the same x in 2.2, the 2.2 seems to be taking 2+x the time to do such (x in this case being: sending me a quiz page to answer, or grade a quiz page). I was impressed in how 2.2 had reduced the size of the includes in the code - I was/am assuming so as to only "compile" the function what would be needed vs including a monster library and only using a few of the function. So, my question is: is there something at the core of 2.2 that is taking more cycles? And is offsetting any gains from smaller, more specific includes?
many thanks - greg
Many thanks for the reply. Is it your impression that these db references are reads, ie, the cost/time of them can be somewhat mitigated by memcaching, or are they writes and just have to happen, ie, 2.2 is simply a more sophisticated cms and there is no free lunch?
They are reads. Under debugging options, turn on 'performance info' and you'll see the counts. (This appears to everyone so be careful if on a live system. Also don't test as admin, the numbers for admins are bigger than for normal students and the latter are rather more important.)
Memcache option isn't built into moodle 2, but yes an approach like that could mitigate the problem. A longer-term solution would be a focus on query count performance (analysing all the queries made for each page and how they can be reduced/combined) in a future Moodle 2.x version.
Basically there are some fundamental reasons why it's slower (blocks on every page; filter settings; fancier theme system; etc) but I'm sure there are also cases where, e.g., it does something using 3 separate queries that would logically make more sense as a single query, or where it makes the same (or essentially similar) query more than once in a single page request.
Many thanks for the reply. And I'll take a look at the counts. ... But at many levels I'm not offended by the greater resource demands - clearly 2.x is a superior piece of code and a more function rich CMS - there are no free lunches. But on the otherhand, being an almost half-century computer type, one always likes to understand what's happening under the covers. But if most of the load is DB related and most of that is reads against static or semi-static parts of the db, then, yes, throw memory at it, turn on the caching, and see if the slowdown doesn't go away. [I know form personal experience with a small 1.9 server and enough memory (16gb), btwn APC and memcache, plus giving mysql enough memory to drown in, 99.999% of the db activity was from memory and as such allowed for 40ms typical response times 99.999% of the time.]
W/re careful analysis of the code etc - in some ways I would rather not go down that path. In the past I worked on systems that were so tightly tuned (eg man-spaceflight), that a) were truly impressive in how optimized they were, but b) at the same time were so fragile in that one could change just a few lines of code and increase the cpu, disk or drum loads by 200%, making the system unmaintainable. Personally, given the almost free costs of hardware and cycles now-a-days, I would be more inclined to (intelligently) throw hardware at the problem and benefit from the added function and at the same time a more forgiving nature of the code and system. Apollo ran on 4mhz 4mb $4m mainframes (8x), where the lifetime cost per line of code was approx $20 (8m lines of code, 1970 dollars), and the system was insanely fragile - Today I can throw 64gb at the problem for less than $600 [and more if needed - I'm not offended by 256GB servers] and move hopefully most of the activity to RAM, and what I can't I can page to RAID-10 SSDs... and have a malleable, flexible, affordable, high-performance system, with pennies per line of code costs.
Following up to my own post: then it sounds like the biggest mistake in trying to speed up a slow system would be to: move the db onto a separate server, ie, even gigabit links are orders of magnitude slower than intramemory transfers, ie, the correct approach would be to go to many processors, lots of memory and multiple gigabit net connections to the outside world, but on a single box (with failover capability to others), ie, process a transaction in very sub-millisec time and be done with it.
Interesting view. Some clarifications though:
> clearly 2.x is a superior piece of code and a more function rich CMS - there are no free lunches.
This is the subject of a continuing debate, since 2.0 was released. This one, "Upgrading from Moodle 1.9 to 2.0" http://moodle.org/mod/forum/discuss.php?d=194572 is only acouple of days old.
> In the past I worked on systems that were so tightly tuned (eg man-spaceflight), that a) were truly impressive in how optimized they were, but b) at the same time were so fragile in that one could change just a few lines of code and increase the cpu, disk or drum loads by 200%, making the system unmaintainable.
How does Linux kernel or the Moodle core fare against those? Or are the times of "A Plea for Lean Software" http://www.duke.edu/~afd3/cps108/leansoftware.html (PDF http://cr.yp.to/bib/1995/wirth.pdf ) finally over?
> a more forgiving nature of the code and system
Could you pl. explain?
> [and more if needed - I'm not offended by 256GB servers] and move hopefully most of the activity to RAM, and what I can't I can page to RAID-10 SSDs... and have a malleable, flexible, affordable, high-performance system, with pennies per line of code costs.
Considering what ultimately arrives at the users end, do you believe that such an armada is warrented?
> and multiple gigabit net connections to the outside world, but on a single box ...
Agree on multiple Gbit peering from the core switch upwards. But how about the the link from the neck of the network card to the switch? Are talking about the Linux bonding driver?
It looks like I stirred up a hornet's nest here...
W/re rich function vs not - I think there's very much a place for both (and let the marketplace sort it out). Clearly Google has done well with the simple, lean and mean approach... but clearly there are more feature rich products which are also doing well. I think the Moodle developers are in a pretty sweet position in that: those that don't want the richer functions, they can stay with 1.9, probably can do so for years and years, and not be crippled in their teaching; conversely, for those that want more features/function, there will be 2.x. Clearly bloatware is a liability, and that's one of the dangers of rich-features, ie, the size and resource demands must not scale faster than the function set. MS with some of its mid-generation Windows clearly added bloat much faster than usable feature... and the users noticed (shall we say).
W/re optimization - I have never seen anything even approaching the level of packing and cycle optimization that was in the Apollo code, and having said that, never hope to ever again. The *ix kernal is orders of mag removed from that level of opt. I could probably do a multi-hour talk on what we had to do to make the thing work/fit. [ever written 12m lines of assembly code, with an instruction execution timing chart at hand.. let's see: an add fullword (memory to register) runs 1.5usec, but uses up an extra 2 bytes of data memory, where a add halfword runs 1.9usecs, and doesn't use those 2 bytes... so a visit to the costing committee to see which we have less of in that overlay: cycles or memory, then choose.)
w/re forgiving code: if the code is perfect, and I literally mean that, then one can strip out all the recovery code, all the value checking, all everything that isn't directly related to getting three men in a can home safely. But when one does get a core dump (yes, core), generally there was so little left that one couldn't debug it. Forgiving code is exactly that - the user can do something unexpected, and it doesn't crash and burn, but does something useful/meaningful about it. I think users today are much less tolerant of a blue-screen system lockups. That comes at a cost.
w/re the armada - we're not talking about that much of a computer, really, more like $2500 +/- (though not big name built, but one that one puts together themselves, ie, no 400% profit margins for Dell or HP etc). And yes, the same bits hopefully will arrive nomatter what... but user time is also valuable, beyond instructor time, beyond IT support people time. But beyond that: by dispatching a transaction as quickly as possible, there is a much higher probability of the hw cache working in one's favor - if it's spread out over time, then the cache probably will contain bit and pieces of multiple transactions, meaning one is running from ram, which for most every modern processor is a huge bottleneck. If one can run from cache, and avoiding processor stalls, one can get by with 1/5th the raw processing capability. I think most people that are designing high-performance transaction processing system have long ago learned this: the longer period of time over which a sequence of instructions are executed, the long time^2 it takes. And given the instruction path lengths within Moodle (at least 1.9) even a 3.0ghz proc (maybe 4 core) allowed to run from cache delivers very impressive performance.
w/re parallel NICs - two approaches: one, yes, bonded, aka shotgun; the other is to have more than one T3s coming in from a backbone. In either case, having one or two cores aggregating incoming traffic and feeding the other 6 -14 cores.
Whether DB on a separate server is a good idea or not depends on whether you are after single-user performance, or scalability.
If you want single-user performance, it is clearly a bad idea.
If you want good performance for many concurrent users, you probably want one separate DB server, and about 6 load-balanced web-servers sharing it.
To expand on Tim's comment: in the world of the infinitely-fast server, which I think Greg was positing, there is never any need for more than one server in order to achieve performance, so a single server is definitely the best option.
In the real world we don't have infinitely fast servers. That means we need to look at a system where we delegate all the CPU stuff that Moodle wastes time with (e.g. overhead due to being written in PHP, inefficient coding) to front-end servers. Performance of front-end servers literally doesn't matter because you can have as many as you need. That's where the 'throw hardware at it rather than coding efficiently' approach comes in.
Unfortunately at the moment it is difficult to configure Moodle to work with multiple database servers (master/read-only slave clusters), which means that the critical bottleneck is database performance, which means it's important to reduce unnecessary database queries.
I think our database server currently has 64GB RAM. Maybe the next upgrade will have 256GB; that would be enough to hold the entire DB in RAM again I guess...
Talking of infinite computing power, has anybody record of huge (computing) power spills.Here is my global candidate: "Facebook's Oregon Data Center Uses As Much Power As Entire County" http://hardware.slashdot.org/story/12/01/31/0355228/facebooks-oregon-data-center-uses-as-much-power-as-entire-county. In the Moodle landscape saw this one just today in the "Windows-based servers" forum: "Slow upload files" http://moodle.org/mod/forum/discuss.php?d=194935, IIS7 grinding 64 GB RAM and 24 cores to a halt behind a 10 Gbit/s link.
Sam, sir -
Where can I buy one of these infinitely-fast processors? I think actually what I was trying to argue, which is the same point I would make in my systems class: a 3ghz process is only 3ghz if running from cache; as soon as it runs from ram it's a 240mhz processor; and as soon as it starts paging it's only a 2.4khz processor. One of the experiments we conduct is: to disable the hw L1 and L2 caches, and do performance testing/benchmarking, then likewise, force the machine into severe paging mode. Then we play games with compact reference sets vs scattered. After 12 weeks the systems engineers have a better handle on what the numbers really mean. I know of several bank transaction process systems that run with 1TB of ram, and simply start blocking incoming if there's a danger of not being able to map everything to memory, ie, the cost to not process and flush is sooo severe that it's better to refuse transactions than allow an overcommit to occur. Just crossing that overcommit line by a 0.1% reduces the entire system throughput by 90% or worse. In that mode one needs 10-12x the processing capability to recover, or, one can buy one addition box and split the load 50-50 and have two boxes with almost 2x headroom. To not be able to entirely map your db into ram is a crime. And to upgrade after one overflows what they have is equally is a crime. Generally we teach 2x headroom is the minimum safe margin. I've seen managers told they had 30 minutes to clean out their desk by not bringing in additional boxes before they were needed.
There are probably a lot of memory utilisation problems with multicore processors in general, I'm not an expert on the hardware details.
Regarding database fitting in RAM or not; I think the Postgres recommendation is that if you want your DB to actually sit in RAM your machine RAM needs to be at least twice the size. Our 1.9 database is about 250GB and I assume we're using relatively cheap commodity servers, I don't think you can probably fit 512GB RAM in there. (And yes - this does cause problems with certain queries - but mostly, it's okay if rarely-used data slips out of file cache.)
Basically we are trying to run some course websites and it doesn't seem like we should have to get a huge box with 'Cray' on the front to achieve that.
Anyhow it does work okay without a supercomputer - but I think database query count optimisation (reducing the number of queries used for a page) is definitely the best way to improve performance (if there is nothing critical wrong with it otherwise). It'll help performance both for places like here, where it basically means we can just be a bit more relaxed about it and maybe not have to upgrade a server quite so soon, and for people who run the system on a $20/month hosting account that not only puts their database and their webserver on the same box, but also 100 other peoples' databases and webservers.
PS Regarding processor cache - as noted I'm not an expert, but I would think there might be better results on this if you put PHP requests on one box and database requests on a different one, so it isn't constantly switching between totally different code bases and datasets.
Clearly there is a point at which one must separate, and yes, after one does that there is a much more linear scaling... and with the number of students at OU you're clearly above that point. But my experience has been that too many shops make that jump way too early (*) and before really optimizing what they have... as I posted before: 64GB of memory is less than $600 and can go a long ways toward make much better use of the resourcs one already has. I strongly suspect most 2 year and 4 year schools can get by with a lot less in terms of hardware, if they'd balance the resources, and more imprortantly understand how they're being used. To have all of Moodle in the APC cache, the entire db in memory, the current workingset of data in memorymapped and each transaction processed from end to end in one go, ie, no task swaps, no processor stalls is clearly the optimal way to go. Anything else becomes exponential expensive.
(*) I also see a lot of "mine's bigger than your's" syndrome, where there seems to be a point of pride to split off the the db (and then hang it off a 100mps link :-O ), just to be able to say that they have a separate db server, having zero clue that doing such just cost them 500% in response time, besides a huge hit in total system reliability.
Yes. I think you are correct here.
However, another issues is robustness, as well as performance. If you have several web servers, and one of them dies, then your site does not go down.
Similarly, if you want to set up master-slave replication, so you have a hot standby of your database, that you can swtich to with minimal down-time if your master DB server crashes, that is also easier to set up if your DB server is just a DB server.
I am not saying you must do it that way, I am just saying that these are also important issues that must be considered.